JOIN Process
To access this process:
- Data ribbon >> Data Tools >> Relational >> Join.
See this process in the Command Table.
Process Overview
Performs a relational join, subset join, weave or subset weave of two input files into an output file. The type of join required is controlled by the values of the optional parameters @SUBSETR AND @SUBSETF.
This process belongs to a group of four similar ones
within the Datamine process collection; JOIN,
SUBJOI, WEAVE
and SUBWVE. Each provides a different
outcome, as described by the following diagram:
By default, with @SUBSETR=0 and @SUBSETF=0, JOIN writes out all records and all fields from both input files. Records are compared on the specified keyfields, and if a match is found, the two records are combined into one on the output file. If both files have identical Data Definitions then the record from the second input file is the one written out. Thus in this case the second file updates the first. If no match is found then records are written out with absent data codes for the missing fields.
-
With @SUBSETR=1 and @SUBSETF=0, a relational subset join is carried out. See process SUBJOI for further details.
-
With @SUBSETR=0 and @SUBSETF=1, a relational weave is carried out. See process WEAVE for further details.
-
With @SUBSETR=1 and @SUBSETF=1, a relational subset weave is carried out. See process SUBWVE for further details.
At least one keyfield must be specified and must appear in both input files as an explicit field. The keyfield may be up to 5 words long, and support is provided for up to 30 fields. If a field is specified which does not exist in both input files, it is ignored, providing at least one field matches.
Both input files must be sorted in the order of the keyfields before they can be joined. If this is not the case, the process will exit with an error message.
Input Files
Name |
Description |
I/O Status |
Required |
Type |
IN1 |
First file to be updated (sorted on required keyfields). |
Input |
Yes |
Table |
IN2 |
Second file (update file) (sorted on required keyfields). |
Input |
Yes |
Table |
Output Files
Name |
I/O Status |
Required |
Type |
Description |
OUT |
Output |
Yes |
Table |
Output file. |
Fields
Name |
Description |
Source |
Required |
Type |
Default |
KEY1 to 30 |
Keyfields for matching. Up to 30 keyfields can be specified. |
IN1, IN2 |
No |
Any |
Undefined |
Parameters
Name |
Description |
Required |
Default |
Range |
Values |
SUBSETR |
Controls whether all records or a subset are written to the output file. If set to (0) all records are written. |
No |
0 |
0,1 |
0,1 |
SUBSETF |
Controls whether all fields or a subset is written to the output file. If set to (0) all fields are written. With SUBSETR and SUBSETF set to 0 JOIN writes out all records and all fields from both input files. With SUBSETR=1 and SUBSETF=0 a relational subset join is carried out. With SUBSETR=0 and SUBSETF=1 a relational weave is carried out. With SUBSETR=1 and SUBSETF=1 a relational subset weave is carried out. |
No |
0 |
0,1 |
0,1 |
CARTJOIN |
If set to (0) and if no keyfields are specified the process will terminate with an error. If set to 1 the full Cartesian product is produced and written to the output file. No keyfields should be specified to produce the Cartesian product. |
No |
0 |
0,1 |
0,1 |
KEYTOL |
KEYTOL is the tolerance value used to test whether numeric key values are equal. It must be greater than or equal to zero. It replaces the previous heuristic comparison method. If KEYTOL is set to a negative value then zero is used. In a macro KEYTOL can be set to absent using -. "@KEYTOL=-" This will revert to legacy behaviour and use a heuristic comparison in relational commands and zero in sort. |
No |
0.00001 |
0,+ |
Undefined |
|
>=1 Display messages on Data definitions. Default is (0) |
No |
0 |
0,1 |
0,1 |
Example
!JOIN &IN1(ASSAYS),&IN2(COLLARS.S),&OUT(HOLES),*KEY1(BHID)
Error and Warning Messages
Message |
Description |
Solution |
>>> INPUT FILE nn NOT SORTED ON KEY FIELD(S) >>> POSITION IN FILE 1 IS nnnnnnnn >>> POSITION IN FILE 2 IS nnnnnnnn |
Fatal; the process is exited. The position numbers are the record numbers reached in each file. |
|
>>> FILE ffffffff CANNOT BE USED AS BOTH <<< >>> INPUT AND OUTPUT BY THIS PROCESS <<< >>> ERR 130 <<< ( fileno) IN JOIN |
Either the first or second input file has the same name as the output file. Fatal; the process is exited. |
|
>>> ERR 47 <<< ( 0) IN FNDKEY |
Warning; none of the specified key fields exist in the input files. The full Cartesian product is produced and written to the output file. |
|
>>> KEYFIELD aaaaaaaa MISSING FROM FILE ffffffff |
A warning message that is produced if @PRINT >=1. The keyfield is ignored and processing continues. |
|
Related topics and activities