Process Help EXTRA - create and modify fields using algebraic expressions |
Process Name |
Menu Path |
Link to Command Table |
EXTRA |
Data ribbon | Data Tools | Expressions |
Introduction
EXTRA is a general purpose EXpression TRAnslator that allows you to transform the contents of files by modifying fields and creating new ones based on the values of existing fields.
Fields can be an alphanumeric or numeric, and are restricted to 24 characters. If you attempt to add a field name that already exists in the selected object, a warning will be displayed and it will not be added. Similarly, restricted Datamine field names cannot be used.
|
How to use
EXTRA makes it easy to calculate new fields in any file, based on existing fields and values. You specify how you want the fields transformed in a simple and natural manner, for example to create a new field "C" which calculates the average of two fields "A" and "B", you can say
C = (A + B)/2
EXTRA will interpret the expression, create the new field and calculate a value for each record.
- Expressions and fields
- Starting to process records
- Arithmetic Operations
- Character Operations
- Ambiguities
- Relational Operations
- Conditional execution
- Functions
- Numeric functions
- "Special" values
- Record selection functions
- Finding the default value of a field
- Using @APPROX
- Functions for alphanumeric fields
- Procedures
Expressions and fields
Expressions can involve constants, field values or function values. These may be either numeric or alphanumeric. If an expression results in a numeric value, it must be assigned to a numeric field. Similarly, alphanumeric expressions must be assigned to alphanumeric fields.
An expression may be assigned to a field that already exists, or a new field may be created. New numeric fields may be created simply by assigning to them. To reinforce the fact that a new numeric field is being created, you may add ";n" to the end of the field name. For example
C;n = (A + B)/2
will create a new numeric field called "C" and assign the result of the expression to it. If you are creating a new alphanumeric field, you must append a field specification of the form ";alen" to the field name, where "len" is the maximum number of characters that the field is to contain. For example
BHID;a12 = join("BH", string(N, 0))
creates a new alphanumeric field BHID with a maximum of 12 characters. (For a description of the functions used in the expression, see below). In EXTRA alphanumeric fields are limited to a maximum of 80 characters.
Starting
EXTRA will continue to accept and compile transformations until you tell it to start processing the file. If you are using EXTRA interactively then processing is started by clicking the OK button. If you are running via a script or a macro then you must insert the keyword go after all transformation expressions to tell EXTRA to start processing. If you are recording to a script or a macro then the keyword go will be inserted automatically for you.
Arithmetic Operations
EXTRA allows you to use a full set of arithmetic and relational operations on values. There is an inferred precedence of operations:
-
Negation (e.g. A = -B)
-
Multiplication and division
-
Addition and subtraction
Therefore, an expression like "A = B * C + D * E" first multiplies the values of fields B and C, then multiplies D and E together, and adds the two results. You can use brackets where necessary to change the order in which transformations are performed, for example "A = B * (C + D) * E". There is no limit to the complexity of the expression that you can use.
There is a limit to the number of expressions that you can use, however because EXTRA compiles your transformations into an internal form that can be executed quickly, it is not easy to say what the limit is!
You can extend an expression over as many lines as is necessary. You may insert a line break anywhere, except within a word.
Character Operations
EXTRA offers a range of functions to manipulate alphanumeric (character) fields. The value assigned to a character field must be a character constant, or the result of a character expression, such as
join("RDH", substr(BHID,4,3))
A character constant is a string of characters enclosed in double quotes (e.g. "RDH034"). EXTRA does not recognise single quotes as character constant delimeters.
Ambiguities
If a field name includes a reserved character or conflicts with a function name then the field name should be enclosed in single quotes as the following three examples illustrate:Missing, dash or minus?
The character "-" has three meanings within your application and EXTRA. In your application, it is used as shorthand for "-infinity", which is the value used to denote missing values of numeric fields. It is a legal character in a field name (such as "NUM-FLDS"), and in EXTRA it is used as a subtraction operator. EXTRA resolves these ambiguities as follows:
- Normally, "-" is an arithmetic subtraction operator
- If used in a field name, the name must be enclosed in single quotes, e.g. 'NUM-FLDS' = 2
- To check for a missing value for a numeric field, compare with the value returned by the absent() function (see below).
However if you want to create an alpha field then it requires the field type descriptor. If you wish to create field of this type containing a minus character, you will need to also include the type descriptor within the quotes, e.g.:
'A-1;a4' = "abcd" is correct, whereas;
'A-1';a4 = "abcd" will generate a syntax error, and is incorrect.
Other reserved characters?
As well as - (minus), the following are reserved characters: + / * = < > ( ). Therefore if a field name includes one of these characters it must be enclosed in single quotes. For example: 'g/t'
Field or function?
Sometimes, you might want to use or create a field with the same name as one of the built-in functions. For example, your input file might contain a field called "max", and you want to update this by comparing it with another grade field. You need to help EXTRA to tell when max is a field name, and when it is a function name. Do this by placing the field name in single quotes, for example:
'max' = max('max', grade)
Relational Operations
In addition to arithmetic operators, EXTRA also provides relational operators. These are normally used in evaluating conditions ("if" tests), as described below, but they may be used elsewhere. The relational operators are:
== | equal to |
!= | not equal to |
< | less than |
<= | less than or equal to |
> | greater than |
>= | greater than or equal to |
If a relationship is "true", then the relational operator evaluates to 1.0, otherwise if a relationship is "false", the operator evaluates to 0.0. For example, the following creates a new field called CHECK, and assigns it the value 1.0 if the value of field FROM is greater than or equal to the value of the field TO:
CHECK = (FROM >= TO)
(It is, in fact, unnecessary to use brackets here, because relational operations have a lower precedence than arithmetic or assignment operations. We used them to clarify what happens).
Combining relational expressions
You can use the keywords "and" and "or" to combine relational (and other) expressions. For example, you can say
CHECK = (FROM >= TO and FROM >= prev(FROM))
which assigns 1.0 to the field CHECK if FROM is greater than or equal to TO and FROM is greater than or equal to the same field on the previous record.
Conditional Execution
EXTRA provides full support for conditional execution of transformations. This is achieved through the use of the if, else, elseif and end keywords. An example of where this might be used is as follows:
ERRORS = 0
if (FROM < TO or FROM < prev(FROM))
ERRORS = ERRORS + 1
end
This fragment begins by initializing a field called ERRORS to zero. (If the field does not exist, it is created automatically). EXTRA then compares the FROM and TO fields and the FROM field with the value on the previous record. If this is "true", EXTRA executes transformations following the if, up to a matching end. There can, of course, be any number of transformations after the if test.
The transformations could be extended, for example
ERRORS = 0
if (FROM < TO or FROM < prev(FROM))
ERRORS = ERRORS + 1
elseif (GRADE > 20)
ERRORS = ERRORS + 1
end
Note that for each if there must be a matching end. There may be as many elseif's as you wish, and one else if necessary. Note that elseif is one word. EXTRA will recognise else if as the beginning of an else followed by a new (nested) if. The new if must be completed by it's own matching end.
If...end constructions may be embedded within other if...end's. There is no limit to the depth to which you may sink while doing this!
Functions
EXTRA provides a long list of functions that can be used in transforming fields. A function is identified by its name, followed by a list of zero or more arguments enclosed in brackets. The brackets must be specified, even if the number of values passed to the function is zero. A function always returns a value. This value may be assigned to a field, or it may be used as a term in a transformation. Examples of EXTRA statements using functions are:
sx = sqrt(X)
c = sqrt(a*a + b*b - 4*a*c*cos(ac))
BIGEST = max(G1, (G2+G3)/2, G4, G5)
Numeric Functions
EXTRA provides the following functions which all take numeric values as arguments and return a numeric value. For trigonometrically functions all angles are in degrees.
Function name | Arguments | Returns |
abs | X | The absolute (positive) value of X |
absent | none | The Datamine "absent data" value |
acos | X | The angle whose cosine is X |
asin | X | The angle whose sine is X |
atan | X | The angle whose tangent is X |
cos | X | The cosine of X |
exp | X | The constant "e" raised to the power of X |
int | X | The integer part of X |
log | X | The base-10 logarithm of X |
loge, logn | X | The base-"e" logarithm of X |
max | X1, X2, X3, ... | The maximum value of X1, X2, X3, ... |
maxia | X1, X2, X3, ... | The maximum value of X1, X2, X3, ... ignoring absent arguments. |
min | X1, X2, X3, ... | The minimum value of X1, X2, X3, ... ignoring absent arguments. For example:
|
minia | X1, X2, X3, ... | The minimum value of X1, X2, X3, ... |
mod, modc | X, Y | The remainder when X is divided by Y |
phi | X | The inverse normal distribution function |
pow, rais | X, Y | The value of X raised to the power of Y |
round | X, Y | Round off the value X to the nearest multiple of Y. Equivalent to Y*int(X/Y + 0.5) |
sin | X | The sine of X |
special | X | One of your application's "special" values (see below) |
sqrt | X | The square root of X |
tan | X | The tangent of X |
"Special" Values
The special(X) function returns a "special" numeric value, depending on the argument given. The argument must evaluate to a character string.
Argument | Value returned |
"TOP" or "+" | The Datamine value for +infinity, sometimes referred to as "Top" |
"BOTTOM" or "-" | The Datamine value for -infinity, sometimes referred to as "Bottom". This is the same value as is used for missing (absent) values for numeric fields. |
"TR" or "DL" | The Datamine "trace" value (1.0e-30) |
"PI" | The numeric constant "pi" (3.14159...) |
"E" | The numeric constant "e" (2.71828...) |
Record Selection Functions
The following functions may be used to select field values from adjacent records. The functions return a numeric or alphanumeric value, depending on the type of the field supplied as an argument. All values passed to these functions must be the names of fields.
Function name | Arguments | Returns |
prev | X | The value of field X from the previous record |
next | X | The value of field X from the next record |
If processing the first record, prev(X) returns a "missing value" (i.e. the same value as absent()). Similarly, when processing the last record (i.e. the last record that matches current retrieval criteria), next(X) returns a missing value.
Do not use the prev function if the previous record has been deleted using delete procedure as the result will be undetermined.
The following functions may be used to test whether a record is the first or last in a file.
Function name | Arguments | Returns |
first | none | If processing the first record, returns 1.0, otherwise returns 0.0 |
last | none | If processing the last record, returns 1.0, otherwise returns 0.0 |
In all cases with these functions, "first" and "last" mean "the first (or last) record that matches current retrieval conditions".
The first() and last() functions are normally used within conditional tests, e.g.
if (first())
sumX = X
else
sumX = sumX + prev(X)
end
Finding the default value of a field
The default() function returns the default value of the field specified. The value passed to this function must be the name of a field in the current file, otherwise the error is reported and EXTRA terminates.
Function name | Arguments | Returns |
default | X | The default value of field X. The value returned will be either numeric or alphanumeric, depending on the type of the field passed as the argument. |
Note that default() is a function, and it returns the default value of a field; you cannot set the default value of a field using EXTRA. In other words, it is incorrect to say:
default(AU) = 10.0# This is incorrect!
Using @APPROX
Rounding errors are sometimes introduced by numerical calculations or when legacy single precision files are converted to extended precision. Setting @APPROX=1 specifies that EXTRA will allow for rounding errors when making comparisons between numeric values.
Functions for alphanumeric fields
The following functions are provided for the manipulation of alphanumeric fields:
Function name | Arguments | Returns |
decode | A | The numeric value of the first set of numeric characters embedded in A |
field | F1 | The contents of a field. The field name must be supplied as a string - for example, field ("AU"). This allows field names with otherwise prohibited characters, such as '-', to be referenced. This function can only be used for getting the value of a field - not for setting it. |
join | A1, A2, A3, ... | The concatenation of the alphanumeric values A1, A2, A3, ... |
lcase | A | Returns the string given as an argument, with all upper case letters converted to lower case |
len | A | The length (up to the first trailing space) of A |
match | A, P | The position where a regular expression P matches characters in A |
string | N, M | The numeric value N, converted to a string with M decimals. If M is omitted, the function uses a general-purpose numeric format. |
substr | A, N, M | Returns a sub-string, consisting of M characters taken from A, starting with character N. If M is omitted, all characters from N to the end are returned. |
trim | A | Returns the value of A with trailing spaces removed |
ucase | A | Returns the string given as an argument, with all lower case letters converted to upper case |
Notes for Alphanumeric functions
- The match() function uses character patterns known as "regular expressions". A regular expression consists of a sequence of literal characters to be matched, or one or more of the following elements:
% | Matches the beginning of the field value |
$ | Matches the end of the field value |
* | Zero or more occurrences of the preceding pattern element |
? | Any single character |
[ ... ] | Any one of the characters enclosed in the square brackets. The shorthand notation "a-z" means any lowercase letter. |
[^ ... ] | Any character except one of these |
The special meaning of a character (e.g. "*") is lost if the character is preceded by "\", hence to match a literal "*", use "\*".
- Be careful when you use the trim() function to create a new character field. Character fields in a database file are always padded out to the declared length with spaces. For example, if the command.
STRING1;A12 = trim(" ABC ")
is used, then the trim() function will remove both the leading and trailing blanks, but assigning the result to the STRING1 field will add 9 trailing spaces to the stored value ("ABC "). The trim command is designed to be used as an argument to another command. For example:
STRING2;A12 = join(trim(STRING1), "XYZ")
This will create STRING2 by adding the characters "XYZ" after the last non blank character in STRING1.
Procedures
A "procedure", unlike a function, does not return any value. It just carries out some action. EXTRA provides three procedures:
Procedure | Arguments | Description |
erase | F1, F2, F3, ... | Erases the fields F1, F2, F3, ... from the output file |
saveonly | F1, F2, F3, ... | Erases all fields except F1, F2, F3, ... from the output file. If used more than once, only the values in the last saveonly are honoured. |
none | Deletes the current record |
Adding Comments
EXTRA statements can be supported by comments, e.g.: !START M1!EXTRA &IN(MOD2), &OUT(t)# Comment 1
NEWFLD = 1
# Comment 2
NEWFLD2=9999
# Commetn 3
GO
!END Note that comments are not currently supported within IF loops. The following statement, for example, will cause EXTRA to fail:VAL2 = PVALUE*1.5
if (VAL2 == 8)
# comment is here
VAL5 = 5
ENDFiles, Fields and Parameters
Input Files
Name | Description | I/O Status | Required | Type |
IN | Input file. | Input | Yes | Undefined |
Output Files
Name | I/O Status | Required | Type | Description |
OUT | Output | Yes | Undefined | Output file. |
Parameters
Name | Description | Required | Default | Range | Values |
Display output to command processor | No | 0 | 0 or 1 | 0 = do nothing 1 = show all code in the output window during processing. | |
@APPROX | Allow for rounding errors when making comparisons | No | 0 | undefined | undefined |
Notes
No additional notes.
Examples
The ways in which EXTRA can be used are endless. This section provides a few ideas of what it can do:
Degrees, minutes and seconds
EXTRA can convert degrees, minutes and seconds to decimal degrees and vice-versa. Starting with fields dd, mm, and ss, EXTRA can easily create a field containing decimal degrees:
# Change dd, mm, ss to decimal degrees
dms = dd + mm/60 + ss/3600The field dms now contains decimal degrees. To change them back, use the following:
dd = int(dms)
mm = mod(int(60*dms), 60)
ss = int(10*mod(3600*dms, 60) + 0.5)/10The final expression rounds-off the seconds field, ss, to the nearest 1/10th.
Don't forget that if you are writing the expressions in a script or a macro then the keyword go must be entered on the last line.
Sort fields into order of increasing values
The SORTX or MGSORT process can be used to sort records into increasing or decreasing order. But what if you have five fields, a, b, c, d and e, and you want to shuffle their values such that the value assigned to a is less than or equal to the value assigned to b, and b is <= c, c <= d and d <= e. The following EXTRA transformations do this, and also show that it is not necessary to place every transformation of a separate line:
if (a > b) t = a a = b b = t end
if (b > c) t = b b = c c = t end
if (c > d) t = c c = d d = t end
if (d > e) t = d d = e e = t end
if (a > b) t = a a = b b = t end
if (b > c) t = b b = c c = t end
if (c > d) t = c c = d d = t end
if (a > b) t = a a = b b = t end
if (b > c) t = b b = c c = t end
if (a > b) t = a a = b b = t end
erase(t)
Add a record-number field
The process COPYNR copies a file, adding a RECORDNO field as it does. The same can be achieved using EXTRA. This example shows how the first() and prev() function can be used:
if (first())
RECORDNO = 1
else
RECORDNO = prev(RECORDNO)+1
end
Alphanumeric strings
The following example illustrates the functions for alphanumeric fields. It creates a new alphanumeric field, s1, with a length of 40 characters and assigns to it the comma-separated list of field values. The trim() function is used to remove trailing spaces from the BHID field. The string() function is used to convert numeric fields to character strings (with two decimal places), and the join() function is used to concatenate the various bits into a single alphanumeric value. Note that all unwanted fields can be erased with a single use of the erase() procedure:
s1;a40 = join(trim(BHID), ", ",
string(FROM, 2), ", ",
string(TO, 2), ", ",
string(Cu, 2), ", ",
string(Zn, 2))
erase(BHID, FROM, TO, Cu, Zn)The resulting file could be OUTPUT, and then loaded into Microsoft Excel or another package as a comma-separated values (CSV) file. (This function is available directly in OUTPUT, but this method gives finer control on the number of decimals, etc.)
Error and Warning Messages
Message | Description | Solution |
>>> Fatal error: No room to create new field: A257 | Not enough room left in file to add field | Remove unnecessary fields. |
|
|
|
Related Topics | |
|