EXTRA Process
To access this process:
- Data ribbon >> Data Tools >> Expressions.
- Enter "EXTRA" into the Command Line and press <ENTER>.
-
Display the Find Command screen, locate EXTRA and click Run.
See this process in the Command Table.
Process Overview
EXTRA is a general purpose EXpression TRAnslator that allows you to transform the contents of files by modifying fields and creating new ones based on the values of existing fields. EXTRA also makes it easy to calculate new fields in any file, based on existing fields and values.
Tip: If you're looking for examples of EXTRA usage, see EXTRA examples.
Before EXTRA starts to process records from the input file, it must define which fields will appear in the output file. It is important to understand that all decisions about fields in the output file are made before any records are processed.
Normally, the fields that appear in the output are:
-
All fields that are in the input file
-
Fields created in the transforms you specify.
EXTRA Key Concepts
Transform
EXTRA applies a “transform” to the contents of a file. It does this through one or more “statements”.
Statements
A transform consists of multiple statements. Statements are separated by spaces or new-lines. There are three kinds of statements
Assignment statements
Assignment statements evaluate an expression and assign the result to a field.
“IF” statements
“If” statements use the value of one or more expressions to execute other statements conditionally.
Procedure statements
Procedure statements do something but do not return a value. Some procedures (”erase”, “keep”, “saveonly”) require a list of field names (not expressions). The other procedures (”delete”, “exit”) don’t require any other information.
Expressions
Expressions are used in assignment and “If” statements. There are 4 types.
Simple expressions
A simple expression is a number, text string or field reference.
Arithmetic expressions
The value of an arithmetic expression is the result of applying the arithmetic operator (+, -, * or /) to the expressions on the left and right of the operator.
Relational expressions
The value of a relational expression, like “X < 2”, is 1.0 if the expression is “true” and 0.0 if it is “false”. They may be used anywhere an arithmetic expression may be used.
Function expressions
A function expression takes zero or more expressions in brackets to calculate a new value. For example, the value of “min(X, 2)” is the smaller of the values of the two expressions “X” and “2”.
The Expression Translator Screen
As a process, EXTRA displays a familiar "files, fields and parameters" type screen to allow an INput and OUTput file to be specified (you can also define retrieval criteria to filter the records to be processed. EXTRA is unusual as a process in that it triggers the display of another screen, the Expression Translator. This screen is used to determine the actual expression to be applied to the input data records that aren't filtered out by retrieval criteria.
Using the Expression Translator, you specify how you want the fields transformed in a simple and natural manner, for example, to create a new field "C" which calculates the average of two fields "A" and "B", you can create the following expression:
C = (A + B)/2
Clicking Execute in the Expression translator feeds the expression back into the EXTRA process, where the new field is created, and a value for each record is calculated.
EXTRA will continue to accept and compile transformations until you tell it to start processing the file. If you are using EXTRA interactively then processing is started by clicking the Execute button in the Expression Translator.
If you
are running via a script or a macro then you must insert the keyword go
after
all transformation expressions to tell EXTRA to start processing. If
you are recording to a script or a macro then the keyword go
is inserted automatically for you.
See Expression Translator (EXTRA).
Expressions and Fields
Expressions can involve constants, field values or function values. These may be either numeric or alphanumeric. If an expression results in a numeric value, it must be assigned to a numeric field. Similarly, alphanumeric expressions must be assigned to alphanumeric fields.
An expression may be assigned to a field that already exists, or a new field may be created. New numeric fields may be created simply by assigning to them. To reinforce the fact that a new numeric field is being created, you may add ";n" to the end of the field name. For example
C;n = (A + B)/2
will create a new numeric field called "C" and assign the
result of the expression to it. If you are creating a new alphanumeric
field, you must append a field specification of the form a[len]
to the field name, where [len]
is the maximum
number of characters that the field can contain. For example
BHID;a12 = join("BH", string(N, 0))
creates a new alphanumeric field BHID with a maximum of 12 characters ("a12").
Note: In EXTRA alphanumeric fields are limited to a maximum of 80 characters.
EXTRA can also be used to create implicit fields. See EXTRA examples.
Missing Fields
If EXTRA tries to use a field that does not exist, it will report an error and stop processing the file. This can be an inconvenience if you want to erase a field and are not sure if it exists.
There are now two ways to deal with this: either using the @FLDFAIL parameter or using the type() function's return value. See EXTRA examples.
Arithmetic Operations
EXTRA allows you to use a set of arithmetic and relational operations on values. There is an inferred precedence of operations:
-
Negation (e.g. A = -B)
-
Multiplication and division
-
Addition and subtraction
Therefore, an expression like "A = B * C + D * E
" first
multiplies the values of fields B
and C
, then multiplies
D
and E
together, and adds the two results. You can use
brackets where necessary to change the order in which transformations
are performed, for example "A = B * (C + D) * E
". There
is no limit to the complexity of the expression that you can use.
Note: extend an expression over as many lines as is necessary. You may insert a line break anywhere, except within a word.
Character Operations
EXTRA offers a range of functions to manipulate alphanumeric (character) fields. The value assigned to a character field must be a character constant, or the result of a character expression, such as
join("RDH", substr(BHID,4,3))
A character constant is a string of characters enclosed in double quotes (e.g. "RDH034"). EXTRA does not recognise single quotes as character constant delimeters.
Missing, dash or minus?
The character "-" has three meanings within your application and EXTRA. In your application, it is used as shorthand for "-infinity", which is the value used to denote missing values of numeric fields. It is a legal character in a field name (such as "NUM-FLDS"), and in EXTRA it is used as a subtraction operator.
EXTRA resolves these ambiguities as follows:
-
Normally, "-" is an arithmetic subtraction operator.
-
If used in a field name, the name must be enclosed in single quotes, e.g.
'NUM-FLDS' = 2
-
To check for a missing value for a numeric field, compare with the value returned by the absent() function (see below).
To create an alphanumeric field, the field type descriptor is required. If you wish to create field of this type containing a minus character, you will need to also include the type descriptor within the quotes, for example:
'A-1;a4' = "abcd"
is correct, whereas;
'A-1';a4 = "abcd"
is incorrect, and will generate a syntax error.
Reserved characters
As well as - (minus), the following are reserved characters: + / *
= < > ( )
. If a field name includes one of
these characters it must be enclosed in single quotes. For
example: 'g/t'
Field or function?
Sometimes, you might want to use or create a field with the same name
as one of the built-in functions. For example, your input file
might contain a field called "max
", and you want to update
this by comparing it with another grade field. You need to
help EXTRA to tell when max
is a field name, and when it is a function
name. Do this by placing the field name in single quotes,
for example:
'max' = max('max', grade)
Relational Operations
In addition to arithmetic operators, EXTRA also provides relational operators. These are normally used in evaluating conditions ("if" tests), as described below, but they may be used elsewhere. The relational operators are:
== |
equal to |
!= |
not equal to |
<> |
not equal to |
< |
less than |
<= |
less than or equal to |
> |
greater than |
>= |
greater than or equal to |
If a relationship is "true", then the relational operator evaluates to 1.0, otherwise if a relationship is "false", the operator evaluates to 0.0. For example, the following creates a new field called CHECK, and assigns it the value 1.0 if the value of field FROM is greater than or equal to the value of the field TO:
CHECK = (FROM >= TO)
(It is, in fact, unnecessary to use brackets here, because relational operations have a lower precedence than arithmetic or assignment operations. They appear here to clarify what happens).
Combining relational expressions
You can use the keywords "and
"
and "or
" to combine relational (and other) expressions. For
example, you can say
CHECK = (FROM >= TO and FROM >= prev(FROM))
This assigns 1.0
to the field CHECK
if FROM
is greater than or equal to TO
and FROM
is greater than or equal to the same field on the previous
record.
Conditional Execution
EXTRA provides full support for conditional execution of transformations.
This is achieved through the use of the if
, else
, elseif
and end
keywords. An example of where this might be used is as
follows:
ERRORS = 0
if (FROM < TO or FROM < prev(FROM))
ERRORS = ERRORS + 1
end
This fragment begins by initializing a field called ERRORS
to
zero. (If the field does not exist, it is created automatically). EXTRA
then compares the FROM
and TO
fields and the FROM
field with the value on the previous record. If this is "true",
EXTRA executes transformations following the if
, up to a matching
end
. There can, of course, be any number of transformations after
the if
test.
The transformations could be extended, for example
ERRORS = 0
if (FROM < TO or FROM < prev(FROM))
ERRORS = ERRORS + 1
elseif (GRADE > 20)
ERRORS = ERRORS + 1
end
Note that for each if
there must be
a matching end
. There may be as many elseif
's as you wish,
and one else
if necessary.
Tip: IF-ELSEIF-END statements should be used where multiple conditions need to be evaluated. This is because EXTRA can detect a matching condition in a single 'run', whilst having multiple END-IF statements instead will mean EXTRA must evaluate every condition independently, which can be much slower if there are many conditions to check. See EXTRA examples.
Note: elseif
is one word.
EXTRA will recognise else if
as the beginning of an else
followed by a new (nested) if
. The new if
must be completed
by it's own matching end
.
If...end
constructions may be embedded within other if...end
's.
There is no limit to the depth to which you may sink while doing this!
Functions
EXTRA provides a long list of functions that can be used in transforming fields. A function is identified by its name, followed by a list of zero or more arguments enclosed in brackets. The brackets must be specified, even if the number of values passed to the function is zero. A function always returns a value. This value may be assigned to a field, or it may be used as a term in a transformation. Examples of EXTRA statements using functions are:
sx = sqrt(X)
c = sqrt(a*a + b*b - 4*a*c*cos(ac))
BIGEST = max(G1, (G2+G3)/2, G4, G5)
Numeric Functions
EXTRA provides the following functions which all take numeric values as arguments and return a numeric value. For trigonometrically functions all angles are in degrees.
Function name |
Arguments |
Returns |
abs |
X |
The absolute (positive) value of X |
absent |
none |
The Datamine "absent data" value |
acos |
X |
The angle whose cosine is X |
asin |
X |
The angle whose sine is X |
atan |
X |
The angle whose tangent is X |
cos |
X |
The cosine of X |
exp |
X |
The constant "e" raised to the power of X |
int |
X |
The integer part of X |
log |
X |
The base-10 logarithm of X |
loge, logn |
X |
The base-"e" logarithm of X |
max |
X1, X2, X3, ... |
The maximum value of X1, X2, X3, ... |
maxia |
X1, X2, X3, ... |
The maximum value of X1, X2, X3, ... ignoring absent arguments. |
min |
X1, X2, X3, ... |
The minimum value of X1, X2, X3, ... ignoring absent arguments. For example:
|
minia |
X1, X2, X3, ... |
The minimum value of X1, X2, X3, ... |
mod, modc |
X, Y |
The remainder when X is divided by Y |
phi |
X |
The inverse normal distribution function |
pow, rais |
X, Y |
The value of X raised to the power of Y |
round |
X, Y |
Round off the value X to the nearest multiple of Y. Equivalent to Y*int(X/Y + 0.5) |
sin |
X |
The sine of X |
special |
X |
One of your application's "special" values (see below) |
sqrt |
X |
The square root of X |
tan |
X |
The tangent of X |
Scientific Notation
EXTRA will interpret numbers like "1.23e-4" as being "1.23 times 10 to the power of -4" = 0.000123. As such "e" can be used to define an exponent of a numeric value.
"Special" Values
The special(X) function returns a "special" numeric value, depending on the argument given. The argument must evaluate to a character string.
Argument |
Value returned |
"TOP" or "+" |
The Datamine value for +infinity, sometimes referred to as "Top" |
"BOTTOM" or "-" |
The Datamine value for -infinity, sometimes referred to as "Bottom". This is the same value as is used for missing (absent) values for numeric fields. |
"TR" or "DL" |
The Datamine "trace" value (1.0e-30) |
"PI" |
The numeric constant "pi" (3.14159...) |
"E" |
The numeric constant "e" (2.71828...) |
Record Selection Functions
The following functions may be used to select field values from adjacent records. The functions return a numeric or alphanumeric value, depending on the type of the field supplied as an argument. All values passed to these functions must be the names of fields.
Function name |
Arguments |
Returns |
prev |
X |
The value of field X from the previous record |
next |
X |
The value of field X from the next record |
If processing the first record, prev(X)
returns a "missing
value" (i.e. the same value as absent()
). Similarly, when
processing the last record (i.e. the last record that matches current
retrieval criteria), next(X)
returns a missing value.
Do not use the prev function if the previous record has been deleted using delete procedure as the result will be undetermined.
The following functions may be used to test whether a record is the first or last in a file.
Function name |
Arguments |
Returns |
first |
none |
If processing the first record, returns 1.0, otherwise returns 0.0 |
last |
none |
If processing the last record, returns 1.0, otherwise returns 0.0 |
In all cases with these functions, "first" and "last" mean "the first (or last) record that matches current retrieval conditions".
The first()
and last()
functions are normally used within
conditional tests, e.g.
if (first())
sumX = X
else
sumX = sumX + prev(X)
end
Default Field Values
The default()
function returns the default value of the field
specified. The value passed to this function must be the name of a field
in the current file, otherwise the error is reported and EXTRA terminates.
Function name |
Arguments |
Returns |
default |
X |
The default value of field X. The value returned will be either numeric or alphanumeric, depending on the type of the field passed as the argument. |
Note: default()
is a function, and it returns the
default value of a field; you cannot set the default value of a
field using EXTRA. In other words, it is incorrect to specify default(AU) = 10.0#
Using @APPROX
Rounding errors are sometimes introduced by numerical calculations or when legacy single precision files are converted to extended precision. Setting @APPROX=1 specifies that EXTRA will allow for rounding errors when making comparisons between numeric values.
Functions for Alphanumeric Fields
The following functions are provided for the manipulation of alphanumeric fields:
Function name |
Arguments |
Returns |
decode |
A |
The numeric value of the first set of numeric characters embedded in A |
field |
F1 |
The contents of a field. The field name must be supplied as a string - for example, field ("AU"). This allows field names with otherwise prohibited characters, such as '-', to be referenced. This function can only be used for getting the value of a field - not for setting it. |
join |
A1, A2, A3, ... |
The concatenation of the alphanumeric values A1, A2, A3, ... |
lcase |
A |
Returns the string given as an argument, with all upper case letters converted to lower case |
len |
A |
The length (up to the first trailing space) of A |
match |
A, P |
The position where a regular expression P matches characters in A |
string |
N, M |
The numeric value N, converted to a string with M decimals. If M is omitted, the function uses a general-purpose numeric format. |
substr |
A, N, M |
Returns a sub-string, consisting of M characters taken from A, starting with character N. If M is omitted, all characters from N to the end are returned. |
trim |
A |
Returns the value of A with trailing spaces removed |
ucase |
A |
Returns the string given as an argument, with all lower case letters converted to upper case |
Notes for Alphanumeric functions
- The
match()
function uses character patterns known as "regular expressions". A regular expression consists of a sequence of literal characters to be matched, or one or more of the following elements:
% |
Matches the beginning of the field value |
$ |
Matches the end of the field value |
* |
Zero or more occurrences of the preceding pattern element |
? |
Any single character |
[...] |
Any one of the characters enclosed in the square brackets. The shorthand notation "a-z" means any lowercase letter. |
[^...] |
Any character except one of these |
The special meaning of a character (e.g. "*") is lost if the character is preceded by "\", hence to match a literal "*", use "\*".
-
Be careful when you use the
trim()
function to create a new character field. Character fields in a database file are always padded out to the declared length with spaces. For example, consider the command:STRING1;A12 = trim(" ABC ")
In this case, the
trim()
function removes both the leading and trailing blanks, but assigning the result to theSTRING1
field will add 9 trailing spaces to the stored value ("ABC "). The trim command is designed to be used as an argument to another command. For example:STRING2;A12 = join(trim(STRING1), "XYZ")
This will create
STRING2
by adding the characters "XYZ" after the last non blank character inSTRING1
.
Procedures
A "procedure", unlike a function, does not return any value. It just carries out some action. EXTRA provides three procedures:
Procedure |
Arguments |
Description |
erase |
F1, F2, F3, ... |
Erases the fields F1, F2, F3, ... from the output file |
saveonly |
F1, F2, F3, ... |
Erases all fields except F1, F2, F3, ... from the output file. If used more than once, only the values in the last saveonly are honoured. |
delete |
none |
Deletes the current record |
Adding Comments
EXTRA statements can be supported by comments, for example:
!START M1
!EXTRA &IN(MOD2), &OUT(t)
# Comment 1
NEWFLD = 1
# Comment 2
NEWFLD2=9999
# Comment 3
GO
!END
Comments are not supported within IF loops. The following statement, for example, will cause EXTRA to fail:
VAL2 = PVALUE*1.5
if (VAL2 == 8)
# comment is here
VAL5 = 5
END
Input Files
Name | Description | I/O Status | Required | Type |
IN | Input file. | Input | Yes | Undefined |
Output Files
Name | I/O Status | Required | Type | Description |
OUT | Output | Yes | Undefined | Output file. |
Parameters
Name | Description | Required | Default | Range | Values |
APPROX | Allow for rounding errors when making comparisons 0 = Exact comparisons 1 = Approximate comparisons | No | 0 | 0, 1 | 0, 1 |
SEED | Seed number for pseudo-random number sequence | No | Undefined | Undefined | Undefined |
FLDFAIL | Controls if missing fields cause an error or not. 0 = Missing fields do not cause an error. 1 = Missing fields cause an error. | No | 1 | 0, 1 | 0, 1 |
Display output to command processor 0 = do nothing 1 = show all code in the output window during processing. | No | 0 | 0, 1 | 0, 1 |
EXTRA Examples
See EXTRA examples.
Error and Warning Messages
Message | Description | Solution |
>>> Fatal error: No room to create new field: A257 | Not enough room left in file to add field | Remove unnecessary fields or consider activating .dmx file support. |
Related topics and activities