Project and File Precision
Your application can read two types of file, Single and Extended precision. You can only generate extended precision data with Studio products.
The difference between the two lies in the definition of numeric data, in fact, the resolution of the data.
Single-precision files, generated by legacy applications, have lower resolution floating point values (24-bit) than the floating point values held in extended precision files (53-bit), but have a lower overhead on your system, and can be parsed more quickly through some Studio processes. With modern PC technology, the difference in terms of system performance has become negligible, hence single-precision data is no longer created by Studio applications.
The type of file that is generated depends on the setting assigned to the active project; extended-precision projects will generate extended precision files, single-precision projects will generate single-precision files.
The choice of which type of file you wish to support within your own projects is up to you, but before you decide, it would be useful to read the information below to ensure that your project is as efficient, and accurate as it needs to be.
Studio Precision History
Studio 2, itself a successor to even earlier Studio versions, introduced the concept of single and extended precision support.
This continued into Studio 3. Now discontinued, Studio 3 also supported the concept of mixed precision projects, in which default data precision settings were either single- or extended-precision (blocking the saving of extended-precision data files in single-precision projects). This arrangement was subsequently simplified in more modern versions of Studio (RM/EM/OP/UG) whereby a single project type was available, which supported the use of either single- or extended-precision files, or any mixture of both.
Product | Project Precisions Read | Project Precisions created | Data Precisions Generated | Data Precisions Imported | |
Studio 2 | Single or Extended | Single or Extended |
Single or extended (choice at save point) |
Single (in single or extended precision projects) and Extended (Extended Precision projects only) | |
Studio 3 | Single or Extended | Single or Extended |
Single or extended (choice at save point) |
Single (in single or extended precision projects) and Extended (Extended Precision projects only) | |
Studio EM, RM UG etc. | Single or Extended | Extended |
|
Single or Extended |
Comparing for Equality in Computer Software
In computing, floating point describes a system for numerical representation in which a string of digits (or bits) represents a rational number.
The term floating point refers to the fact that the decimal point can "float": that is, it can be placed anywhere relative to the significant digits of the number.
Unfortunately, floating point maths is not exact. Simple values like 0.2 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations can change the result. Different compilers and CPU architectures store temporary results at different precisions, so results will differ depending on the details of your environment. If you do a calculation and then compare the results against some expected value it is highly unlikely that you will get exactly the result you intended.
In other words, if you do a calculation and then do this comparison:
if (result == expectedResult)
...it is unlikely the comparison will be true. If the comparison is true then it is probably unstable – tiny changes in the input values, compiler, or CPU may change the result and make the comparison be false.
It is standard practice within software code to compare floating point numbers within a tolerance, and there are various accepted methodologies for determining this tolerance dynamically depending on the size of the numbers being compared. Such methods are also employed with Datamine’s software.
Floating Point Representations and Mantissa
The mantissa (also coefficient or significand) is the part of a floating-point number that contains its significant digits. Depending on the interpretation of the exponent, the mantissa may be considered to be an integer or a fraction.
For example, the number 123.45 can be represented as a decimal floating-point number with integer significand 12345 and exponent −2. Its value is given by the arithmetic:
12345 × 10−2
This same value could also be represented in normalized form with the fractional (non-integer) coefficient 1.2345 and exponent +2:
1.2345 × 10+2
The IEEE-754 standard lays down the following definition of floating point formats:
All Datamine software conforms to these representations of floating point numbers as implemented by the compilers which it uses to create the processor executables.
Significant Digits
As can be seen from the table above the mantissa of a single precision float has 24 bits of precision while a double has 53 bits. Converting from bits to decimal digits gives 7 decimal digits for floats (used in single precision Studio), 16 for doubles (used in extended precision Studio).
For example, 24 bits leads to a decimal accuracy of 1/224, or 6x10-8, while 53 bits leads to a decimal accuracy of 1/253, or 1x10-16. While the binary precision is exact, the decimal ‘mileage’ will vary slightly depending on the actual number being considered.
To put this into context consider that 16 significant figures is sufficient in metres to compare the circumference of the earth (4e7 metres) to a human hair (1e-4 metres), and still have a few significant figures left over.
Data and their Representations
Floating point format allows computer programs to use a very wide range of values while requiring only a small amount of memory to store each number. The standard used by Datamine for single precision values uses only 4 bytes to represent any positive or negative number with a value between 10-38 and 10+38 with a precision of approximately seven decimal digits. Double precision values allow for greater precision and a bigger range of values but use 8 bytes per number.
Floating point numbers can represent such a wide range of values because they separate the “exponent” from the “fraction”. In decimal, a number like “3.09” has an exponent (power of ten) of 1 and a fraction of 0.309. The exponent is used to put the decimal point in the right place. In other words:
3.09 = 101 x 0.309
In greater detail, we can write
3.09 = 101 x [ (3 x 10-1) + (0 x 10-2) + (9 x 10-3) ]
Computers work in binary and have to store floating point numbers in binary. To represent 3.09 in binary, the exponent (power of 2) is “2” and the binary fraction is “1100010…”. That is:
3.09 = 22 + [ (1 x 2-1) + (1 x 2-2) + (0 x 2-3) + (0 x 2-4) + (0 x 2-5) + (1 x 2-6) + (0 x 2-7) + …
Single precision floating point format allows 23 binary digits (“bits”) to store the binary fraction. This limit can lead to some unexpected results. It turns out that a decimal number like 3.09 cannot be exactly represented in binary using 23 bits. The nearest decimal value to 22 x 0.11000101110000101000111 is 3.08999991416931.
This means that the number 3.09 is actually stored in a Datamine extended precision file (and other data formats) as 3.08999991416931 and in a single precision file as 3.0899999.
Which Data File Precision is Correct?
In short, either. The choice of precision will depend on the type of data you wish to represent, and your decision may also be based on the processing power you have available in order to manipulate data files.
Moving from Single- to Extended-precision Data Files
To convert single precision data files to extended precision Datamine has written specific code that is additional to that provided by standard software compilers for converting from floats to doubles. This code is required to ensure that the actual data created in the extended precision file does actually represent the value that was intended in the single precision file. Using the above example, this code is required to ensure that an intended value of 3.09 is displayed rather than 3.0899999. All data exported by Studio applications will be extended precision by default.
The single-precision project format supported in both Studio 2 and Studio 3 is automatically enhanced when that project is loaded into a more modern Studio product (RM/EM/OP/UG). This conversion will include the transition from single- to extended-precision project format. Data files associated with that legacy project will be unaffected, as the new product will support either, or a mixture of both within the same project.
Related Topics and Activities