Reading Formatted Axis Values

The process of converting a formatted coordinate value for a FrameFrame axis, such as might be produced by astFormatastFormat (§7.6), back into a numerical (double) value ready for processing is performed by astUnformatastUnformat. However, although this process is essentially the inverse of that performed by astFormat, there are a number of additional difficulties that must be addressed in practice.

The main use for astUnformat is in reading formatted coordinate values which have been entered by the user of a program, or read from a file. As such, we can rarely assume that the values are neatly formatted in the way that astFormat would produce. Instead, it is usually desirable to allow considerable flexibility in the form of input that can be accommodated, so as to permit “free-format” data input by the user. In addition, we may need to extract individual coordinate values embedded in other textual data.

Underlying these requirements is the root difficulty that the textual format used to represent a coordinate value will depend on the class of Frame we are considering. For example, for a basic Frame, astUnformat may have to read a value like “1.25e-6”, whereas for a more specialised Frame representing celestial coordinates it may have to handle a value like “-07d 49m 13s”. Of course, the format might also depend on which axis is being considered.

Ideally, we would like to write software that can handle any kind of Frame. However, this makes it a little more difficult to analyse textual input data to extract individual coordinate values, since we cannot make assumptions about how the values are formatted. It would not be safe, for example, simply to assume that the values being read are separated by white space. This is not just because they might be separated by some other character, but also because celestial coordinate values might themselves contain spaces. In fact, to be completely safe, we cannot make any assumptions about how a formatted coordinate value is separated from the surrounding text, except that it should be separated in some way which is not ambiguous.

This is the very basic assumption upon which astUnformat works. It is invoked as follows:


\begin{terminalv}
int n;
\par
...
\par
n = astUnformat( frame, iaxis, string, &value );
\end{terminalv}

It is supplied with a Frame pointer (“frame”), the number of an axis (“iaxis”) and a character string to be read (“string”). If it succeeds in reading a value, astUnformat returns the resulting coordinate to the address supplied via the final argument (“&value”). The returned function value indicates how many characters were read from the string in order to obtain this result.

The string is read as follows:

  1. Any white space at the start is skipped over.

  2. Further characters are considered, one at a time, until the next character no longer matches any of the acceptable forms of input (given the characters that precede it). The longest sequence of characters which matches is then considered “read”.

  3. If a suitable sequence of characters was read successfully, it is converted into a coordinate value which is returned. Any white space following this sequence is then skipped over and the total number of characters consumed is returned as the function value.

  4. If the sequence of characters read is empty, or insufficient to define a coordinate value, then the string does not contain a value to read. In this case, the read is aborted and astUnformat returns a function value of zero and no coordinate value. However, it returns without error.

Note that failing to read a coordinate value does not constitute an error, at least so far as astUnformat is concerned. However, an error can occur if the sequence of characters read appears to have the correct form but cannot be converted into a valid coordinate value. Typically, this will be because it violates some constraint, such as a limit on the value of one of its fields. The resulting error message will give details.

For any given Frame axis, astUnformat does not necessarily always use the same algorithm for converting the sequence of characters it reads into a coordinate value. This is because some forms of input (particularly free-format input) can be ambiguous and might be interpreted in several ways depending on the context. For example, the celestial longitude “12:34:56.7” could represent an angle in degrees or a right ascension in hours. To decide which to use, astUnformat may examine the Frame's attributes and, in particular, the appropriate Format(axis)Format(axis) string which is used by astFormat when formatting coordinate values (§7.6). This is done in order that astFormat and astUnformat should complement each other—so that formatting a value and then un-formatting it will yield the original value, subject to any rounding error.

To give a simple (but crucially incomplete!) example, consider reading a value for the axis of a basic Frame, as follows:


\begin{terminalv}
n = astUnformat( frame, iaxis, '' 1.5e6 -99.0'', &value );
\end{terminalv}

astUnformat will skip over the initial space in the string supplied and then examine each successive character. It will accept the sequence “1.5e6” as input, but reject the space which follows because it does not form part of the format of a floating point number. It will then convert the characters “1.5e6” into a coordinate value and skip over the three spaces which follow them. The returned function value will therefore be 9, equal to the total number of characters consumed. This result may be used to address the string during a subsequent read, so as to commence reading at the start of “-99.0”.

Most importantly, however, note that if the user of a program mistakenly enters the string “ 1.5r6...” instead of “ 1.5e6...”, a coordinate value of 1.5 and a function result of 4 will be returned, because the “r” would prematurely terminate the attempt to read the value. Because this sort of mistake does not automatically result in an error but can produce incorrect results, it is vital to check the returned function value to ensure that the expected number of characters have been read.[*] For example, if the string is expected to contain exactly one value, and nothing else, then the following would suffice:


\begin{terminalv}
n = astUnformat( frame, iaxis, string, &value );
if ( astOK ) ...
... {
<error in input data>
} else {
<value read correctly>
}
}
\end{terminalv}

If astUnformat does not detect an error itself, we check that it has read to the end-of-string and consumed at least one character (which traps the case of a zero-length input string). If this reveals an error, the value of “n” indicates where it occurred.

Another common requirement is to obtain a position by reading a list of coordinates from a string which contains one value for each axis of a Frame. We assume that the values are separated in some unambiguous manner, perhaps using white space and/or some unspecified single-character separator. The choice of separator is up to the data supplier, who must choose it so as not to conflict with the format of the coordinate values, but our software does not need to know what it is. The following is a template algorithm for reading data in this form:


\begin{terminalv}
const char *s;
double values[ 10 ];
\par
...
\par
/* Initialis...
...{
<error in input data>
} else {
<values read correctly>
}
}
\end{terminalv}

In this case, “s” will point to the location of any input error.

Note that this algorithm is insensitive to the precise format of the data and will therefore work with any class of Frame and any reasonably unambiguous input data. For example, here is a range of suitable input data for a 3-dimensional basic Frame:


\begin{terminalv}
1 2.5 3
3.1,3.2,3.3
1.5, 2.6, -9.9e2
-1.1+0.4-1.8
.1/.2/.3
44.0 ; 55.1 -14
\end{terminalv}