SPP Reference Manual
- Authors:
Douglas Tody, Kitt Peak National Observatory
- Date:
January 1983 (revised September 1983)
- Abstract:
The IRAF subset preprocessor language (SPP) implements a subset of the proposed full IRAF scientific programming language. This paper defines the language and gives several examples of procedures written in the language. The differences between the subset language and the full language are summarized. IRAF tasks or programs are discussed, and a working example of an IRAF program is shown. The XC compiler for the SPP language on the UNIX software development system is introduced.
Introduction
The IRAF subset preprocessor language (SPP) implements a subset of the proposed IRAF preprocessor language, described in the paper “The Role of the Preprocessor”. The subset does not include pointers, structures, and dynamic and virtual arrays. Subset preprocessor programs should be written with eventual conversion to the full preprocessor language in mind. In general, this conversion will not be difficult, provided one generates simple and straightforward code.
One of the best ways to learn to program in a new language is to read the source listings of existing programs. Many examples of procedures, programs, and packages written in the SPP language can be found in the IRAF system source directories. We shall only attempt to summarize the basic features of the language here.
Getting Started
The best way to get started is to build and run a simple program, before attempting to learn all the details of the language. Here is our version of the “hello world” program from the C book:
# Simple program to print "hello, world" on the standard
# output.
task hello # CL callable task
procedure hello() # common procedure
begin
call printf ("hello, world\n")
end
On the UNIX system, this program would be placed in a file with the
extension .x
and compiled with the command XC (X Compiler) as
follows:
xc hello.x
The XC compiler will translate the program into Fortran, call the
Fortran compiler to generate the object file (hello.o
), and call the
loader to link the object file with modules from the IRAF system
libraries to produce the executable program hello
. XC may be used
to compile C and Fortran programs as well as X programs, and in
general behaves very much like CC or F77 (note that the -o
flag is
not required; by default the name of the output module is the base
name of the first file name on the command line). The -F
flag may
be used to inspect the Fortran generated by the preprocessor; this is
sometimes necessary to interpret error messages from the F77 compiler.
Fundamentals of the Language
The SPP language is based on the Ratfor language. The lexical form, operators, and control flow constructs are identical to those provided by Ratfor. The major differences are the data types, the form of a procedure, the addition of inline strings and character constants, the use of square brackets for arrays, and of course the TASK statement. The i/o facilities provided are quite different.
Lexical Form
A subset preprocessor program consists of a sequence of lines of text. The length of a line is arbitrary, but the SPP is guaranteed to be able to handle only lines of up to 160 characters in length. The end of each line is marked by the NEWLINE character. Both upper and lower case characters are permitted. Case is significant.
WHITESPACE is defined as one or more tabs or spaces. Newline normally marks the end of a statement, and is not considered to be whitespace. Whitespace always delimits tokens, i.e., keywords and operators will not be recognized as such if they contain embedded whitespace.
Continuation
Statements may span several lines. A line which ends with an operator
(excluding /
) or punctuation character (comma or semicolon) is
automatically understood to be continued on the following line.
Integer Constants
A decimal integer constant is a sequence of one or more of the digits
0-9. An octal constant is a sequence of one or more of the digits
0-7, followed by the letter b
or B
. A hexadecimal integer
constant is one of the digits 0-9, followed by zero or more of the
digits 0-9, the letters a-f, or the letters A-F, followed by the
letter x
or X
. The following notation more concisely summarizes
these definitions:
decimal constant |
|
octal constant |
|
hexadecimal constant |
|
identifier |
|
In the notation used above, +
means 1 or more, *
means zero or
more, -
implies a range, and |
means “or”. Brackets ([]
) define
a class of characters. Thus, [0-9]+
reads “one or more of the
characters 0 through 9”.
Floating Point Constants
A floating point constant (type REAL or DOUBLE) consists of a
decimal integer, followed by a decimal point, followed by a decimal
fraction, followed by one of (e|E|d|D
), followed by a decimal
integer, which may be negative. Either the decimal integer or the
decimal fraction part must be present. The number must contain either
the decimal point or the exponent (or both). Embedded whitespace is
not permitted.
The following are all legal floating point numbers:
.01
100.
100.01
1E5
1E-5
1.00D5
1.0D0
A floating constant may also be given in sexagesimal format, i.e., in hours and minutes, or in hours, minutes, and seconds. The number of colon separated fields must be two or three, and the number of decimal digits in the second field and in the integer part of the third field is limited to exactly two. The decimal point is optional:
00:01 = 0.017
00:00:01 = 0.00028
01:00:00 = 1.0
01:00:00.00 = 1.0
Character Constants
A character constant consists of from 1 to 4 digits delimited at front
and rear by the single quote ('
, as opposed to the double quotes
used to delimit string constants). A character constant is
numerically equivalent to the corresponding decimal integer, and may
be used wherever an integer constant would be used.
|
integer equivalent of the letter ‘a’ |
|
integer equiv. of the newline character |
|
the octal integer 07B |
|
the integer equiv. of the character ‘\’ |
The backslash character (\
) is used to form “escape sequences”.
The following escape sequences are defined:
|
backspace |
|
formfeed |
|
newline |
|
carriage return |
|
tab |
String Constants
A string constant is a sequence of characters enclosed in double quotes. The double quote itself may be included in the string by escaping it with backslash. All of the escape sequences given above are recognized. The backslash character itself must be escaped to be included in the string. A string constant may not span several lines of text.
Identifiers
An identifier is an upper or lower case letter, followed by zero or more upper or lower case letters, digits, or the underscore character. Identifiers may be as long as desired, but only the first five characters and the last character are significant.
The following identifiers are reserved (though some are not actually used at present):
auto |
do |
include |
short |
begin |
double |
int |
sizeof |
bool |
else |
long |
static |
break |
end |
map |
struct |
call |
entry |
next |
switch |
case |
extern |
plot |
task |
char |
false |
printf |
true |
clgetpar |
for |
procedure |
union |
clputpar |
getpix |
putpix |
unmap |
common |
goto |
real |
until |
complex |
if |
repeat |
virtual |
data |
iferr |
return |
vstruct |
define |
imstruct |
scan |
while |
Data Types
The subset preprocessor language supports a fairly wide range of data types. The actual mapping of an XPP data type into a Fortran data type depends on what the target compiler has to offer.
bool |
boolean (Fortran LOGICAL) |
char |
character (8 bit signed) |
short |
short integer |
int |
integer (Fortran INTEGER) |
long |
long integer |
real |
single precision floating (Fortran REAL) |
double |
double precision floating (DOUBLE PRECISION) |
complex |
single precision complex (Fortran COMPLEX) |
The only permissible values for a boolean variable are true and false. The CHAR data type belongs to the family of integer data types, i.e., a CHAR variable or array behaves like an integer variable or array. The value of a CHAR variable may range from -127 to 127. CHAR and SHORT are signed integer data types (i.e., they may take on negative values).
In addition to the seven primitive data types, the SPP language provides the abstract type POINTER. The SPP language makes no distinction between pointers to different types of objects, unlike more strongly typed languages such as C (and the full preprocessor). The SPP implementation of the POINTER data type is a stopgap measure.
Declarations
The SPP language implements named procedures with formal parameters and local variables. Global common and dynamic memory allocation may be used to share data amongst procedures. A procedure may return a value, but may not return an array or string. Declarations are included for procedures, variables, arrays, strings, typed procedures, external procedures, and global common areas. Storage for local and global variables and arrays may be assumed to be statically allocated.
Variable, Array, and Function Declarations
Although the language does not require that parameters be declared before local variables and functions, it is a good practice to follow. The syntax of a type declaration is the same for parameters, variables, and procedures:
type_spec object [, object [,... ]]
Here, type_spec
may be any of the seven primitive data types, a
derived type such as POINTER, or EXTERN. A list of one or
more data objects follows. An object may be a variable, array, or
procedure. The declaration for each type of object (variable, array,
or procedure) has a unique syntax, as follows:
variable identifier
array identifier "[" dimension_list "]"
procedure identifier "()"
Procedures may be passed to other procedures as formal parameters. If a procedure is to be passed to a called procedure as a formal parameter, it must be declared in the calling procedure as an object of type EXTERN.
Array Declarations
Arrays are one-indexed. The storage order is fixed in such a way that when the elements of the array are accessed in storage order, the leftmost subscript varies fastest. Arrays of up to three dimensions are permitted.
The size of each dimension of an array may be specified by any compile time constant expression, or by an integer parameter or parameters, if the array is a formal parameter to the procedure. If the array is declared as a formal parameter, and the size of the highest dimension is unknown, the size of that dimension should be given as ARB (for arbitrary):
real data[ARB] # length of array is unknown
short raster[NPIX*2,128] # 2-dim array
The declared dimensionality of an array passed as a formal parameter to a procedure may be less than or equal to the actual dimensionality of the array.
String Declarations
A string is an EOS delimited array of type CHAR (EOS stands for End Of String). Strings may contain only character data (values 0 through 127 decimal), and must be EOS delimited. A character string may be declared in either of two ways, depending on whether initialization is desired:
char input_file[SZ_FNAME]
string legal_codes "efgdox"
The preprocessor automatically adds 1 to the declared array size, to
allow space for the EOS marker. The space used by the EOS
marker is not considered part of the string. Thus, the array char
x[10]
will contain space for ten characters, plus the EOS
marker.
Global Common Declarations
Global common provides a means for sharing data between separately compiled procedures. The COMMON statement is a declaration, and must be used only in the declarations section of a procedure. Each procedure referencing the same common must declare the common in the same way:
common /common_name/ object [, object [, ... ]]
To avoid the possiblity of two procedures declaring the same common area differently in separate procedures, the COMMON declaration should be placed in a INCLUDE file (include files are discussed in a later section).
Procedure Declarations
The form of a PROCEDURE declaration is shown below. The
data_type
field must be included if the procedure returns a value.
The BEGIN keyword separates the declarations section from the
executable body of the procedure, and is required. The END
keyword must follow the last executable statement:
[data_type] PROCEDURE proc_name ([p1 [, p2 [,... ]]])
(declarations for parameters)
(declarations for local variables and functions)
(initialization)
BEGIN
(executable statements)
END
All parameters, variables, and typed procedures must be declared. The XPP language does not permit implicit typing of parameters, variables, or procedures (unlike Fortran).
If a procedure has formal parameters, they should agree in both number and type in the procedure declaration and when the procedure is called. In particular, beware of SHORT or CHAR parameters in argument lists. An INT may be passed as a parameter to a procedure expecting a SHORT integer on some machines, but this usage is NOT PORTABLE, and is not detected by the compiler. The compiler does not verify that a procedure is declared and used consistently.
If a procedure returns a value, the calling program must declare the procedure in a type declaration, and reference the procedure in an expression. If a procedure does not return a value, the calling program may reference the procedure only in a CALL statement.
Example 1: The sinc Function
This example demonstrates how to declare a typed procedure, which in
this case returns a single real value. Note the inclusion of the
double parenthesis (()
) in the declaration of the function
SIN, to make it clear that a function is being declared, rather
than a local variable. Note also the use of the RETURN statement
to return the value of the function SINC:
real procedure sinc (x)
real x
begin
if (x == 0.0)
return (1.0)
else
return (sin(x) / x)
end
Multiple Entry Points
Procedures with multiple entry points are permitted in the subset preprocessor language because they provide an attractive alternative to global common when several procedures must have access to the same data. The multiple entry point mechanism is a primitive form of block structuring. The advantages of multiple entry points over global common are:
Access to the database is restricted to calls to the defined entry points. A secure database can thus be assured.
Initialization of data in a procedure with multiple entry points is permissible at compile time, whereas global common cannot (reliably) be initialized at compile time.
Nonetheless, the multiple entry point construct is only useful for small problems. If the problem grows too large, an enormous procedure with many entry points results, which is unacceptable.
The form of a procedure with multiple entry points is shown below. Either all entry points should be untyped, as in the example, or all entry points should return values of the same type. Control should only flow forward. Each entry point should be terminated by a RETURN statement, or by a GOTO to a common section of code which all entry points share. The shared section of code should be terminated by a single RETURN which all entry points share.
Example 2: Multiple Entry Points
procedure push (datum)
int datum # value to be pushed or popped
int stack[SZ_STACK] # the stack
int sp # the stack pointer
data sp/0/
begin
(push datum on the stack, check for overflow)
return
entry pop (datum)
(pop stack into "datum", check for underflow)
return
end
Initialization
Local variables, arrays, and character strings may be initialized at compile time with the DATA statement. Data in a global common may NOT be initialized at compile time. If initialization of data in a global common is required, it must be done at run time by an initialization procedure.
The syntax of the DATA statement is defined in the Fortran 77 standard. Some simple examples follow:
real x, y[2]
char ch[2]
data x/0/, y/1.0,2.0/, ch/'a','b',EOS/
Control Flow Constructs
The subset preprocessor provides a full set of control flow constructs, such as are found in most modern languages. Some of these have already appeared in the examples.
An SPP control flow construct executes a “statement” either
conditionally or repetitively. The “statement” to be executed may be
a simple one line statement, a COMPOUND STATEMENT enclosed in curly
brackets or braces ({}
), or the NULL STATEMENT (;
on a line by
itself).
conditional constructs: |
IF, IF ELSE, SWITCH CASE |
repetitive constructs: |
DO, FOR, REPEAT UNTIL, WHILE |
branching: |
BREAK, NEXT, GOTO, RETURN |
Two special statements are provided to interrupt the flow of control through one of the repetitive constructs. BREAK causes an immediate exit from the loop, by jumping to the statement following the loop. NEXT shifts control to the next iteration of the loop. If BREAK and NEXT are embedded in a conditional construct, which is in turn embedded in a repetitive construct, it is the outer repetitive construct which will define where control is shifted to.
Conditional Execution
The IF and IF ELSE constructs are shown below. The expr
part may be any boolean expression. The statement
part may be a
simple statement, compound statement enclosed in braces, or the null
statement. The control flow constructs may be nested indefinitely.
IF construct
if (expr)
statement
IF ELSE construct
if (expr)
statement
else
statement
ELSE IF construct
The ELSE IF construct is useful for selecting one statement to be executed from a group of possible choices. This construct is a more general form of the SWITCH CASE construct.
if (expr)
statement
else if (expr)
statement
else if (expr)
statement
SWITCH CASE construct
The SWITCH CASE construct evaluates an integer expression once, then branches to the matching case. Each case must be a unique integer constant. The maximum number of cases is limited only by table space within the compiler.
A case may consist of a single integer constant, or a list of integer
constants, delimited by the character :
. The special case
DEFAULT, if included, is selected if the switch value does not
match any of the other cases. If the switch value does not match any
case, and there is no default case, control passes to the statement
following the body of the SWITCH statement.
Each case of the SWITCH statement may consist of an arbitrary number of statements, which do not have to be enclosed in braces. The body of the switch statement, however, must be enclosed in braces as shown:
switch (int_expr) {
case int_const_list:
statements
case int_const_list:
statements
default:
statements
}
example:
switch (operator) {
case '+':
c = a + b
case '-':
c = a - b
default:
call error (1, "unknown operator")
}
The SWITCH construct will execute most efficiently if the cases form a monotonically increasing sequence without large gaps between the cases (i.e., case 1, case 2, case 3, etc.). The cases should, of course, be defined parameters or character constants, rather than explicit numbers.
Error Handling
The SPP language provides support for error actions, error handling and error recovery. Knowledge of the SPP error handling procedures is necessary to correctly deal with error actions initiated by the system library routines.
A recoverable error condition is asserted by a call to the ERROR statement. An irrecoverable error condition is asserted with the FATAL statement. Error recovery is implemented using the IFERR and IFNOERR statements. If an error handler is not “posted” by a call to IFERR or IFNOERR, a system defined error handler will be called, returning system resources, closing files, deleting temporary files, and aborting the program.
errchk proc1, proc2, ... # errchk declaration
iferr (procedure call or assignment statement)
<error_action_statement>
iferr {
<any statements, including IFERR>
} then
<error_action_statement>
Language support includes the IFERR and IFNOERR statements and the ERRCHK declaration. The IFERR and IFNOERR statements are gramatically equivalent to the IF statement. The meaning of the IFERR statement is “if an an error occurs during the processing of the enclosed code,…”. IFNOERR is equivalent, except that the sense of the test is reversed. Note that the condition to be tested in an IFERR statement may a single or compound procedure call or assignment statement, while the IF statement tests a boolean expression.
If a procedure calls a subprocedure which may directly or indirectly take an error action, then the subprocedure must be named in an ERRCHK declaration in the calling procedure. If an error occurs during the processing of a subprocedure and an error handler is posted somewhere back up the chain of procedure calls, then control must revert immediately back up the chain of procedures to the procedure which posted the error handler. This will work only if all intermediate procedures include ERRCHK declarations for the next lower procedure in the chain.
Graphically, assume that procedure A calls B, that B in turn calls C, and so on as shown below:
A (A posts error handler with IFERR)
B (B must ERRCHK procedure C)
C (C must ERRCHK procedure D)
D (D calls ERROR)
As indicated by the diagram, procedure D calls ERROR, “taking an error action”. If no handler is posted, the error action will consist of the system error recovery actions, terminating with the abort of the current program. But if an error handler is posted, as is done by procedure A in the example, then control should revert immediately to procedure A. The error handler in A might try again with slightly different parameters, perform special cleanup actions and abort, print a more meaningful error message and take another error action, print a warning message, or whatever. If the ERRCHK declaration is omitted in procedure B or C, control will not revert immediately to procedure A, and processing will erroneously continue in the intermediate procedure, as if an error had not occurred.
Several library procedures are provided in the system library for use
in error handlers. The ERRACT procedure may be called in an error
handler to issue the error message posted by the original ERROR
call as a warning message, or to cause a particular error action to be
taken. The error actions are defined in the include file
<error.h>
. ERRCODE returns either OK or the integer code
of the posted error.
Library procedures related to error handling:
error (errcode, error_message) (language)
fatal (errcode, error_message) (library)
erract (severity) (library)
val = errcode () (library)
ERRACT severity codes <error.h>
:
EA_WARN # issue a warning message
EA_ERROR # assert recoverable error
EA_FATAL # assert fatal error
An arithmetic exception (X_ARITH) will be trapped by an IFERR statement, provided the posted handler(s) return without causing error restart. X_INT and X_ACV (interrupt and access violation may be caught only by posting an exception handler with XWHEN.
Repetitive Execution
An assortment of repetitive constructs are provided for convenience. The simplest constructs are WHILE, which tests at the top of the loop, and REPEAT UNTIL, which tests at the bottom. The DO construct is convienent for simple sequential operations on arrays. The most general repetitive construct is the FOR statement.
WHILE construct
while (expr)
statement
REPEAT UNTIL construct
repeat {
statements
} until (expr)
Infinite REPEAT loop
repeat {
statements (exit with BREAK, RETURN, etc)
}
FOR loop
The FOR construct consists of an initialization part, a test part, and a loop control part. The initialization part consists of a statement which is executed once before entering the loop. The test part is a boolean expression, which is tested before each iteration of the loop. The loop control statement is executed after the last statement in the body of the FOR, before branching to the test at the beginning of the loop. When used in a FOR statement, NEXT causes a branch to the loop control statement.
The FOR construct is very general, because of the lack of restrictions on the type of initialization and loop control statements chosen. Any or all of the three parts of the FOR may be ommitted, but the semicolon delimiters must be present.
for (init; test; control) FOR construct
statement
example:
for (ip=strlen(str); str[ip] != 'z' && ip > 0; ip=ip-1)
;
The example demonstrates the flexibility of the FOR construct. The FOR statement shown searches the string str backwards until the character ‘z’ is encountered, or until the beginning of the string is reached. Note the use of the null statement for the body of the FOR, since everything has already been done in the FOR itself. The STRLEN procedure is shown in a later example.
DO loop
The DO construct is a special case of the FOR construct. DO is ideal for simple array operations, and since it is implemented with the Fortran DO statement, its use should result in particularly efficient code.
Only INTEGER loop control expressions are permitted in the DO statement. General expressions are permitted. The loop may run forwards or backwards, with any step size. The value of the loop control parameter is UNDEFINED upon exit from the loop. The body of the DO will be executed zero times, if the initial value of the loop control parameter satisfies the termination condition.
do lcp = initial_value, final_value [, step_size]
statement
example:
do i = 1, NPIX DO construct
a[i] = abs (a[i])
Expressions
Every expression is characterized by a data type and a value. The data type is fixed at compile time, but the value may be either fixed at compile time, or calculated at run time. An expression may be a constant, a string constant, an array reference, a call to a typed procedure, or any combination of the above elements, in combination with one or more unary or binary operators.
Operators
Special Operators
|
procedure call |
|
array reference |
Unary Operators
|
negation |
|
boolean not |
Binary Operators
|
exponentiation |
|
arithmetic |
|
boolean comparison |
|
boolean and, or |
Parenthesis may be used to force the compiler to evaluate the parts of an expression in a certain order. In the absence of parenthesis, the “precedence” of an operator determines the order of evaluation of an expression. The highest precedence operators are evaluated first. The precedence of the SPP operators is defined by the order in which the operators appear in the table above (procedure call has the highest precedence).
The “arglist” in a procedure or array reference consists of a list of general expressions separated by commas. If an expression contains calls to two or more procedures, the order in which the procedures are evaluated is undefined.
Mixed Mode Expressions
The binary operators combine two expressions into a single expression. If the two input expressions are of different data types, the expression is said to be a “mixed mode” expression. The data type of a mixed mode expression is defined by the order in which the types of the two input expressions appear in the table on page 5. The data type which appears furthest down in this table will be the data type of the combined expression. For example, an integer plus a real produces a real. Mixed mode expressions involving booleans are illegal.
Type Coercion
The term “type coercion” refers to the conversion of an object from one data type to another. Such conversions may involve loss of information, and hence are not always reversible. Type coercion occurs automatically in mixed mode expressions, and in assignment statements. Type coercion is not permitted between booleans and the other data types.
The data type of an expression may coerced by a call to an intrinsic
function. The names of these intrinsic functions are the same as the
names of the data types. Thus, int(x)
, where x
is of type REAL,
coerces x
to type INT, while double(x)
produces a double
precision result.
The Assignment Statement
The assignment statement assigns the value of the general expression on the right side to the variable or array element given on the left side. Automatic type coercion will occur during the assignment if necessary (and legal). Multiple assignments may not be made on the same line.
Some Examples
We have now finished discussing the fundamentals of the subset preprocessor language. The following examples demonstrate two complete procedures written in the SPP language. Additional examples are given in appendix B, and in the IRAF source directories.
Example 3: Length of a String
This example demonstrates the declaration and use of a function to compute the length of a character string passed as a formal parameter. STRLEN simply inspects each character in the string, until the end of string marker (EOS) is reached:
int procedure strlen (string)
char string[ARB]
int ip
begin
ip = 1
while (string[ip] != EOS)
ip = ip + 1
return (ip - 1)
end
The code fragment shown below shows how the function STRLEN might be used in another procedure. STRLEN is called to get the index of the last character in the string, then the string is truncated by overwriting the last character with EOS. EOS is a predefined constant, which should be considered part of the language:
char string[SZ_LINE]
int strlen()
begin
string_length = strlen (string)
if (string_length >= 1)
string[string_length] = EOS
Example 4: Min and Max of a Real Array
This example shows how to declare a procedure which returns its output via formal parameters, rather than as the function value. Note the use of square brackets to declare and reference arrays. If the limiting values of the data cannot be computed, the special value INDEF is returned, signifying that the limiting values are indefinite. INDEF is another predefined constant:
procedure limits (data, npix, minval, maxval)
real data[npix] # input data array
int npix # length of array
real minval, maxval # output values
int i
begin
if (npix >= 1) {
minval = data[1]
maxval = data[1]
for (i=2; i <= npix; i=i+1) {
if (data[i] < minval)
minval = data[i]
if (data[i] > maxval)
maxval = data[i]
}
} else {
minval = INDEF
maxval = INDEF
}
end
The generalization of this procedure to handle indefinites in the input data array is left up to the reader.
Program Structure
An SPP source file may contain any number of PROCEDURE declarations, zero or one TASK statements, any number of DEFINE or INCLUDE statements, and any number of HELP text segments. By convention, global definitions and include file references should appear at the beginning of the file, followed by the task statement, if any, and the procedure declarations.
include <ctype.h> # character type definitions
include "widgets.h" # package definitions file
# This file contains the source for the tasks making up the
# Widgets analysis package (describe the contents of the file).
define MAX_WIDGETS 50 # local definitions
define NPIX 512
define LONGITUDE 7:32:23.42
task alpha, beta, epsilon=eps
# ALPHA -- (describe the alpha task)
procedure alpha()
...
Include Files
Include files are referenced at the beginning of a file to include global definitions that must be shared amongst separately compiled files, and within procedures to reference common block definitions. The INCLUDE statement is effectively replaced by the contents of the named file. Includes may be nested at least 5 deep.
The name of the file to be included must be delimited by either angle
brackets (<file>
) or quotation marks ("file"
). The first form
is used to reference the IRAF system include files. The second, more
general, form may be used to include any file.
Macro Definitions
Macro definitions are invaluable for “information hiding”, and can do much to enhance the modifiability of a program. The effective use of macros also tends to improve the readability of a program. By convention, the names of macros are always upper case, to make it clear that a macro is being used, and to avoid redefinitions of ordinary variables and procedures.
There are two kinds of macros – those with arguments, and those without. Macros without arguments are the most common, and are used primarily to turn explicit constants into symbolic parameters. Examples are shown above.
Macros may also be used to reference the field of a structure, or to
define inline code fragments (similar to Fortran statement functions).
In the SPP, the arguments of a macro are referenced as $1
, $2
, in
the following manner:
define I_TYPE $1[1]
define I_NPIX $1[2]
define I_COEFF $1[10]
if (I_TYPE(coeff) == LINEAR)
...
In this example, the array coeff
is actually a simple structure,
containing the fields i_type
, i_npix
, …, and i_coeff
.
It greatly enhances the readability of the program to refer to the
fields of this structure by name, rather than offset (coeff[2]
),
and furthermore makes it trivial to modify the structure.
Macros with arguments may also be used to define inline functions.
For example, here are a couple of definitions of character classes
from the system include file ctype.h
:
define IS_UPPER ($1>='A'&&$1<='Z')
define IS_LOWER ($1>='a'&&$1<='z')
define IS_DIGIT ($1>='0'&&$1<='9')
usage:
if (IS_DIGIT(string[i])) {
...
Note that these definitions work for ASCII, but not for EBCDIC (IBM). By using macros, we have concentrated this machine dependent knowledge of the character set into a single file.
Note
In the current implementation of the SPP, macro definitions may not include string constants. All other types of constants, constant expressions, array and procedure references, are allowed. The domain of definition of a macro extends from the line following the macro, to the end of the file (except for include files). Macros are recursive. Redefinitions of macros are silently permitted.
The Task Statement, and Tasks
The TASK statement is used to make an IRAF task. A file need not contain a task statement, and may not contain more than a single task statement. Files without task statements are separately compiled to produce object modules, which may subsequently be linked together to make a task, or which may be installed in a library.
A single physical task (ptask) may contain one or more LOGICAL TASKS (ltasks). These tasks need not be related. Several ltasks may be grouped together into a single ptask merely to save disk storage, or to minimize the overhead of task execution. Ltasks should communicate with one another only via disk files, even if they reside in the same ptask.
task ltask1, ltask2, ltask3=proc3
The task statement defines a set of ltasks, and associates each with a compiled procedure. If only the name of the ltask is given in the task statement, the associated procedure is assumed to have the same name. A file may contain any number of ordinary procedures which are not associated (directly) with an ltask. The source for the procedure associated with a given ltask need not reside in the same file as the task statement.
An ltask associated procedure MUST not have any arguments. An ltask procedure gets its parameters from the CL via the CL interface. Most commonly used are the CLGETx procedures. The CLPUTx procedures may be used to change the values of parameters.
task alpha, beta, epsilon=eps
procedure alpha()
int npix, clgeti()
real lcut, clgetr()
char file[SZ_FNAME]
begin
npix = clgeti ("npix")
lcut = clgetr ("lower_cutoff")
call clgstr ("input_file", file, SZ_FNAME)
...
An IRAF task may be run by the CL or called from the command interpreter provided by the host operating system, without change. Parameter requests and i/o to the standard input and output will function properly in both cases. When running without the CL, of course, the interface is much more primitive.
To run an IRAF task directly, without the CL (especially useful for debugging purposes), begin by simply running the task. The task will sense that it is being run without the CL, and issue a prompt:
> ?
alpha beta epsilon
> alpha
npix: (response)
lower_cutoff: (response)
input_file: (response)
(ltask "alpha" continues)
> bye
Every IRAF task has two special commands built in. The command ?
will list the names of the ltasks recognized by the interpreter. The
command bye is used to exit the interpreter.
Help Text
Documentation may be embedded in an XPP source file either by commenting out the lines of text, or by enclosing the lines of text within .help and .endhelp directives. If there are only a few lines of text, it is probably most convenient to comment them out. Large blocks of text should be enclosed by the help directives, making the text easier to edit, and accessible to the online documentation and text processing tools.
# (everything from the '#' to end of line is a comment)
.help [keyword [qualifier [package_description_string]]]
(help text)
.endhelp
The preprocessor ignores comments, and everything between .help and .endhelp directives. The directives must occur at the beginning of a line to be recognized. In both cases, the preprocessor ignores the remainder of the line. The arguments to .help are used by the HELP, MANPAGE, and LISTING utilities, but are ignored by XPP.
Help text may be typed in as it is to appear on the terminal or printer, or it may contain text processing directives. A filter (LISTING) is available to strip help text out when making listings, or to replace help text containing directives with nicely formatted text. See the LROFF documentation for a description of the IRAF text processing directives.
Manual pages for ltasks may be stored either directly in the source
file as help text segments, or in separate files. If separate source
and help files are used, both files should reside in the same
directory and should have the same root name, and the help text file
should have the extension .hlp
.
Anachronisms
Certain constructs in the subset preprocessor language are not likely to survive in their present form in the full preprocessor. These include:
the STRING declaration
the DATA statement
the COMMON statement
the POINTER data type
The STRING declaration will disappear at the same time as the DATA statement. Both will be replaced by initializations of the form:
real x = 0.0, y[] = {1.,2.,4.}
char opcodes[SZ_OPCODES] = {'f','g','e','d'}
COMMON declarations, in their present form, are cumbersome and dangerous to use. The global data capability provided by COMMON will be present in the full preprocessor in a more structured form.
The POINTER data type will be replaced by a strongly typed (and therefore much more reliable) implementation of pointers, patterned after C.
Notes on Topics not Discussed
This present version of the SPP reference manual omits a discussion of the basic i/o facilities, some of which require language support. Dynamic memory management and pointers will be covered in a later revision of the manual. Data structuring is possible in the SPP, using macros, and is discussed in the design documentation for VSIO.
Programs written in the subset preprocessor language should adhere to the (currently informal) coding standard being developed for IRAF. The coding standard has not yet been documented. Try to style procedures after those shown in the examples, and in the IRAF system source directories.
Appendix A: Predefined Constants
The subset preprocessor language includes a number of predefined symbolic constants. Included are various machine dependent constants describing the hardware and data types. Other symbolic constants are used for basic file i/o. All predefined constants are of type integer.
language and machine definitions
ARB |
arbitrary (array dimension) |
BOF, BOFL |
beginning of file |
EOF, EOFL |
end of file |
EOS |
end of string |
EPSILON |
smallest real x s.t. 1+x > 1 |
EPSILOND |
double precision epsilon |
ERR |
error status return |
INDEF |
indefinite of type REAL |
INDEF[SILRDX] |
indefinites for all types |
MAX_DIGITS |
number of digits of precision (DOUBLE) |
MAX_EXPONENT |
largest positive exponent |
MAX_INT |
largest positive integer |
MAX_LONG |
largest positive long integer |
MAX_REAL |
largest real or double |
MAX_SHORT |
largest short integer |
MIN_REAL |
smallest representable real number |
NBYTES_CHAR |
number of machine bytes per character |
NO |
opposite of YES |
NULL |
invalid pointer |
OK |
status return, opposite of ERR |
SZ_BOOL |
nchars per BOOL |
SZ_CHAR |
nchars per CHAR |
SZ_COMPLEX |
nchars per COMPLEX |
SZ_DOUBLE |
nchars per DOUBLE |
SZ_FNAME |
size of a file name string, chars |
SZ_INT |
nchars per INT |
SZ_LINE |
size of a file line buffer, chars |
SZ_LONG |
nchars per LONG |
SZ_REAL |
nchars per REAL |
SZ_SHORT |
nchars per SHORT |
TY_BOOL |
code for type BOOL |
TY_CHAR |
code for type CHAR |
TY_COMPLEX |
code for type COMPLEX |
TY_DOUBLE |
code for type DOUBLE |
TY_INT |
code for type INT |
TY_LONG |
code for type LONG |
TY_REAL |
code for type REAL |
TY_SHORT |
code for type SHORT |
YES |
opposite of NO |
file i/o definitions
APPEND |
file access mode |
BINARY_FILE |
file type |
NEW_FILE |
file access mode |
READ_ONLY |
file access mode |
READ_WRITE |
file access mode |
STDERR |
standard error output |
STDGRAPH |
standard graphics output |
STDIMAGE |
standard image display output |
STDIN |
standard input |
STDOUT |
standard output |
STDPLOTTER |
standard plotter output |
TEXT_FILE |
file type |
WRITE_ONLY |
file access mode |
Appendix B: Detailed Examples
Example 5: Matrix Inversion
An SPP translation of Bevington’s routine to invert a matrix by gaussian elimination with partial pivoting is shown below. The help text is shown with text formatter commands inserted. The restriction of this procedure to matrices of a fixed size is unfortunate, but we have kept it that way to conform to Bevingtons original code.
.help matinv 2 "math library"
.nf ____________________________________________________________________
NAME
matinv -- invert a symmetric matrix and calculate its determinant.
SOURCE
Bevington, pages 302-303.
USAGE
call matinv (array, order, determinant)
PARAMETERS
array (real) Input matrix of fixed size 10 by 10 (smaller
matrices may be placed in this matrix). Replaced by the
inverse upon output.
order The number of rows and columns in the actual matrix.
determinant
(real) Determinant of input matrix.
DESCRIPTION
The input matrix, which must be dimensioned [10,10] in the calling
program, is inverted, and its determinant is calculated. The
inverse overwrites the input matrix. The algorithm used is
gaussian elimination with partial pivoting.
^G.endhelp _______________________________________________________________
define MAX_ORDER 10 # maximum size of matrix
procedure matinv (array, order, determinant)
double array[MAX_ORDER,MAX_ORDER]
int order
real determinant
int ik[MAX_ORDER], jk[MAX_ORDER]
int i, j, k, l
double maxval, temp
begin
determinant = 1.
do k = 1, order {
# Find largest element array[i,j] in rest of matrix.
maxval = 0.
repeat {
do i = k, order
do j = k, order
if (abs(maxval) <= abs(array[i,j])) {
maxval = array[i,j]
ik[k] = i
jk[k] = j
}
if (maxval == 0) { # abnormal return
determinant = 0.0
return
}
# Interchange rows and columns to put maxval in
# array[k,k].
i = ik[k]
if (i >= k) {
if (i != k)
do j = 1, order {
temp = array[k,j]
array[k,j] = array[i,j]
array[i,j] = -temp
}
j = jk[k]
if (j >= k)
break
}
}
if (j != k)
do i = 1, order {
temp = array[i,k]
array[i,k] = array[i,j]
array[i,j] = -temp
}
# Accumulate elements of inverse matrix.
do i = 1, order
if (i != k)
array[i,k] = -array[i,k] / maxval
do i = 1, order
do j = 1, order
if (i != k && j != k)
array[i,j] = array[i,j] + array[i,k] * array[k,j]
do j = 1, order
if (j != k)
array[k,j] = array[k,j] / maxval
array[k,k] = 1.0 / maxval
determinant = determinant * maxval
}
# Restore ordering of matrix.
do l = 1, order {
k = order - l + 1
j = ik[k]
if (j > k)
do i = 1, order {
temp = array[i,k]
array[i,k] = -array[i,j]
array[i,j] = temp
}
i = jk[k]
if (i > k)
do j = 1, order {
temp = array[k,j]
array[k,j] = -array[i,j]
array[i,j] = temp
}
}
end
Example 6: Pattern Matching
The next example was selected for inclusion here because it demonstrates most of the control flow constructs, as well as the use of defined parameters. The STRMATCH procedure searches a string for the specified pattern. The pattern may contain several metacharacters, or characters which are not matched but rather which tell STRMATCH what constitutes a match. For example:
if (strmatch (line_buffer, "^{naxis}#=") > 0)
...
In this case, STRMATCH would search for the string naxis =
,
returning the index of the first character matched or zero. The
metacharacters are defined in the INCLUDE file pattern.h
, as
follows:
# Pattern Matching Metacharacters (STRMATCH, PATMATCH)
define CH_BOL '^' # beginning of line symbol
define CH_NOT '^' # not, in character classes
define CH_EOL '$' # end of line symbol
define CH_ANY '?' # match any single character
define CH_CLOSURE '*' # zero or more occurrences
define CH_CCL '[' # begin character class
define CH_CCLEND ']' # end character class
define CH_RANGE '-' # as in [a-z]
define CH_ESCAPE '\' # escape character
define CH_WHITESPACE '#' # match optional whitespace
define CH_IGNORECASE '{' # begin ignoring case
define CH_MATCHCASE '}' # begin checking case
The source for the STRMATCH procedure, in file strmatch.x
,
follows. Though this is not a good example of modular code (the
control flow is too complex), it does serve to illustrate the use of
many of the control flow constructs:
include <ctype.h>
include <pattern.h>
.help strmatch, gstrmatch
.nf __________________________________________________________________
STRMATCH -- Find the first occurrence of the string A in the string B.
If not found, return zero, else return the index of the first
character following the matched substring.
GSTRMATCH -- More general version of strmatch. The indices of the
first and last characters matched are returned as arguments. The
function value is the same as for STRMATCH.
STRMATCH recognizes the metacharacters BOL, EOL, ANY, WHITESPACE,
IGNORECASE, and MATCHCASE (BOL and EOL are special only as the first
and last chars in the pattern). The null pattern matches any string.
Metacharacters can be escaped.
^G.endhelp _____________________________________________________________
# STRMATCH -- Search a string for a pattern.
int procedure strmatch (str, pat)
char pat[ARB], str[ARB]
int first_char, last_char
int gstrmatch()
begin
return (gstrmatch (str, pat, first_char, last_char))
end
# GSTRMATCH -- Generalized strmatch which returns the indices of the
# match substring.
int procedure gstrmatch (str, pat, first_char, last_char)
char pat[ARB], str[ARB]
int first_char, last_char
bool ignore_case, bolflag
char ch, pch # string, pattern characters
int i, ip, initial_pp, pp
begin
ignore_case = false
bolflag = false
ip = 1
initial_pp = 1
if (pat[1] == CH_BOL) { # match at beginning of line?
bolflag = true
initial_pp = 2
}
# Try to match pattern starting at each character offset in
# string.
for (first_char=ip; str[ip] != EOS; ip=ip+1) {
i = ip
# Compare pattern to string str[ip].
for (pp=initial_pp; pat[pp] != EOS; pp=pp+1) {
switch (pat[pp]) {
case CH_WHITESPACE:
while (IS_WHITE (str[i]))
i = i + 1
case CH_ANY:
if (str[i] != '\n')
i = i + 1
case CH_IGNORECASE:
ignore_case = true
case CH_MATCHCASE:
ignore_case = false
default:
pch = pat[pp]
if (pch == CH_ESCAPE && pat[pp+1] != EOS) {
pp = pp + 1
pch = pat[pp]
} else if (pch == CH_EOL || pch == '\n')
if (pat[pp+1] == EOS && str[i] == '\n') {
first_char = ip
last_char = i
return (last_char + 1)
}
ch = str[i]
i = i + 1
# Compare ordinary characters. The comparison is
# trivial unless case insensitivity is required.
if (ignore_case) {
if (IS_UPPER (ch)) {
if (IS_UPPER (pch)) {
if (pch != ch)
break
} else if (pch != TO_LOWER (ch))
break
} else if (IS_LOWER (ch)) {
if (IS_LOWER (pch)) {
if (pch != ch)
break
} else if (pch != TO_UPPER (ch))
break
} else {
if (pch != ch)
break
}
} else {
if (pch != ch)
break
}
}
}
# If the above loop was exited before the end of the pattern
# was reached, the pattern did not match.
if (pat[pp] == EOS) {
first_char = ip
last_char = i-1
return (i)
} else if (bolflag || str[i] == EOS)
break
}
return (0) # no match
end
Example 7: Error Handling
The following simple procedure reads a list of file names from the CL, and attempts to delete each file. The DELETE library procedure will take an error action if it cannot delete a file; this is not what is desired, so we post an error handler and reissue the error message from DELETE as a warning message:
include <error.h>
# DELETE_FILES -- Delete a list of files.
procedure delete_files()
char filename[SZ_FNAME] # name of file to be deleted
int list, clpopns(), clgfil()
begin
# Fetch template and open it as a list of files.
list = clpopns ("template")
# Read successive file names from the list, and delete each
# file.
while (clgfil (list, filename, SZ_FNAME) != EOF)
iferr (call delete (filename))
call erract (EA_WARN)
call clpcls (list)
end
The Fortran output for the DELETE_FILES procedure is shown below.
Note the implemention of the template
string, the mapping of long
identifiers into 6 character Fortran identifiers, and the
implementation of the while statement using GOTO.
subroutine delets()
integer*2 filene(33 +1)
integer list, clpops, clgfil
integer*2 st0001(9)
logical xerpop
data st0001 /116,101,109,112,108, 97,116,101, 0/
save
list = clpops (st0001)
110 if (.not.(clgfil (list, filene, 33 ) .ne. (-2))) goto 111
call xerpsh
call delete (filene)
if (.not.xerpop()) goto 120
call erract (3 )
120 continue
goto 110
111 continue
call clpcls (list)
100 return
end
C delets delete_files
C filene filename
C clpops clpopns
Comments
A comment begins with the character
#
and ends at the end of the line.