SPP Reference Manual

Authors:: Douglas Tody, Kitt Peak National Observatory
Date:: January 1983 (revised September 1983)
Abstract:: The IRAF subset preprocessor language (SPP) implements a subset of the proposed full IRAF scientific programming language. This paper defines the language and gives several examples of procedures written in the language. The differences between the subset language and the full language are summarized. IRAF tasks or programs are discussed, and a working example of an IRAF program is shown. The XC compiler for the SPP language on the UNIX software development system is introduced.

Introduction

The IRAF subset preprocessor language (SPP) implements a subset of the proposed IRAF preprocessor language, described in the paper “The Role of the Preprocessor”. The subset does not include pointers, structures, and dynamic and virtual arrays. Subset preprocessor programs should be written with eventual conversion to the full preprocessor language in mind. In general, this conversion will not be difficult, provided one generates simple and straightforward code.

One of the best ways to learn to program in a new language is to read the source listings of existing programs. Many examples of procedures, programs, and packages written in the SPP language can be found in the IRAF system source directories. We shall only attempt to summarize the basic features of the language here.

Getting Started

The best way to get started is to build and run a simple program, before attempting to learn all the details of the language. Here is our version of the “hello world” program from the C book:

# Simple program to print "hello, world" on the standard
# output.

task    hello                   # CL callable task


procedure hello()               # common procedure

begin
        call printf ("hello, world\n")
end

On the UNIX system, this program would be placed in a file with the extension .x and compiled with the command XC (X Compiler) as follows:

xc hello.x

The XC compiler will translate the program into Fortran, call the Fortran compiler to generate the object file (hello.o), and call the loader to link the object file with modules from the IRAF system libraries to produce the executable program hello. XC may be used to compile C and Fortran programs as well as X programs, and in general behaves very much like CC or F77 (note that the -o flag is not required; by default the name of the output module is the base name of the first file name on the command line). The -F flag may be used to inspect the Fortran generated by the preprocessor; this is sometimes necessary to interpret error messages from the F77 compiler.

Fundamentals of the Language

The SPP language is based on the Ratfor language. The lexical form, operators, and control flow constructs are identical to those provided by Ratfor. The major differences are the data types, the form of a procedure, the addition of inline strings and character constants, the use of square brackets for arrays, and of course the TASK statement. The i/o facilities provided are quite different.

Lexical Form

A subset preprocessor program consists of a sequence of lines of text. The length of a line is arbitrary, but the SPP is guaranteed to be able to handle only lines of up to 160 characters in length. The end of each line is marked by the NEWLINE character. Both upper and lower case characters are permitted. Case is significant.

WHITESPACE is defined as one or more tabs or spaces. Newline normally marks the end of a statement, and is not considered to be whitespace. Whitespace always delimits tokens, i.e., keywords and operators will not be recognized as such if they contain embedded whitespace.

Comments

A comment begins with the character # and ends at the end of the line.

Continuation

Statements may span several lines. A line which ends with an operator (excluding /) or punctuation character (comma or semicolon) is automatically understood to be continued on the following line.

Integer Constants

A decimal integer constant is a sequence of one or more of the digits 0-9. An octal constant is a sequence of one or more of the digits 0-7, followed by the letter b or B. A hexadecimal integer constant is one of the digits 0-9, followed by zero or more of the digits 0-9, the letters a-f, or the letters A-F, followed by the letter x or X. The following notation more concisely summarizes these definitions:

decimal constant	`[0-9]+`
octal constant	`[0-7]+('b' 'B')`
hexadecimal constant	`[0-9][0-9a-fA-F]*('x' 'X')`
identifier	`[a-zA-Z][a-zA-Z_0-9]*`

In the notation used above, + means 1 or more, * means zero or more, - implies a range, and | means “or”. Brackets ([]) define a class of characters. Thus, [0-9]+ reads “one or more of the characters 0 through 9”.

Floating Point Constants

A floating point constant (type REAL or DOUBLE) consists of a decimal integer, followed by a decimal point, followed by a decimal fraction, followed by one of (e|E|d|D), followed by a decimal integer, which may be negative. Either the decimal integer or the decimal fraction part must be present. The number must contain either the decimal point or the exponent (or both). Embedded whitespace is not permitted.

The following are all legal floating point numbers:

.01
100.
100.01
1E5
1E-5
1.00D5
1.0D0

A floating constant may also be given in sexagesimal format, i.e., in hours and minutes, or in hours, minutes, and seconds. The number of colon separated fields must be two or three, and the number of decimal digits in the second field and in the integer part of the third field is limited to exactly two. The decimal point is optional:

01           = 0.017
00:01        = 0.00028
00:00        = 1.0
00:00.00     = 1.0

Character Constants

A character constant consists of from 1 to 4 digits delimited at front and rear by the single quote (', as opposed to the double quotes used to delimit string constants). A character constant is numerically equivalent to the corresponding decimal integer, and may be used wherever an integer constant would be used.

`'a'`	integer equivalent of the letter ‘a’
`'\n'`	integer equiv. of the newline character
`'\007'`	the octal integer 07B
`'\\'`	the integer equiv. of the character ‘\’

The backslash character (\) is used to form “escape sequences”. The following escape sequences are defined:

`\b`	backspace
`\f`	formfeed
`\n`	newline
`\r`	carriage return
`\t`	tab

String Constants

A string constant is a sequence of characters enclosed in double quotes. The double quote itself may be included in the string by escaping it with backslash. All of the escape sequences given above are recognized. The backslash character itself must be escaped to be included in the string. A string constant may not span several lines of text.

Identifiers

An identifier is an upper or lower case letter, followed by zero or more upper or lower case letters, digits, or the underscore character. Identifiers may be as long as desired, but only the first five characters and the last character are significant.

The following identifiers are reserved (though some are not actually used at present):

auto	do	include	short
begin	double	int	sizeof
bool	else	long	static
break	end	map	struct
call	entry	next	switch
case	extern	plot	task
char	false	printf	true
clgetpar	for	procedure	union
clputpar	getpix	putpix	unmap
common	goto	real	until
complex	if	repeat	virtual
data	iferr	return	vstruct
define	imstruct	scan	while

Data Types

The subset preprocessor language supports a fairly wide range of data types. The actual mapping of an XPP data type into a Fortran data type depends on what the target compiler has to offer.

bool	boolean (Fortran LOGICAL)
char	character (8 bit signed)
short	short integer
int	integer (Fortran INTEGER)
long	long integer
real	single precision floating (Fortran REAL)
double	double precision floating (DOUBLE PRECISION)
complex	single precision complex (Fortran COMPLEX)

The only permissible values for a boolean variable are true and false. The CHAR data type belongs to the family of integer data types, i.e., a CHAR variable or array behaves like an integer variable or array. The value of a CHAR variable may range from -127 to 127. CHAR and SHORT are signed integer data types (i.e., they may take on negative values).

In addition to the seven primitive data types, the SPP language provides the abstract type POINTER. The SPP language makes no distinction between pointers to different types of objects, unlike more strongly typed languages such as C (and the full preprocessor). The SPP implementation of the POINTER data type is a stopgap measure.

Declarations

The SPP language implements named procedures with formal parameters and local variables. Global common and dynamic memory allocation may be used to share data amongst procedures. A procedure may return a value, but may not return an array or string. Declarations are included for procedures, variables, arrays, strings, typed procedures, external procedures, and global common areas. Storage for local and global variables and arrays may be assumed to be statically allocated.

Variable, Array, and Function Declarations

Although the language does not require that parameters be declared before local variables and functions, it is a good practice to follow. The syntax of a type declaration is the same for parameters, variables, and procedures:

type_spec       object [, object [,... ]]

Here, type_spec may be any of the seven primitive data types, a derived type such as POINTER, or EXTERN. A list of one or more data objects follows. An object may be a variable, array, or procedure. The declaration for each type of object (variable, array, or procedure) has a unique syntax, as follows:

variable        identifier
array           identifier "[" dimension_list "]"
procedure       identifier "()"

Procedures may be passed to other procedures as formal parameters. If a procedure is to be passed to a called procedure as a formal parameter, it must be declared in the calling procedure as an object of type EXTERN.

Array Declarations

Arrays are one-indexed. The storage order is fixed in such a way that when the elements of the array are accessed in storage order, the leftmost subscript varies fastest. Arrays of up to three dimensions are permitted.

The size of each dimension of an array may be specified by any compile time constant expression, or by an integer parameter or parameters, if the array is a formal parameter to the procedure. If the array is declared as a formal parameter, and the size of the highest dimension is unknown, the size of that dimension should be given as ARB (for arbitrary):

real    data[ARB]               # length of array is unknown
short   raster[NPIX*2,128]      # 2-dim array

The declared dimensionality of an array passed as a formal parameter to a procedure may be less than or equal to the actual dimensionality of the array.

String Declarations

A string is an EOS delimited array of type CHAR (EOS stands for End Of String). Strings may contain only character data (values 0 through 127 decimal), and must be EOS delimited. A character string may be declared in either of two ways, depending on whether initialization is desired:

char    input_file[SZ_FNAME]
string  legal_codes "efgdox"

The preprocessor automatically adds 1 to the declared array size, to allow space for the EOS marker. The space used by the EOS marker is not considered part of the string. Thus, the array char x[10] will contain space for ten characters, plus the EOS marker.

Global Common Declarations

Global common provides a means for sharing data between separately compiled procedures. The COMMON statement is a declaration, and must be used only in the declarations section of a procedure. Each procedure referencing the same common must declare the common in the same way:

common /common_name/ object [, object [, ... ]]

To avoid the possiblity of two procedures declaring the same common area differently in separate procedures, the COMMON declaration should be placed in a INCLUDE file (include files are discussed in a later section).

Procedure Declarations

The form of a PROCEDURE declaration is shown below. The data_type field must be included if the procedure returns a value. The BEGIN keyword separates the declarations section from the executable body of the procedure, and is required. The END keyword must follow the last executable statement:

[data_type] PROCEDURE proc_name ([p1 [, p2 [,... ]]])

(declarations for parameters)
(declarations for local variables and functions)
(initialization)

BEGIN
    (executable statements)
END

All parameters, variables, and typed procedures must be declared. The XPP language does not permit implicit typing of parameters, variables, or procedures (unlike Fortran).

If a procedure has formal parameters, they should agree in both number and type in the procedure declaration and when the procedure is called. In particular, beware of SHORT or CHAR parameters in argument lists. An INT may be passed as a parameter to a procedure expecting a SHORT integer on some machines, but this usage is NOT PORTABLE, and is not detected by the compiler. The compiler does not verify that a procedure is declared and used consistently.

If a procedure returns a value, the calling program must declare the procedure in a type declaration, and reference the procedure in an expression. If a procedure does not return a value, the calling program may reference the procedure only in a CALL statement.

Example 1: The sinc Function

This example demonstrates how to declare a typed procedure, which in this case returns a single real value. Note the inclusion of the double parenthesis (()) in the declaration of the function SIN, to make it clear that a function is being declared, rather than a local variable. Note also the use of the RETURN statement to return the value of the function SINC:

real procedure sinc (x)

real    x

begin

    if (x == 0.0)
        return (1.0)
    else
        return (sin(x) / x)

end

Multiple Entry Points

Procedures with multiple entry points are permitted in the subset preprocessor language because they provide an attractive alternative to global common when several procedures must have access to the same data. The multiple entry point mechanism is a primitive form of block structuring. The advantages of multiple entry points over global common are:

Access to the database is restricted to calls to the defined entry points. A secure database can thus be assured.
Initialization of data in a procedure with multiple entry points is permissible at compile time, whereas global common cannot (reliably) be initialized at compile time.

Nonetheless, the multiple entry point construct is only useful for small problems. If the problem grows too large, an enormous procedure with many entry points results, which is unacceptable.

The form of a procedure with multiple entry points is shown below. Either all entry points should be untyped, as in the example, or all entry points should return values of the same type. Control should only flow forward. Each entry point should be terminated by a RETURN statement, or by a GOTO to a common section of code which all entry points share. The shared section of code should be terminated by a single RETURN which all entry points share.

Example 2: Multiple Entry Points

procedure push (datum)

int     datum                   # value to be pushed or popped
int     stack[SZ_STACK]         # the stack
int     sp                      # the stack pointer
data    sp/0/

begin
    (push datum on the stack, check for overflow)
    return

entry   pop (datum)
    (pop stack into "datum", check for underflow)
    return
end

Initialization

Local variables, arrays, and character strings may be initialized at compile time with the DATA statement. Data in a global common may NOT be initialized at compile time. If initialization of data in a global common is required, it must be done at run time by an initialization procedure.

The syntax of the DATA statement is defined in the Fortran 77 standard. Some simple examples follow:

real    x, y[2]
char    ch[2]
data    x/0/, y/1.0,2.0/, ch/'a','b',EOS/

Control Flow Constructs

The subset preprocessor provides a full set of control flow constructs, such as are found in most modern languages. Some of these have already appeared in the examples.

An SPP control flow construct executes a “statement” either conditionally or repetitively. The “statement” to be executed may be a simple one line statement, a COMPOUND STATEMENT enclosed in curly brackets or braces ({}), or the NULL STATEMENT (; on a line by itself).

conditional constructs:	IF, IF ELSE, SWITCH CASE
repetitive constructs:	DO, FOR, REPEAT UNTIL, WHILE
branching:	BREAK, NEXT, GOTO, RETURN

Two special statements are provided to interrupt the flow of control through one of the repetitive constructs. BREAK causes an immediate exit from the loop, by jumping to the statement following the loop. NEXT shifts control to the next iteration of the loop. If BREAK and NEXT are embedded in a conditional construct, which is in turn embedded in a repetitive construct, it is the outer repetitive construct which will define where control is shifted to.

Conditional Execution

The IF and IF ELSE constructs are shown below. The expr part may be any boolean expression. The statement part may be a simple statement, compound statement enclosed in braces, or the null statement. The control flow constructs may be nested indefinitely.

IF construct

if (expr)
    statement

IF ELSE construct

if (expr)
    statement
else
    statement

ELSE IF construct

The ELSE IF construct is useful for selecting one statement to be executed from a group of possible choices. This construct is a more general form of the SWITCH CASE construct.

if (expr)
    statement
else if (expr)
    statement
else if (expr)
    statement

SWITCH CASE construct

The SWITCH CASE construct evaluates an integer expression once, then branches to the matching case. Each case must be a unique integer constant. The maximum number of cases is limited only by table space within the compiler.

A case may consist of a single integer constant, or a list of integer constants, delimited by the character :. The special case DEFAULT, if included, is selected if the switch value does not match any of the other cases. If the switch value does not match any case, and there is no default case, control passes to the statement following the body of the SWITCH statement.

Each case of the SWITCH statement may consist of an arbitrary number of statements, which do not have to be enclosed in braces. The body of the switch statement, however, must be enclosed in braces as shown:

switch (int_expr) {
case int_const_list:
    statements
case int_const_list:
    statements
default:
    statements
}

example:

switch (operator) {
case '+':
    c = a + b
case '-':
    c = a - b
default:
    call error (1, "unknown operator")
}

The SWITCH construct will execute most efficiently if the cases form a monotonically increasing sequence without large gaps between the cases (i.e., case 1, case 2, case 3, etc.). The cases should, of course, be defined parameters or character constants, rather than explicit numbers.

Error Handling

The SPP language provides support for error actions, error handling and error recovery. Knowledge of the SPP error handling procedures is necessary to correctly deal with error actions initiated by the system library routines.

A recoverable error condition is asserted by a call to the ERROR statement. An irrecoverable error condition is asserted with the FATAL statement. Error recovery is implemented using the IFERR and IFNOERR statements. If an error handler is not “posted” by a call to IFERR or IFNOERR, a system defined error handler will be called, returning system resources, closing files, deleting temporary files, and aborting the program.

errchk  proc1, proc2, ...               # errchk declaration

iferr (procedure call or assignment statement)
    <error_action_statement>

iferr {
    <any statements, including IFERR>
} then
    <error_action_statement>

Language support includes the IFERR and IFNOERR statements and the ERRCHK declaration. The IFERR and IFNOERR statements are gramatically equivalent to the IF statement. The meaning of the IFERR statement is “if an an error occurs during the processing of the enclosed code,…”. IFNOERR is equivalent, except that the sense of the test is reversed. Note that the condition to be tested in an IFERR statement may a single or compound procedure call or assignment statement, while the IF statement tests a boolean expression.

If a procedure calls a subprocedure which may directly or indirectly take an error action, then the subprocedure must be named in an ERRCHK declaration in the calling procedure. If an error occurs during the processing of a subprocedure and an error handler is posted somewhere back up the chain of procedure calls, then control must revert immediately back up the chain of procedures to the procedure which posted the error handler. This will work only if all intermediate procedures include ERRCHK declarations for the next lower procedure in the chain.

Graphically, assume that procedure A calls B, that B in turn calls C, and so on as shown below:

A                       (A posts error handler with IFERR)
    B                   (B must ERRCHK procedure C)
        C               (C must ERRCHK procedure D)
            D           (D calls ERROR)

As indicated by the diagram, procedure D calls ERROR, “taking an error action”. If no handler is posted, the error action will consist of the system error recovery actions, terminating with the abort of the current program. But if an error handler is posted, as is done by procedure A in the example, then control should revert immediately to procedure A. The error handler in A might try again with slightly different parameters, perform special cleanup actions and abort, print a more meaningful error message and take another error action, print a warning message, or whatever. If the ERRCHK declaration is omitted in procedure B or C, control will not revert immediately to procedure A, and processing will erroneously continue in the intermediate procedure, as if an error had not occurred.

Several library procedures are provided in the system library for use in error handlers. The ERRACT procedure may be called in an error handler to issue the error message posted by the original ERROR call as a warning message, or to cause a particular error action to be taken. The error actions are defined in the include file <error.h>. ERRCODE returns either OK or the integer code of the posted error.

Library procedures related to error handling:

        error (errcode, error_message)        (language)
        fatal (errcode, error_message)        (library)
       erract (severity)                      (library)
val = errcode ()                              (library)

ERRACT severity codes <error.h>:

EA_WARN                 # issue a warning message
EA_ERROR                # assert recoverable error
EA_FATAL                # assert fatal error

An arithmetic exception (X_ARITH) will be trapped by an IFERR statement, provided the posted handler(s) return without causing error restart. X_INT and X_ACV (interrupt and access violation may be caught only by posting an exception handler with XWHEN.

Repetitive Execution

An assortment of repetitive constructs are provided for convenience. The simplest constructs are WHILE, which tests at the top of the loop, and REPEAT UNTIL, which tests at the bottom. The DO construct is convienent for simple sequential operations on arrays. The most general repetitive construct is the FOR statement.

WHILE construct

while (expr)
    statement

REPEAT UNTIL construct

repeat {
    statements
} until (expr)

Infinite REPEAT loop

repeat {
    statements                  (exit with BREAK, RETURN, etc)
}

FOR loop

The FOR construct consists of an initialization part, a test part, and a loop control part. The initialization part consists of a statement which is executed once before entering the loop. The test part is a boolean expression, which is tested before each iteration of the loop. The loop control statement is executed after the last statement in the body of the FOR, before branching to the test at the beginning of the loop. When used in a FOR statement, NEXT causes a branch to the loop control statement.

The FOR construct is very general, because of the lack of restrictions on the type of initialization and loop control statements chosen. Any or all of the three parts of the FOR may be ommitted, but the semicolon delimiters must be present.

for (init;  test;  control)     FOR construct
    statement

example:

for (ip=strlen(str);  str[ip] != 'z' && ip > 0;  ip=ip-1)
    ;

The example demonstrates the flexibility of the FOR construct. The FOR statement shown searches the string str backwards until the character ‘z’ is encountered, or until the beginning of the string is reached. Note the use of the null statement for the body of the FOR, since everything has already been done in the FOR itself. The STRLEN procedure is shown in a later example.

DO loop

The DO construct is a special case of the FOR construct. DO is ideal for simple array operations, and since it is implemented with the Fortran DO statement, its use should result in particularly efficient code.

Only INTEGER loop control expressions are permitted in the DO statement. General expressions are permitted. The loop may run forwards or backwards, with any step size. The value of the loop control parameter is UNDEFINED upon exit from the loop. The body of the DO will be executed zero times, if the initial value of the loop control parameter satisfies the termination condition.

do lcp = initial_value, final_value [, step_size]
    statement

example:

do i = 1, NPIX                  DO construct
    a[i] = abs (a[i])

Expressions

Every expression is characterized by a data type and a value. The data type is fixed at compile time, but the value may be either fixed at compile time, or calculated at run time. An expression may be a constant, a string constant, an array reference, a call to a typed procedure, or any combination of the above elements, in combination with one or more unary or binary operators.

Operators

Special Operators

`(` arglist `)`	procedure call
`[` arglist `]`	array reference

Unary Operators

`-`	negation
`!`	boolean not

Binary Operators

`**`	exponentiation
`/` `*` `+` `-`	arithmetic
`==` `!=` `<=` `>=` `<` `>`	boolean comparison
`&&` `\|\|`	boolean and, or

Parenthesis may be used to force the compiler to evaluate the parts of an expression in a certain order. In the absence of parenthesis, the “precedence” of an operator determines the order of evaluation of an expression. The highest precedence operators are evaluated first. The precedence of the SPP operators is defined by the order in which the operators appear in the table above (procedure call has the highest precedence).

The “arglist” in a procedure or array reference consists of a list of general expressions separated by commas. If an expression contains calls to two or more procedures, the order in which the procedures are evaluated is undefined.

Mixed Mode Expressions

The binary operators combine two expressions into a single expression. If the two input expressions are of different data types, the expression is said to be a “mixed mode” expression. The data type of a mixed mode expression is defined by the order in which the types of the two input expressions appear in the table on page 5. The data type which appears furthest down in this table will be the data type of the combined expression. For example, an integer plus a real produces a real. Mixed mode expressions involving booleans are illegal.

Type Coercion

The term “type coercion” refers to the conversion of an object from one data type to another. Such conversions may involve loss of information, and hence are not always reversible. Type coercion occurs automatically in mixed mode expressions, and in assignment statements. Type coercion is not permitted between booleans and the other data types.

The data type of an expression may coerced by a call to an intrinsic function. The names of these intrinsic functions are the same as the names of the data types. Thus, int(x), where x is of type REAL, coerces x to type INT, while double(x) produces a double precision result.

The Assignment Statement

The assignment statement assigns the value of the general expression on the right side to the variable or array element given on the left side. Automatic type coercion will occur during the assignment if necessary (and legal). Multiple assignments may not be made on the same line.

Some Examples

We have now finished discussing the fundamentals of the subset preprocessor language. The following examples demonstrate two complete procedures written in the SPP language. Additional examples are given in appendix B, and in the IRAF source directories.

Example 3: Length of a String

This example demonstrates the declaration and use of a function to compute the length of a character string passed as a formal parameter. STRLEN simply inspects each character in the string, until the end of string marker (EOS) is reached:

int procedure strlen (string)

char    string[ARB]
int     ip

begin
        ip = 1
        while (string[ip] != EOS)
            ip = ip + 1
        return (ip - 1)
end

The code fragment shown below shows how the function STRLEN might be used in another procedure. STRLEN is called to get the index of the last character in the string, then the string is truncated by overwriting the last character with EOS. EOS is a predefined constant, which should be considered part of the language:

char    string[SZ_LINE]
int     strlen()

begin
        string_length = strlen (string)
        if (string_length >= 1)
            string[string_length] = EOS

Example 4: Min and Max of a Real Array

This example shows how to declare a procedure which returns its output via formal parameters, rather than as the function value. Note the use of square brackets to declare and reference arrays. If the limiting values of the data cannot be computed, the special value INDEF is returned, signifying that the limiting values are indefinite. INDEF is another predefined constant:

procedure limits (data, npix, minval, maxval)

real    data[npix]              # input data array
int     npix                    # length of array
real    minval, maxval          # output values
int     i

begin
        if (npix >= 1) {
            minval = data[1]
            maxval = data[1]
            for (i=2;  i <= npix;  i=i+1) {
                if (data[i] < minval)
                    minval = data[i]
                if (data[i] > maxval)
                    maxval = data[i]
            }
        } else {
            minval = INDEF
            maxval = INDEF
        }
end

The generalization of this procedure to handle indefinites in the input data array is left up to the reader.

Program Structure

An SPP source file may contain any number of PROCEDURE declarations, zero or one TASK statements, any number of DEFINE or INCLUDE statements, and any number of HELP text segments. By convention, global definitions and include file references should appear at the beginning of the file, followed by the task statement, if any, and the procedure declarations.

include <ctype.h>               # character type definitions
include "widgets.h"             # package definitions file

# This file contains the source for the tasks making up the
# Widgets analysis package (describe the contents of the file).

define  MAX_WIDGETS     50      # local definitions
define  NPIX            512
define  LONGITUDE       7:32:23.42


task    alpha, beta, epsilon=eps


# ALPHA -- (describe the alpha task)

procedure alpha()
        ...

Include Files

Include files are referenced at the beginning of a file to include global definitions that must be shared amongst separately compiled files, and within procedures to reference common block definitions. The INCLUDE statement is effectively replaced by the contents of the named file. Includes may be nested at least 5 deep.

The name of the file to be included must be delimited by either angle brackets (<file>) or quotation marks ("file"). The first form is used to reference the IRAF system include files. The second, more general, form may be used to include any file.

Macro Definitions

Macro definitions are invaluable for “information hiding”, and can do much to enhance the modifiability of a program. The effective use of macros also tends to improve the readability of a program. By convention, the names of macros are always upper case, to make it clear that a macro is being used, and to avoid redefinitions of ordinary variables and procedures.

There are two kinds of macros – those with arguments, and those without. Macros without arguments are the most common, and are used primarily to turn explicit constants into symbolic parameters. Examples are shown above.

Macros may also be used to reference the field of a structure, or to define inline code fragments (similar to Fortran statement functions). In the SPP, the arguments of a macro are referenced as $1, $2, in the following manner:

define  I_TYPE          $1[1]
define  I_NPIX          $1[2]
define  I_COEFF         $1[10]


if (I_TYPE(coeff) == LINEAR)
    ...

In this example, the array coeff is actually a simple structure, containing the fields i_type, i_npix, …, and i_coeff. It greatly enhances the readability of the program to refer to the fields of this structure by name, rather than offset (coeff[2]), and furthermore makes it trivial to modify the structure.

Macros with arguments may also be used to define inline functions. For example, here are a couple of definitions of character classes from the system include file ctype.h:

define  IS_UPPER        ($1>='A'&&$1<='Z')
define  IS_LOWER        ($1>='a'&&$1<='z')
define  IS_DIGIT        ($1>='0'&&$1<='9')

usage:

if (IS_DIGIT(string[i])) {
    ...

Note that these definitions work for ASCII, but not for EBCDIC (IBM). By using macros, we have concentrated this machine dependent knowledge of the character set into a single file.

Note

In the current implementation of the SPP, macro definitions may not include string constants. All other types of constants, constant expressions, array and procedure references, are allowed. The domain of definition of a macro extends from the line following the macro, to the end of the file (except for include files). Macros are recursive. Redefinitions of macros are silently permitted.

The Task Statement, and Tasks

The TASK statement is used to make an IRAF task. A file need not contain a task statement, and may not contain more than a single task statement. Files without task statements are separately compiled to produce object modules, which may subsequently be linked together to make a task, or which may be installed in a library.

A single physical task (ptask) may contain one or more LOGICAL TASKS (ltasks). These tasks need not be related. Several ltasks may be grouped together into a single ptask merely to save disk storage, or to minimize the overhead of task execution. Ltasks should communicate with one another only via disk files, even if they reside in the same ptask.

task    ltask1, ltask2, ltask3=proc3

The task statement defines a set of ltasks, and associates each with a compiled procedure. If only the name of the ltask is given in the task statement, the associated procedure is assumed to have the same name. A file may contain any number of ordinary procedures which are not associated (directly) with an ltask. The source for the procedure associated with a given ltask need not reside in the same file as the task statement.

An ltask associated procedure MUST not have any arguments. An ltask procedure gets its parameters from the CL via the CL interface. Most commonly used are the CLGETx procedures. The CLPUTx procedures may be used to change the values of parameters.

task    alpha, beta, epsilon=eps


procedure alpha()

int     npix, clgeti()
real    lcut, clgetr()
char    file[SZ_FNAME]

begin
        npix = clgeti ("npix")
        lcut = clgetr ("lower_cutoff")
        call clgstr ("input_file", file, SZ_FNAME)
                      ...

An IRAF task may be run by the CL or called from the command interpreter provided by the host operating system, without change. Parameter requests and i/o to the standard input and output will function properly in both cases. When running without the CL, of course, the interface is much more primitive.

To run an IRAF task directly, without the CL (especially useful for debugging purposes), begin by simply running the task. The task will sense that it is being run without the CL, and issue a prompt:

> ?
alpha beta epsilon
> alpha
npix: (response)
lower_cutoff: (response)
input_file: (response)
    (ltask "alpha" continues)
> bye

Every IRAF task has two special commands built in. The command ? will list the names of the ltasks recognized by the interpreter. The command bye is used to exit the interpreter.

Help Text

Documentation may be embedded in an XPP source file either by commenting out the lines of text, or by enclosing the lines of text within .help and .endhelp directives. If there are only a few lines of text, it is probably most convenient to comment them out. Large blocks of text should be enclosed by the help directives, making the text easier to edit, and accessible to the online documentation and text processing tools.

# (everything from the '#' to end of line is a comment)

.help [keyword [qualifier [package_description_string]]]
        (help text)
.endhelp

The preprocessor ignores comments, and everything between .help and .endhelp directives. The directives must occur at the beginning of a line to be recognized. In both cases, the preprocessor ignores the remainder of the line. The arguments to .help are used by the HELP, MANPAGE, and LISTING utilities, but are ignored by XPP.

Help text may be typed in as it is to appear on the terminal or printer, or it may contain text processing directives. A filter (LISTING) is available to strip help text out when making listings, or to replace help text containing directives with nicely formatted text. See the LROFF documentation for a description of the IRAF text processing directives.

Manual pages for ltasks may be stored either directly in the source file as help text segments, or in separate files. If separate source and help files are used, both files should reside in the same directory and should have the same root name, and the help text file should have the extension .hlp.

Anachronisms

Certain constructs in the subset preprocessor language are not likely to survive in their present form in the full preprocessor. These include:

the STRING declaration
the DATA statement
the COMMON statement
the POINTER data type

The STRING declaration will disappear at the same time as the DATA statement. Both will be replaced by initializations of the form:

real    x = 0.0, y[] = {1.,2.,4.}
char    opcodes[SZ_OPCODES] = {'f','g','e','d'}

COMMON declarations, in their present form, are cumbersome and dangerous to use. The global data capability provided by COMMON will be present in the full preprocessor in a more structured form.

The POINTER data type will be replaced by a strongly typed (and therefore much more reliable) implementation of pointers, patterned after C.

Notes on Topics not Discussed

This present version of the SPP reference manual omits a discussion of the basic i/o facilities, some of which require language support. Dynamic memory management and pointers will be covered in a later revision of the manual. Data structuring is possible in the SPP, using macros, and is discussed in the design documentation for VSIO.

Programs written in the subset preprocessor language should adhere to the (currently informal) coding standard being developed for IRAF. The coding standard has not yet been documented. Try to style procedures after those shown in the examples, and in the IRAF system source directories.

Appendix A: Predefined Constants

The subset preprocessor language includes a number of predefined symbolic constants. Included are various machine dependent constants describing the hardware and data types. Other symbolic constants are used for basic file i/o. All predefined constants are of type integer.

language and machine definitions

ARB	arbitrary (array dimension)
BOF, BOFL	beginning of file
EOF, EOFL	end of file
EOS	end of string
EPSILON	smallest real x s.t. 1+x > 1
EPSILOND	double precision epsilon
ERR	error status return
INDEF	indefinite of type REAL
INDEF[SILRDX]	indefinites for all types
MAX_DIGITS	number of digits of precision (DOUBLE)
MAX_EXPONENT	largest positive exponent
MAX_INT	largest positive integer
MAX_LONG	largest positive long integer
MAX_REAL	largest real or double
MAX_SHORT	largest short integer
MIN_REAL	smallest representable real number
NBYTES_CHAR	number of machine bytes per character
NO	opposite of YES
NULL	invalid pointer
OK	status return, opposite of ERR
SZ_BOOL	nchars per BOOL
SZ_CHAR	nchars per CHAR
SZ_COMPLEX	nchars per COMPLEX
SZ_DOUBLE	nchars per DOUBLE
SZ_FNAME	size of a file name string, chars
SZ_INT	nchars per INT
SZ_LINE	size of a file line buffer, chars
SZ_LONG	nchars per LONG
SZ_REAL	nchars per REAL
SZ_SHORT	nchars per SHORT
TY_BOOL	code for type BOOL
TY_CHAR	code for type CHAR
TY_COMPLEX	code for type COMPLEX
TY_DOUBLE	code for type DOUBLE
TY_INT	code for type INT
TY_LONG	code for type LONG
TY_REAL	code for type REAL
TY_SHORT	code for type SHORT
YES	opposite of NO

file i/o definitions

APPEND	file access mode
BINARY_FILE	file type
NEW_FILE	file access mode
READ_ONLY	file access mode
READ_WRITE	file access mode
STDERR	standard error output
STDGRAPH	standard graphics output
STDIMAGE	standard image display output
STDIN	standard input
STDOUT	standard output
STDPLOTTER	standard plotter output
TEXT_FILE	file type
WRITE_ONLY	file access mode

Appendix B: Detailed Examples

Example 5: Matrix Inversion

An SPP translation of Bevington’s routine to invert a matrix by gaussian elimination with partial pivoting is shown below. The help text is shown with text formatter commands inserted. The restriction of this procedure to matrices of a fixed size is unfortunate, but we have kept it that way to conform to Bevingtons original code.

.help matinv 2 "math library"
.nf ____________________________________________________________________
NAME
     matinv -- invert a symmetric matrix and calculate its determinant.

SOURCE
     Bevington, pages 302-303.

USAGE
     call matinv (array, order, determinant)

PARAMETERS

     array   (real)  Input  matrix  of  fixed  size  10  by  10 (smaller
             matrices may be placed in this matrix).   Replaced  by  the
             inverse upon output.

     order   The number of rows and columns in the actual matrix.

     determinant
             (real) Determinant of input matrix.


DESCRIPTION
     The  input matrix, which must be dimensioned [10,10] in the calling
     program, is inverted,  and  its  determinant  is  calculated.   The
     inverse  overwrites  the  input  matrix.   The  algorithm  used  is
     gaussian elimination with partial pivoting.
^G.endhelp _______________________________________________________________

define  MAX_ORDER       10      # maximum size of matrix


procedure matinv (array, order, determinant)

double  array[MAX_ORDER,MAX_ORDER]
int     order
real    determinant

int     ik[MAX_ORDER], jk[MAX_ORDER]
int     i, j, k, l
double  maxval, temp

begin
        determinant = 1.

        do k = 1, order {

            # Find largest element array[i,j] in rest of matrix.

            maxval = 0.
            repeat {
                do i = k, order
                    do j = k, order
                        if (abs(maxval) <= abs(array[i,j])) {
                            maxval = array[i,j]
                            ik[k] = i
                            jk[k] = j
                        }

                if (maxval == 0) {              # abnormal return
                    determinant = 0.0
                    return
                }

                # Interchange rows and columns to put maxval in
                # array[k,k].

                i = ik[k]
                if (i >= k) {
                    if (i != k)
                        do j = 1, order {
                            temp = array[k,j]
                            array[k,j] = array[i,j]
                            array[i,j] = -temp
                        }
                    j = jk[k]
                    if (j >= k)
                        break
                }
            }

            if (j != k)
                do i = 1, order {
                    temp = array[i,k]
                    array[i,k] = array[i,j]
                    array[i,j] = -temp
                }

            # Accumulate elements of inverse matrix.

            do i = 1, order
                if (i != k)
                    array[i,k] = -array[i,k] / maxval

            do i = 1, order
                do j = 1, order
                    if (i != k && j != k)
                        array[i,j] = array[i,j] + array[i,k] * array[k,j]

            do j = 1, order
                if (j != k)
                    array[k,j] = array[k,j] / maxval

            array[k,k] = 1.0 / maxval
            determinant = determinant * maxval
        }

        # Restore ordering of matrix.

        do l = 1, order {
            k = order - l + 1
            j = ik[k]
            if (j > k)
                do i = 1, order {
                    temp = array[i,k]
                    array[i,k] = -array[i,j]
                    array[i,j] = temp
                }

            i = jk[k]
            if (i > k)
                do j = 1, order {
                    temp = array[k,j]
                    array[k,j] = -array[i,j]
                    array[i,j] = temp
                }
        }
end

Example 6: Pattern Matching

The next example was selected for inclusion here because it demonstrates most of the control flow constructs, as well as the use of defined parameters. The STRMATCH procedure searches a string for the specified pattern. The pattern may contain several metacharacters, or characters which are not matched but rather which tell STRMATCH what constitutes a match. For example:

if (strmatch (line_buffer, "^{naxis}#=") > 0)
    ...

In this case, STRMATCH would search for the string naxis =, returning the index of the first character matched or zero. The metacharacters are defined in the INCLUDE file pattern.h, as follows:

# Pattern Matching Metacharacters (STRMATCH, PATMATCH)

define  CH_BOL          '^'             # beginning of line symbol
define  CH_NOT          '^'             # not, in character classes
define  CH_EOL          '$'             # end of line symbol
define  CH_ANY          '?'             # match any single character
define  CH_CLOSURE      '*'             # zero or more occurrences
define  CH_CCL          '['             # begin character class
define  CH_CCLEND       ']'             # end character class
define  CH_RANGE        '-'             # as in [a-z]
define  CH_ESCAPE       '\'             # escape character
define  CH_WHITESPACE   '#'             # match optional whitespace
define  CH_IGNORECASE   '{'             # begin ignoring case
define  CH_MATCHCASE    '}'             # begin checking case

The source for the STRMATCH procedure, in file strmatch.x, follows. Though this is not a good example of modular code (the control flow is too complex), it does serve to illustrate the use of many of the control flow constructs:

include <ctype.h>
include <pattern.h>

.help strmatch, gstrmatch
.nf __________________________________________________________________
STRMATCH -- Find the first occurrence of the string A in the string B.
If not found, return zero, else return the index of the first
character following the matched substring.

GSTRMATCH -- More general version of strmatch.  The indices of the
first and last characters matched are returned as arguments.  The
function value is the same as for STRMATCH.

STRMATCH recognizes the metacharacters BOL, EOL, ANY, WHITESPACE,
IGNORECASE, and MATCHCASE (BOL and EOL are special only as the first
and last chars in the pattern).  The null pattern matches any string.
Metacharacters can be escaped.
^G.endhelp _____________________________________________________________


# STRMATCH -- Search a string for a pattern.

int procedure strmatch (str, pat)

char    pat[ARB], str[ARB]
int     first_char, last_char
int     gstrmatch()

begin
        return (gstrmatch (str, pat, first_char, last_char))
end


# GSTRMATCH -- Generalized strmatch which returns the indices of the
# match substring.

int procedure gstrmatch (str, pat, first_char, last_char)

char    pat[ARB], str[ARB]
int     first_char, last_char
bool    ignore_case, bolflag
char    ch, pch                         # string, pattern characters
int     i, ip, initial_pp, pp

begin
        ignore_case = false
        bolflag = false
        ip = 1
        initial_pp = 1

        if (pat[1] == CH_BOL) {         # match at beginning of line?
            bolflag = true
            initial_pp = 2
        }

        # Try to match pattern starting at each character offset in
        # string.

        for (first_char=ip;  str[ip] != EOS;  ip=ip+1) {
            i = ip
            # Compare pattern to string str[ip].
            for (pp=initial_pp;  pat[pp] != EOS;  pp=pp+1) {
                switch (pat[pp]) {
                case CH_WHITESPACE:
                    while (IS_WHITE (str[i]))
                        i = i + 1
                case CH_ANY:
                    if (str[i] != '\n')
                        i = i + 1
                case CH_IGNORECASE:
                    ignore_case = true
                case CH_MATCHCASE:
                    ignore_case = false

                default:
                    pch = pat[pp]
                    if (pch == CH_ESCAPE && pat[pp+1] != EOS) {
                        pp = pp + 1
                        pch = pat[pp]
                    } else if (pch == CH_EOL || pch == '\n')
                        if (pat[pp+1] == EOS && str[i] == '\n') {
                            first_char = ip
                            last_char = i
                            return (last_char + 1)
                        }

                    ch = str[i]
                    i = i + 1

                    # Compare ordinary characters.  The comparison is
                    # trivial unless case insensitivity is required.

                    if (ignore_case) {
                        if (IS_UPPER (ch)) {
                            if (IS_UPPER (pch)) {
                                if (pch != ch)
                                    break
                            } else if (pch != TO_LOWER (ch))
                                    break
                        } else if (IS_LOWER (ch)) {
                            if (IS_LOWER (pch)) {
                                if (pch != ch)
                                    break
                            } else if (pch != TO_UPPER (ch))
                                    break
                        } else {
                            if (pch != ch)
                                break
                        }
                    } else {
                        if (pch != ch)
                            break
                    }
                }
            }

            # If the above loop was exited before the end of the pattern
            # was reached, the pattern did not match.

            if (pat[pp] == EOS) {
                first_char = ip
                last_char = i-1
                return (i)

            } else if (bolflag || str[i] == EOS)
                break
        }

        return (0)                      # no match
end

Example 7: Error Handling

The following simple procedure reads a list of file names from the CL, and attempts to delete each file. The DELETE library procedure will take an error action if it cannot delete a file; this is not what is desired, so we post an error handler and reissue the error message from DELETE as a warning message:

include <error.h>

 # DELETE_FILES -- Delete a list of files.

 procedure delete_files()

 char    filename[SZ_FNAME]              # name of file to be deleted
 int     list, clpopns(), clgfil()

 begin
         # Fetch template and open it as a list of files.
         list = clpopns ("template")

         # Read successive file names from the list, and delete each
         # file.
         while (clgfil (list, filename, SZ_FNAME) != EOF)
             iferr (call delete (filename))
                 call erract (EA_WARN)

         call clpcls (list)
 end

The Fortran output for the DELETE_FILES procedure is shown below. Note the implemention of the template string, the mapping of long identifiers into 6 character Fortran identifiers, and the implementation of the while statement using GOTO.

      subroutine delets()
      integer*2 filene(33 +1)
      integer list, clpops, clgfil
      integer*2 st0001(9)
      logical xerpop
      data st0001 /116,101,109,112,108, 97,116,101, 0/
      save
      list = clpops (st0001)
110   if (.not.(clgfil (list, filene, 33 ) .ne. (-2))) goto 111
      call xerpsh
      call delete (filene)
      if (.not.xerpop()) goto 120
         call erract (3 )
120         continue
            goto 110
111      continue
         call clpcls (list)
100      return
      end
C     delets  delete_files
C     filene  filename
C     clpops  clpopns