tselect: Create a new table from selected rows of a table.

Package: nttools

Usage

tselect intable outtable expr

Description

This task creates a new table from a subset of rows in an input table. The rows are selected on the basis of a boolean expression whose variables are table column names. If, after substituting the values associated with a particular row into the column name variables, the expression evaluates to yes, that row is included in the output table. Boolean operators can be used in the expression in either their Fortran or SPP forms. The following boolean operators can be used in the expression:

equal           .eq.  ==        not equal               .ne.  !=
less than       .lt.  <         less than or equal      .le.  <=
greater than    .gt.  >         greater than or equal   .ge.  >=
or              .or.  ||        and                     .and. &&
negation        .not. !         pattern match                 ?=

The pattern match operator (?=) has no corresponding Fortran form. It takes a string expression as its first argument and a pattern as its second argument. The result is "yes" if the pattern is contained in the string expression. Patterns are strings which may contain meta-characters (i.e., wildcard characters used in pattern matching). The meta-characters themselves can be matched by preceeding them with the escape character (\). The meta-characters are:

beginning of string     ^       end of string           $
one character           ?       zero or more characters *
white space             #       escape character        \
begin ignoring case     {       end ignore case         }
begin character class   [       end character class     ]
not, in char class      ^       range, in char class    -

The expression may also include arithmetic operators and functions. Trigonometric functions use degrees, not radians. The following arithmetic operators and functions can be used in the expression:

addition                +       subtraction             -
multiplication          *       division                /
negation                -       exponentiation          **
concatenation           //      date difference         delta(x,y)
absolute value          abs(x)  cosine                  cos(x)
sine                    sin(x)  tangent                 tan(x)
arc cosine              acos(x) arc sine                asin(x)
arc tangent             atan(x) arc tangent             atan2(x,y)
exponential             exp(x)  square root             sqrt(x)
natural log             log(x)  common log              log10(x)
modulo                  mod(x)  minimum                 min(x,y)
row number              row()   maximum                 max(x,y)
nearest integer         nint(x) convert to integer      int(x)
convert to real         real(x) convert to string       str(x)

The row number function returns an integer value corresponding to the row number in the table. The date difference function returns a real value corresponding to the Julian date of the first argument minus the Julian date of the second argument; the arguments to the data function must be in CDBS date format: i.e., character strings of the form YYYYMMDD:HHMMSSCC. Any field after the colon is optional. The last date field (CC) is hundreths of a second.

One concept in most databases and in STSDAS tables is the concept of a null value: a value which indicates that the element is unknown or non-existent. An element in an STSDAS table is null if it is INDEF in a numeric column or a zero length string in a text column. Evaluating expressions involving nulls requires a three valued logic: true, false, and unknown. Any arithmetic operation on a null element should return another null and any comparison operation should return an unknown. Unfortunately, tselect does not implement a true three valued logic correctly. The code instead evaluates any expression containing a null element as unknown. Since tselect only returns rows for which the expression is true, all such rows are excluded from the output of tselect. This is usually right, but sometimes wrong, as in the case where two comparisons are joined by an or and one evaluates to true and the other evaluates to unknown. It also sometimes returns nonintuitive results, as when checking that a column is not equal to INDEF.

Parameters

intable [file name template]
Table(s) from which rows are copied. If input is redirected, this parameter will ignored and input will be read from STDIN instead.
outtable [file name template]
The new table(s) containing the copied rows. If more than one input table was used, then the number of output tables must equal the number of input tables.
expr [string]
The boolean expression which determines which rows are copied to the new table. The expression may be placed in a list file and the name of the file passed to this parameter (preceded by the "@" character).

Examples

1. Extract all binary stars brighter than fifth magnitude from a catalog:

tt> tselect starcat.tab binary.tab "binary && mag <= 5."

2. Create a new set of spectra where all measurements with errors greater than ten percent are excluded. Use file name editing to create new tables with the extension ".tbl" instead of ".tab":

tt> tselect  *.tab  *.%tab%tbl%  "ERROR / (FLUX + .001) < .1"

3. Create a table of engineering parameters whose names begin with a digit:

tt> tselect datalog.tab sublog.tab "name ?= '^[0-9]'"

4. Return all observations in a schedule for the day of Dec 31, 1989:

tt> tselect schedule.tab week.tab "abs(delta(date,'19891231:12'))<.5"

Bugs

Column names must be set off from operators by blanks in the expression so that they can be correctly parsed by the expression evaluator. Expressions involving nulls may evaluate incorrectly, see above for a discussion.

References

This task was written by Bernie Simon.

See also

tproject, tjoin, tproduct

Type "help tables opt=sys" for a higher-level description of the 'tables' package.