Entries in functions (3)

Friday
Feb132009

Revision Time (Missing and Undefined Values)

Missing and undefined values is a source of confusion, for those previously using SPSS because it is different and for those from traditional programming environments where the concept are not so well used. The following is extracted from the SIR 2 Manual but still remains true.

The user may specify one, two or three missing values for each variable in the data base with the MISSING VALUES command. This capability is also available for summary variables during Retrieval.

For example:

MISSING VALUES VAR1 TO VAR5 (7,8,9)
 RO01 TO R025 (-1,-2)
 DEPT JOBNO HOURS (BLANK)

When a value of 7 is input or computed for VAR1, SIR will recognize it as a missing value.

In SIR, missing values are distinguished from "undefined" values. The latter can arise in several ways:

a. as the result of undefined computations such as by division zero, square root of a negative, logarithm of zero or negative, tangent of pi/2, etc.

COMPUTE V = SQRT(Z/Y)

V will be undefined if Y is zero or Z/Y is negative.

b. as the result of a computation using missing or undefined values.

COMPUTE RADIUS = Ql + 7*Q2

If either Ql or Q2 is missing or undefined, RADIUS will be set to undefined.

c. as the result of a computation which sets a variable to an invalid or out-of-range value.

VAR RANGES MEAS1 TO MEAS5 (1,50)
...
...
COMPUTE MEASl = (BAK + BAL) 

If the result of the expression on the right is outside the range 1 through 50, MEAS1 will be set to undefined.

d. as the result of a Batch Data Input operation which the user has chosen to ACCEPT records containing invalid or out-of-range values. The records are stored in the data base with "undefined" substituted for the "bad" data.

SIR provides the user with several functions that detect missing and undefined values.

EXISTS (X)

This function returns the value 1 if X is defined and the value 0 if X is missing or undefined. For example, the logical expression (EXISTS(XNUM) EQ 0) will be true if XNUM contains a missing or undefined value.

MISNUM (X)
This function returns the value 0 if X is undefined and the value 1, 2 or 3 if X is the contains the first, second or third missing value. If X is defined, MISNUM returns undefined. For example, if the missing values for ANSO1 are defined by:

MISSING VALUES ANS01(97,98,99)

then the statement

XANS = MISNUM(ANS01)

will set XANS to 3 if ANS01 is 99.

MISSING (X)
This function returns the actual missing value of X when X contains one of its three missing values. Otherwise, MISSING returns undefined. The MISSING function is the only way to retrieve a missing value for further use in SIR. For example, suppose the missing values for ANSO1 are 97, 98 and 99 (as before), and ANSO1 in a particular record contains 99,
then the statement

XANS = ANSO1

would set XANS to undefined, but the statement

XANS = MISSING(ANS01)

would set XANS to 99.

And the following will allow comparison with a specific missing value:

IF (MISSING (ANSO1)=99)

It is important to check for missing and undefined values because they generate undefined results and undefined whenever they appear in arithmetic and logical expressions.

Manipulation of missing values in calculations
There are also several functions that detect and ignore missing and undefined values. The group of functions listed below compute various aggregates for two or more variables.

CNT : count the defined values
FST : find first defined value

LST : find last defined value
MAX : find maximum defined value
MEAN: mean of the defined values
MIN : find minimum defined value
STD : standard deviation
SUM : sum of the defined values

The CNT function is particularly useful for detecting simultaneously defined (or undefined) values. For example, suppose we want to compute R only if A, Band C are all defined; if anyone of them is undefined or missing we want R set to zero. The EXISTS function (see above) could be used as follows:

COMPUTE R = 0
IF (EXISTS (A) EQ 1 AND EXISTS (B) EQ 1 AND EXISTS (C) EQ 1) R = SQRT(A + B + C)

However the CNT functions provides a more concise and elegant form:

COMPUTE R = 0
IF (CNT(A,B,C) EQ 3) R = SQRT(A + B + C)

There are several other functions (available in Retrieval' only) that compute aggregates of defined quantities across several records of a given type.

CNTR : count defined quantities
FSTR : get first defined quantity
LSTR : get last defined quantity

MAXR : get maximum defined quantity
MINR : get minimum defined quantity

MEANR : get mean defined quantities

STDEVR: get standard deviation of defined quantities

SUMR : get sum of the defined quantities

Tuesday
Sep162008

Using VARPUT function in a called procedure

A generic procedure called UPDATE_VAR was created to update the value of a variable as defined in parameter 1 to a value defined in parameter 2.

OLD REC IS ……….
. COMPUTE X=VARPUT( '<1>',<2>)
END REC

We wanted to call the member either by entering a variable name in parameter 1 or by extracting the variable name held as the value of another variable.

. CALL UPDATE_vAR(HCREL00,3)
. CALL UPDATE_VAR(VARGET(VARNAME), RELOPPS(3))

These calls were not working correctly as the quotes surrounding <1> in the called procedure meant that in the second call above the variable name was assumed to be VARGET(VARNAME). The value in parameter 2 (i.e. the value to update the variable to) is required to be a text value in the VARPUT command. This worked well when a specific figure was included in the CALL command (as in the first call statement above) but when an integer variable was included the VARPUT statement in the called procedure did not work correctly.

The solution was to amend the procedure UPDATE_VAR to include a statement to assign the value of parameter 1 to a new variable VARTOUPD and to apply the format function to the value to update to (parameter 2).

COMPUTE VARTOUPD=<1>
OLD REC IS ……….
. COMPUTE X=VARPUT(VARTOUPD, format(<2>))
END REC

The CALL commands now need to be in the following forms; the name of the variable to be updated should be entered either as text enclosed in quotes or the name of the variable containing the name of the variable to be updated.

. CALL UPDATE_VAR("HCREL00",3)
. CALL UPDATE_VAR(VARNAME,RELOPPS(3))

Thursday
Jul312008

Process All Database Record Types

Written by Tom Shriver to create globals for every record number in a database
Output: 4 globals: GREC1, GREC2, GREC3 and GREC4 (gotta use 4 cause SIR will only read up to a certain a length. databases with a lot of record types will go over the line length that SIR can read

program
. write ['Making Global Recs ' + dbname(0)]
. integer * 2 REC#C POS# MAXLEN# GLINE#
. string * 210 REC1 REC2 REC3 REC4
. set GLINE# (1) MAXLEN# (200) REC1 REC2 REC3 REC4 ('')
. REC#C = nrecs(0)
. for REC# = 1, REC#C
. ifthen (numrecs(REC#) > 0)
. REC1 to REC4(GLINE#) = REC1 to REC4(GLINE#) + format(REC#) + ','
. if (len(REC1 to REC4(GLINE#)) > MAXLEN#) GLINE# = GLINE# + 1
. endif
. end for
. REC1 to REC4(GLINE#) = sbst(REC1 to REC4(GLINE#),1,len(REC1 to REC4(GLINE#)) - 1)
. x = dglobal('GREC1')
. x = dglobal('GREC2')
. x = dglobal('GREC3')
. x = dglobal('GREC4')
. x = globals('GREC1',REC1)
. ifthen (len(REC2) > 0)
. x = globals('GREC2',REC2)
. ifthen (len(REC3) > 0)
. x = globals('GREC3',REC3)
. ifthen (len(REC4) > 0)
. x = globals('GREC4',REC4)
. endif
. endif
. endif
. x = globals('GCASE',trim(keyname(0,1)))
end program

and to call it (as a member)...

call DCCPROCF.SYSTEM.ALLRECS
retrieval
. process cases
. do repeat x = <GREC1>
<GREC2>
<GREC3>
<GREC4>
. process rec !x
. write ["Case " + varget(varname(0,1)) + "has a " + trim(recname(!x)) + " record"]
. end rec
. end repeat
. exit case
. end cases
end retrieval