Revision Time (Missing and Undefined Values)
Friday, February 13, 2009 at 12:00PM Missing and undefined values is a source of confusion, for those previously using SPSS because it is different and for those from traditional programming environments where the concept are not so well used. The following is extracted from the SIR 2 Manual but still remains true.
The user may specify one, two or three missing values for each variable in the data base with the MISSING VALUES command. This capability is also available for summary variables during Retrieval.
For example:
MISSING VALUES VAR1 TO VAR5 (7,8,9)
RO01 TO R025 (-1,-2)
DEPT JOBNO HOURS (BLANK)
When a value of 7 is input or computed for VAR1, SIR will recognize it as a missing value.
In SIR, missing values are distinguished from "undefined" values. The latter can arise in several ways:
a. as the result of undefined computations such as by division zero, square root of a negative, logarithm of zero or negative, tangent of pi/2, etc.
COMPUTE V = SQRT(Z/Y)
V will be undefined if Y is zero or Z/Y is negative.
b. as the result of a computation using missing or undefined values.
COMPUTE RADIUS = Ql + 7*Q2
If either Ql or Q2 is missing or undefined, RADIUS will be set to undefined.
c. as the result of a computation which sets a variable to an invalid or out-of-range value.
VAR RANGES MEAS1 TO MEAS5 (1,50)
...
...
COMPUTE MEASl = (BAK + BAL)
If the result of the expression on the right is outside the range 1 through 50, MEAS1 will be set to undefined.
d. as the result of a Batch Data Input operation which the user has chosen to ACCEPT records containing invalid or out-of-range values. The records are stored in the data base with "undefined" substituted for the "bad" data.
SIR provides the user with several functions that detect missing and undefined values.
EXISTS (X)
This function returns the value 1 if X is defined and the value 0 if X is missing or undefined. For example, the logical expression (EXISTS(XNUM) EQ 0) will be true if XNUM contains a missing or undefined value.
MISNUM (X)
This function returns the value 0 if X is undefined and the value 1, 2 or 3 if X is the contains the first, second or third missing value. If X is defined, MISNUM returns undefined. For example, if the missing values for ANSO1 are defined by:
MISSING VALUES ANS01(97,98,99)
then the statement
XANS = MISNUM(ANS01)
will set XANS to 3 if ANS01 is 99.
MISSING (X)
This function returns the actual missing value of X when X contains one of its three missing values. Otherwise, MISSING returns undefined. The MISSING function is the only way to retrieve a missing value for further use in SIR. For example, suppose the missing values for ANSO1 are 97, 98 and 99 (as before), and ANSO1 in a particular record contains 99,
then the statement
XANS = ANSO1
would set XANS to undefined, but the statement
XANS = MISSING(ANS01)
would set XANS to 99.
And the following will allow comparison with a specific missing value:
IF (MISSING (ANSO1)=99)
It is important to check for missing and undefined values because they generate undefined results and undefined whenever they appear in arithmetic and logical expressions.
Manipulation of missing values in calculations
There are also several functions that detect and ignore missing and undefined values. The group of functions listed below compute various aggregates for two or more variables.
CNT : count the defined values
FST : find first defined value
LST : find last defined value
MAX : find maximum defined value
MEAN: mean of the defined values
MIN : find minimum defined value
STD : standard deviation
SUM : sum of the defined values
The CNT function is particularly useful for detecting simultaneously defined (or undefined) values. For example, suppose we want to compute R only if A, Band C are all defined; if anyone of them is undefined or missing we want R set to zero. The EXISTS function (see above) could be used as follows:
COMPUTE R = 0
IF (EXISTS (A) EQ 1 AND EXISTS (B) EQ 1 AND EXISTS (C) EQ 1) R = SQRT(A + B + C)
However the CNT functions provides a more concise and elegant form:
COMPUTE R = 0
IF (CNT(A,B,C) EQ 3) R = SQRT(A + B + C)
There are several other functions (available in Retrieval' only) that compute aggregates of defined quantities across several records of a given type.
CNTR : count defined quantities
FSTR : get first defined quantity
LSTR : get last defined quantity
MAXR : get maximum defined quantity
MINR : get minimum defined quantity
MEANR : get mean defined quantities
STDEVR: get standard deviation of defined quantities
SUMR : get sum of the defined quantities
PQL,
Revision,
functions,
missing values,
undefined; Revision 
