Entries in administration (2)

Tuesday
Sep022008

Reloading an Unload

For reasons' I cant work out the GUI sometimes flakes out with the error

Cannot store variable because record/cir is locked. (Warning 738)

Not sure why this is happening, its been reported to SIR, in the meantime you can still continue to use reload from the command line with

RELOAD FILE dbname
filename='XXX.UNL'

Wednesday
Aug062008

Importing SPSS data into SIR

SIR does not support importing SPSS data files directly. Technically it is possible using ODBC, but no one has ever got that to work reliably, and even if you did, you would not be able to import the variable labels, value labels, missings etc.

Various other ways have been used, including editing SPSS Data Lists and importing the raw batch data, exporting SPSS as csv etc and whilst some of them work it is all rather unsatisfactory and recent versions of SPSS have made it even harder.

As part of the migration of 3 British cohort studies into SIR, I wrote a couple of programs in Python to take an SPSS portable file, and create a SIR schema and import that into the defined SIR record. It is available from the CLS Website

Limitations

A few things are not supported in the program

  • No strings longer than 80 characters anywhere
  • Strings must have no missing values and no value labels
  • Times and dates must be positive
  • The CASEID has to be the first variable in the SPSS file
  • More than 3 discrete missing values actually in the data
python spss_parser.py -f <input file> -s <sir configuration file>

The program has a helper 'configuration' file which specifies the inputs, outputs and a number of parameters such as record name, specifies the BDI parameters and can optionally be used to create the database.

# SIR SCHEMA NAME
sir_schema_name|mcs1.pqo
# SIR BATCH INPUT NAME
sir_batch_input|mcs1.dat
# SIR SCHEMA HEADER INFORMATION
RUN NAME|Conversion from SPSS file mcs1.por
TASK NAME|INITIALIZATION
NEW FILE|mcs1
JOURNAL|On
N OF CASES|25000
MAX INPUT COLS|104
caseid|serial
caseid_type|A
rectype_cols_start|100
rectype_cols_end|102
MAX REC TYPES|100
MAX REC COUNT|50
MAX KEY SIZE|20
COMMON VARS|serial (A7)/
DOCUMENT|MCS Sweep One
record_number|1
record_name|mcs1
keyfield|1
KEY FIELDS|mpflag
TASK NAME|RECORD 1 (MCS1) SCHEMA DEFINITION
RECORD SCHEMA|1 MCS1
SEQUENCE CHECK|OFF
MAX REC COUNT|1200
#other info
default_data_width|90
#this should be greater than the width of the CASEID
default_data_indent|10
# only for use on first record created
db_setup|1