26
Using MS-ACCESS® Metadata to Drive Automated SAS® Data Processing Gary N. Weeks Centers for Disease Control Atlanta, Georgia

Using MS-ACCESS® Metadata to Drive Automated SAS® Data Processing Gary N. Weeks Centers for Disease Control Atlanta, Georgia

Embed Size (px)

Citation preview

Using MS-ACCESS® Metadata to Drive Automated

SAS® Data Processing

Gary N. WeeksCenters for Disease Control

Atlanta, Georgia

PRAMS Pregnancy Risk Assessment

Monitoring System

Pregnancy and birth outcome data Grown to 32 participating agencies Mixed mode data collection Unique data structures

PRAMS Paticipatants

PRAMS Participants

Metadata in Microsoft Access database

Ease of maintenance Updates incorporated immediately Scaleable

Scripting Automation

Read manifest file Perform initial file management Call SAS processing program

ManifestSDN_POST

datetime="7/5/2005 11:57:29 AM"

Comment=""

UserFilename="ME198.zip"

type="upload"

state="ME"

batch_number="198"

birth_year="2004“

email_address=“[email protected]"

.

.

.

SAS Processing

Read flat files into SAS data sets Check variable ranges Validate skip patterns Log Errors Recode data values Combine data Generate and send processing

reports

Read Flat Files

4 fixed format files 1 state specific format file Input using positional input

information from metadata

Basic Metadata ElementsStateAbb Variable QRangeML QRangePH

FL BC_WHEN4 1-2,9,._ 1-2,8,9,._

MD FEEL_PG 1-4,9 1-4,8,9

MN PREG_TRY 1-2,9 1-2,8,9

SC BC_WHEN4 1-2,9,._ 1-2,8,9,._

NJ BC_WHEN4 1-2,9,._ 1-2,8,9,._

ME BC_WHEN4 1-2,9,._ 1-2,8,9,._

OH BC_WHEN4 1-2,9,._ 1-2,8,9,._

NC BC_WHEN4 1-2,9,._ 1-2,8,9,._

Filtered Metadata

Revision Phase StateAbb Variable QRangeML QRangePH

1.1 5 GA MH_PPHLP 1-2,9 1-2,8,9

1.2 5 GA MH_PPHLP 1-2,9 1-2,8,9

1.1 5 GA TOD_MTH 1-12,.N 1-12,88,99,.R,.N

1.2 5 GA TOD_MTH 1-12,.N 1-12,88,99,.R,.N

1.1 5 GA TOD_DAY 1-31,.N 1-31,88,99,.R,.N

1.2 5 GA TOD_DAY 1-31,.N 1-31,88,99,.R,.N

1.1 5 GA TOD_YR 0-2005,.N 0-2005,8888

1.2 5 GA TOD_YR 0-2005,.N 0-2005

Put Metadata into Macro Variablesproc sql noprint ;

SELECT Variable, StartCol, Length, BCLabel, BCRange, BCRecode, Type

INTO

:Variable1-:Variable&NUM_OBS,

:StartCol1-:StartCol&NUM_OBS,

:Length1-:Length&NUM_OBS,

:BCLabel1-:BCLabel&NUM_OBS,

:BCRange1-:BCRange&NUM_OBS,

:BCRecode1-:BCRecode&NUM_OBS,

:Type1-:Type&NUM_OBS

FROM bc_descrip;

QUIT;

Variable StartCol Length BCLabel BCRange BCRecode Type

SEX 22 1 Gender of Infant

1-2,9 udotnine 1

PLURAL 23 1 Plurality 1-3,9 udotnine 1

MAT_ED 26 1 Maternal Education

1-5,9 udotnine 1

PAT_ED 28 1 Paternal Education

1-5,9 udotnine 1

%Let varRead= INPUT ; %DO I=1 %TO &NUM_OBS %By 1 ; %if &&Type&I =1 %then %Let LengthFt=&&Length&I...; %else %if &&Type&I =2 %then %Let LengthFt=$&&Length&I...; %Let varRead= &varRead @&&StartCol&I &&Variable&I &LengthFt;%END;

data state.bc_&state&batch; infile MDATA missover; &varRead; run;

Building An Input Statement

%Let varRead= INPUT ;

%DO I=1 %TO &NUM_OBS %By 1 ;

%if &&Type&I =1 %then %Let LengthFt=&&Length&I...;

%else %if &&Type&I =2 %then %Let LengthFt=$&&Length&I...;

%Let varRead= &varRead @&&StartCol&I &&Variable&I &LengthFt;

%END;

&varRead INPUT @90 ADM_NNC 1. @79 BCOUNTY 2. @77 BW 1. @83 DD_MDOB 2.

@87 INTERVAL 3. @81 MM_MDOB 2. @78 PAY 1. @85 YY_MDOB 2. @67

HISP_BC 1. @68 MM_HBP 1. @69 MM_BLEED 1. @70 MM_DIAB 1. @71 MM_FEVER 1. @72 MM_PROM 1. @73 MM_ABNOR 1. @74 MM_NOMD 1. @75 MM_NOLD 1. @76 POB 1. @1 MM_DOB 2. @3 DD_DOB 2. @5 YY_DOB 2. @7 MM_LMP 2. @9 DD_LMP 2. @11 YY_LMP 2. @13 PNC_MTH 2. @15 PNC_VST 2. @17 GRAM 4. @21 MARRIED 1. @22 SEX 1. @23 PLURAL 1. @24 MAT_AGE 2. @26 MAT_ED 1. @27 MAT_RACE $1. @28 PAT_ED 1. @29 MM_LLB 2. @31 YY_LLB 2. @33 PRE_LB 2. @35 GEST_WK 2. @37 MOMSMOKE 1. @38 MOMCIG 2. @40 MOMDRINK 1. @41 MOMDRKS 2. @43 MOMLBS 2. @45 DEFECT 1. @46 OTH_TERM 1. @47 DEL_VAG 1. @48 DEL_VCS 1. @49 DEL_1CS 1. @50 DEL_RCS 1. @51 DEL_FORC 1. @52 DEL_VACM 1. @53 MOM_RES 3. @56 HOSPITAL $3. @59 BC 8.

&varRead (Human Version)

INPUT @90 ADM_NNC 1.

@79 BCOUNTY 2.

@77 BW 1.

@83 DD_MDOB 2.

@87 INTERVAL 3.

@81 MM_MDOB 2.

@78 PAY 1.

@85 YY_MDOB 2.

.

.

.

Create the Data Set

data state.bc_&state&batch;

infile MDATA missover;

&varRead;

run;

Formats

Stored in metadata Used to recode values Read using the Cntlin option

FormatsFMTNAME START END LABEL HLO

TWENTYONEHI 21 HIGH .B H

zzTWEFIVH . . .B

TWENTYONEHI .A .Z .B

TWENTYONEHI . . .B

THREEHI .A .Z .B

THREEHI . . .B

Formatslibname RcdFrmts access "J:\path\PRAMS_MetaData.mdb";

data RecodeFormats;

set RcdFrmts.Recodeformats (drop=HLO);

start=compress(start);

end=compress(end);

run;

proc sort data=RecodeFormats nodup;

out=recodeFmts;

by fmtname start end;

run;

proc format cntlin=recodeFmts;

run;

Skip Pattern Validation

Some survey questions may be unanswered The questionnaire is incomplete Skipped because previous a

question makes it irrelevant

24. Did you have any of these problems during your most recent pregnancy?

a. High blood sugar (diabetes) that started before this pregnancy

b. High blood sugar (diabetes) that started during this pregnancy

c. Vaginal bleeding

d. Kidney or bladder (urinary tract) infection

e. Severe nausea, vomiting, or dehydration

f. Cervix had to be sewn shut (incompetent cervix)

g. High blood pressure, hypertension (including pregnancy-induced hypertension [PIH], preeclampsia, or toxemia)

h. Problems with the placenta (such as abruptio placentae or placenta previa)

i. Labor pains more than 3 weeks before my baby was due (preterm or early labor)

j. Water broke more than 3 weeks before my baby was due (premature rupture of membranes [PROM])

k. I had to have a blood transfusion

l. I was hurt in a car accident

If you did not have any of these problems, go to Question 26.

Lead Question

Follow Up25. Did you do any of the following things because of

these problems?

a. I went to the hospital or emergency room and stayed less than 1 day

b. I went to the hospital and stayed 1 to 7 days

c. I went to the hospital and stayed more than 7 days

d. I stayed in bed at home more than 2 days because of my doctor’s or nurse’s advice

Skip Patterns

Validation Error output specification

Integrity Constraintsproc datasets library=state nolist;

modify ckskp1;

ic create MORB5LT1217_skp = check (where=(

(((MORB5LT1 = ._ and MORB_BLD in (1 )) or

(MORB5LT1 ^= ._ and MORB_BLD not in (1 ))) and type=1) or

(((MORB5LT1 = ._ and MORB_BLD in (1,8 )) or

(MORB5LT1 ^= ._ and MORB_BLD not in (1,8 ))) and type=2)

or

(((MORB5LT1 = ._ and MORB_BP in (1 )) or

(MORB5LT1 ^= ._ and MORB_BP not in (1 ))) and type=1) or

(((MORB5LT1 = ._ and MORB_BP in (1,8 )) or

(MORB5LT1 ^= ._ and MORB_BP not in (1,8 ))) and type=2)

.

.

.

Skip Pattern Validation

StateAbb Variable Ref SRangeML SRangePH

ME MORB5LT1 MORB_CAR 1,9 1,8,9

ME MORB5LT1 MORB_DID 1,9 1,8,9

ME MORB5LT1 MORB_BP 1,9 1,8,9

ME MORB5LT1 MORB_CRV 1,9 1,8,9

ME MORB5LT1 MORB_KID 1,9 1,8,9

ME MORB5LT1 MORB_LAB 1,9 1,8,9

ME MORB5LT1 MORB_NAU 1,9 1,8,9

ME MORB5LT1 MORB_PLA 1,9 1,8,9

ME MORB5LT1 MORB_PRM 1,9 1,8,9

ME MORB5LT1 MORB_TRN 1,9 1,8,9

ME MORB5LT1 MORB_DIB 1,9 1,8,9

ME MORB5LT1 MORB_BLD 1,9 1,8,9