Upload
laura-simmons
View
215
Download
3
Embed Size (px)
Citation preview
DATA ENTRY AND CLEANING
Objective is to help prepare effective data entry programs that minimize the post cleaning period
DATA ENTRY AND CLEANING
Elements of clean dataConsistent and logical
• All variables are consistent.• Expenditure of continuous variables
realistic• Consistency in coding• All missing values justified and documented
DATA ENTRY AND CLEANING
Good questionnaire layout facilitates the data entry design and subsequent data entry and cleaning1. Involve a Data Management Specialist from the
beginning
2. Delineate the questionnaire into sections
3. Pre-code all variables directly on the questionnaires
4. Enumerate each variable clearly
5. Create entry boxes for response fields
6. Integrate logical skips and test it during pilot
DATA ENTRY AND CLEANING
Quality Controls Design an effective data entry program
1. Data entry screens must match questionnaire.
2. Concurrent controls (Real time controls)
3. Range checks.
DATA ENTRY AND CLEANINGNo Questions Codes Skip to
1 Did you receive any private funds last year?
1 Yes
2 No
>>q3
2 How much?
3 Date of birth
4 Year started teaching in this school
5 Year started teaching
6 What is your salary? Enter “9” for refuse to answer
>>10
7 Relationship with district 1 Excellent
2 Good
3 Fair
4 Bad
DATA ENTRY AND CLEANING
Real time ControlsChecks are done at data entry time
e.g. If S1Q1 = 2 then skip to S1Q3
Endif
e.g If S1Q1 = 1 and S1Q3 <0 then reject
(if s1q1 =1 then amount must exist)
endif
DATA ENTRY AND CLEANING
Range checksLimits all out of range values
• Most out of range values come from data entry miskeys and carelessness
Soft checks• Warns data entry operators but can override
Hard check• Cannot override
DATA ENTRY AND CLEANING
Use controls or reference tables whenever availableSample codes
• 1, 24, 45, 60, 90, 154, 766, 980
Other identification variables
DATA ENTRY AND CLEANING
Simple consistency checksQ1: when did you start teaching in this school
= 2000Q2: when did you start teaching = 2002
Inconsistent
Consistency check:
If S4Q1>S4Q2 then reject Message (S4Q1 must be >= S4Q2)
Endif
DATA ENTRY AND CLEANING
Conclusion Data Entry screens must match
questionnaire Real time Controls Range checks Simple Consistency checks
Software
• The data analyst has to make a selection of the software to be used for statistical analysis e.g. SPSS, SAS, MS Access, MS Excel
• The analyst will be guided by the researchers on the sort of tables to be generated.
ANALYSIS OF PETSPrimary analysis should focus on your PET objectives:Measure leakage of funds tracked on their way to schools and analyze the
causes Use simple parameters (simple average percentages and standard
deviations) but complex relationship can be developed. Analyze equity in leakage distribution(urban/rural divide etc) Comparing resources disbursed at various levels: central, district,
subdistrict, subcounty, schools against entitlements. Calculating average differences between levels. Determine how these differences vary over time and space. What are the explanations. What are the proposed interventions.
Amount actual received/Amount entitled
Calculated by administrative level.
Region,District, School.
Explanations for the leakages.
LEAKAGES
Analysis beyond leakagesLeakage is the primary focus but there are many aspects of service
delivery that are equally important.Textbooks procured and at school but not used or stored outside school.Teachers are hired and paid for but not teaching.Students enrolled but absent most of the time.Students at school but parents not providing for uniform and lunches.Health workers paid for but not attending to patients when drugs are
out of stock.Rude nurses scare pregnant women who want to deliver at health center.Health centers opened in time but waiting time to see health worker too
long.Most patients opt for traditional healers instead of public health centers.All these call for a combination of research methodologies and
interventions to investigate and resolve the above.