Upload
nicole-griffin
View
229
Download
0
Tags:
Embed Size (px)
Citation preview
Sampling
MICS3 Regional Workshop
“Survey Design”
MICS Sample Design
MICS is a complex survey (Multi-stage stratified). MICS is a worldwide program, consistence &
comparability are important issues. We will discuss only a few of the highlights including:
Sample size determination Stratification and sample allocation Number of Primary Sampling Units and cluster sizes Use of existing sample or new sample A few special topics
Sample Size for MICS
Most important feature of MICS with respect to survey costs. We will discuss:
DETERMINANTS – factors, constraints
INDICATORS to use
FORMULA to calculate sample size
Determinants of Sample Size(Factors and Constraints) Sample size (households) depends on many factors:
Expected size estimate of indicators Expected size estimate of target population(s) Average household size Margin of error wanted Level of confidence wanted “Design effect” (increase in sample error due to use of cluster
survey instead of simple random sample) Expected non-response rate Number of clusters or PSUs Cluster size (number of households per sample cluster) Number of sub-national areas for separate estimates (domains) Survey budget and implementing capability
MICS Recommendations on Sample Size Determinants
FACTOR RECOMMENDATION
1.Expected size estimate of indicators (next slide)2.Expected size estimate of target population12-23 mos [3%]3.Average household size 6 persons4.Relative margin of error wanted 12% of coverage rate5.Level of confidence wanted 95 percent6.Design effect in cluster surveys 1.57.Expected non-response rate 10 percent8.Number of clusters or PSUs - minimum [300-400]9.Cluster size [15-35]10.Number of estimation “domains” wanted [5 or fewer]11.Survey budget (country specific)
For items 2, 3, 6, 7 use available country data (recent survey or census); if not available, use value above.
Indicators for Sample Size Determination Sample size is different for each MICS indicator. Must choose a key indicator, since only one sample size can
be used in MICS. Recommendations for choosing key indicator:
Choose from among main indicators of interest in your country. Choose the one which will yield largest sample size. Usually for a single-year age group, and Usually DPT, measles, polio or tuberculosis immunization - or
birth weight below 2.5 kg Exceptions: Do not choose infant or maternal mortality rates
as the key indicators. Do not choose a low coverage indicator that is desirably low (such as malnutrition prevalence). Do not choose breast-feeding indicators for 4-month age groups.
Checklist for Target Group and Indicator To decide on the appropriate target group and indicator that
you need to determine your sample size: 1. Pick children 12-23 months old - the target population that
comprises the smallest percentage of the total population – probably about 3 percent.
2. For that target group, pick the lowest from among the following coverage rates: - DPT immunization level - Measles immunization level - Polio immunization level - Tuberculosis immunization level
3. Do not pick from the desirably low coverage indicators that is already acceptably low.
Formula for Sample Size
Different formula than MICS2000 MICS2005 formula emphasizes relative margin of error*
instead of 5% absolute error (high coverage indicator) or 3% for low coverage indicator. Less confusing Does not depend on high or low coverage
* The Relative Margin of Error is the percentage of tolerable difference that the estimated proportion can differ from its true value with a given confidence level. It determines the relative length of the confidence interval.
Formula
n = [4 (r) (1 - r) (deff) (1.1)] / [(.12r )2(p)(ave-size)]
where n is the required sample size, expressed as number of households, for
the KEY indicator 4 is factor to achieve 95 percent level of confidence, r is anticipated prevalence (coverage) rate for key indicator, 1.1 is factor to raise sample size by 10 percent for potential nonresponse, deff is shortened symbol for design effect, 0.12r is margin of error to be tolerated, defined as 12 percent of r (12
percent thus represents the relative sampling error of r), p is proportion of total population that smallest group comprises, and ave-size is average household size.
You may use the table on the next page instead of formula if all conditions are satisfied for that table in your country.
Sample Size (Households) Calculation for Proportion Estimation Using Smallest Target Population
Average Household Size
(number of persons)
coverage rate,
r = 0.15
coverage rate,
r = 0.20
Coverage rate, r = 0.30
coverage rate,
r = 0.40
4.5 19,239 13,580 7,922 5,093
5.0 17,315 12,222 7,130 4,583
5.5 15,741 11,111 6,481 4,167
6.0 14,429 10,185* 5,941 3,819
6.5 13,319 9.402 5,484 3,526
Use this table when your
1. Target population is 3 percent of total population; this is generally children 12-23
months old
2. Sample design effect, deff, is assumed to be 1.5 and nonresponse is expected to be 10 percent
3. Relative marginal sampling error is set at 12 percent of estimate of coverage rate, r
Example 1
Target group: Children 12 to 23 months old Percent of population: 3 percent Key indicator: DPT immunization coverage Prevalence (Coverage): 30 percent Deff: No information Non-response: No information Average household size: 6
Checking table => n = 5941
Checklist for Use of Sample Size formula The formula to determine your sample size :
n = [4 (r) (1 - r) (f) (1.1)] / [(.12r)2 (p) (nh)].
Use it if any (one or more) of the following applies in your country:
1) p – the proportion of one-year-old children is other than 3%2) nh – the average household size is less than 4.5 persons or greater
than 6.53) r – the coverage rate of your key indicator is under 20 or over 40
percent4) f - the sample design effect for your key indicator is different from 1.5,
according to accepted estimates from other surveys in your country5) your anticipated non-response rate is more or less than 10 percent.
Example 2 Target group: Children 12 to 23 months old Percent of population: 3.5 percent Key indicator: DPT immunization coverage Prevalence (Coverage): 25 percent Deff: 1.6 Non-response adjustment = 1.05 (response rate
95%) Average household size: 6
n = [4 (.25) (.75) (1.6) (1.05)] / [(.12*.25)2 (.035) (6)] = 1.26/.000189 = 6667.
Stratification & Sample Allocation Stratification is the process of regrouping similar PSUs into sub-groups
(strata).
Effects: better precision, flexible design, small sub-population coverage (or over sampling).
How to do stratification? (region) X (residence type)
Sample allocation: proportional, power allocation, equal size allocation (if budget is too tight).
Implicit stratification: sort the sampling frame according to certain characters such as regions, urban-rural residence, sub-regions, districts, etc.., then select a pps sample.
There is no unique rule for stratification, it depends on country situation
Number of PSUs and Cluster Size
Survey costs depend not only on number of households but their distribution among Primary Sampling Units (PSUs).
In general, the more PSUs the better for reliability but the greater the cost (usually travel costs).
We recommend 300 to 400 PSUs or more.
Number of PSUs also depends on cluster size.
Cluster size should be as small as practical for reliability.
Example: 8000 households selected in 400 PSUs of 20 households each is much more reliable sample than 200 PSUs of 40 each, but more expensive.
MICS Sampling Option 1
USE AN EXISTING SAMPLE Piggy-back MICS onto DHS or other survey if timely and feasible. Or, use sample from a previous survey and re-interview households for
MICS. Or, use old survey sample EAs and construct new listing of
households to select for MICS. Old sample must be probability-based, national in scope. Possibilities – DHS, other national health survey, recent labour force
survey Possibilities – DHS, other national health survey, recent labour force or
household expenditure surveys Important: design parameters must be known (such as selection
probability, stratification, etc..)
OPTION 1 - USE OF AN EXISTING SAMPLE, continued Advantages of old sample - cost savings - maps available for interviewers - design rigor - simplicity Limitations of old sample - burden on respondents - sample design may need modification * sample size * sub-national coverage * number of PSUs or clusters => Balance between loss and gain
MICS Sampling Option 2
USE NEW SAMPLE WITH HOUSEHOLD LISTING OPERATION Design new MICS sample based on prototype Two stages with census as frame (see comprehensive discussion
in Chapter 4 on frame construction and up-dating old frames) Use of implicit stratification, systematic selection of census EAs at
first stage with pps Create standard segments (DHS approach) List households in selected segments Select households systematically from list Interview only the selected households, no replacement will be
allowed
OPTION 2 - NEW SAMPLE WITH HOUSEHOLD LISTING, continued Advantages of option 2 - simple design - probability-based - if possible self-weighting (national level) Limitations of option 2 - expense of listing households - time necessary to list households [Example, sample size of 5000 households may need 25000
to 50000 households to be listed.]
DHS Method - Option 2
Create “standard” segments. Divide census population in each EA by 500 to
determine number of standard segments. Map sketch segments in each EA. Choose 1 segment at random. List households in selected segment only (instead of
entire EA). Purpose is to reduce listing workload to a manageable
size.
MICS Sampling Option 3
USE NEW SAMPLE WITHOUT HOUSEHOLD LISTING OPERATION
(Modified Segment, or Cluster, Design) Design new MICS sample based on prototype. Two stages with census as frame Use of implicit stratification, systematic selection of census
EAs at first stage with pps Pre-determine number of segments based on desired cluster
size. Map sketch segments in each EA. Choose 1 segment at random. Interview all households in selected segment
OPTION 3 - NEW SAMPLE WITHOUT HOUSEHOLD LISTING, continued Illustration: Suppose desired cluster size is 20 households. Suppose first sample EA contains 112 census
households (according to frame). Divide 112 by 20 = 5.6 (round to 6). Map sketch exactly 6 segments based on canvass of EA. Select one segment at random. Interview all households (no matter how many are
currently in the selected segment).
OPTION 3 - NEW SAMPLE WITHOUT HOUSEHOLD LISTING, continued Advantages of option 3 avoids listing completely probability-based self-weighting (national level)
Limitations of option 3 less reliable than option 2 (households are “clustered” together in compact
segments) segmentation itself can be time-consuming and complicated difficult to control sample size
Special Topics
Sub-national estimates, domains Water and sanitation estimates Survey weighting, sampling errors Other – sample frame construction, selection
techniques Country examples
Sub-national Estimates, Domains Number of separate areas (domains) for which separate,
equally reliable estimates are wanted affects sample size. If, say, 5 regional estimates are wanted, then, theoretically,
sample should be increased by factor of 5. Must be careful therefore in producing separate estimates for
domains. Either limit number of domains to avoid large increase in
sample size, Or be prepared to accept domain estimates with much higher
sampling errors than national.
Water and Sanitation Estimates These are an important component of MICS. Sampling errors will be high, however (extremely high in some
cases). MICS sample is design primarily for person variables rather
than household variables such as water/sanitation. Sample design effects for water and sanitation indicators will
be much higher than for other indicators. Consequently, sampling reliability is very low. Estimates can nevertheless be useful to estimate trends in
water/sanitation if previous surveys exist upon which to make comparison.
Survey Weighting and Sampling Errors All analysis based on survey data must apply survey weights
in order to prevent biased results. Survey weighting is design-specific. Non-response must be
taken into account. Formulas for calculating weights depend on the exact sample
design used in each country.
Sampling Error Estimation
Calculation of sampling errors necessary to evaluate reliability of survey estimates
Should be done for 30-50 important indicators Methodology is complex and design-specific There are several options for sampling error calculations:
May use existing software (Clusters, WesVar, CenVar, PCCarp, etc.)
Latest version of SPSS currently evaluated whether new routines on sampling error are appropriate for MICS3 surveys
Routines in CSPro can be used Or use simple, variance spreadsheet that will be available on the
MICS website, www.childinfo.org
Sampling Error Estimation, continued With spreadsheet, only necessary to enter:
Survey weights for each cluster Unweighted indicator estimate for each cluster
Sampling error automatically calculated Confidence limits, design effect automatically
calculated
Other Topics
Other key information to be included in the MICS3 manual for the sampling statistician to review: Sample frame construction
When new sample is used for MICS Especially important if frame is old
Selection techniques Details of systematic sampling PPS sampling (probability proportionate to size)
Country examples from MICS2000 Papua New Guinea, Lebanon, Angola