Upload
buck-stokes
View
212
Download
0
Embed Size (px)
Citation preview
In stratified random sampling from a population of size N, what sample size n is necessary to determine a proportion ¶ to within the maximum allowable difference D with confidence P?
To answer this, we want to use the following parameters:
N= Each project area’s population size
D= The maximum allowable difference (or margin of error)
¶= Estimated proportion screened
P= Confidence level (or alpha)
Question of interest:
Take the smallest subgroup of STD and smallest subgroup of FP (in other words, the STD or FP project area with the smallest eligible population (N)), and calculate their required sample size (n) based on the parameters we set.
Whatever percentage the n is of that area’s N will be applied to all the other subgroups of STD or FP.
Eligible population for: STD: females under 30 yrsFP: females under 25 yrs for initial or annual exam
Getting the sample size
Using information provided, project areas included were:
STD: NJ, NYS, NYC
• NYS had the smallest eligible population• Range of estimated number screened was
83.8% - 95.7%
Family planning: NJ, NYS, and NYS [USVI FP provided eligible population, but it was so small that it would have greatly increased the sample size for all other areas].
• NYC had the smallest eligible population• Range of estimated number screened was
74.0% - 91.1%
Implementation
• NYS clinics 2005: N=4,792• D: Set at .03
We’re allowing a maximum allowable difference of 3% around the proportion estimate
• ¶: Set at .90We’ll estimate that of all eligible women in the population has an MD visit, 90% are screened for Ct.
• P: Set at .95We’ll estimate our proportion (within +3%) with 95% confidence; in other words, if we repeated this exercise 100 times, 95 of those times we would find that the true proportion of women screened is between 87%-93%.
• If the true proportion is lower than 87%, we will perform a statistical test (z-score) to assess if the difference between the estimated and observed proportions is statistically significant.
Parameters used (STD)
356 is 7.4% of 4,792 (NYS STD clinics’ N).
Thus, we want to take a 7.4% sample from all other project areas.
Subgroup N
NYS STD 4792
NYC STD 7448
NJ STD 5301
n (=N*7.4%)
362
551
394
Calculated sample size and sample sizes for all other groups
Using a power and sample size calculator:
n= 356
• NYC clinics 2005: N=5,621• D: Set at .05
We’re allowing a maximum allowable difference of 5% around the proportion estimate
• ¶: Set at .85We’ll estimate that of all eligible women in the population has an MD visit, 85% are screened for Ct.
• P: Set at .95We’ll estimate our proportion (within +5%) with 95% confidence; in other words, if we repeated this exercise 100 times, 95 of those times we would find that the true proportion of women screened is between 80%-90%.
• If the true proportion is lower than 80%, we will perform a statistical test (z-score) to assess if the difference between the estimated and observed proportions is statistically significant.
Parameters used (FP)
191 is 3.4% of 5,621 (NYC FP clinics’ N).
Thus, we want to take a 3.4% sample from all other project areas.
Subgroup N
NYC FP 5621
NYS FP 45133
NJ FP 68410
n (=N*3.4%)
191
1535
2326
Calculated sample size and sample sizes for all other groups
Using a power and sample size calculator:
n= 191
• Once you have a sample size, you can draw the random sample based on how the medical records are put in order (by date of visit, alphabetically by name, etc.)
• Microsoft Excel
• Random number table3680 2231 8846 5418 0498 5245 7071 2597
If you wanted to sample two records from records numbered 1 to 48 we would read off the digits in pairs:36 80 22 31 88 46 54 18 04 98 52 45 70 71 25 97
If we wanted to sample two records from a much longer list with 140 records in it we would need to read the digits off in groups of three:368 022 318 846 541 804 985 245 707 125 97
Choosing a random sample
• In a random sample every member of the population has an equal chance of being chosen, which is not the case with a systematic sample, but it is almost always accepted as being random.
• Suppose you want to sample 8 charts from a population of 120 charts. 120/8=15, so every 15th chart is chosen after a random starting point between 1 and 15. If the random starting point is 11, then the charts selected are 11, 26, 41, 56, 71, 86, 101, and 116.
Choosing a systematic sample