42

Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

  • View
    222

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests
Page 2: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Session V

Analyzing Data

Page 3: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Session Overview

• Analysis planning

• Descriptive epidemiology– Attack rates

• Analytic epidemiology– Measures of association– Tests of significance

Page 4: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Learning Objectives

• Understand what an analytic study contributes to an epidemiological outbreak investigation

• Know why and how to generate measures of association for cohort and case-control studies

• Understand how to interpret measures of association (risk ratios, odds ratios) and corresponding confidence intervals

• Understand how to interpret tests of significance

Page 5: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Basic Steps of an Outbreak Investigation

1. Verify the diagnosis and confirm the outbreak

2. Define a case and conduct case finding

3. Tabulate and orient data: time, place, person

4. Take immediate control measures

5. Formulate and test hypotheses

6. Plan and execute additional studies

7. Implement and evaluate control measures

8. Communicate findings

Page 6: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Analysis Planning

Page 7: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Analysis Planning – An invaluable investment of time

– Helps you select the most appropriate epidemiologic methods

– Helps assure that the work leading up to analysis yields a database structure and content that your preferred analysis software needs to successfully run analysis programs

Page 8: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Analysis PlanningSeveral factors influence—and sometimes limit—your approach to data analysis:

– Research question

– Exposure and outcome variables

– Study design

– Sample population selection

Page 9: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Analysis Planning

Three key considerations as you plan your analysis:

1. Work backwards from the research question(s) to design the most efficient data collection instrument

2. Study design will determine which statistical tests and measures of association you evaluate in the analysis output

3. Consider the need to present, graph, or map data

Page 10: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Analysis Planning

1. Work backwards from the research question(s) to design the most efficient data collection instrument

• Develop a sound data collection instrument

• Collect pieces of information that can be counted, sorted, and recoded or stratified

• Analysis phase is not the time to realize that you should have asked questions differently!

Page 11: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Analysis Planning

2. Study design will determine which statistical tools you will use

• Use risk ratio (RR) with cohort studies and odds ratio (OR) with case-control studies

• Some sampling methods (e.g., matching in case-controls studies) require special types of analysis

Page 12: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Analysis Planning

3. Consider the need to present, graph, or map data

• Even if you collect continuous data, you may later categorize it so you can generate a bar graph and assess frequency distributions

• If you plan to map data, you may need X and Y coordinate or denominator data

Page 13: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Data Cleaning

• Check for accuracy– Outliers

• Check for completeness– Missing values

• Determine whether or not to create or collapse data categories

• Get to know the basic descriptive findings

Page 14: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Data Cleaning:Outliers

• Outliers can be cases at the very beginning and end that may not appear to be related– First check to make certain they are not due to a

collection, coding or data entry error

• If they are not an error, they may represent– Baseline level of illness– Outbreak source– A case exposed earlier than the others– An unrelated case– A case exposed later than the others– A case with a long incubation period

Page 15: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Data Cleaning:Distribution of Variables

Illness Onset for Outbreak of Gastrointestinal Illness at a Nursing Home

0

2

4

6

8

Day of Onset

Nu

mb

er o

f C

ases

“Outlier”

Page 16: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Data Cleaning:Missing Values

• The investigator can check into missing values that are expected versus those that are due to problems in data collection or entry

• The number of missing values for each variable can also be learned from frequency distributions

Page 17: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Data Cleaning:Data Categories

• Which variables are continuous versus categorical?

• Collapse existing categories into fewer?

• Create categories from continuous? (e.g., age)

Page 18: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Attack Rates

Page 19: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Attack Rates (AR)AR

# of cases of a disease

# of people at risk (for a limited period of time)

Food-specific AR# people who ate a food and became ill

# of people who ate that food

Page 20: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Food-Specific Attack Rates

CDC. Outbreak of foodborne streptococcal disease. MMWR 23:365, 1974.

 Consumed

ItemDid Not Consume

Item

Item Ill Total AR(%) Ill Total AR(%)

Chicken 12 46 26 17 29 59

Cake 26 43 61 20 32 63

Water 10 24 42 33 51 65

Green Salad 42 54 78 3 21 14

Asparagus 4 6 67 42 69 61

This food is probably not the source of infection

Page 21: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Hypothesis Generation vs. Hypothesis Testing

Page 22: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Hypothesis Generation vs. Hypothesis Testing

Formulate hypotheses– Occurs after having spoken with some case –

patients and public health officials – Based on information from literature review– Based on descriptive epidemiology (step #3)

Test hypotheses– Occurs after hypotheses have been

generated– Based on analytic epidemiology

Page 23: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Descriptive Epidemiology

Analytic Epidemiology

Search for clues Clues available

Formulate hypotheses Test hypotheses

No comparison group Comparison group

Answers: How much, who, what, when, where

Answers: How, why

Page 24: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Measures of Association

• Assess the strength of an association between an exposure and the outcome of interest

• Two widely used measures:

– Risk ratio (a.k.a. relative risk, RR)• Used with cohort studies

– Odds ratio (a.k.a. OR)• Used with case-control studies

Page 25: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

2 x 2 TablesUsed to summarize counts of disease and exposure in order to do calculations of association

Outcome

Exposure Yes No Total

Yes a b a + b

No c d c + d

Total a + c b + d a + b + c + d

Page 26: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

2 x 2 Tablesa = number who are exposed and have the outcomeb = number who are exposed and do not have the

outcomec = number who are not exposed and have the outcomed = number who are not exposed and do not have the outcome Outcome

Exposure Yes No Total

Yes a b a + b

No c d c + d

Total a + c b + d a + b + c + d

Page 27: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

2 x 2 Tables

a + b = total number who are exposedc + d = total number who are not exposeda + c = total number who have the outcomeb + d = total number who do not have the outcomea + b + c + d = total study population

Outcome

Exposure Yes No Total

Yes a b a + b

No c d c + d

Total a + c b + d a + b + c + d

Page 28: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Risk Ratio

Ill Not Ill Total

Exposed A B A+B

Unexposed C D C+D

Risk Ratio [A/(A+B)]

[C/(C+D)]

Page 29: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Interpreting a Risk Ratio

• RR=1.0 = no association between exposure and disease

• RR>1.0 = positive association

• RR<1.0 = negative association / protective effect

Page 30: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Risk Ratio Example

Ill Well Total

Ate alfalfa sprouts

43 11 54

Did not eat alfalfa sprouts

3 18 21

Total 46 29 75

RR = (43 / 54) / (3 / 21) = 5.6

Page 31: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Odds Ratio

Cases Controls

Exposed A B

Unexposed C D

Odds Ratio (A/C)/(B/D)=(A*D)/(B*C)

Page 32: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Interpreting an Odds Ratio

The odds ratio is interpreted in the same way as a risk ratio:

• OR=1.0 = no association between exposure and disease

• OR>1.0 = positive association

• OR<1.0 = negative association

Page 33: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Odds Ratio Example

Case Control Total

Ate at restaurant X 60 25 85

Did not eat at restaurant X

18 55 73

Total 78 80 158

OR = (60 / 18) / (25 / 55) = 7.3

Page 34: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

What to do with a Zero CellCase Control Total

Ate at restaurant X 60 0 60

Did not eat at restaurant X

18 55 73

Total 78 55 133

•Try to recruit more study participants

•Add 1 to each cell*

*Remember to document / report this!

Page 35: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Tests of Significance• Indication of reliability of the association that

was observed

• Answers the question “How likely is it that the observed association may be due to chance?”

• Two main tests:1. 95% Confidence Intervals (CI)

2. p-values

Page 36: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Confidence Intervals• Allow the investigator to:

– Evaluate statistical significance

– Assess the precision of the estimate (the odds ratio or risk ratio)

• Consist of a lower bound and an upper bound

– Example: RR=1.9, 95% CI: 1.1-3.1

Page 37: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Confidence Intervals• Provide information on precision of

estimate

– Narrow confidence intervals =more precise

• Example: OR=10, 95% CI: 9.0 - 11.0

– Wide confidence intervals =less precise

• Example: OR=10, 95% CI: 0.9 - 44.0

Page 38: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

p-values• The p-value is a measure of how likely the

observed association would be to occur by chance alone, in the absence of a true association

• A very small p-value means that you are very unlikely to observe such a RR or OR if there was no true association

• A p-value of 0.05 indicates only a 5% chance that the RR or OR was observed by chance alone

Page 39: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Plan and Execute Additional Studies

• To gather more specific info– Example: Salmonella muenchen

• Intervention study – Example: Implement intensive hand-washing

Page 40: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Session V Summary

Analysis planning will ensure that you get the most valuable / useful data out of your investigation.

Attack rates are descriptive statistics used in cohort studies that are useful for comparing the risk of disease in groups with different exposures (such as consumption of individual food items).

Page 41: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

Session V Summary

Analytic epidemiology allows you to test the hypotheses generated via review of descriptive statistics and the medical literature.

The measures of association for case-control and cohort analytic studies, respectively, are odds ratios and risk ratios.

Confidence intervals and p-values that accompany measures of association evaluate the statistical significance of the measures.

Page 42: Session V Analyzing Data Session Overview Analysis planning Descriptive epidemiology –Attack rates Analytic epidemiology –Measures of association –Tests

References and Resources

• Centers for Disease Control and Prevention (1992). Principles of Epidemiology, 2nd ed. Atlanta, GA: Public Health Practice Program Office.

• Gordis L. (1996). Epidemiology. Philadelphia, WB Saunders.

• Rothman KJ. Epidemiology: An Introduction. New York, Oxford University Press, 2002.

• Stehr-Green, J. and Stehr-Green, P. (2004). Hypothesis Generating Interviews. Module 3 of a Field Epidemiology Methods course being developed in the NC Center for Public Health Preparedness, UNC Chapel Hill.