40
How Many Cases Are Too Many? Detection of Disease Outbreaks and Clusters Lance A. Waller, Department of Biostatistics, Rollins School of Public Health, Emory University [email protected]

How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

How Many Cases Are Too Many?

Detection of Disease Outbreaks and Clusters

Lance A. Waller, Department of Biostatistics, Rollins School of Public Health, Emory University

[email protected]

Page 2: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

How many are too many?

What sets off the public health “alarm”?

For anthrax and smallpox…

ONE (no statistics needed)

(rare enough and dangerous enough)

Page 3: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

What about…

…a more subtle pattern?5 flu cases in a single day.20 acute asthma attacks in one

neighborhood.

We want to detect anomolies, patterns of cases differing from the “usual” pattern.

Page 4: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

What are we looking for?

Among “Epidemiologic clues that may signal a covert bioterrorism attack” CDC’sThe Public Health Response to Biological and Chemical Terrorism: Interim Planning Guidance for State Public Health Officials (July 2001):

“Disease with unusual geographic or seasonal distribution”

http://www.bt.cdc.gov/Documents/Planning/PlanningGuidance.PDF

Page 5: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

John Snow, M.D. 1845 map

!

Snow, J. (1949) Snow on Cholera.Oxford University Press: London.

Page 6: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

What we want...

Statistical assessments of the “unusualness” of observed patterns in space and time.Suggests statistical tests of: H0: No clusters in the data.

Yes/no answer?Easy to ask, harder to answer.

Page 7: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Distributed “by chance”…

Need to “operationalize” H0

What sort of data arise under H0?What counts as evidence against H0 ?

Simple random (uniform) pattern?

Page 8: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Scan statistics

Count events in moving window.In time:

Consideration: Cluster “anywhen”, or outbreak now?

4 3 2 20

Wallenstein, S. (1980) A test for detection of clustering over time. American Journal of Epidemiology 111, 367-372.

Page 9: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Scan statistic in space

2

0

3

1

Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics-Theory and Methods 26, 1481-1496.

Page 10: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Complication

Heterogeneous population density

Page 11: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Refine the question…

“Are there clusters in the data?” to

“Are there clusters in the data after adjusting for heterogeneities in the population at risk?”

Page 12: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Complication:

Where is “where”?Which location for each case?Example: Maxcy (1926) study of endemic typhus fever in Montgomery, AL, 1922-1925.

Lilienfeld, D.E. and Stolley, P.D. (1994) Foundations of Epidemiology, Third Edition. Oxford University Press: New York, pp. 136-140.

Maxcy, K.F. (1926) “An epidemiological study of endemic typhus (Brill’s disease) in the Southeastern United States with special reference to its

mode of transmition.” Public Health Reports 41, 2967-2995.

Page 13: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Residence location Place of employment

Page 14: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Refine the question…

“Are there clusters in the data after adjusting for heterogeneities in the population at risk?” to…

“Are there clusters of case residences in the data after adjusting for heterogeneities in the population at risk?”

We’re building a conceptual model…

Page 15: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

What we have…

Disease surveillance (ongoing collection, monitoring, and analysis of disease data).Vital statistics (birth/death certificates)Notifiable diseases (required reporting)Registries (link multiple sources of information on each case, e.g. SEER)Health surveys (NHANES, NHIS, BRFSS)

Teutsch, S.M. and Churchill, R.E. (1994) Principles and Practice of Public Health Surveillance. Oxford University Press: New York.

Page 16: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Data components

Types of location (time or space) data:

Point data (case locations)• Latitude/longitude• Street address• Confidentiality?

Regional count data• Counts for enumeration

districts

Page 17: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Background data

Types of background data:Point locations for non-cases (“controls”)• Is the spatial distribution of

cases close to that of controls?

Regional census counts• Are the observed number of

cases close to the number expected under H0?

Page 18: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Point data

Case locations geocoded from registry or billing records.Controls:

All non-cases (e.g., birth records)Sample (perhaps matched) of non-cases.Different outcome (e.g., nonrespiratory ED visits, compared to respiratory ED visits)

Page 19: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Regional Count Data

Aggregate to regional counts, often to preserve confidentiality.

4 1

211 2

Page 20: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Complication:

Counts lose some resolution...

4 1

211 2

Page 21: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Modifiable Areal Unit Problem

Different aggregations can lead to different results.

4 1

211 2

0 0 0 0

2210

20

24

0

Page 22: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

MAUP example: John Snow

?

Monmonier, M (1991) How to Lie with Maps. University of Chicago Press: Chicago. p. 142.

Page 23: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Operationalizing H0 :

Case/control point data:Random labeling hypothesisSay n0 control, n1 case locations.H0: Case/control label randomly assigned to the n = n0 + n1 total locations.

Page 24: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Operationalizing H0 :

Regional count data:Constant risk hypothesisEach individual subject to same risk.Expected count = (risk)*(population size).

Variable total: Poisson counts.Fixed total: Multinomial counts.

4 1

211 2

5 2

101 1

3 0

410 3

Page 25: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

H0 drives type of test

Random labeling: often compare observed spatial intensities (expected number of events per unit area) of cases and controls.Constant risk: compare observed to those expected counts (goodness of fit).

Page 26: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

What deviation from H0 ?

Tests of clustering: check tendency for cases to occur in clusters. Tests to detect clusters: find most likely cluster(s).General tests: detect clusters or clustering anywhere.Focused tests: detect clusters or clustering around suspected foci.

Besag, J. and Newell, J. (1991) “The detection of clusters in rarediseases”. Journal of the Royal Statistical Society-A 154 327-333.

Page 27: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

How weird? (Monte Carlo test)

Random labeling/constant risk simulate data sets under H0.For any test statistic, calculate value in observed data, Tobs.Simulate many data sets under H0, and calculate the test statistic for each (T1,T2,…,Tnumsim ).p-value = proportion of test statistics from simulated data sets exceeding Tobs (fraction of T’s > Tobs).

Page 28: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Example: Regional Counts

Comparing observed to expected.Pearson’s chi-square statistic:

X2 =Sum of (Oi – Ei)2

But X2 ignores location of lack of fit.

Page 29: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Spatial goodness-of-fit

Instead of squaring (Oi – Ei), what if we link (Oi – Ei) and (Ok – Ek) by proximity of regions i and k ?Say, sum wik (Oi – Ei)(Ok – Ek), where wik gives link between i and k ?This (essentially) gives Tango’s index of clustering.

Tango, T. (1990) An index for cancer clustering. EnvironmentalHealth Perspectives 87, 157-162.

Page 30: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Finding spatial clusters?

Spatial scan statistic (SaTScan)Scan on windows with distance radii.

Turnbull et al’s Cluster Evaluation Permutation Procedure (CEPP).

Scan on window of constant population size (e.g., 10,000 people at risk).

Besag and Newell’s approachScan on window of constant number of cases (e.g., 10 cases).

All seek collection least consistent with H0 .

Page 31: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

New York Leukemia

592 cases 1978-1982, 8 counties, 790 census regions, ~ 1 million people.

Page 32: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Example: case/control point data

Kelsall and Diggle (1995)Compare ratio of case intensity to control intensity.Random labeling simulations.Identify locations where case intensity significantly exceeds control intensity (pointwise test of significance).

Approach to detect clusters.

Kelsall, J.E. and Diggle, P.J. (1995) Non-parametric estimation ofspatial variation in relative risk. Statistics in Medicine 14, 2335-2342.

Page 33: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Archeology data

Alt and Vach (1991)143 grave sites, 30 with affected teeth (“cases”)Question: families buried together?Tested question: Do gravesites with affected teeth cluster?

Alt, K.W., and Vach, W. (1991) “The reconstruction of ‘genetickinship’ in prehistoric burial complexes – problems and statistics”

In Classification, Data Analysis, and Knowledge Organization:Models and Methods with Applications. H.-H. Beck and P. Ihm (eds.)

Springer: Berlin.

Page 34: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Map

Page 35: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Case and control intensities

f

Y

Z

Affected

4000 6000 8000 10000

4000

6000

8000

10000

**

*

*

*

*

*

*

* *

***

*

*

**

**

*

*

*

**

*** **

*

Affected, bw = 500

g

Y

Z

Non-affected

4000 6000 8000 10000

4000

6000

8000

10000

o

oooo

oo o o

oo

oo

o

oo

oooo

oo

oo

o

o

ooo

oooo

oo

oo

o

o

oo

oo

oo

oo

o

o

oo

ooo

oo

o

o

o

o

ooo o

ooo

o

oooo

o

o

o

oo

oo

o

o

oo

o

o

o

o

o

o

oo oooo

oo

o

ooo

ooooo

o

ooo

oo

o

Non-affected, bw = 500

Page 36: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Relative risk surface

r

Y

Z

Relative risk surface

4000 6000 8000

4000

6000

8000 **

*

*

*

*

*

*

* ****

*

*

**

**

*

*

*

***** **

*

o

ooooooo o

oooo

ooo

oooo

oo

oo

o

o

oooooo

o

oooo

o

o

oo

oo

ooo

o

ooooooo

oo

o

o

o

o

ooo o

ooo

o

oooo

o

oo

o ooo

o

o

oo

o

o

o

o

o

o

oo oooo

ooo

ooo

ooooo

o

ooo

ooo

Relative risk surface, bw= 500

Page 37: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Spatial scan statistic

Most likely cluster (p-value = 0.067)

Page 38: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Important ideas

What question do I want to answer?What data can I get?What statistical method will I use? What question can I answer with the data I have and the method?Does this match my first question?

Page 39: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Additional important ideas

Results depend on data structure (MAUP).Every test involves a specific definition of “cluster”…ask yourself:

What data results from H0 (the model of “no clustering”)?

• Can you simulate data from H0?

What constitutes evidence against H0(the model of “clustering”)?

• Do your data appear consistent with H0?

Page 40: How Many Cases Are Too Many? Detection of Disease ...web1.sph.emory.edu/users/lwaller/ClusterIOL.pdf · for cases to occur in clusters. Tests to detect clusters: find most likely

Reading listBesag, J. and Newell, J. (1991). The detection of clusters in rare diseases. Journal of the Royal Statistical Society, Series A 154, 143-155. Kelsall, J.E. and Diggle, P.J. (1995) Non-parametric estimation of spatial variation in relative risk. Statistics in Medicine 14, 2335-2342. Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics-Theory and Methods 26, 1481-1496.Neutra, R.R. (1990). Counterpoint from a cluster buster. American Journal of Epidemiology 132, 1-8.Rothman, K. (1990). A sobering start to the cluster busters’ conference. American Journal of Epidemiology 132 (Supplement), S6-S13.Snow, J. (1946) Snow on Cholera. Oxford University Press.Tango, T. (1990) An index for cancer clustering. Environmental Health Perspectives 87, 157-162.Turnbull, B.W., Iwano, E.J., Burnett, W.S., Howe, H.L., and Clark, L.C. (1990). Monitoring for clusters of disease: application to leukemia incidence in upstate New York. American Journal of Epidemiology 132 (Supplement), S136-S143. Wallenstein, S. (1980) A test for detection of clustering over time. American Journal of Epidemiology 111, 367-372.Waller, L.A. and Jacquez, G.M. (1995). Disease models implicit in statistical tests of disease clustering. Epidemiology 6, 584-590.Waller, L.A. (2002). Methods for detecting disease clustering in time or space”. In Statistical Methods and Principles in Public Health Surveillance. R. Brookmeyer and D. Stroup (eds). Oxford University Press.