14
1 Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007 Spatial Analysis of Surveillance Data Fernando Simón, Francisco Luquero, Victor Flores, Denis Coulombier Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007 Early Neonatal Mortality Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007 Cartogram: Malaria. N. cases Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

Surveillance Data Spatial - EpiSouth · 2018. 4. 11. · ArcGis® and Arcview) Compatible with Arcview 3.x, and ArcGIS 10.05.2004 Intro analysis of surveillance data, EPISOUTH, Madrid,

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

  • 1

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Spatial Analysis of Surveillance Data

    Fernando Simón, Francisco Luquero, Victor Flores, Denis Coulombier

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Early Neonatal Mortality

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Cartogram: Malaria. N. cases

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

  • 2

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Purpose of spatial analysisDescribe spatial distribution of data– Counts, rates, RR …– Mapping

    Identify spatial association of cases– Identify clusters, OB …– Analysis of point processes

    Estimating – Counts, rates, RR …– Geostatistics

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Descriptive Analysis

    Dot-density maps for count of casesAdministrative area maps for rates– Choice of administrative areas– Rates to account for population– Standardised rates to account for

    population structure“Isorate” maps for sentinel surveillanceGIS when case coordinates available

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Dot Density Map

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Notification of Tuberculosis in France, 19964-Week Period Ending 31/12/1996

  • 3

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Spatial Distribution of Polio CasesAlbania, April-September 1996

    AprilMayJuneJulyAugustSeptember

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Descriptive Analysis: Place and Rates

    Count of cases does not represent riskAdministrative areas have different populationsPopulation may vary over time– Seasons – Population influx (refugees)

    Rates allow to compare risksChoice of administrative areas (problem of small numbers of cases)Choice of ranges

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Notification Rate of Tuberculosis in France, 1996

    Cases/100,000

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Distribution of cases of PERTUSSISLebanon, as of week 2003-15

    ##

    ##

    #

    ##

    #

    ####

    # ##

    #

    ######

    ###

    ##

    ###

    #

    #

    #

    ##

    Cases/100,000/Y

    0.178 – 0.5540.555 – 0.8720.873 – 1.7411.742 – 3.554No report

  • 4

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Choosing map data If we map number of cases instead of rates:

    Is the disease risk in A really similar to B?

    Misleading because underlying population may be greater in A

    A B

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Use of Standardised Rates

    Age structure

    Disease Place

    Population structurevaries across places

    independently of disease

    Disease occurrence varies across ages

    independently of place

    ConfoundingAge, independently related to disease and to location

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Use of Standardised Rates

    Direct standardisation

    Indirect standardisation

    Value of rate affected by the reference population

    For comparison only

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Distribution of Death by Falls by Province, Canada, 1998

    Age Standardized Rate per 100,000Crude deaths rate per 100,000

  • 5

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    General mortality, 1995-1997

    Not smoothed

    Smoothed

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    BoxMap

    Equal interval Equal n. of records

    Natural breaking

    Chloropleths: colors or patterns

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Choropleth Maps

    Natural break

    The story we tell depends upon how we choose to create the legend

    Equal ranges

    Equal counts

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Choice of Data Breakdown in Classes

    Equal area

    Equalinterval

    Naturalbreaks

    Quartiles

    MeanSt Dev.

    0

    5

    10

    15

    20

    0.378 7.400 14.423 21.445 28.467

  • 6

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Testing for Hypothesis place

    Remove confounding (standardisation)Detection of clusters– Unexpected events: dot-maps

    • Test for spatial correlation by nearest neighbour– Events with baseline historical data

    • Test for spatial correlation by contiguity analysis

    Risk factor identification– Overlaying exposure and outcome– Test for cross-correlation

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Distribution of cases of Botulism France, Week 42-45, 2000

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    0.000000-0.0012500.001251-0.0020510.002052-0.0025920.002593-0.006114

    Death per 100000Observed contiguity in high risk counties: 24Expected contiguity in high risk counties: 16.3Contiguity standard deviation: 3.46z statistic: 2.07, p=0.038

    Testing for ContiguitiesGrimson Method

    Sudden Infant Death Syndrome by County, North Carolina, 1974-1978

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Procesos Puntuales

  • 7

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Why identify geographic patterns?

    Geographic patterns range from completely clustered to completely dispersed. A pattern between these extremes is said to be random.

    Knowing there’s pattern in your data is useful if you need to gain a better understanding of a geographic phenomenon

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Spatial Processes questionsIs there any systematic pattern or are my data distributed atrandom.

    Possibilities: clusteringregularity

    Scale of the clustrereing

    The pattern is due to:

    • natural variation in the population• obvious a priori heterogeneity• associated with proximity to other features of interestAre events that aggregate in space also clustered in time?

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Is this clustered?

    YES!

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Is this clustered?

    NO:

    Its regularly dispersed

  • 8

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Is this clustered?

    Maybe: (Complete Spatially Random)

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Point processes

    The process is stationary if the jointdistribution of N(A) is invariant totranslation by an arbitrary amount x

    The process is isotropic if the jointdistribution of N(A) is invariant torotation through an arbitrary angle

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Point processesTheorem 2: For a homogeneous planar Poisson process

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Point processes

  • 9

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Point processes

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Point processes

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    RANDOM UNIFORM CLUSTERED

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Global and Local Tests

    Cluster detection methods

    Global (first-order) tests detect the presence or absence of clustering over the whole study regionwithout specifying the spatial location.

    Local (second-order) tests additionally specify the location and if extended to consider temporal patterns, can specify spatio-temporal clusters.

    A special case of local tests is the focussed test which is used to detect raised incidence of disease around some pre-specified source, such as an incinerator.

  • 10

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Choosing a method

    There are two critical aspects, statistical power, and confounding.

    Methods that can control for known confounding effects should be used in the first instance.

    Statistical power is the ability to detect a real effect.

    Readers will become acquainted in the literature with the ability of methods to identify true clusters (true positives) but also the frequency with which the methods report clusters falsely (false positives).

    Comparative evaluations of statistical power, often by running competing cluster methods against a set of simulated data with known properties, can provide guidance in the choice and application of particular methods. Confounding is the erroneous attribution of a disease cluster to a factor which is both related to an exposure and a disease outcome.

    Change in background population density

    demographic factors such as age, gender or ethnicity.

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Maps for Sentinel SystemsIncidence of diarrhea in France, 1995

    Cases / 100,000 population

    Source: Réseau National Télématique des Maladies Transmissibles

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Distribution of Syndromic Influenza,France, Week 1-19, 2003

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Interpretation of Significant Tests

    The role of artefacts– Errors…

    The role of confounding– Rates (time)– Standardised rates (place)

    The role of chance– Statistical testing (place dependency)

    True disease pattern

  • 11

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    George W. Comstock

    The art of epidemiological thinkingis to draw conclusions

    from imperfect data

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Epi Map MappingShapefiles + Data file (.mdb)

    Slides based in a previous Paolo D’Ancona presentation for EPIET

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Epi MapGIS component of Epi-InfoProgrammed with MapObjects language (Shapefiles), Developed by ESRI (Makers of ArcGis® and Arcview)Compatible with Arcview 3.x, and ArcGIS

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ShapefilesGIS data set– Data represented by coordinates– Point, line, area (polygon)– Attributes stored in separate dbf file

    • Names and other information

  • 12

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Structure of a simple dataset

    Main file: xy.shpIndex file: xy.shxdBASE file or Access: xy.dbf or xyMDB

    The project file– In arcview: xy.apr– In epimap: xy.map

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ESRI shapefileEPI MAP

    Europe.shp Polygons

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ESRI shapefileEPI MAP

    Europe.shp Polygons

    Europe.dbf Attributes (names)

    Greece

    France

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ESRI shapefileEPI MAP

    Europe.shp Polygons

    Europe.dbf Attributes

    Europe.shx File Structure

    Greece

    France

  • 13

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ESRI shapefile & Access data tableEPI MAP

    Europe.shp Polygons

    Europe.dbf Attributes

    Europe.shx File Structure

    Greece

    France

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ESRI shapefile & Access data tableEPI MAP

    Europe.shp Polygons

    Europe.dbf Attributes

    Europe.shx File Structure

    Attributes to match

    Greece

    France

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ESRI shapefile & Access data tableEPI MAP

    Europe.shp Polygons

    Europe.dbf Attributes

    Europe.shx File Structure

    Attributes to matchCount variables

    Greece

    France

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    ESRI shapefile & Access data tableEPI MAP

    Europe.shp Polygons

    Europe.dbf Attributes

    Europe.shx File Structure

    Attributes to matchCount variablesOther Information (Size, Pop...)

    Greece

    France

  • 14

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    Data file (.mdb)

    Contains– Geographical variable

    • To link data to specific features in shapefile• Unique relationship

    – At least one numeric variable• Disease count, rate, etc.

    Individual data– Must be processed to produce a summary file– Only one record per geographic entity

    10.05.2004 EPIET Workshop Bordeaux, May 2004Intro analysis of surveillance data, EPISOUTH, Madrid, September 2007

    = Epi-Info Map file

    Dot density Chloropleth