Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Designing multi-objective surveillance systems
Samuel V. Scarpino Omidyar Fellow Santa Fe Institute
[email protected] @svscarpino scarpino.github.io
Where will the next pandemic emerge?
Where will the next influenza pandemic emerge?
Air travel Density of pig farms
What factors are important for influenza?
Worldwide pig density
Worldwide pig density
and it happened again with Ebola
and it happened again with Ebola
Knobloch et al. 1982
Disease surveillance stands at a crossroads
Four Step Program
1. Formalize the public health objectives
2. Specify candidate data sources & acquire historical data
3. Simulate data where missing
4. Select the most informative data sources and validate!
Scarpino, Dimitrov, Meyers 2012
1. Surveillance goal - influenza in Texas
2. Data - ILINet, Hospitalizations, & G.F.T.
3. Simulate data
1. Obtain a hospitalization time series.
2. Integrate over published estimates for the hospitalization rate of influenza.
3. Account for noise.
4. Account for imperfect reporting.
ILI = C1× PerfectInformation + C0 + N(0,σ 2 )
3. Simulate data - provider behavior
Reporting Non-Reporting
3. Simulate data - empirical reporting rates
4. Select providers
R2 (Hosp,S,ξ̂) =Var(Hosp) −Var Hosp − α iPi (ξ̂)
i∈S∑⎛
⎝⎜⎞⎠⎟
Var(Hosp)
maxS⊆P
Eξ R2 (Hosp,S,ξ̂)⎡⎣
⎤⎦
Hosp S
ξ̂α i
P
= goal time series (state-wide hospitalizations)
= subset of selected providers
= observational noise and reporting
= regression coefficients
= mock provider pool
4. Select providers
R2 (Hosp,S,ξ̂) =Var(Hosp) −Var Hosp − α iPi (ξ̂)
i∈S∑⎛
⎝⎜⎞⎠⎟
Var(Hosp)
maxS⊆P
Eξ R2 (Hosp,S,ξ̂)⎡⎣
⎤⎦
Hosp S
ξ̂α i
P
= goal time series (state-wide hospitalizations)
= subset of selected providers
= observational noise and reporting
= regression coefficients
= mock provider pool
4. Select providers - account for provider behavior
R2 (Hosp,S,ξ̂) =Var(Hosp) −Var Hosp − α iPi (ξ̂)
i∈S∑⎛
⎝⎜⎞⎠⎟
Var(Hosp)
maxS⊆P
Eξ R2 (Hosp,S,ξ̂)⎡⎣
⎤⎦
An optimized network is more informative
Scarpino et al. 2012Providers
Per
form
ance
(R2 )
0.0
0.2
0.4
0.6
0.8
1.0
50 100 150 200
Cover
CDC - GreedyCDC - Random
RSquare
An optimized network is more efficient
Obs. ILINet – 82 Providers – R2 ~ 0.6 Rsquare – 38 Providers – R2 ~ 0.6
Scarpino et al. 2012
But what about pandemic influenza?
Situational AwarenessEarly Detection
The RSquare network doesn’t detect early cases
0
300
600
900
0 50 100 150 200
Providers
Ear
ly In
fluen
za C
ases
Det
ecte
d
200
Situational Awareness
EarlyDetection
But a mixed network can!
0
300
600
900
0 50 100 150 200
Providers
Ear
ly In
fluen
za C
ases
Det
ecte
d
200
Mixed
Situational Awareness
EarlyDetection
And we can benchmark ILINet
0
300
600
900
0 50 100 150 200
Providers
Ear
ly In
fluen
za C
ases
Det
ecte
d
200
Mixed
Situational Awareness
EarlyDetection
ILINet (obs.)
And the results hold for situational awareness
0.00
0.25
0.50
0.75
1.00
0 50 100 150 200
Prop
ortio
n of
Var
iatio
n Ex
plai
ned
Providers
Mixed
Situational Awareness
EarlyDetection
ILINet (obs.)
Dengue in Puerto Rico0
200
400
600
800
deng
ue v
irus
coun
ts
1991 1992 1994 1995 1997 1999 2000 2002 2004 2005
Greedy selection providers
1995 2000 2005
020
6010
0ca
ses
R^2 = 0.61
1995 2000 2005
020
6010
0ca
ses
R^2 = 0.61
1995 2000 2005
020
6010
0ca
ses
R^2 = 0.89
Island-wide incidence
1 20 40 60 75
0.0
0.2
0.4
0.6
0.8
1.0
Number of Clinics
(ou
t-o
f-sa
mp
le R
2)
Island (Island-wide)
Regional (H.S.R.)
Serotypes (Serotypes)
Multi-objective (Island-wide)
Multi-objective (H.S.R.)
Multi-objective (Serotypes)
Pe
rfo
rma
nce ^
Multi-Objective Surveillance
1 20 40 60 75
0.0
0.2
0.4
0.6
0.8
1.0
Number of Clinics
(ou
t-o
f-sa
mp
le R
2)
Island (Island-wide)
Regional (H.S.R.)
Serotypes (Serotypes)
Multi-objective (Island-wide)
Multi-objective (H.S.R.)
Multi-objective (Serotypes)
Pe
rfo
rma
nce ^
Regional surveillance
1 20 40 60 75
0.0
0.2
0.4
0.6
0.8
1.0
Number of Clinics
(ou
t-o
f-sa
mp
le R
2)
Island (Island-wide)
Regional (H.S.R.)
Serotypes (Serotypes)
Multi-objective (Island-wide)
Multi-objective (H.S.R.)
Multi-objective (Serotypes)
Pe
rfo
rma
nce ^
Serotype surveillance
1 20 40 60 75
0.0
0.2
0.4
0.6
0.8
1.0
Number of Clinics
(ou
t-o
f-sa
mp
le R
2)
Island (Island-wide)
Regional (H.S.R.)
Serotypes (Serotypes)
Multi-objective (Island-wide)
Multi-objective (H.S.R.)
Multi-objective (Serotypes)
Pe
rfo
rma
nce ^
Performance without redundancy
0 20 40 60
0.0
0.2
0.4
0.6
0.8
1.0
Number of Clinics
Prop
ortio
n of
Cas
es C
over
ed
IslandRegionalSerotype Multi-objective
75
Performance without redundancy
0 20 40 60
0.0
0.2
0.4
0.6
0.8
1.0
Number of Clinics
Prop
ortio
n of
Cas
es C
over
ed
IslandRegionalSerotype Multi-objective
75
Alternative design algorithms
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Perfo
rman
ce
(R2 :
Isla
nd-w
ide
inci
denc
e)^
^
Alternative design algorithms
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Perfo
rman
ce
(R2 :
Isla
nd-w
ide
inci
denc
e)^
^
Alternative design algorithms
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Perfo
rman
ce
(R2 :
Isla
nd-w
ide
inci
denc
e)^
^
Alternative design algorithms
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Perfo
rman
ce
(R2 :
Isla
nd-w
ide
inci
denc
e)^
^
Alternative design algorithms
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Perfo
rman
ce
(R2 :
Isla
nd-w
ide
inci
denc
e)^
^
Alternative design algorithms
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Perfo
rman
ce
(R2 :
Isla
nd-w
ide
inci
denc
e)^
^
Alternative design algorithms
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Perfo
rman
ce
(R2 :
Isla
nd-w
ide
inci
denc
e)^
^
Similar pattern for the other objectives
Islan
dMult
i-objec
tive
Dive
rsity
Region
alSe
rotype
Volum
ePo
pulat
ion
0.0
0.2
0.4
0.6
0.8
1.0
Islan
d
Mult
i-objec
tive
Dive
rsity
Region
al
Serotype
Volum
ePo
pulat
ion
Islan
dMult
i-objec
tive
Dive
rsity
Region
al
Serotype
Volum
ePo
pulat
ion
Perf
orm
ance
(R2: Is
land-w
ide incid
ence)
(R
2: S
ero
type-s
pecific
incid
ence)
(R2:
H.S
.R. in
cid
ence)
^
^
^
Hidden validation
020
40
60
80
Aguadilla
cases
020
60
Arecibo
040
80
Bayamon
cases
020
40
60
Caguas
02
46
8
Fajardo
cases
020
40
Mayaguez
2006 2008 2010 2012
040
80
120
Ponce
cases
2006 2008 2010 2012
040
80
120
San Juan
2006 2008 2010 2012
0100
300
500
ca
se
s
01
00
25
0
DENV1
cases
02
05
0
DENV2
2006 2008 2010 2012
06
01
20
DENV3
cases
2006 2008 2010 2012
04
08
0
DENV4
A.) Island-wide incidence
B.) Serotype-specific incidence C.) Regional incidence
Observed
Validation estimate
Chikungunya Surveillance
Texas Arbovirus Risk Map
Four Step Program
1. Formalize the public health objectives
2. Specify candidate data sources & acquire historical data
3. Simulate data where missing
4. Select the most informative data sources and validate!
Scarpino, Dimitrov, Meyers 2012
Acknowledgments
Ned Dimitrov Lauren Ancel Meyers Michael Johansson
Bruce Clements Lyn Finelli
Alison Galvani
NIH MIDAS DTRA
The Santa Fe Institute The Omidyar Group
Questions?
Samuel V. Scarpino Omidyar Fellow
Santa Fe Institute
[email protected] @svscarpino
scarpino.github.io
1995 2000 2005
020
6010
0ca
ses
R^2 = 0.61
1995 2000 2005
020
6010
0ca
ses
R^2 = 0.61
1995 2000 2005
020
6010
0ca
ses
R^2 = 0.89