Upload
instedd
View
1.468
Download
0
Tags:
Embed Size (px)
DESCRIPTION
RIFF - A Social Network and Collaborative Platform For Public Health Disease Surveillance
Citation preview
Photo credit: IRMA (Integrated Risk Management for Africa)
Taha Kass-Hout and Nicolas di Tada, Summer 2008, Washington, DC, USA.
What is public health disease surveillance “Public health surveillance is the ongoing systematic
collection, analysis, and interpretation of health data essential to the planning, implementation, and evaluation of public health practice, closely integrated with the timely dissemination of these data to those who need to know. The final link in the surveillance chain is the application of these data to prevention and control. A surveillance system includes a functional capacity for data collection, analysis, and dissemination linked to public health programs.”
What is syndromic surveillance? US CDC defines syndromic surveillance as “surveillance
using health-related data that precede diagnosis and signal a sufficient probability of a case or an outbreak to warrant further public health response.”
3
Thacker, S.B., and Berkelman, R.L. "Public Health Surveillance in the United States." Epidemiology Reviews 10 (1988): 164-90.
Current systems design, analysis and evaluation of disease surveillance systems has been geared towards specific data sources and detection algorithms – not humans Much less has been towards interaction with responders and
domain experts across agencies and at multiple levels Often provide contradictory interpretations of ongoing events
We have disease surveillance systems in place for those threats we have been faced with before We are more vulnerable to those we know about, but have not
faced on a major scale Even more vulnerable to those that we don’t know about
4
The likelihood of disasters and disease outbreaks is growing According to a recent Oxfam report, there has
been a four-fold increase in the annual number of natural disasters
30 new infectious diseases identified since 1973
Potential impact is getting greater Impact on health, economies & security Capable of spreading faster than ever before
http://www.oxfam.org/en/policy/briefingpapers/bp108_climate_change_alarm_0711
5
To address these challenges by adopting a social and collaborative decision making approach in order to facilitate early characterization and identification
of potential health threats their verification, assessment and
investigation in order to recommend measures (public
health and others) to control them
6
Event-based - ad-hoc unstructured reports issued by formal or informal sources
Indicator-based - (number of cases, rates, proportion of strains…)
Timeliness, Representativeness, Completeness, Predictive Value, Quality, Cost, Feasibility, … 77
Identified risksMandatory notification
Laboratory surveillance
Emerging risksSyndromic surveillance
Mortality monitoringHealthcare activity monitoring
Prescription monitoring
Non healthcare basedVeterinary surveillanceBehavioral surveillance
Environmental surveillancePoison centers
Food safety/water supply…
DomesticMediaNGOsField Epi points
InternationalDistribution lists
ProMed (English, Chinese, Spanish, Russian, etc.)
International agencies WHOOIECDCNASA (e.g., remote sensing, weather, population migration, bird migration, population density, plant, animal)
Confidential/Limited mailing list dissemination
ProMed (e.g., MBDS)International health regulation agencies (WHO, OIE, CDC, NASA)Threat bulletin (EWARN, ECDC)
Public disseminationNews, blogs, articles, Health ministry press releases sitesWeekly releases (Eurosurveillance)
8Adopted from WHO
Reduce Morbidity and Mortality and Improve Health
9Adopted from WHO
Exposed
Clinical specimen
Symptoms
Pos. specimen
Infected
Seek medical attention
Report
1000 Shigella infections (100%)
50 Shigella notifications (5%)
Main attributes Representativeness Completeness Predictive value positive
Get as close to the bottom of the pyramid
as possible
Urge frequent reporting
Specificity / Reliability
Sensitivity / Timeliness 10
Exposed
Clinical specim
en
Symptom
s
Pos. specim
en
Infected
Seek medical attention
Report
Analyze and interpret Signal as early
as possible
Automated analysis/thresholds
Time
Main attributes Timeliness
11
Clickstream/Keyword Searching
Blogs/Chatrooms
News Sources Local National International
Curated mailing lists (ProMED)
Multi-national surveillance (Eurosurveillance)
Validated official global alerts (WHO)
12
Sensitivity / Timeliness
Specificity / Reliability
Main attributes Data quality
0
10
20
30
40
50
60
70
80
90
1 4 7 10 13 16 19 22 25 28 31 34 37 40DAY
CASES
LabConfirmationDetection/
Reporting
First Case
Opportunity for control
13Adopted from WHO
Response
0
10
20
30
40
50
60
70
80
90
1 4 7 10 13 16 19 22 25 28 31 34 37 40
First Case
Detection/ReportingConfirmation
Investigation
Opportunity for control
Response
DAY
CASES
14Adopted from WHO
15
Nov 2002 Mar 2003
Progression of outbreak
Electronic Surveillance
Cases of atypical pneumonia FoshanNov 16th
Infected Chinese DoctorHong Kong hotelFeb 21st
305 Cases of acute respGuangdong ProvinceFeb 11th
Pharma report Guangdong ProvinceNovember 27
Media reportsGuangdong ProvinceFeb 10
Astute physician on ProMEDFeb 10
Initial WHO ReportFeb 25
Official WHO ReportMarch 10
Adopted from Brownstein, et al.
Figure 1. Reported cases of campylobacteriosis (n=45) in Svolvær, Norway, by
date of onset J anuary and February 1997.
10 = 1 primary case
= 1 secondary household case
5
22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 6 7 8 9 10
J anuary February
8
67
12
34
5
910
8
67
12
34
5
910
16
News articles
Alerts
Disease reports
17
18
9/20, 15213, cough/cold, …9/21, 15207, antifever, …9/22, 15213, CC = cough, ...1,000,000 more records…
Huge mass of data Detection algorithm Too many alerts
Duplicative and uni-directional
channels
Uncoordinated response
Hybrid: Machine- and Human-based
Social, collaborative and cross-disciplinary
Web 2.0/3.0 platform
19
Better detection model Better response model
Source: http://www.pbs.org/wgbh/pages/frontline/shows/georgia/outbreak/matrix.html Source: www.sociology.columbia.edu/pdf-files/bearmanarticle.pdf
20
News item 345
Field alerts
Disease report
Health News
Field alerts
News sources
Alerts
Data + Metadata
Collaboration and multi-directional communication between interested groups
Interactions beyond that allowed by original sources and with controlled visibility
Customizable, secure ‘social’ and ‘professional’ metadata around information
21
22
9/20, 15213, cough/cold, …9/21, 15207, antifever, …9/22, 15213, CC = cough, ...1,000,000 more records…
Huge mass of data
Feedback loop
Fewer and more actionable alerts
Effective and coordinated response
Multi-directional communication
23
Feature extraction (including geo-location)
TagsTags
CommentsComments
LocationLocation
Flags/Alerts/BookmarksFlags/Alerts/Bookmarks
Environment FactorsEnvironment Factors
Animal Health FactorsAnimal Health Factors
Remote SensingRemote Sensing
Riff
Bot
Event Classification and
Detection
Event Classification and
Detection
Previous Event Training DataPrevious Event Training Data
Previous Event Control DataPrevious Event Control Data
Metadataextraction
Other reference information
Machine learning
Show event characterizations
Social networkSocial network
Other inferred information
……
Professional network feedback
Professional feedback
Anomaly detection
Multiple data streams (multi-lingual)
User-Generated and Machine Learning Metadata
Existing Social Network (e.g., Comm. of interest)
24
Kass-Hout and di Tada: Best Poster Award for Improving Public Health Investigation and Response at the Seventh Annual ISDS Conference, December 3-5, 2008 at the Raliegh Conference Civic Center. http://kasshout.blogspot.com/2008/12/best-poster-award-for-improving-public.html and http://www.isdsjournal.org/article/viewArticle/3308 25
Search: _____
{tag Cloud}Terms tagged by human collaborators or source
{Event Tag cloud}
X DiarrehaX CholeraX InfluenzaX Respiratory lllnessX Fever
[Show me unusual distributions]
26
27
28
FiltersFilters
Item (e.g., disease report, news article, alert) summary and location (s)
Item (e.g., disease report, news article, alert) summary and location (s) Tag cloudTag cloud
29
Subscriptions Subscriptions SMS alertsSMS alerts
Ratings, comments, alerts, flagsRatings, comments, alerts, flagsTags (automatic + humans classification)Tags (automatic + humans classification)
Thread (related Items)Thread (related Items)
LOCATIONS HEATMAP
30
31
32
33
34
35
36
Tracking the Avian
Influenza Outbreak in
Egypt (reports
started to appear late
January 2009).
Current classifications (automated and corrected by human experts) includes: 7 syndromes 10 transmission modes > 100 infectious diseases > 180 micro-organisms > 140 symptoms > 50 chemicals
37
HFOSS Disease Ontology Prediction Project http://2009.hfoss.org/Evolve_-_Disease_Ontology_Prediction
Over the summer, the Humanitarian FOSS (HFOSS) Project Summer Institute 2008 (May' 08 - July' 08) carried out an internship project mentored by InSTEDD and a number of HFOSS faculty. During this internship, Juan Pablo Mendoza and Qianqian Lin developed ALPACA Light Parsing And Classifying Application (ALPACA) to:
1. Transform raw unstructured documents (e.g., news reports, ProMED mail, etc.) into machine readable and analyzable data using a text parsing module
2. Categorize documents using a SVM classifier using libSVM for: 1. a) Classification into a predetermined (user-
defined) list of categories as described above (syndromes, symptoms, routes of transmission, diseases, etc.), and
2. b) Suggesting additional tags and/or topics using a Naive Bayes classifier given existing topics and monitoring human input and review. This is especially helpful with new (emerging) threats or those threats that we know about but we experience them at a much bigger scale than usual (e.g., far more virulent flu virus than we’ve experienced over the past few years)
38
We tested ALPACA against two widely accepted early sources of information in the public health community; Reuters news and ProMED mail. Results are shown here:
ALPACA is extensible through a plug-in functionality that provides a simple way to add additional parsers and classifiers to the application. We are continuously adding and testing additional algorithms and we welcome your contribution to help us better calibrate existing classifiers and parsers as well as introduce additional ones (you can visit our collaborative space here.)
39
To-date, we have: 480 registered users 394 collaboration spaces 694 streams of information sources (RSS, SMS,
etc.) 900.000 items [e.g., news articles, disease
reports] analyzed 443,151 geo-coded locations 700 terms [tags] ‘trained’ [accept/reject] by
human experts 12.000+ tags ‘suggested’ by human experts
40
Technical considerations Collaboration Workflow
Organizational considerations
Evaluation framework
41
Latest Progress Ontologies (e.g., BioCaster, SNOMED, ICD) Event reporting, analysis and public announcements
(e.g., Thomson Reuters Foundation’s Emergency Information Service (EIS) deployment during the Haiti Response, 2010
Planned Steps API for external extensions and interactions Full support for structured data Automatic field data collection through forms, SMS,
etc. Anomaly detections (e.g., EARS)
42
http://alertnet.org/db/blogs/1564/2010/00/24-120746-1.htm
http://ndt.instedd.org/search/label/eis
43
44
Taha Kass-Hout, MD, MS
http://kasshout.blogspot.com
Nicolás di Tada
Riff
http://riff.instedd.org
[Software: http://code.google.com/p/riff-evolve Code license: GNU General Public License v3, Content license: Creative Commons 3.0 BY-SA]
Cambodia, Photo taken by Taha Kass-Hout, October 2008
“this pic says it all- our kids are all the same- they deserve the same”, Comment by Robert Gregg on Facebook, October 2008
45
Kass-Hout and di Tada: Best Poster Award for Improving Public Health Investigation and Response at the Seventh Annual ISDS Conference, December 3-5, 2008 at the Raliegh Conference Civic Center. http://kasshout.blogspot.com/2008/12/best-poster-award-for-improving-public.html and http://www.isdsjournal.org/article/viewArticle/3308 47
Detection-focused visualization Individual alert listings Summary alerts Alerts in time-series graph Mapping alerts
Information-based visualization (visualizing data and information) Data query Data stratification Time-series graphs Data line listing Matrix portal Mapping
How to communicate information to users of the system. Typically there are three basic components:•Time-series graphs•Maps •Data tables
However, depending on the primary focus (detection-bases, information-based, or a hybrid of both) there can be more components as follows:
48
Detection-based systems (alert listings and maps of current anomalies are the two most important visualization components) What-if scenarios Automatic anomaly detection
Statistical anomaly
System believes there is an anomaly of interest to the user
Information-based systems (GIS, time-series graphs, data tables, query wizards, and real-time displays are the most important visualization components) Create new case definition Select different processing options Customize presentation to meet users needs
Additionally, we propose building a hybrid solution that combines both detection and information-based systems, which supports the following:
49
Early detectionSpatio-temporalBelief Networks (BNs)Simulation and modeling
Provide earlier notification of a change in the normal levels of observed counts of the desired health indicator.
Emphasis on the importance of matching the analytic process to the data type so as to achieve the performance needed for early identification of the event with minimum false alarms (Type I and Type II errors).
Performance evaluation f analytic processes using accepted metrics.
50
Statistical decision of the analytic data monitor include: which combinations of data sources to
test which algorithms to use with respect to
characteristics of the data background how to achieve sensitivity over many
locations within manageable false alert rate frequency
how much corroboration among data streams is required to achieve a threshold for escalating the information
A multiplicity of data sources has appeal because consistent evidence may be employed to suggest inferential accuracy. In practice, however, multiple data sources can be contradictory. Decision requirements for the prospective analytic data monitor involve when and how deeply to investigate a data anomaly as well as when to escalate the information (as an alert) for action. Unambiguous, corroborated data spikes are the exception rather than the rule. For single data streams, univariate algorithms employ data modeling and hypothesis tests to provide systematic signal escalation protocols.
51
Use of application-linked and hyperlinked-fields for integration of analysis and visualization tools
Commenting Capability Notification via a “publish and
subscribe” capability Shared group definitions and
calendars Shared access to key artifacts Support for Mobile devices (e.g., SMS)
and VOIP52
The enforcement of the business rules for distributing and validating alerts, escalation, and the definition of tasks
Keeping the business logic encapsulated in an business engine, as opposed to “coding it into” the core applications
Modification of operations “on the fly”, and supporting different modes of operation depending on the current level of emergency
53
Overall measures Situation Awareness Global Assessment
Technique (SAGAT) The Situation Awareness Rating
Technique (SART)
Individual processes measures
Network parameters
55
Which automated systems generated the most reliable alerts, and for what types of conditions?
Which human users where the most effective in identifying conditions?
Which indicators are the most effective in identifying a health event?
What factors help to minimize or aggravate a health event?
Which elements of the biosurveillance lifecycle require the most time and/or collaboration?
The network history will provide a common point of evaluation for a variety of surveillance and response techniques
56
System description Purpose (detection- and information-based) Stakeholders Operations
Health-related event detection Timeliness Validity
Validation approach
Statistical assessment of validity
Data quality
System experience System usefulness Flexibility Acceptability Portability Stability Costs
Evaluation here is primarily for the timely detection of health-related event and effectiveness of response. We have to keep in mind the flexibility of the system and how it can meet both regular and advanced users. Advanced users often want control in order to customize queries, modify graphic presentation, adjust sensitivity levels of detection algorithms, etc.
57
Source Type
Non-Specific Syndromic
Specific Case Definition
59
Note: All tags can follow a hierarchical construct
Ontology
Example: A subset of disease ontology, showing relationships between the various forms of pneumonia.
Pneumonia andinfluenza
Pneumonia due toStaphylococcus aureus
Other bacterialpneumoniaPneumococcal
pneumonia
Pneumonia due toHemophilus influenzae
Human Prodromal Clinical Morbidity and Mortality
Animal
Environmental (or Climate)
Allied Professional Source60
Building/vessel contamination Continuous or intermittent release of
an agent Contagious person-to-person Commercially distributed products Waterborne Vector/host borne Sexually transmitted Other
Large-scale bioaerosol Premonitory release of agent …
61
CLIMATE
62
Temperature change
Precipitation change
Wind change …
PEOPLE
Die-offs observed
Sentinels tested
…
ANIMAL
Increased mortality rate
Increased presentations for treatment
…
Building/vessel contamination
Continuous or intermittent release of an agent
Contagious person-to-person
Commercially distributed products
Waterborne Vector/host borne Sexually transmitted Other
Large-scale bioaerosol
Premonitory release of agent
…
TRANSMISSION ROUTE
Note: All tags can follow a hierarchical construct
RESPIRATORYBREATHING DIFFICULTY
63
Hemoptysis Asthma attack Croup Pneumonia Wheezing Runny or stuffy
nose Pleuritic pain Sore throat URI …
GI
Fever Weakness Anorexia Viral syndrome Faintness Malaise Body aches General illness Chills Lymphadenopa
thy Sweating …
CONSTITUTIONALIRRITABLE BABY
Abdominal pain
Diarrhea Vomiting Nausea Gastroenteritis Dehydration …
Cough Sore throat Fever Weakness Viral syndrome Body aches Bronchiolitis Pnemonia Upper respiratory
infection Malaise Chills Influenza …
INFLUENZA-LIKE ILLNESS (OR ILI)
Note: All tags can follow a hierarchical construct
UNDIAGNOSED
GI Respiratory …
DIAGNOSED
Influenza Avian
influenza …
64
Can be mapped to standards, such as:• Unified Medical Language System
(UMLS) [which supports SNOMED, LOINC, ICDs, etc.] http://www.nlm.nih.gov/research/umls/
• PHIN VADS (http://www.cdc.gov/PHIN)
Case Definition:•Probable•Possible•Confirmed
Note: All tags can follow a hierarchical construct
65
Cough [13 of 130]
If Item has:
Runny Nose [20 of 130]
Fever [23 of 130]
Then tag it with:
Flu [10 of 130]
Admin configures a new inference:
User sees a suggestion for a new item:
System will analyze the existing tagged Items and find out the probability of an item been a flu given that it has cough, runny nose and fever.
Flu [85% confidence because of cough, runny nose and fever]Influenza [55% confidence because of cough and headace]
Tags inferred
66
Cough
Longitude
Latitude
Fever
3 items clustered because of its proximity and similar symptoms
Note: This is actually done in a n-dimensional space, n being the number of tags available, plus the number of relevant words detected, plus a possible spatio-temporal dimension
67
Time
Each item gets represented by a vector of the relevant words it contains with the corresponding frequency.
Each tag classifier gets its linear classifier, which needs at least one positive and one negative sample. The classification is based on the vectors for each item, the linear classifier creates a hyperplane which divides the n-space in two for positive and negative predictions.
Whenever a user corrects or confirms a suggestion we feedback the classifier.
Any number of BNs can be created to map some “evidence” tags to a “prediction” tag. The system will measure for each item the probability of having that tag based on the existence of previous tags.
The vectors for the items can be grouped to find clusters. This will mean that those items are near in the n-space so they have similar values for their word content and tags.
68
This is just an initial approach, there are a number of alternatives implementations: Automatic tagging can be done using clustering: we
create clusters for each tag and for new items we measure to which cluster centroids the item is closer.
Automatic tagging can also be done using BNs, our evidence can be the words and we can measure the probability of a certain tag based on the words contained by the item.
New Tag suggestions can be done using clustering instead of BNs: clustering all the items and suggesting tags that some of the items in the cluster have and the others don’t.
Given that we implement the algorithm abstractly enough, it should be simple to interchange them and see what works best.
69
70
P(malaria) = 22% P(influenza) = 13% P(other ILI) = 33%
ClassifiersClusteringBayesian StatisticsNeural NetworksGenetic Algorithms
71
72
cold
fever
Map items to vectors (Feature extraction)
Normalize those vectorsTrain the classifierMeasure the results with new
informationFeedback the classifierSeparate classes in feature space
73
74
Support vectors define the separator
75
76
Φ: x → φ(x)
Map to higher-dimension space
77
ClassifierClassifier
Document 1
Document 1
Document 2
Document 2
Document 3
Document 3
PositivesPositives
NegativesNegatives
Training DocumentTraining
DocumentTraining
DocumentTraining
Document
Map items to vectors (Feature extraction)
Normalization
Agglomerative or Partitional
78
79
80
81
P(A |B) P(B | A).P(A)
P(B)
Probability of disease A (flu)
once symptom B (fever) is observed
Probability of disease A (flu)
once symptom B (fever) is observed
Probability of fever once flu is confirmed
Probability of fever once flu is confirmed
Probability of flu (prior or marginal)
Probability of flu (prior or marginal)
Probability of fever (prior or
marginal)
Probability of fever (prior or
marginal)
Given a set of stimuli, train a system to produce a given output…
82
Hidden LayerHidden Layer
Output LayerOutput Layer
Input LayerInput Layer
83
[…]
[…]
{I0,I1,……In}
{O0,O1,……On}
Weight
Weight
).(0 in
I
i in wIH
84
Event?
Define the model that you want to optimize
Create the fitness function
Evolve the gene pool testing against the fitness function.
Select the best individual
85
Model the transmission process using a set of parameters (e.g., an infectious disease): Onset time between an infection and
illness Latency period Incubation period Symptomatic period Infectious period
86
(Onset, Latency, Incubation, Symptomatic , Infectious)
( 2 days, 3 days, 1 day, 4 days, 3 days)
87
Fitness = 1/AreaFitness = 1/Area
1. Create an initial population of candidates
2. Use operators to generate new candidates (mating and mutation)
3. Discard worst individuals or select best individuals in generation
4. Repeat from 2 until you find a candidate that satisfies the solution searched
88
(4,5,6,3,5) (4,3,6,2,5)
89
(5,3,4,6,2) (2,4,6,3,5) (4,3,6,5,2)
(2,3,4,6,5) (3,4,5,2,6)
(3,5,4,6,2) (4,5,3,6,2) (5,4,2,3,6)
(4,6,3,2,5) (3,4,2,6,5) (3,6,5,1,4)
(5,3,2,6,5)
(3,4,4,6,2)
(5,3,2,6,5)
(3,4,4,6,2)
90
• Each "pill" is an hypothesis, it has the event tags on top, followed by the author.
• Each hypothesis can have many responses to it, a response means someone disagrees or wants to add something to the proposed hypothesis.
• The Red area are items that have been rejected by the responder to the previous hypothesis. Something like "Yes, I agree with you, but this and this don't belong here".
• The Blue area means items that have been added to the responded hypothesis, like "Yes, you are right, but also this and this items should be included in this event.
92
• A hypothesis response can include changes in the event tags: "You are right, there's definitely something going on with all those items, but I don't think its airborne, I think its waterborne"
• The list of people below the pill are all the users that have subscribed or agreed to that hypothesis: "Yes, Doc. James is right, I put my signature here."
• The size of the subscribers below is relative to their "reputation" inside Riff, the whole area below a pill gives an idea of how well supported a hypothesis is. We could also grow the pill proportionally.
93
• The "bold" pill is the confirmed one, once there's a confirmation, that should increase the reputation of all the subscribers to that hypothesis.
• Hovering each pill's area, should display a summary of the number of items there, maybe the area and the items tags.
• Clicking should popup a small navigation of the actual items contained there.
94
Can trend analysis predict outbreaks? Recent studies show that Internet
search has: …considerable potential as one of the
earliest indicators for syndromic surveillance …the potential to predict population-based
events relevant to public health …the potential for a higher sensitivity
compared to other early sources (e.g., media, ProMed)
96
Many individuals experiencing symptoms of illness conduct Internet search prior to seeking medical attention Wilson, Kumanan, Brownstein, John S., Early detection of disease outbreaks
using the Internet, CMAJ 2009 180: 829-831 Ginsberg J, Mohebbi MH, Patel RS, et al. Detecting influenza epidemics using
search engine query data. Nature 2009;457:1012–4 Mostashari F. Can Internet searches provide useful data for public health
surveillance?. Advances in Disease Surveillance 2007;2:209 Wethington H, Bartlett P. Usage and data collection patterns for a novel
web-based foodborne-disease surveillance system. J Environ Health. 2006 Mar;68(7):25-9
Cooper CP, Mallon KP, Leadbetter S, Pollack LA, Peipins LA. Cancer Internet search activity on a major search engine, United States 2001-2003. J Med Internet Res 2005;7:e36 Li CS, Aggarwal C, Campbell M, et al. Site-Based Biosurveillance. MMWR September 24, 2004 / 53(Suppl);249
Eysenbach G, Kohler C. What is the prevalence of health-related searches on the World Wide Web? Qualitative and quantitative analysis of search engine queries on the Internet. Proc AMIA Annu Fall Symp 2003;225-9
97
Internet search for allergies and ragweed search terms increase in the spring, and allergy and pollen search terms increase significantly in the fall. It would also appear that Texas and Oklahoma are leading locales for ragweed.
98Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209
99
A search for the term “leptospirosis” in the United States finds dramatically higher search rates from Honolulu, Hawaii, consistent with the epidemiology of the illness in the United States (more than half of all national cases are reported from Hawaii).
Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209
Internet search for “contact lens” increased in Singapore in February 2006, prior to the notification from CDC of the first US cases of contact lens-associated Fusarium keratitis in March 2006, and prior to widespread news coverage in April 2006.
100Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209
Following large anti-war protests on the Mall in Washington DC in late September 2005, multiple environmental sensors watching for bioterror events detected the presence of Francisella tularensis. Interestingly, queries appear to have increased prior to discovery of the sensor findings by public health officials on September 30th.
101Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209
While uncommon words like “croup” readily reveal the expected seasonal pattern of Internet search, more common words like “cough” or “throat” require logical modifiers to rule out more common search phrases.
102Source: Mostashari F. Can Internet searches provide useful data for public health surveillance?. Advances in Disease Surveillance 2007;2:209