27
Oswego in the Cloud: Epi Info™—Mesh4x Synchronization Prototype By: Taha Kass-Hout, MD, MS (InSTEDD) Many epidemiologists are familiar with the foodborne outbreak in Oswego, New York, U.S.A. on April 18 th , 1940. In this outbreak, over half of the participants at a pot-luck church supper developed a gastro-intestinal illness. A survey was created and interviews were conducted with 75 of the 80 people known to have been present to determine the source of the contamination. While the Oswego study focused on a single region, the significant value of data synchronization can be demonstrated by expanding this scenario to where interviews and data entry are conducted in different localities. Therefore, we expanded the scenario by populating fictitious addresses spread across five counties in New York upstate region. The actual church supper was held in the Oswego county; however, in our imaginary scenario we added four additional neighboring counties: Jefferson, Lewis, Oneida, and Wayne. In this scenario, two epidemiologists conducted the outbreak investigation; an Epidemic Intelligence Service (or EIS) officer investigated the outbreak in Oneida county and another

Oswego in the Cloud: Scenario Script

Embed Size (px)

DESCRIPTION

A near-real time data exchange between multiple instances of Epi Info™ was enabled by configuring Mesh4x (http://code.google.com/p/mesh4x/) for Internet cloud (e.g., Amazon’s EC2, Google cloud/App Engine) and for peer-to-peer (over SMS) synchronization. A client-based tool can easily be used by an epidemiologist to build and configure a mesh environment, without requiring prior technical knowledge.

Citation preview

Page 1: Oswego in the Cloud: Scenario Script

Oswego in the Cloud: Epi Info™—Mesh4x Synchronization PrototypeBy: Taha Kass-Hout, MD, MS (InSTEDD)

Many epidemiologists are familiar with the foodborne outbreak in Oswego, New York, U.S.A.

on April 18th, 1940. In this outbreak, over half of the participants at a pot-luck church supper

developed a gastro-intestinal illness. A survey was created and interviews were conducted with

75 of the 80 people known to have been present to determine the source of the contamination.

While the Oswego study focused on a single region, the significant value of data synchronization

can be demonstrated by expanding this scenario to where interviews and data entry are

conducted in different localities. Therefore, we expanded the scenario by populating fictitious

addresses spread across five counties in New York upstate region. The actual church supper was

held in the Oswego county; however, in our imaginary scenario we added four additional

neighboring counties: Jefferson, Lewis, Oneida, and Wayne. In this scenario, two

epidemiologists conducted the outbreak investigation; an Epidemic Intelligence Service (or EIS)

officer investigated the outbreak in Oneida county and another officer at the state health

department investigated the other four counties. Before the data synchronization, the Oneida

county EIS officer could not conclude the cause of the outbreak since his partial data showed two

competing factors. However, after data synchronization, both investigators had a clear picture of

the spread of the illness (over space and time) and the Oneida county EIS officer was able to

clearly detect the actual cause of the outbreak, the Vanilla Ice Cream.

The Scenario Script

DISCLAIMER: All places used in the modified (imaginary) scenario are fictitious. Any

similarity to real places is a coincidence.

Page 2: Oswego in the Cloud: Scenario Script

Outbreak investigation in Oneida County

Scenario: Oneida Medical Officer (Dr. Roger Mir) receives a call from local doctors about

patients reporting to clinics with gastrointestinal illness. Dr. Mir launches the investigation and

begins interviews.

Analysis of Outbreak Data: Dr. Mir opens the Analysis module in Epi Info™

Roger: I have returned to the office from the interviews and I have done some preliminary

analysis of the data. I will use Epi Info™ to show you my initial results. First, I need to open

the Epi Info™ Analysis tool and READ the data into Epi Info™:

Epi Info™ Command: READ epiinfo.mdb: viewOswego

Roger: In the output window, Dr. Mir finds a Record Count of 28.

Frequencies of COUNTY and ILL fields

Page 3: Oswego in the Cloud: Scenario Script

o Epi Info™ Command: FREQ COUNTY ILL

Roger: The results show that all 28 cases are in Oneida County (12 interviewees that were ILL

[Ill = 1] and 16 interviewees that were NOT ILL [Ill = 0])

Roger: I will create a 2 by 2 table of BAKED HAM by ILL:

Epi Info™ Command: TABLES BAKEDHAM ILL

Page 4: Oswego in the Cloud: Scenario Script

The results show that BAKED HAM is significant (at P = 0.05 level).

Roger: I will now create a 2 by 2 table for VANILLA by ILL.

Epi Info™ Command: TABLES VANILLA ILL

These results show VANILLA is also significant (at P = 0.05 level).

Roger: Having analyzed the data for my county, I found that there are at least two potential

sources of the illness—BAKED HAM and VANILLA ICE CREAM.

Generate local Google Earth map for Oneida county

Roger: In order to gain a perspective of the cases, I will create a map of all the data and display

the map in Google Earth (or KML map layers). In order to generate the map, I will launch the

Epi Info™ Mesh4x Data Synchronization tool and generate a KML file as follows:

Page 5: Oswego in the Cloud: Scenario Script

Click the “Map Exchange” tab.

Specify the data source (EpiInfo.mdb) and select the Oswego table.

Click the “Create Map” button which will automatically generate a spatio-temporal map.

It takes Dr. Mir approximately 30 seconds for the tool to automatically geocode all the addresses

in his database and generate the map layers (create a Google Earth KML file).

Roger: To show the map, I will now open Google Earth and show the file generated by the

Epi Info™ Mesh4x tool. The results map has 28 cases in Oneida county:

Page 6: Oswego in the Cloud: Scenario Script

12 patients reported with GI illness (Red pins) between April 18th and 19th

16 interviewers did not show any illness (Yellow pins)

Dr. Mir logged into Epi-X, CDC’s Epidemic Exchange collaboration environment and

discovered a discussion about an outbreak in his neighboring county of Oswego. He initiates a

call to Dr. Nitschke (with the State Health Department) to further discuss the situation…

Roger: I logged into Epi-X and discovered a discussion about an outbreak in my neighboring

counties; I’d like to ask if you can share the data for these counties with me.

Page 7: Oswego in the Cloud: Scenario Script

Outbreak investigation in the other counties

Scenario: The NY State Epidemiologist requests an EIS officer (Dr. David Nitschke) to help

investigate a GI outbreak that appears in the counties of Oswego, Lewis, Jefferson, and Wayne.

David goes through similar steps as described above for all counties except Oneida.

Analysis of Outbreak Data: Dr. Nitschke opens the Analysis module in Epi Info™

David: Having returned to the office from the interviews and running some preliminary

analyses, allow me to use Epi Info™ to show you my initial results. First, I need to open

Analysis and READ the data in Epi Info™:

Epi Info™ Command: READ epiinfo.mdb: viewOswego

David: In the Output window, I find a Record Count of 47 across all the 4 counties. Notice that

I do not yet have data for Oneida County.

Frequencies of County and ILL

o Epi Info™ Command: FREQ COUNTY ILL

David: I create a 2 by 2 table for BAKEDHAM by ILL and for VANILLA by ILL:

Page 8: Oswego in the Cloud: Scenario Script

Epi Info™ Command: TABLES BAKEDHAM ILL

Epi Info™ Command: TABLES VANILLA ILL

These results only show VANILLA to be significant (at P = 0.05 level), BAKEDHAM was not

significant. Now I will share this information with Roger for further collaboration on the

outbreak investigation.

Generate local Google Earth map for the other counties

David: In order to gain a perspective of the cases, I will create a map of all the data and display

in Google Earth (or KML map layers). In order to generate the map, I will launch the Epi Info™

Mesh4x tool and generate a KML file as follows:

Click the “Map Exchange” tab

Specify the data source (EpiInfo.mdb) and select the Oswego table

Page 9: Oswego in the Cloud: Scenario Script

Click the “Create Map” button which will automatically generate a spatio-temporal map

It takes Dr. Nitschke less than a minute for the tool to automatically geocode all the addresses in

his database and generate the map layers (create a KML file).

David: To show the map, I will now open Google Earth and show the file generated by the Epi

Info™ Mesh4x tool. The results map has 47 records in the 4 counties:

33 patients reported with GI illness (Red pins) between April 18th and 19th

14 interviewers did not show any illness (Yellow pins)

Dr. Nitschke just received a call from Dr. Mir and learned that Oneida county has 28 cases, but

Dr. Mir is unable to conclude the source of the outbreak based on his information.

David: While speaking to Dr. Mir, I noticed that I don’t have any cases in Oneida County. I also

learned that he is unable to conclude the source of the outbreak given the information he has and

Page 10: Oswego in the Cloud: Scenario Script

the analysis he ran. We both agree to share our information to further collaborate and investigate

the situation.

Data synchronization between Oneida County and the other neighboring counties

Dr. Nitschke prepares to launch the Epi Info™ Mesh4x tool and Sync data over the Amazon

EC2/S3 cloud (State’s available online data)

David: I will first create a mesh (a collaborative space where I can share and synchronize my

data with others) by visiting this site: http://sync.staging.instedd.org:8080/mesh4x.

I will create an Epi Info collaborative mesh space (Epiinfo):

Then I will create the data feed (Oswego) and the fields I want to show on the map (mapping)

Here are the fields I want to share on a map (in the Mapping field):

<item.title>patient name: {Oswego/Name}</item.title><item.description>adress: {Oswego/Address}</item.description><geo.location>{geoLocation(Oswego/Address)}</geo.location><geo.longitude>{geoLongitude(Oswego/Address)}</geo.longitude><geo.latitude>{geoLatitude(Oswego/Address)}</geo.latitude><patient.ill>{Oswego/ILL}</patient.ill>

Page 11: Oswego in the Cloud: Scenario Script

<patient.updateTimestamp>{Oswego/DateOnset}</patient.updateTimestamp>

Then, I will evaluate the data in the cloud (State’s Data ready to be shared) by visiting this site:

http://sync.staging.instedd.org:8080/mesh4x/feeds/Epiinfo/Oswego. There are currently no

records in the State’s online database.

David: I will now make my data available on our online database (over the Amazon EC2/S3

cloud for this exercise) so Dr. Mir can synchronize with our records. I will now select the Data

Exchange tab in the Epi Info™ Mesh4x tool and click Synchronize.

David: I will evaluate if my records are now available in the online database for Dr. Mir to

synchronize. I refresh the page

(http://sync.staging.instedd.org:8080/mesh4x/feeds/Epiinfo/Oswego) and here are all the 47

records from all the counties except for Oneida.

Page 12: Oswego in the Cloud: Scenario Script

Roger (on the phone with David): Now I can see the data in the cloud, I will now synchronize

my information with yours.

Roger: I prepare to launch the Epi Info™ Mesh4x tool and Sync data over the Amazon EC2/S3

cloud (State’s available online data)

Select the Data Exchange tab and click Synchronize

Roger: Now I have the data from the other counties. The State Database on the mesh should

have all of the data from my Oneida County. I Return to the web browser and refresh the Mesh

URL to show the full complement of 75 records.

Page 13: Oswego in the Cloud: Scenario Script

Analysis of Outbreak Data for ALL Counties: Dr. Mir and Nitschke open the Analysis

module in Epi Info™

Roger: To show that I have the data, I can rerun the statistics from before. I see in my

Epi Info™ database that I now have 75 cases across all five counties (46 ILL and 29 NOT ILL)

Epi Info™ Command: READ ‘C:\epiinfo\data\epiinfo.mdb’: viewOswego

Epi Info™ Command: FREQ COUNTY ILL

Roger: I will re-run the analysis from earlier.

Page 14: Oswego in the Cloud: Scenario Script

Epi Info™ Command: TABLES BAKEDHAM ILL

Epi Info™ Command: TABLES VANILLA ILL

The results show that BAKEDHAM is no longer significant but VANILLA is clearly the

potential source of the outbreak. I will call David and update him with the results.

David already synchronized his records with the cloud following the same steps as above in

order to get Oneida county data, re-ran the analysis and came up with the same conclusion as

Roger.

Generate Google Earth map for ALL counties

Roger: In order to gain a perspective of the cases for ALL counties, I will re-create the map and

display in Google Earth (or KML map layers). In order to generate the new map, I will follow

the same steps as before by launching the Epi Info™ Mesh4x tool and generate a KML file:

Page 15: Oswego in the Cloud: Scenario Script

Click the “Map Exchange” tab

Specify the data source (EpiInfo.mdb) and select the Oswego table

Click the “Create Map” button which will automatically generate a spatio-temporal map

It takes Dr. Mir less than a minute for the tool to automatically geocode the new addresses in his

database and generate the map layers (create a KML file).

Roger: To show the map, I will now open Google Earth and show the file generated by the

Epi Info™ Mesh4x tool. The results map has 75 records in the 4 counties:

46 patients reported with GI illness (Red pins) between April 18th and 19th

29 interviewers did not show any illness (Yellow pins)

Page 16: Oswego in the Cloud: Scenario Script

Updating Records

Scenario: Dr. Nitschke receives a call about a new incident case from Lewis County. Upon

further investigation, the previously not ill person is now ill because he ate Vanilla Ice Cream

that he took home from the church supper.

David: I will update the information for Patient19 in the Epi Info™ database.

Page 17: Oswego in the Cloud: Scenario Script

David: I will now make the update available on our online database (over the Amazon EC2/S3

cloud for this exercise) so Dr. Mir can have the latest update. I will now select the Data

Exchange tab in the Epi Info™ Mesh4x tool and click Synchronize.

David: I will evaluate if my records are updated in the online database for Dr. Mir to re-initiate

the synchronization process. I refresh the page and check the status of illness and vanilla for

Patient19.

Page 18: Oswego in the Cloud: Scenario Script

Roger: I will re-launch Epi Info™ Mesh4x tool to get the updated record:

Sync data over the Amazon EC2/S3 cloud (State’s online available data)

Roger: I will check the data in my Epi Info database and I see that I have the most up-to-date

information on Patient19. Next, I will regenerate the map as previously described. The results

map shows 75 cases across ALL counties:

46 patients reported with GI illness (Red pins) between April 18th and 19th

29 interviewers did not show any illness (Yellow pins)

o Patient19 status has changed to ill (Red pin)

Page 19: Oswego in the Cloud: Scenario Script

Sharing Data with the Centers for Disease Control and Prevention (CDC)

Scenario: Dr. Mark Berndt, an epidemiologist at the CDC foodborne branch, is now involved in

the investigation and wants to access State data.

Mark: I will launch the Epi Info™ Mesh4x tool, synchronize to receive NY state’s data, then

create a map of the data:

Select the Data Exchange tab and click Synchronize

Select the Map Exchange tab and click Create Map

The results map shows 75 cases in NY state across 5 counties:

47 patients reported with GI illness (Red pins) between April 18th and 19th

28 interviewers did not show any illness (Yellow pins)

Page 20: Oswego in the Cloud: Scenario Script

Competing interests

InSTEDD was supported by research funding from Google.org and Rockefeller Foundation.