13
3/25/2015 This is a project for the CS7930 Class -Distributed Systems Development with Research 1 An Infectious Disease Surveillance Simulation (IDSS) in the Cloud Prepared by: Jorge Edison Lascano [email protected] PhD student at Utah State University Associate Professor at Universidad de las Fuerzas Armadas ESPE-Ecuador Presented in the Industry Day at Utah State University Computer Science Department

An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

Embed Size (px)

Citation preview

Page 1: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 This is a project for the CS7930 Class -Distributed Systems Development with Research

1

An Infectious Disease Surveillance Simulation

(IDSS) in the Cloud

Prepared by: Jorge Edison Lascano [email protected]

PhD student at Utah State University Associate Professor at Universidad de las Fuerzas Armadas ESPE-Ecuador

Presented in the Industry Day at Utah State University Computer Science Department

Page 2: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Introduction •  Objective

–  Understand the behavior of Vector Timestamps in a Distributed System Deployed in the Cloud.

•  Results –  Infectious Disease Surveillance System

•  Implemented and tested •  Setup for automated deployment to the cloud

(AWS EC2 instances with S3 storage) •  Demonstrated effective use of Vector

Timestamps

2

Page 3: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Agenda •  Project Overview •  Tasks Details •  Technology •  Outstanding Risks and Issues •  Demo (github available) •  Results •  Conclusion

3

Page 4: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Project Overview •  Project description

–  This project is divided in three phases: •  Implement a IDSS using communication

protocols like: UDP/TCP/HTTP •  Implement automated deployment that

–  Starts up Virtual Machines in a cloud – Deploys software to those VM’s –  Launches the IDSS

•  Study the resulting Vector Timestamps –  This project allowed us to see the behavior of

asynchronous calls in a distributed environment and to deploy and run automatically its components

4

Page 5: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

IDSS Diagram

5

Page 6: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Vector Timestamp

6

Source http://compquiz.blogspot.com/2010/12/logical-and-vector-timestamps-in.html

Page 7: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

7

Tasks Details Ord. Task Description 1 Implement HDS A Health District System collects infectious disease

information from EMRs and keeps track of diseases, also sends information to every DOA; DOA1: Influenza, DOA2: Chicken pox; and, DOA3: Measles. Saves its Vector Timestamp to the log file

2 Implement EMR The Electronical Medical Record Simulator sends information about diseases periodically to its HDS Saves its Vector Timestamp to the log file.

3 Implement DOA Every DOA keeps track of a disease rate, if a threshold is reached, it sends a notification to every HDS. Saves its Vector Timestamp to the log file.

4 Local Tests EMR – HDS – DOA are generating traffic and Vector Timestamps locally

Page 8: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

8

Tasks Details Ord. Task Description 5 Launching /

starting Instances

A number of EC2 instances are launched, these instances will host the EMRs, HDSs and DOAs processes.

6 Name Resolution

A shell script that registers the corresponding 9 EMRs, 3 HDs and 3 DOAs is implemented and updates the hosts file in every EC2 instance. The number of launched instances vary from 1 to 15. A simple Round Robin algorithm is implement in case of less than 15 instances were launched

7 Deployment A script that allows the update of changes and automatic distribution of the system in the EC2 instances is created

8 Execution After the system was deployed, a script that runs every process based on the name resolution will execute every one of the 15 processes

Page 9: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Technology •  Framework: node.js •  IDE: Webstorm •  Cloud Provision: AWS: S3+EC2+EBS •  Languages

–  Javascript (node.js) –  shell script (bash, awk, sed)

•  Communication Protocols –  UDP, HTTP –  ssh, aws cli

•  Partial Ordering Algorithm: Vector Timestamps 9

Page 10: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Outstanding Risks and Issues •  Project Risks

–  Phase 1: Implementation of the system •  Risk:

–  New language, new IDE •  Solution:

–  Tutorials –  Phase 2: Distribution in the Cloud

•  Risk : –  Running out of money (I still can not find my money)

•  Solution: –  Monitor, detach and shutdown every resource

•  Project Issues –  Phase 1:

•  Debugging –  Phase 2:

•  Testing •  S3 sync will not update if the file size is the same

10

Page 11: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Demo •  https://github.com/elascano/HW2VectorTimestamp_on_AWS

11

EC2 instances

Vector Timestamps

Page 12: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Results

12

•  A set of vector timestamps that can be used to partially order the execution if the events.

•  These vector timestamps allow to see which are the processes that supports most of the load in the whole system.

•  A non-blocking / asynchronous system that allows processing information from any client at any point of time, the asynchronous feature was easily implemented using node.js

•  An automated process for deployment of distributed systems.

Page 13: An Infectious Disease Surveillance Simulation (IDSS) in the Cloud

3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]

Conclusion •  This project helped understand the order/

disorder of how the processes and their events are executed based on the communication model in a real distributed environment.

•  If there is not direct or indirect connection (communications) between process x and y, then the y-th element of x's timestamp doesn't get updated.

[email protected] 13