Upload
edison-lascano
View
58
Download
1
Tags:
Embed Size (px)
Citation preview
3/25/2015 This is a project for the CS7930 Class -Distributed Systems Development with Research
1
An Infectious Disease Surveillance Simulation
(IDSS) in the Cloud
Prepared by: Jorge Edison Lascano [email protected]
PhD student at Utah State University Associate Professor at Universidad de las Fuerzas Armadas ESPE-Ecuador
Presented in the Industry Day at Utah State University Computer Science Department
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Introduction • Objective
– Understand the behavior of Vector Timestamps in a Distributed System Deployed in the Cloud.
• Results – Infectious Disease Surveillance System
• Implemented and tested • Setup for automated deployment to the cloud
(AWS EC2 instances with S3 storage) • Demonstrated effective use of Vector
Timestamps
2
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Agenda • Project Overview • Tasks Details • Technology • Outstanding Risks and Issues • Demo (github available) • Results • Conclusion
3
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Project Overview • Project description
– This project is divided in three phases: • Implement a IDSS using communication
protocols like: UDP/TCP/HTTP • Implement automated deployment that
– Starts up Virtual Machines in a cloud – Deploys software to those VM’s – Launches the IDSS
• Study the resulting Vector Timestamps – This project allowed us to see the behavior of
asynchronous calls in a distributed environment and to deploy and run automatically its components
4
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
IDSS Diagram
5
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Vector Timestamp
6
Source http://compquiz.blogspot.com/2010/12/logical-and-vector-timestamps-in.html
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
7
Tasks Details Ord. Task Description 1 Implement HDS A Health District System collects infectious disease
information from EMRs and keeps track of diseases, also sends information to every DOA; DOA1: Influenza, DOA2: Chicken pox; and, DOA3: Measles. Saves its Vector Timestamp to the log file
2 Implement EMR The Electronical Medical Record Simulator sends information about diseases periodically to its HDS Saves its Vector Timestamp to the log file.
3 Implement DOA Every DOA keeps track of a disease rate, if a threshold is reached, it sends a notification to every HDS. Saves its Vector Timestamp to the log file.
4 Local Tests EMR – HDS – DOA are generating traffic and Vector Timestamps locally
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
8
Tasks Details Ord. Task Description 5 Launching /
starting Instances
A number of EC2 instances are launched, these instances will host the EMRs, HDSs and DOAs processes.
6 Name Resolution
A shell script that registers the corresponding 9 EMRs, 3 HDs and 3 DOAs is implemented and updates the hosts file in every EC2 instance. The number of launched instances vary from 1 to 15. A simple Round Robin algorithm is implement in case of less than 15 instances were launched
7 Deployment A script that allows the update of changes and automatic distribution of the system in the EC2 instances is created
8 Execution After the system was deployed, a script that runs every process based on the name resolution will execute every one of the 15 processes
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Technology • Framework: node.js • IDE: Webstorm • Cloud Provision: AWS: S3+EC2+EBS • Languages
– Javascript (node.js) – shell script (bash, awk, sed)
• Communication Protocols – UDP, HTTP – ssh, aws cli
• Partial Ordering Algorithm: Vector Timestamps 9
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Outstanding Risks and Issues • Project Risks
– Phase 1: Implementation of the system • Risk:
– New language, new IDE • Solution:
– Tutorials – Phase 2: Distribution in the Cloud
• Risk : – Running out of money (I still can not find my money)
• Solution: – Monitor, detach and shutdown every resource
• Project Issues – Phase 1:
• Debugging – Phase 2:
• Testing • S3 sync will not update if the file size is the same
10
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Demo • https://github.com/elascano/HW2VectorTimestamp_on_AWS
11
EC2 instances
Vector Timestamps
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Results
12
• A set of vector timestamps that can be used to partially order the execution if the events.
• These vector timestamps allow to see which are the processes that supports most of the load in the whole system.
• A non-blocking / asynchronous system that allows processing information from any client at any point of time, the asynchronous feature was easily implemented using node.js
• An automated process for deployment of distributed systems.
3/25/2015 [Infectious Disease Surveillance Simulation in the Cloud]
Conclusion • This project helped understand the order/
disorder of how the processes and their events are executed based on the communication model in a real distributed environment.
• If there is not direct or indirect connection (communications) between process x and y, then the y-th element of x's timestamp doesn't get updated.