41
Cloud Computing at Amazon’s EC2 Joe Steele [email protected]

Cloud Computing at Amazon’s EC2 Joe Steele [email protected]

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Cloud Computingat Amazon’s EC2

Joe [email protected]

Page 2: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Grid Computing

Shared resources – many computer clusters transferring data and running jobs.

Geographically distributed.Cross-grid collaboration.Idea is analogous to electric power network

(grid), where power generators are distributed, but users access electric power without bothering about the source of energy and its location.

Page 3: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

LHC Computing Grid (LCG)

Page 4: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Cloud Computing

What if I don’t have my own cluster?Cloud computing refers to a cluster that invites

users to send jobs. (SaaS –Software as a Service)Computation, software, data access, and storage

services that do not require user knowledge of the location or configuration of the system.

Term comes from the cloud drawing used in the past to represent the telephone network, later represents the internet.

Page 5: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Cloud Computing

Page 6: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Cloud Computing

Private companies large data centers.When considering operational costs, 50k servers

are cheaper per cpu then 1k servers (5 to 7 times cheaper).

Amazon: • $0.085/cpu-hour• No minimum, maximum• No contract

Page 7: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Amazon E2

aws.amazon.comComputing cluster – create an account and

provide a credit card.Let Amazon take care of the hardware.

Page 8: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Cloud BioLinux

JCVI (J. Craig Venter Institute) created cloud version of NERC BioLinux VM.

An Ubuntu machine with over 100 NEBC software packages. Image stored at EC2, is available to be copied at no charge, by EC2 users.

Page 9: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

http://aws.amazon.com

Page 10: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Create a new account

Page 11: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Enter your information

Page 12: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Sign up for an EC2 account

Page 13: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Click on “Sign up for Amazon EC2”

Page 14: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

EC2 Account

• Signing up for EC2 automatically signs you up for Amazon Simple Storage Service, and Amazon Virtual Private Cloud.

• Requires credit card information.• No charges until you start using the services.• Amazon will email with Access Identifiers, and

instructions for your first log in.

Page 15: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Click on “AWS Management Console”

Page 16: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Click the EC2 Tab

Page 17: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Launch an Instance

Page 18: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

I recommend biolinux

Page 19: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Click “Select”

Page 20: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Pricing

• Amazon has a variety of VM sizes available – pricing is at: http://aws.amazon.com/ec2/pricing/

• You are charged for CPU usage, for data storage, and for data transferred to or from Amazon. Charges continue until a VM is “Terminated”.

• You can set up a small test VM for free – select “Micro” for the size.

Page 21: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Kernel defaults are fine

Page 22: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Create a Key Pair

Page 23: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Create security group

Page 24: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Launch

Page 25: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Machine info

Page 26: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

“Terminate” to end charges

Page 27: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

ssh to the machine

A window opens, telling you how to connect to your new VM, eg,:

“ssh -i key_pair_name.pem [email protected]

However, for biolinux, do:ssh –i key_pair_name.pem ubuntu@ec2-76-202-

01-919.compute-1.amazonaws.com

Page 28: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

NX

Use NX for the graphical display (built in to biolinux already). Open source, can be found at http://www.nomachine.com/

Must ssh into VM FIRST, using the key pair.>adduser <username>>groups >usermod -G <grp1>,<grp2>,ssh <username>

Page 29: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Start NX

Page 30: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

“Configure”

Page 31: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

BioLinux over NX

Page 32: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Data Stored at Amazon

There are large datasets stored at Amazon, available for use – free of charge (mostly). You are charged for any data you copy.

http://aws.amazon.com/datasetsto search through them.

Page 33: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

http://aws.amazon.com/datasets

Page 34: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

DatasetsHuman DNA sequences: • 1000 Genomes Project (7,300 GB) • Ensembl Annotated Human Genome - FASTA (115 GB)• Ensembl Annotated Human Genome - MySQL (200 GB) • GenBank (200 GB) • Human Liver Cohort (Sage Bionetworks) (0.6 GB) • Illumina - Jay Flatley's Human Genome Data Set. (350 GB) • YRI Trio Data - complete genome sequence for three individuals (700 GB)

Other (might include some human data): • Ensembl - FASTA DB (100 GB) • Influenza Virus (including Swine Flu) - from NCBI (1 GB) • UniGene - from NCBI (10 GB) •

PubChem Library - from NCBI (230 GB)

Page 35: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Public Snapshots

Page 36: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu
Page 37: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Select “Volumes”

Page 38: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Create a Volume

Page 39: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Instance Information

Page 40: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Attach it to your Instance

Page 41: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

Mount the Volume

From your VM:>sudo mkfs –t ext3 /dev/sdf>sudo mkdir /mnt/datasets>sudo mount –t ext3 /dev/sdf /mnt/datasets

200GB of genbank data are now in /mnt/datasets