Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
1
Amazon Web Services Introduction
AWS IMMERSION DAY
FacilitatorsJulia Knox, Baccarelli Lab StaffRaj Chary, AWS Solutions ArchitectDana Koch, AWS Senior Account Executive - Education
January 31, 2019
2
Morning
● 10:00 am to 10:10am - Welcome/Introduction
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:10 am to 10:40am - Set the stage on AWS
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session –
Demo/Show and Tell 120 mins
(Laptops Required)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge – TBD
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
3
Welcome to AWS Immersion Day!
4
Current Set-up Challenges
5
HPC at Columbia
$ ssh [email protected]
$ qlogin –l mem=2G,time=:20:
$ cd /ifs/scratch/c2b2/my_lab/user
$ ./workhard
● Running a job on the cluster requires specifying memory and time
● Charged for time over 100K CPU hours and storage
● Lots of Command Line Dependence
6
Morning
● 10:00 am to 10:10am - Introduction
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:10 am to 10:40am - Set the stage on AWS
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session –
120 mins (Laptops Required)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge – TBD
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
7
AWS Workflow Examples
8
Introduction: AWS Computing Overview
AWS Computing Jobs:
● Run on virtual servers, known as instances, enabled by
Amazon Elastic Compute Cloud (EC2)
● Approximate 4 minute launch time of instances
○ Instantly launch Amazon Machine Images (AMIs)
■ Images preconfigured to suit research needs
● Call objects from S3 (storage) on instances using RStudio
9
Morning
● 10:00 am to 10:10am - Introduction
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:10 am to 10:40am - Set the stage on AWS
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session –
120 mins (Laptops Required)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge – TBD
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
10
AWS Advantage: Time
Columbia Cluster Computing Jobs: Compute the memory factored cpu hours (MFCPU) using the formula N * max(1.0, M / 3G) * H.
ExamplesIf you submit a job for 12G X 12 hours, and it runs for 10 hours, this would be:
(12G/3G) * 10 = 40 MFCPU
Monthly limit is 100K CPU Hours
AWS Computing Jobs:● Run on virtual servers, known as instances, enabled by
Amazon Elastic Compute Cloud● Instant launch of HPC clusters ● Scaling options● Multiple payment choices
○ Enables you to scale by priority■ E.g. With on-demand launching, you pay by the
hour of compute time. ■ If you have a job that can tolerate interruptions,
you can also bid on unused space called “Spot Instances”, which are normally 50% to 93% lower than the on-demand price.
11
AWS Advantage: Cost
Elasticity/Scalability Options
Pay for what you use
Lab member data
needs (based on
recent survey)
Estimated Costs to be
Approximately $1,000
Year or Less
12
AWS Advantage: Data Transfers
● Allow a researcher from a different
organization access to your AWS
environment while maintaining control of
what data and systems they can see and
do within the AWS account.
● This cross-account access is enable by
utilizing Amazon Identity and Access
Management (IAM).
● Hard Drive
● Does not enable API-enabled instance
access for file sharing
Amazon AWS Columbia HPC
Lab member data transfer
needs (based on recent
survey):
13
Morning
● 10:00 am to 10:10am - Introduction
● 10:10 am to 10:40am - Set the stage on AWS
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session –
120 mins (Laptops Required)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge – TBD
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
14
What is your biggest challenge with...
○ ….Data organization?○ ….Sharing data?○ ….Receiving data?○ ….Performing large analyses?
Audience Poll
15
A
W
S
S
y
s
t
e
m
16
AWS Reference In Development - GitHub Page
17
Morning
● 10:00 am to 10:10am - Introduction
● 10:10 am to 10:40am - Set the stage on AWS
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session –
- Demo/Show and Tell - 120 mins
(Laptops Required)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge – TBD
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
18
Example S3 Bucket from Baccarelli Lab AWS Set-up
S3 for Backup and Storage
This is Allison’s bucket. Only she, and people written into her “bucket
policy” have access to its objects.
19
Morning
● 10:00 am to 10:10am - Introduction
● 10:10 am to 10:40am - Set the stage on AWS
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:40 am – 11:30 am: Audience Poll
○ Challenges and areas of focus
● 11:45 am – 12:00 pm
○ R computing
○ S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm –
[CATERED LUNCH]
● 12:30pm – 2:30pm -- Lab session –
120 mins (Demo/Show and Tell)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge – TBD
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
20
Catered Lunch
21
Morning
● 10:00 am to 10:10am - Introduction
● 10:10 am to 10:40am - Set the stage on AWS
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session
– Demo/Show and Tell
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
22
Lab Session - Laptops Required
■ R
■ S3
■ EC2
23
Morning
● 10:00 am to 10:10am - Introduction
● 10:10 am to 10:40am - Set the stage on AWS
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session –
120 mins (Laptops Required)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
24
Challenge
26
Morning
● 10:00 am to 10:10am - Introduction
● 10:10 am to 10:40am - Set the stage on AWS
● Current setup/challenges
○ Current cluster setup
○ Ability to run jobs when needed
○ Backup and centralized storage
AWS Immersion Day Schedule
● 10:40am – 10:55 am: Audience Poll
○ Challenges and areas of focus
● 11:00am – 12:00pm
○ Demo – show and tell
● R computing
● S3 for backup and storage
Afternoon
● 12:00pm to 12:30pm – [CATERED
LUNCH]
● 12:30pm – 2:30pm -- Lab session –
120 mins (Demo/Show and Tell)
■ R
■ S3
■ EC2
● 2:30pm to 3:30pm:
○ Challenge
● 3:30pm – 4:00pm
○ Final Wrap-up and Q&A
27
● Accessing code from S3
buckets in an EC2 Instance
● Running an AMI configured
EC2 Instance with RStudio
R and RStudio Computing
28
Q & A Session
AWS Solutions ArchitectAWS Senior Account Executive - Education
Raj Chary
Dana Koch
29
Fun Activity
● Groups of 3-4
● Simple Questions
● Using Echo
30
Thanks, Everyone!