40
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Adrian Hornsby, Technical Evangelist @ AWS Twitter: @adhorn Email: [email protected] AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch: Simplifying Batch Computing in the Cloud

Embed Size (px)

Citation preview

Page 1: AWS Batch: Simplifying Batch Computing in the Cloud

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Adrian Hornsby, Technical Evangelist @ AWS

Twitter: @adhorn

Email: [email protected]

AWS Batch: Simplifying Batch

Computing in the Cloud

Page 2: AWS Batch: Simplifying Batch Computing in the Cloud

• Technical Evangelist, Developer Advocate,

… Software Engineer

• My @home is in Finland

• Previously:

• Solutions Architect @AWS

• Lead Cloud Architect @Dreambroker

• Director of Engineering, Software Engineer, DevOps, Manager, ... @Hdm

• Researcher @Nokia Research Center

• and a bunch of other stuff.

• Love climbing and ginger shots.

Page 3: AWS Batch: Simplifying Batch Computing in the Cloud

What to expect from this session

• Batch processing overview

• AWS Batch platform walkthrough

• API overview

• Demo(s)

• Show me the code!

• Usage patterns

Page 4: AWS Batch: Simplifying Batch Computing in the Cloud

What is batch computing?

Page 5: AWS Batch: Simplifying Batch Computing in the Cloud

What is batch computing?

Run jobs asynchronously and automatically across one or more

computers.

Jobs may have dependencies, making the sequencing and scheduling of

multiple jobs complex and challenging.

Page 6: AWS Batch: Simplifying Batch Computing in the Cloud

Early Batch APIs (19th Century)

• Processing of data stored on decks of punch

card

• Tabulating machine by Herman Hollerith,

used for the 1890 United States Census.

• Each card stored a separate record of data

with different fields.

• Cards were processed by the machine one

by one, all in the same way, as a batch.

IBM Type 285 tabulators (1936) being used for batch

processing of punch cards (in stack on each machine) with

human operators at U.S. Social Security Administration

Page 7: AWS Batch: Simplifying Batch Computing in the Cloud

Batch in Linux

echo "cc -o foo foo.c" | at 1145 jan 31

Page 8: AWS Batch: Simplifying Batch Computing in the Cloud

Batch in Linux

echo "cc -o foo foo.c" | at 1145 jan 31

> job 1 at Wed Jan 31 11:45:00 2018

Page 9: AWS Batch: Simplifying Batch Computing in the Cloud

Batch in Linux

echo "cc -o foo foo.c" | at 1145 jan 31

> job 1 at Wed Jan 31 11:45:00 2018

$ at 1145 jan 31

at> cc -o foo foo.c

at> ^D

$ atq (list jobs)

$ atrm <job_number>

Page 10: AWS Batch: Simplifying Batch Computing in the Cloud

Batch computing today

• In-house compute clusters powered by open source or

commercial job schedulers.

• Often comprised of a large array of identical,

undifferentiated processors, all of the same vintage and

built to the same specifications.

Page 11: AWS Batch: Simplifying Batch Computing in the Cloud

It’s like trying to fit a square into a circle

Batch computing today …

Page 12: AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch

Overview & Concepts

Page 13: AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch in a nutshell

• Fully managed batch primitives

• Focus on your applications • Shell scripts,

• Linux executables,

• Docker images

• and their resource requirements

• We take care of the rest!

Page 14: AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch advantages

Reduces

operational

complexities

Saves time Reduces costs

Page 15: AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch Components

• Jobs

• Job definitions

• Job queues

• Job Scheduler

• Compute environments

Page 16: AWS Batch: Simplifying Batch Computing in the Cloud

Components relation

Batch Compute Environment **

Batch Queue (2)

Batch Queue (1)

Batch Queue (0)

Job Definition 1

Job Definition 2

Job Definition 3

Job Definition n

priorityJob 1

Job 2

Container Property

Compute

Resources

De

pe

nd

s O

n Container Property

Container Property

Container Property

** regional service

Page 17: AWS Batch: Simplifying Batch Computing in the Cloud

Jobs

Jobs are the unit of work executed by AWS Batch as containerized

applications running on Amazon EC2.

Containerized jobs can reference a container image, command, and

parameters.

Or, users can fetch a .zip containing their application and run it on a

Amazon Linux container.

Page 18: AWS Batch: Simplifying Batch Computing in the Cloud

Submit Job

aws batch submit-job --cli-input-json file://submit_job.json --region us-east-1

Page 19: AWS Batch: Simplifying Batch Computing in the Cloud

Submit Jobwith dependency

aws batch submit-job --cli-input-json file://submit_job.json --region us-east-1

Page 20: AWS Batch: Simplifying Batch Computing in the Cloud

Job States

Jobs submitted to a queue can have the following states:

SUBMITTED: Accepted into the queue, but not yet evaluated for execution

PENDING: Your job has dependencies on other jobs which have not yet completed

RUNNABLE: Your job has been evaluated by the scheduler and is ready to run

STARTING: Your job is in the process of being scheduled to a compute resource

RUNNING: Your job is currently running

SUCCEEDED: Your job has finished with exit code 0

FAILED: Your job finished with a non-zero exit code, was cancelled or terminated.

Page 21: AWS Batch: Simplifying Batch Computing in the Cloud

Job Definition

AWS Batch job definitions specify how jobs are to be run.

Some of the attributes specified in a job definition:

• IAM role associated with the job

• vCPU and memory requirements

• Mount points

• Container properties

• Environment variables

• Retry strategy

• While each job must reference a job definition, many parameters

can be overridden.

Page 22: AWS Batch: Simplifying Batch Computing in the Cloud

Create

Job Definition

aws batch register-job-definition --region us-east-1 --cli-input-json file://job_def.json

Page 23: AWS Batch: Simplifying Batch Computing in the Cloud

Job Queue

Jobs are submitted to a job queue, where they reside until they are

able to be scheduled to a compute resource. Information related to

completed jobs persists in the queue for 24 hours.

Job queues support priorities and multiple queues can schedule work

to the same compute environment.

Page 24: AWS Batch: Simplifying Batch Computing in the Cloud

Create

Job Queue

aws batch create-job-queue --region us-east-1 --cli-input-json file://job_queue.json

Page 25: AWS Batch: Simplifying Batch Computing in the Cloud

Job Scheduler

The scheduler evaluates when, where, and how to run jobs

that have been submitted to a job queue.

Jobs run in approximately the order in which they are

submitted, as long as all dependencies on other jobs have

been met.

Page 26: AWS Batch: Simplifying Batch Computing in the Cloud

Compute Environment

Job queues are mapped to one or more compute environments.

Managed compute environments enable you to describe your business

requirements (instance types, min/max/desired vCPUs, and Spot

Instance bid as a % of the On-Demand price) and we launch and scale

resources on your behalf.

You can choose specific instance types or choose “optimal” and AWS

Batch launches appropriately sized instances.

Page 27: AWS Batch: Simplifying Batch Computing in the Cloud

Create

Environment

aws batch create-compute-environment --cli-input-json file://job_env.json --region us-east-1

Page 28: AWS Batch: Simplifying Batch Computing in the Cloud

Customer Provided AMIs

Customer Provided AMIs let you set the AMI that is

launched as part of a managed compute environment.

Makes it possible to configure Docker settings, mount

EBS/EFS volumes, and configure drivers for GPU jobs.

AMIs must be Linux-based, HVM and have a working ECS

agent installation.

Page 29: AWS Batch: Simplifying Batch Computing in the Cloud

Resource Limits

Page 30: AWS Batch: Simplifying Batch Computing in the Cloud

Deployment

Page 31: AWS Batch: Simplifying Batch Computing in the Cloud

Pricing

Page 32: AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch: Demo

Fetch&Run

Page 33: AWS Batch: Simplifying Batch Computing in the Cloud

IAM Role

AWS Batch

QueueAWS Batch

Compute Env.Read/Write

Fetch & Run Demo

Job definition

AWS Batch execution

Container

AWS Batch

Scheduler

Amazon DynamoDB

Fe

tch

Sc

rip

t

Submit job

Developer

Amazon S3

Page 34: AWS Batch: Simplifying Batch Computing in the Cloud

Show me the code!

Page 35: AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch: Typical Use cases

Page 36: AWS Batch: Simplifying Batch Computing in the Cloud

AWS Batch Use Cases

High Performance Computing

Post-Trade Analytics

Fraud Surveillance

Drug Screening

DNA Sequencing

Rendering

Transcoding

Media Supply Chain

Page 37: AWS Batch: Simplifying Batch Computing in the Cloud

Financial Services: Automate the analysis of the day’s transaction for fraud surveillance.

Page 38: AWS Batch: Simplifying Batch Computing in the Cloud

Life Sciences: Drug Screening for BiopharmaRapidly search libraries of small molecules for drug discovery.

Page 39: AWS Batch: Simplifying Batch Computing in the Cloud

Digital Media: Visual Effects RenderingAutomate content rendering workloads and reduce the need for human intervention due to execution

dependencies or resource scheduling.

Page 40: AWS Batch: Simplifying Batch Computing in the Cloud

Thank you!

Twitter: @adhorn

Email: [email protected]