61
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Andrew Spyker (@aspyker) 12/1/2016 Container Scheduling, Execution and AWS Integration

Re:invent 2016 Container Scheduling, Execution and AWS Integration

  • Upload
    aspyker

  • View
    95

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Re:invent 2016 Container Scheduling, Execution and AWS Integration

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Andrew Spyker (@aspyker)

12/1/2016

Container Scheduling, Executionand AWS Integration

Page 2: Re:invent 2016 Container Scheduling, Execution and AWS Integration

What to Expect from the Session

• Why containers?• Including current use cases and scale

• How did we get there?• Overview of our container cloud platform

• Collaboration with ECS

Page 3: Re:invent 2016 Container Scheduling, Execution and AWS Integration

About Netflix

• 86.7M members• 1000+ developers• 190+ countries• > ⅓ NA internet download traffic• 500+ Microservices• Over 100,000 VM’s• 3 regions across the world

Page 4: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Why containers?

Given our VM architecture comprised of …

amazingly resilient,microservice driven,cloud native,CI/CD devops enabled,elastically scalable

do we really need containers?

Page 5: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Our Container System Provides Innovation Velocity

• Iterative local development, deploy when ready

• Manage app and dependencies easily and completely

• Simpler way to express resources, let system manage

Page 6: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Innovation Velocity - Use Cases

• Media Encoding - encoding research development time• Using VM’s - 1 month, using containers  - 1 week

• Niagara• Build all Netflix codebases in hours• Saves development 100’s of hours of debugging

• Edge Rearchitecture with NodeJS• Focus returns to app development• Simplifies, speeds test and deployment

Page 7: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Why not use existing container mgmt solution?

• Most solutions are focused on the datacenter• Most solutions are

• Working to abstract datacenter and cross-cloud• Delivering more than cluster manager• Not yet at our level of scale

• Wanted to leverage our existing cloud platform• Not appropriate for Netflix

Page 8: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Batch

Page 9: Re:invent 2016 Container Scheduling, Execution and AWS Integration

What do batch users want?

• Simple shared resources, run till done, job files

• NOT• EC2 Instance sizes, autoscaling, AMI OS’s

• WHY• Offloads resource management ops, simpler

Page 10: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Historic use of containers

• General Workflow (Meson), Stream Processing (Mantis)

• Proven using cgroups and Mesos

• With simple isolation

• Using specific packaging formatsLinux

cgroups

Page 11: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Enter Titus

Job Management

Batch

Resource Management & Optimization

Container ExecutionIntegration

Page 12: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Sample batch use cases

• Algorithm Model Training

Page 13: Re:invent 2016 Container Scheduling, Execution and AWS Integration

GPU usage

• Personalization and recommendation• Deep learning with neural nets/mini batch

• Titus• Added g2 support using nvidia-docker-plugin• Mounts nvidia drivers and devices into Docker container• Distribution of training jobs and infrastructure made self service

• Recently moved to p2.8xl instances• 2X performance improvement with same CUDA based code

Page 14: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Sample batch use cases

• Media Encoding Experimentation

• Digital Watermarking

Page 15: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Sample batch use cases

Ad hocReporting

Open ConnectCDN Reporting

Page 16: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Lessons learned from batch

• Docker helped generalize use cases• Cluster autoscaling adds efficiency• Advanced scheduling required• Initially ignored failures (with retries)• Time sensitive batch came later

Page 17: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Titus Batch Usage (Week of 11/7)

• Started ~ 300,000 containers during the week• Peak of 1000 containers per minute• Peak of 3,000 instances (mix of r3.8xls and m4.4xls)

Page 18: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Services

Page 19: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Adding Services to Titus

Job Management

Batch

Resource Management & Optimization

Container ExecutionIntegration

Service

Page 20: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Services are just long running batch, right?

Page 21: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Services more complex

Services resize constantly and run forever• Autoscaling• Hard to upgrade underlying hosts

Have more state• Ready for traffic vs. just started/stopped• Even harder to upgrade

Existing well defined dev, deploy, runtime & ops tools

Page 22: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Real Networking is Hard

Page 23: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Multi-Tenant Networking is Hard

• IP per container• Security group support• IAM role support• Network bandwidth isolation

Page 24: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Solutions

• VPC Networking driver• Supports ENI’s - full IP functionality• With scheduling - security groups• Support traffic control (isolation)

• EC2 Metadata proxy• Adds container “node” identity• Delivers IAM roles

Page 25: Re:invent 2016 Container Scheduling, Execution and AWS Integration

VPC Networking Integration with Docker

Titus Executor

Titus Networking Driver

- Create and attach ENI with- security group- IP address

create net namespace

Page 26: Re:invent 2016 Container Scheduling, Execution and AWS Integration

VPC Networking Integration with Docker

Titus Executor

Titus Networking Driver

- Launch ”pod root” container with- IP address- Using “pause” container- Using net=none

Pod RootContainer Docker

create net namespace

Page 27: Re:invent 2016 Container Scheduling, Execution and AWS Integration

VPC Networking Integration with Docker

Titus Executor

Titus Networking Driver

- Create virtual ethernet- Configure routing rules- Configure metadata proxy iptables NAT- Configure traffic control for bandwidth

pod_root_id

Pod RootContainer

Page 28: Re:invent 2016 Container Scheduling, Execution and AWS Integration

VPC Networking Integration with Docker

Titus Executor

Pod RootContainer(pod_root_id)

Docker

App Container

create container with--net=container:pod_root_id

Page 29: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Metadata Proxy

container

Amazon Metadata Service

(169.254.169.254)

Titus Metadata Proxy

What is my IP, instanceid, hostname?- Return Titus assigned

What is my ami, instance type, etc.- Unknown

Give me my role credentials- Assume role to container role, return

credentials

Give me anything else- Proxy

veth<id>

169.254.169.254:80

host_ip:9999

iptables/NAT

Page 30: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Putting it all together

Virtual Machine HostENI1sg=A

ENI2sg=X

ENI3sg=Y,Z

Non-routable IP IP1

IP2

IP3

sg=X sg=X sg=Y,ZNonroutable IP, sg=A Metadata proxy

Appcontainer

pod root

veth<id>

Appcontainer

pod root

veth<id>

Appcontainer

pod root

veth<id>

Appcontainer

pod root

veth<id>

Container 1 Container 2 Container 3 Container 4

Linux Policy Based Routing+ Traffic Control

169.254.169.254

NAT

Page 31: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Additional AWS Integrations

• Live and rotated to S3 log file access• Multi-tenant resource isolation (disk)• Environmental context• Automatic instance type selection• Elastic scaling of underlying resource pool

Page 32: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Netflix Infrastructure Integration

• Spinnaker CI/CD• Atlas telemetry• Discovery/IPC• Edda (and dependent systems)• Healthcheck, system metrics pollers• Chaos testing

Page 33: Re:invent 2016 Container Scheduling, Execution and AWS Integration

VM’sVM’s

Why? Single consistent cloud platform

VPC

EC2

Virtual Machines

AWS

Autoscaler

ServiceApplications

Cloud Platform Libraries(metrics, IPC, health)

Titus Job Control

VM’sVM’s

Container

ServiceApplications

Cloud Platform Libraries(metrics, IPC, health)

VM’sVM’s

Container

BatchApplications

Cloud Platform Libraries(metrics, IPC)

Edda EurekaAtlas

Page 34: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Titus Spinnaker Integration

Page 35: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Deploy Based On New Docker

Registry Tags

Page 36: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Deployment Strategies Same

as ASG’s

IAM Roles and Sec Groups Per

Container

Basic Resource

Requirements

Page 37: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Easily See Healthcheck &

Service Discovery Status

Page 38: Re:invent 2016 Container Scheduling, Execution and AWS Integration
Page 39: Re:invent 2016 Container Scheduling, Execution and AWS Integration
Page 40: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Fenzo – The heart of Titus scheduling

Extensible Library for Scheduling Frameworks

• Plugins based scheduling objectives• Bin packing, etc.

• Heterogeneous resources & tasks• Cluster autoscaling

• Multiple instance types• Plugins based constraints evaluator

• Resource affinity, task locality, etc.• Single offer mode added in support of ECS

Page 41: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Fenzo scheduling strategy

For each task

On each hostValidate hard constraintsEval fitness and soft constraints

Until fitness “good enough”, andA minimum #hosts evaluated

Plugins

Page 42: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Scheduling – Capacity Guarantees

DesiredMax

Titus maintains …Critical tier• guaranteed

capacity & start latencies

Flex tier• more dynamic

capacity & variable start latency

Titus MasterScheduler

Fenzo

Page 43: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Scheduling – Bin Packing, Elastic Scaling

Max

User adds work tasks

• Titus does bin packing to ensure that we can downscale entire hosts efficiently

Canterminate

Desired

Min

✖ ✖ ✖ ✖

Titus MasterScheduler

Fenzo

Page 44: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Availability Zone B

Availability Zone A

Scheduling – Constraints including AZ Balancing

User specifies constraints

• AZ Balancing• Resource and Task

affinity• Hard and softDesired

Min

Titus MasterScheduler

Fenzo

Page 45: Re:invent 2016 Container Scheduling, Execution and AWS Integration

ASG version 001

Scheduling – Rolling new Titus code

Operator updates Titus agent codebase

• New scheduling on new cluster• Batch jobs drain• Service tasks are migrated via

Spinnaker pipelines• Old cluster autoscales down

Desired

Min

ASG version 002

Min

Desired

✖ ✖

Titus MasterScheduler

Fenzo

Page 46: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Current Service Usage

• Approach• Started with internal applications• Moved on to line-of-fire NodeJS (shadow first, prod 1Q17)• Moved on to stream processing (prod 4Q)

• Current - ~ 2000 long running containers

1Q

Batch 2Q

Servicepre-prod 3Q

Serviceshadow

ServiceProd

4Q

Page 47: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Collaboration with ECS

Page 48: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Why ECS?

• Decrease operational overhead of underlying cluster state management

• Allow open source collaboration on ECS Agent• Work with Amazon and others on EC2 enablement• GPUS, VPC, Sec Groups, IAM Roles, etc.• Over time this enablement should result in less maintenance

Page 49: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Titus Today

Container Host

mesos-agent

Titus executor

containercontainer

containerMesos master

Titus Scheduler

EC2 Integration

Outbound- Launch/Terminate Container- ReconciliationInbound- Container Host Events (and offers)- Container Events

Page 50: Re:invent 2016 Container Scheduling, Execution and AWS Integration

First Titus ECS Implementation

Container Host

ECS agent

Titus executor

containercontainer

containerECSTitus

Scheduler

EC2 integrationOutbound

- Launch/Terminate Container- Polling for

- Container Host Events- Container Events

Page 51: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Collaboration with ECS team starts

• Collaboration on ECS “event stream” that could provide• “Real time” task & container instance state changes• Event based architecture more scalable than polling

• Great engineering collaboration• Face to face focus• Monthly interlocks• Engineer to engineer focused

Page 52: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Current Titus ECS Implementation

Container Host

ECS agent

Titus executor

containercontainer

containerECS

Titus Scheduler

EC2 Integration

Outbound- Launch/Terminate Container- ReconciliationInbound- Container Host Events- Container Events

Cloud Watch EventsSQS

Page 53: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Analysis - Periodic Reconciliation

For tasks in listTasksdescribeTasks (batches of 100)

Number of API calls: 1 + num tasks / 100 per reconcile

1280 containersacross 40 nodes

Page 54: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Analysis - Scheduling

• Number of API calls: 2X number of tasks• registerTaskDefinition and startTask

• Largest Titus historical job• 1000 tasks per minute• Possible with increased rate limits

Page 55: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Continued areas of scheduling collaboration

• Combining/batching registerTaskDefinition and startTask

• More resource types in the control plane• Disk, Network Bandwidth, ENI’s

• To fit with existing scheduler approach• Extensible message fields in task state transitions• Named tasks (beyond ARN’s) for terminate• Starting vs. Started state

Page 56: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Possible phases of ECS support in Titus

• Work in progress• ECS completing scheduling collaboration items• Complete transition to ECS for overall cluster manager• Allows us to contribute to ECS agent open source

Netflix cloud platform and EC2 integration points

• Future• Provide Fenzo as the ECS task placement service• Extend Titus Job Management features to ECS

Page 57: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Titus Future Focus

Page 58: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Future Strategy of Titus

• Service Autoscaling and global traffic integration• Service/Batch SLA management

• Capacity guarantees, fair shares and pre-emption• Trough / Internal spot market management• Exposing pods to users• More use cases and scale

Page 59: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Questions?

Andrew Spyker (@aspyker)

Page 60: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Thank you!

Page 61: Re:invent 2016 Container Scheduling, Execution and AWS Integration

Remember to complete your evaluations!