View
137
Download
1
Category
Preview:
Citation preview
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sathiya Shunmugasundaram @ Capital One
Gnani Dathathreya @ Capital One
December 2016
Operations Automation and
Infrastructure Management with Amazon ECS
CON311
What to Expect from the Session
• Microservices, Docker, and the Amazon ECS journey
• Container stack evolution
• Container stack operations automation
• Infrastructure creation automation
• AMI update automation
• Blue/green deployment automation
• Canary deployment automation
• Lessons and looking forward
We use Docker and ECS-based container technologies to advance
microservices adoption and increase efficiencies of cloud resources:
Microservices
architecture
We embraced microservices architecture for our cloud
applications and this is driving Docker container
technology adoption
Federated operating
model
Self-service
automation tools
Ours is a federated organization with a You Build You
Own operating model providing autonomy and speed
for delivery teams
We developed self service container management
automation tools based on ECS for accelerating
federating teams application delivery
Amazon ECS is the most adopted container management solution
in Capital One
• ECS and Docker implementations at Capital One include Credit Card servicing, Auto
Loan Servicing, and Enterprise Open Source office applications
• We run microservices, event-driven applications, batch applications, real-time APIs
and web applications using ECS and Docker solutions
• ECS is adopted in multiple lines of business for both internal and customer
applications
• ECS simplified the containerization journey in Capital One
• We leverage ECS’s integration with CloudWatch, IAM and other native services for
seamless integration with operations
• With ECS and our automation tooling, Docker apps can be deployed with a production
hardened container stack in minutes
Container stacks are integrated with Enterprise DevOps tools
providing an end-to-end automation solution for containerized
microservices.
SCM Build
Code
Binary
Repo
Docker
Image
Repo
Cluster
Scheduler
Cluster manager
Service
Discovery
Software
LBELB
API
Gateway
clientsDevelopers
Container management solution
components
Capability SCM Build Repos Compute cluster
Cluster manager
Container scheduler
Dynamic
Service
Discovery
Load balancer Load
balancer
API
Gateway
Solution GitHub
Enterprise
Jenkins Nexus
Docker
registry
EC2 instances
ECS
Consul
Target Group
Nginx
App load
balancer
Elastic Load
Balancing
API
Gateway
Our container stack evolved along with the ECS and ELB
advancement.
Simple stack
Mutual SSL1 +
ECSClassic load
balancer
Our container stack evolved along with the ECS and ELB
advancement.
Registrator+ + + +ECS Classic load
balancer
Consul Registrator Nginx
Simple stack
Mutual SSL
High Density Packing
Mutual SSL
1
2
+ECS
Classic load
balancer
Our container stack evolved along with the ECS and ELB
advancement.
+ECS
Application
load balancer
3Simple stack
High Density Packing
Simple stack
Mutual SSL1 +
ECSClassic load
balancer
Registrator+ + + +ECS Classic ELB Consul Registrator Nginx
High Density Packing
Mutual SSL2
ECSClassic load
balancerConsul Registrator Nginx
ECS and Classic Load Balancer is a simpler solution for running
containers; however, fixed host port mapping constrains running
one task per service in a ECS instance.
SV1Task1
SV1Task2
SV2Task1
SV2Task2
X X
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS instance ECS instance
Service 1
ELB
Service 2
ELBELB’s fixed listener
port constrains
running only one task
per ECS instance for
a service
ELB’s fixed listener
port constrains
running only one task
per ECS instance for
a service
1
ECS, Classic Load Balancer, Consul, Nginx, and Registrator
solution provides dynamic service discovery and load balancing;
However, it involves management of several components.
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS instance ECS instance
Services
ELB
SV1
Task1
SV2
Task2
Nginx
SV1
Task3
Consul Agent
Registrator ConsulTemplates
SV1
Task2
SV2
Task1
Nginx
SV2
Task3
Consul Agent
Registrator ConsulTemplates
ELB fixed listener port is
mapped to Nginx running
in each ECS instance. An
ELB can serve multiple
services
Nginx config routes
service requests to
appropriate service
containers/tasks
Registrator, consul
and consul templates
dynamically discover
containers/tasks and
configure nginx
Consul cluster for
dynamic service
discovery
Consul ELB
Availability Zone A
Availability Zone B
Availability Zone C
instance
consul
Auto Scaling group
instance
consul
instance
consul
2
ECS and Application Load Balancer solution provides a simpler,
efficient solution with dynamic service discovery and load balancing
capabilities.
Services
ELB
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS instance ECS instance
SV1Task1
SV2Task2
SV1Task3
SV1Task2
SV2Task1
SV2Task3
Application Load Balancer
For each service, a Target
Group is created with Routing
Rule. Service Containers with
dynamic ports are added to
the target group
Application Load Balancer
Routes requests to service
containers based on routing
rules and target groups
3
Infrastructure automation lets developers focus on application
development and less time on infrastructure coding:
• Lambda functions, Jenkins jobs for container stack creation, termination
• Blue/green and canary deployment automation tools
• AMI update automation
• Container health checks, alerts, and actions
• Integration with enterprise logging solution
• Monitoring solution with CloudWatch
• JVM stats monitoring with CloudWatch
• Automatic scaling of ECS containers
• Automatic scaling of ECS Instances
• Test apparatus self-service tool for performance testing
Automation tooling provides a consistent and repeatable way for
users to create container stacks without writing a single line of
infrastructure code
virtual private cloud
Parameters
for container
stack creation
S3
parameters put
event triggers
Lambda
Lambda + Terraform
Lambda executes Terraform with
parameters for infrastructure
creation
Container stack is
created in the VPC
Users provide
parameters like
subnets, security
groups, etc.
Users provide information like subnets, security groups, metrics,
alarms, and alerts as parameters for a container stack creation tool.
instance_type="m3.medium”
server_subnets="subnet-ab12,subnet-ab12”
ecs_sg="sg-sg1234”
asg_min=”3”
asg_max=”9”
asg_desired=”6”
sns_topic=”my-alerts”
scalein_adjustment="-1”
scaleout_adjustment="1”
scalein_cooldown="300”
scaleout_cooldown="300”
scaleout_alarm_cpu_interval_secs="900”
scaleout_cpu_percent="80”
scalein_cpu_percent=”40"
custom_script_location=“my_s3_bucket"
custom_script_name =”custom-script”
X509_cert_location=“my_s3_bucket”
X509_cert_files=“cert1.cer,cert2.cer”
ecs_cluster_name=”my-app-cluster”
iam_role=“my_app_Iam_role”
docker_registry=“my-docker-registry”
proxy_server=“my-co.proxycom”
Three Lambda functions make up the core of the stack creation; these
microfunctions decouple the compute cluster, Application Load
Balancer, and ECS service so they can have their own lifecycles.
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
ECS cluster
Lambda
Creates ECS cluster and
EC2 instances
Lambda
Creates load balancer with
default TG
Lambda
Creates target group, ECS
service, and rule
Compute cluster
Rehydrated independently
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
ECS cluster
Application Load Balancer
End users get the same endpoints
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
ECS cluster
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
ECS service
Target group and service deployments
Users perform regular AMI updates without outages using
automation tooling Lambda functions
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
Old AMI Old AMI
Old AMI container stack
AMI update: Lambda function creates new ECS cluster and EC2
instances with the new AMI.
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
New AMI New AMI
Lambda
Creates updated EC2
instances
ECS cluster
1
Old AMI Old AMI
AMI update: Lambda function replicates ECS services from old
AMI cluster to new AMI-based ECS cluster
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
New AMI New AMI
Lambda
Replicates ECS services to
new AMI instances
ECS cluster
Old AMI Old AMI
2
AMI update: Lambda function drains and deletes ECS services from
the old ECS cluster and terminates the old AMI EC2 instances
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
New AMI New AMI
ECS cluster
Old AMI Old AMI
Lambda
Delete ECS services and old
instances
3
X
AMI update: This completes the AMI update for the stack without
causing any outages
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service2Task2
service1Task3
service1Task2
service2Task1
service2Task3
New AMI New AMI
New AMI
container stack
Blue/green deployment reduces downtime and risk by running two
environments called Blue and Green and toggling between them
Image Courtesy: http://martinfowler.com/bliki/BlueGreenDeployment.html
Users perform Blue/Green deployments and rollbacks using
automation tooling Lambda functions
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service1Task2
Blue services
Blue/Green: Lambda function creates a beta ELB and green service;
users can test green service with Beta ELBLambda
Creates beta load balancer and
green service
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service1Task2service1
Task1service1Task1
Blue/Green: Lambda function adds green service to the original
ELB; traffic flows to green serviceLambda
Adds green service to
original LB
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service1Task2service1
Task1service1Task1
Blue/Green: Lambda function deletes blue services and the Beta
ELB; traffic flows to green serviceLambda
Deletes beta LB and blue
services
ECS cluster
Availability Zone A Availability Zone B
Auto Scaling groupECS Instance ECS Instance
service1Task1
service1Task1
Canary deployment is a way of releasing a new version of an
application by mixing new and old versions and gradually
increasing the percentage of new version
Load Balancer
V1 V1 V1 V1 V2
Load Balancer
V1 V1 V2V2V2
Load Balancer
V2V2V2V2V2
Canary deployment automation allows applications to roll out new
versions in a very controlled manner
Canary release automation uses Lambda, ECS, and other AWS
services. Flexible, serverless, and lean.
Deployment
Request Bucket
Deployment
JSON S3 Trigger
Lambda
function
SQS Poll
Lambda
function
Deployment
SQS QueueECS
Service
Deployment
State Bucket
ECS
Instance
ECS
Instance
SNS Topic
1 23
4
5
6
7
8
911
1210
Lessons learned and looking forward
• Amazon ECS has significantly reduced our container stack operations
• With ECS and our automation tooling, Docker apps can be deployed
with production hardened container stack in minutes
We would like to see the following ECS features that will
accelerate our enterprise adoption
• Container-level security groups
• Container placement constraints
• Balancing placements with scale-in, scale-out actions
Recommended