46
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. November 13, 2014 | Las Vegas Elastic Load Balancing Deep Dive & Best Practices David Brown, Director, Software Engineering

(SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Embed Size (px)

DESCRIPTION

Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances for fault tolerance and load distribution. In this session, we go into detail about Elastic Load Balancing's configuration and day-to-day management, as well as its use in conjunction with Auto Scaling. We explain how to make decisions about the service's many customization choices. We also share best practices and useful tips for success.

Citation preview

Page 1: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

November 13, 2014 | Las Vegas

Elastic Load Balancing

Deep Dive & Best Practices

David Brown, Director, Software Engineering

Page 2: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Elastic Load Balancing automatically distributes

incoming application traffic across multiple

Amazon EC2 instances.

Page 3: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

SecureElastic Integrated Cost Effective

Page 4: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

EC2

Instance

Page 5: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Load Balancer used to

route incoming requests

to multiple EC2

instances.

ELB

EC2

Instance

EC2

Instance

EC2

Instance

Page 6: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Load balance over classic EC2

instances.

Support for public IP addresses only.

No control over the load balancer

security group.

Load balance over EC2 instances

within a VPC.

Support for both public and private IP

addresses.

Full control over the load balancer

security group.

Tightly integrated into the associated

VPC and subnets.

EC2-Classic EC2-VPC

Page 7: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

ArchitectureCustomer VPC

EC2

Instance

EC2

Instance

us-w

est-

1a

us-w

est-

1b

Amazon

Route 53

ELB VPC

ELB

ELB

Page 8: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

HTTP/HTTPSTCP/SSL

Incoming client connection bound to

server connection

No header modification

Proxy Protocol prepends source and

destination IP and ports to request

Round robin algorithm used for

request routing

Connection terminated at the load

balancer and pooled to the server

Headers may be modified

X-Forwarded-For header contains

client IP address

Least outstanding requests algorithm

used for request routing

Sticky session support available

Page 9: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Health checks allow for

traffic to be shifted away

from failed instances

Page 10: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

ELB

EC2

Instance

EC2

Instance

EC2

Instance

Health checks ensure

that request traffic is

shifted away from a

failed instance.

Health Checks

Page 11: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Support for TCP and HTTP health checks.

Customize the frequency and failure

thresholds.

Must return a 2xx response.

Consider the depth and accuracy of your

health checks.

Health Checks

Page 12: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Idle timeouts allow for connections to be closed by

the load balancer when no longer in use.

Page 13: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Length of time that an idle connection should be kept open.

For both client and back-end connections.

Defaults to 60 seconds but can be set between 1 and 3,600

seconds.

Timeouts should decrease as you go

up the stack.

Idle Timeouts

Page 14: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

15s

3s

3sELB

15sEC2

Instances

Amazon S3

Amazon RDS

Amazon SWF

3s

9s

Idle Timeouts

Page 15: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Using multipleAvailability Zones

Page 16: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Multiple Availability ZonesELB VPC Customer VPC

EC2

InstanceELB

ELBEC2

Instance

us-w

est-

1a

us-w

est-

1b

Amazon

Route 53

Page 17: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Multiple Availability ZonesELB VPC Customer VPC

EC2

InstanceELB

ELB

us-w

est-

1a

us-w

est-

1b

Amazon

Route 53

Page 18: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Always associate two or more subnets in

different zones with the load balancer

Page 19: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Using multiple Availability Zones does

bring a few challenges.

Page 20: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Re

qu

es

t C

ou

nt

Time

Traffic Imbalances

Page 21: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Imbalanced Instance CapacityELB VPC Customer VPC

EC2

InstanceELB

ELB

us-w

est-

1a

us-w

est-

1b

Amazon

Route 53

EC2

Instances

Page 22: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Cross-Zone Load BalancingELB VPC Customer VPC

EC2

InstanceELB

ELB

us-w

est-

1a

us-w

est-

1b

Amazon

Route 53

EC2

Instances

Page 23: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Re

qu

es

t C

ou

nt

Time

Traffic Imbalances

Cross-Zone Enabled

Page 24: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Load balancer absorbs impact of DNS caching.

Eliminates imbalances in back-end instance utilization.

Requests distributed evenly across multiple

Availability Zones.

Check connection limits before enabling.

No additional bandwidth charge for

cross-zone traffic.

Cross-Zone Load Balancing

Page 25: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Each load balancer domain may contains multiple records.

Round robin used to balance traffic between Availability Zones.

DNS records will to change over time; never

target IP addresses directly.

After being removed from DNS, IP addresses

are drained and quarantined for up to 7 days.

Understanding DNS

Page 26: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

DNS caching by clients and ISPs can often cause clients to target

a specific IP address or stop resolving at all.

Register a wildcard CNAME or ALIAS within Amazon Route 53.

// Create a wildcard CNAME or ALIAS in Route 53.

*.example.com ALIAS … elb-12345.us-east-1.elb.amazon.com

*.example.com CNAME elb-12345.us-east-1.elb.amazon.com

// prepend random content for each lookup made by the application.

PROMPT> dig +short 25a8ade5-6557-4a54-a60e-8f51f3b195d1.example.com

192.0.2.1

192.0.2.2

DNS Optimization

Page 27: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

SSL Offloading

Support for both SSL and HTTPs is provided.

Support for latest ciphers and protocols including

Elliptical Curve Ciphers and Perfect Forward Secrecy.

Ability to fully customize ciphers and protocols to be

used by each load balancer.

SSL Negotiation Suites provided to remove complexity

of selecting ciphers and protocols.

Page 28: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

SSL Negotiation Policies

Provide selection of ciphers and protocols that adhere to the latest

industry best practices.

Balance security best practices with client’s ability to negotiate a

connection, generated using traffic to Amazon.com.

Released on a regular cadence or when new

vulnerabilities are published.

Default for all new load balancers.

Page 29: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

POODLE Mitigation

Within 24 hours, 62% of load

balancers migrated to the latest SSL

Negotiation Policy, disabling SSLv3.

Page 30: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

@awscloud Thank-you #AWS for making it

so easy to prevent #sslv3 #poodleattack Only

took about 3 clicks of my mouse.“”@granticini

Page 31: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

13 CloudWatch metrics provided for each load

balancer.

Provide detailed insight into the health of the load

balancer and application stack.

CloudWatch alarms can be configured to notify or

take action should any metric go outside of the

acceptable range.

All metrics provided at the 1-minute granularity.

Amazon CloudWatch Metrics

Page 32: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

HealthyHostCount

The count of the number of healthy instances

in each Availability Zone.

Most common cause of unhealthy hosts are

health check exceeding the allocated timeout.

Test by making repeated requests to the back-

end instance from another EC2 instance.

View at the zonal dimension.

Page 33: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Latency

Measures the time elapsed in seconds after the request leaves the load

balancer until the response is received.

Test by sending requests to the back-end instance from another instance.

Using min, average and max CloudWatch stats

provide upper and lower bounds for latency.

Debug individual requests using Access Logs.

Page 34: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

SurgeQueue and Spillovers

Count of the number of requests that could not be sent to back-end

instances.

Queue up to 1024 requests per load balancer

node, after which 503 errors will be returned.

Often caused by not being able to open

connections to the back-end instance.

Normally a sign of an under-scaled application.

Page 35: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

CloudWatch and AutoScaling

All load balancer metrics can be used for AutoScaling.

Allow you to scale dynamically based on the load

balancers view of the application.

Important to consider all metrics when using

AutoScaling, may not be aware of resource

contention on another metric.

You may be at peak multiple times a day.

Page 36: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Provide detailed information on each

request processed by the load balancer.

Includes request time, client IP address,

latencies, request path, and server

responses.

Delivered to an Amazon S3 bucket every

5 or 60 minutes.

Access Logs

Page 37: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Access Logs

ELB VPC

ELB

ELB

ELB Amazon S3

Logs indexed by date

but include the IP

address of the load

balancer node itself.

Page 38: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

• timestamp

• elb name

• client:port

• backend:port

• request_processing_time

• backend_processing_time

• response_processing_time

• elb_status_code

• backend_state_code

• received_bytes

• sent_bytes

• “request”

2014-02-15T23:39:43.945958Z my-test-loadbalancer

192.168.131.39:2817 10.0.0.0.1 0.000073 0.001048 0.000057

200 200 0 29 "GET http://www.example.com:80/HTTP/1.1"

Access Logs

Page 39: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

“Everything fails all the time”Werner Vogels, CTO, Amazon.com

Page 40: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Be prepared to do nothing!

Page 41: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Mitigation Isolation Restore

Redundancy

Page 42: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Mitigation

All load balancers scaled to handle loss

of single Availability Zone.

Amazon Route 53 health checks shift

traffic away from the failed Availability

Zone.

Completed within 150 seconds.

No other external or control plane

dependencies.

Page 43: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Isolation

Other zones must remain unaffected.

Avoid dependencies between zones.

Be careful of work generated as a result

of the event.

Operating at reduced capacity but stable.

Page 44: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Health checkers and edge locations

perform the same volume of activity

whether endpoints are healthy or

unhealthy.

Constant Work

time

System activity

Time to react

When nothing is failing, volume of API

calls is zero. When failure occurs,

volume of API calls spikes.

time

System activity

Time to react

Work on Failure

Page 45: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Restore Redundancy

Restoring the system back to full capacity.

Avoid putting additional load on the system

by rushing this step.

Ensure that recovered resources are left in

a consistent state.

Full recovered when done.

Page 46: (SDD423) Elastic Load Balancing Deep Dive and Best Practices | AWS re:Invent 2014

Please give us your feedback on this

presentation

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Join the conversation on Twitter with

#reinvent

SDD423