50
Elastic Load Balancing Deep Dive & Best Practices Iftach Ragoler Senior Manager Elastic Load Balancing AWS Loft – Tel Aviv March 2016

Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Embed Size (px)

Citation preview

Page 1: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Elastic Load BalancingDeep Dive & Best Practices

Iftach RagolerSenior Manager Elastic Load Balancing

AWS Loft – Tel Aviv March 2016

Page 2: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Hardware Load Balancers

• Plan for peak time• High Cost• High Maintenance

Page 3: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Software Load Balancers

• Manual Scaling• High Maintenance• Lack of fault tolerant

Page 4: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Elastic Load Balancing automatically distributes

incoming application traffic across multiple

Amazon EC2 instances.

Page 5: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

SecureElastic Integrated Cost Effective

Page 6: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

EC2Instance

Page 7: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Load Balancer used to route incoming requests to multiple EC2 instances.

ELB

EC2Instance

EC2Instance

EC2Instance

Page 8: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Load balance over classic EC2 instances.

Support for public IP addresses only.

No control over the load balancer security group.

Load balance over EC2 instances within a VPC.

Support for both public and private IP addresses.

Full control over the load balancer security group.

Tightly integrated into the associated VPC and subnets.

EC2-Classic EC2-VPC

Page 9: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

ArchitectureCustomer VPC

EC2Instance

EC2Instance

us-w

est-1

aus

-wes

t-1b

AmazonRoute 53

ELB VPC

ELB

ELB

Page 10: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

HTTP/HTTPSTCP/SSLIncoming client connection bound to server connection

No header modification

Proxy Protocol prepends source and destination IP and ports to request

Round robin algorithm used for request routing

Connection terminated at the load balancer and pooled to the server

Headers may be modified

X-Forwarded-For header contains client IP address

Least outstanding requests algorithm used for request routing

Sticky session support available

Page 11: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Health checks allow for traffic to be shifted away

from failed instances

Page 12: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

ELB

EC2Instance

EC2Instance

EC2Instance

Health checks ensure that request traffic is shifted away from a failed instance.

Health Checks

Page 13: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Support for TCP and HTTP health checks.

Customize the frequency and failure thresholds.

Must return a 2xx response.

Consider the depth and accuracy of your health checks.

Health Checks

Page 14: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Idle timeouts allow for connections to be closed by the load balancer when no longer in use.

Page 15: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Length of time that an idle connection should be kept open.

For both client and back-end connections.

Defaults to 60 seconds but can be set between 1 and 3,600 seconds.

Timeouts should decrease as you go up the stack.

Idle Timeouts

Page 16: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

15s

3s

3sELB

15sEC2Instances

Amazon S3

Amazon RDS

Amazon SWF3s

9s

Idle Timeouts

Page 17: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Using multiple Availability Zones

Page 18: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Multiple Availability ZonesELB VPC Customer VPC

EC2InstanceELB

ELB EC2Instance

us-w

est-1

aus

-wes

t-1b

AmazonRoute 53

Page 19: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Multiple Availability ZonesELB VPC Customer VPC

EC2InstanceELB

ELB

us-w

est-1

aus

-wes

t-1b

AmazonRoute 53

Page 20: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Always associate two or more subnets in

different zones with the load balancer

Page 21: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Using multiple Availability Zones does bring a few challenges.

Page 22: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Req

uest

Cou

nt

Time

Traffic Imbalances

Page 23: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Imbalanced Instance CapacityELB VPC Customer VPC

EC2InstanceELB

ELB

us-w

est-1

aus

-wes

t-1b

AmazonRoute 53

EC2Instances

Page 24: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Cross-Zone Load BalancingELB VPC Customer VPC

EC2InstanceELB

ELB

us-w

est-1

aus

-wes

t-1b

AmazonRoute 53

EC2Instances

Page 25: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Req

uest

Cou

nt

Time

Traffic ImbalancesCross-Zone Enabled

Page 26: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Load balancer absorbs impact of DNS caching.

Eliminates imbalances in back-end instance utilization.

Requests distributed evenly across multipleAvailability Zones.

Check connection limits before enabling.

No additional bandwidth charge for cross-zone traffic.

Cross-Zone Load Balancing

Page 27: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Each load balancer domain may contains multiple records.

Round robin used to balance traffic between Availability Zones.

DNS records will to change over time; never target IP addresses directly.

After being removed from DNS, IP addresses are drained and quarantined for up to 7 days.

Understanding DNS

Page 28: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

DNS caching by clients and ISPs can often cause clients to target a specific IP address or stop resolving at all.

Register a wildcard CNAME or ALIAS within Amazon Route 53.

// Create a wildcard CNAME or ALIAS in Route 53.*.example.com ALIAS … elb-12345.us-east-1.elb.amazon.com*.example.com CNAME elb-12345.us-east-1.elb.amazon.com

// prepend random content for each lookup made by the application.PROMPT> dig +short 25a8ade5-6557-4a54-a60e-8f51f3b195d1.example.com192.0.2.1192.0.2.2

DNS Optimization

Page 29: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

SSL OffloadingSupport for both SSL and HTTPs is provided.

Support for latest ciphers and protocols including Elliptical Curve Ciphers and Perfect Forward Secrecy.

Ability to fully customize ciphers and protocols to be used by each load balancer.

SSL Negotiation Suites provided to remove complexity of selecting ciphers and protocols.

Page 30: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

SSL Negotiation PoliciesProvide selection of ciphers and protocols that adhere to the latest industry best practices.

Balance security best practices with client’s ability to negotiate a connection, generated using traffic to Amazon.com.

Released on a regular cadence or when new vulnerabilities are published.

Default for all new load balancers.

Page 31: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

AWS Certificate Monitor Integration with ELB

• aws acm  request-certificate --domain-name demo2.example.us --idempotency-token demo2 --endpoint-url  --region ap-

southeast-2 

• Verify certificate via link attached to email 

• aws elb create-load-balancer-listeners --load-balancer-name Demo2 --listeners

Protocol=HTTPS,LoadBalancerPort=443,InstanceProtocol=HTTP,InstancePort=80,SSLCertificateId= arn:aws:acm:ap-

southeast-2:015209794502:certificate/6beab518-24b4-4893-9b81-85af49c9f977 --region ap-southeast-2

• aws acm  describe-certificate --certificate-arn  arn:aws:acm:ap-southeast-2:015209794502:certificate/6beab518-24b4-4893-9b81-85af49c9f977 --region ap-southeast-2

• aws elb set-load-balancer-listener-ssl-certificate --load-balancer-name <your elb name, e.g. ELB2> --load-balancer-

port 443 --ssl-certificate-id   arn:aws:acm:us-east-1:<your ACM cert ARN>

Provision, manage, and deploy SSL/TLS certificates to ELB with ACM

Makes it very simply to manage certificates when offloading SSL to ELB

Page 32: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

POODLE MitigationWithin 24 hours, 62% of load balancers migrated to the latest SSL Negotiation Policy, disabling SSLv3.

Page 33: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

@awscloud Thank-you #AWS for making it so easy to prevent #sslv3 #poodleattack Only took about 3 clicks of my mouse.“ ”@granticini

Page 34: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

13 CloudWatch metrics provided for each load balancer.

Provide detailed insight into the health of the load balancer and application stack.

CloudWatch alarms can be configured to notify or take action should any metric go outside of the acceptable range.

All metrics provided at the 1-minute granularity.

Amazon CloudWatch Metrics

Page 35: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

HealthyHostCountThe count of the number of healthy instances in each Availability Zone.

Most common cause of unhealthy hosts are health check exceeding the allocated timeout.

Test by making repeated requests to the back-end instance from another EC2 instance.

View at the zonal dimension.

Page 36: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

LatencyMeasures the time elapsed in seconds after the request leaves the load balancer until the response is received.

Test by sending requests to the back-end instance from another instance.

Using min, average and max CloudWatch stats provide upper and lower bounds for latency.

Debug individual requests using Access Logs.

Page 37: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

SurgeQueue and Spillovers

Count of the number of requests that could not be sent to back-end instances.

Queue up to 1024 requests per load balancernode, after which 503 errors will be returned.

Often caused by not being able to open connections to the back-end instance.

Normally a sign of an under-scaled application.

Page 38: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

CloudWatch and AutoScaling

All load balancer metrics can be used for AutoScaling.

Allow you to scale dynamically based on the loadbalancers view of the application.

Important to consider all metrics when using AutoScaling, may not be aware of resource contention on another metric.

You may be at peak multiple times a day.

Page 39: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Provide detailed information on each request processed by the load balancer.

Includes request time, client IP address, latencies, request path, server responses, negotiated cipher.

Delivered to your Amazon S3 bucket every 5 or 60 minutes.

Access Logs

Page 40: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Access LogsELB VPC

ELB

ELB

ELB Amazon S3

Logs indexed by date but include the IP address of the load balancer node itself.

Page 41: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

• timestamp• elb name• client:port• backend:port• request_processing_time• backend_processing_time• response_processing_time

• elb_status_code• backend_state_code• received_bytes• sent_bytes• “request”• negotiated cipher and

protocol2014-02-15T23:39:43.945958Z my-test-loadbalancer 192.168.131.39:2817 10.0.0.0.1 0.000073 0.001048 0.000057 200 200 0 29 "GET http://www.example.com:80/HTTP/1.1"

Access Logs

Page 42: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

“Everything fails all the time”Werner Vogels, CTO, Amazon.com

Page 43: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Be prepared to do nothing!

Page 44: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Mitigation Isolation RestoreRedundancy

Page 45: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

MitigationAll load balancers scaled to handle loss of single Availability Zone.

Amazon Route 53 health checks shift traffic away from the failed Availability Zone.

Completed within max of 150 seconds.

No other external or control plane dependencies.

Page 46: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Isolation

Other zones must remain unaffected.

Avoid dependencies between zones.

Be careful of work generated as a result of the event.

Operating at reduced capacity but stable.

Page 47: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Health checkers and edge locations perform the same volume of activity whether endpoints are healthy or unhealthy.

Constant Work

time

System activityTime to react

When nothing is failing, volume of API calls is zero. When failure occurs, volume of API calls spikes.

time

System activityTime to react

Work on Failure

Page 48: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Restore Redundancy

Restoring the system back to full capacity.

Avoid putting additional load on the system by rushing this step.

Ensure that recovered resources are left in a consistent state.

Full recovered when done.

Page 49: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Classic Link:Migration from Classic to VPC Network

• VPC has better isolation, management and functionality

• Classic link enables you to register classic instances behind VPC ELB

• Help with slow migration of complex stacks

Page 50: Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel Aviv

Thank you!