54
Global Server Load Balancing Dima Krioukov [[email protected]] Alex Kit [[email protected]] October 24, 2000

Title Subtitle

Embed Size (px)

Citation preview

Page 1: Title Subtitle

Global Server Load Balancing

Dima Krioukov [[email protected]]

Alex Kit [[email protected]]

October 24, 2000

Page 2: Title Subtitle

GSLB - 2

Purpose

• Existing methods

• New technique

• Analysis

• Applicability considerations

Page 3: Title Subtitle

GSLB - 3

Plan• Introduction

– What are ASPs?– Requirements to IDCs

• LSLB– Load Sharing NAT (LSNAT)– Direct Server Return (DSR)– Tunneling

• GSLB– DNS Based– Host Route Injection (HRI)– Triangle Data Flow (TDF)– Latest Trends

• New Technique – Virtual Block Injection (VBI)– Description– Testing– Analysis

• Applicability Considerations

• Conclusions and References

Page 4: Title Subtitle

GSLB - 4

Abbreviations

• LB = Load Balancing/Balancer

• SLB = Server LB

• LSLB = Local SLB

• GSLB = Global SLB

• HA = High Availability

• RS = Real Server/Service

• VS = Virtual Server/Service

• VIP = VS IP address

• LSNAT = Load Sharing NAT

• DSR = Direct Server Return

• PRP = Proximity Report Protocol

• LRP = Load Report Protocol

• LPRP = PRP + LRP

• HRI = Host Route Injection

• VBI = Virtual Block Injection

• TDF = Triangle Data Flow

• IDC = Internet Data Center

• CDN = Content Delivery Network

• ASP = Application Service Provider

• CASP = Content/Collocation and Application Service Provider

• AIP = Application Infrastructure Provider

• xyP = ?

Page 5: Title Subtitle

GSLB - 5

1. Introduction

Logic: GSLB IDC ASP Hosting

Page 6: Title Subtitle

GSLB - 6

Hosting

Infrastructure

Web User Content Owner IDC Owner ISP

OSS

Page 7: Title Subtitle

GSLB - 7

ASP

Infrastructure

End Customer ASP Applications

Operations

ISP/Backbone

Access

IDC

Page 8: Title Subtitle

GSLB - 8

IDC

IDCCore

(Routing)

Distribution(L3 Switching)

Tier Tier Tier

LB TierLoad Balancing(L4 Switching)

Port Density(L2 Switching)

Servers

SAN

Page 9: Title Subtitle

GSLB - 9

Requirements to IDCs

• Load Balancing (LB)

IDC1 IDC2

Client

– Local– Global

– Local– Global

· Proximity (“including” congestion)· Load

• High Availability (HA)

HA ⊂ LB

Page 10: Title Subtitle

GSLB - 10

2. Generic SLB and LSLBSLB = VS RS

• Health Checking– Layer 2– Layer 3– Layer 4– Layer 7

• SLB Algorithm– Round Robin– Least Connections– Server Response Time– Server Load– Hashing

• SLB Forwarding– Session Tables– Timers

Page 11: Title Subtitle

GSLB - 11

LSLB Forwarding

• LSNAT

• DSR

• Tunneling

Page 12: Title Subtitle

GSLB - 12

LSNAT

Router

LB

S1 S2 S3

src/dst

Layer Ingress

Client_PortS1_Portdst

Client_IPS1_IPdst

LB_MACS1_MACdst

Client_PortVirtual_Portdst

Client_IPVirtual_IPdst

dst Router_MACVirtual_MAC

Client_Port

Client_IP

LB_MAC

Client_Port

Client_IP

Router_MAC

S1_IPsrcL3

src

src

src

src

src

Virtual_IPL3

S1_PortL4

Virtual_PortL4

S1_MACL2

Y

Virtual_MACL2

X

EgressSegment

X

Y

Page 13: Title Subtitle

GSLB - 13

LSNAT + Source NAT

Router

LB

S1 S2 S3

src/dst

Layer Ingress

LB_V_PortS1_Portdst

LB_V_IPS1_IPdst

LB_V_MACS1_MACdst

Client_PortVirtual_Portdst

Client_IPVirtual_IPdst

dst Router_MACVirtual_MAC

LB_V_Port

LB_V_IP

LB_V_MAC

Client_Port

Client_IP

Router_MAC

S1_IPsrcL3

src

src

src

src

src

Virtual_IPL3

S1_PortL4

Virtual_PortL4

S1_MACL2

Y

Virtual_MACL2

X

EgressSegment

X

Y

Page 14: Title Subtitle

GSLB - 14

DSR

Router

LB

S1 S2 S3 Virtual_Port

Client_Port

Virtual_IP

Client_IP

S1_MAC

Virtual_MAC

2

Client_Port

Virtual_Port

Client_IP

Virtual_IP

Router_MAC

S1_MAC

3src/dst

Layer 1

Virtual_Portdst

Virtual_IPdst

dst Virtual_MAC

Client_Port

Client_IP

Router_MAC

src

src

src

L3

L4

L21

23

Page 15: Title Subtitle

GSLB - 15

Tunneling

Router

LB

S1 S2 S3

Int: V_IP

Int: C_IP

V_Port

C_Port

Ext: S1_IP

Ext: LB_IP

S1_MAC

LB_MAC

2

C_Port

V_Port

C_IP

V_IP

R_MAC

S1_MAC

3src/dst

Layer 1

V_Portdst

V_IPdst

dst V_MAC

C_Port

C_IP

R_MAC

src

src

src

L3

L4

L21

23

Page 16: Title Subtitle

GSLB - 16

3. GSLB

• DNS Based

• HRI

• TDF

• Latest Trends

Page 17: Title Subtitle

GSLB - 17

3.1 DNS Based

GSLB = Name VS (DNS+)

• Smart DNS– Load and availability awareness Load Report Protocol (LRP)– Proximity and congestion awareness Proximity Report Protocol

(PRP)

• LB DNS Functionality– DNS Server– DNS Proxy

• Caching– DNS Traffic Intercept

Page 18: Title Subtitle

GSLB - 18

LPRP• Transport

– UDP– TCP– HTTP

• Operation– Periodic Updates– Periodic Requests– Triggered Updates

IDC1

LB

IDC2

LB

IDC3

LB

Page 19: Title Subtitle

GSLB - 19

PRP

• RTT

• Effective bandwidth

• Number of hops

• Number of AS hops

• IGP metric

Proximity to the client LDNS, not to the client

Page 20: Title Subtitle

GSLB - 20

LRP

• VS Health– Up– Down– Backup only

• VS Load– Number of sessions– Response Time

• LB Load– Number of sessions– Capacity threshold– CPU

• RS/Content Load

• Network Load– bps– pps

• QoS

• Security

Page 21: Title Subtitle

GSLB - 21

How it works

IDC1

IDC2

LB

IDC3

LB

CustomerLDNS

ADNS

Client

RDNS

1

2 3

455

6

6

6

Page 22: Title Subtitle

GSLB - 22

How it works

IDC1

IDC2

LB

IDC3

LB

CustomerLDNS

ADNS

Client

RDNS

7

7

810

119

Page 23: Title Subtitle

GSLB - 23

Analysis

Pros

• Accurate load info

• Accurate proximity info

• Perfect solution… in some cases and if certain conditions are met

Cons• DNS – wrong target

• Proximity between client and its LDNS

• Caching– LB– LDNS– Application

• Complexity

• Hard to find optimal values for various timers (TTL, cache timeouts, etc.) and prefix lengths

Page 24: Title Subtitle

GSLB - 24

3.2 HRI

GSLB = Routing+

• To what?– BGP– IGP

• By what?– RS– Router– LB

Page 25: Title Subtitle

GSLB - 25

To what

• IGP?

• BGP– Route filtering (both ways)– No ECMP

Client

IDC1

IDC2

Router

Page 26: Title Subtitle

GSLB - 26

By what

RS

IDC1

Router

RS

BGP

IDC2

Router

RS

BGP

Page 27: Title Subtitle

GSLB - 27

By what

Router

IDC1

Router

RS

IDC2

Router

RS RS

LB

Page 28: Title Subtitle

GSLB - 28

By what

LB

IDC2

Router

RS RS

LB

IDC1

Router

RS RS

LB

BGP BGP

Page 29: Title Subtitle

GSLB - 29

Analysis

Pros

• Simplicity

• No new protocols are needed

• Proximity is handled by routing

• Load handling?

Cons

• Single backbone*– Its own– Single ISP

• Too many routes

• Less accurate load and proximity info– Only local load– Optimal routing?

• Route flapping*

Page 30: Title Subtitle

GSLB - 30

3.3 TDF

GSLB = X + TDF

• NAT Based

• Tunneling

Client

IDC1, “wrong”

IDC2,“right”

Page 31: Title Subtitle

GSLB - 31

Why “wrong” IDC?

• Failure of, disabled or non-implemented LPRP

• Cached DNS records

• Other retardation effects (LPRP, BGP)

Page 32: Title Subtitle

GSLB - 32

NAT Based

Client

IDC1, “wrong”

V1.1; V1.2

IDC2,“right”

3

21

1

V1.1

C

CV2.2dst

V1.1CsrcL3

32

V2.1; V2.2

Page 33: Title Subtitle

GSLB - 33

“Remote Servers”

Client

IDC1, “wrong”

V1.1

IDC2,“right”

21

C

V1.1

41

V1.1

C

V1.1V2.1dst

V2.1V1.1srcL3

32

V2.1

3

4

Page 34: Title Subtitle

GSLB - 34

Tunneling

Next section

Page 35: Title Subtitle

GSLB - 35

Analysis

Pros

• Fixes errors optimally

Cons

•ip verify reverse-path

Client

IDC1, “wrong”

IDC2,“right”

Router

Router

Page 36: Title Subtitle

GSLB - 36

Analysis

Pros

• Fixes errors optimally

Cons

•ip verify reverse-path

Client

IDC1, “wrong”

IDC2,“right”

Router

Router

Page 37: Title Subtitle

GSLB - 37

3.4 Latest Trends, Radicalism

• Internet infiltration

• Going to the client edge

• Going to the client

• Modifying the client

• LB presence in strategic locations (HydraGPS, Speedera)

• LDNS modifications (Speedera)

• Application modifications (SRV RRs)

Page 38: Title Subtitle

GSLB - 38

Internet Infiltrations

IDC2

LB

IDC1

LB

Customer

LB

LB

LB

ClientLB

LB

LB

Page 39: Title Subtitle

GSLB - 39

Internet Infiltrations

IDC2

LB

IDC1

LB

Customer

LB

LB

LB

Client

LB

LB

Page 40: Title Subtitle

GSLB - 40

LDNS modifications in CDNs

IDC2

LB

IDC1

LB

CustomerLDNSClient

ASP Backbone

Page 41: Title Subtitle

GSLB - 41

4. Virtual Block Injection (VBI)

• Inject not VS host routes, but blocks of GSLB’ed VSs IDC (LB) failures are handled by the routing protocol

• Use tunneling TDF in case of individual VS failure

Page 42: Title Subtitle

GSLB - 42

How it works

ISP1 ISP2

IDC1, R1/20 IDC2, R2/20

AS1 AS2

V/20, AS3V/20, AS3

Client

Page 43: Title Subtitle

GSLB - 43

How it works

ISP1 ISP2

IDC1, R1/20 IDC2, R2/20

AS1 AS2

V/20, AS3

Client

Page 44: Title Subtitle

GSLB - 44

How it works

ISP1 ISP2

IDC1, R1/20 IDC2, R2/20

AS1 AS2

V/20, AS3V/20, AS3

Client

Page 45: Title Subtitle

GSLB - 45

Testing

Needed

• LB

• BGP

• Tunnels

Linux

• Linux Virtual Server (LVS,Wensong Zhang,Julian Anastasov)

• Zebra

• Tunnels

Page 46: Title Subtitle

GSLB - 46

Test Network

Page 47: Title Subtitle

GSLB - 47

Analysis

Pros

• All of HRI, plus

• No host route injection

• Working TDF

• Perfect VS health handling

• VS load LRP

• Obvious simplifications in more “ideal” cases

Cons

• LB load stop advertisement?

• BGP – proximity tool?

• Discontinuous AS?

• Route flapping!

Page 48: Title Subtitle

GSLB - 48

Route Flapping

ISP1 ISP2

IDC1, R1/20 IDC2, R2/20

AS1 AS2

V/20, AS3V/20, AS3

Client

RouterUDPTCP

Page 49: Title Subtitle

GSLB - 49

Solution for UDPSession table entry exchange for long sessions

ISP1 ISP2

IDC1, R1/20 IDC2, R2/20

AS1 AS2

V/20, AS3V/20, AS3

Client

Router

Page 50: Title Subtitle

GSLB - 50

Solution for UDPSession table entry exchange for long sessions

ISP1 ISP2

IDC1, R1/20 IDC2, R2/20

AS1 AS2

V/20, AS3V/20, AS3

Client

Router

Page 51: Title Subtitle

GSLB - 51

Solution for TCPIf LB receives packet

• Destined to a VS

• No SYN

• No session table entry

• Not via the tunnels

Forward via all the tunnels

ISP1 ISP2

IDC1, R1/20 IDC2, R2/20

AS1 AS2

V/20, AS3V/20, AS3

Client

Router

Page 52: Title Subtitle

GSLB - 52

5. Applicability Considerations

GSLB of

• Small number of VSs (or RSs) – by an ISP*– by its customer

• Big number of VSs (between IDCs)– CASP = ISP– CASP ≠ ISP

• CASP has its own backbone- CASP does not have control over customer access- CASP has control over customer access**

• CASP does not have its own backbone- CASP is multihomed to the same ISP- CASP is multihomed to different ISPs*

Page 53: Title Subtitle

GSLB - 53

6. Conclusions

• No ideal GSLB method

• For some “ideal” network scenarios, there are some “ideal” solutions

• For realistic network scenarios, there are rapidly improving realistic solutions

• Good competition

• Lack of comparative testing in the production-like environment