27
Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Embed Size (px)

Citation preview

Page 1: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Towards Efficient Load Balancing in Structured P2P Systems

Yingwu Zhu, Yiming Hu

University of Cincinnati

Page 2: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Outline

• Motivation and Preliminaries

• Load balancing scheme

• Evaluation

Page 3: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Why Load Balancing?• Structured P2P systems, e.g., Chord,Pastry

– Object IDs and Node IDs are produced by using a uniform hash function.

– Results in O(log N) load imbalance, in the number of objects stored at each node.

• Skewed distribution of node capacity– Nodes may carry loads proportional to their

capacities.

• Other problems: different object sizes, non-uniform dist. of object IDs.

Page 4: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Virtual Servers (VS)• First introduced in Chord/CFS.

• A VS is responsible for a contiguous region of the ID space.

• A node can host multiple VSs.

Chord Ring

Node A

Node C

Node B

Page 5: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Virtual Sever Reassignment• Virtual server is the basic unit of load movement, allowing load

to be transferred between nodes.

• L – Load, T – Target Load.

T=15

Chord Ring

Heavy

L=45

L=41

L=3Node C

Node B

Node A

30

20 11

3

10

15T=50

T=35

Page 6: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Virtual Sever Reassignment• Virtual server is the basic unit of load movement, allowing load

to be transferred between nodes.

• L – Load, T – Target Load.

T=15

Chord Ring

Heavy

L=45

L=41

L=3Node C

Node B

Node A

30

20 11

3

10

15T=50

T=35

Page 7: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Virtual Sever Reassignment• Virtual server is the basic unit of load movement, allowing load to

be transferred between nodes.• L – Load, T – Target Load.

Chord Ring

Node A

Node C

Node B

T=50

T=15

T=35

L=45

L=31

L=14

L=30

30

20 11

3

10

15

Page 8: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Advantages of Virtual Servers

• Flexible: load is moved in the unit of a virtual server.

• Simple: – VS movement is supported by all structured P2P

systems.– Simulated by a leave operation followed by a join

operation.

Page 9: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Current Load Balancing Solutions

• Some use the concept of virtual server

• However:– Either ignore the heterogeneity of node

capabilities.– Or transfer loads without considering proximity

relationships between nodes.– Or both.

Page 10: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Goals

• Goals:– To maintain each node’s load less than its target

load (maximum load a node is willing to take).– High capacity nodes take more loads.– Load balancing is performed in proximity-aware

manner, to minimize the overhead of load movement (bandwidth usage) and allow more efficient and fast load balancing.

• Load: depends on the particular P2P systems.– E.g., storage, network bandwidth, and CPU cycles.

Page 11: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Assumptions

• Nodes in system are cooperative.

• Only one bottlenecked resource, e.g., storage or network bandwidth.

• The load of each virtual server is stable over the timescale when load balancing is performed.

Page 12: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Overview of Design

• Step1: Load balancing information (LBI) aggregation, e.g., load and capacity info.

• Step2: Node classification. E.g., heavy nodes, light nodes, neutral nodes.

• Step3: Virtual server assignment (VSA).

• Step4: Virtual server transferring (VST).

• Proximity-aware load balancing– VSA is proximity-aware.

Page 13: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

LBI Aggregation and Node Classification• Rely on a fully decentralized, self-repairing, and fault-tolerant K-nary

tree built on top of a DHT (distributed hash table). • Each K-nary tree node is planted in a DHT node.• <L, C, Lmin> represents the load, capacity and the minimum load of

virtual servers, respectively.

<12,10,2> <15,8,3> <20,10,5> <15,20,4>

<27,18,2> <35,30,4>

<62, 48, 2>

Page 14: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

LBI Aggregation and Node Classification• Relying on a fully decentralized, self-repairing, and fault-tolerant K-

nary tree built on top of a DHT. • Each K-nary tree node is planted in a DHT node.

• <L, C, Lmin> represents the load, capacity, and the minimum load of virtual servers.

<12,10,2> <15,8,3> <20,10,5> <15,20,4>

<62, 48, 2>

<62, 48, 2> <62, 48, 2>

<62, 48, 2> <62, 48, 2> <62, 48, 2> <62, 48, 2>

Light

Heavy

Heavy

LightTi = (L/C+)*Ci

Page 15: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Virtual Server Assignment

H1 L1 H2 H3 Ln Ln+1 Hm Hm+1…

V11, V12 C1 V21 V31, V32 Cn Cn+1 Vm1, Vm2 Vm+1

VSA information VSA information

Rendezvous point: best-fit heuristics

Rendezvous point: best fit heuristics

Unpaired VSA information

Final rendezvous pointVS

A happens earlier betw

een logically closer nodes

Logically close

Page 16: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Virtual Server Assignment• DHT identifier space-based VSA:

– VSA happens earlier between logically closer nodes.– Proximity-ignorant, because logically close nodes in DHT do

NOT mean they are physically close together.

H1

L3

L2L4

L1

H2

[1] Nodes in same colors are

physically close to each other.

[2] H – heavy nodes, L – light nodes.

[3] Vi – virtual servers.V1

V2 V3

Page 17: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Proximity-Aware VSA• Nodes in same colors are physically close to each other.

• H – heavy node, L – light node, Vi – virtual server.

• VSs are assigned between physically close nodes.

H1

L3

L2L4

L1

H2

V1V2

V3

Page 18: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Proximity-Aware VSA

• Use landmark clustering to generate proximity information, e.g. landmark vectors.

• Use space-filling curves (e.g., Hilbert curve): Landmark vectors Hilbert numbers as DHT keys.

• Heavy nodes and light nodes each puts/maps their VSA info. into the underlying DHT with the resulting DHT keys: align physical closeness with logical closeness.

• Each virtual server independently reports the VSA info. which is mapped into its responsible region, rather than its node’s own VSA info.

Page 19: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Proximity-Aware Virtual Server Assignment

H1 L1 H2 H3 Ln Ln+1 Hm Hm+1…

V11, V12 C1 V21 V31, V32 Cn Cn+1 Vm1, Vm2 Vm+1

VSA information VSA information

Rendezvous point: best-fit heuristics

Rendezvous point: best fit heuristics

Unpaired VSA information

Final rendezvous point

Physically close

VS

A happens earlier betw

een physically closer nodes

Page 20: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Experimental Setup

• A K-nary tree built on top of a DHT (Chord), e.g., k=2, and 8, respectively.

• Two node capacity distributions:– Gnutella-like capacity profile, 5-level capacities.– Zipf-like capacity profile.

• Two load distributions of virtual servers:– Gaussian dist. and Pareto dist.

• Two transit-stub topologies (5,000 nodes):– “ts5k-large” and “ts5k-small”.

Page 21: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

High Capacity Nodes Carry More Loads

Gaussian load distribution + Gnutella-like capacity profile

Page 22: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

High Capacity Nodes Carry More Loads

Pareto load distribution + Zipf-like capacity profile

Page 23: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Proximity-Aware Load Balancing

CDF of Moved Load Distribution in ts5k-large

Gaussian load distribution and

Gnutella-like capacity profile

Pareto load distribution and

Zipf-like capacity profile

More loads are moved within shorter distances by proximity-aware load balancing.

Page 24: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Benefit of Proximity-Aware Scheme

• Load movement cost:

LM(d) denotes the load moved in the distance of d hops.

• Benefit:

• Results: – For ts5k-large: B = 37-65%

– For ts5k-small: B = 11-20%

Page 25: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Other Results

• Quantify the overhead of K-nary tree construction:– Link stress, node stress.

• The latencies of LBI aggregation and VSA, bound in O(logN) time.

• The effect of pairing threshold in rendezvous points.

Page 26: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Conclusions• Current load balancing approaches using virtual servers

have limitations:– Either ignore node capacity heterogeneity.– Or transfer loads without considering proximity relationships

between nodes.– Or both.

• Our solution:– A fully decentralized, self-repairing, and fault-tolerant K-nary

is built on top of DHTs for performing load balancing.– Nodes carry loads in proportion to their capacities.– The first work to address load balancing issue in a proximity-

aware manner, thereby minimizing the overhead of load movement and allowing more efficient load balancing.

Page 27: Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

Questions?