Distributed Hash-based Lookup for Peer-to-Peer Systems

Preview:

Citation preview

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Distributed Hash-based Lookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

March 10, 2006

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Agenda

I Peer-to-Peer Sytems

I Initial approaches to Peer-to-Peer Systems

I Their Limitations

I CAN - Content Addressable Network

I Applications of CAN

I Design Details

I Benefits

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Peer-to-Peer Systems

I Distributed and a decentralized architecture

I No centralized Server (unlike Client-ServerArchitecture)

I Any participating host can behave as the server

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Napster

I P2P file sharing system

I Central Server stores the index of all the filesavailable on the network

I To retrieve a file, central server contacted toobtain location of desired file

I Not completely decentralized system

I Central directory not scalable

I Single point of failure

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Napster

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Napster

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Napster

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Napster

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Gnutella

I P2P file sharing system

I No Central Server store to index the files availableon the network

I File location process decentralized as well

I Requests for files are flooded on the network

I No Single point of failure

I Flooding on every request not scalable

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Gnutella

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Content Addressable Network (CAN)

I In any P2P system, File transfer process isinherently scalable

I However, the indexing scheme which maps filenames to location crucial for scalablility

I Content Addressable Network (CAN) provides ascalable indexing mechanism

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Applications of CAN

DNSI Conventional DNS lookup system

I Requires special root serversI Closely ties naming scheme to the manner in which a

name is resolved to an IP address

I DNS lookup system using hash based lookupI CAN can provide DNS lookup system without any special

serversI naming scheme independent of the name resolution

process

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Applications of CAN (..continued)

File Systems for P2P systems

I The file system would store files and their metadata across nodes in the P2P network

I The nodes containing blocks of files could belocated using hash based lookup

I The blocks would then be fetched from thosenodes

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

What is CAN?

I CAN is a distributed infrastructure that provideshash table like functionality

I CAN is composed of many individual nodes

I Each CAN node stores a chunk(zone) of theentire hash table

I Request for a particular key is routed byintermediate CAN nodes whose zone containsthat key

I The design can be implemented in applicationlevel (no changes to kernel required)

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Design of CAN

I Involves a virtual d-dimensional CartesianCo-ordinate space

I The co-ordinate space is completely logical

I The co-ordinate space is partitioned into zonesamong all nodes in the system

I Every node in the system owns a distinct zone

I The distribution of zones into nodes forms anoverlay network

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Co-ordinate space in CAN

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Design of CAN (..continued)

I To store (Key,value) pairs, keys are mappeddeterministically onto a point P in co-ordinatespace using a hash function

I The (Key,value) pair is then stored at the nodewhich owns the zone containing P

I To retrieve an entry corresponding to Key K, thesame hash function is applied to map K to thepoint P

I The retrieval request is routed to node owningzone containing P

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Routing in CAN

I Every CAN node holds IP address and virtualco-ordinates of each of it’s neighbours

I Every message to be routed holds the destinationco-ordinates

I Using it’s neighbour’s co-ordinate set, a noderoutes a message towards the neighbour withco-ordinates closest to the destinationco-ordinates

I Even if one of the neighbours fails, messages canbe routed through other neighbours in thatdirection

I How close the message gets closer to thedestination after being routed to one of theneighbours is called Progress

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Routing in CAN (..continued)

I For a d-dimensional space partitioned into n equalzones, routing path length = O(d.n1/d) hops

I Every node has 2d neighbours

I With increase in no. of nodes, per node statedoes not change

I With increase in no. of nodes, routing pathlength grows as O(n1/d)

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Allocation of a new node to a zone

I The new node randomly choses a point P in theco-ordinate space

I It sends a JOIN request to point P via anyexisting CAN node

I The request is forwarded using CAN routingmechanism to the node D owning the zonecontaining P

I D then splits it’s node into half and assigns onehalf to new node

I The new neighbour information is determined forboth the nodes

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Before a node joins CAN

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

After a node joins CAN

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Failure of a node

I If a node leaves CAN, the zone it occupies istaken over by the remaining nodes

I If a node leaves voluntarily, it can handover it’sdatabase to some other node

I When a node simply becomes unreachable, thedatabase of the failed node is lost

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Design Improvements

I Multidimensional co-ordinate spaces

I Multiple co-ordinate spaces

I Better routing metrics

I Overloading co-ordinate zones

I Multiple hash functions

I Topologically sensitive construction of overlaynetwork

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Multidimensional co-ordinate spaces

I Increase the dimensions of the CAN co-ordinatespace

I This reduces routing path length

I As a result, path latency reduces

I Fault tolerance increases

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Multidimensional co-ordinate spaces

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Multiple co-ordinate spaces

I Multiple, independent co-ordinate spaces(Reality) can be maintained

I Each node assigned a different zone in eachco-ordinate space

I Contents of hash table replicated on each reality

I With r realities, data is unavailable only when thenode corresponding to it in each of the r realitiesis unavailable

I Thus, improved fault tolerance

I During routing, a node forwards the message tothe neighbour closest to the destination (oneither reality)

I Thus, reduced path length

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Multiple co-ordinate spaces

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Better routing metrics

I Instead of using only progress made towards thedestination, use ratio of progress to RTT as ametric

I Forward to neighbour with maximum ratio ofprogress to RTT

I This aims at reducing the latency of individualhops

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Overloading co-ordinate zones

I Allow more than 1 node to share a zone

I The contents of the hash table can be replicatedamong the nodes in the same zone to increaseavailability

I Overloading AdvantagesI Reduced number of hopsI Reduces per hop latency, because node can select closest

node from among those in the neighbouring zonesI Improved fault tolerance - Zone vacant only when all

nodes in it fail

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Multiple hash functions

I Map each key onto nodes using k different hashfunctions

I Replicate (Key,Value) pair at k different nodes

I Queries for a particular key can then be sent toall k nodes parallelly

I However, size of (key,value) database increases

I Also, Query traffic increases by a factor of k(Improvement possible)

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Topologically sensitive construction of overlaynetwork

I Earlier design approaches try to reduce pathlatency for a given overlay network

I However, they do not improve the overlaynetwork itself

I In an overlay network, neighbouring nodes can beactually far enough

I Also, topologically close nodes can be far enoughin the co-ordinate space

I Hence, need to locate topologically close nodes innearby zones of the co-ordinate space

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Topologically sensitive construction of overlaynetwork ..continued

I Locate m well known machines on the network

I Measure RTT of every CAN node from these mmachines

I Group nodes into zones based on their relativeordering from the m machines

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Other Design Improvements

I Uniform Partitioning

I Caching

I Replication

Distributed Hash-basedLookup

for Peer-to-Peer Systems

Sandeep Shelke 05305402Shrirang Shirodkar 05305007

MTech I CSE

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Benefits of CAN

I No centralized control/coordination

I Control information/state stored at nodesindependent of number of nodes in the system

I Improved fault tolerance

I Scalable

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Chord: A Scalable Peer-to-Peer LookupProtocol for Internet Applications

March 10, 2006

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Features

I Addresses a fundamental problem in Peer-to-PeerI Efficient location of the node that stores desired data itemI One operation: Given a key, maps it onto a nodeI Data location by associating a key with each data item

I Adapts EfficientlyI Dynamic with frequent node arrivals and departuresI Automatically adjusts internal tables to ensure availability

I Uses Consistent HashingI Load balancing in assigning keys to nodesI Little movement of keys when nodes join and leave

I Efficient RoutingI Distributed routing tableI Maintains information about only O(logN) nodesI Resolves lookups via O(logN) messages

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Features (continued)

I ScalableI Communication cost and state maintained at each node

scales logarithmically with number of nodes

I Flexible NamingI Flat key-space gives applications flexibility to map their

own names to Chord keys

I DecentralizedI Fully distributedI No “super” nodes

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Example Applications

I Cooperative MirroringI Multiple content providers spreading the load evenlyI Low cost since each participant provides for average load

I Time-shared StorageI Nodes with intermittent connectivityI Data stored elsewhere when node is disconnectedI Data’s name serves as key to identify node

I Distributed IndexesI Keyword search like Napster

I Large-scale Combinatorial SearchI Code breakingI Keys are candidate solutions such as cryptographic keysI Keys mapped to testing machines

I Chord File System (CFS)I Stores files and meta-data in a peer-to-peer systemI Good lookup performance despite dynamic system

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

The Chord Protocol

I Consistent Hashing

I Simple Key Location

I Scalable Key Location

I Node Joins and Stabilization

I Failure and Replication

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Some Terminology

I KeyI Hash key or its image under hash function, as per contextI m-bit identifier, using SHA-1 as a base hash function

I NodeI Actual node or its identifier under the hash functionI Length m such that low probability of a hash conflict

I Chord RingI The identifier circle for ordering of 2m node identifiers

I Successor NodeI First node whose identifier is equal to or follows key k in

the identifier space

I Virtual NodeI Introduced to limit the bound on keys per node to K/NI Each real node runs Ω(logN) virtual nodes with its own

identifier

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Example: Chord Ring

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Consistent Hashing

I A consistent hash function is one which changesminimally with changes in the range of keys and atotal remapping is not required

I Desirable propertiesI High probability that the hash function balances loadI Minimum disruption, only O(1/N) of the keys moved

when a nodes joins or leavesI Every node need not know about every other node, but a

small amount of “routing” information

I m-bit identifier for each node and key

I Key k assigned to Successor Node

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Simple Key Location

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Example

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Scalable Key Location

I Accelerates loopupsI Maintains additional routing informationI Not essential for correctness

I Finger tableI Routing table containing upto m entries, where m is

number of bits in the key/node identifierI i th entry at node n contains identity of the first node s

that succeeds n by at least 2i−1 on the identifier circleI First finger referred to as successor

I Each node stores information only about smallnumber of other nodes and more about nodesclosely following

I The number of nodes that must be contacted tofind successor in an N-node network is O(logN)

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Pseudocode

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Example

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Stabilization

I Ensures that lookups execute correctly, as the setof participating nodes changes

I Runs periodically in the background and updatesChord’s finger tables and successor pointers

I If any sequence of join operations is executedinterleaved with stabilizations, then at some timeafter the last join the successor pointers will froma cycle on all the nodes

I In practiceI Chord ring will never be in a stable stateI No time to stabilizeI “almost stable”

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Pseudocode

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Pseudocode (continued)

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Impact of Node Joins on Lookups: Correctness

I For a lookup before stabilization has finished,

I Case 1: All finger table entries inolved in thelookup are reasonably current, and lookup findscorrect successor in O(logN) steps

I Case 2: Successor pointers are correct, but fingerpointers are inaccurate. Yields correct lookupsbut may be slower

I Case 3: Incorrect successor pointers or keys notmigrated yet to newly joined nodes. Lookup mayfail

I Option of retrying after a quick pause, duringwhich stabilization fixes successor pointers

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Impact of Node Joins on Lookups: Performance

I After stabilization, no effect other than increasingthe value of N in O(logN)

I Before stabilization is complete

I Possibly incorrect finger table entries

I Does not significantly affect lookup speed, sincedistance halving property depends only onID-space distance

I If new nodes’ IDs are between the targetpredecessor and the target, then lookup speed isinfluenced

I Still takes O(logN) time for N new nodes

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Handling Failures

I Maintain successor list of size r , containing thenode’s first r successors

I If immediate successor does not respond,substitute the next entry in the successor list

I Modified version of stabilize protocol to maintainthe successor list

I Modified closest preceding node to search notonly finger table but also successor list for mostimmediate predecessor

I Voluntary Node DeparturesI Transfer keys to successor before departureI Notify predecessor p and successor s before leaving

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Simulation

I Implements Iterative StyleI Node resolving a lookup initiates all communication unlike

Recursive Style, where intermediate nodes forward request

I OptimizationsI During stabilization, a node updates its immediate

successor and 1 other entry in successor list or finger tableI Each entry out of k unique entries gets refreshed once in

’k’ stabilization roundsI Size of successor list is 1I Immediate notification of predecessor change to old

predecessor, without waiting for next stabilization round

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Load Balance

I Test ability of consistent hashing, to allocate keysto nodes evenly

I Number of keys per node exhibits large variations,that increase linearly with the number of keys

I Association of keys with Virtual NodesI Makes the number of keys per node more uniformI Significantly improves load balanceI Asymptotic value of query path length not affected muchI Total identifier space covered remains same on averageI Worst-case number of queries does not changeI Not much increase in routing state maintainedI Asymptotic number of control messages not affected

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

In Absence of Virtual Nodes

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

In Presence of Virtual Nodes

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Path Length

I Number of nodes that must be visited to resolvea query, measured as the query path length

I As per theorem, this number is O(logN), whereN is the number of nodes

I Let N = 2k nodes, vary k from 3 to 14

I Each node picks random set of keys to queryI Observed Results

I Mean query path length increases logarithmically withnumber of nodes

I Same as expected average query path length

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Graph

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Future Work

I Resilience against Network PartitionsI Detect and heal partitionsI For every node have a set of initial nodesI Maitain a long term memory of a random set of nodesI Likely to include nodes from other partition

I Handle Threats to Availability of dataI Malicious participants could present incorrect view of dataI Periodical Global Consistency Checks for each node

I Better EfficiencyI O(logN) messages per lookup too many for some appsI Increase the number of fingers

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Conclusion

I Decentralized Solution

I Efficient Lookup Mechanism

I Simplicity

I Provable Performance

I Provable Correctness

I A Powerful Primitive

I Applications

Chord: A ScalablePeer-to-Peer LookupProtocol for Internet

Applications

Peer-to-Peer Systems

Content Addressable Network(CAN)

Routing in CAN

CAN Details

Design Improvements in CAN

CAN: Benefits

Introduction

The Chord Protocol

Consistent Hashing

Simple Key Location

Scalable Key Location

Node Joins and Stabilization

Failure and Replication

Test Results

Simulation

Load Balance

Path Length

Future Work

Conclusion

Thank You !

Recommended