49
Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Embed Size (px)

Citation preview

Page 1: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

T-110.455 Network Application Frameworks and XML

Distributed Hash Tables (DHTs) and multi-homing

16.2.2005

Sasu Tarkoma

Based on slides by Pekka Nikander

Page 2: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Contents

Introduction Layered model, routing table vs. address

Multi-homing Overlay networks Summary

Page 3: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Introduction

A name determines the node / service An address gives its location

May change in mobility A path determines how to get there Routing determines the path from an

address Directories convert names to addresses

You can think of routing tables and directories as distributed databases

Page 4: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Layered-model revisited

Object API

Firewall bypass

End-to-end

Routing

Congestion control

Presentation DNS names

IP addresses

Routing paths

XML presentation

Finding, meta-data, invoking, syncing, mobility. Web Services

Page 5: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Topology in address vs. routing table

Reactive AD HOC (MANET)

routing

Pure source routing

ATM PNNI CIDR

Original IP routing

Proactive ad hoc

(MANET) routing

Host-based hop-by-hop

Page 6: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Identity/Locator split

Process

Transport

ID Layer

IP Layer

Link Layer

identifier

locator

New name space for IDs Maybe based on DNS Maybe a separate

namespace Maybe IP addresses are

used for location Good for hiding IP versions

Communication end-points (sockets) bound to identifiers

Page 7: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Multi-addressing

Multi-homing Server has multiple addresses on different

networks for increased reliability Client has multiple addresses

Multi-homing requires support for address change

Topology change can cause renumbering From old prefix to new prefix Changing the IP host addresses of each device

within the network

Related with multi-homing and must be supported by mobility protocols

Page 8: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Multi-homing

Mobility

Onehost

Single subnet

Parts oftopology

Allhosts

End-hostmultihoming

SoHo sitemultihoming

enterprisemultihoming

Movingmulti-homed

networks

ad hocnetworks

End-hostmobility

Moving networks(NEMO)

reliability, load-sharing

HIP, SCTP

Mobile IP, HIP, SCTP

Site multihoming challenge: prefixes

create additional stateMULTI6 in IETF for

IPv6 multihoming

Page 9: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

ISP1

ISP2

Multi-homedsite

NAT1

NAT2

Site multi-homing

Multi-homing Examples

Wireless Host

Internet

WLAN

GPRS

End-host multi-homing

Page 10: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Multi-homing

End-host multihoming Transport-level (SCTP)

Good when there is a connection Needs mobility management for boot-strapping

Locator / Identity split Host Identity Protocol Natural way to decouple topological

location/locations and identity Needs directories to solve bootstrapping problems

We need directories for coping with mobility and multi-homing Scalability, low latency updates

Page 11: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Overlays

Motivation Terminology Overlays

Clusters and wide-area Applications for DHTs

Internet Indirection Infrastructure (i3) Delegation Oriented Architecture (DOA)

Page 12: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Motivation for Overlays

Directories are needed Name resolution Mobility support

Fast updates

Required properties Fast updates Scalability

DNS has limitations Update latency Administration

One solution Overlay networks

Page 13: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Terminology

Peer-to-peer (P2P) Different from client-server model Each peer has both client/server features

Overlay networks Routing systems that run on top of another network,

such as the Internet. Distributed Hash Tables (DHT)

An algorithm for creating efficient distributed hash tables (lookup structures)

Used to implement overlay networks Typical features of P2P / overlays

Scalability, resilience, high availability, and they tolerate frequent peer connections and disconnections

Page 14: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Peer-to-peer in more detail

A P2P system is distributed No centralized control Nodes are symmetric in functionality

Large faction of nodes are unreliable Nodes come and go

P2P enabled by evolution in data communications and technology

Current challenges: Security, IPR issues

P2P systems are decentralized overlays

Page 15: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Evolution of P2P systems

Started from centralized servers Napster

Centralized directory Single point of failure

Second generation used flooding Gnutella

Local directory for each peer High cost, worst-case O(N) messages for lookup

Research systems use DHTs Chord, Tapestry, CAN, .. Decentralization, scalability

Page 16: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Challenges

Challenges in the wide-area Scalability Increasing number of users, requests, files,

traffic Resilience: more components -> more

failures Management: intermittent resource

availability -> complex management

Page 17: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Overlay Networks

Origin in Peer-to-Peer (P2P) Builds upon Distributed Hash Tables

(DHTs) Easy to deploy

No changes to routers or TCP/IP stack Typically on application layer

Overlay properties Resilience Fault-tolerance Scalability

Page 18: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Upper layers

Overlay

Congestion

End-to-end

Routing

DNS names, customidentifiers

Overlay addresses

IP addresses

Routing paths

Page 19: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Cluster vs. Wide-area

Clusters are single, secure, controlled, administrative

domains engineered to avoid network partitions low-latency, high-throughput SANs predictable behaviour, controlled environment

Wide-area Heterogeneous networks Unpredictable delays and packet drops Multiple administrative domains Network partitions possible

Page 20: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Wide-area requirements

Easy deployment Scalability to millions of nodes and

billions of data items Availability

Copes with routine faults Self-configuring, adaptive to network

changes Takes locality into account

Page 21: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Distributed Data Structures (DSS)

DHTs are an example of DSS DHT algorithms are available for clusters

and wide-area environments They are different!

Cluster-based solutions Ninja LH* and variants

Wide-area solutions Chord, Tapestry, .. Flat DHTs, peers are equal Maintain a subset of peers in a routing table

Page 22: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Distributed Data Structures (DSS)

Ninja project (UCB) New storage layer for cluster services Partition conventional data structure across

nodes in a cluster Replicate partitions with replica groups in

cluster Availability

Sync replicas to disk (durability) Other DSS for data / clusters

LH* Linear Hashing for Distributed Files Redundant versions for high-availability

Page 23: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Cluster-based Distributed Hash Tables (DHT)

Directory for non-hierarchical data Several different ways to implement Each “brick” maintains a partial map Overlay addresses used to direct to the

right “brick” Resilience through parallel, unrelated

mappings

Page 24: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

client client client client

service

DSS lib

service

DSS lib

storage“brick”

storage“brick”

storage“brick”

storage“brick”

storage“brick”

storage“brick”

SAN

Service interactswith DSS lib

Hash table API

Redundant, low latency, high throughput

network

Brick = single-node, durable

hash table,replicated

clients interact with anyservice

“front-end”

Page 25: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

DHT interfaces

DHTs offer typically two functions put(key, value) get(key) --> value delete(key)

Supports wide range of applications Similar interface to UDP/IP

Send(IP address, data) Receive(IP address) --> data

No restrictions are imposed on the semantics of values and keys

An arbitrary data blob can be hashed to a key Key/value pairs are persistent and global

Page 26: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Distributed applications

Distributed Hash Table (DHT)

Node Node Node Node

put(key, value) get(key) valueDHT balances keys and

data across nodes

Page 27: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Some DHT applications

File sharing Web caching Censor-resistant data storage Event notification Naming systems Query and indexing Communication primitives Backup storage Web archive

Page 28: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Chord

Chord is an overlay algorithm from MIT Stoica et. al., SIGCOMM 2001

Chord is a lookup structure (a directory) Resembles binary search

Uses consistent hashing to map keys to nodes Keys are hashed to m-bit identifiers Nodes have m-bit identifiers

IP-address is hashed SHA-1 is used as the baseline algorithm

Support for rapid joins and leaves Churn Maintains routing tables

Page 29: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Chord routing I

A node has a well determined place within the ring

Identifiers are ordered on an identifier circle modulo 2m

The Chord ring A node has a predecessor and a successor A node stores the keys between its

predecessor and itself The (key, value) is stored on the successor node of

key A routing table (finger table) keeps track of

other nodes

Page 30: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Finger Table

Each node maintains a routing table with at most m entries

The i:th entry of the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1 on the identifier circle

s = successor(n + 2i-1) The i:th finger of node n

Page 31: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

N1

N8

N14

N21

N32N38

N42

N51

N56

2m-1 0

+1+2

+4

+8

+16

+32

Finger Maps to Real node

1,2,3

4

5

6

x+1,x+2,x+4

x+8

x+16

x+32

N14

N21

N32

N42

m=6

for j=1,...,m thefingers of p+2j-1

Predecessor node

Page 32: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Chord routing II

Routing steps check whether k is found between n and the

successor of n if not, forward the request to the closest finger

preceding k Each know knows a lot about nearby

nodes and less about nodes farther away

The target node will be eventually found

Page 33: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Chord lookup

N1

N8

N14

N21

N32N38

N42

N51

N56

2m-1 0m=6

Page 34: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Chord Properties

Each node is responsible for K/N keys (K is the number of keys, N is the number of nodes)

When a node joins or leaves the network only O(K/N) keys will be relocated

Lookups take O(log N) messages To re-establish routing invariants after

join/leave O(log2 N) messages are needed

Page 35: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Tapestry DHT developed at UCB

Zhao et. al., UC Berkeley TR 2001 Used in OceanStore

Secure, wide-area storage service Tree-like geometry Suffix - based hypercube

160 bits identifiers Suffix routing from A to B

hop(h) shares suffix with B of length digits 1245 routes to 5432 via

5432 -> 3235 -> 1245 -> 6245 -> 1245

Tapestry Core API: publishObject(ObjectID,[serverID]) routeMsgToObject(ObjectID) routeMsgToNode(NodeID)

Page 36: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Pastry I

A DHT based on a circular flat identifier space

Prefix-routing Message is sent towards a node which is

numerically closest to the target node Procedure is repeated until the node is found Prefix match: number of identical digits before

the first differing digit Prefix match increases by every hop

Similar performance to Chord

Page 37: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Pastry II

Routing a message If leaf set has the prefix --> send to

local else send to the identifier in the

routing table with the longest common prefix (longer then the current node)

else query leaf set for a numerically closer node with the same prefix match as the current node

Used in event notification (Scribe)

Page 38: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Security Considerations

Malicious nodes Attacker floods DHT with data Attacker returns incorrect data

self-authenticating data Attacker denies data exists or supplies

incorrect routing info Basic solution: using redundancy What if attackers have quorum?

Need a way to control creation of node Ids Solution: secure node identifiers

Use public keys

Page 39: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Applications for DHTs

DHTs are used as a basic building block for an application-level infrastructure Internet Indirection Infrastructure (i3)

New forwarding infrastructure based on Chord DOA (Delegation Oriented Architecture)

New naming and addressing infrastructure based on overlays

Page 40: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Internet Indirection Infrastructure (i3)

A DHT - based overlay network Based on Chord

Aims to provide more flexible communication model than current IP addressing

Also a forwarding infrastructure i3 packets are sent to identifiers each identifier is routed to the i3 node responsible

for that identifier the node maintains triggers that are installed by

receivers when a matching trigger is found the packet is

forwarded to the receiver

Page 41: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

i3 II

An i3 identifier may be bound to a host, object, or a session

i3 has been extended with ROAM Robut Overlay Architecture for Mobility Allows end hosts to control the placement of

rendezvous-points (indirection points) for efficient routing and handovers

Legacy application support user level proxy for encapsulating IP packets

to i3 packets

Page 42: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Source: http://i3.cs.berkeley.edu/

R inserts a trigger (id, R) and receives all packets with identifier id.

the host changes its address from R1 to R2, it updates its trigger from (id, R1) to (id, R2).

Mobility is transparent for the sender

Page 43: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Source: http://i3.cs.berkeley.edu/

A multicast tree using a hierarchy of triggers

Page 44: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Source: http://i3.cs.berkeley.edu/

Anycast using the longest matching prefix rule.

Page 45: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Source: http://i3.cs.berkeley.edu/

Sender-driven service composition using a stack of identifiers

Receiver-driven service composition using a stack of identifiers

Page 46: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

DOA (Delegation Oriented Architecture)

Proposed to circumvent the harmful side-effects of middleboxes

Middleboxes are used to improve scalability, but they also constrain the Internet architecture

Two changes persistent host identifiers a mechanism for resolving these

Each host has a unique 160 bit EID (End-point Identifier)

Works in a similar fashion to HIP EID hash of a public key, identity/locator split, Self-certification

Page 47: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

DOA II DOA header is located between IP and TCP

headers and carries source and destination EIDs The mapping service maps EIDs to IP addresses This allows the introduction of various

middleboxes to the routing of packets Outsourcing of intermediaries

Ideally clients may select the most useful network elements to use

Differences to i3 Not a forwarding infrastructure i3 does not provide a primitive for specifying

intermediaries

Page 48: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Layered-model

Object API

Firewall bypass

End-to-end

Routing

Congestion control

Presentation

IP addresses

Routing paths

Host Identity, public key

Page 49: T-110.455 Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing 16.2.2005 Sasu Tarkoma Based on slides by Pekka Nikander

Summary

Mobility and multi-homing require directories Scalability, low-latency updates

Overlay networks have been proposed Searching, storing, routing, notification,.. Lookup (Chord, Tapestry, Pastry), coordination

primitives (i3), middlebox support (DOA) Logarithmic scalability, decentralised,…

Host/identity split and overlays for directories seem good candidates for solving many of the issues with mobility and multi-homing