Overlay Networks and Overlay Multicast

May 2006

Definition Network

- defines addressing, routing, and service model for communication between hosts

Overlay network- A network built on top of one or

more existing networks- adds an additional layer of

indirection/virtualization- changes properties in one or

more areas of underlying network

Alternative- change an existing network

A Historical Example

Internet is an overlay network

- goal: connect local area networks

- built on local area networks (e.g., Ethernet), phone lines

- add an Internet Protocol header to all packets

Benefits

Do not have to deploy new equipment, or modify existing software/protocols

- probably have to deploy new software on top of existing software

- e.g., adding IP on top of Ethernet does not require modifying Ethernet protocol or driver

- allows bootstrapping

• expensive to develop entirely new networking hardware/software

• all networks after the telephone (Ethernet) have begun as overlay networks

Benefits Do not have to deploy at every node

- not every node needs/wants overlay network service all the time

• e.g., QoS guarantees for best-effort traffic- overlay network may be too heavyweight for

some nodes• e.g., consumes too much memory, cycles, or

bandwidth- overlay network may have unclear security

properties• e.g., may be used for service denial attack

- overlay network may not scale (not exactly a benefit)

• e.g. may require n2 state or communication

Costs Adds overhead

- adds a layer in networking stack• additional packet headers, processing

- sometimes, additional work is redundant• e.g., an IP packet contains both Ethernet (48 + 48

bits) and IP addresses (32 + 32 bits)• eliminate Ethernet addresses from Ethernet header

and assume IP header(?) Adds complexity

- layering does not eliminate complexity, it only manages it

- more layers of functionality more possible unintended interaction between layers

- e.g., corruption drops on wireless interpreted as congestion drops by TCP

Applications

Mobility

- MIPv4: pretends mobile host is in home network

Routing – P2P, … Addressing Security Multicast

Applications: Routing

Flat space- every node has a route to every other node- n2 state and communication, constant distance

Hierarchy- every node routes through its parent- constant state and communication, log(n) distance- too much load on root

Mesh (e.g., CAN-Content Addressable Network)- every node routes through 2d other nodes- O(d) state and communication, n1/d distance

Chord- every node routes through O(log n) other nodes- O(log n) state and communication, O(log n) distance

Applications: Increasing Routing Robustness

Resilient Overlay Networks [Anderson et al 2001]- overlay nodes form a

complete graph- nodes probe other nodes

for lowest latency- knowledge of complete

graph lower latency routing than IP, faster recovery from faults

Applications: Addressing

provide more address space than underlying network 6bone

- IPv6 on IPv4- requires NAT-like gateways for IPv6-only hosts to communicate with IPv4-only hosts- main current deployment of IPv6

TRIAD, IP-NL- enhanced NAT- separate Internet into realms, each with its

own IPv4 address space- use overlay network for inter-realm routing

Internet v4

6/4NAT

Applications: Security (VPN) provide more security than underlying network privacy (e.g., IPSEC)

- overlay encrypts traffic between nodes- only useful when end hosts cannot be secure

anonymity (e.g., Zero Knowledge)- overlay prevents receiver from knowing which host is

the sender, while still being able to reply- receiver cannot determine sender exactly without

compromising every overlay node along path service denial resistance (e.g., FreeNet)

- overlay replicates content so that loss of a single node does not prevent content distribution

Multicast Outline

Problems with IP Multicast Advantage: Highly efficient, Good delay Scales poorly with number of groups

- A router must maintain state for every group Supporting higher level functionality is difficult

- IP Multicast: best-effort multi-point delivery service

- Reliability and congestion control for IP Multicast complicated

• scalable, end-to-end approach for heterogeneous receivers is very difficult

• hop-by-hop approach requires more state and processing in routers

Deployment is difficult and slow- ISP’s reluctant to turn on IP Multicast

Overlay Multicast

Provide multicast functionality above the IP layer overlay or application level multicast

Potential Benefits over IP Multicast - Quick deployment- All multicast state in end systems- Computation at forwarding points simplifies

support for higher level functionality Challenge: do this efficiently Narada [Yang-hua Chu et al, 2000 CMU]

- Multi-source multicast- Involves only end hosts- Small group sizes <= hundreds of nodes- Typical application: chat

Narada: End System Multicast

Stanford

Overlay treeGatech

Berkeley

GatechStan1

End System Multicast: Narada

A distributed protocol for constructing efficient overlay trees among end systems

Caveat: assume applications with small and sparse groups

- Around tens to hundreds of members

Potential Benefits Scalability

- routers do not maintain per-group state- end systems do, but they participate in few groups

Easier to deploy- only requires adding software to end hosts

Potentially simplifies support for higher level functionality

- use hop-by-hop approach, but end hosts are routers- leverage computation and storage of end systems- e.g., packet buffering, transcoding of media streams,

ACK aggregation- leverage solutions for unicast congestion control and

reliability

Performance Concerns

Duplicate Packets:

Bandwidth Wastage

Gatech

Delay from CMU to

Berk1 increases

Stanford

Berkeley

GatechStan1

Overlay Tree Delay between the source and receivers is small The number of redundant packets on any

physical link is low Heuristic:

- Every member in the tree has a small degree - Degree chosen to reflect bandwidth of connection to

Internet

Gatech

“Efficient” overlay

Berk1Berk1

High degree (unicast)

Gatech

Stan2CMU

High latency

Gatech

Overlay Construction Problems

Dynamic changes in group membership

- Members may join and leave dynamically

- Members may die Dynamic changes in network conditions and

topology

- Delay between members may vary over time due to congestion, routing changes

Knowledge of network conditions is member specific

- Each member must determine network conditions for itself

Solution

Two step design

- Build a mesh that includes all participating end-hosts

• what they call a mesh is just a graph

• members probe each other to learn network related information

• overlay must self-improve as more information available

- Build source routed distribution trees

Advantages:- Offers a richer topology robustness; don’t

need to worry too much about failures- Don’t need to worry about cycles

Desired properties - Members have low degrees- Shortest path delay between any pair of

members along mesh is small

Berk2 Berk1

Gatech

Stan1Stan2

Overlay Trees Source routed minimum spanning tree on

mesh Desired properties

- Members have low degree- Small delays from source to receivers

Berk2 GatechBerk1

Stan1Stan2

Berk2Berk1

Gatech

Stan1Stan2

Narada Components/Techniques

Mesh Management: - Ensures mesh remains connected in face of

membership changes Mesh Optimization:

- Distributed heuristics for ensuring shortest path delay between members along the mesh is small

Spanning tree construction:- Routing algorithms for constructing data-

delivery trees - Distance vector routing, and reverse path

forwarding

Definitions

Utility gain of adding a link based on- The number of members to which routing

delay improves- How significant the improvement in delay to

each member is Cost of dropping a link based on

- The number of members to which routing delay increases, for either neighbor

Add/Drop Thresholds are functions of:- Member’s estimation of group size - Current and maximum degree of member in

the mesh

Optimizing Mesh Quality

Members periodically probe other members at random

New link added ifUtility_Gain of adding link > Add_Threshold

Members periodically monitor existing links Existing link dropped if

Cost of dropping link < Drop Threshold

Stan2CMU

Gatech1

Gatech2

A poor overlay topology:Long path from Gatech2 to CMU

Desirable properties of heuristics Stability: A dropped link will not be immediately re-added Partition avoidance: A partition of the mesh is unlikely to

be caused as a result of any single link being dropped

Delay improves to Stan1, CMU

but marginally.

Do not add link!

Delay improves to CMU, Gatech1

and significantly.

Add link!

Stan2CMU

Gatech1

Gatech2

Stan2CMU

Gatech1

Gatech2

Example

Used by Berk1 to reach only Gatech2 and vice versa: Drop!!

Gatech1Berk1

Stan2CMU

Gatech2

Gatech1Berk1

Stan2CMU

Gatech2

Simulation Results

Simulations

- Group of 128 members

- Delay between 90% pairs < 4 times the unicast delay

- No link caries more than 9 copies Experiments (in Internet)

- Group of 13 members

- Delay between 90% pairs < 1.5 times the unicast delay

Summary

End-system multicast (NARADA) : aimed to small-sized groups- Application example: chat

Multi source multicast model No need for infrastructure Properties

- low performance penalty compared to IP Multicast

- potential to simplify support for higher layer functionality

- allows for application-specific customizations

Summary Narada demonstrates the flexibility of the

application level multicast- I.e., the ability to optimize the multicast

distribution to the application needs Issues

- 4x unicast delay could be a problem for interactive applications

- reliability and congestion control for heterogeneous receivers not demonstrated

- sender access control solution not demonstrated

- overhead of probes is low for one group, what about for n groups on same host?

- is stress really an important metric?

Other Projects Overcast [Jannotti et al, Cisco, 2000]

- Single source tree- Uses an infrastructure; end hosts are not part of

multicast tree- Large groups ~ millions of nodes- Typical application: content distribution

Scattercast (Chawathe et al, UC Berkeley)- Emphasis on infrastructural support and proxy-

based multicast- Uses a mesh like Narada, but differences in protocol

details Yoid (Paul Francis, Cornell)

- Uses a shared tree among participating members- Distributed heuristics for managing and optimizing

tree constructions

Overcast Designed for throughput intensive content

delivery

- Streaming, file distribution Single source multicast; like Express Solution: build a server based infrastructure Tree building objective: high throughput

Tree Building Protocol Goal: Maximize bandwidth to root for all nodes Idea: Add a new node as far away from the route as possible without

compromising the throughput

0.80.8 1

Join (new, root) { current = root; B = bandwidth(root, new); do { B1 = 0; for all n in children(current) { B1 = bandwidth(n, new); if (B1 >= B) { current = n; break; } } while (B1 >= B); new->parent = root; }

Details A node periodically reevaluates its position by

measuring bandwidth to its- Siblings- Parent- Grandparent

The Up/Down protocol: track membership- Each node maintains info about all nodes in it sub-

tree plus a log of changes• Memory cheap

- Each node sends periodical alive messages to its parent

- A node propagates info up-stream, when• Hears first time from a children• If it doesn’t hear from a children for a present

interval• Receives updates from children

Details Problem: root single point of failure Solution: replicate root to have a backup source Problem: only root maintain complete info about the

tree; need also protocol to replicate this info Elegant solution: maintain a tree in which first levels

have degree one- Advantage: all nodes at these levels maintain full info

about the tree- Disadvantage: may increase delay, but this is not

important for application supported by Overcast

Nodes maintaining full Status info about tree

Some Results

Network load < twice the load of IP multicast (600 node network)

Convergence: a 600 node network converges in ~ 45 rounds

Summary Overcast: aimed to large groups and high

throughput applications

- Examples: video streaming, software download

Single source multicast model Deployed as an infrastructure Properties

- Low performance penalty compared to IP multicast

- Robust & customizable (e.g., use local disks for aggressive caching)

Enabling Conferencing Applications

on the Internet using an Overlay Multicast Architecture

Yang-hua Chu, Sanjay Rao, Srini Seshan and Hui Zhang

Carnegie Mellon University

Past Work Self-organizing protocols

- Yoid (ACIRI), Narada (CMU), Scattercast (Berkeley), Overcast (CISCO), Bayeux (Berkeley), …

- Construct overlay trees in distributed fashion

- Self-improve with more network info

Performance results showed promise, but…

- Evaluation conducted in simulation

- Did not consider impact of network dynamics on overlay performance

Focus of This Paper

Can End System Multicast support real-world applications on the Internet?

- Study in context of conferencing applications

- Show performance acceptable even in a dynamic and heterogeneous Internet environment

First detailed Internet evaluation to show the feasibility of End System Multicast

Why Conferencing?

Important and well-studied

- Early goal and use of multicast (vic, vat) Representative of interactive apps

- E.g., distance learning, on-line games Stringent performance requirements

- High bandwidth, low latency

Roadmap

Enhancing self-organizing protocols for conferencing applications

Evaluation methodology Results from Internet experiments

Supporting Conferencing in ESM (End System Multicast)

Framework- Unicast congestion control on each overlay link

- Adapt to the data rate using transcoding

Objective- High bandwidth and low latency to all receivers

along the overlay

B2 Mbps

2 Mbps 0.5 MbpsSource rate

2 MbpsUnicast congestion control

Transcoding

Enhancements of Overlay Design

Two new issues addressed

- Dynamically adapt to changes in network conditions

- Optimize overlays for multiple metrics

• Latency and bandwidth Study in the context of the Narada protocol

(Sigmetrics 2000) - CRZ00

- Techniques presented apply to all self-organizing protocols

• Capture the long term performance of a link – Exponential smoothing, Metric discretization

Adapt to Dynamic Metrics Adapt overlay trees to changes in network condition

- Monitor avail bandwidth (eg, with Spruce) and latency of overlay links

- Link measurements can be noisy- Aggressive adaptation may cause overlay instability

raw estimatesmoothed estimatediscretized estimate

transient: do not react

persistent:react

Optimize Overlays for Dual Metrics

Prioritize bandwidth over latency Break tie with shorter latency

SourceReceiver X

30ms, 1Mbps

60ms, 2MbpsSource rate

2 Mbps

Example of Protocol BehaviorM

Reach a stable overlay• Acquire network

info• Self-organization

Adapt to network congestion

All members join at time 0 Single sender, CBR traffic

Evaluation Overview

Evaluation Goals- Can ESM provide application level performance

comparable to IP Multicast?- What network metrics must be considered while

constructing overlays?- What is the network cost and overhead?

Compare performance of our scheme with- Benchmark (IP Multicast)- Other overlay schemes that consider fewer metrics

Evaluate schemes in different scenarios- Vary host set, source rate

Performance metrics- Application perspective: latency, bandwidth- Network perspective: resource usage, overhead

Benchmark Scheme

IP Multicast not deployed (Mbone is an overlay) Sequential Unicast: an approximation

- Bandwidth and latency of unicast path from source to each receiver

- Performance similar to IP Multicast with ubiquitous (well spread out) deployment

Source

Overlay Schemes

Overlay Scheme Choice of Metrics

Bandwidth Latency

Bandwidth-Latency

Bandwidth-Only

Latency-Only

Random

Experiment Methodology

Compare different schemes on the Internet

- Ideally: run different schemes concurrently

- Interleave experiments of different schemes

- Repeat same experiments at different time of day

- Average results over 10 experiments For each experiment

- All members join at the same time

- Single source CBR traffic with TFRC adaptation

- Each experiment lasts for 20 minutes

Application Level Metrics

Bandwidth (throughput) observed by each receiver

RTT between source and each receiver along overlay

Source Data path

RTT measurement

These measurements include queueing and processing delays at end systems

Performance of Overlay Scheme

“Quality” of overlay tree produced by a scheme Sort (“rank”) receivers based on performance Take mean and std. dev. on performance of same rank

across multiple experiments Std. dev. shows variability of tree quality

Rank1 2

Different runs of the same scheme mayproduce different but “similar quality” trees

Harvard

Exp2Exp1

Exp1Exp2

Mean Std. Dev.

Factors Affecting Performance

Heterogeneity of host set

- Primary Set: 13 university hosts in U.S. and Canada

- Extended Set: 20 hosts, which includes hosts in Europe, Asia, and behind ADSL

Source rate

- Fewer Internet paths can sustain higher source rate

- More intelligence required in overlay constructions

Three Scenarios Considered

Does ESM work in different scenarios? How do different schemes perform under

various scenarios? Is it sufficient to consider just a single metric?

- Bandwidth-Only, Latency-Only

Primary Set1.2 Mbps

Primary Set2.4 Mbps

Extended Set2.4 Mbps

(lower) “stress” to overlay schemes (higher)

Primary Set1.2 Mbps

BW (Primary Set, 1.2 Mbps)

Random scheme performs poorly even in a less “stressful” scenario

RTT results show similar trend

Internet pathology

BW (Extended Set, 2.4 Mbps)

no strong correlation betweenlatency and bandwidth

Optimizing only for latency has poor bandwidth performance

RTT (Extended Set, 2.4Mbps)

Bandwidth-Only cannot avoidpoor latency links or long path length

Optimizing only for bandwidth has poor latency performance

Internet tools used in the design

From: “Measurement-Based Optimization Techniques for Bandwidth-Demanding Peer-to-Peer Systems” T. S. Eugene et al, Infocom 2003

• RTT probing:

• a Ping pkt is sent to each candidate peer; select lowest RTT peer

• 10KB TCP:

• TCP connection open, 10KB downloaded, fastest download peer selected

• BNBW (Bottleneck Bw):

• Nettimer (a packet pair scheme) used; peer with largest BNBW is selected

Summary so far…

For best application performance: adapt dynamically to both latency and bandwidth metrics

Bandwidth-Latency performs comparably to IP Multicast (Sequential-Unicast)

What is the network cost and overhead?

Resource Usage (RU)

Captures consumption of network resource of overlay tree Overlay link RU = propagation delay Tree RU = sum of link RU

U.Pitt

U. Pitt

Efficient (RU = 42ms)

Inefficient (RU = 80ms)

IP Multicast 1.0Bandwidth-Latency 1.49Random 2.24Naïve Unicast 2.62

Scenario: Primary Set, 1.2 Mbps(normalized to IP Multicast RU)

Protocol Overhead

Results: Primary Set, 1.2 Mbps

- Average overhead = 10.8%

- 92.2% of overhead is due to bandwidth probe Current scheme employs active probing for

available bandwidth

- Simple heuristics to eliminate unnecessary probes

- Focus of our current research

Protocol overhead = total non-data traffic (in bytes)

total data traffic (in bytes)

Contribution

First detailed Internet evaluation to show the feasibility of End System Multicast architecture

- Study in context of a/v conferencing

- Performance comparable to IP Multicast

Impact of metrics on overlay performance

- For best performance: use both latency and bandwidth

More info: http://www.cs.cmu.edu/~narada

Overlay Networks and Overlay Multicast

Documents

Overlay Multicast for MANETs Using Dynamic Virtual Mesh

Efficient Overlay Multicast Protocol in Mobile Ad hoc Networks

Docker Overlay Networks

ROMaN: Revenue driven Overlay Multicast Networking

Designing Overlay Multicast Networks for Streaming

OVERLAY MULTICAST FOR REAL-TIME DISTRIBUTED …

Quality-of-Service Multicast Overlay Spanning Tree Algorithms …rodolakis/publications/qmost.pdf · Quality-of-Service Multicast Overlay Spanning Tree Algorithms for Wireless Ad

Multicast ad hoc networks

Designing Overlay Multicast Networks for Commercial Streaming

Overlay Networks for Multimedia Contents Distribution · Mesh-based Multicast Networks Tree-based Multicast Networks DHT-based Multicast Networks References Overcast (Cisco, 2000)

ANF Overlay Multicast Working Group Activity

ROMA: Reliable Overlay Multicast with Loosely Coupled TCP Connections

Service Overlay Forest Embedding for Software … Overlay Forest Embedding for Software-Deﬁned Cloud Networks ... Virtual Network Functions (VNFs) ... Network-based multicast IPTV

A Management Method of IP Multicast in Overlay …conferences.sigcomm.org/sigcomm/2012/paper/hotsdn/p91.pdfA Management Method of IP Multicast in Overlay Networks ... switch with VXLAN

Overlay Multicast Trees

Overlay Networks and Overlay Multicast May 2008. 2 Definition Network -defines addressing, routing, and service model for communication between hosts

EECS 122: Introduction to Computer Networks Multicast and ...inst.eecs.berkeley.edu/~ee122/fa08/notes/24-Multicastx2.pdf · Multicast and Overlay Networks Ion Stoica (and Brighten

CS 268: Overlay Networks: Introduction and Multicast Ion Stoica April 15-17, 2003

Impact of Topology on Overlay Multicast Suat Mercan

Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured