View
102
Download
3
Category
Tags:
Preview:
DESCRIPTION
Overlay Networks and Overlay Multicast. May 2006. Definition. Network defines addressing, routing, and service model for communication between hosts Overlay network A network built on top of one or more existing networks adds an additional layer of indirection/virtualization - PowerPoint PPT Presentation
Citation preview
Overlay Networks and Overlay Multicast
May 2006
2
Definition Network
- defines addressing, routing, and service model for communication between hosts
Overlay network- A network built on top of one or
more existing networks- adds an additional layer of
indirection/virtualization- changes properties in one or
more areas of underlying network
Alternative- change an existing network
layer
3
A Historical Example
Internet is an overlay network
- goal: connect local area networks
- built on local area networks (e.g., Ethernet), phone lines
- add an Internet Protocol header to all packets
4
Benefits
Do not have to deploy new equipment, or modify existing software/protocols
- probably have to deploy new software on top of existing software
- e.g., adding IP on top of Ethernet does not require modifying Ethernet protocol or driver
- allows bootstrapping
• expensive to develop entirely new networking hardware/software
• all networks after the telephone (Ethernet) have begun as overlay networks
5
Benefits Do not have to deploy at every node
- not every node needs/wants overlay network service all the time
• e.g., QoS guarantees for best-effort traffic- overlay network may be too heavyweight for
some nodes• e.g., consumes too much memory, cycles, or
bandwidth- overlay network may have unclear security
properties• e.g., may be used for service denial attack
- overlay network may not scale (not exactly a benefit)
• e.g. may require n2 state or communication
6
Costs Adds overhead
- adds a layer in networking stack• additional packet headers, processing
- sometimes, additional work is redundant• e.g., an IP packet contains both Ethernet (48 + 48
bits) and IP addresses (32 + 32 bits)• eliminate Ethernet addresses from Ethernet header
and assume IP header(?) Adds complexity
- layering does not eliminate complexity, it only manages it
- more layers of functionality more possible unintended interaction between layers
- e.g., corruption drops on wireless interpreted as congestion drops by TCP
7
Applications
Mobility
- MIPv4: pretends mobile host is in home network
Routing – P2P, … Addressing Security Multicast
8
Applications: Routing
Flat space- every node has a route to every other node- n2 state and communication, constant distance
Hierarchy- every node routes through its parent- constant state and communication, log(n) distance- too much load on root
Mesh (e.g., CAN-Content Addressable Network)- every node routes through 2d other nodes- O(d) state and communication, n1/d distance
Chord- every node routes through O(log n) other nodes- O(log n) state and communication, O(log n) distance
9
Applications: Increasing Routing Robustness
Resilient Overlay Networks [Anderson et al 2001]- overlay nodes form a
complete graph- nodes probe other nodes
for lowest latency- knowledge of complete
graph lower latency routing than IP, faster recovery from faults
10
Applications: Addressing
provide more address space than underlying network 6bone
- IPv6 on IPv4- requires NAT-like gateways for IPv6-only hosts to communicate with IPv4-only hosts- main current deployment of IPv6
TRIAD, IP-NL- enhanced NAT- separate Internet into realms, each with its
own IPv4 address space- use overlay network for inter-realm routing
6/4
Internet v4
6/44
6
6
6
6/4NAT
11
Applications: Security (VPN) provide more security than underlying network privacy (e.g., IPSEC)
- overlay encrypts traffic between nodes- only useful when end hosts cannot be secure
anonymity (e.g., Zero Knowledge)- overlay prevents receiver from knowing which host is
the sender, while still being able to reply- receiver cannot determine sender exactly without
compromising every overlay node along path service denial resistance (e.g., FreeNet)
- overlay replicates content so that loss of a single node does not prevent content distribution
12
Multicast Outline
13
Problems with IP Multicast Advantage: Highly efficient, Good delay Scales poorly with number of groups
- A router must maintain state for every group Supporting higher level functionality is difficult
- IP Multicast: best-effort multi-point delivery service
- Reliability and congestion control for IP Multicast complicated
• scalable, end-to-end approach for heterogeneous receivers is very difficult
• hop-by-hop approach requires more state and processing in routers
Deployment is difficult and slow- ISP’s reluctant to turn on IP Multicast
14
Overlay Multicast
Provide multicast functionality above the IP layer overlay or application level multicast
Potential Benefits over IP Multicast - Quick deployment- All multicast state in end systems- Computation at forwarding points simplifies
support for higher level functionality Challenge: do this efficiently Narada [Yang-hua Chu et al, 2000 CMU]
- Multi-source multicast- Involves only end hosts- Small group sizes <= hundreds of nodes- Typical application: chat
15
Narada: End System Multicast
Stanford
CMU
Stan1
Stan2
Berk2
Overlay treeGatech
Berk1
Berkeley
GatechStan1
Stan2
CMU
Berk1
Berk2
16
End System Multicast: Narada
A distributed protocol for constructing efficient overlay trees among end systems
Caveat: assume applications with small and sparse groups
- Around tens to hundreds of members
17
Potential Benefits Scalability
- routers do not maintain per-group state- end systems do, but they participate in few groups
Easier to deploy- only requires adding software to end hosts
Potentially simplifies support for higher level functionality
- use hop-by-hop approach, but end hosts are routers- leverage computation and storage of end systems- e.g., packet buffering, transcoding of media streams,
ACK aggregation- leverage solutions for unicast congestion control and
reliability
18
Performance Concerns
Duplicate Packets:
Bandwidth Wastage
CMU
Stan1
Stan2
Berk2
Gatech
Berk1
Delay from CMU to
Berk1 increases
Stanford
Berkeley
GatechStan1
Stan2
CMU
Berk1
Berk2
19
Overlay Tree Delay between the source and receivers is small The number of redundant packets on any
physical link is low Heuristic:
- Every member in the tree has a small degree - Degree chosen to reflect bandwidth of connection to
Internet
Gatech
“Efficient” overlay
CMU
Berk2
Stan1
Stan2
Berk1Berk1
High degree (unicast)
Berk2
Gatech
Stan2CMU
Stan1
Stan2
High latency
CMU
Berk2
Gatech
Stan1
Berk1
20
Overlay Construction Problems
Dynamic changes in group membership
- Members may join and leave dynamically
- Members may die Dynamic changes in network conditions and
topology
- Delay between members may vary over time due to congestion, routing changes
Knowledge of network conditions is member specific
- Each member must determine network conditions for itself
21
Solution
Two step design
- Build a mesh that includes all participating end-hosts
• what they call a mesh is just a graph
• members probe each other to learn network related information
• overlay must self-improve as more information available
- Build source routed distribution trees
22
Mesh
Advantages:- Offers a richer topology robustness; don’t
need to worry too much about failures- Don’t need to worry about cycles
Desired properties - Members have low degrees- Shortest path delay between any pair of
members along mesh is small
Berk2 Berk1
CMU
Gatech
Stan1Stan2
23
Overlay Trees Source routed minimum spanning tree on
mesh Desired properties
- Members have low degree- Small delays from source to receivers
Berk2 GatechBerk1
Stan1Stan2
Berk2Berk1
CMU
Gatech
Stan1Stan2
CMU
24
Narada Components/Techniques
Mesh Management: - Ensures mesh remains connected in face of
membership changes Mesh Optimization:
- Distributed heuristics for ensuring shortest path delay between members along the mesh is small
Spanning tree construction:- Routing algorithms for constructing data-
delivery trees - Distance vector routing, and reverse path
forwarding
25
Definitions
Utility gain of adding a link based on- The number of members to which routing
delay improves- How significant the improvement in delay to
each member is Cost of dropping a link based on
- The number of members to which routing delay increases, for either neighbor
Add/Drop Thresholds are functions of:- Member’s estimation of group size - Current and maximum degree of member in
the mesh
26
Optimizing Mesh Quality
Members periodically probe other members at random
New link added ifUtility_Gain of adding link > Add_Threshold
Members periodically monitor existing links Existing link dropped if
Cost of dropping link < Drop Threshold
Berk1
Stan2CMU
Gatech1
Stan1
Gatech2
A poor overlay topology:Long path from Gatech2 to CMU
27
Desirable properties of heuristics Stability: A dropped link will not be immediately re-added Partition avoidance: A partition of the mesh is unlikely to
be caused as a result of any single link being dropped
Delay improves to Stan1, CMU
but marginally.
Do not add link!
Delay improves to CMU, Gatech1
and significantly.
Add link!
Berk1
Stan2CMU
Gatech1
Stan1
Gatech2
Probe
Berk1
Stan2CMU
Gatech1
Stan1
Gatech2
Probe
28
Example
Used by Berk1 to reach only Gatech2 and vice versa: Drop!!
Gatech1Berk1
Stan2CMU
Stan1
Gatech2
Gatech1Berk1
Stan2CMU
Stan1
Gatech2
29
Simulation Results
Simulations
- Group of 128 members
- Delay between 90% pairs < 4 times the unicast delay
- No link caries more than 9 copies Experiments (in Internet)
- Group of 13 members
- Delay between 90% pairs < 1.5 times the unicast delay
30
Summary
End-system multicast (NARADA) : aimed to small-sized groups- Application example: chat
Multi source multicast model No need for infrastructure Properties
- low performance penalty compared to IP Multicast
- potential to simplify support for higher layer functionality
- allows for application-specific customizations
31
Summary Narada demonstrates the flexibility of the
application level multicast- I.e., the ability to optimize the multicast
distribution to the application needs Issues
- 4x unicast delay could be a problem for interactive applications
- reliability and congestion control for heterogeneous receivers not demonstrated
- sender access control solution not demonstrated
- overhead of probes is low for one group, what about for n groups on same host?
- is stress really an important metric?
32
Other Projects Overcast [Jannotti et al, Cisco, 2000]
- Single source tree- Uses an infrastructure; end hosts are not part of
multicast tree- Large groups ~ millions of nodes- Typical application: content distribution
Scattercast (Chawathe et al, UC Berkeley)- Emphasis on infrastructural support and proxy-
based multicast- Uses a mesh like Narada, but differences in protocol
details Yoid (Paul Francis, Cornell)
- Uses a shared tree among participating members- Distributed heuristics for managing and optimizing
tree constructions
33
Overcast Designed for throughput intensive content
delivery
- Streaming, file distribution Single source multicast; like Express Solution: build a server based infrastructure Tree building objective: high throughput
34
Tree Building Protocol Goal: Maximize bandwidth to root for all nodes Idea: Add a new node as far away from the route as possible without
compromising the throughput
10.5
0.80.8 1
0.5
0.7
1
Root
Join (new, root) { current = root; B = bandwidth(root, new); do { B1 = 0; for all n in children(current) { B1 = bandwidth(n, new); if (B1 >= B) { current = n; break; } } while (B1 >= B); new->parent = root; }
35
Details A node periodically reevaluates its position by
measuring bandwidth to its- Siblings- Parent- Grandparent
The Up/Down protocol: track membership- Each node maintains info about all nodes in it sub-
tree plus a log of changes• Memory cheap
- Each node sends periodical alive messages to its parent
- A node propagates info up-stream, when• Hears first time from a children• If it doesn’t hear from a children for a present
interval• Receives updates from children
36
Details Problem: root single point of failure Solution: replicate root to have a backup source Problem: only root maintain complete info about the
tree; need also protocol to replicate this info Elegant solution: maintain a tree in which first levels
have degree one- Advantage: all nodes at these levels maintain full info
about the tree- Disadvantage: may increase delay, but this is not
important for application supported by Overcast
Nodes maintaining full Status info about tree
37
Some Results
Network load < twice the load of IP multicast (600 node network)
Convergence: a 600 node network converges in ~ 45 rounds
38
Summary Overcast: aimed to large groups and high
throughput applications
- Examples: video streaming, software download
Single source multicast model Deployed as an infrastructure Properties
- Low performance penalty compared to IP multicast
- Robust & customizable (e.g., use local disks for aggressive caching)
Enabling Conferencing Applications
on the Internet using an Overlay Multicast Architecture
Yang-hua Chu, Sanjay Rao, Srini Seshan and Hui Zhang
Carnegie Mellon University
40
Past Work Self-organizing protocols
- Yoid (ACIRI), Narada (CMU), Scattercast (Berkeley), Overcast (CISCO), Bayeux (Berkeley), …
- Construct overlay trees in distributed fashion
- Self-improve with more network info
Performance results showed promise, but…
- Evaluation conducted in simulation
- Did not consider impact of network dynamics on overlay performance
41
Focus of This Paper
Can End System Multicast support real-world applications on the Internet?
- Study in context of conferencing applications
- Show performance acceptable even in a dynamic and heterogeneous Internet environment
First detailed Internet evaluation to show the feasibility of End System Multicast
42
Why Conferencing?
Important and well-studied
- Early goal and use of multicast (vic, vat) Representative of interactive apps
- E.g., distance learning, on-line games Stringent performance requirements
- High bandwidth, low latency
43
Roadmap
Enhancing self-organizing protocols for conferencing applications
Evaluation methodology Results from Internet experiments
44
Supporting Conferencing in ESM (End System Multicast)
Framework- Unicast congestion control on each overlay link
- Adapt to the data rate using transcoding
Objective- High bandwidth and low latency to all receivers
along the overlay
D
C
A
B2 Mbps
2 Mbps 0.5 MbpsSource rate
2 MbpsUnicast congestion control
Transcoding
(DSL)
45
Enhancements of Overlay Design
Two new issues addressed
- Dynamically adapt to changes in network conditions
- Optimize overlays for multiple metrics
• Latency and bandwidth Study in the context of the Narada protocol
(Sigmetrics 2000) - CRZ00
- Techniques presented apply to all self-organizing protocols
46
• Capture the long term performance of a link – Exponential smoothing, Metric discretization
Adapt to Dynamic Metrics Adapt overlay trees to changes in network condition
- Monitor avail bandwidth (eg, with Spruce) and latency of overlay links
- Link measurements can be noisy- Aggressive adaptation may cause overlay instability
time
ban
dw
idth
raw estimatesmoothed estimatediscretized estimate
transient: do not react
persistent:react
47
Optimize Overlays for Dual Metrics
Prioritize bandwidth over latency Break tie with shorter latency
SourceReceiver X
30ms, 1Mbps
60ms, 2MbpsSource rate
2 Mbps
48
Example of Protocol BehaviorM
ean
Rec
eive
r B
and
wid
th
Reach a stable overlay• Acquire network
info• Self-organization
Adapt to network congestion
All members join at time 0 Single sender, CBR traffic
49
Evaluation Overview
Evaluation Goals- Can ESM provide application level performance
comparable to IP Multicast?- What network metrics must be considered while
constructing overlays?- What is the network cost and overhead?
Compare performance of our scheme with- Benchmark (IP Multicast)- Other overlay schemes that consider fewer metrics
Evaluate schemes in different scenarios- Vary host set, source rate
Performance metrics- Application perspective: latency, bandwidth- Network perspective: resource usage, overhead
50
Benchmark Scheme
IP Multicast not deployed (Mbone is an overlay) Sequential Unicast: an approximation
- Bandwidth and latency of unicast path from source to each receiver
- Performance similar to IP Multicast with ubiquitous (well spread out) deployment
C
A B
Source
51
Overlay Schemes
Overlay Scheme Choice of Metrics
Bandwidth Latency
Bandwidth-Latency
Bandwidth-Only
Latency-Only
Random
52
Experiment Methodology
Compare different schemes on the Internet
- Ideally: run different schemes concurrently
- Interleave experiments of different schemes
- Repeat same experiments at different time of day
- Average results over 10 experiments For each experiment
- All members join at the same time
- Single source CBR traffic with TFRC adaptation
- Each experiment lasts for 20 minutes
53
Application Level Metrics
Bandwidth (throughput) observed by each receiver
RTT between source and each receiver along overlay
D
C
A
B
Source Data path
RTT measurement
These measurements include queueing and processing delays at end systems
54
Performance of Overlay Scheme
“Quality” of overlay tree produced by a scheme Sort (“rank”) receivers based on performance Take mean and std. dev. on performance of same rank
across multiple experiments Std. dev. shows variability of tree quality
Rank1 2
RTT
Different runs of the same scheme mayproduce different but “similar quality” trees
CMU
MIT
Harvard
CMU
MIT
Harvard
Exp2Exp1
32ms
42ms
30ms
40ms
Exp1Exp2
Mean Std. Dev.
55
Factors Affecting Performance
Heterogeneity of host set
- Primary Set: 13 university hosts in U.S. and Canada
- Extended Set: 20 hosts, which includes hosts in Europe, Asia, and behind ADSL
Source rate
- Fewer Internet paths can sustain higher source rate
- More intelligence required in overlay constructions
56
Three Scenarios Considered
Does ESM work in different scenarios? How do different schemes perform under
various scenarios? Is it sufficient to consider just a single metric?
- Bandwidth-Only, Latency-Only
Primary Set1.2 Mbps
Primary Set2.4 Mbps
Extended Set2.4 Mbps
(lower) “stress” to overlay schemes (higher)
Primary Set1.2 Mbps
57
BW (Primary Set, 1.2 Mbps)
Random scheme performs poorly even in a less “stressful” scenario
RTT results show similar trend
Internet pathology
58
BW (Extended Set, 2.4 Mbps)
no strong correlation betweenlatency and bandwidth
Optimizing only for latency has poor bandwidth performance
59
RTT (Extended Set, 2.4Mbps)
Bandwidth-Only cannot avoidpoor latency links or long path length
Optimizing only for bandwidth has poor latency performance
60
Internet tools used in the design
From: “Measurement-Based Optimization Techniques for Bandwidth-Demanding Peer-to-Peer Systems” T. S. Eugene et al, Infocom 2003
• RTT probing:
• a Ping pkt is sent to each candidate peer; select lowest RTT peer
• 10KB TCP:
• TCP connection open, 10KB downloaded, fastest download peer selected
• BNBW (Bottleneck Bw):
• Nettimer (a packet pair scheme) used; peer with largest BNBW is selected
61
Summary so far…
For best application performance: adapt dynamically to both latency and bandwidth metrics
Bandwidth-Latency performs comparably to IP Multicast (Sequential-Unicast)
What is the network cost and overhead?
62
Resource Usage (RU)
Captures consumption of network resource of overlay tree Overlay link RU = propagation delay Tree RU = sum of link RU
UCSD
CMU
U.Pitt
UCSD
CMU
U. Pitt
40ms
2ms
40ms
40ms
Efficient (RU = 42ms)
Inefficient (RU = 80ms)
IP Multicast 1.0Bandwidth-Latency 1.49Random 2.24Naïve Unicast 2.62
Scenario: Primary Set, 1.2 Mbps(normalized to IP Multicast RU)
63
Protocol Overhead
Results: Primary Set, 1.2 Mbps
- Average overhead = 10.8%
- 92.2% of overhead is due to bandwidth probe Current scheme employs active probing for
available bandwidth
- Simple heuristics to eliminate unnecessary probes
- Focus of our current research
Protocol overhead = total non-data traffic (in bytes)
total data traffic (in bytes)
64
Contribution
First detailed Internet evaluation to show the feasibility of End System Multicast architecture
- Study in context of a/v conferencing
- Performance comparable to IP Multicast
Impact of metrics on overlay performance
- For best performance: use both latency and bandwidth
More info: http://www.cs.cmu.edu/~narada
Recommended