View
220
Download
4
Tags:
Embed Size (px)
Citation preview
Dynamic Multicast Tree Construction in OceanStore
Puneet Mehra and Satrajit ChatterjeeAdvanced Topics in Computer Systems Final Project
EECS Department, CS DivisionUniversity of California, Berkeley
Motivation
• The I nner Ring serializes all client modifications • Clients can sacrifi ce consistency for read performance• Secondary Replica servers provide loosely-consistent
cached copies of data
• I nner Ring Secondary Replicas
• Link Replicas to the I nner Ring using D-trees
• The Problem: Given a set of nodes that want updates, we must f orm an effi cient tree to deliver updates f rom the root to all other nodes.
Updates
Design Heuristics
• Adaptation• OceanStore is large and must be self -maintaining
•Nodes may come and go•Links may get congested
• We adapt the d-tree to:•Optimize f or a given metric (latency or bandwidth)
• Awareness of Network Topology• Not all OceanStore nodes are created equal
•Servers in network core may have more resources than clients
•The underlying network (I nternet) has a natural hierarchy
• D-Tree structured to exploit knowledge of network•Place replicas to reduce network resource utilization
Adaptation Mechanism
• Ndes m
1. Each node periodically probes its siblings and grandparent. (B is probing in the picture)
2. A node switches parents if it can get a 10% improvement in a certain metric (eg: latency or bandwidth). (B has switched parents to C).
Exploiting Network Topology
• Model Internet as transit-stub network. • Data goes through stub nodes into stub domain. Transit
nodes pass data between domains. • Placing overhead Replicas at transit or stub nodes can
decrease network utilization.
R
T T
54
S2
S
31
Naïve Algorithm
R
T T
54
S2
S
31
Tapestry Algorithm
R
T T
54
S2
S
31
Transit-Stub Algorithm
Simulation Framework• Simulator
– Built on top of ns-2 – Used a static Tapestry implementation
• Network Characteristics– 196 nodes. 4 Transit Nodes. 24 Stub nodes. Domain size varied.– 10 Mb/ s links between nodes in stub– 45 Mb/ s links between stub and transit nodes– 100 Mb/ s links between all transit nodes
• D- tree Architecture– D-tree structure maintained through heartbeats– Cycles in Tree were detected using timestamp f rom root.
• Workload– Best eff ort model f or data delivery 1. Single source “streaming” application. 3000 byte data packet
per second. Fraction of nodes joined tree in random order.2. Multiple producer model. 3000 byte update every 4 seconds
Average Bandwidth
• All perform well f or f ew receivers• 16.54% diff erence between best and worst f or large receiver set.• The topology aware algorithms almost provide the streaming rate.• Adaptive algorithms maintain performance regardless of receiver set size.
Dissemination Effi ciency
• (Σ data update bytes received at nodes) / (total network utilization)• Adaptation provides better network utilization• 4X improvement using both topology and adaptation heuristics
Multiple Producer Workload
• Experimental Setup– 10 Experiments: 4 Randomly selected nodes generate 3000 byte
updates every 4 seconds.– Done using Adaptive Transit-Stub with 60% nodes joining.– Each node keeps time-ordered list of “Tentative” Updates.
• Results of Using Tentative Updates– Benefi ts: 10 % improvement in update latency– Costs: 0.2 % increase in network utilization– Localized Tentative Ordering matched I nner Ring serialization