Upload
marian-boyd
View
220
Download
2
Tags:
Embed Size (px)
Citation preview
Client/Server Model Hides Many Files in Internet
2 × 10 18 Bytes/year generated in Internet.
Only 3 × 10 12 Bytes/year available to public via servers and search engines. (0.00015%)
Google only searches 1.3 ×10 8 Web pages.
(source: IEEE Internet Computing, 2001)
Peer-to-Peer Aims at File Sharing without Central Controls
Each is both client (customer) & server (producer)
Has the freedom to join and leave anytime.
Huge peer diversity: service ability, storage space, networking bandwidth, and service demand.
A widely decentralized system built on top of the physical networks (overlay networks).
Objectives and Benefits of P2P
As long as there is no physical break in networks, a targeted file will always be found. (strong and massive connectivity)
Adding more contents to P2P further improves its performance (information scalability)
Peers’ “come and go” in P2P will not affect its performance. (robustness)
``Troubles” in P2P File Sharing
Copyright violations: many on-going lawsuits against P2P file sharing.
Inefficient bandwidth usage: a huge amount of traffic can be generated (many large universities have to prohibit P2P activities from time to time).
Information Leaking: easy accessing files can be abused (many government agencies prohibit P2P activities for this reason).
Relaxing security protection: viruses evil code can be easily and quickly spread via P2P networks.
Non-deterministic Factors Hinder P2P R&D
Mutual trusts are not guaranteed.
Self-organization lacks coordination.
Dynamics of peers are not predictable.
Service qualities of peers are unknown.
P2P R&D has lost its momentum and become less significant due to the nature of non-determinism.
P2P Resource Sharing in Deterministically Dynamic Environment
A P2P environment with high non-determinism would not attract serious applications.
R&D efforts on such an environment would not be productive and expect high impact.
We argue that improving P2P performance in a deterministically dynamic environment is much more meaningful and useful.
P2P resource sharing in Internet/distributed systems with minimal unpredicted human interactions.
Outline
Overview of resources in distributed systems What is the source of unbalanced supply/demand? How can P2P effectively balance supply/demand? Case studies:
Network RAM Cooperative caching in network file systems. Multilevel caching in distributed data bases. Proxy caching for streaming media.
Conclusion
Major Resources in Computer and Distributed Systems
CPU cycles: very rich, oversupplied for most apps. Cache capacity: always limited. Memory bandwidth: improved dramatically. Memory capacity: increasingly large and low cost. I/O bandwidth: improved dramatically. Disk capacity: huge and cheap. Cluster and Internet bandwidths: very rich.
Improvement of data access latencies lags behind.
CPU-DRAM Gap is no longer Major Bottleneck
1
10
100
1000
10000
100000
19801982
19841986
19881990
19921994
19961998
20002002
2004
50% per year
CPU
DRAM
• Cache optimization only.
• Limited cache capacity would not hold working sets of data intensive applications.
• Caches are highly efficient, little space for improvement.
• A cache miss latency has been reduced to 50-90 ns.
• Memory with a large capacity becomes a working place for fast data accesses.
Device Name 1980(ns) 2000(ns) Improvement
CPU Cycle Time 1,000 1.6 625.00x
SRAM Access Time 300 2 150.00x
DRAM Access Time 375 60 6.25x
Disk Seek Time 87,000,000 8,000,000 10.87x
Latency Gaps Among CPU, Cache, DRAM, and Disk
1
10
100
1000
10000
100000
1000000
10000000
100000000
1980 1985 1990 1995 2000
Year
ns
CPU Cycle Time
SRAM Access Time
DRAM Access Time
Disk Seek Time
0.3 0.37587,000
0.91.2
451,807
0.72
560,000
2.5
11.66
1,666,666
1.25
37.5
5,000,000
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
4500000
5000000
Cycles
1980 1985 1990 1995 2000
Year
Latencies of Cache, DRAM and Disk in CPU Cycles
SRAM Access Time DRAM Access Time Disk Seek Time
Limited by the mechanic components, the disk’s performance is seriously lagging behind the CPU and memory. In 1980, one disk seek costs 87,000 cycles, in 2000, one disk seek costs over 5,000,000 cycles. The disks in 2000 are more than 57 times “SLOWER” than their ancestors in 1980.
Quantifying the I/O Bottleneck Disk access time is limited by mechanical delays. A fast Seagate Cheetah X15 disk (15000 rpm):
average seek time: 3.9 ms, rotation latency: 2 ms internal transfer time for a strip unit (8KB): 0.16 ms Total disk latency: 6.06 ms.
External transfer rate increases 40% per year. from disk to DRAM: 160 MBps (UltraSCSI I/O bus)
To get 8KB from disk to DRAM takes 6.11 ms. 10,000+ times slower than the same transfer from DRAM (90 ns
for a 128 B cache line miss). = 12 million+ CPU cycles of 2GHz!
Date Communication in Computer Systems
Transfer Bandwidth Time
Latency Time
Client-perceived latency reduction is still limited due to imbalanced improvement of bandwidth and latency
Client-perceived latency reduction is still limited due to imbalanced improvement of bandwidth and latency
Source Destination
Latency Lags Bandwidth (CACM, Patterson)
• In the last 20 years, 100–2000X improvement in bandwidth 5-20X improvement in latency
Between CPU and on-chip L2: bandwidth: 2250X increase latency: 20X reduction
Between L3 cache and DRAM: bandwidth: 125X increase Latency: 4X reduction
Between DRAM and disk: bandwidth: 150X increase latency: 8X reduction
Between two nodes via a LAN: bandwidth: 100X increase latency: 15X reduction
Internet LatencyNorth America: Coast to coastCalTech -- Duke
Asia: Across PacificTaiwan – Pittsburg
Source: NLANR
Source: NLANR
RTT reduced little although Internet bandwidth increases a lot.RTT reduced little although Internet bandwidth increases a lot.
Internet Bandwidth
Source: Nielsen//NetRatings
More and more persons connected with Internet through broadband Around 70% homes in US expected to use broadband this year
More and more persons connected with Internet through broadband Around 70% homes in US expected to use broadband this year
Internet Caching and ReplicationCaching
Prefetching
Client Proxy Server
CDN Surrogate
+ reduce both latency and bandwidth- limited performance and potential staleness
+ reduce latency effectively+ reduce network congestion- increase server overhead- complicated implementation
Client/Peer
+ reduce server congestion and latency- dedicated machines expensive
Replication
Making a Balanced System
To cope with the bandwidth-latency imbalance, we must build buffers to exploit locality anywhere if necessary. Three effective techniques:
Caching: reuse data in a relatively close place. Replication: utilize large memory/storage capacity Prefetching: utilize rich bandwidth to hide latency.
Data sharing via interconnection networks is faster than local disk accesses (e.g. SGI NUMALink bandwidth time of a data block: 50 ns, faster than a local memory access)
Effective caching techniques are critical to the success.
CPU Registers
L1TLB
L3
L2
Row buffer
DRAMBus adapterController
buffer
Buffer cache
CPU-memory bus
I/O bus
I/O controller
disk
Disk cache
TLB
registers
L1
L2
L3
Controller buffer
Buffer cache
disk cache
Row buffer
Where are Buffers in Deep Memory Hierarchy
Algorithm implementationAlgorithm implementation
CompilerCompiler
Micro architectureMicro architecture
Micro architectureMicro architectureMicro architectureMicro architecture
Operating systemOperating system
Outline
Overview of resources in distributed systems Overview of resources in distributed systems What is the source of unbalanced supply/demand?What is the source of unbalanced supply/demand? How can P2P effectively balance supply/demand?How can P2P effectively balance supply/demand? Case studies:
Network RAM Cooperative caching in network file systems.Cooperative caching in network file systems. Multilevel caching in distributed data bases. Multilevel caching in distributed data bases. Proxy caching for streaming media. Proxy caching for streaming media.
Conclusion Conclusion
Distributed Memory Hierarchies P2P in Cluster System: Network RAM
Network
CPU1 CPU2 CPU3
disk3disk2disk1
job
Paging!
Remote
paging
Remote paging
Issues and Challenges of Network RAM Memory-based resource allocations Ownership and autonomy of node memory. Global memory management (centralized or
decentralized?) Network traffic reduction Proximity and location optimization for latency
Adopting P2P in Operating Systems for Memory Resource Sharing.
P2P Memory Sharing in Network RAM Each node contributes its own memory space but
also uses in other nodes. (producer/consumer). Each node reliably stays connected (low dynamics) Low node diversity (predictable service quality) Nodes can be connected physically or by overlays. P2P searching for memory resources
Flooding based search (TPDS, 05) DHT based: idle memory registration/ remote searching. Directory based: for small scales without overlays.
Outline
Overview of resources in distributed systems Overview of resources in distributed systems What is the source of unbalanced supply/demand?What is the source of unbalanced supply/demand? How can P2P effectively balance supply/demand?How can P2P effectively balance supply/demand? Case studies:
Network RAMNetwork RAM Cooperative caching in network file systems. Multilevel caching in distributed data bases. Multilevel caching in distributed data bases. Proxy caching for streaming media. Proxy caching for streaming media.
Conclusion Conclusion
Distributed Memory Hierarchy: P2P Caching in Networking File Systems
I/O data is cached in DRAM: buffer caches. Cooperative caching: sharing buffer caches in NFS. Cooperation principle:
Keep data with strong locality in local buffer cache. Store data to be used sometimes later in a remote cache. Replace data not used locally and remotely.
Challenges and Issues: Decisions of keeping, forwarding, and replacing data. Consistency for duplicated data.
Cooperative P2P Caching in NFS
Network
CPU1 CPU2 CPU3
disk3disk2disk1
Buffer cache
overflow!
replacement
Cooperative P2P Caching in NFS
Network
CPU1 CPU2 CPU3
disk3disk2disk1
Remote accesses
Remote accesses
Local accesses of
most frequently
used
Replacement of no longer
used
P2P Management in NFS A node distinguishes the locality strengths for its
(1) replaced, (2) forwarded, and (3) kept blocks. P2P searching for idle buffer cache space.
Flooding based DHT based. Directory based.
Adapting the demands of VM and buffer cache: Return space back to owners for high VM pressures. Memory load index in each node updated periodically.
Outline
Overview of resources in distributed systems Overview of resources in distributed systems What is the source of unbalanced supply/demand?What is the source of unbalanced supply/demand? How can P2P effectively balance supply/demand?How can P2P effectively balance supply/demand? Case studies:
Network RAMNetwork RAM Cooperative caching in network file systems.Cooperative caching in network file systems. Multilevel caching in distributed data bases. Proxy caching for streaming media. Proxy caching for streaming media.
Conclusion Conclusion
Distributed Memory Hierarchy : P2P in Multi-Level Buffer Caching
client
client
network
Front-tier server end-tier server
disk array
Challenges to Improve Multi-level Caching Performance
LRU
L1 L2 L3 L4
LRU LRU LRU
(1) Can the hit rate of hierarchical caches achieve the hit rate of a single first
level cache with its size equal to the aggregate size of the hierarchy?
(2) Can we make caches close to clients contribute more to the hit rate?
L1
80%40%50% 10% 10% 10%
10% 10% 10% 50%
P2P Coordination in Multilevel Buffer Caching Locality strength index exchange among levels
Directory based. (Jiang et. al. ICDCS’04) Dynamic data moving based on locality strengths:
Store data with strong locality close to clients. Server can replace data with weak locality at any level.
Outline
Overview of resources in distributed systems Overview of resources in distributed systems What is the source of unbalanced supply/demand?What is the source of unbalanced supply/demand? How can P2P effectively balance supply/demand?How can P2P effectively balance supply/demand? Case studies:
Network RAMNetwork RAM Cooperative caching in network file systems.Cooperative caching in network file systems. Multilevel caching in distributed data bases.Multilevel caching in distributed data bases. Proxy caching for streaming media.
Conclusion Conclusion
Distributed Memory Hierarchy P2P Assistance: Proxy Caching
Server Intermediary Client
Reduce response
time to client
Reduce network traffic
Reduce server’s load
Existing Internet Status
Servers Intermediaries Clients
Servers Intermediaries Clients
Media Objects
Very large sizes
Very rigorous real-time delivery
constrains: small startup
latency, continuous
delivery
A large numberof proxies with:disk, memory,and CPU cycles
Diverse clientaccess devices:
computers,PDAs,
cell-phones
A large numberof proxies with:disk, memory,and CPU cycles
A Missing Component in Proxy Caching
Current streaming media delivery status (WWW’05) Client-Server Model (poor scalability, heavy traffic)
Downloading (about 90%).Many media servers limits downloading to protect copyrights.
CDNs (very expensive, idle resources)Dedicated services.
• Why not leverage the existing proxy resources?
P2P assisted Streaming Proxy (Guo, et. al., ICDCS’04)
Internet
Media Server
Firew allFirewallMedia Proxy
P2P OverlayContent Addressable
Network
Intranet DHT
DHT DHT
DHT DHT
DHT
DHT
System Components
Streaming proxy Interface between the system and media servers Bootstrap site of the system
P2P overlay of users, in which each peer is A client A streaming server An index server and router
Media Proxy
Bootstrap
Fetch media data from media server
Cache media objects by segment
Serve media data for clients
New client join
Media server
Media proxyMedia proxy
Client A
Client B
Internet
Peer as a Streaming Server
Local Cache
Client BClient A
Receive requestsStream media data
Peer Streaming Server
pointers to serving peers
……meta data
Segment Index
peer peer
proxy
Segment IDvalue
Peer as an Index Server/Router
DHT
Routing table
??
Is Segment ID in my key space?
Yes
No
key
??
peer
peer
Basic Operations
Publishing and unpublishing media segments publish (segment_id, location) unpublish (segment_id, location)
Requesting and serving media segments request (segment_id, URL)
Getting and updating segment meta data update_info (segment_id, data) get_info (segment_id)
Peer Serves Streaming
Internet
Media Server
Firew allFirewallMedia Proxy
P2P Overlay DHT
DHT DHT
DHT DHT DHT
DHT
??Overlay routing
Point to point
ready?yes!
Proxy Fetches Data
Internet
Media Server
Firew allFirewallMedia Proxy
P2P Overlay DHT
DHT DHT
DHT DHT DHT
DHT
??Overlay routing
Point to point
NULNULLL
ask proxy to fetchpublish
Conclusion P2P R&D should be done in a trust domain.
P2P has been quietly and inevitably applied in computer systems designs at all levels. P2P memory storage sharing can be very effective. Latency is reduced by utilizing global memory.
Decentralized data and memory management autonomy and resource ownerships of each node. data consistency for duplications. decisions of keeping, forwarding, and elimination.
P2P is becoming a principle in systems design.