Upload
dominic-waters
View
220
Download
1
Embed Size (px)
Citation preview
Understanding the Understanding the Performance of Web Caching Performance of Web Caching
System with an Analysis System with an Analysis Model and SimulationModel and Simulation
Xiaosong Hu Xiaosong Hu Nur Zincir-HeywoodNur Zincir-Heywood
Sep 20 2003Sep 20 2003
OutlineOutline• Web Cache Background• Cooperating Web Caching System• The model of the Hierarchical System• Performance Analysis• Simulation• Comparison• Conclusion
Web Cache BackgroundWeb Cache Background• WWW becomes the dominant
application of the Internet• Demand for bandwidth outstrips the
supply• Congestion and server overloading• Web Caching---Storing some popular
pages somewhere close to the clients• Browser caching and proxy caching
Cooperating Web Caching SystemCooperating Web Caching System
• Can we configure the proxy cache server to visit each other at first before going to the source server?
• Based on research so far, the web caching system can be divided into
--- Hierarchical System --- Distributed System --- Hybrid System
Hierarchical SystemHierarchical System
Request
ClientThe router without associated cache server
The router with associated cache server
** The figure shows the network system with O = 2, h = 2.
Institutional cache
Regional cache
National cacheResponse
To Source Server
Performance AnalysisPerformance Analysis---Parameters------Parameters---
• Hit Ratio: The probability that a requested document can be retrieved from the system
• Latency: The average time to retrieve a document from the Internet
• Traffic: The average traffic at each link over a time unit
• Average Hop: The average distance for a request to be satisfied
Performance AnalysisPerformance Analysis ---Assuming--- ---Assuming---
Parameter nameParameter
value
1 Nodal outdegree of the tree (O) 3
2 Hops between two neighboring levels of cache (h) 2
3The hops from the national cache to the source
server (z)10 [10]
4 Average request rate from the clients ( βI) 2 requests/sec
5 Total document number 1 million
6 Average document update time (∆) 12h
7 Skew factor of Zipf distribution (α) 0.64 [9]
8 Cache Content updating function LFU
Performance AnalysisPerformance Analysis---Formula for Hit Ratio------Formula for Hit Ratio---
• the most popular Ci documents will be cached.
• A request is a hit only if it is for a document in Ci and its interval is less than ∆.
• Hit Ratio H = ---PN(i) is the probability that a request is for document i ---P(L) is the probability that the request for document i within ∆.
))()(((1
LPiPiC
iN
Performance AnalysisPerformance Analysis--- --- PPNN(i) & P(L) (i) & P(L) ------
• The probability of a request for document i is a Zipf-like distribution
PN(i) = ,
• The arriving requests are a Poisson distribution. Within a short time
e iILP ,1)|(
Ω =1
1
1
N
i ii
)1(1
1)( ,
,
iIeLPiI
dLPLP )|(1
)(0
Performance AnalysisPerformance Analysis--- average hit ratio ------ average hit ratio ---
Institutional cache (i)H(i)=H(ii)+H(ir)+H(in)
Regional cache (r) H(r)=H(rr)+H(rn)
National cache (n) H(n) =H(nn)
Client
Client
Client
Haverage = [H(i)•O2h + H(r)•Oh + H(n)]/(O2h+Oh+1)
SimulationSimulation• The web cache simulator
• Data Sets: ---half, one and two millions requests ---login file;
Input & Output Component
SimulationComponent
Network CacheSystem
ComparisonComparison
ConclusionConclusion• The hit ratio is a logarithmical
function or a function with a small power to the cache size
• The results from the model are compatible with those from the simulation
• The model can be used to further analyze the cache system mechanism