56
1 Objective-Optimal Algorithms for Long-term Web Prefetching Bin Wu & Ajay Kshemkalyani Dept. of Computer Science, Univ. of Illinois at Chicago [email protected]

Objective-Optimal Algorithms for Long-term Web Prefetching

  • Upload
    clea

  • View
    27

  • Download
    4

Embed Size (px)

DESCRIPTION

Objective-Optimal Algorithms for Long-term Web Prefetching. Bin Wu & Ajay Kshemkalyani Dept. of Computer Science, Univ. of Illinois at Chicago [email protected]. Outline. Problem definition and background Web prefetching algorithms Performance metrics - PowerPoint PPT Presentation

Citation preview

Page 1: Objective-Optimal Algorithms for Long-term Web Prefetching

1

Objective-Optimal Algorithms for Long-term Web Prefetching

Bin Wu & Ajay Kshemkalyani

Dept. of Computer Science, Univ. of Illinois at Chicago

[email protected]

Page 2: Objective-Optimal Algorithms for Long-term Web Prefetching

2

Outline

• Problem definition and background• Web prefetching algorithms • Performance metrics• Objective-Greedy algorithms (O(n) time)

– Hit rate greedy (also hit rate optimal)– Bandwidth greedy (also bandwidth optimal)– H/B greedy

• H/B-Optimal algorithm (expected O(n) time)• Simulation results• Conclusions

Page 3: Objective-Optimal Algorithms for Long-term Web Prefetching

3

Introduction Web caching reduces user-perceived latency

– Client-Server mode– Bottleneck occurs at server side– Means of improving performance:

• local cache, proxy server, server farm, etc.

– Cache management: LRU, Greedy dual-size, etc.

On-demand caching vs. (long-term) prefetching– Prefetching is effective in dynamic environments.– Clients subscribe to web objects– Server “pushes” fresh copies into web caches– Selection of prefetched objects based on long-term

statistical characteristics, maintained by CDS

Page 4: Objective-Optimal Algorithms for Long-term Web Prefetching

4

Introduction

• Web prefetching Caches web objects in advanceUpdated by web serverReduces retrieval latency and user access timeRequires more bandwidth and increases traffic.

• Performance metricsHit rateBandwidth usageBalance of the two

Page 5: Objective-Optimal Algorithms for Long-term Web Prefetching

5

Object Selection Criteria

Popularity

(Access frequency)Lifetime Good FetchAPL

Page 6: Objective-Optimal Algorithms for Long-term Web Prefetching

6

Web Object Characteristics

• Access frequencyZipf-like request model is used in web traffic

modeling.

The relationship between access frequency p and popularity rank i of web object:

i i

kwhereikp1

/1,/

Page 7: Objective-Optimal Algorithms for Long-term Web Prefetching

7

Web Object CharacteristicsThe generalized “Zipf’s-like” distribution of web

requests is calculated as:

k is a normalization constant, i is the object ID (popularity rank), and α is a Zipf’s parameter:

0.986 (Cunha et al.),

0.75 (Nishikawa et al.) and

0.64 (Breslau et al.)

i i

kwhereikp 1

/1,/

Page 8: Objective-Optimal Algorithms for Long-term Web Prefetching

8

Web Object Characteristics

• Size of ObjectsAverage object size:10–15 KB.No strong correlation between object size and its

access frequency.

• Lifetime of web objectsAverage time interval between updatesWeak correlation between access frequency and

lifetime.

Page 9: Objective-Optimal Algorithms for Long-term Web Prefetching

9

Caching Architecture

• Prefetching selection algorithms use as an input these global statistics:– Estimates of object reference frequencies– Estimates of object lifetimes

• Content distribution servers cooporate to maintain these statistics

• When an object is updated in the original server, the new version will be sent to any cache that has subscribed to it.

Page 10: Objective-Optimal Algorithms for Long-term Web Prefetching

10

Solution space for web prefetching

• Two extreme cases:Passive caches (non-prefetching)

- Least network bandwidth and lowest cache hit rate

Prefetching all objects - 100% cache hit rate- Huge amount of unnecessary bandwidth

• Existing algorithms use different object-selecting criteria and fetch objects exceeding the threshold.

Page 11: Objective-Optimal Algorithms for Long-term Web Prefetching

11

Steady State Properties• Steady state hit rate for object i

is defined as freshness factor, f(i)

• Overall hit rate:

• Especially,

(Venkataramani et al.)

1iiii

lap

lap

prefetchednotisiobject

lap

lap

prefetchedisiobjectiii

ii

h 11

i

iihpH

i

idemand ifpH )(

Page 12: Objective-Optimal Algorithms for Long-term Web Prefetching

12

Steady State Properties

• Steady state bandwidth for object i

• Total bandwidth:

• Especially:

prefetchednotisiobjectsifap

prefetchedisiobjectl

siii

i

ib ))(1(

i

ibBW

i

iidemand sifapBW ))(1(

Page 13: Objective-Optimal Algorithms for Long-term Web Prefetching

13

Objective Metrics

• Hit rate – benefit • Bandwidth – cost• H/B model – balance of benefit and cost

Basic H/B

Enhanced H/B

• (Jiang, et al.)

Demandefetching

Demandefetching

BWBW

HitHitBH

Pr

Pr

Demandefetching

kDemandefetchingk

BWBW

HitHitBH

Pr

Pr )(

Page 14: Objective-Optimal Algorithms for Long-term Web Prefetching

14

Existing Prefetching Algorithms

• Popularity [Markatos et al.]Popularity [Markatos et al.]Keep the most popular objects in the systemUpdate these objects immediately when they changeCriterion – object’s popularityExpected to achieve high hit rate

• Lifetime [Jiang et al.]Lifetime [Jiang et al.]Keep objects with longest lifetimesMostly consider the network resource demands Threshold – the expected lifetime of objectExpected to minimize bandwidth usage

Page 15: Objective-Optimal Algorithms for Long-term Web Prefetching

15

Existing Prefetching Algorithms

• Good Fetch [Venkataramani et al.]Computes the probability that an object is accessed

before it changes.Prefetch objects with “high probability of being

accessed during their average lifetime”

Prefetch object i if the probability exceeds threshold.Objects with higher access frequencies and longer

update intervals are more likely to be prefetchedBalance the benefit (hit rate increase) against the cost

(bandwidth increase) of keeping an object.

Page 16: Objective-Optimal Algorithms for Long-term Web Prefetching

16

Existing Prefetching Algorithms

• APL [Jiang et al.]

Computes apl values of web objects.apl of an object represents “expected number of

accesses during its lifetime”Prefetch object i if its apl exceeds threshold.Tends to improve hit rate; attempts to balance benefit

(hit rate) against cost (bandwidth).

Page 17: Objective-Optimal Algorithms for Long-term Web Prefetching

17

Existing Prefetching Algorithms

• Enhanced APLn>1, prefers objects with higher popularity (emphasize

hit rate)n<1, prefers objects with longer lifetime (emphasize

network bandwidth)

lapn

Page 18: Objective-Optimal Algorithms for Long-term Web Prefetching

18

Objective-Greedy Algorithms

• Existing algorithms choose prefetching criteria based on intuitions

• These intuitions are not aimed at any specific performance metrics

• These intuitions consider only individual objects’ characteristics, not the global impact

• None of them gave optimal performance based on any metric– Simple counter-examples can be shown

Page 19: Objective-Optimal Algorithms for Long-term Web Prefetching

19

Objective-Greedy Algorithms

• Objective-Greedy algorithms select criteria to intentionally improve performance based on various metrics.

• E.g., Hit Rate-Greedy algorithm aims to improve the overall hit rate, thus, reduce the latency of object requests.

Page 20: Objective-Optimal Algorithms for Long-term Web Prefetching

20

H/B-Greedy Prefetching

• Consider the H/B value of on-demand caching:

• If object j is prefetched, then H/B is updated to:

i i

i

ii

demand

demand

demand ifls

ifp

BW

Hit

B

H

)(

)(

Si i

i

j

j

Sii

j

demand

j

j

Si i

i

Siji

ifl

s

jfl

s

ifp

jfp

B

H

jfl

sif

l

s

jfpifp

)(

))(1(

1

)(

))(1(1

))(1()(

))(1()(

Page 21: Objective-Optimal Algorithms for Long-term Web Prefetching

21

H/B-Greedy Prefetching• We define

as the increase factor of object j, incr(j).

• incr(j) indicates the amount by which H/B can be increased if object j is selected.

Si i

i

j

j

Sii

j

ifls

jfl

s

ifp

jfp

)(

))(1(

1

)(

))(1(1

Page 22: Objective-Optimal Algorithms for Long-term Web Prefetching

22

H/B-Greedy Prefetching

• H/B-Greedy prefetching prefetches those m objects with greatest increase factors.

• The selection is based on the effect on the hit rate by prefetching individual objects.

• H/B-Greedy is still not an optimal algorithm in terms of H/B value.

Page 23: Objective-Optimal Algorithms for Long-term Web Prefetching

23

Page 24: Objective-Optimal Algorithms for Long-term Web Prefetching

24

Hit Rate-Greedy Prefetching

• To maximize the overall hit rate given the number of objects to prefetch, m, we select the m objects with the greatest hit rate contribution:

• This algorithm is optimal in terms of hit rate.

1))(1()(_

ii

ii lap

pifpiContrHR

Page 25: Objective-Optimal Algorithms for Long-term Web Prefetching

25

Bandwidth-Greedy Prefetching• To minimize the total bandwidth given m, the

number of objects to prefetch, we select the m objects with least bandwidth contribution:

• Bandwidth-Greedy Prefetching is optimal in terms of bandwidth consumption.

iii

i

i

i

llap

sif

l

siContrBW

2))(1()(_

Page 26: Objective-Optimal Algorithms for Long-term Web Prefetching

26

H/B-Optimal Prefetching

• Optimal algorithm for H/B metric provided by a solution to the following selection problem.

• This is equivalent to maximum weighted average problem with pre-selected items.

'

'

))(1()(

))(1()(

maxargmaxarg,','

'

Sj j

j

Si i

i

Sjj

Sii

mSSSprefmSSS jfl

sif

ls

jfpifp

B

HS

Page 27: Objective-Optimal Algorithms for Long-term Web Prefetching

27

Maximum Weighted Average

Maximum Weighted Average Problem:• Totally n courses, with different credit hours and scores• select m (m < n ) courses• maximize the GPA of m selected courses

Solution:

• If m=1

Then select course with highest score

What if m>1? A misleading intuition: select the m courses with highest

scores.

Page 28: Objective-Optimal Algorithms for Long-term Web Prefetching

28

A Course Selection Problem

• If m=2

If we select the 2 courses with highest scores: C and B.

then GPA: 93.33

But if we select C and D, then GPA: 93.57

• Question: how to select m courses such that the GPA is maximized?

Answer: Eppstein & Hirschberg solved this

Courses A B C D E F G HCredit

hours 5.0 3.0 6.0 1.0 2.0 4.0 3.0 6.0Scores 70 90 95 85 75 60 65 80

Page 29: Objective-Optimal Algorithms for Long-term Web Prefetching

29

With Pre-selected items

Maximum Weighted Average with pre-selected items: • Totally n courses, with different credit hours and scores• Course A and E (for example) must be selected, plus:• Select additional m (m is given, m<n) courses, such that:

the resulting GPA is maximized

Courses A B C D E F G H

Credit

hours5.0 3.0 6.0 1.0 2.0 4.0 3.0 6.0

Scores 70 90 95 85 75 60 65 80

Page 30: Objective-Optimal Algorithms for Long-term Web Prefetching

30

Pre-selection is not trivial

1) Selection domain B~I, no pre-selection, m=2optimal subset: {B,C}, GPA: 88.33

2) Selection domain B~I, A is pre-selected, m=2one candidate subset: {A,D,H}, GPA: 75.61better than: {A,B,C}, GPA: 70.625

Conclusion: {B,C} not contained in optimal subset for pre-selected problem.

Course A B C D E F G H ICredit 5.0 1.0 2.0 10.0 1.5 2.5 2.0 3.0 4.0Score 60 95 85 83 63 71 80 77 65

Page 31: Objective-Optimal Algorithms for Long-term Web Prefetching

31

H/B-Optimal v.s. Course selection

• The problem is formulated as:

Where v0=5.0*70+2.0*75=500, and w0=5.0+2.0=7.0, in the previous example.• Equivalent to H/B-Optimal selection problem:

'

'

0

0

,'

' maxargSj

j

Sjj

mSSS ww

vv

S

'

'

))(1()(

))(1()(

maxarg,'

'

Sj j

j

Si i

i

Sjj

Sii

mSSS jfl

sif

l

s

jfpifp

S

Page 32: Objective-Optimal Algorithms for Long-term Web Prefetching

32

H/B-Optimal v.s. Course selection

Page 33: Objective-Optimal Algorithms for Long-term Web Prefetching

33

H/B-Optimal algorithm design

• The selection of m courses is not trivial• For course i, we define auxiliary function

• And for a given number m, we define a Utility function

xm

ww

m

vvxr iii )()()( 00

'

',')(max)(

Sii

SSmSxrxF

Page 34: Objective-Optimal Algorithms for Long-term Web Prefetching

34

H/B-Optimal algorithm

• Lemma 1

Suppose A* is the maximum GPA we are computing, then for any subset S’ S and |S|=m

Lemma 1 indicates that the optimal subset contains those courses that have the m largest ri (A*) values

.'0)().2

;0)().1

'

*

'

*

subsetoptimaltheisSiffAr

Ar

Sii

Sii

Page 35: Objective-Optimal Algorithms for Long-term Web Prefetching

35

H/B-Optimal algorithm design

• n=6, m=4• Each line is ri (x)• Assume we know A*

• Optimal subset has the 4 courses

with largest ri (A*) values.

• Dilemma: A* is unknown

Page 36: Objective-Optimal Algorithms for Long-term Web Prefetching

36

H/B-Optimal algorithm design

*

*

*

0)().3

0)().2

0)().1

AxiffxF

AxiffxF

AxiffxF

• Lemma 2:

• lemma 2 narrows

range of A*

(Xl , Xr) is the current

A*-range

Page 37: Objective-Optimal Algorithms for Long-term Web Prefetching

37

H/B-Optimal algorithm design

• If F (xl) > 0 and F (xr) < 0, then A* in (xl, xr)

• Compute the value of F((xl+xr)/2)

- if F((xl+xr)/2) > 0, then A* > (xl+xr)/2

- if F((xl+xr)/2) < 0, then A* < (xl+xr)/2

- if F((xl+xr)/2) = 0, then A* = (xl+xr)/2; (Lemma 2)

• Narrow down the range of A* by half

Page 38: Objective-Optimal Algorithms for Long-term Web Prefetching

38

H/B-Optimal algorithm design

• Why keep on narrowing down the range of A* ?– If intersection of rj (x) and rk (x) falls out of range, then

the ordering of rj (x) and rk (x) is determined within the range, so is rj (A*) and rk (A*), by comparing their slopes.

– If the range is narrow enough that there are no intersections of r (x) lines within the range then the total ordering of all r (A*) values is determined.

– Now our optimal problem is solved: just select the m candidates with highest r (A*) values.

• Main idea to solve this optimal problem.

Page 39: Objective-Optimal Algorithms for Long-term Web Prefetching

39

H/B-Optimal algorithm design

• However, the total ordering requires O(n2) time complexity

• A randomized approach is used instead, this randomized algorithm:– Iteratively reduces the problem domain into a

smaller one.– The algorithm maintains 4 sets: X, Y, E, Z,

initially empty

Page 40: Objective-Optimal Algorithms for Long-term Web Prefetching

40

H/B-Optimal algorithm designIn each iteration, randomly selects a course i, and compare it with each of the other courses, k. There are 4 possibilities:

1). if rk(A*) > ri(A*): insert k into set X

2). if rk(A*) < ri(A*): insert k into set Y

3). if wk=wi and vk=vi: insert k into set E4). if undetermined: insert k into set Z Now do the following loop:

loop:narrow the range of A* by half

compare ri(A*) with rk’(A*) for k’ in Zif appropriate, move k’ to X or Y, accordingly

until |Z| is sufficiently small (i.e., |Z| < |S|/32)

Page 41: Objective-Optimal Algorithms for Long-term Web Prefetching

41

H/B-Optimal algorithm design

• The sets X or Y have enough members.

• Next, examine and compare the sizes of X, Y and E:

Page 42: Objective-Optimal Algorithms for Long-term Web Prefetching

42

H/B-Optimal algorithm design

1). If |X|+|E| > m:

At least m courses whose r(A*) values are greater than r(A*) value of all courses in Y. All members in Y may be removed. Then: |S| = |S| - |Y|

Page 43: Objective-Optimal Algorithms for Long-term Web Prefetching

43

H/B-Optimal algorithm design

2). If |Y|+|E| > |S|-m: All members in X are among the top m courses. All members in X must be in the optimal set. Collapse X into a single course (This course is included in the final optimal set). Then:

|S| = |S| - |X| + 1;

m = m - |X| + 1.

Page 44: Objective-Optimal Algorithms for Long-term Web Prefetching

44

H/B-Optimal algorithm design• In either case, the resulting domain has reduced size.• By iteratively removing or collapsing courses, the

problem domain finally has only one course remaining: a course formed by collapsing all courses in optimal set.

• Complexity:Expected time complexity, briefly: (Assume Sb is the domain before iteration and Sa after.)1). Each iteration takes expected time O(|Sb|)2). Expected size |Sa| = (207/256) |Sb|

The recurrence relation of the iteration:T(n) = O(n) + T[(207/256)n]

Resolves to linear time complexity.

Page 45: Objective-Optimal Algorithms for Long-term Web Prefetching

45

H/B-Greedy v.s. H/B-Optimal

• H/B-greedy is an approximation to H/B-Optimal

• H/B-greedy achieves higher H/B metric than

any existing algorithms.

• H/B greedy is more easy to implement than H/B-Optimal.

Page 46: Objective-Optimal Algorithms for Long-term Web Prefetching

46

Simulation Results

• Evaluation of H/B Greedy PrefetchingFigure 1 : H/B , for total object number =1,000.Figure 2 : H/B , for total object number =10,000.Figure 3 : H/B , for total object number =100,000.Figure 4 : H/B , for total object number

=1,000,000.

• Evaluation of H-Greedy and B-Greedy algorithmFigure 5 : H-Greedy algorithm.Figure 6 : B-Greedy algorithm.Figure 7 : B-Greedy algorithm, zoomed in.

Page 47: Objective-Optimal Algorithms for Long-term Web Prefetching

47

Figure 1: H/B, for total object number=1,000

Page 48: Objective-Optimal Algorithms for Long-term Web Prefetching

48

Figure 2: H/B, for total object number=10,000

Page 49: Objective-Optimal Algorithms for Long-term Web Prefetching

49

Figure 3: H/B, total object number=100,000

Page 50: Objective-Optimal Algorithms for Long-term Web Prefetching

50

Figure 4: H/B, total object number=1,000,000

Page 51: Objective-Optimal Algorithms for Long-term Web Prefetching

51

Figure 5: H-Greedy algorithm

Page 52: Objective-Optimal Algorithms for Long-term Web Prefetching

52

Figure 6: B-Greedy algorithm

Page 53: Objective-Optimal Algorithms for Long-term Web Prefetching

53

Figure 7: B-Greedy, Bandwidth magnified

Page 54: Objective-Optimal Algorithms for Long-term Web Prefetching

54

Performance Comparison

 

Table 1. Performance comparison of different algorithms in terms of various metrics. (Lower values represents better performance)

Page 55: Objective-Optimal Algorithms for Long-term Web Prefetching

55

Conclusions

• Proposed a family of Objective-Greedy prefetching algorithms, that are superior to Popularity, Good Fetch, APL, & Lifetime– Hit rate greedy (this is also optimal)– Bandwidth greedy (this is also optimal)– H/B greedy

• All the above are O(n) complexity• Proposed an H/B-Optimal algorithm, that is also

O(n) expected time• Experimental evaluation shows significant gains

over existing algorithms• H/B-greedy is almost as good as H/B-optimal

Page 56: Objective-Optimal Algorithms for Long-term Web Prefetching

56