Placement of Web-Server Proxies with Consideration of Read and Update Operations on the Internet

Preview:

DESCRIPTION

Placement of Web-Server Proxies with Consideration of Read and Update Operations on the Internet. Parallel Programming Team 6 B92902039 莊謹譽 B92902054 李苡嬋 B92902092 張又仁. Outline. Introduction Problem Formulation Optimal Placement of k Proxies Optimal Number of Proxies. Outline. - PowerPoint PPT Presentation

Citation preview

Placement of Web-Server Proxies with Consideration of Read and

Update Operations on the Internet

Parallel Programming Team 6

B92902039 莊謹譽B92902054 李苡嬋B92902092 張又仁

Outline

IntroductionProblem FormulationOptimal Placement of k ProxiesOptimal Number of Proxies

Outline

IntroductionProblem FormulationOptimal Placement of k ProxiesOptimal Number of Proxies

Introduction (1/3)

Caching – alleviate traffic congestion & improve the response time of Web servers.Client-based:

cache the file recently accessed by the clients

Server-based:server proxy stores replicated data, acts as a “front

end” of this Web server to its client.

Introduction (2/3)

Server-based caching:advantages:

improve client response timesDistribute the workload of the Web serverreduce the network traffic

disadvantages:need to update data at the proxies. cost up.

Introduction (3/3)

Two subproblems discussed:1.Given k proxies, find the optimal placement of

the proxies in the network

2. For an unconstrained number, find the optimal number of proxies and their placement.

such that the overall access cost (read & update) is minimized.

using dynamic programming.

Outline

IntroductionProblem FormulationOptimal Placement of k ProxiesOptimal Number of Proxies

Notation (1/3)

s: server.w: the update frequency of server.r(v): the access frequency to s of node v.SPT: the shortest path tree.

the leaf nodes of the SPT are only proxies.only one copy of the data traverses over it.

Multicast vs. Unicast

SPT for updating proxies

TsSPT

Notation (2/3)

: the path connecting u & v in Ts. : distance function

= : the first proxy that is met while

going from client v to s along tree Ts. : the hit ratio of a proxy.

),( vu),( vud

),( svp

),(),(

),(vuyx

yxd

Equation (for read)

The cost for client v to access s

Total cost for all clients in Ts to access s

---- equation for reads

),(*)(*)1()),(,(*)(* svdvrsvpvdvr

Tsv

svdvrsvpvdvr )],(*)(*)1()),(,(*)(*[

Notation (3/3)

P: a set of proxies in the networkSPT(s,P): the SPT rooted from s

Equation (for update)

The overall cost to update the proxies:

---- equation for update

),(),(

),(PsSPTyx

yxdw

Equation (total cost)

Total cost of all clients in Ts to access s with a set of proxies P:

C(Ts,P) =

(read)

(update)

),(),(

),(

)),(*)(*)1(

)),(,(*)(*(

PsSPTyx

Tsv

yxdw

svdvr

svpvdvr

Outline

IntroductionProblem FormulationOptimal Placement of k ProxiesOptimal Number of Proxies

Optimal Placement of k Proxies

Notation: Tv

u is to the left of v

Optimal Placement of k Proxies

Notation: Lu,v = {x: x ∈ Tv and x is to the left of u }Ru,v = {x: x ∈ Tv and x Tu ∪Lu,v }Lu,v,x = {y: y ∈ Rv,u and y is to the left of x }

Optimal Placement of k Proxies

Optimal Placement of k Proxies

This term is irrelevant tothe proxy placement P.

Define the placement of proxies recursively.Server is regarded as one of the proxy

C(Tv,k) (or C*(Rv,u,k)) is the minimal access cost by placing k proxies in Tv (or Rv,u)

Optimal Placement of k ProxiesOptimal Placement of k Proxies

When t = 1

Optimal Placement of k Proxies

When t > 1, we can always find a node u, u ∈ Tv and u v, which satisfies:

That is: C( Tv, t ) = C( Lv,u ) + C( Tu, t’ ) +

C*( Rv,u, t - t’ ) + w*d( v,u ) Where

constant ! when u,v given

uvLx

uv vxdxrLC,

),(*)(*)( ,

Optimal Placement of k Proxies

The recursive equation:

Optimal Placement of k Proxies

Optimal Placement of k Proxies

Theorem 1. Equation (4) and (5) are correct.Proof.

Prove equation (5).When t = 1, trivially correct.When t > 1, define C*’(Rv,u,t).

C*’( Rv,u, t ) = the right-hand side of

formula (5), i.e.,

To prove: C*( Rv,u, t ) = C*’( Rv,u,t )

Optimal Placement of k Proxies

C*(Rv,u, t) C*’( Rv,u, t) (1/2)Let Popt: optimal placement in Rv,u

→ C*( Rv,u, t ) = C*( Rv,u, Popt )Popt is known

find a proxy node x in Rv,u, such that:

Optimal Placement of k Proxies

C*(Rv,u, t) C*’( Rv,u, t) (2/2)C*( Rv,u, Popt ) = C( Lv,u,x ) + C( Tx, Popt Tx )

+ C*( Rv,x, Popt Ru,x ) + w*d(x, (u,v) )Since the sub-placement may not be optimal

Substituing these into (7)

),(*),(*

),(),(

,,, ltRCRPRC

lTCTPTC

xvxvoptxv

xxoptx

Optimal Placement of k Proxies

C*(Rv,u, t) C*’( Rv,u, t) (1/2)Find a x, such that:

Lv,u,x has no proxy.Tx has a placement with t’ proxies.Rv,x has a placement with t-t’

proxies.See their union

: a t proxies placement in Rv,u

but may not be optimal

xvopt

xopt PP ,

),(*),(* ,,,

xvopt

xoptuvuv PPRCtRC

xoptP

xvoptP

,

C*(Rv,u, t) C*’( Rv,u, t) (2/2)

x and t’ are arbitrary value

Optimal Placement of k Proxies

Optimal Placement of k Proxies

Algorithm 1

Initialize

Case 1: only one proxy

Case 0

Case 1

Case 2

Case 2: recursion of eq. (4)

Case 0: return (computed)

Algorithm 1 – complexity

C[ Tv, t ] : 2-d array

( n*k )

C*[ Rv,u, t ] : 3-d array ( n*n*k )

n times

k times constant

)()*)(( 232 knOnkknnkO

Outline

IntroductionProblem FormulationOptimal Placement of k ProxiesOptimal Number of Proxies

Optimal Number of Proxies

Let P be a set of proxies placed in Tv

( the size of P is unknown )

note: v is always a proxy.

Optimal Number of Proxies

: no poxy placed in Tv except v : some proxies placed in Tv

Obviously,

Optimal Number of Proxies

Consider :when there are proxies other than v in Tv, we can

always find a node u, u Tv, u v, which satisfies:a proxy is placed at u;no proxy is placed in Lv,u

( Lv,u could be empty )No proxy is placed in

The cost for updating proxy node u is

d(u,v)Tv is partitioned into Lv,u, Tu, and Rv,u

},{),( uvuv

Optimal Number of Proxies

Thus we have

Where

For all possible dividing point u:

Optimal Number of Proxies

(11)

Optimal Placement of k Proxies

Theorem 3.Equation (10) and (11) are correct.

Proof. Prove equation (11).From definition in (11),

(12)

Optimal Number of Proxies

Comparing (11) with (12)We need to prove:

Prove and

Optimal Number of Proxies

Prove (1/2)Let Popt be the optimal placement in Rv,u, that is:

Popt is known, find the proxy node x satisfies:

The cost for updating proxy x isRv,u is partitioned into three parts by x:

Lv,u,x, Tx, and Rv,x

Optimal Number of Proxies

Prove (2/2)

Optimal Number of Proxies

Prove (1/2)For any x Rv,u,

: optimal placement in Tv

: optimal placement in Rv,x

Consider their union : a placement in Rv,u,

but may not be optimal

xoptPxv

optP,

xvopt

xopt PP ,

Optimal Number of Proxies

Prove (2/2)

x is an arbitray value

Optimal Number of Proxies

Optimal Number of Proxies

Optimal Number of Proxiesn entries

n*n entries

n times

)()*)(( 32 nOnnnO

A numerical example

Update frequency : 12

Simulation Setup

Inet topology generator,http://topology.eecs.umich.edu/inet/

Default n = 3037r(v) randomly generated in [0,100]w : number of update operationα : read-write ratio

Tsv

vrw )(

Performance comparisons

Traffic reduction ratio

3 algorithm:GreedyOptimalRandom

%100),T(

P) ,C(T - ),T(

s

ss

C

CR

Performance comparisons (R vs. k)

α = 0Difference between

opt and greedy usually within 10%

R rises sharply at a small number of proxies

R does not change much as the access frequency changes.

Performance comparisons (Bell Lab)

Why different?Traffic even or uneven

Performance comparisons (R vs. k )

ρ (hit ratio) : fixed 40%

Performance comparisons (R vs. α)

ρ (hit ratio) : fixed 40%

Observation

Randomly placing the proxy just makes the things worse.

R increases sharply when k is small and becomes saturated when k reaches about 5.

Placing too many proxies would degenerate the system performance If the update frequency is relatively high.

Performance comparisons ( R vs. ρ)

R improves significantly as ρ increases

Finding the optimal number of proxies

Depends on n, α and ρFor the next 3 figures, 2 y-axis are used:

LHS : the optimal number of proxies required in the system (denoted by k)

RHS : the corresponding traffic reduction ratio (denoted by R)

The optimal number of proxies ( diff. n )

ρ = 40%α = 0.001 and 0.0001k-curve for α = 0.001

remains almost flatThe k-curve for α =

0.0001 shows a stable increase

Two R-curves are quite flat

The optimal number of proxies (diff. α)

ρ = 40%the need of proxies

drops dramatically as the update to the Web data frequency increases.We could predict k-

curve would eventually reach 0.

The optimal number of proxies (diff. ρ)

α = 0.001The k-curve and R-

curve both show a stable increase of k as the hit-ratio increases

Placing more proxies should come together with the improvement of cache hit-ratio

Discussion

Stability of routingIf routes are stable, the routes used to access

the Web server would form a SPT; root=server.

In reality:80% of routes change less often than 1/day93% of the routes are actually stable (from Bell

Lab’s Web server to 13,533 destinations )

Reduce the arbitrary network to a tree

Discussion (cont.)

The placement of en-route proxies in the routers requires static configuration work.although the client population changes

significantly from time to time, the outgoing traffic remains pretty stable.

the optimal locations for the proxies do not change by much as time progresses

Discussion (cont.)

Multicast model and not considering building and maintenance cost.Once a proxy is placed at node u, the nodes on

the π(u, s) path can have a proxy without increasing the cost, but just decrease read cost on those nodes.

Solution : Consider the monetary cost and maintenance cost.

Conclusion

placing k proxies problemTime complexity : O(n3k2), where k is the

number of proxies and n the number of nodes in the system.

The optimal number of proxies problemgiven the read frequencies of all clients and the

update frequency of the server.Time complexity : O(n3),

Placement of Web-Server Proxies with

Consideration of

Read

and

Update Operationson莊謹譽

李苡嬋

Internet張又仁

2007 COMMUNICATION OPTIMIZATION FOR PARALLEL PROCESSING

the