Job Allocation Schemes in Computational Grids based on...

Preview:

Citation preview

Job Allocation Schemes in Computational Gridsbased on Cost Optimization

Satish Penmatsa

Joint work with: Dr. A.T.Chronopoulos

in Proceedings of the 19th IEEE International Parallel andDistributed Processing Symposium (IPDPS05)

Job Scheduling: problem formulation

• Given a large number of jobs, find the allocation of jobsto computers optimizing a given objective function (e.g.total execution time or total cost).

Talk Outline

• Introduction to Grid Computing

• Pricing Model

• System Model

• Price based Job Allocation Schemes

• Experimental results

• Conclusions

Grid Computing

• Grid is a type of parallel / distributed system.

– Enables the sharing, selection, and aggregation of ge-ographically distributed ”autonomous” resources dy-namically at runtime.

– Depends on the resource availability, capability, perfor-mance, cost, and users’ quality-of-service requirements.

• Difference between a Grid and a Cluster?

– The key distinction is in the way resources are managed.

• Computational grid: Tries to solve problems or applica-tions by allocating the idle computing resources over anetwork or the internet

• These computational resources have different owners whocan be enabled by an automated negotiation mechanismby the grid controllers

Pricing Model

[Ghosh et al. ‘04]

• Incomplete information alternating-offer non-cooperativebargaining game

• Players are the Grid Servers and the Computers

• Reserved valuations

• The server has to play an independent game with eachcomputer associated with it to form the price per unitresource vector, pj.

• In a system with m servers and n computers at time t,we have m× n bargaining games.

S1

S2

Sm

C1

C2

Cn

Computing Resource PoolGrid Server Pool

Ci − i th Computer BGji − Bargaining Game

Sj − j th Grid Server

betweenj th Server and i th Computer

BG11

BG12

BG1n

BG21BG22

BG2n

BGm1

BGm2

BGmn

Figure 1: Bargaining game mapping between the grid servers and computers

The Bargaining Protocol

• One of the players starts the game.

• If the server starts the game, it proposes an offer whichwill be much less than its own reserved valuation.

• If the offered price ≥ the computer’s standard price withhighest expected surplus, then the computer accepts theoffer.

• Else, the computer makes a counter offer.

• If this counter offer ≤ the server’s standard price with thehighest expected surplus, then the server accepts.

• Else the server counter offers again.

• This procedure continues until an agreement is reached.

• Grid Server: E[Surplus] = (reserved valuation of server- standard price of server)×probability(standard price)

• Computer: E[Surplus] = (standard price of computer -reserved valuation of computer)×probability(standardprice)

• Standard price represents the different offered prices usedby the players to compute their expected surplus.

• probability(standard price) is the probability that thestandard price will be accepted by the other player aspredicted by itself.

Example

• Reserved valuation of the Grid server: $100;

• Reserved valuation of the Computer: $60;

• Let the Computer make an initial offer of $110;

Offered Price ($) Probability Expected Surplus ($)40 0.10 0660 0.40 1680 0.70 1490 0.90 09100 1.00 00

Table 1: Grid Server’s computation for making decision

Offered Price ($) Probability Expected Surplus ($)60 1.00 0070 0.90 0980 0.70 1490 0.40 12110 0.10 05

Table 2: Computer’s computation for making decision

Offered Price ($) Updated Probability Expected Surplus ($)40 0.10 - 0.30 = 0.00 0060 0.40 - 0.30 = 0.10 0480 0.70 - 0.30 = 0.40 0890 0.90 - 0.30 = 0.60 06100 1.00 - 0.30 = 0.70 00

Table 3: Grid Server’s computation using modified probability for making decision

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Grid

Ser

ver’s

Exp

ecte

d Su

rplu

s

Grid Server’s Offered Price

t=0t=1t=2t=3

Figure 2: Expected surplus of the Grid server vs Offered prices

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Com

pute

r’s E

xpec

ted

Surp

lus

Computer’s Offered Price

t=0t=1t=2t=3

Figure 3: Expected surplus of the Computer vs Offered prices

GridCommunity

GridServers

Computers

S1

S2

Sm

C1

C2

Cn

1

2

m

s11

s12

s1n

s21

s22

s2n

sm1

sm2

smn

1

1

1

2

2

2

m

m

m

JobAssignment

JobAssignment

Figure 4: Grid System Model

Notations & Assumptions

•m Grid Servers

• n Computers

• φj : Job arrival rate at server j; j = 1, . . . , m

• Φ = ∑mj=1 φj : Total job arrival rate of the system

• µi : Average processing rate of computer i; i = 1, . . . , n

• Each computer is modeled as an M/M/1 queuing system

• Φ < ∑ni=1 µi

• pji : Price per unit resource as agreed between server jand computer i

Notations & Assumptions (cont’d)

• sji : Fraction of workload (jobs) that server j sends tocomputer i

• sj = (sj1, sj2, . . . , sjn) denotes the workload fractions ofserver j

• The vector s = (s1, s2, . . . , sm) denotes the load fractionsof all the servers

• The expected response time at computer i is given by:

Fi(s) =1

µi − ∑mj=1 sjiφj

(1)

• Thus the overall expected cost of server j is given by:

Dj(s) =n∑

i=1kipjisjiFi(s) =

n∑

i=1

kipjisji

µi − ∑mk=1 skiφk

(2)

• The overall expected cost of the system (i.e. of all theservers) is given by:

D(s) =1

Φ

m∑

j=1φjDj(s) (3)

which is equivalent to

D(s) =1

Φ

m∑

j=1

n∑

i=1

kipjiφjsji

µi − ∑mk=1 skiφk

(4)

subject to the constraints:

sji ≥ 0, i = 1, . . . , n, j = 1, . . . , m (5)

n∑

i=1sji = 1, j = 1, . . . , m (6)

m∑

j=1sjiφj < µi, i = 1, . . . , n (7)

Price based Job Allocation Schemes

1. Global Optimal Scheme with Pricing (GOSP )

2. Nash Scheme with Pricing (NASHP )

1. Global Optimal Scheme with Pricing (GOSP )

• The load fractions (s) are obtained by solving the nonlin-ear optimization problem D(s) (4) which gives the opti-mum expected cost of the system.

• Let µji = µi−∑m

k=1,k 6=j skiφk be the available processingrate at computer i as seen by server j.

D(s) Solution:

Theorem 1: Assuming that computers are ordered in de-

creasing order of their available processing rates (µj1 ≥

µj2 ≥ . . . ≥ µj

n), the load fractions for server j are givenby:

sji =

1φj

µ

ji −

√kipjiµi

∑cjk=1 µ

jk−φj

∑cjk=1

√kkpjkµk

if 1 ≤ i < cj

0 if cj ≤ i ≤ n

(8)

where cj is the minimum index that satisfies the in-equality:

µjcj ≤

√√√√kcjpjcjµcj(∑cjk=1 µ

jk − φj)

∑cjk=1

√kkpjkµk

(9)

Algorithm for solving D(s)

BEST-FRACTIONS(µj1, . . . , µ

jn, φj, pj1, . . . , pjn, k1, . . . , kn)

Input: Available processing rates: µj1, µj

2, . . . µjn;

Total arrival rate: φj

The price per unit resource vector: pj1, pj2, . . . pjn

The constants vector: k1, k2, . . . kn

Output: Load fractions: sj1, sj2, . . . sjn;

1. Sort the computers in decreasing order of ( µj1√

µ1k1pj1≥ . . . ≥ µ

jn√

µnknpjn);

2. t ←∑n

i=1 µji−φj

∑ni=1

õipjiki

3. while ( t ≥ µjn√

µnknpjn) do

sjn ← 0n ← n− 1

t ←∑n

i=1 µji−φj

∑ni=1

õipjiki

4. for i = 1, . . . , n do

sji ← 1φj

(µj

i − t√µipjiki

)

A Distributed Algorithm

Server j, (j = 1, . . . , m) executes:1. Initialization:

s(0)j ← 0;

D(0)j ← 0;

l ← 0;norm ← 1;sum ← 0;tag ← CONTINUE;left = [(j − 2)modm] + 1;right = [jmodm] + 1;

2. while ( 1 ) doif (j = 1) {server 1}

if (l 6= 0)Recv(left, (norm, l, tag));if (norm < ε)

Send(right, (norm, l, STOP));exit;

sum ← 0;l ← l + 1;

else {the other servers}Recv(left, (sum, l, tag));if (tag = STOP)

if (j 6= m) Send(right, (sum, l, STOP));exit;

for i = 1, . . . , n do

Obtain µji by inspecting the run queue of each computer

(µji ← µi − ∑m

k=1,k 6=j skiφk);

s(l)j ← BEST-FRACTIONS(µj

1, . . . , µjn, φj);

Compute D(l)j ;

sum ← sum + |D(l−1)j −D

(l)j |;

Send(right, (sum, l, CONTINUE));endwhile

2. Nash Scheme with Pricing (NASHP )

• In this scheme each server tries to minimize the total costof its jobs independently of the others.

• The load fractions are obtained by formulating the prob-lem as a non-cooperative game among the servers.

• The goal of server j is to find a feasible job allocationstrategy sj such that Dj(s) (2) is minimized.

Dj(s) Solution:

Theorem 2: Assuming that computers are ordered in de-

creasing order of their available processing rates (µj1 ≥

µj2 ≥ . . . ≥ µj

n), the solution sj of the optimization prob-lem Dj(s) is given by:

sji =

1φj

µ

ji −

√√√√√kipjiµji

∑cjk=1 µ

jk−φj

∑cjk=1

√√√√kkpjkµjk

if 1 ≤ i < cj

0 if cj ≤ i ≤ n(10)

where cj is the minimum index that satisfies the in-equality:

√√√√√µjcj ≤

√√√√kcjpjcj(∑cjk=1 µ

jk − φj)

∑cjk=1

√√√√√kkpjkµjk

(11)

Experimental Results

Performance metrics:

• Expected Response Time

• Fairness Index(I(C))

I(C) =[∑m

j=1 Cj]2

m ∑mj=1 C2

j(12)

System configuration

• 32 computers

• 20 servers

Relative µi 1 2 3 4 5 7 8 10#computers 7 6 5 4 3 3 2 2µi (jobs/sec) 10 20 30 40 50 70 80 100ki 1 2 3 4 5 6 7 8

System utilization vs Expected Price

0

0.05

0.1

0.15

0.2

0.25

0.3

10 20 30 40 50 60 70 80 90

Pric

e

System Utilization(%)

AscendingDescending

Random

System utilization vs Expected Response Time

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

10 20 30 40 50 60 70 80 90

Expe

cted

Res

pons

e Ti

me

System Utilization(%)

GOSPNASHP

System utilization vs Fairness Index

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

10 20 30 40 50 60 70 80 90

Fairn

ess

Inde

x

System Utilization(%)

GOSPNASHP

Heterogeneity vs Expected Price

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

2 4 6 8 10 12 14 16 18 20

Pric

e

Max Speed/Min Speed

AscendingDescending

Random

Heterogeneity vs Expected Response Time

0.05

0.1

0.15

0.2

0.25

0.3

0.35

2 4 6 8 10 12 14 16 18 20

Expe

cted

Res

pons

e Ti

me

Max Speed/Min Speed

GOSPNASHP

Conclusions

•We proposed two job allocation schemes based on pricingfor computational grids.

• The GOSP scheme tries to minimize the cost of the en-tire grid system and so is advantageous when the systemoptimum is required. But it is not fair to the servers andso to the users.

• The NASHP scheme minimizes the cost for each server.This is fair to the servers and so to the users.

Recommended