34
1 A Study of Deadline Scheduling for Client- Server Systems on the Computational Grid Atsuko Takefusa, Henri Casanova, Satoshi Matsuoka, Francine Berman Presenter: Pushpinder Kaur Chouhan

A Study of Deadline Scheduling for Client-Server Systems on the Computational Grid

Embed Size (px)

DESCRIPTION

A Study of Deadline Scheduling for Client-Server Systems on the Computational Grid. Atsuko Takefusa, Henri Casanova, Satoshi Matsuoka, Francine Berman. Presenter: Pushpinder Kaur Chouhan. Outline. Computational Grid & NES Overview of Bricks A Deadline-scheduling algorithm - PowerPoint PPT Presentation

Citation preview

Page 1: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

1

A Study of Deadline Scheduling for Client-Server Systems on the Computational Grid

Atsuko Takefusa, Henri Casanova, Satoshi Matsuoka, Francine Berman

Presenter: Pushpinder Kaur Chouhan

Page 2: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

2

Outline

Computational Grid & NES Overview of Bricks A Deadline-scheduling algorithm

Load Correction mechanism Fallback mechanism

Experiments Failure rates Resource cost Multiple Fallbacks

Conclusion Future work View

Page 3: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

3

The Computational Grid

A promising platform for the deployment of HPC applications

Effective use of resources A crucial issue is Scheduling

Page 4: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

4

NES: Network-Enabled Server Grid software which provides a

service on the network Client-server architecture RPC-style programming model Many high-profile applications from

science and engineering are amenable

Scheduling in multi-client multi-server scenario?

Page 5: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

5

Scheduling for NES Resource economy model

Grid currency allow owners to “charge” for usage

$$$$$$$$$$$ Choice?

No actual economical model is implemented Deadline-scheduling algorithm

Users specify deadlines for the task of their apps.

Page 6: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

6

Approach Goal is to minimize

The overall occurrences of deadline misses The reduction of resource cost

Assumptions Each request comes with a deadline

requirement Resources are priced proportionally to their

performance Simulation framework

Bricks: Performance evaluating system

Page 7: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

7

The Bricks Architecture

Scheduler

NetworkMonitor ServerMonitor

ClientNetwork

NetworkServer

Scheduling UnitScheduling Unit

Global Computing EnvironmentGlobal Computing Environment

ResourceDB

NetworkPredictorServerPredictor

Predictor

Page 8: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

8

Scheduler

Predictor

NetworkMonitorServerMonitor

ClientNetwork

NetworkServer

ResourceDB

Grid Computing Environment Client

Represents the user machine, upon which global computing tasks are initiated by the user program.

Server Represents computational resources

Network Represents the network interconnecting the Client and the Server

Represented using queues

Page 9: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

9

An Example of the Global Computing Environment Model

Page 10: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

10

Communication/Server Models using queues in Bricks

Extraneous data/job modelCongestion represented by adjusting the amount of arrival data/jobs from other nodes/users

Trace modelBandwidth/performance at each step = trace data such as observed parameters of real environment.

Need to specify several parametersGreater accuracy requires larger simulation cost

Network/Server behaves as if real network/serverSimulation cost lower than the previous modelNeed to accumulate the measurements

Page 11: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

11

Scheduler

Predictor

NetworkMonitorServerMonitor

ClientNetwork

NetworkServer

ResourceDB

Scheduling Unit Network/ServerMonitor

Measures/monitors network/server status on the Grid

ResourceDBServes as scheduling-specific database, storing the values of various measurements.

PredictorReads the measured resource information from ResourceDB, and predicts availability of resources.

SchedulerAllocates a new task invoked by a client

on suitable server machine(s)

Page 12: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

12

Overview of Grid Computing Simulation with Bricks

Scheduler

NetworkMonitor ServerMonitor

ClientNetwork

NetworkServer

Scheduling UnitScheduling Unit

Grid Computing EnvironmentGrid Computing Environment

ResourceDB

Monitor periodically

Store observed informationInquire

suitable server

Query available servers

Query predictions

Execute taskReturn results

NetworkPredictor

Invoke task

ServerPredictor

Return scheduling info

Predictor

Send the task

Predictions Perform Predictions

Page 13: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

13

Deadline-Scheduling

Many NES scheduling strategies Greedy assigns requests to the server that completes

it the earliest Deadline-scheduling:

Aims at meeting user-supplied job deadline specifications

Server 1

Server 2

Server 3

Job execution time

Deadline$

$$$$$

$$

Page 14: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

14

A Deadline-Scheduling Algorithm for multi-client/server NES

1 Estimate job processing timeTsi on each server Si :Tsi = Wsend/Psend + Wrecv/Precv + Ws/Pserv (0 i < n)Wsend, Wrecv, Ws: send/recv data size, and logical comp. costPsend, Precv, Pserv: estimated send/recv throughput, and performance

2 Compute Tuntil deadline:Tuntil deadline = Tdeadline - now

Server 1Server 2Server 3

Comp.

Estimated job execution time

Send Recv

now Tuntil deadlineDeadline

Page 15: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

15

A Deadline-Scheduling Algorithm (cont.)

3 Compute target processing time Ttarget:Ttarget = Tuntil deadline x Opt (0 < Opt 1)

4 Select suitable server Si:Conditions : MinDiff=Min(Diff si) where Diff si=Ttarget–Tsi0

Min(|Diff|) otherwise

now

Tuntil deadline

Ttarget

Server 1Server 2Server 3

Comp.

Estimated job execution time

Send Recv

Page 16: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

16

Factors in Deadline-Scheduling Failures

Accuracy of predictions is not guaranteed Monitoring systems do not perceive, load

change instantaneously Tasks might be out-of-order in FCFS

queues

Page 17: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

17

Ideas to improve schedule performance Scheduling decisions will result in an

increase in load of scheduled nodes

Server can estimate whether it will be able to complete the task by the deadline

Load Correction: Use corrected load values

Fallback: Transfered a scheduling functionality to server

Page 18: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

18

The Load Correction Mechanism Modify load predictions from monitoring system, Loa

dSi, as follows:LoadSi corrected = LoadSi + Njobs Si x ploadNjobs Si: the number of scheduled and unfinished jobs on the

server Si

Pload (= 1): arbitrary value that determines the magnitude

Scheduler

NetworkMonitor ServerMonitor

ResourceDB

NetworkPredictorServerPredictor

PredictorCorrected prediction

Page 19: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

19

The Fallback Mechanism Server can estimate whether it will be able to complet

e the task by the deadline Fallback happens when:

Tuntil deadline < Tsend + ETexec + ETrecv

Nmax. fallbacks NfallbacksTsend : Comm. duration (send)ETexec, ETrecv: Estimated comm. (recv) and comp. duration Nfallbacks, Nmax. fallbacks : Total/Max. number of fallbacks

Client Server

SchedulerServer

FallbackRe-submit

Page 20: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

20

Experiments Performance criteria:

Failure rate: Percentage of requests that missed their deadline

Resource cost: Avg. resource cost over all requestscost = machine performanceE.g. select 100 Mops/s and 300 Mops/s

servers Resource cost=200

Multiple Fallbacks: Resubmission of the request

Page 21: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

21

Scheduling Algorithms

Greedy: Typical NES scheduling strategy Deadline (Opt = 0.5, 0.6, 0.7, 0.8, 0.9) Load Correction (on/off) Fallback (Nmax fallbacks = 0/1/2/3/4/5)

Page 22: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

22

Configurations of the Bricks Simulation Grid Computing Environment (75 nodes, 5

Grids) # of local domain: 10, # of local domain nodes: 5-10 Avg. LAN bandwidth: 50-100[Mbits/s] Avg. WAN bandwidth: 500-1000[Mbits/s] Avg. server performance: 100-500[Mops/s] Avg. server Load: 0.1

Characteristics of client jobs Send/recv data size: 100-5000[Mbits] # of instructions: 1.5-1080[Gops] Avg. intervals of invoking:

60(high load), 90(medium load), 120(low load) [min]

Page 23: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

23

Simulation Environment The Presto II cluster:

128PEs at Matsuoka Lab., Tokyo Institute of Technology. Dual Pentium III 800MHz Memory: 640MB Network: 100Base/TX

Use APST[Casanova ’00] to deploy Bricks simulations

24 hour simulation x 2,500 runs(1 sim. takes 30-60 [min] with Sun JVM 1.3.0+HotSpot)

Page 24: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

24

Comparison of Failure Rates(load: medium)

0

10

20

30

40

50

60

70

Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9

Failu

re R

ate

[%]

x/x

L/xx/F

L/F

Typical NES scheduling

Fallback leads tosignificant reductions

Load Correctionis NOT useful

Page 25: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

25

Comparison of Failure Rates(Load: high, medium, low)

“Low” load leads to improved failure rates All show similar characteristics

0

10

20

30

40

50

60

70

Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9

Failu

re R

ate[

%] x/x

L/x

x/F

L/F

0

10

20

30

40

50

60

70

Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9

Failu

re R

ate

[%] x/x

L/x

x/F

L/F

0

10

20

30

40

50

60

70

Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9

Failu

re R

ate

[%]

x/x

L/x

x/F

L/F

High LowMedium

Page 26: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

26

Comparison of Resource Costs

0

50

100

150

200

250

300

350

400

450

500

Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9

Avg

. Re

sou

rce

Co

st

x/x

L/x

x/F

L/F

Greedy leads to higher costs

Costs decrease when the algorithm becomes less conservative

Page 27: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

27

Comparison of Failure Rates(x/F, Nmax. fallbacks = 0-5)

0

10

20

30

40

50

60

70

Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9

Failu

re R

ate

[%]

0

1

2

3

4

5

Multiple fallbacks causesignificant improvement

Page 28: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

28

Comparison of Resource Costs(x/F, Nmax. fallbacks = 0-5)

0

50

100

150

200

250

300

350

400

450

500

Greedy D-0.5 D-0.6 D-0.7 D-0.8 D-0.9

Avg

. Res

ourc

e C

ost

0

1

2

3

4

5

Multiple fallbacks lead to“small” increases in costs

Page 29: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

29

Conclusions Proposed a deadline-scheduling algorithm

for multi-client/server NES systems, and Load Correction and Fallback mechanisms

Investigated performance in multi-client multi-server scenarios with the improved Bricks

The experiments showed It is possible to make a trade-off between

failure-rate and resource cost Load correction mechanism may not be useful NES systems should use deadline-scheduling

with multiple fallbacks

Page 30: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

30

Future Work

Make Bricks support more sophisticated economy models

Investigate their feasibility and improve deadline-scheduling algorithms

Implement the deadline-scheduling algorithm within actual NES systems

Page 31: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

31

Views Deadline scheduling is a good

approach. Overall performance can be

increased. Resource cost can be decreased. Load correction mechanism is light. Real implementation is difficult.

Page 32: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

32

Scheduling Framework

client

serverclient

client

server

WAN

task

result

Page 33: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

34

Performance of a Deadline Scheduling Scheme Traditional scheduling deadline

scheduling Charging mechanisms will be adopted.

The Grids consists of various resources. The resources are shared by various users.

Grid Users want the lowest cost machines which process jobs in a fixed period of time

Deadline scheduling meets a deadline for returning the results of

jobs

Page 34: A Study of Deadline Scheduling for Client-Server Systems  on the Computational Grid

35

A Hierarchical Network Topology on Bricks

Client

ServerClient

Server

Client

Server

Client

Server

Client

ClientServer

Client

Server

Client

Server

Client

Server

Client

Server

Server

Client

WAN

LAN

Local Domain

NetworkNetwork

trace