Scheduling Algorithms for Grid Computing

Embed Size (px)

Citation preview

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    1/45

    Technical Report No. 2006-504

    Scheduling Algorithms for Grid omputing!

    State of the Art and "pen #ro$lems

    %angpeng &ong and Selim G. A'l

    School of omputing(

    )ueen*s +ni,ersit

    ingston( "ntario/anuar 2006

    A$stract!

    Than's to ad,ances in ide-area netor' technologies and the lo cost of computing

    resources( Grid computing came into $eing and is currentl an acti,e research area. "ne

    moti,ation of Grid computing is to aggregate the poer of idel distri$uted resources(

    and pro,ide non-tri,ial ser,ices to users. To achie,e this goal( an efficient Grid scheduling

    sstem is an essential part of the Grid. Rather than co,ering the hole Grid scheduling

    area( this sur,e pro,ides a re,ie of the su$1ect mainl from the perspecti,e of 

    scheduling algorithms. n this re,ie( the challenges for Grid scheduling are identified.

    %irst( the architecture of components in,ol,ed in scheduling is $riefl introduced to

     pro,ide an intuiti,e image of the Grid scheduling process. Then ,arious Grid schedulingalgorithms are discussed from different points of ,ie( such as static ,s. dnamic policies(

    o$1ecti,e functions( applications models( adaptation( )oS constraints( strategies dealing

    ith dnamic $eha,ior of resources( and so on. 3ased on a comprehensi,e understanding

    of the challenges and the state of the art of current research( some general issues orth of 

    further eploration are proposed.

    . ntroduction

    The popularit of the nternet and the a,aila$ilit of poerful computers and

    high-speed netor's as lo-cost commodit components are changing the a e use

    computers toda. These technical opportunities ha,e led to the possi$ilit of using

    geographicall distri$uted and multi-oner resources to sol,e large-scale pro$lems in

    science( engineering( and commerce. Recent research on these topics has led to the

    emergence of a ne paradigm 'non as Grid computing 78.

    To achie,e the promising potentials of tremendous distri$uted resources( effecti,e and

    efficient scheduling algorithms are fundamentall important. +nfortunatel( scheduling

    algorithms in traditional parallel and distri$uted sstems( hich usuall run on

    homogeneous and dedicated resources( e.g. computer clusters( cannot or' ell in the ne

    circumstances 28. n this paper( the state of current research on scheduling algorithms for 

    the ne generation of computational en,ironments ill $e sur,eed and open pro$lems

    ill $e discussed.

    The remainder of this paper is organi9ed as follos. An o,er,ie of the Gridscheduling pro$lem is presented in Section 2 ith a generali9ed scheduling architecture. n

    Section :( the progress made to date in the design and analsis of scheduling algorithms

    for Grid computing is re,ieed. A summar and some research opportunities are offered in

    Section 4.

    2. ",er,ie of the Grid Scheduling #ro$lem

    A computational Grid is a hardare and softare infrastructure that pro,ides

    dependa$le( consistent( per,asi,e( and inepensi,e access to high-end computational

    capa$ilities 458. t is a shared en,ironment implemented ,ia the deploment of a

     persistent( standards-$ased ser,ice infrastructure that supports the creation of( and resource

    sharing ithin( distri$uted communities. Resources can $e computers( storage space(

    instruments( softare applications( and data( all connected through the nternet and amiddleare softare laer that pro,ides $asic ser,ices for securit( monitoring( resource

    management( and so forth. Resources oned $ ,arious administrati,e organi9ations are

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    2/45

    shared under locall defined policies that specif hat is shared( ho is alloed to access

    hat( and under hat conditions 4;8. The real and specific pro$lem that underlies the Grid

    concept is coordinated resource sharing and pro$lem sol,ing in dnamic(

    multi-institutional ,irtual organi9ations 448.

    %rom the point of ,ie of scheduling sstems( a higher le,el a$straction for the Grid

    can $e applied $ ignoring some infrastructure components such as authentication(

    authori9ation( resource disco,er and access control. Thus( in this paper( the folloingdefinition for the term Grid adopted! ualit-of-ser,ice re>uirements? 08.

    To facilitate the discussion( the folloing fre>uentl used terms are defined!

    @ A tas' is an atomic unit to $e scheduled $ the scheduler and assigned to a

    resource.

    @ The properties of a tas' are parameters li'e #+memor re>uirement( deadline(

     priorit( etc.

    @ A 1o$ Bor metatas'( or applicationC is a set of atomic tas's that ill $e carried out

    on a set of resources. /o$s can ha,e a recursi,e structure( meaning that 1o$s arecomposed of su$-1o$s andor tas's( and su$-1o$s can themsel,es $e decomposed

    further into atomic tas's. n this paper( the term 1o$( application and metatas' are

    interchangea$le.

    @ A resource is something that is re>uired to carr out an operation( for eample! a

     processor for data processing( a data storage de,ice( or a netor' lin' for data

    transporting.

    @ A site Bor nodeC is an autonomous entit composed of one or multiple resources.

    @ A tas' scheduling is the mapping of tas's to a selected group of resources hich

    ma $e distri$uted in multiple administrati,e domains.

    2

    2. The Grid Scheduling #rocess and omponents

    A Grid is a sstem of high di,ersit( hich is rendered $ ,arious applications(

    middleare components( and resources. 3ut from the point of ,ie of functionalit( e

    can still find a logical architecture of the tas' scheduling su$sstem in Grid. %or eample(

    Dhu 2:8 proposes a common Grid scheduling architecture. Ee can also generali9escheduling

     process in the Grid into three stages! resource disco,ering and filtering(

    resource selecting and scheduling according to certain o$1ecti,es( and 1o$ su$mission 748.

    As a stud of scheduling algorithms is our primar concern here( e focus on the second

    step. 3ased on these o$ser,ations( %ig. depicts a model of Grid scheduling sstemshich

    functional components are connected $ to tpes of data flo! resourceapplication information

    flo and tas' or tas' scheduling command flo.a

    in

    or 

    %ig. ! A logical Grid scheduling architecture! $ro'en lines sho resource or application

    information

    flos and real lines sho tas' or tas' scheduling command flos.

    3asicall( a Grid scheduler BGSC recei,es applications from Grid users( selects feasi$le

    resources for these applications according to ac>uired information from the Grid

    nformation Ser,ice module( and finall generates application-to-resource mappings( $ased

    on certain o$1ecti,e functions and predicted resource performance. +nli'e their 

    counterparts in traditional parallel and distri$uted sstems( Grid schedulers usuall cannotcontrol Grid resources directl( $ut or' li'e $ro'ers or agents:8( or e,en tightl

    coupled ith the applications as the application-le,el scheduling scheme proposes 8(

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    3/45

    058. The are not necessaril located in the same domain ith the resources hich are

    ,isi$le to them. %ig. onl shos one Grid scheduler( $ut in realit multiple such

    schedulers might $e deploed( and organi9ed to form different structures Bcentrali9ed(

    hierarchical and decentrali9ed 558C according to different concerns( such as performance

    or scala$ilit. Although a Grid le,el scheduler Bor Fetascheduler as it is sometime referred

    to in the literature( e.g.( in 8C is not an indispensa$le component in the Grid

    :infrastructure Be.g.( it is not included in the Glo$us Tool'it 258( the defacto standardthe Grid

    computing communitC( there is no dou$t that such a scheduling componentcrucial for harnessing

    the potential of Grids as the are epanding >uic'l( incorporating

    resources from supercomputers to des'tops. "ur discussion on scheduling algorithms$ased on the

    assumption that there are such schedulers in a Grid.

    nformation a$out the status of a,aila$le resources is ,er important for a Grid

    scheduler to ma'e a proper schedule( especiall hen the heterogeneous and dnamic

    nature of the Grid is ta'en into account. The role of the Grid information ser,ice BGSCto pro,ide

    such information to Grid schedulers. GS is responsi$le for collecting and

     predicting the resource state information( such as #+ capacities( memor si9e( netor' 

     $andidth( softare a,aila$ilities and load of a site in a particular period. GS can anser >ueries for resource information or push information to su$scri$ers. The Glo$us

    Fonitoring and &isco,er Sstem BF&SC ::8 is an eample of GS.

    3esides ra resource information from GS( application properties Be.g.( approimate

    instruction >uantit( memor and storage re>uirements( su$tas' dependenc in a 1o$ and

    communication ,olumesC and performance of a resource for different application species

    are also necessar for ma'ing a feasi$le schedule. Application profiling BA#C is usedetract

     properties of applications( hile analogical $enchmar'ing BA3C pro,ides a measure

    of ho ell a resource can perform a gi,en tpe of 1o$ 68 768. "n the $asis'noledge from A#

    and A3( and folloing a certain performance model 48( cost

    estimation computes the cost of candidate schedules( from hich the scheduler chooses

    those that can optimi9e the o$1ecti,e functions.

    The Haunching and Fonitoring BHFC module Balso 'non as the

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    4/45

    massi,el parallel processors computers BF##C and cluster of or'stations B"EC.

    Hoo'ing $ac' at such efforts( e find that scheduling algorithms are e,ol,ing ith the

    architecture of parallel and distri$uted sstems. Ta$le captures some important features

    of parallel and distri$uted sstems and tpical scheduling algorithms the adopt.

    4

    Ta$le ! I,olution of scheduling algorithms ith parallel and distri$uted computing sstems

    Tpical Architecture &SF( F##

     "E

     Grid

    hronolog

     Hate 70s

     Hate 7;0s

     Fid 770s

    ommercial HAN(

    Tpical Sstem nterconnect

     3us ( Sitch

     EANnternetATF

    Ker Ho

     Ho +suall Not

     Jigh

    ost of nterconnection

     Negligi$le

     Negligi$le

     Not Negligi$le

    nterconnection Jeterogeneit

     None

     Ho

     Jigh

     Node Jeterogeneit

     None

     Ho

     Jigh

    Single Sstem mage

     Les

     Les

     No

    Resource #ool #redetermined

     #redetermined and

     Not #redetermined

    Static&namicit

     and Static

     Static

     and &namic

    Resource Fanagement #olic

     Fonotone

     Fonotone

     &i,erseJomogeneous

     Jeterogeneous

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    5/45

    Grid Scheduling

    Tpical Scheduling Algorithms

     Scheduling

     Scheduling

    Algorithms

    Algorithms

     AlgorithmsAlthough e can loo' for inspirations in pre,ious research( traditional scheduling

    models generall produce poor Grid schedules in practice. The reason can $e found $

    going through the assumptions underling traditional sstems 48!

    @ All resources reside ithin a single administrati,e domain.

    @ To pro,ide a single sstem image( the scheduler controls all of the resources.

    @ The resource pool is in,ariant.

    @ ontention caused $ incoming applications can $e managed $ the scheduler 

    according to some policies( so that its impact on the performance that the site can

     pro,ide to each application can $e ell predicted.

    @ omputations and their data reside in the same site or data staging is a highl

     predicta$le process( usuall from a predetermined source to a predetermineddestination( hich can $e ,ieed as a constant o,erhead.

    +nfortunatel( all these assumptions do not hold in Grid circumstances. n Grid

    computing( man uni>ue characteristics ma'e the design of scheduling algorithms more

    challenging 2:8( as eplained in hat follos.

    @ Jeterogeneit and Autonom

    Although heterogeneit is not ne to scheduling algorithms e,en $efore the emergence

    of Grid computing( it is still far from full addressed and a $ig challenge for scheduling

    algorithm design and analsis. n Grid computing( $ecause resources are distri$utedmultiple

    domains in the nternet( not onl the computational and storage nodes $ut also the

    underling netor's connecting them are heterogeneous. The heterogeneit resultsdifferent

    capa$ilities for 1o$ processing and data access.

    n traditional parallel and distri$uted sstems( the computational resources are usuall

    managed $ a single control point. The scheduler not onl has full information a$out all

    runningpending tas's and resource utili9ation( $ut also manages the tas' >ueue and

    resource pool. Thus it can easil predict the $eha,iours of resources( and is a$le to assign

    tas's to resources according to certain performance re>uirements. n a Grid( hoe,er(

    in

    in

    5

    resources are usuall autonomous and the Grid scheduler does not ha,e full control of the

    resources. t cannot ,iolate local policies of resources( hich ma'es it hard for the Gridscheduler to estimate the eact cost of eecuting a tas' on different sites. The autonom

    also results in the di,ersit in local resource management and access control policies( such

    as( for eample( the priorit settings for different applications and the resource reser,ation

    methods. Thus( a Grid scheduler is re>uired to $e adapti,e to different local policies. The

    heterogeneit and autonom on the Grid user side are represented $ ,arious parameters(

    including application tpes( resource re>uirements( performance models( and optimi9ation

    o$1ecti,es. n this situation( ne concepts such as application-le,el scheduling and Grid

    econom 208 are proposed and applied for Grid scheduling.

    @ #erformance &namismM

    Fa'ing a feasi$le scheduling usuall depends on the estimate of the performance that

    candidate resources can pro,ide( especiall hen the algorithms are static. Grid schedulersor' in a dnamic en,ironment here the performance of a,aila$le resources is constantl

    changing. The change comes from site autonom and the competition $ applications for 

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    6/45

    resources. 3ecause of resource autonom( usuall Grid resources are not dedicated to a

    Grid application. %or eample( a Grid 1o$ su$mitted remotel to a computer cluster might

     $e interrupted $ a cluster*s internal 1o$ hich has a higher priorit ne resources ma

     1oin hich can pro,ide $etter ser,ices or some other resources ma $ecome una,aila$le.

    The same pro$lem happens to netor's connecting Grid resources! the a,aila$le

     $andidth can $e hea,il affected $ nternet traffic flos hich are non-rele,ant to Grid

     1o$s. %or a Grid application( this 'ind of contention results in performance fluctuation(hich ma'es it a hard 1o$ to e,aluate the Grid scheduling performance under classic

     performance models. %rom the point ,ie of 1o$ scheduling( performance fluctuation

    might $e the most important characteristic of Grid computing compared ith traditional

    sstems. A feasi$le scheduling algorithm should $e a$le to $e adapti,e to such dnamic

     $eha,iors. Some other measures are also pro,ided to mitigate the impact of this pro$lem(

    such as )oS negotiation( resource reser,ation Bpro,ided $ the underling resource

    management sstemC and rescheduling. Ee discuss algorithms related to these mechanisms

    in Section :.

    @

     Resource Selection and omputation-&ata Separation

    n traditional sstems( eecuta$le codes of applications and inputoutput data areusuall in the same site( or the input sources and output destinations are determined $efore

    the application is su$mitted. Thus the cost for data staging can $e neglected or the cost is a

    constant determined $efore eecution( and scheduling algorithms need not consider it. 3ut

    in a Grid hich consists of a large num$er of heterogeneous computing sites Bfrom

    supercomputers to des'topsC and storage sites connected ,ia ide area netor's( the

    computation sites of an application are usuall selected $ the Grid scheduler according to

    resource status and certain performance models. Additionall( in a Grid( the

    communication $andidth of the underling netor' is limited and shared $ a host of 

     $ac'ground loads( so the inter-domain communication cost cannot $e neglected. %urther(

    M

    Ee use the term &namism in this paper to refer to the dnamic change in grid resource

     performance

     pro,ided to a grid application.

    6

    man Grid applications are data intensi,e( so the data staging cost is considera$le. This

    situation $rings a$out the computation-data separation pro$lem! the ad,antage $rought $

    selecting a computational resource that can pro,ide lo computational cost ma $e

    neutrali9ed $ its high access cost to the storage site.

    These challenges depict uni>ue characteristics of Grid computing( and put significant

    o$stacles to design and implement efficient and effecti,e Grid scheduling sstems.

    t is $elie,ed( hoe,er( that research achie,ements on traditional scheduling pro$lems canstill pro,ide stepping-stones hen a ne generation of scheduling sstems is $eing

    constructed ;8.

    :. Grid Scheduling Algorithms! State of the Art

    t is ell 'non that the compleit of a general scheduling pro$lem is N#-omplete

    428. As mentioned in Section ( the scheduling pro$lem $ecomes more challenging

     $ecause of some uni>ue characteristics $elonging to Grid computing. n this section( e

     pro,ide a sur,e of scheduling algorithms in Grid computing( hich ill form a $asis for 

    future discussion of open issues in the net section.

    :. A Taonom of Grid Scheduling Algorithms

    n 248( asa,ant et al propose a hierarchical taonom for scheduling algorithms in

    general-purpose parallel and distri$uted computing sstems. Since Grid is a special 'ind of such sstems( scheduling algorithms in Grid fall into a su$set of this taonom. %rom the

    top to the $ottom( this su$set can $e identified as hat follos.

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    7/45

    @

     Hocal ,s. Glo$al

    At the highest le,el( a distinction is dran $eteen local and glo$al scheduling. The

    local scheduling discipline determines ho the processes resident on a single #+ are

    allocated and eecuted a glo$al scheduling polic uses information a$out the sstem to

    allocate processes to multiple processors to optimi9e a sstem-ide performance o$1ecti,e.

    "$,iousl( Grid scheduling falls into the glo$al scheduling $ranch.@

     Static ,s. &namic

    The net le,el in the hierarch Bunder the glo$al schedulingC is a choice $eteen static

    and dnamic scheduling. This choice indicates the time at hich the scheduling decisions

    are made. n the case of static scheduling( information regarding all resources in the Grid

    as ell as all the tas's in an application is assumed to $e a,aila$le $ the time the

    application is scheduled. 3 contrast( in the case of dnamic scheduling( the $asic idea is

    to perform tas' allocation on the fl as the application eecutes. This is useful hen it is

    impossi$le to determine the eecution time( direction of $ranches and num$er of iterations

    in a loop as ell as in the case here 1o$s arri,e in a real-time mode. These ,ariances

    introduce forms of non-determinism into the running program 428. 3oth static anddnamic scheduling are idel adopted in Grid computing. %or eample( static

    scheduling algorithms are studied in 8 2:8dnamic scheduling algorithms are presented.

    and;8( and in 6;8 068 2;8 and 78(

    %ig. 2! A Jierarchical taonom for scheduling algorithms. 3ranches co,ered $ Grid scheduling

    algorithms up to date are denoted in italics. Iamples of each co,ered $ranch are shon at the

    lea,es.

    o Static Scheduling

    n the static mode( e,er tas' comprising the 1o$ is assigned once to a resource. Thus(

    the placement of an application is static( and a firm estimate of the cost of the computation

    can $e made in ad,ance of the actual eecution. "ne of the ma1or $enefits of the static

    model is that it is easier to program from a scheduler*s point of ,ie. The assignment of 

    tas's is fied a priori( and estimating the cost of 1o$s is also simplified. The static model

    allos a uite possi$le and $eond the capa$ilit of a traditionalscheduler running static scheduling policies. To alle,iate this pro$lem( some auiliar

    mechanisms such as rescheduling mechanism :8 are introduced at the cost of o,erhead

    for tas' migration. Another side-effect of introducing these measures is that the gap

     $eteen static scheduling and dnamic scheduling $ecomes less important 528.

    o &namic Scheduling

    ;

    &namic scheduling is usuall applied hen it is difficult to estimate the cost of 

    applications( or 1o$s are coming online dnamicall Bin this case( it is also called online

    schedulingC. A good eample of these scenarios is the 1o$ >ueue management in some

    metacomputing sstems li'e ondor 28 and Hegion 268. &namic tas' scheduling has

    to ma1or components ;8! sstem state estimation Bother than cost estimation in staticschedulingC and decision ma'ing. Sstem state estimation in,ol,es collecting state

    information throughout the Grid and constructing an estimate. "n the $asis of the estimate(

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    8/45

    decisions are made to assign a tas' to a selected resource. Since the cost for an assignment

    is not a,aila$le( a natural a to 'eep the hole sstem health is $alancing the loads of all

    resources. The ad,antage of dnamic load $alancing o,er static scheduling is that the

    sstem need not $e aare of the run-time $eha,ior of the application $efore eecution. t is

     particularl useful in a sstem here the primar performance goal is maimi9ing resource

    utili9ation( rather than minimi9ing runtime for indi,idual 1o$s 648. f a resource is

    assigned too man tas's( it ma in,o'e a $alancing polic to decide hether to transfer some tas's to other resources( and hich tas's to transfer. According to ho ill initiate

    the $alancing process( there are to different approaches! sender-initiated here a node

    that recei,es a ne tas' $ut doesn*t ant to run the tas' initiates the tas' transfer( and

    recei,er-initiated here a node that is illing to recei,e a ne tas' initiates the process

    758. According to ho the dnamic load $alancing is achie,ed( there are four $asic

    approaches 428!

    +nconstrained %irst-n-%irst-"ut B%%"( also 'non as %irst-ome-%irst- Ser,edC

    3alance-constrained techni>ues

    ost-constrained techni>ues

    J$rids of static and dnamic techni>ues

    +nconstrained %%"! n the unconstrained %%" approach( the resource ith thecurrentl shortest aiting >ueue or the smallest aiting >ueue time is selected for the

    incoming tas'. This polic is also called opportunistic load $alancing B"H3C 68 or 

    mopic algorithm. The ma1or ad,antage of this approach is its simplicit( $ut it is often far 

    from optimal.

    3alance-constrained! The $alance-constrained approach attempts to re$alance the loads

    on all resources $ periodicall shifting aiting tas's from one aiting >ueue to another.

    n a massi,e sstem such as the Grid( this could $e ,er costl due to the considera$le

    communication dela. So some adapti,e local re$alancing heuristic can $e applied. %or 

    eample( tas's are initiall distri$uted to all resources( and then( instead of computing the

    glo$al re$alance( the re$alancing onl happens inside a uic'l distri$uted to all resources and started >uic'l the re$alancing process is

    distri$uted and scala$le and the communication dela of re$alancing can $e reduced since

    tas' shifting onl happens among the resources that are uired resources for pending 1o$s.

    J$rid! A further impro,ement is the static-dnamic h$rid scheduling. The main idea

     $ehind h$rid techni>ues is to ta'e the ad,antages of static schedule and at the same time

    capture uncertain $eha,iors of applications and resources. %or the scenario of an

    application ith uncertain $eha,ior( static scheduling is applied to those parts that alaseecute. At run time( scheduling is done using staticall computed estimates that reduce

    run-time o,erhead. That is( static scheduling is done on the alas-eecuted-tas's( and

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    9/45

    dnamic scheduling on others. %or eample( in those cases here there are special )oS

    re>uirements in some tas's( the static phase can $e used to map those tas's ith )oS

    re>uirements( and dnamic scheduling can $e used for the remaining tas's. %or the

    scenario of lo predicta$le resource $eha,iors( static scheduling is used to initiate tas' 

    assignment at the $eginning and dnamic $alancing is acti,ated hen the performance

    estimate on hich the static scheduling is $ased fails. Spring et al sho an eample of this

    scenario in 008.Some other dnamic online scheduling algorithms( such as those in 28 and 8(

    consider the case of resource reser,ation hich is popular in Grid computing as a a to

    get a degree of certaint in resource performance. Algorithms in these to eamples aim to

    minimi9e the ma'espan of incoming 1o$s hich consist of a set of tas's. Fateescu 8

    uses a resource selector to find a co-reser,ation for 1o$s re>uiring multiple resources.

    The 1o$ >ueue is managed $ a %%" fashion ith dnamic priorit correction. f 

    co-reser,ation fails for a 1o$ in a scheduling ccle( the 1o$*s priorit ill $e promoted in

    the >ueue for the net scheduling round. The resource selector ran's candidate resources

     $ their num$er of processors and memor si9e. Aggaral*s method 28 is introduced in

    su$section :.:.

    @ "ptimal ,s. Su$optimal

    n the case that all information regarding the state of resources and the 1o$s is 'non(

    an optimal assignment could $e made $ased on some criterion function( such as minimum

    ma'espan and maimum resource utili9ation. 3ut due to the N#-omplete nature of 

    scheduling algorithms and the difficult in Grid scenarios to ma'e reasona$le assumptions

    hich are usuall re>uired to pro,e the optimalit of an algorithm( current research tries to

    find su$optimal solutions( hich can $e further di,ided into the folloing to general

    categories.

    @

     Approimate ,s. Jeuristic

    The approimate algorithms use formal computational models( $ut instead of searching

    the entire solution space for an optimal solution( the are satisfied hen a solution that is

    sufficientl ue can $e used to decrease the time ta'en to find an accepta$le

    schedule. The factors hich determine hether this approach is orth of pursuit include

    248!

    0

    A,aila$ilit of a function to e,aluate a solution.

    The time re>uired to e,aluate a solution.

    The a$ilit to 1udge the ,alue of an optimal solution according to some metric.

    A,aila$ilit of a mechanism for intelligentl pruning the solution space.f traditional e,aluating metrics are used for tas' scheduling in Grid computing( e.g.(

    ma'espan( the dnamic nature of Grid computing ill ,iolate the a$o,e conditions Bsee

    :.2C( so that there are no such approimation algorithms 'non to date. The onl

    approimate algorithms in Grid scheduling at the time of this riting are $ased on a nel

     proposed o$1ecti,e function! Total #rocessor cle onsumption 508 58.

    The other $ranch in the su$optimal categor is called heuristic. This $ranch represents

    the class of algorithms hich ma'e the most realistic assumptions a$out a priori

    'noledge concerning process and sstem loading characteristics. t also represents the

    solutions to the scheduling pro$lem hich cannot gi,e optimal ansers $ut onl re>uire

    the most reasona$le amount of cost and other sstem resources to perform their function.

    The e,aluation of this 'ind of solution is usuall $ased on eperiments in the real orld or on simulation. Not restricted $ formal assumptions( heuristic algorithms are more

    adapti,e to the Grid scenarios here $oth resources and applications are highl di,erse and

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    10/45

    dnamic( so most of the algorithms to $e discussed in the folloing are heuristics.

    @

     &istri$uted ,s. entrali9ed

    n dnamic scheduling scenarios( the responsi$ilit for ma'ing glo$al scheduling

    decisions ma lie ith one centrali9ed scheduler( or $e shared $ multiple distri$uted

    schedulers. n a computational Grid( there might $e man applications su$mitted or 

    re>uired to $e rescheduled simultaneousl. The centrali9ed strateg has the ad,antage of ease of implementation( $ut suffers from the lac' of scala$ilit( fault tolerance and the

     possi$ilit of $ecoming a performance $ottlenec'. %or eample( Sa$in et al ;;8 propose a

    centrali9ed metasheduler hich uses $ac'fill to schedule parallel 1o$s in multiple

    heterogeneous sites. Similarl( Arora et al 68 present a completel decentrali9ed( dnamic

    and sender-initiated scheduling and load $alancing algorithm for the Grid en,ironment. A

     propert of this algorithm is that it uses a smart search strateg to find partner nodes to

    hich tas's can migrate. t also o,erlaps this decision ma'ing process ith the actual

    eecution of read 1o$s( there$ sa,ing precious processor ccles.

    @

     ooperati,e ,s. Non-cooperati,e

    f a distri$uted scheduling algorithm is adopted( the net issue that should $econsidered is hether the nodes in,ol,ed in 1o$ scheduling are or'ing cooperati,el or 

    independentl Bnon-cooperati,elC. n the non-cooperati,e case( indi,idual schedulers act

    alone as autonomous entities and arri,e at decisions regarding their on optimum o$1ects

    independent of the effects of the decision on the rest of sstem. Good eamples of such

    schedulers in the Grid are application-le,el schedulers hich are tightl coupled ith

     particular applications and optimi9e their pri,ate indi,idual o$1ecti,es.

    n the cooperati,e case( each Grid scheduler has the responsi$ilit to carr out its on

     portion of the scheduling tas'( $ut all schedulers are or'ing toard a common

    sstem-ide goal. Iach Grid scheduler*s local polic is concerned ith ma'ing decisions

    in concert ith the other Grid schedulers in order to achie,e some glo$al goal( instead of 

    ma'ing decisions hich ill onl affect local performance or the performance of a

     particular 1o$. An eample of cooperati,e Grid scheduling is presented in 758( here the

    efficienc of sender-initiated and recei,er-initiated algorithms adopted $ distri$uted Grid

    schedulers is compared ith that of centrali9ed scheduling and local scheduling.

    The hierarch taonom classifies scheduling algorithms mainl from the sstem*s

     point ,ie( such as dnamic or static( distri$uted or centrali9ed. There are still man other 

    important aspects forming a scheduling algorithm that cannot $e co,ered $ this method.

    asa,ant et al 248 call them flat classification characteristics. n this paper( e discuss

    the folloing properties and related eamples hich are rendered $ current scheduling

    algorithms hen the are confronted ith ne challenges in the Grid computing scenario!hat*s the goal for schedulingO s the algorithm adapti,eO s there dependenc among

    tas's in an applicationO Jo to deal ith large ,olumes of input and out data during

    schedulingO Jo do )oS re>uirements influence the scheduling productO Jo does the

    scheduler fight again dnamism in the GridO %inall( hat ne methodologies are applied

    to the Grid scheduling pro$lemO

    :.2 "$1ecti,e %unctions

    %ig. :! "$1ecti,e functions co,ered in this sur,e.

    The to ma1or parties in Grid computing( namel( resource consumers ho su$mit

    ,arious applications( and resources pro,iders ho share their resources( usuall ha,e

    different moti,ations hen the 1oin the Grid. These incenti,es are presented $ o$1ecti,e

    functions in scheduling. urrentl( most of o$1ecti,e functions in Grid computing areinherited from traditional parallel and distri$uted sstems. Grid users are $asicall

    concerned ith the performance of their applications( for eample the total cost to run a

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    11/45

     particular application( hile resource pro,iders usuall pa more attention to the

     performance of their resources( for eample the resource utili9ation in a particular period.

    Thus o$1ecti,e functions can $e classified into to categories! application-centric and

    resource-centric 2:8. %ig. : shos the o$1ecti,e functions e ill meet in our folloing

    discussion.

    @

     Application-entricScheduling algorithms adopting an application-centric scheduling o$1ecti,e function

    aim to optimi9e the performance of each indi,idual application( as application-le,el

    schedulers do. Fost of current Grid applications* concerns are a$out time( for eample the

    2

    ma'espan( hich is the time spent from the $eginning of the first tas' in a 1o$ to the end of 

    the last tas' of the 1o$. Fa'espan is the one of the most popular measurements of 

    scheduling algorithms and man eamples gi,en in the folloing discussion adopt it. As

    economic models ;8 78 4:8 248 are introduced into Grid computing( the economic

    cost that an application needs to pa for resources utili9ation $ecomes a concern of some

    of Grid users. This o$1ecti,e function is idel adopted $ Grid economic models hich

    are mainl discussed in Su$section :.6. 3esides these simple functions( man applicationsuse compound o$1ecti,e functions( for eample( some ant $oth shorter eecution time

    and loer economic costs. The primar difficult facing the adoption of this 'ind of 

    o$1ecti,e functions lies in the normali9ation of to different measurements! time and

    mone. Such situations ma'e scheduling in the Grid much more complicated. t is re>uired

    that Grid schedulers $e adapti,e enough to deal ith such compound missions. At the

    same time( the de,elopment of the Grid infrastructure has shon a ser,ice-oriented

    tendenc 478( so the >ualit of ser,ices B)oSC $ecomes a $ig concern of man Grid

    applications in such a non-dedicated dnamic en,ironment. The meaning of )oS is highl

    dependent on particular applications( from hardare capacit to softare eistence.

    +suall( )oS is a constraint imposed on the scheduling process instead of the final

    o$1ecti,e function. The in,ol,ement of )oS usuall has effect on the resource selection

    step in the scheduling process( and then influences the final o$1ecti,e optimi9ation. Such

    scenarios ill $e discussed in :..

    @

     Resource-entric

    Scheduling algorithms adopting resource-centric scheduling o$1ecti,e functions aim to

    optimi9e the performance of the resources. Resource-centric o$1ecti,es are usuall related

    to resource utili9ation( for eample( throughput hich is the a$ilit of a resource to process

    a certain num$er of 1o$s in a gi,en period utili9ation( hich is the percentage of time a

    resource is $us. Ho utili9ation means a resource is idle and asted. %or a multiprocessor 

    resource( utili9ation differences among processors also descri$e the load $alance of thesstem and decrease the throughput. ondor is a ell 'non sstem adopting throughput

    as the scheduling o$1ecti,e 28. As economic models are introduced into Grid computing(

    economic profit Bhich is the economic $enefits resource pro,iders can get $ attracting

    Grid users to su$mit applications to their resourcesC also comes under the pur,ie of 

    resource management polices.

    n the Grid computing en,ironments( due to autonom $oth in Grid users and resource

     pro,iders( application-centric o$1ecti,es and resource centric o$1ecti,es often are at odds.

    Hegion 268 pro,ides a methodolog alloing each group to epress their desires( and acts

    as a mediator to find a resource allocation that is accepta$le to $oth parties through a

    flei$le( modular approach to scheduling support.

    The o$1ecti,e functions mentioned a$o,e are idel adopted $efore the emergence of Grid computing and man efforts ha,e $een made to approach an approimation 28 :08

    578 08 or to get a

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    12/45

    assumption( namel( that the resources are dedicated so that the can pro,ide constant

     performance to an application. 3ut as e ha,e emphasi9ed in Section 2( this assumption

    does not hold in Grid computing. This ,iolation ea'ens the pre,ious results. %or eample(

    assume an optimal schedule ith ma'espan "#T can $e found( if the resources in,ol,ed

    are sta$le. f the Grid resources are suddenl sloed don at "#T due to some reason

    :

    Binterrupted $ resources* local 1o$s( netor' contention( or hate,erC and the slo speedsituation continues for a long period( then the ma'espan of the actual schedule is far from

    "#T and cannot $e $ounded $ scheduling algorithms that cannot predict the performance

    changing. So( if the o$1ecti,e function of a schedule is ma'espan( and there is no $ound

    either for resource performance or for the time period of the change( in other ords( if e

    cannot predict the performance fluctuations( there ill $e no ma'espan approimation

    algorithm in general that applies to a Grid 58.

    To escape from this predicament( a no,el scheduling criterion for Grid computing is

     proposed in 58! total processor ccle consumption BT#C( hich is the total num$er of 

    instructions the Grid could compute from the starting time of eecuting the schedule to the

    completion time. T# represents the total computing poer consumed $ an application.

    n this ne criterion( the length of a tas' is the num$er of instructions in the tas' the speedof a processor is the num$er of instructions computed per unit time and processors in a

    Grid are heterogeneous so the ha,e ,arious speeds. n addition( the speed of each

     processor ,aries o,er time due to the contention epected in an open sstem. Het sp(t $e the

    speed of processor p during time inter,al t( tPC( here t is a non-negati,e integer.

    Eithout loss of generalit( it can $e assumed that the speed of each processor does not

    ,ar during time inter,al t( tPC( for e,er t $ adopting an inter,al as short as the unit

    time. t is also assumed that the ,alue of an sp(t cannot $e 'non in ad,ance. %ig. 4BaC

    shos an eample of a set of tas's. %ig. 4B$C shos an eample of processor speed

    distri$ution.

    %ig. 4! A ne Grid scheduling criterion! T# 508.

    Het T $e a set of n independent tas's ith the same length H. Het S $e a schedule of T

    in a Grid ith m processors. Het F $e the ma'espan of S. The speed of processor p during

    the time inter,al t( tPC is sp(t. Then( the T# of S is defined as!

    Q p m

      F Q t 0⎣  ⎦ s p(t

     P Q p

     m

     BF ⎣

    F ⎦Cs p( F⎣ ⎦ .

    So the T# of the schedule in %ig. 4BcC is 2 P 45 P :; P 25 P 6 25 P 5 25

    .2.

    4

    The ad,antage of T# is that it can $e little affected $ the ,ariance of resource

     performance( et still related to the ma'espan. Since the total num$er of instructions

    needed to run a 1o$ is constant( approimation algorithms $ased on this criterion can $e

    de,eloped. n 508( a B P m Blog e Bm C P C approimation algorithm is gi,en for 

    nHco BnC m Bloge Bm C P coarse-grained independent tas's in the Grid. A B P C approimation

    n

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    13/45

    algorithm for scheduling of coarse-grained tas's ith precedence orders is descri$ed in

    58( here HcpBnC is the critical path of the tas' graph.

    :.: Adapti,e Scheduling

    An adapti,e solution to the scheduling pro$lem is one in hich the algorithms and

     parameters used to ma'e scheduling decisions change dnamicall according to the

     pre,ious( current andor future resource status 248. n Grid computing( the demand for 

    scheduling adaptation comes from three points! the heterogeneit of candidate resources(the dnamism of resource performance( and the di,ersit of applications( as %ig. 4 shos.

    orrespondent ith these three points( e can find three 'inds of eamples also.

    %ig. 5! Taonom for adapti,e scheduling algorithms in Grid computing.

    Resource Adaptation! 3ecause of resource heterogeneit and application di,ersit(

    disco,ering a,aila$le resources and selecting an application-appropriate su$set of those

    resources are ,er important to achie,e high performance or reduce the cost( for eample

    Su et al 028 sho ho the selection of a data storage site affects the netor' 

    transmission dela. &ail et al :48 propose a resource selection algorithm in hich

    a,aila$le resources are grouped first into dis1oint su$sets according to the netor' delas

     $eteen the su$sets. Then( inside each su$set( resources are ran'ed according to their 

    memor si9e and computational poer. %inall( an appropriatel-si9ed resource group isselected from the sorted lists. The upper $ound for this ehausti,e search procedure is

    gi,en and claimed accepta$le in the computational Grid circumstance. Su$hlo' et al 048

    sho algorithms to 1ointl anal9e computation and communication resources for different

    application demands and a frameor' for automatic node selection. The algorithms are

    adapti,e to demands li'e selecting a set of nodes to maimi9e the minimum a,aila$le

     $andidth $eteen an pair of nodes and selecting a set of nodes to maimi9e the

    minimum a,aila$le fractional compute and communication capacities. The compleit of 

    these algorithms is also anal9ed and the results ha,e shon it is insignificant in

    comparison ith the eecution time of the applications that the are applied to.

    5

    &namic #erformance Adaptation! The adaptation to the dnamic performance of 

    resources is mainl ehi$ited as! BiC changing scheduling policies or rescheduling 008

    ;8 Bfor eample( the sitching $eteen static scheduling algorithms hich use predicted

    resource information and dnamic ones hich $alance the static scheduling resultsC( BiiC

    or'load distri$uting according to application-specific performance models 8( and BiiiC

    finding a proper num$er of resources to $e used 228 58. Applications to hich these

    adapti,e strategies are applied usuall adopt some 'ind of di,ide-and-con>uer approach to

    sol,e a certain pro$lem ;8. n the di,ide-and-con>uer approach( the initial pro$lem can

     $e recursi,el di,ided into su$-pro$lems hich can $e sol,ed more easil. As a special

    case of the di,ide-and-con>uer approach( a model for applications folloing the

    manageror'er model is shon in %ig. 6 6:8. %rom an initial tas' Bnode A in %ig. 6C( anum$er of tas's Bnodes 3( ( & and IC are launched to eecute on pre-selected or 

    dnamicall assigned resources. Iach tas' ma recei,e a discrete set of data( and fulfil its

    computational tas' independentl and deli,er its output BNode %C. Iamples of such

    applications include parameter seep applications 228 2:8 608( and data stripe

     processing 8 008. n contrast ith the manageror'er model Bhere the manager is in

    charge of the $eha,iors of its or'ersC( an acti,e adaptation method named luster-aare

    Random Stealing BRSC for Grid computing sstems is proposed in ;8 $ased on the

    traditional Random Stealing BRSC algorithm. RS allos an idle resource steal 1o$s not

    onl from the local cluster $ut also from remote ones ith a ,er limited amount of 

    ide-area communication. Thus( load $alancing among nodes running a

    di,ide-and-con>uer application is achie,ed. n re,ieing eperiences gained for application le,el scheduling in Grid computing( 3erman et al :8 note that ,ia schedule

    adaptation( it is possi$le to use sophisticated scheduling heuristics( li'e list-scheduling

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    14/45

    approaches hich are sensiti,e to performance prediction errors( for Grid en,ironments in

    hich resource a,aila$ilities change o,er time.

    3

    A

     %

    &I

    %ig. 6! The parallel or'flo of a di,ide-and-con>uer application.

    Application Adaptation! To achie,e high performance( application-le,el schedulers in the

    Grid Be.g. AppHeS :8C are usuall tightl integrated ith the application itself and are not

    easil applied to other applications. As a result( each scheduler is application-specific.

     Noticing this limitation( &ail et al :48 eplicitl decouple the scheduler core Bthe

    searching procedure introduced in the $eginning of this su$sectionC from

    application-specific Be.g. performance modelsC and platform-specific Be.g. resource

    information collectionC components used $ the core. The 'e feature to implement the

    decoupling Bhile still 'eeping aareness of application characteristicsC is that application

    characteristics are recorded andor disco,ered $ components such as a speciali9ed6

    compiler and Grid-ena$led li$raries. These application characteristics are communicated

    ,ia ell-defined interfaces to schedulers so that schedulers can $e general-purpose( hile

    still pro,iding ser,ices that are appropriate to the application at hand. Aggaral et al 28

    consider another case that applications in Grid computing often meet( namel( resource

    reser,ation( and de,elop a generali9ed Grid scheduling algorithm that can efficientl

    assign 1o$s ha,ing ar$itrar inter-dependenc constraints and ar$itrar processing

    durations to resources ha,ing prior reser,ations. Their algorithm also ta'es into account

    ar$itrar delas in transfer of data from the parent tas's to the child tas's. n fact( this is a

    heuristic list algorithm hich e ill discus in the net su$section. n :8( Eu et al gi,e

    a ,er good eample of ho a self-adapti,e scheduling algorithm cooperates ith

    long-term resource performance prediction 548 058. Their algorithm is adapti,e to

    indi,isi$le single se>uential 1o$s( 1o$s that can $e partitioned into independent parallel

    tas's( and 1o$s that ha,e a set of indi,isi$le tas's. Ehen prediction error of the sstem

    utili9ation is reaching a threshold( the scheduler ill tr to reallocate tas's.

    :.4 Tas' &ependenc of an Application

    Ehen the relations among tas's in a Grid application are considered( a common

    dichotom used is dependenc ,s. independenc. +suall( dependenc means there are

     precedence orders eisting in tas's( that is( a tas' cannot start until all its parent are done.

    &ependenc has crucial impact to the design of scheduling algorithms( so in this

    su$section( algorithms are discussed $ folloing the same dichotom as shon in %ig. .%ig. ! Tas' dependenc taonom of Grid scheduling algorithms.

    :.4. ndependent Tas' Scheduling

    As a set of independent tas's arri,e( from the sstem*s point ,ie( a common strateg

    is to assign them according to the load of resources in order to achie,e high sstem

    throughput. This approach as discussed under the dnamic $ranch in Su$section :..

    %rom the point of ,ie of applications( some static heuristic algorithms $ased on eecution

    cost estimate can $e applied 8.

    @

     Iamples of Static Algorithms ith #erformance Istimate

    FIT BFinimum Iecution TimeC! FIT assigns each tas' to the resource ith the $estepected eecution time for that tas'( no matter hether this resource is a,aila$le or not at

    the present time. The moti,ation $ehind FIT is to gi,e each tas' its $est machine. This

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    15/45

    can cause a se,ere load im$alance among machines. I,en orse( this heuristic is not

    applica$le to heterogeneous computing en,ironments here resources and tas's are

    characteri9ed as consistent( hich means a machine that can run a tas' faster ill run all

    the other tas's faster.

    FT BFinimum ompletion TimeC! FT assigns each tas'( in an ar$itrar order( to

    the resource ith the minimum epected completion time for that tas'. This causes some

    tas's to $e assigned to machines that do not ha,e the minimum eecution time for them.The intuition $ehind FT is to com$ine the $enefits of opportunistic load $alancing B"H3C

    and FIT( hile a,oiding the circumstances in hich "H3 and FIT perform poorl.

    Fin-min! The Fin-min heuristic $egins ith the set + of all unmapped tas's. Then(

    the set of minimum completion time F for each tas' in + is found. Net( the tas' ith the

    o,erall minimum completion time from F is selected and assigned to the corresponding

    machine Bhence the name Fin-minC. Hast( the nel mapped tas' is remo,ed from +( and

    the process repeats until all tas's are mapped Bi.e.( + is emptC. Fin-min is $ased on the

    minimum completion time( as is FT. Joe,er( Fin-min considers all unmapped tas's

    during each mapping decision and FT onl considers one tas' at a time. Fin-min maps

    the tas's in the order that changes the machine a,aila$ilit status $ the least amount that

    an assignment could. Het t i $e the first tas' mapped $ Fin-min onto an empt sstem.The machine that finishes ti the earliest( sa m1( is also the machine that eecutes ti the

    fastest. %or e,er tas' that Fin-min maps after ti( the Fin-min heuristic changes the

    a,aila$ilit status of m1 $ the least possi$le amount for e,er assignment. Therefore( the

     percentage of tas's assigned to their first choice Bon the $asis of eecution timeC is li'el to

     $e higher for Fin-min than for Fa-min Bsee $eloC. The epectation is that a smaller 

    ma'espan can $e o$tained if more tas's are assigned to the machines that complete them

    the earliest and also eecute them the fastest.

    Fa-min! The Fa-min heuristic is ,er similar to Fin-min. t also $egins ith the

    set + of all unmapped tas's. Then( the set of minimum completion time F( is found. Net(

    the tas' ith the o,erall maimum from F is selected and assigned to the corresponding

    machine Bhence the name Fa-minC. Hast( the nel mapped tas' is remo,ed from +( and

    the process repeats until all tas's are mapped Bi.e.( + is emptC. ntuiti,el( Fa-min

    attempts to minimi9e the penalties incurred from performing tas's ith longer eecution

    times. Assume( for eample( that the 1o$ $eing mapped has man tas's ith ,er short

    eecution times and one tas' ith a ,er long eecution time. Fapping the tas' ith the

    longer eecution time to its $est machine first allos this tas' to $e eecuted concurrentl

    ith the remaining tas's Bith shorter eecution timesC. %or this case( this ould $e a

     $etter mapping than a Fin-min mapping( here all of the shorter tas's ould eecute first(

    and then the longer running tas' ould $e eecuted hile se,eral machines sit idle. Thus(

    in cases similar to this eample( the Fa-min heuristic ma gi,e a mapping ith a more

     $alanced load across machines and a $etter ma'espan.Fin-min and Fa-min algorithms are simple and can $e easil amended to adapt to

    different scenarios. %or eample( in 58( a )oS Guided Fin-min heuristic is presented

    hich can guarantee the )oS re>uirements of particular tas's and minimi9e the ma'espan

    at the same time. Eu( Shu and Dhang 58 ga,e a Segmented Fin-min algorithm( in

    ;

    hich tas's are first ordered $ the epected completion time Bit could $e the maimum

    IT( minimum IT or a,erage IT on all of the resourcesC( then the ordered se>uence is

    segmented( and finall Fin-min is applied to all these segments. The segment impro,es

    the performance of tpical Fin-min hen the lengths of the tas's are dramaticall

    different $ gi,ing a chance to longer tas's to $e eecuted earlier than in the case here

    the tpical Fin-min is adopted.USuffrage! Another popular heuristic for independent scheduling is called Suffrage

    68. The rationale $ehind Suffrage is that a tas' should $e assigned to a certain host and if 

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    16/45

    it does not go to that host( it ill suffer the most. %or each tas'( its suffrage ,alue is

    defined as the difference $eteen its $est FT and its second-$est FT. Tas's ith high

    suffrage ,alue ta'e precedence. 3ut hen there is input and output data for the tas's( and

    resources are clustered( con,entional suffrage algorithms ma ha,e pro$lems. n this case(

    intuiti,el( tas's should $e assigned to the resources as near as possi$le to the data source

    to reduce the ma'espan. 3ut if the resources are clustered( and nodes in the same cluster 

    are ith near identical performance( then the $est and second $est FTs are also nearlidentical hich ma'es the suffrage close to 9ero and gi,es the tas's lo priorit. "ther 

    tas's might $e assigned on these nodes so that the tas' might $e pushed aa from its data

    source. To fi this pro$lem( asano,a et al ga,e an impro,ement called USuffrage in 2:8

    hich gi,es a cluster le,el suffrage ,alue to each tas'. Iperiments sho that USuffrage

    outperforms the con,entional Suffrage not onl in the case here large data files are

    needed( $ut also hen the resource information cannot $e predicted ,er accuratel.

    Tas' Grouping! The a$o,e algorithms are usuall used to schedule applications that

    consist of a set of independent coarse-grained compute-intensi,e tas's. This is the ideal

    case for hich the computational Grid as designed. 3ut there are some other cases in

    hich applications ith a large num$er of lighteight 1o$s. The o,erall processing of 

    these applications in,ol,es a high o,erhead cost in terms of scheduling and transmission toor from Grid resources. Futhu,elu et al 78 propose a dnamic tas' grouping scheduling

    algorithm to deal ith these cases. "nce a set of fine grained tas's are recei,ed( the

    scheduler groups them according to their re>uirements for computation Bmeasured in

    num$er of instructionsC and the processing capa$ilit that a Grid resource can pro,ide in a

    certain time period. All tas's in the same group are su$mitted to the same resource hich

    can finish them all in the gi,en time. 3 this mean( the o,erhead for scheduling and 1o$

    launching is reduced and resource utili9ation is increased.

    The #ro$lem of Jeterogeneit! n heterogeneous en,ironments( the performance of 

    the a$o,e algorithms is also affected $ the rate of heterogeneit of the tas's and the

    resources as ell as the consistenc of the tas's* estimated completion time on different

    machines. The stud in 68 shos that no single algorithm can ha,e a permanent

    ad,antage in all scenarios. This result clearl leads to the conclusion that if high

     performance is anted as much as possi$le in a computational Grid( the scheduler should

    ha,e the a$ilit to adapt different applicationresource heterogeneities.

    @

     Algorithms ithout #erformance Istimate

    The algorithms introduced a$o,e use predicted performance to ma'e tas' assignments.

    n 0:8 and 78 to algorithms are proposed that do not use performance estimate $ut

    adopt the idea of duplication( hich is feasi$le in the Grid en,ironment here

    computational resources are usuall a$undant $ut muta$le.

    7Su$ramani et al 0:8 introduce a simple distri$uted duplication scheme for 

    independent 1o$ scheduling in the Grid. A Grid scheduler distri$utes each 1o$ to the least

    loaded sites. Iach of these sites schedules the 1o$ locall. Ehen a 1o$ is a$le to start at

    an of the sites( the site informs the scheduler at the 1o$-originating site( hich in turn

    contacts the other - sites to cancel the 1o$s from their respecti,e >ueues. 3 placing each

     1o$ in the >ueue at multiple sites( the epectations are impro,ed sstem utili9ation and

    reduced a,erage 1o$ ma'espan. The parameter can $e ,aried depending upon the

    scala$ilit re>uired. Sil,a et al 78 propose a resource information free algorithm called

    Eor'>ueue ith Replication BE)RC for independent 1o$ scheduling in the Grid. The E)R 

    algorithm uses tas' replication to cope ith the heterogeneit of hosts and tas's( and also

    the dnamic ,ariation of resource a,aila$ilit due to load generated $ others users in theGrid. +nli'e the scheme in 0:8 here there are no duplicated tas's actuall running( in

    E)R( an idle resource ill replicate tas's that are still running on other resources. Tas's

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    17/45

    are replicated until a predefined maimum num$er of replicas are reached. Ehen a tas' 

    replica finishes( other replicas are cancelled. n this approach( performance is increased in

    situations hen tas's are assigned to slo$us hosts $ecause hen a tas' is replicated(

    there is a greater chance that a replica is assigned to a fastidle host. Another ad,antage of 

    this scheme is that it increases the immunit to performance changing( since the possi$ilit

    that all sites are changing is much smaller than one site.

    :.4.2 &ependent Tas' SchedulingEhen tas's composing a 1o$ ha,e precedence orders( a popular model applied is the

    directed acclic graph B&AGC( in hich a node represents a tas' and a directed edge

    denotes the precedence orders $eteen its to ,ertices. n some cases( eights can $e

    added to nodes and edges to epress computational costs and communicating costs

    respecti,el. As Grid computing infrastructures $ecome more and more mature and

     poerful( support for complicated or'flo applications( hich can $e usuall modeled

     $ &AGs( are pro,ided. Ee can find such tools li'e ondor &AGFan 28( oG 48(

    #egasus :8( Grid%lo 28 and ASAH"N 08. A comprehensi,e sur,e of these

    sstems is gi,en in 28( hile e continue our focus on their scheduling algorithm

    components in hat follos.

    @ Grid Sstems Supporting &ependent Tas' Scheduling

    To run a or'flo in a Grid( e need to consider to pro$lems! C ho the tas's in the

    or'flo are scheduled( and 2C ho to su$mit the scheduled tas's to Grid resources

    ithout ,iolating the structure of the original or'flo. Grid or'flo generators address

    the first pro$lem and Grid or'flo engines are used to deal ith the second.

    o Grid Eor'flo Ingines

    Grid or'flo engines are responsi$le for oG is a set of A#s hich can $e used to

    su$mit concrete or'flos to the Grid here the concrete or'flo means the tas's in a

    &AG are alread mapped to resource locations here the are to $e eecuted( so oG

    itself does not consider the optimi9ation pro$lem of or'flos. &AGFan or's similar 

    ith oG. t accepts &AG description files representing or'flos( and then folloing the

    order of tas's and dependenc constraints in the description files( su$mits tas's to

    20

    ondor-G( hich schedule them onto the $est machines a,aila$le in a %%" strateg

    ithout an long-term optimi9ation( 1ust li'e it does ith common ondor tas's.

    o Grid Eor'flo Generators

    #egasus pro,ides a $ridge $eteen Grid users and or'flo eecution engines li'e

    &AGFan. n #egasus( there are to 'inds of or'flos! a$stract or'flos hich are

    composed of tas's Breferred as application components in #egasusC and their dependencies

    reflecting the data dependencies of tas's( and concrete or'flos hich are the mappings

    of a$stract or'flos to Grid resources. #egasus* main concern is to generate these to'inds of or'flos according to demands $ users for certain data products. t does so $

    searching a,aila$le application components hich can produce the re>uired data products

    and a,aila$le input and intermediate data replicas in the Grid. To this ends( it pro,ides a

    oncrete Eor'flo Generator BEGC :68. EG performs the mapping from an a$stract

    or'flo to a concrete or'flo and generates the correspondent &AGFan su$mit files.

    t automaticall identifies phsical locations for $oth application components and data(

    finds appropriate resources to eecute the components reling on GS( and generates an

    eecuta$le or'flo that can $e su$mitted to the ondor-G through &AGFan. Ehen

    there are multiple appropriate resources a,aila$le( EG supports a fe standard selection

    algorithms! random( round-ro$in and min-min :;8. Resource selection algorithms are

     plugga$le components in #egasus so third-part de,eloped algorithms can $e appliedaccording to different concerns. As an eample( 3lthe et al 68 present a multiple rounds

    mied min-min and ma-min algorithm for resource selection in hich the final mapping

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    18/45

    selected is the one that has the minimal ma'espan. onsidering the dnamism of the Grid(

    instead of su$mitting the hole tas' graph at once( #egasus applies a or'flo partition

    method that su$mits laer-partitioned su$graphs iterati,el. 3ut( as shon in the

    discussion $elo( laered partition ma not use the ad,antages of localit for tas' 

    independenc and( as a result( produce $ad schedules( especiall hen the &AG is

    un$alanced. This ea'ness is also demonstrated in 08.

    Similar ith #egasus( IN ;8 also adopts plugga$le algorithms for a$stractor'flo to concrete or'flo mapping( and in 78( random( $est of n random(

    simulated annealing and game theor algorithms are tested. The latter to algorithms ill

     $e discussed in Su$section :.6.

    n Grid%lo 28( or'flo scheduling is conducted hierarchicall $ a glo$al Grid

    or'flo manager and a local Grid su$-or'flo scheduler. Glo$al Grid or'flo

    manager recei,es re>uests from users ith the or'flo description in UFH( and then

    simulates or'flo eecution to find a near-optimal schedule in terms of ma'espan. The

    simulation is done $ polling local Grid schedulers hich can estimate the finish time of 

    su$-or'flos on their local sites. A fu99 timing techni>ue is used to get the estimate(

    and the possi$ilit of conflict on a shared resource among tas's from different

    su$-or'flos is considered. The ad,antage of fu99 functions is that the can $ecomputed ,er fast and are suita$le for scheduling of time-critical Grid applications(

    although the do not necessaril pro,ide the $est scheduling solution. Grid%lo also

     pro,ides rescheduling functionalit hen the real eecution is delaed too far from the

    estimate.

    Ee see from the a$o,e discussion that most current efforts ha,e $een directed toards

    supporting or'flos at the programming le,el( thus pro,iding potential opportunities for 

    algorithms designers Bas the allo scheduling algorithms to $e plugged inC. As Grid

    2

    computing inherits pro$lems from traditional sstems( a natural >uestion to as' is hat can

     $e learned from the etensi,e studies on &AG scheduling algorithms in heterogeneous

    computingO A complete sur,e of these algorithms is $eond the scope of this paper( $ut

    some ideas and common eamples are discussed $elo to sho the pro$lems e are still

    confronted ith in the Grid.

    @

     Taonom of Algorithms for &ependent Tas' Scheduling

    onsidering communication delas hen ma'ing scheduling decisions introduces a $ig

    challenge! the trade-off $eteen ta'ing ad,antage of maimal parallelism and minimi9ing

    communication dela. This pro$lem is also called the ma-min pro$lem 428. Jigh

     parallelism means dispatching more tas's simultaneousl to different resources( thus

    increasing the communication cost( especiall hen the communication dela is ,er high.

    Joe,er( clustering tas's onl on a fe resources means lo resource utili9ation. To dealith this pro$lem in heterogeneous computing sstems( three 'inds of heuristic algorithms

    ere pre,iousl proposed.

    o Hist Jeuristics

    n general( list scheduling is a class of scheduling heuristics in hich tas's are assigned

    ith priorities and placed in a list ordered in decreasing magnitude of priorit. Ehene,er 

    tas's contend for processing( the selection of tas's to $e immediatel processed is done on

    the $asis of priorit ith higher-priorit tas's $eing assigned resources first 428. The

    differences among ,arious list heuristics mainl lie in ho the priorit is defined and hen

    a tas' is considered read for assignment.

    An important issue in &AG scheduling is ho to ran' Bor eighC the nodes and edges

    Bhen communication dela is consideredC. The ran' of a node is used as its priorit in thescheduling. "nce the nodes and edges are ran'ed( tas'-to-resource assignment can $e

    found $ considering the folloing to pro$lems to minimi9e the ma'espan! ho to

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    19/45

     paralleli9e those tas's ha,ing no precedence orders in the graph and ho to ma'e the time

    cost along ith the critical path in the &AG as small as possi$le. Fan list heurists ha,e

     $een in,ented( and some ne proposals can $e found in 08( 78 and ;:8 as ell as the

    comparison of their algorithms ith older ones.

    To lassic Iamples

    JI%T! Topcuoglu et al 08 present a heuristic called Jeterogeneous

    Iarliest-%inish-Time BJI%TC algorithm. The JI%T algorithm selects the tas' ith thehighest upard ran' Ban upard ran' is defined as the maimum distance from the current

    node to the eiting node( including the computational cost and communication costC at

    each step. The selected tas' is then assigned to the processor hich minimi9es its earliest

    finish time ith an insertion-$ased approach hich considers the possi$le insertion of a

    tas' in an earliest idle time slot $eteen to alread-scheduled tas's on the same resource.

    The time compleit of JI%T is "BepC( here e is the num$er of edges and p is the

    num$er of resources.

    JI%T might $e one of the most fre>uentl referred to listing algorithms hich aim to

    reduce ma'espan of tas's in a &AG. %or eample( it is tested in ASAH"N and compared

    ith a genetic algorithm and a mopic algorithm 08( and the results sho its

    effecti,eness in the Grid scenarios( especiall hen the tas' graph is un$alanced.22

    %#! Radulescu et al ;:8 present another list heuristic called %ast ritical #ath B%#C(

    intending to reduce the compleit of the list heuristics hile maintaining the scheduling

     performance at the same time. The moti,ation of %# is $ased on the folloing

    o$ser,ation regarding the compleit of list heuristics. 3asicall( a list heuristic has the

    folloing procedures! the "Be P ,C time ran'ing phase( the "B, log ,C time ordering phase(

    and finall the "BBe P ,C pC time resource selecting phase( here e is the num$er of edges(

    , is the num$er of tas's and p is the num$er of resources. +suall the third term is larger 

    than the second term. The %# algorithm does not sort all the tas's at the $eginning $ut

    maintains onl a limited num$er of tas's sorted at an gi,en time. nstead of considering

    all processors as possi$le targets for a gi,en tas'( the choice is restricted to either the

     processor from hich the last messages to the gi,en tas' arri,es or the processor hich

     $ecomes idle the earliest. As a result( the time compleit is reduced to "B, log p P eC.

    The #ro$lem of Jeterogeneit!

    A critical issue in list heuristics for &AGs is ho to compute a node*s ran'. n a

    heterogeneous en,ironment( the eecution time of the same tas' ill differ on different

    resources as ell as the communication cost ,ia different netor' lin's. So for a particular 

    node( its ran' ill also $e different if it is assigned to different resources. The pro$lem is

    ho to choose the proper ,alue used to ma'e the ordering decision. These ,alues could $e

    the mean ,alue Bli'e the original JI%T in 08C( the median ,alue 628( the orst ,alue(

    the $est ,alue and so on. 3ut Dhao et al 228 ha,e shon that different choices can affectthe performance of list heuristics such as JI%T dramaticall Bma'espan can change 4.2V

    for certain graphC. Foti,ated $ this o$ser,ation( Sa'ellariou et al 78 ga,e a J$rid

    algorithm hich is less sensiti,e to different approaches for ran'ing nodes. n this

    algorithm( tas's are upard ran'ed and sorted decreasingl. Then the sorted tas's are

    grouped along the sorted se>uence and in e,er group( tas's are independent. %inall( each

    group can $e assigned to resources using heuristics for independent tas's.

    The a$o,e algorithms ha,e eplored ho the heterogeneit of resources and tas's

    impacts the scheduling algorithm( $ut the onl consider the heterogeneit of 

    computational resources( and miss the heterogeneit of communication lin's.

    nstances of Hist Jeuristics in Grid omputing

    #re,ious research in &AG scheduling algorithms is ,er helpful hen e consider thesame pro$lem in the Grid scenario. %or eample( a list scheduling algorithm is proposed

    in ;8( hich is similar to the JI%T algorithm( $ut changes the method to compute the

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    20/45

    le,el of a tas' node $ not onl including its longest path to an eit node( $ut also ta'ing

    incoming communication cost from its parents into account. n 58( Fa et al propose a

    ne list algorithm called Itended &namic ritical #ath B&#C hich is a Grid-ena$led

    ,ersion of the &namic ritical #ath B&#C algorithm( applied in a homogenous

    en,ironment. The idea $ehind &# is continuous shortening the critical path in the tas' 

    graph( $ scheduling tas's in the current # to a resource here a tas' on the critical path

    has an earlier start time. The &# algorithm as proposed for scheduling parameter-sapapplications in a heterogeneous Grid. The impro,ements include! BiC initial shuffling tas's

    to multiple resources hen the scheduling $egins instead of 'eeping them on one node( BiiC

    using the finish time instead of start time to ran' tas' nodes to adapt to heterogeneous

    resources( and BiiiC multiple rounds of scheduling to impro,e the current scheduling instead

    of onl scheduling once. The compleit of &# is "B,:C( hich is the same as the

    &#.

    2:

    o &uplication 3ased Algorithms

    An alternati,e a to shorten the ma'espan is to duplicate tas's on different resources.

    The main idea $ehind duplication $ased scheduling is utili9ing resource idle time to

    duplicate predecessor tas's. This ma a,oid the transfer of results from a predecessor to asuccessor( thus reducing the communication cost. So duplication can sol,e the ma-min

     pro$lem.

    &uplication $ased algorithms differ according to the tas' selection strategies for 

    duplication. "riginall( algorithms in this group ere usuall for an un$ounded num$er of 

    identical processors such as distri$uted memor multiprocessor sstems. Also the ha,e

    higher compleit than the algorithms discussed a$o,e. %or eample( &ar$ha et al :58

     present such an algorithm named T&S BTas' &uplication-$ased Scheduling AlgorithmC in

    a distri$uted-memor machine ith a compleit of "B, 2 C. BNote( this compleit is for 

    homogeneous en,ironments.C

    T&S! n :58( for each node in a &AG( its earliest start time BestC( earliest completion

    time BectC( latest alloa$le start time BlastC( latest alloa$le completed time BlactC( and

    fa,orite predecessor BfpredC should $e computed. The last is the latest time hen a tas' 

    should $e started otherise( successors of this tas' ill $e delaed Bthat is( their est ill

     $e ,iolatedC. The fa,orite predecessors of a node i are those hich are predecessors of i

    and if i is assigned to the same processors on hich these nodes are running( estBiC ill $e

    minimi9ed. The le,el ,alue of a node Bhich denotes the length of the longest path from

    that node to an eit node Balso 'non as sin' nodeC( ignoring the communicating cost

    along that pathC is used as the priorit to determine the processing order of each tas'. To

    compute these ,alues( the hole &AG of the 1o$ ill $e tra,ersed( and the compleit

    needed for this step is "Be P ,C. 3ased on these ,alues( tas' clusters are created iterati,el.

    The clustering step is li'e a depth-first search from an unassigned node ha,ing the loestle,el ,alue to an entr node. "nce an entr node is reached( a cluster is generated and tas's

    in the same cluster ill $e assigned to the same resource. n this step( the last and lact

    ,alues are used to determine hether duplication is needed. %or eample( if 1 is a fa,orite

     predecessor of i and BlastBiC - lactB1CC W c1(i ( here c 1(i is the communication cost $eteen 1

    and i( i ill $e assigned to the same processor as 1( and if 1 has $e assigned to other 

     processors( it ill $e duplicated to i*s processor. n the clustering step( the &AG is

    tra,ersed similarl to the depth-first search from the eiting node( and the compleit of 

    this step ould $e the same as the compleit of a general search algorithm( hich is also

    "B, P eC. So the o,erall compleit is "B, P eC. n a dense &AG( the num$er of edges is

     proportional to "B,2 C( hich is the orst case compleit of duplication algorithm. Note(

    in the clustering step( the num$er of resources a,aila$le is alas assumed to $e smaller than re>uired( that is( the num$er of resources is un$ounded.

    TANJ! To eploit the duplication idea in heterogeneous en,ironments( a ne

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    21/45

    algorithm called TANJ BTas' duplication-$ased scheduling Algorithm for Netor' of 

    Jeterogeneous sstemsC is presented in ;48 and 78. ompared to the ,ersion for 

    homogeneous resources( the heterogeneous ,ersion has higher compleit( hich is "B,2

     pC. This is reasona$le since the eecution time of a tas' differs on different resources. A

    ne parameter is introduced for each tas'! the fa,orite processor BfpC( hich can complete

    the tas' earliest. "ther parameters of a tas' are computed $ased on the ,alue of fp. n the

    clustering step( the initial tas' of a cluster is assigned to its first fp( and if the first fp hasalread $een assigned( then to the second and so on. A processor reduction algorithm is

    24

    also pro,ided in 78( hich is used to merge clusters hen the num$er for processors is

    less than that the clusters generate.

    &uplication $ased algorithms are ,er useful in Grid en,ironments. The computational

    Grid usuall has a$undant computational resources Brecall that the num$er of resource is

    un$ounded in some duplication algorithmsC( $ut high communication cost. This ill ma'e

    tas' duplication ,er cost effecti,e. &uplication has alread recei,ed some attention Bsee(

    for eample( 0:8 and 78C( $ut current duplication $ased scheduling algorithms in the

    Grid onl deal ith independent 1o$s. There are opportunities to create ne algorithms for 

    complicated &AGs scheduling in an en,ironment that is not onl heterogeneous( $ut alsodnamic.

    o lustering Jeuristics

    n parallel and distri$uted sstems( clustering is an efficient a to reduce

    communication dela in &AGs $ grouping hea,il communicating tas's to the same

    la$eled clusters and then assigning tas's in a cluster to the same resource. n general(

    clustering algorithms ha,e to phases! the tas' clustering phase that partitions the original

    tas' graph into clusters( and a post-clustering phase hich can refine the clusters produced

    in the pre,ious phase and get the final tas'-to-resource map.

     

     

    .5

     .5

     .5

     

     

    .

    %ig. ;! BaC A &AG ith computational and communication cost. B$C A linear clustering. BcC A

    nonlinear 

    clustering5:8.Algorithms for Tas' lustering

    At the $eginning( each node in a tas' graph is an independent cluster. n each iteration(

     pre,ious clusters are refined $ merging some clusters. Generall( clustering algorithms

    map tas's in a gi,en &AG to an unlimited num$er of resources. n practice( an additional

    cluster merging step is needed after clusters are generated( so that the num$er of clusters

    generated can $e e>ual to the num$er of processors. A tas' cluster could $e linear or 

    nonlinear. A clustering is called nonlinear if to independent tas's are mapped in the same

    cluster otherise it is called linear. %ig. ; shos a &AG ith computational and

    communication cost B%ig. ;BaCC( a linear clustering ith three clusters Xn ( n 2( nC( Xn:( n4(

    n 6 Y( Xn5Y B%ig. ;B$CC( and a nonlinear clustering ith clusters Xn ( nY( Xn:( n4( n5( n6Y( and

    XnY B%ig. ;BcCC 5:8. The pro$lem of o$taining an optimal clustering of a general tas' 25

    graph is N#-complete( so heuristics are designed to deal ith this pro$lem B5:8 8 8

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    22/45

    28C.

    &S! Lang et al 8 propose a clustering heuristic called &ominant Se>uence lustering

    B&SC algorithm. The critical path of a scheduled &AG is called &ominant Se>uence B&SC

    to distinguish it from the critical path of a clustered &AG. The critical path of a clustered

    graph is the longest path in that graph( including $oth non-9ero communication edge cost

    and tas' eights in that path. The ma'espan in eecuting a clustered &AG is determined

     $ the &ominant Se>uence( not $ the critical path of the clustered &AG. %ig. 7 BaC shosthe critical path of a clustered graph( hich consists of Xn ( n2( nY ith a length of 7. %ig.

    7B$C is a schedule of this clustered graph( and %ig. 7 BcC gi,es the &S of the scheduled tas' 

    graph( hich consists of Xn ( n:( n 4( n5( n 6 ( nY ith a length of 0 5:8.

    0

    %ig. 7! BaC The clustered &AG and its # shon in thic' arros. B$C The Gantt chart of a schedule.

    BcC

    The scheduled &AG and the &S shon in thic' arros5:8.

    n the &S algorithm( tas' priorities are dnamicall computed as the sum of their top

    le,el and $ottom le,el. The top le,el and $ottom le,el are the sum of the computation and

    communication costs along the longest path from the gi,en tas' to an entr tas' and an eit

    tas'( respecti,el. Ehile the $ottom le,el is staticall computed at the $eginning( the tople,el is computed incrementall during the scheduling process. Tas's are scheduled in the

    order of their priorities. The current node is an unassigned node ith highest propriet.

    Since the entr node alas has the longest path to the eit node( clustering alas $egins

    ith the entr node. The current node is merged ith the cluster of one of its predecessors

    so that the top le,el ,alue of this node can $e minimi9ed. f all possi$le merging increases

    the top le,el ,alue( the current node ill remain in its on cluster. After the current node is

    clustered( priorities of all its successors ill $e updated. The time compleit of &S is

    "BBe P ,C log ,C( in hich "Blog ,C comes from priorit updating at each step using a

     $inar heap( and Be P ,C is for graph tra,ersal in the clustering iterations. So for a dense

    2tas' graph( the compleit is roughl "B, log ,C.

    ASS-! Hiou et al 8 present a cluster algorithm called ASS- hich emplos a

    to-step approach. n the first step( ASS- computes for each node , a ,alue sB,C( hich

    26

    is the length of the longest path from an entr node to , Becluding the eecution time of ,C.

    Thus( sB,C is the start time of , $efore clustering( and sB,C is 0 if , is an entr node. The

    second step is the clustering step. /ust li'e &S( it consists of a se>uence of refinement

    steps( here each refinement step creates a ne cluster or ZZgros** an eisting cluster.

    +nli'e &S( ASS- constructs the clusters $ottom-up( i.e.( starting from the eit nodes.

    To construct the clusters( the algorithm computes for each node , a ,alue fB,C( hich is the

    longest path from , to an eit node in the current partiall clustered &AG. Het lB,C fB,C P

    sB,C. The algorithm uses lB,C to determine hether the node , can $e considered for clustering at the current refinement step. The clustering $egins $ placing e,er eit node

    in its on cluster( and then goes through a se>uence of iterations. n each iteration( it

    considers to cluster e,er node , hose immediate successors ha,e all $een clustered.

     Node , is merged to the cluster of its successor hich gi,es it the minimum lB,C ,alue if 

    the merge ill not increase the lB,C ,alue. ASS- does not re-compute the critical path in

    each refinement step. Therefore( the algorithm has a compleit "Be P , log ,C( and shos

    : to 5 times faster than &S in eperiments according to 8.

    Algorithms for the #ost-clustering #hase

    n 28( steps after tas' clustering are studied( hich include the cluster merging(

     processor assignment and tas' ordering in local processors. %or the cluster merging( three

    strategies are compared( namel! load $alancing BH3C( communication traffic minimi9ationBTFC( and random BRAN&C.

    o H3! &efine the BcomputationalC or'load of a cluster as the sum of eecution times

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    23/45

    of the tas's in the cluster. At each merging step( choose a cluster( ( that has a

    minimum or'load among all clusters( and find a cluster( 2( that has a minimum

    or'load among those clusters hich ha,e a communication edge $eteen

    themsel,es and . Then the pair of clusters and 2 are merged.

    o TF! &efine the BcommunicationC traffic of a pair of clusters B( 2C as the sum

    of communication times of the edges from to 2 and from 2 to . At each

    merging step( merge the pair of clusters hich ha,e the most traffic.o RAN&! At each refinement step( merge a random pair of clusters.

    %or the processor assignment( a simple heuristic is applied to find a one-to-one

    mapping $eteen a cluster and a processor! BC assign the cluster ith the largest total

    communication traffic ith all other clusters to a processor B2C choose an unassigned

    cluster ha,ing the largest communication traffic ith an assigned cluster and place it in a

     processor closest to its communicating partner B:C repeat B2C until all clusters ha,e $een

    assigned to processors.

    Iperimental results in 8 and 28 indicate that the performance of clustering

    heuristics e,aluated $ ma'espan depends on the granularit of the tas's of a graph. The

    granularit of a tas' is the ratio of its eecution time ,s. the o,erhead incurred hen

    communicating ith other tas's. This result means adapti,e a$ilit ill $e helpful for thescheduler to pro,ide higher scheduling >ualit if 1o$s ha,e high di,ersit.

    The #ro$lem of Jeterogeneit

    According to the $asic idea of tas' clustering( cluster heuristics need not consider 

    heterogeneit of the resources in the clustering phase. 3ut in the folloing cluster merging

    and resource assigning phases( heterogeneit ill definitel affect the final performance.

    "$,iousl( research in 28 does not consider this pro$lem and no other research on this

     pro$lem has $een performed( to our 'noledge. luster heuristics ha,e not et impro,ed

    2

    for Grid computing either( here communication is usuall costl and performance of 

    resources ,aries o,er time. Therefore this remains an interesting topic for research in the

    Grid computing en,ironement. Another ,alue of the cluster heuristic for Grid scheduling is

    its multi-phase nature( hich pro,ides more flei$ilit to the Grid scheduler to emplo

    different strategies according to the configuration and organi9ation of the underling

    resources.

    @

     Algorithms onsidering &namism of Grid

    Joe,er( there is an important issue for Grid computing hich has not $een discussed!

    the resource performance dnamism. All algorithms that e ha,e mentioned in this

    su$section schedule hole tas' graphs on the $asis of static resource performance estimate(

    hich could $e 1eopardi9ed $ resource performance change during the eecution period.

    +suall the performance dnamism is resulted from completion among 1o$s sharing thesame resource. This pro$lem could $e reconciled $ considering the possi$ilit of conflict

    hen the scheduling decision is made. Je et al 568 sho us an eample of this approach.

    Their algorithm considers the optimi9ation of &AG ma'espan on multiclusters hich ha,e

    their on local schedulers and >ueues shared $ other $ac'ground or'loads( hich arri,e

    as a linear function of time. The moti,ation is to map as man tas's as possi$le to the same

    cluster in order to full utili9e the parallel processing capa$ilit( and at the same time

    reduce the inter-cluster communication. The schedulers ha,e a hierarchical structure! the

    glo$al scheduler is responsi$le for mapping tas's to different clusters according to their 

    latest finish time in order to minimi9e the ecess o,er the length of critical path. The local

    scheduler on each multicluster pro,ides the estimated finish time of a particular tas' on

    this cluster( reports it to the glo$al scheduler upon >ueries( and manages its local >ueue ina %%" a. The time compleit of the glo$al mapping algorithm is "BpMBnPCMn2PeC(

    here p is the num$er of multiclusters.

  • 8/19/2019 Scheduling Algorithms for Grid Computing

    24/45

    Another approach to deal ith the dnamic pro$lem is appling dnamic algorithms.

    n 58( the authors propose a pF-S algorithms hich etends a traditional dnamic

    Faster-Sla,e scheduling model. n the pF-S algorithm( to >ueues are used $ the master(

    the unscheduled >ueue and the read >ueue. "nl tas's in the Read >ueue can $e directl

    dispatched to sla,e nodes( and tas's in the unscheduled >ueue can $e onl put into the

    read >ueue hen all its parents ha,e $een in the read >ueue or dispatched. The

    dispatching order in the read >ueue is $ased on tas's* priorities. Ehen a tas' is finished(the priorities of all its children*s ancestors ill $e dnamicall promoted.

    n 628( another dnamic algorithm is proposed for scheduling &AGs in a shared

    heterogeneous distri$uted sstem. +nli'e the pre,ious or's in hich a uni>ue glo$al

    scheduler eists( in this or'( the authors consider multiple independent