7
Priority-based scheduling for Large-Scale Distribute Systems with Energy Awareness Masnida Hussin, Young Choon Lee, Albert Y. Zomaya Centre for Distributed and High Performance Computing, School of Information Technologies The University of Sydney NSW 2006, Australia [email protected], {young.lee, albert.zomaya}@sydney.edu.au Abstract—Large-scale distributed computing systems (LDSs), such as grids and clouds are primarily designed to provide massive computing capacity. These systems dissipate often excessive energy to both power and cool them. Concerns over greening these systems have prompted a call for scheduling policies with energy awareness (e.g., energy proportionality). The dynamic and heterogeneous nature of resources and tasks in LDSs is a major hurdle to be overcome for energy efficiency when designing scheduling policies. In this paper, we address the problem of scheduling tasks with different priorities (deadlines) for energy efficiency exploiting resource heterogeneity. Specifically, our investigation for energy efficiency focuses on two issues: (1) balancing the workload in the way utilization is maximized and (2) power management by controlling execution of tasks on processor for ensuring the energy is optimally consumed. We form a hierarchical scheduler that exploits the multi-core architecture for effective scheduling. Our scheduling approach exploits the diversity of task priority for proper load balancing across heterogeneous processors while observing energy consumption in the system. Simulation experiments prove the efficacy of our approach; and the comparison results indicate our scheduling policy helps improve energy efficiency of the system. Keywords– energy efficiency, dynamic scheduling, task priority. I. INTRODUCTION Large-scale distributed computing systems (LDSs) are composed of a set of heterogeneous resources that much facilitate the effective exploitation of various parallel applications. In general, high performance is the main purpose of provisioning such a large-scale computing capacity. This massive computing capacity is achieved primarily by increasing the volume of system components, e.g., processors, memory and disks. It is only recent that vast amounts of power consumed by LDSs have brought much attention to not only ICT community, but also the entire society. Excessive heat emitted by these systems causes notable energy consumption for cooling them. What’s more, those systems may not always run at high utilization (often less than 20%) [1, 2]. Nonetheless, they dissipate a substantial portion of energy simply to keep them available. With the explosive increase in energy consumption, energy efficiency for high performance computing in LDSs (also known as ‘green computing’) has become a prominent issue. Concerns over greening LDSs have prompted a call for scheduling policies with energy awareness (e.g., energy proportionality). Typically, scheduling and resource allocation in LDSs needs to address deadline constraints and system heterogeneity while enhancing users’ satisfaction (e.g., response time). The efficacy of a scheduling policy is greatly influenced by the availability of prior information of tasks and resources, also the accuracy of this information [3]. The problem of scheduling and resource allocation gets further complicated when energy consumption comes in as an additional system objective. Hence, the scheduling policy must be designed to meet processing requirements while providing energy efficiency to the system. The energy consumption also varies depending on the system workload. For instance, the system consumes higher energy if there are numerous newly arriving tasks within a short time interval. However, the system is underutilized in the majority of time. Apparently, it is hard to obtain optimal scheduling decisions (in terms of energy consumption) with dynamic workloads. In this work, we study the use of dynamic scheduling with task priority consideration aiming for better performance and energy efficiency. Our scheduling strategy is constructed using heuristics based on mapping analysis and task execution on processors. We induce the multi-core processor architecture for effective scheduling while bringing new optimization opportunities to the energy management. This is realized based on two aspects. First, we determine appropriate processors that can satisfy the urgency of tasks to improve successful execution (deadline satisfaction). Second, we address the issue of energy-aware scheduling by continuously monitoring of the processors being used. Their computational power determines according to capacity of cores. This power is dynamically changed based on the workload in the system. As such, it sustains performance growth while observing energy use in the processors, such that energy is optimally consumed. The results obtained from our comparative evaluation study clearly show that our adaptive (priority-based) scheduling outperforms other scheduling schemes in terms of performance by a noticeable margin. It is shown that 2011 Ninth IEEE International Conference on Dependable, Autonomic and Secure Computing 978-0-7695-4612-4/11 $26.00 © 2011 IEEE DOI 10.1109/DASC.2011.96 504 2011 Ninth IEEE International Conference on Dependable, Autonomic and Secure Computing 978-0-7695-4612-4/11 $26.00 © 2011 IEEE DOI 10.1109/DASC.2011.96 504 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing 978-0-7695-4612-4/11 $26.00 © 2011 IEEE DOI 10.1109/DASC.2011.96 504 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing 978-0-7695-4612-4/11 $26.00 © 2011 IEEE DOI 10.1109/DASC.2011.96 503

[IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

Embed Size (px)

Citation preview

Page 1: [IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

Priority-based scheduling for Large-Scale Distribute Systems with Energy Awareness

Masnida Hussin, Young Choon Lee, Albert Y. Zomaya Centre for Distributed and High Performance Computing,

School of Information Technologies The University of Sydney

NSW 2006, Australia [email protected], {young.lee, albert.zomaya}@sydney.edu.au

Abstract—Large-scale distributed computing systems (LDSs), such as grids and clouds are primarily designed to provide massive computing capacity. These systems dissipate often excessive energy to both power and cool them. Concerns over greening these systems have prompted a call for scheduling policies with energy awareness (e.g., energy proportionality). The dynamic and heterogeneous nature of resources and tasks in LDSs is a major hurdle to be overcome for energy efficiency when designing scheduling policies. In this paper, we address the problem of scheduling tasks with different priorities (deadlines) for energy efficiency exploiting resource heterogeneity. Specifically, our investigation for energy efficiency focuses on two issues: (1) balancing the workload in the way utilization is maximized and (2) power management by controlling execution of tasks on processor for ensuring the energy is optimally consumed. We form a hierarchical scheduler that exploits the multi-core architecture for effective scheduling. Our scheduling approach exploits the diversity of task priority for proper load balancing across heterogeneous processors while observing energy consumption in the system. Simulation experiments prove the efficacy of our approach; and the comparison results indicate our scheduling policy helps improve energy efficiency of the system.

Keywords– energy efficiency, dynamic scheduling, task priority.

I. INTRODUCTION Large-scale distributed computing systems (LDSs) are

composed of a set of heterogeneous resources that much facilitate the effective exploitation of various parallel applications. In general, high performance is the main purpose of provisioning such a large-scale computing capacity. This massive computing capacity is achieved primarily by increasing the volume of system components, e.g., processors, memory and disks. It is only recent that vast amounts of power consumed by LDSs have brought much attention to not only ICT community, but also the entire society. Excessive heat emitted by these systems causes notable energy consumption for cooling them. What’s more, those systems may not always run at high utilization (often less than 20%) [1, 2]. Nonetheless, they dissipate a substantial portion of energy simply to keep them available. With the explosive increase in energy consumption, energy efficiency for high performance

computing in LDSs (also known as ‘green computing’) has become a prominent issue.

Concerns over greening LDSs have prompted a call for scheduling policies with energy awareness (e.g., energy proportionality). Typically, scheduling and resource allocation in LDSs needs to address deadline constraints and system heterogeneity while enhancing users’ satisfaction (e.g., response time). The efficacy of a scheduling policy is greatly influenced by the availability of prior information of tasks and resources, also the accuracy of this information [3]. The problem of scheduling and resource allocation gets further complicated when energy consumption comes in as an additional system objective. Hence, the scheduling policy must be designed to meet processing requirements while providing energy efficiency to the system. The energy consumption also varies depending on the system workload. For instance, the system consumes higher energy if there are numerous newly arriving tasks within a short time interval. However, the system is underutilized in the majority of time. Apparently, it is hard to obtain optimal scheduling decisions (in terms of energy consumption) with dynamic workloads.

In this work, we study the use of dynamic scheduling with task priority consideration aiming for better performance and energy efficiency. Our scheduling strategy is constructed using heuristics based on mapping analysis and task execution on processors. We induce the multi-core processor architecture for effective scheduling while bringing new optimization opportunities to the energy management. This is realized based on two aspects. First, we determine appropriate processors that can satisfy the urgency of tasks to improve successful execution (deadline satisfaction). Second, we address the issue of energy-aware scheduling by continuously monitoring of the processors being used. Their computational power determines according to capacity of cores. This power is dynamically changed based on the workload in the system. As such, it sustains performance growth while observing energy use in the processors, such that energy is optimally consumed.

The results obtained from our comparative evaluation study clearly show that our adaptive (priority-based) scheduling outperforms other scheduling schemes in terms of performance by a noticeable margin. It is shown that

2011 Ninth IEEE International Conference on Dependable, Autonomic and Secure Computing

978-0-7695-4612-4/11 $26.00 © 2011 IEEE

DOI 10.1109/DASC.2011.96

504

2011 Ninth IEEE International Conference on Dependable, Autonomic and Secure Computing

978-0-7695-4612-4/11 $26.00 © 2011 IEEE

DOI 10.1109/DASC.2011.96

504

2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing

978-0-7695-4612-4/11 $26.00 © 2011 IEEE

DOI 10.1109/DASC.2011.96

504

2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing

978-0-7695-4612-4/11 $26.00 © 2011 IEEE

DOI 10.1109/DASC.2011.96

503

Page 2: [IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

energy efficiency is realized without significantly affecting response time.

The remainder of this paper is organized as follows. Section II describes related work on energy efficiency using task-scheduling strategy. Section III details the models used in the paper. Our adaptive scheduling approach is presented in Section IV. Experimental settings and results are presented in Section V. Finally, Section VI concludes the paper.

II. RELATED WORK Energy management has been an active area of research over the years for computing components. Processors are the major energy consumer [2, 4], thus, accounting its energy consumption is necessary for energy-efficient computing. There are numerous research efforts on energy efficiency that use scheduling algorithms to optimize energy consumption (e.g., [1, 5-8]). Since the performance of LDSs greatly influences processing time, heuristics are most popularly adopted in the scheduling model. This is because heuristic methods tend to produce competitive solutions with lower time complexity and react competently in the highly heterogeneous environment. Generally, resource manager or scheduler can contribute much to the overall energy efficiency of the system. It has the ability to work within the processing requirements/ constraints that put forth by the system users. The processing constraints must be effectively handled particularly in the present of dynamic computing; otherwise it may lead to load imbalance, over-provisioning of resources, and system unreliability. The schedulers in [5, 6] adaptively deal with processing constraints for reliable execution while minimizing energy consumption. The work in [5] developed a software framework to implement and evaluate various scheduling techniques to save energy with the minimal performance impact. The EASY backfilling policy is proposed in [7] to increase resource utilization by continuously monitoring the load in the system. It adaptively selects a certain number of resources that need to be put into ‘sleep mode.’ In [8] a meta-scheduler is used to select the most energy efficient resource site. The scheduler decides the time slot in which tasks should be executed at the minimum CPU frequency for saving energy. A task consolidation technique is proposed in [1] aiming to maximize resource utilization. The approach contributes for promising energy-saving capability as the energy consumption is significantly reduced when the task is consolidated with one or more tasks. These approaches have demonstrated the effectiveness in minimizing energy consumption while still meeting certain performance goals. However, the efficacy of these approaches in dealing with system dynamicity is limited to a certain level. The scope of energy efficiency also should be stretched further incorporating dynamicity and heterogeneity of both resources and tasks. For the case of large-scale dynamic environment, it is much beneficial for a scheduler to

measure its performance and adapt accordingly. The scheduler typically, strives for minimizing response time and ensuring fairness among the running tasks in the system. The impending widespread usage of multiple processor cores appear to be an excellent opportunity for realizing performance and energy benefits [9]. To seek for the optimal performance, the scheduler needs to actively adapt of the multi-core processors topologies while being aware of task characteristics. Due to the fact that cores in a processor are in a very close proximity to each other [10], load balancing can be done very effectively reducing idle time of processors. Dynamic scheduling can be of a very effective approach for proper load balancing while tracking energy consumption. While most previous energy efficiency solutions deal with homogenous resources and/or adopt static scheduling policy, our scheduling approach in this work is explicitly taking into account processing priority with heterogeneous resources.

III. MODELS In this section, we describe the application, system and energy models used in our study.

A. Application model Tasks are assumed to be sequential applications and each of which requires on no more than one core for its execution. The completion time of a task on a particular processor denotes the elapsed time from the time the task arrives into the scheduler until it completes the execution entirely: )__( texetwaitCT �� , (1) where wait_t is the elapsed time between a task submission and the start of execution and exe_t is actual execution time of task, respectively. Tasks arrive in a Poisson process. We assume that the task’s profile is available and can be provided by the user using job profiling, analytical models or historical information [12]. Each task ti requires a different processing capacity for its completion; and this determines the priority of that task. We use workload traces from real systems available from the Parallel Workload Archive (PWA) [11] to model the distributed application. From the traces, we obtain submission time, requested time and actual runtime of the tasks. We use the actual runtime as deadline (or upper bound). However, the workload traces do not contain information about processing priority. Hence, we synthetically assign the priority to a given task. The priority is determined based on requested time rt and actual runtime art. A task ti is set to high priority if its art is at least 70% of rt. If art is at most 20% of rt, the priority is considered as low. Otherwise, the task is set to medium priority. Such a task priority considers the consequence of missing deadline.

505505505504

Page 3: [IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

B. System Model The target system used in this work consists of a number

of sites in each of which a set of p heterogeneous processors is fully interconnected (Figure 1). The model allows any set of tasks to arrive at the system and to be executed in available processors. For the sake of simplicity, we form hierarchical scheduler that can handle tasks from system users and map them onto processors. Hence, there are two types of scheduler: global scheduler GS and local scheduler LS. Specifically, the local scheduler LS of each site is responsible for scheduling tasks after allocation by the global scheduler GS.

Figure 1. The System Model

Each processor is composed of a different number of cores. The processing speed of cores in a particular processor is homogenous and expressed in terms of million instructions per second (MIPS). The processing capacity of processor pj is defined as:

C

ss

jcs

Lpc

1

(2)

where L is the total number of tasks completed within some observation period, cs is speed of core and C is total number of cores in processor pj, respectively. The processing capacity of a processor is also subject to its availability. If a processor is capable of receiving and executing task considering its energy use the processor is set to available, otherwise it is considered to be unavailable. For a given processor, its processing capacity and availability fluctuate. Therefore, the actual completion time of a task on a particular processor is difficult, if not impossible, to determine a priori.

C. Energy model The concept of energy efficiency in our approach is defined as the subjective probability by which a processor executes a given task with optimal energy consumption. The

efficiency of a processor is subject to task execution time and the utilization. Hence, the energy consumption ECj of a processor pj is defined as

)*()exe_t *( min

1imax idleppEC j ���

� (3)

where pmax is the power usage at 100% utilization (busy), pmin is the power usage when processor j becomes idle and idle is total idle time of pj, respectively. It is assumed that for a given processor the peak power (80 - 95Watt) [13] is proportional to its processing capacity. Typically, the power consumption at idle state is about 50% of the peak power [14]. Hence, we use pmin and pmax of 48 and 95, respectively, which are common for processors used in data centers. The scheduling scheme requires tracking and adaptive mechanisms to schedule tasks into resources for energy efficiency. The number of incoming tasks in the system is difficult to accurately determine. This dynamic nature contributes to worsening resource performance and increasing processor idle time. In response to this, we introduce the processing power limit for each processor or power-threshold for effective energy use purposes. It helps monitor and discover the level of energy use without significant performance degradation. Different processors have various numbers of cores; and this is a clear indication of varying levels of processing capacity. The power-threshold ptj is determined differently with two workload scenarios: lightly loaded and heavily loaded. These scenarios are determined based on the average waiting time in the local scheduler LS during a particular period of time. The system is set as lightly loaded if the difference between average wait_t and minimum wait_t is at least 80% of average wait_t. Otherwise, it is considered as heavily loaded. Consequently, the power-threshold ptj for lightly loaded denotes the average speed of cores in the processor and for heavily loaded it sets as the total speed of cores, respectively. Since the threshold is related to the number of cores in the processor, load balancing among processors is achieved.

IV. PROPOSED SCHEDULING APPROACH This section begins by describing our priority-based scheduling strategy and gives details of energy efficiency solution incorporated into our approach.

A. Priority-based scheduling scheme Schedulers and processors are two main components in the system hierarchy (Figure 2). Queues holding waiting tasks of the global and local schedulers are called global-queue gq and local-queue lq, respectively. When tasks dispatched from the global scheduler GS reach the processing level, a local scheduler LS assigns these tasks to run in a set of processors. We assume that the local

Users

Global scheduler

Local scheduler

Processor

506506506505

Page 4: [IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

scheduler LS has all necessary information about processors currently located in the system. Corresponding to two hierarchical schedulers, there are two kinds of scheduling; there are the global scheduling that is executed by GS and local allocation (matching and scheduling) which is performed by LS.

Multi-core processors

global-queue, gq

local-queue, lq

high-priority queue

medium-priority queue

low-priority queue

Figure 2. Hierarchical scheduler

Each and every task submitted first stays in gq. It is assumed that tasks in gq are prioritized by their arrival time (i.e., First Come First Served). For the sake of simplicity, the matching process in the global scheduling is not considered. GS then dispatches tasks to appropriate local schedulers. Local scheduler LS is responsible for matching and assigning those tasks to suitable processors based on the local scheduling policy. The scheduling process at LS is more complicated than that at GS because tasks come with diverse processing priorities. In response to this, we consider three different waiting-queues: low-priority queue (low-pq), medium-priority queue (med-pq) and high-priority queue (high-pq). The local scheduler maintains these queues. There are two different scheduling policies adopted by LS. First, a task is assigned to a processor based on processing capacity aiming to achieve reliable execution. We introduce a suitability/fitness value between processor pj and task ti as defined to be:

CRjpc

irtvalfit ��� (4)

where CR is the number of successful tasks divided by total number of tasks that are executed in processor pj within some observation period. A task is assigned to a processor that gives the highest fit-val for guaranteeing the processing requirements. For the second scheduling policy, tasks are randomly assigned into any available processors without considering their suitability to minimize waiting time and maximize resource utilization. Tasks from high-pq are mapped based on the suitability value while low-pq performing task assignment randomly. For med-pq, the task assignment is according to either suitability value or random policy. If there determines the

waiting time constantly increases in med-pq, the random scheduling policy comes into the picture; otherwise it remains and relies on the suitability value for assignment process. The tasks from the priority queues are simultaneously mapped; this repeats until no further processing requirements to be executed. In some cases, for instance, if more than one task competes for the same processor, the higher priority task is executed first. Such scheduling strategy is to increase resource utilization and success rate.

B. Energy consumption constraint The priority-based scheduling scheme by itself cannot bring efficiency to the energy model. The scheduler needs to effectively schedule tasks in terms of both performance and energy consumption. For each processor, we monitor its energy consumption by controlling the processing requirements that are assigned into it. That means the task assignment is subject to the power-threshold of processor. If a processor reaches its power threshold, the task is assigned into another processor. The power-threshold also exists to control the processor to run at higher energy consumption. Hence, we introduce the processing rate of processor j is given as:

L

texePPj��

_ (5)

LS regularly checks whether processing rate PPj of

processor j exceeds its power-threshold ptj. Task assignment is realized when the processing rate of processor is satisfied. The scheduler allows processors to receive and execute other tasks if the energy consumption of processor j is thought to be moderate (i.e., PPj � ptj). In this state, the processor is considered as available; else, it is marked as unavailable. Obviously, the resource availability fluctuates according to the number of tasks executed by the processors. Due to power-threshold ptj being varied with the load factor (lightly loaded and heavily loaded), the processors able to adapt their processing power in changing the workload. As such, the processing capacity in the system is efficiently used and load balancing is implicitly achieved. The power-threshold of a particular processor contributes to the energy efficiency if PPj on the processor for optimizing resource capacity is actually realized.

V. PERFORMANCE EVALUATION In this section, we describe the experiment configuration and present results.

A. Experimental Settings

To evaluate the performance of our scheduling approach, we have conducted extensive simulations with real-workload

507507507506

Page 5: [IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

traces, SDSC Blue and HPC2N. The SDSC Blue (San Diego Supercomputer Center) located in San Diego with 1152 processors and the workload trace covers the period from April 2000 to January 2003. The workload trace of HPC2N (High Performance Computing Center North) collected from July 2002 to January 2006. This system is located in Sweden that composed by 240 compute nodes. Both traces were chosen for the extensiveness in their time period on LDSs.

In our simulation system, there are 4 to 8 resource sites in each of which its own scheduler resides. The number of processors in each site is ranging from 8 to 20 with the number of cores in a given processor dynamically chosen from 2 to 8 with an interval of 2. The speed of cores cs in processor pj is homogenous but might vary from cores in other processors, and selected within the range of 50 and 100MIPS. We varied (average) inter-arrival times of jobs by multiplying job submission times included in the trace files by the inverse of a load factor. This factor is increased by 0.1 increments from a low value to a maximum value that limits 5000 jobs from the traces. We first study the performance of our scheduling approach (Two-selection) for energy efficiency that is compared with three other resource allocation or selection rules, which are Fit-selection, Min-min [15] and Random-selection. In the Fit-selection, the suitable processor for a particular task determined according to the suitability or fitness value fit-val without considering random scheduling policy. Min-min heuristic maps task to the processor, which gives the smallest fitness value fit-val from its minimum list. It allows the low priority task to be executed first in order to achieve better successful execution. In Random-selection, tasks are randomly mapped to available processors. Performance metrics used for the experiment are successful execution rate and energy consumption. The successful execution rate denotes the number of tasks that met their deadline, given as

L

L

i i�� �1

�� (6)

It is used to measure the degree of reliable execution and to identify how well the resource allocation approach deals with tasks of various priorities. Then, we evaluate our priority-based scheduling with the identical queuing strategy. To understand the advantage of multiple queues for different priority of arriving tasks, we depict the comparison of our approach with the Fit and Min-min policies that each is associated with one queue. For this purposes, we introduce queuing time as another metric to study the performance of the scheduler. It denotes the total time for the task residing in the global and local waiting-queues, which includes the waiting time before being

executed. The probabilities of three different task priorities (low, medium and high) are varied in different experiments.

B. Results Experimental results are presented in two different ways based on energy efficiency and load factor.

Experiment 1: The impact of selection rule on energy efficiency

As shown in Figures 3 and 4, the proposed approach outperformed others in terms of successful execution rate. The superior performance of our approach is primarily achieved by mapping tasks based on their priority leading to finding the most suitable processing capacity. Interestingly, although the overall job size in HPC2N is smaller than that of the SDSC Blue workload its success rate is lower than that with the latter workload. It is because the waiting time of HPC2N workload is higher in low-pq that reduces the suitability in many matching decisions. On the other hand, two different scheduling policies more effectively deal with the SDSC Blue workload that exhibit huge variation in job size and load pattern.

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.1 0.2 0.3 0.4 0.5 0.6

Load factor

Suc

cess

ful e

xecu

tion

rate

Two-selection Fit-selection Min-Min Random-selection

Figure 3. Successful execution rate ( SDSC Blue) with different

scheduling approaches

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1 0.2 0.3 0.4 0.5 0.6Load factor

Succ

essf

ul e

xecu

tion

rate

Two-selection Fit-selection Min-Min Random-selection

Figure 4. Successful execution rate ( HP2CN) with different scheduling approaches

Figures 5 and 6 show the energy consumption that is plotted against load factor for HPC2N and SDSC Blue, respectively. Our approach obtains appealing energy

1 if CTi � rti

0 otherwise ; where � =

508508508507

Page 6: [IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

consumption for both workload traces. Specifically, the smaller size of job, the shorter execution time, hence on the basis of our energy model, Two-Selection policy with HPC2N is leading for minimizing the energy consumption. It also observes that Fit-selection is comparable with Min-Min strategy. It is due the fact that both strategies rely on processing capacity and task characteristic during their matching processes for optimal scheduling.

0

40

80

120

160

200

0.1 0.2 0.3 0.4 0.5 0.6Load factor

Ener

gy c

onsu

mpt

ion

(in m

illio

ns) Two-selection Fit-selection Min-Min Random-selection

Figure 5. Energy consumption with SDSC Blue workload

0

40

80

120

160

0.1 0.2 0.3 0.4 0.5 0.6Load factor

Ener

gy c

onsu

mpt

ion

(in m

illio

ns) Two-selection Fit-selection Min-Min Random-selection

Figure 6. Energy consumption with HP2CN workload

Experiment 2: The impact of queuing strategy on different patterns of workloads From Figures 7 and 8 we can see that the average queuing time using Two-selection is another compelling strength. The reason is that the local scheduler simultaneously mapped the tasks from the different queues (i.e., low-pq, med-pq and high-pq) to the processors, so the tasks experienced less waiting time at lq. According to Figure 8, the average queuing time of the resource selection strategies have comparable performance with the difference about 20% on average; particularly true in light loaded. The primary source of this performance gain is consequence of the workload pattern of HPC2N—considering the overall size of job is relatively small.

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0.1 0.2 0.3 0.4 0.5 0.6Load factor

aver

age

queu

ing

time

(t u

nit)

Two-selection Fit (one-queue) Min-Min (one-queue)

Figure 7. Queue waiting time (SDSC Blue) with different queuing

strategies

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.1 0.2 0.3 0.4 0.5 0.6

Load factor

aver

age

queu

ing

time

(t u

nit)

Two-selection Fit (one-queue) Min-Min (one-queue)

Figure 8. Queue waiting time (HP2CN) with different queuing strategies

Figures 9 and 10 show that the benefit of separated/different queue for a good energy management controller. Although all strategies consider task characteristics during the mapping process, multiple waiting-queues significantly increase resource utilization. Note that energy consumption reflects resource utilization implicitly. Increases in utilization lead to energy efficiency without significant performance degradation. We extend the analysis of Two-selection approach corresponding to different settings in core count. In addition to heterogeneous number of cores that set-up earlier, we introduce another setting with four cores residing in each processor. According to result in Fig. 11, the energy efficiency with multi-core processor architecture is considerably improved. Interestingly, the setting of identical-core for HPC2N is comparable with SDSC Blue that sets with diverse number of core. This result is a strong indication that the workload pattern and the number of cores explicitly influence energy consumption.

509509509508

Page 7: [IEEE 2011 IEEE 9th International Conference on Dependable, Autonomic and Secure Computing (DASC) - Sydney, Australia (2011.12.12-2011.12.14)] 2011 IEEE Ninth International Conference

0

50

100

150

200

250

0.1 0.2 0.3 0.4 0.5 0.6Load factor

Ener

gy c

onsu

mpt

ion

(in m

illio

ns) Two-selection Fit (one-queue) Min-Min (one-queue)

Figure 9. Energy consumption (SDSC Blue) with different queuing strategies

0

50

100

150

200

0.1 0.2 0.3 0.4 0.5 0.6Load factor

Ener

gy c

onsu

mpt

ion

(in m

illio

ns)

Two-selection Fit (one-queue) Min-Min (one-queue)

Figure 10. Energy consumption (HP2CN) with different queuing strategies

0

20

40

60

80

100

120

140

160

0.1 0.2 0.3 0.4 0.5 0.6Load factor

Ener

gy c

onsu

mpt

ion

(in m

illio

ns)

SDSC (identical-core) HP2CN (identical-core)SDSC (various-core) HP2CN (various-core)

Figure 11. Energy consumption of under different workload patterns with different core settings

VI. CONCLUSION Distributed applications in science and engineering rely

on larger configurations of parallel computers. The continued exponential growth in the number of computing devices leads to explosive increase of energy consumption. Energy consumption in LDSs gains a lot of attention recently due to its significant performance, environmental and economic implications. In this paper, we address the

energy efficiency issue in the context of scheduling. We have effectively modelled a hierarchical scheduler with the explicitly consideration of urgency of tasks. In addition, the power constraint is devised to effectively control energy consumption of processors with minimal possible performance degradation. Based on our simulation results, our priority-based scheduling demonstrates appealing performance while actively exploiting the heterogeneous nature of both tasks and resources. Our scheduling approach is also robust in the sense that energy efficiency remains at a good level regardless of different workload patterns.

REFERENCES

[1] Y. C. Lee, and A. Y. Zomaya, “Energy efficient utilization of resources in cloud computing systems,” Journal of Supercomputing, pp. 1-13, 2010.

[2] L. A. Barroso, and U. Holzle, “The Case for Energy-Proportional Computing,” Journal Computer, vol. 40, no. 12, pp. 33-37, 2007.

[3] Y. C. Lee, and A. Y. Zomaya, "Scheduling in grid environments," Handbook of parallel computing: models, algorithms and applications, S.Rajasekaran and J. Reif, eds., pp. 21.1-21.19: CRC Press, Boca Raton, Florida, USA, 2008.

[4] K. W. Cameron, R. Ge, and X. Feng, “High-Performance Power-Aware Distributed Computing for Scientific Applications,” IEEE Computer, pp. 40-47, 2005.

[5] R. Ge, X. Feng, and K. W. Cameron, “Performance-constrained Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters,” in Proc. of the 2005 ACM/IEEE Conference on Supercomputing, pp. 34, 2005.

[6] K. H. Kim, R. Buyya, and J. Kim, “Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-enabled Clusters,” in 7th IEEE/ACM Int'l Symposium on Cluster Computing and the Grid (CCGRID2007), 2007.

[7] B. Lawson, and E. Smirni, “Power-aware Resource Allocation in High-end Systems via Online Simulation,” in 19th Int'l Conf. on Supercomputing (Boston, USA), pp. 229-238, 2005.

[8] S. K. Garg, and R. Buyya, “Exploiting Heterogeneity in Grid Computing for Energy-Efficient Resource Allocation,” in Proc. of the 17th Int'l Conf. on Advanced Computing and Communications (ADCOM), Bengaluru, India, 2009.

[9] I. Ahmad, S. Ranka, and S. U. Khan, “Using game theory for scheduling tasks on multi-core processors for simultaneous optimization of performance and energy,” in IEEE Int'l Sym. on Parallel and Distributed Processing (IPDPS), Miami, FL, pp. 1-6, 2008.

[10] N. Aggarwal, P. Ranganathan, N. P.Jouppi et al., “Configurable Isolation: Building High Availability Systems with Commodity Multi-core Processors,” in Proc. of the 34th Annual Int'l Sym. on Computer Architecture, pp. 470-481, 2007.

[11] PWA: Parallel workloads archive, http://www.cs.huji.ac.il/labs/parallel/workload/logs.html. [12] M. Hussin, Y. C. Lee, and A. Y. Zomaya, “ADREA: A

Framework for Adaptive Resource Allocation in Distributed Computing Systems,” in 11th Int'l Conf. on Parallel and Distributed Computing, Applications and Technologies (PDCAT), Wuhan China, pp. 50-57, 2010.

[13] P. Schreier, “An Energy Crisis in HPC,” Scientific Computing World : HPC Projects Dec 2008/Jan 2009.

[14] L. A. Barroso, and U. Holzle, “The Case for Energy-Proportional Computing,” J. Computer, vol. 40, no. 12, 2007.

[15] T. D.Braun, H. J. Siegal, and N. Beck, “A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems,” Journal of Parallel and Distributed Computing, vol. 61, pp. 810-837, 2001.

510510510509