7
Study on New Task Scheduling Strategy in Cloud Computing Environment based on the Simulator CloudSim He Zhongtang 1,a , Zhang Xiaoqing* 2,b , Zhang Hengxi 3,c , Xu Zhiwei 4 1 Guangdong Electronics Industry Institute, Institute of Computing Technology Chinese Academy of Sciences, China 2 School of Computer Science and Technology, Wuhan University of Technology, China 3 Department of Fundamental, Xuzhou Air Force Academy, China 4 Institute of Computing Technology Chinese Academy of Sciences, China a [email protected], b [email protected], c [email protected] Keywords: Cloud Computing; Task Scheduling; CloudSim; Simulation Abstract. In this paper, the principle and mechanism of CloudSim is analyzed. The layered architecture of CloudSim is emphasized, including user code layer, cloud resource layer, cloud service layer, network layer, virtual machine service layer and user interface structure layer. Based on these works, taking the task scheduling in cloud environment as a research object, five task scheduling algorithms are proposed and are extended into CloudSim platform. Then a specific cloud simulation scene is deployed and five simulation experiments are carried out. The cloud computing simulation and the extension of task scheduling strategy in CloudSim become easily by the help of such works. Introduction As a next-generation data center, cloud computing has become the most hot study area currently after it was put forwarded in the third quarter of 2007, and it is regard as a commercialized realization of grid computing [1]. Currently, the definition of cloud computing has not yet reach a consensus in the industry, a representative view is [2]: Clouds are a large pool of easily usable and accessible virtualized resources (such as hardware, development platform and services); these resources can be dynamically reconfigure to adjust to a variable load, allowing also for an optimum resource utilization; this pool of resources is typically exploited by a pay-per-use model in which guarantees are offered by the Infrastructure Provider by means of customized SLAs. This definition focuses on some characteristics of cloud computing, such as large scale, virtualization, scalability, versatility and on-demand services and so on. At the present stage, many large IT corps have released respective cloud platform. Those cloud infrastructures such as Google App Engine [3], Amazon EC2 [4], IBM Blue cloud [5] and Microsoft Azure [6] have been used widely. The research area on cloud computing focuses on virtualization, mass data storage, resource management, task scheduling, energy efficiency of cloud data center and cloud Security. Since building a real cloud computing environment is a large-scale system engineering, using simulation tools is a feasible and effective method. As a emerging Internet computing mode, there are not many cloud simulation tools. Two typical simulation platforms include SimCloud [7] and CloudSim [8,9]. SimCloud is a chargeable simulation platform faced on enterprises. It effectively integrates many enterprise-level IT resource, including some engineering numerical simulation application software and servers. Comparatively speaking, CloudSim is an open source simulation platform, which is a new and extensible simulation framework that allows seamless modeling, simulation, and experimentation of emerging cloud computing infrastructures and application services. As CloudSim is open source software (OSS) and used widely, after analyzing the layered architecture and realization mechanism of CloudSim, five task scheduling algorithms are proposed and implemented through the platform extension. Simulation experiments related with these algorithms are processed in cloud computing scenes. Advanced Materials Research Vol. 651 (2013) pp 829-834 Online available since 2013/Jan/25 at www.scientific.net © (2013) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/AMR.651.829 All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of TTP, www.ttp.net. (ID: 130.207.50.37, Georgia Tech Library, Atlanta, USA-16/11/14,00:07:48)

Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

  • Upload
    zhi-wei

  • View
    215

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

Study on New Task Scheduling Strategy in Cloud Computing Environment based on the Simulator CloudSim

He Zhongtang1,a, Zhang Xiaoqing*2,b, Zhang Hengxi3,c, Xu Zhiwei4 1Guangdong Electronics Industry Institute, Institute of Computing Technology Chinese Academy of

Sciences, China

2School of Computer Science and Technology, Wuhan University of Technology, China

3Department of Fundamental, Xuzhou Air Force Academy, China

4Institute of Computing Technology Chinese Academy of Sciences, China

[email protected], [email protected], [email protected]

Keywords: Cloud Computing; Task Scheduling; CloudSim; Simulation

Abstract. In this paper, the principle and mechanism of CloudSim is analyzed. The layered

architecture of CloudSim is emphasized, including user code layer, cloud resource layer, cloud

service layer, network layer, virtual machine service layer and user interface structure layer. Based on

these works, taking the task scheduling in cloud environment as a research object, five task

scheduling algorithms are proposed and are extended into CloudSim platform. Then a specific cloud

simulation scene is deployed and five simulation experiments are carried out. The cloud computing

simulation and the extension of task scheduling strategy in CloudSim become easily by the help of

such works.

Introduction

As a next-generation data center, cloud computing has become the most hot study area currently

after it was put forwarded in the third quarter of 2007, and it is regard as a commercialized

realization of grid computing [1]. Currently, the definition of cloud computing has not yet reach a

consensus in the industry, a representative view is [2]: Clouds are a large pool of easily usable and

accessible virtualized resources (such as hardware, development platform and services); these

resources can be dynamically reconfigure to adjust to a variable load, allowing also for an optimum

resource utilization; this pool of resources is typically exploited by a pay-per-use model in which

guarantees are offered by the Infrastructure Provider by means of customized SLAs. This definition

focuses on some characteristics of cloud computing, such as large scale, virtualization, scalability,

versatility and on-demand services and so on. At the present stage, many large IT corps have

released respective cloud platform. Those cloud infrastructures such as Google App Engine [3],

Amazon EC2 [4], IBM Blue cloud [5] and Microsoft Azure [6] have been used widely. The

research area on cloud computing focuses on virtualization, mass data storage, resource

management, task scheduling, energy efficiency of cloud data center and cloud Security. Since

building a real cloud computing environment is a large-scale system engineering, using simulation

tools is a feasible and effective method. As a emerging Internet computing mode, there are not

many cloud simulation tools. Two typical simulation platforms include SimCloud [7] and CloudSim

[8,9]. SimCloud is a chargeable simulation platform faced on enterprises. It effectively integrates

many enterprise-level IT resource, including some engineering numerical simulation application

software and servers. Comparatively speaking, CloudSim is an open source simulation platform,

which is a new and extensible simulation framework that allows seamless modeling, simulation,

and experimentation of emerging cloud computing infrastructures and application services.

As CloudSim is open source software (OSS) and used widely, after analyzing the layered

architecture and realization mechanism of CloudSim, five task scheduling algorithms are proposed

and implemented through the platform extension. Simulation experiments related with these

algorithms are processed in cloud computing scenes.

Advanced Materials Research Vol. 651 (2013) pp 829-834Online available since 2013/Jan/25 at www.scientific.net© (2013) Trans Tech Publications, Switzerlanddoi:10.4028/www.scientific.net/AMR.651.829

All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of TTP,www.ttp.net. (ID: 130.207.50.37, Georgia Tech Library, Atlanta, USA-16/11/14,00:07:48)

Page 2: Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

As a widely distributed system, a key objective of grid [1] is to match the abilities of grid

resource to requirements of grid users. But, due to the heterogeneity and dynamics nature of grid

entities, resource allocation [2] will be a complicated business in grid. Many researches have been

undertaken to study resource allocation exploring the idea of game theory, due to the similarities

between resource management in grid and social economic activities. The game theory formalizes

the grid resource allocation as a utility optimization problem by searching equilibrium solution.

Layered Architecture of CloudSim

CloudSim is a kind of cloud computing simulation engine based on discrete events, developed from

GridSim [10]. Fig.1 shows that the multi-layered design of the CloudSim software framework and

its architecture components. Initial releases of CloudSim used SimJava as discrete event simulation

engine that supports several core functionalities, such as queuing and processing of events, creation

of cloud system entities, communication between components, and management of the simulation

clock.

Cloud scenarioUser

requirement…

Application

configuration

User or DataCenter Broker

Simulation

specification

Scheduling

policy

Cloudlet Virtual machine

CloudSim

User code

Cloudlet execution VM management

User

Interface

structure

VM

service

VM

provisioningCPU allocation

Memory

allcationStorage allocation

Bandwidth

allocationCloud

service

Host Datacenter Events handling SensorCloud

coordinator

Cloud

resource

Network

topology

Massage delay

calculateNetwork

CloudSim core simulation engine Fig.1 Layered Architecture of CloudSim 2.0 [8]

User Code Layer

The top-most layer of CloudSim is the user code layer, which provides some basic entities, such as

hosts(the number of machines, features), application(the number of tasks and task demand), virtual

machines, the number of users and application types and broker’s scheduling strategies. By

extending these entities, cloud developers can perform the following functions: 1)To generate the

load distribution and application configuration requests. 2) To model the available cloud scenes and

test the robustness according to customized configuration. 3) To achieve customized application

scheduling technologies for cloud and intercloud.

Cloud Resource Layer

This layer is responsible for modeling the physical resources of underlying cloud components and

behaviors. The hardware infrastructures of cloud computing environment are simulated mainly

through the extension of data center entity. The data center manages a large number of hosts, which

are actual physical computers in cloud. These hosts are pre-deployed with processing capacity,

memory, storage and scheduling strategies of processing cores. These cloud hosts are scheduled to

virtual machines according to allocation strategy of virtual machines defined by cloud service

providers. Meanwhile, the host components have supported the single-core nodes and multi-core

nodes. For CloudSim, a entity is an instance of a component, which is a class or a class set such as a

data center or a host.

Cloud Service Layer

As against Grid computing, cloud computing contains a extra layer, cloud service layer (or

virtualization layer) that acts act as an execution, management and hosting environment for

application services. Although virtual machines are contextually (physical and secondary memory

space) separated in practice, they still have to share the processing cores and system bus. Hence, the

830 Engineering Materials and Application

Page 3: Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

amount of hardware resource available to each virtual machine is constrained by the total

processing power and system bandwidth available within the host. The key element must be

considered in the provision process is to avoid creation of a virtual machine that demands more

processing power than is available within the host. For supporting different scheduling strategies in

different environments, CloudSim supports two levels of virtual machines provision: at the host

level and at the virtual machine level. At the host level, it is possible to specify how much of the

overall processing power of each core will be assigned to each virtual machine. At the virtual

machine level, the virtual machine assigns a fixed amount of the available processing power to the

individual application services that are hosted within its execution engine. In this case, we can

consider the task unit as a finer abstraction of an application service being hosted in the virtual

machine.

Network Layer

This layer is responsible for modeling the complicated network topology of connecting cloud

computing simulation entities, such as hosts, storages and end-users. The inter network of cloud

entities in CloudSim is based on a conceptual network abstracting. In this model, there are no actual

network components available for simulation network entities, such as routers or switches. Instead,

network latency that a message can experience on its path from one CloudSim entity to another is

simulated based on the information stored in the latency matrix. Matrix (1) shows a latency matrix

consisting of four CloudSim entities. At any time, the CloudSim maintains an m�n size matrix for

all CloudSim entities. The element Ei,j in the matrix denotes the latency that a message will undergo

when it is being transferred from entity i to entity j over the network. CloudSim is an emulator

driven by events, whose event management enginee utilizes the inter-entity network latency

information for inducing delays in transmitting message to entities.

Virtual Machine Service Layer

This layer supports the operation of task units and the lifecycle management of virtual machines,

such as allocating hosts to virtual machines, creating virtual machines, destroying virtual machines

and migrating virtual machines.

User Interface Structure Layer

This layer implements the interface creation between task units and virtual machines entities.

Entity Communication of Task Scheduling

Fig.2 shows the entities communication process of task scheduling in CloudSim. Since the CIS

provides the information registration function fro cloud providers, in the initial stage of simulation,

each data center entity (DataCenter) will register though the CIS. Then, the DataCenterBroker acing

on behalf of users consult the CIS service to get the list of cloud providers who can offer

infrastructure services that match application’s QoS, hardware, and software requirements. In the

process of a match, the DataCenterBroker deploys the user’s application with the CIS suggested

cloud. During the processing of task units is handled by respective virtual machines, their process

must be continuously updated and monitored. In this process, each DataCenter entity will invoke

a method called updateVMsProcessing() to manage its every host. The connected virtual machines

will update processing of currently active tasks with host. At the host level, the method

updateVMsProcessing() triggers a method updateCloudletsProcessing(), which directs every virtual

machine to update its task unit status(finish, suspended or exexcuting) with the DataCenter. This

method has a same function as the method updateVMsProcessing(), but the difference is the method

updateClouletsProcessing() is at the virtual machine level. Once the method is transferred, virtual

machines will return the next expected completion time of the task units currently managed by them.

Then, the least completion time among all the computed values is sent to the DataCenter.

Advanced Materials Research Vol. 651 831

Page 4: Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

DataCenter CISRegistry DataCenterBroker

Registration

Available datacenter

Query

Get characteristic

Create VM

Task scheduling

Task completion

Destroy VM

Host0VM0

UpdateVMsProcessing()UpdateCloudletsProcessing()

Time of next event

VMn

UpdateCloudletsProcessing()

Time of next event

Smallest time of next event

Fig.2 Entity communication process of task scheduling

Extension of CloudSim and Task Scheduling

We have extended open cloud computing simulation platform CloudSim 2.0, and implemented five

task scheduling algorithms in its source codes. After recompiling new platform, we carry out some

simulation programs of above five algorithms.

Simulation Environment

The software and hardware configure in native machine: OS: Windows XP, CPU: Inter P4 2.8GHZ,

Memory: 1GB, Storage: 60G. Java compile environment: JDK 1.6.0_26+Eclipse SDK 3.4.1.

Simulation toolkit of cloud computing environment: CloudSim 2.1.1.

Experimental Parameter

For implementing simulation experiments on task scheduling of cloud computing, we have

simulated one date center comprising one host, including four virtual machines requirements and

eight cloud tasks. Table 1-Table 4 gives detailed simulation parameters respectively.

Table 1 Parameter list of cloud user task cloudlet CloudletId Lengths(MI) FileSize OutputSize

0 18375 300 300

1 48624 400 300

2 31384 240 200

3 45628 300 300

4 14562 400 500

5 21075 600 250

6 31579 250 200

7 31864 350 200

Table 2 Virtual machine list VMId CPU MIPS Ram Storage Bw Share-Policy

0 1 267 2048 10000 100 Space-Shared

1 1 285 2048 10000 100 Space-Shared

2 1 210 2048 10000 100 Space-Shared

3 1 291 2048 10000 100 Space-Shared

Table 3 Datacenter list Architecture OS VMM Cost

X86 Linux Xen 3.0

Table 4 Host list HostId MIPS Ram Storage Bw Share Policy

0 1200 2048 100000 1000 Time-Shared

Task Scheduling Algorithm and Experimental Results

In CloudSim, extending bindcloudletToVm (int cloudletId, int VmId) of DatacenterBroker class can

implement the bind and execution between task cloudlet and specified virtual machine. Fig.6 shows

the task scheduling mode of CloudSim. By means of overloading this method, the following five

task scheduling algorithms are presented:

832 Engineering Materials and Application

Page 5: Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

Cloud user

Uers1

Uers2

User3

Userk

用户

用户

用户

用户

Cloudlet

T1

T2

T3

T4

Tn

VM

VM1

VM2

VM3

VM4

VMm

Datacenter

Host1

Host2

Host3

Fig.3 Task scheduling model in CloudSim platform

1)Sequenced Scheduling (SS): When all of the VM are running the tasks, and then start again

from the first VM allocation of new tasks. SS tries to ensure that each VM can run the same

numbers of tasks to flat load. However, SS do not take the differences of task requirements and VM

performance into account.

2)First Come First Service (FCFS): When all VM are running the tasks, the subsequent task is

directed assigned to the VM with finishing task firstly, and so on, until all tasks are finished. FCFS

can avoid too much idle time of VM, but the VM with high performance may be overloaded.

3)Shortest Task First (STB): All tasks are sorted ascending according to task length, and then are

assigned to VM sequentially. The algorithm ensures that the short task can be finished firstly.

However, it also do not consider the difference of task demands.

4)Balance Scheduling (BS): All tasks are sorted descending according to task length and all VM

are sorted descending according to process performance. Given resorted task set and VM set, SS is

transferred. The aim of the algorithm is to assign long task to VM with high performance and short

task to VM with low performance, enforcing the balanced scheduling between task and VM.

However, it loses sight of the utilization of VM.

5)Greed Scheduling (GS): All tasks are sorted descending according to task length, and all VM

are sorted ascending according to process performance(MIPS). A matrix E(i,j) is defined, which

stores sorted (task length/MIPS, e.g. task execute time). Implementing greedy strategy in E(i,j):

beginning with the task with line number 0 in matrix, trying ot assign the last one corresponding to

the VM for each time. If the choice is optimal comparing to other choices, the allocation is finished.

Otherwise, the task is assigned to the VM making the current result optimization. The aim of this

algorithm is to assign complex task to the VM with high performance, which can solve the

bottleneck problem brought by complex tasks. Meanwhile, the finish time of all tasks can approach

to shortest ones. But the trouble is that high performance VM may be overloaded and energy

consumption is great.

Fig.4 shows the finish time of each cloudlet in five scheduling algorithms. Fig.5 shows the total

finish time of all tasks in five scheduling algorithms. Because GS carries out greedy strategy to

cloudlet on all virtual machines, its finish time is shortest. STF do not consider the difference of

process performance of VM, its finish time is longest. The finish time of GS is less 25% than one of

STF. Because the tasks are allocated sequentially in the beginning, the finish time of SS is closed to

FCFS. With the increasing of the number of cloudlets, FCFS performances better. Due to taking the

differences of the task length and process performance of VM, BS is a suboptimal algorithm.

0

50

100

150

200

250

300

350

0 1 2 3 4 5 6 7

CloudletId

finish time of task

SS FCFS STF BS GS

0

50

100

150

200

250

300

350

SS FCFS STF BS GS

task scheduling algorithm

total finish time

Fig.4 The finish time of each cloudlet Fig.5 The total finish time of all cloudlets

Advanced Materials Research Vol. 651 833

Page 6: Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

Conclusion

The simulation mechanism of cloud computing platform CloudSim is introduced in the paper. The

layered architecture of CloudSim is emphasized. Based on these works, taking the task scheduling

as a research plot under CloudSim, five task scheduling algorithms are presented. We extend

CloudSim platform and implement these five algorithms. Some simulation experiments and

performance analysis are carried out after deploying specified cloud scenes. The above researches

provide some ideas and solution for task scheduling strategy and code extension of new algorithms

in cloud computing environment based on CloudSim.

Acknowledgment

This work was supported by the National Natural Science Foundation of China Grant

NO.60970064,Guangdong and Hong Kong invited bidding special for Dongguan Grant No.

201120510101,Dongguan Major Science and Technology Special Project No.

2009215102001,Guangdong province high-tech zone development guide special Project

No.2011B010700043 and Chinese Academy of Sciences cooperation project.

References

[1] Liu peng, Cloud Computing (Second Edition)[M]. Beijing: Electronic Industry Press, 2011..

[2] Vaquero, L.M., L. Rodero-Merino, J. Caceres, et al. A break in the clouds: towards a cloud

definition. SIGCOMM Computer Communication Review, 2008, 39(1):50-55.

[3] Google App Engine. http://appengine google.com.

[4] Amazon EC2. http://www.amazon.com/ec2.

[5] IBM Cloud. http://www.ibm.com/cloud-computing/us/en/.

[6] Microsoft Azure. http://www.microsoft.com/azure.

[7] SimCloud Platform. http://simcloud.com/.

[8] Calheiros, R.N., R. Ranjan, A. Beloglazov, et al. CloudSim: a toolkit for modeling and

simulation of cloud computing environments and evaluation of resource provisioning

algorithms. Software-Practice & Experience, 2011, 41(1):23-50.

[9] Buyya, R., R. Ranjan, R.N. Calheiros. Modeling and simulation of scalable Cloud computing

environments and the CloudSim toolkit: Challenges and opportunities. In International

Conference on High Performance Computing & Simulation, HPCS'09. 2009: p.1-11

[10] Buyya, R.,M. Murshed. GridSim: a toolkit for the modeling and simulation of distributed

resource management and scheduling for Grid computing. Concurrency and

Computation-Practice & Experience, 2002, 14(13-15): 1175-1220.

834 Engineering Materials and Application

Page 7: Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator CloudSim

Engineering Materials and Application 10.4028/www.scientific.net/AMR.651 Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator

CloudSim 10.4028/www.scientific.net/AMR.651.829

DOI References

[2] Vaquero, L.M., L. Rodero-Merino, J. Caceres, et al. A break in the clouds: towards a cloud definition.

SIGCOMM Computer Communication Review, 2008, 39(1): 50-55.

http://dx.doi.org/10.1145/1496091.1496100