Upload
zhi-wei
View
215
Download
4
Embed Size (px)
Citation preview
Study on New Task Scheduling Strategy in Cloud Computing Environment based on the Simulator CloudSim
He Zhongtang1,a, Zhang Xiaoqing*2,b, Zhang Hengxi3,c, Xu Zhiwei4 1Guangdong Electronics Industry Institute, Institute of Computing Technology Chinese Academy of
Sciences, China
2School of Computer Science and Technology, Wuhan University of Technology, China
3Department of Fundamental, Xuzhou Air Force Academy, China
4Institute of Computing Technology Chinese Academy of Sciences, China
[email protected], [email protected], [email protected]
Keywords: Cloud Computing; Task Scheduling; CloudSim; Simulation
Abstract. In this paper, the principle and mechanism of CloudSim is analyzed. The layered
architecture of CloudSim is emphasized, including user code layer, cloud resource layer, cloud
service layer, network layer, virtual machine service layer and user interface structure layer. Based on
these works, taking the task scheduling in cloud environment as a research object, five task
scheduling algorithms are proposed and are extended into CloudSim platform. Then a specific cloud
simulation scene is deployed and five simulation experiments are carried out. The cloud computing
simulation and the extension of task scheduling strategy in CloudSim become easily by the help of
such works.
Introduction
As a next-generation data center, cloud computing has become the most hot study area currently
after it was put forwarded in the third quarter of 2007, and it is regard as a commercialized
realization of grid computing [1]. Currently, the definition of cloud computing has not yet reach a
consensus in the industry, a representative view is [2]: Clouds are a large pool of easily usable and
accessible virtualized resources (such as hardware, development platform and services); these
resources can be dynamically reconfigure to adjust to a variable load, allowing also for an optimum
resource utilization; this pool of resources is typically exploited by a pay-per-use model in which
guarantees are offered by the Infrastructure Provider by means of customized SLAs. This definition
focuses on some characteristics of cloud computing, such as large scale, virtualization, scalability,
versatility and on-demand services and so on. At the present stage, many large IT corps have
released respective cloud platform. Those cloud infrastructures such as Google App Engine [3],
Amazon EC2 [4], IBM Blue cloud [5] and Microsoft Azure [6] have been used widely. The
research area on cloud computing focuses on virtualization, mass data storage, resource
management, task scheduling, energy efficiency of cloud data center and cloud Security. Since
building a real cloud computing environment is a large-scale system engineering, using simulation
tools is a feasible and effective method. As a emerging Internet computing mode, there are not
many cloud simulation tools. Two typical simulation platforms include SimCloud [7] and CloudSim
[8,9]. SimCloud is a chargeable simulation platform faced on enterprises. It effectively integrates
many enterprise-level IT resource, including some engineering numerical simulation application
software and servers. Comparatively speaking, CloudSim is an open source simulation platform,
which is a new and extensible simulation framework that allows seamless modeling, simulation,
and experimentation of emerging cloud computing infrastructures and application services.
As CloudSim is open source software (OSS) and used widely, after analyzing the layered
architecture and realization mechanism of CloudSim, five task scheduling algorithms are proposed
and implemented through the platform extension. Simulation experiments related with these
algorithms are processed in cloud computing scenes.
Advanced Materials Research Vol. 651 (2013) pp 829-834Online available since 2013/Jan/25 at www.scientific.net© (2013) Trans Tech Publications, Switzerlanddoi:10.4028/www.scientific.net/AMR.651.829
All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of TTP,www.ttp.net. (ID: 130.207.50.37, Georgia Tech Library, Atlanta, USA-16/11/14,00:07:48)
As a widely distributed system, a key objective of grid [1] is to match the abilities of grid
resource to requirements of grid users. But, due to the heterogeneity and dynamics nature of grid
entities, resource allocation [2] will be a complicated business in grid. Many researches have been
undertaken to study resource allocation exploring the idea of game theory, due to the similarities
between resource management in grid and social economic activities. The game theory formalizes
the grid resource allocation as a utility optimization problem by searching equilibrium solution.
Layered Architecture of CloudSim
CloudSim is a kind of cloud computing simulation engine based on discrete events, developed from
GridSim [10]. Fig.1 shows that the multi-layered design of the CloudSim software framework and
its architecture components. Initial releases of CloudSim used SimJava as discrete event simulation
engine that supports several core functionalities, such as queuing and processing of events, creation
of cloud system entities, communication between components, and management of the simulation
clock.
Cloud scenarioUser
requirement…
Application
configuration
User or DataCenter Broker
Simulation
specification
Scheduling
policy
Cloudlet Virtual machine
CloudSim
User code
Cloudlet execution VM management
User
Interface
structure
VM
service
VM
provisioningCPU allocation
Memory
allcationStorage allocation
Bandwidth
allocationCloud
service
Host Datacenter Events handling SensorCloud
coordinator
Cloud
resource
Network
topology
Massage delay
calculateNetwork
CloudSim core simulation engine Fig.1 Layered Architecture of CloudSim 2.0 [8]
User Code Layer
The top-most layer of CloudSim is the user code layer, which provides some basic entities, such as
hosts(the number of machines, features), application(the number of tasks and task demand), virtual
machines, the number of users and application types and broker’s scheduling strategies. By
extending these entities, cloud developers can perform the following functions: 1)To generate the
load distribution and application configuration requests. 2) To model the available cloud scenes and
test the robustness according to customized configuration. 3) To achieve customized application
scheduling technologies for cloud and intercloud.
Cloud Resource Layer
This layer is responsible for modeling the physical resources of underlying cloud components and
behaviors. The hardware infrastructures of cloud computing environment are simulated mainly
through the extension of data center entity. The data center manages a large number of hosts, which
are actual physical computers in cloud. These hosts are pre-deployed with processing capacity,
memory, storage and scheduling strategies of processing cores. These cloud hosts are scheduled to
virtual machines according to allocation strategy of virtual machines defined by cloud service
providers. Meanwhile, the host components have supported the single-core nodes and multi-core
nodes. For CloudSim, a entity is an instance of a component, which is a class or a class set such as a
data center or a host.
Cloud Service Layer
As against Grid computing, cloud computing contains a extra layer, cloud service layer (or
virtualization layer) that acts act as an execution, management and hosting environment for
application services. Although virtual machines are contextually (physical and secondary memory
space) separated in practice, they still have to share the processing cores and system bus. Hence, the
830 Engineering Materials and Application
amount of hardware resource available to each virtual machine is constrained by the total
processing power and system bandwidth available within the host. The key element must be
considered in the provision process is to avoid creation of a virtual machine that demands more
processing power than is available within the host. For supporting different scheduling strategies in
different environments, CloudSim supports two levels of virtual machines provision: at the host
level and at the virtual machine level. At the host level, it is possible to specify how much of the
overall processing power of each core will be assigned to each virtual machine. At the virtual
machine level, the virtual machine assigns a fixed amount of the available processing power to the
individual application services that are hosted within its execution engine. In this case, we can
consider the task unit as a finer abstraction of an application service being hosted in the virtual
machine.
Network Layer
This layer is responsible for modeling the complicated network topology of connecting cloud
computing simulation entities, such as hosts, storages and end-users. The inter network of cloud
entities in CloudSim is based on a conceptual network abstracting. In this model, there are no actual
network components available for simulation network entities, such as routers or switches. Instead,
network latency that a message can experience on its path from one CloudSim entity to another is
simulated based on the information stored in the latency matrix. Matrix (1) shows a latency matrix
consisting of four CloudSim entities. At any time, the CloudSim maintains an m�n size matrix for
all CloudSim entities. The element Ei,j in the matrix denotes the latency that a message will undergo
when it is being transferred from entity i to entity j over the network. CloudSim is an emulator
driven by events, whose event management enginee utilizes the inter-entity network latency
information for inducing delays in transmitting message to entities.
Virtual Machine Service Layer
This layer supports the operation of task units and the lifecycle management of virtual machines,
such as allocating hosts to virtual machines, creating virtual machines, destroying virtual machines
and migrating virtual machines.
User Interface Structure Layer
This layer implements the interface creation between task units and virtual machines entities.
Entity Communication of Task Scheduling
Fig.2 shows the entities communication process of task scheduling in CloudSim. Since the CIS
provides the information registration function fro cloud providers, in the initial stage of simulation,
each data center entity (DataCenter) will register though the CIS. Then, the DataCenterBroker acing
on behalf of users consult the CIS service to get the list of cloud providers who can offer
infrastructure services that match application’s QoS, hardware, and software requirements. In the
process of a match, the DataCenterBroker deploys the user’s application with the CIS suggested
cloud. During the processing of task units is handled by respective virtual machines, their process
must be continuously updated and monitored. In this process, each DataCenter entity will invoke
a method called updateVMsProcessing() to manage its every host. The connected virtual machines
will update processing of currently active tasks with host. At the host level, the method
updateVMsProcessing() triggers a method updateCloudletsProcessing(), which directs every virtual
machine to update its task unit status(finish, suspended or exexcuting) with the DataCenter. This
method has a same function as the method updateVMsProcessing(), but the difference is the method
updateClouletsProcessing() is at the virtual machine level. Once the method is transferred, virtual
machines will return the next expected completion time of the task units currently managed by them.
Then, the least completion time among all the computed values is sent to the DataCenter.
Advanced Materials Research Vol. 651 831
DataCenter CISRegistry DataCenterBroker
Registration
Available datacenter
Query
Get characteristic
Create VM
Task scheduling
Task completion
Destroy VM
Host0VM0
UpdateVMsProcessing()UpdateCloudletsProcessing()
Time of next event
VMn
…
UpdateCloudletsProcessing()
Time of next event
Smallest time of next event
Fig.2 Entity communication process of task scheduling
Extension of CloudSim and Task Scheduling
We have extended open cloud computing simulation platform CloudSim 2.0, and implemented five
task scheduling algorithms in its source codes. After recompiling new platform, we carry out some
simulation programs of above five algorithms.
Simulation Environment
The software and hardware configure in native machine: OS: Windows XP, CPU: Inter P4 2.8GHZ,
Memory: 1GB, Storage: 60G. Java compile environment: JDK 1.6.0_26+Eclipse SDK 3.4.1.
Simulation toolkit of cloud computing environment: CloudSim 2.1.1.
Experimental Parameter
For implementing simulation experiments on task scheduling of cloud computing, we have
simulated one date center comprising one host, including four virtual machines requirements and
eight cloud tasks. Table 1-Table 4 gives detailed simulation parameters respectively.
Table 1 Parameter list of cloud user task cloudlet CloudletId Lengths(MI) FileSize OutputSize
0 18375 300 300
1 48624 400 300
2 31384 240 200
3 45628 300 300
4 14562 400 500
5 21075 600 250
6 31579 250 200
7 31864 350 200
Table 2 Virtual machine list VMId CPU MIPS Ram Storage Bw Share-Policy
0 1 267 2048 10000 100 Space-Shared
1 1 285 2048 10000 100 Space-Shared
2 1 210 2048 10000 100 Space-Shared
3 1 291 2048 10000 100 Space-Shared
Table 3 Datacenter list Architecture OS VMM Cost
X86 Linux Xen 3.0
Table 4 Host list HostId MIPS Ram Storage Bw Share Policy
0 1200 2048 100000 1000 Time-Shared
Task Scheduling Algorithm and Experimental Results
In CloudSim, extending bindcloudletToVm (int cloudletId, int VmId) of DatacenterBroker class can
implement the bind and execution between task cloudlet and specified virtual machine. Fig.6 shows
the task scheduling mode of CloudSim. By means of overloading this method, the following five
task scheduling algorithms are presented:
832 Engineering Materials and Application
Cloud user
Uers1
Uers2
User3
…
Userk
用户
用户
用户
用户
Cloudlet
T1
T2
T3
T4
…
Tn
VM
VM1
VM2
VM3
VM4
…
VMm
Datacenter
Host1
Host2
…
Host3
Fig.3 Task scheduling model in CloudSim platform
1)Sequenced Scheduling (SS): When all of the VM are running the tasks, and then start again
from the first VM allocation of new tasks. SS tries to ensure that each VM can run the same
numbers of tasks to flat load. However, SS do not take the differences of task requirements and VM
performance into account.
2)First Come First Service (FCFS): When all VM are running the tasks, the subsequent task is
directed assigned to the VM with finishing task firstly, and so on, until all tasks are finished. FCFS
can avoid too much idle time of VM, but the VM with high performance may be overloaded.
3)Shortest Task First (STB): All tasks are sorted ascending according to task length, and then are
assigned to VM sequentially. The algorithm ensures that the short task can be finished firstly.
However, it also do not consider the difference of task demands.
4)Balance Scheduling (BS): All tasks are sorted descending according to task length and all VM
are sorted descending according to process performance. Given resorted task set and VM set, SS is
transferred. The aim of the algorithm is to assign long task to VM with high performance and short
task to VM with low performance, enforcing the balanced scheduling between task and VM.
However, it loses sight of the utilization of VM.
5)Greed Scheduling (GS): All tasks are sorted descending according to task length, and all VM
are sorted ascending according to process performance(MIPS). A matrix E(i,j) is defined, which
stores sorted (task length/MIPS, e.g. task execute time). Implementing greedy strategy in E(i,j):
beginning with the task with line number 0 in matrix, trying ot assign the last one corresponding to
the VM for each time. If the choice is optimal comparing to other choices, the allocation is finished.
Otherwise, the task is assigned to the VM making the current result optimization. The aim of this
algorithm is to assign complex task to the VM with high performance, which can solve the
bottleneck problem brought by complex tasks. Meanwhile, the finish time of all tasks can approach
to shortest ones. But the trouble is that high performance VM may be overloaded and energy
consumption is great.
Fig.4 shows the finish time of each cloudlet in five scheduling algorithms. Fig.5 shows the total
finish time of all tasks in five scheduling algorithms. Because GS carries out greedy strategy to
cloudlet on all virtual machines, its finish time is shortest. STF do not consider the difference of
process performance of VM, its finish time is longest. The finish time of GS is less 25% than one of
STF. Because the tasks are allocated sequentially in the beginning, the finish time of SS is closed to
FCFS. With the increasing of the number of cloudlets, FCFS performances better. Due to taking the
differences of the task length and process performance of VM, BS is a suboptimal algorithm.
0
50
100
150
200
250
300
350
0 1 2 3 4 5 6 7
CloudletId
finish time of task
SS FCFS STF BS GS
0
50
100
150
200
250
300
350
SS FCFS STF BS GS
task scheduling algorithm
total finish time
Fig.4 The finish time of each cloudlet Fig.5 The total finish time of all cloudlets
Advanced Materials Research Vol. 651 833
Conclusion
The simulation mechanism of cloud computing platform CloudSim is introduced in the paper. The
layered architecture of CloudSim is emphasized. Based on these works, taking the task scheduling
as a research plot under CloudSim, five task scheduling algorithms are presented. We extend
CloudSim platform and implement these five algorithms. Some simulation experiments and
performance analysis are carried out after deploying specified cloud scenes. The above researches
provide some ideas and solution for task scheduling strategy and code extension of new algorithms
in cloud computing environment based on CloudSim.
Acknowledgment
This work was supported by the National Natural Science Foundation of China Grant
NO.60970064,Guangdong and Hong Kong invited bidding special for Dongguan Grant No.
201120510101,Dongguan Major Science and Technology Special Project No.
2009215102001,Guangdong province high-tech zone development guide special Project
No.2011B010700043 and Chinese Academy of Sciences cooperation project.
References
[1] Liu peng, Cloud Computing (Second Edition)[M]. Beijing: Electronic Industry Press, 2011..
[2] Vaquero, L.M., L. Rodero-Merino, J. Caceres, et al. A break in the clouds: towards a cloud
definition. SIGCOMM Computer Communication Review, 2008, 39(1):50-55.
[3] Google App Engine. http://appengine google.com.
[4] Amazon EC2. http://www.amazon.com/ec2.
[5] IBM Cloud. http://www.ibm.com/cloud-computing/us/en/.
[6] Microsoft Azure. http://www.microsoft.com/azure.
[7] SimCloud Platform. http://simcloud.com/.
[8] Calheiros, R.N., R. Ranjan, A. Beloglazov, et al. CloudSim: a toolkit for modeling and
simulation of cloud computing environments and evaluation of resource provisioning
algorithms. Software-Practice & Experience, 2011, 41(1):23-50.
[9] Buyya, R., R. Ranjan, R.N. Calheiros. Modeling and simulation of scalable Cloud computing
environments and the CloudSim toolkit: Challenges and opportunities. In International
Conference on High Performance Computing & Simulation, HPCS'09. 2009: p.1-11
[10] Buyya, R.,M. Murshed. GridSim: a toolkit for the modeling and simulation of distributed
resource management and scheduling for Grid computing. Concurrency and
Computation-Practice & Experience, 2002, 14(13-15): 1175-1220.
834 Engineering Materials and Application
Engineering Materials and Application 10.4028/www.scientific.net/AMR.651 Study on New Task Scheduling Strategy in Cloud Computing Environment Based on the Simulator
CloudSim 10.4028/www.scientific.net/AMR.651.829
DOI References
[2] Vaquero, L.M., L. Rodero-Merino, J. Caceres, et al. A break in the clouds: towards a cloud definition.
SIGCOMM Computer Communication Review, 2008, 39(1): 50-55.
http://dx.doi.org/10.1145/1496091.1496100