EDR: An Energy-Aware Runtime Load Distribution …people.cs.vt.edu/bxl4074/cluster13_li.pdfEDR: An Energy-Aware Runtime Load Distribution System for Data-Intensive Applications in

EDR: An Energy-Aware Runtime Load DistributionSystem for Data-Intensive Applications in the Cloud

Bo LiVirginia Tech

[email protected]

Shuaiwen Leon SongPacific Northwest National Lab

[email protected]

Ivona BezakovaRochester Institute of Technology

[email protected]

Kirk W. CameronVirginia Tech

[email protected]

Abstract—Data centers account for a growing percentage ofUS power consumption. Energy efficiency is now a first-classdesign constraint for the data centers that support cloud services.Service providers must distribute their data efficiently acrossmultiple data centers. This includes creation of data replicasthat provide multiple copies of data for efficient access. However,selecting replicas to maximize performance while minimizingenergy waste is an open problem. State of the art replicaselection approaches either do not address energy, lack scalabilityand/or are vulnerable to crashes due to use of a centralizedcoordinator. Therefore, we propose, develop and evaluate asimple cost-oriented decentralized replica selection system namedEDR (Energy-Aware Distributed Running system), implementedwith two distributed optimization algorithms. We demonstrateexperimentally the cost differences in various replica selectionscenarios and show that our novel approach is as fast as thebest available decentralized approach DONAR,while additionallyconsidering dynamic energy costs. We show that an average of12% savings on total system energy costs can be achieved byusing EDR for several data intensive applications.

I. Introduction

Cloud computing is built upon an infrastructure of geo-graphically distributed data centers throughout the world. Themost common way to protect against data loss in the cloud isthrough replication — or maintaining multiple copies of data[1]. When a user needs access to their data in the cloud, thesystem responds by selecting a copy of the data from variousgeographically distributed locations and providing the data tothe end user. During this process (called replica selection), thesystem selects the copy of the data that it believes will result inthe lowest latency (or fastest data transfer), least packet loss,etc. Service providers can use replica selection to distributeload generated by user requests, thus globally optimizing theuse of available resources.

Replica selection requires access to a coordinator that keepstrack of where account data resides globally. As expected, forlarge systems with a large number of users, the data neededby a coordinator can grow unwieldy. Additionally, as data ismoved dynamically at any time, the coordinator data must bekept up to date continuously.

Replica selection coordinators are typically implementedusing either a centralized or a distributed approach. Central-ized coordinators are typically simpler and often faster thandistributed implementations but suffer from a single pointof failure [2] and poor scalability. Distributed coordinators

are typically more complicated while potentially slower thancentralized implementations but they can be more scalable [3].

Current replica selection implementations focus on opti-mizing for bandwidth and latency and do not consider thecost of the power necessary to access a replica. However,according to the prediction from the 2007 EPA report [4], theenergy used to power data centers will likely exceed 3% ofthe total US energy use by 2013. Therefore, power is nowthe biggest cost item in a data center [5][6] and the costs perkWh also vary widely by region globally [7]. Hence, all datadoes not cost the same from the data center operators’ (i.e.service providers) perspective. Substantial research has beenconducted to optimize these data centers both locally [8] andglobally [9] in attempts to reduce the number of machinespowered on 24x7.

To reduce costs globally, future data center operators willrequire replica selection that considers not only bandwidthcapacity and network latency but also the dollar cost (e.g.power) of accessing to different replicas. In this paper, wepropose a decentralized replica selection system named EDR,considering data transfer under varied regional power costs,bandwidth capacity and network latency of the data centers.We demonstrate experimentally the cost differences in variousreplica selection scenarios and show that our novel approachperforms as well as the best available decentralized approach”DONAR” [3] while additionally considering dynamic energycosts. In particular, we show how an average of 12% savingscan be achieved using our methods for the data intensiveapplications such as online video streaming and distributedfile sharing.

The rest of this paper is organized as follows. In Section II,we discuss related work. Then, we present our methodology onsolving the energy-aware replica selection problem in SectionIII. After that, Section IV shows the evaluation of performanceas well as the energy cost reduction of our EDR framework.Finally, in Section V, we conclude with a brief summary anddescribe our future work.

II. RelatedWork

A. Energy Efficiency in a Data Center

Energy efficiency in the cloud has been studied as animportant issue by others. The effort of reducing energy costhas been taken through hardware, software, as well as network-ing aspects [10]. For example, resource allocation [11] andscheduling algorithms considering QoS [12][13] can improveenergy efficiency of data center as well as guarantee the978-1-4799-0898-1/13/$31.00©2013 IEEE

quality of services. However, such work does not investigatethe workload distribution of client requests among all the datacenters, which can also affect the total energy cost even if theresources have already been optimally allocated. In particular,for the data-intensive services, the distribution of workloadto each data center can significantly affect the energy cost.Zong et al. [14] apply a buffer-disk to schedule storage systemtasks, so that energy consumption can be reduced by keepinga large number of data disks idle. But they only considerthe relationship between disk state and power consumptionregardless of different workload types for the disk. Kim etal.[15] try to reduce data center energy by introducing lowpower devices. But they do not consider the electricity pricesat different geographical locations. Rao et al. [9] consider mul-tiple electricity prices into their energy model for data centers.However, they do not consider the bandwidth capacity anddo not provide a decentralized solution to their optimizationmodel. In order to minimize the energy cost of data centers,Liu et al. [16] take workload and number of active servers ineach data center into consideration. However, they assume thatthe single server energy consumption does not depend on thetraffic load, which is not practical for modeling data-intensiveapplications in cloud.

Smith and Sommerville [17] studied the impacts of vari-ous types of applications on different subcomponents’ energyconsumption within a server. Furthermore, a linear relationshipbetween data-intensive workload and energy consumption forhard disks in server systems is validated in work [18]. How-ever, we cannot make the assumption that a linear relationshipexists between workload and the energy consumption of othercomponents in data centers. For instance, majority of thenetwork devices are far from being energy proportional [19],[20].

B. Replica Selection in Cloud

Ruiz-Alvarez and Humphrey [21] present an approach forselecting storage services in the cloud. However, this approachhas some security issues for accessing multiple data centers atruntime so it is hard to be used for solving replica selectionproblems. Le et al. [22], Chen et al. [23], and Rajamani et al.[24] have studied the load distribution problem in the clustersand data centers. However, they have not considered thedecentralized system architecture. Work [3] and [25] presentdecentralized replica selection systems for load distribution.The systems are validated to be working decently under a wildnetwork environment. However, energy efficiency issues arenot considered in their frameworks.

III. Methodology

In this section, we first present a simple energy cost modelfor data centers used by our runtime scheduler. Based on thismodel, we then formulate the replica selection problem as aconvex optimization problem which minimizes total energycost of data centers subject to bandwidth capacity and networklatency of each data center. After that, we propose a simple de-centralized framework named EDR where the distributed nodescooperate with each other to solve the global optimizationproblem in parallel. Finally, we adapt two parallel algorithmsinto EDR and give a detailed analysis on their individualcommunication complexity and theoretical convergence rate.

Some important model-related parameters in this section aresummarized in Table I and mapped to the system servicearchitecture in Fig. 1.

TABLE I. Notations

C Set of all clientsN Set of all replicasEg Total energy consumption of all the replicas in the cloudEn Total energy consumption for replica npc,n Traffic load mapped from client c to replica nPn Constraint sets on replica nBn Bandwidth capacity on replica nT User-defined max tolerable network latencyRc Traffic load of the request from client ctc,n Network latency from client c to replica nun Unit price (¢) of power in replica nan Weight value of replica n in consensus-based algorithm

αn, βn Weight scalars for the energy consumption of serversand network devices in replica n

γn Parameter to correlate traffic load to network devices’ energyconsumption for replica n. γn’s value depends on theunderlying device’s architecture

A. Energy Cost Model for Data Center

In order to model the energy cost of data-intensive appli-cations in the cloud, we build a correlation between energyconsumption and the workload from the clients. Since theprimary goal of the model is to help clients make runtimedecision on how to efficiently distribute workload amongreplicas, we want to keep the model simple and only reflect themajor energy consuming components which are significantlyimpacted by the traffic loads. Therefore, we make the followingassumptions for modeling data center energy consumption:

1). From a system component-level modeling perspective,it can be nontrivial to describe the total energy consumption ofa single server node, which normally consists of CPU, memory,hard disk, motherboard, accelerators, system and CPU fans,etc. Detailed system-level modeling has been conducted inwork such as [26][27], which often involves modeling on-off chip time and power affected by workload and hardwarefrequency. In our case, since we are primarily targeting thedata and network intensive applications, we can assume theyprovide relatively consistent workload intensity for individualservers during each period of execution time. Thus, we canassume the power consumption for each replica is constantand the time spent on each replica is linear to the workload.Consequently, it is reasonable to make the assumption that therelationship between energy consumption of each replica andits workload is linear as well.

2). Network devices’ energy is also a significant contributorto the overall data center energy consumption. It depends onseveral factors, including traffic load, temperature, quality ofservice (QoS) policies and floor space. In this work, we onlyconsider the most influential factor— traffic load [28]. Restrepoet al.[29] have studied data centers’ network energy profilesand categorized the relationships between network energy andtraffic loads. For instance, switch architectures such as Batcherand Crossbar [30] generally follow the ”Linear” relations whilethe ”Cubic” relationship often corresponds to common dataintensive workload on network devices in the cloud. γn valueheavily depends on the underlying network architecture. In thiswork, we only consider the energy cost from network devicessuch as NICs, routers and swtiches while ignoring the energyspent on the optical transmitters.

3). Infrastructure energy consumption such as cooling isanother major part of the data center operating costs. Somereport [4] indicates that cooling itself can contribute as high as33% of the entire data center’s energy consumption. However,cooling energy can be very complicated to model because itis not directly impacted by the workload. Based on PUE [31],we can treat this part of the energy cost as a fraction of thetotal system energy consumption. Since it will not affect theruntime scheduling decision, we simply ignore its effects inthe final model.

Based on the assumptions above, we can build a simple en-ergy cost model for data centers only considering three majorenergy-consuming components: server nodes, network devices(routers, switches, etc), and the cooling system. Accordingto our previous discussion on the cooling energy in SectionIII-A(3), we can build a weighted combination of linear(for servers) and degree γn polynomial (for network devices)relationships between energy consumption and network trafficload in our model. The total energy consumption of all thereplicas can be modeled as:

Eg =∑

N

un · (αn

∑C

pc,n + βn(∑

C

pc,n)γn ) (1)

where αn and βn are weight scalars for the energy consumptionof servers and network devices in replica n. The goal of ourproblem is to minimize Eg for the clients’ requests to the datacenters. The global optimization problem can be formulatedas:

minimizepc,n

Eg =∑

N

En

subject to fn(P) =∑

C

pc,n − Bn ≤ 0,∀n ∈ N

hc(P) =∑

N

pc,n − Rc = 0,∀c ∈ C

ec,n(P) = tc,n − T ≤ 0,∀n ∈ N, c ∈ C

(2)

where En = un · (αn∑

C pc,n + βn(∑

C pc,n)γn ) is the energyconsumption of replica n, fn(P) is the bandwidth capacityconstraint of replica n, hc(P) is the request constraint of clientc, and ec,n(P) represents the network latency constraint fromclient c to replica n. The problem turns out to be a degree γnpolynomial objective function (convex function) with severallinear equality and inequality constraints.

B. EDR System Architecture

The EDR system is built on top of the common data centerinfrastructure without additional devices, shown in Fig.1. In thesystem, each replica keeps listening to the clients’ incomingrequests. Once the requests are received, the replicas will startcooperating with each other to solve the global optimizationproblem.

In EDR, the replica selection service is transparent to theclients, which means the clients do not need to know whichreplica(s) they are communicating with. This is decided at run-time by EDR. Higher system reliability and better scalabilitycan also be achieved through EDR’s decentralized architecture.If the replica selection work is assigned to a single centralagent, the crash of such agent can cause the failure of the entire

Fig. 1. An illustration of a general service-oriented cloud with multiplereplicas and clients.

Fig. 2. The EDR server side components diagram.

replica selection system. It is unlikely to happen in a decen-tralized environment with efficient fault tolerance mechanismsunless all the replicas malfunction. However, the decentralizedsolution does not always outperform the centralized method.For a lower runtime calculation workload (e.g. fewer clientsand smaller workload in each request), the communicationand synchronization overhead may result in a performancedegradation for the decentralized system. Fortunately, usersand companies care more about the energy consumption duringthe peak service hours, which dominate the entire operatingcost. Still, selecting high-performance distributed algorithmsused for solving the global optimization problems in EDR isessential for minimizing the energy cost during the decisionmaking phase at runtime. The detailed descriptions of theselected distributed algorithms used in EDR can be found inSection III-D.

C. EDR System Design

The replica selection system involves the client side and theserver side. The programs of both sides are designed as multi-threaded programs using TCP/IP sockets for communication.The structure of the server side program is illustrated in Fig. 2.The ClientListener thread keeps listening to the new requestsfrom clients. The ReplicaListener thread keeps listening tothe requests of solution information from other replicas. TheFileDownload thread handles the sending of requested files tothe clients. Once a new client request comes, it communicateswith the ClientListener thread first and then waits for thesolution of how to distribute its requested load. Once thesolution is reached, the client side will create new threads tocommunicate with all the replicas at the same time to downloadthe computed amount of load.

In the current EDR design, we guarantee the reliability of

the system by using a combination of time-out mechanismand ring fault-tolerance structure. The ReplicaListener threadis used to communicate between replicas. Once a replicamalfunctions, the other replicas will know and then removethis dead replica from their ”active member lists” and the ringstructure. After that, EDR will perform the runtime schedulingagain based on the new ring of replicas.

D. Solving Global Optimization Problem: LDDM v.s. CDPSM

Solving an optimization problem with constraints in adistributed environment is not as easy as on a single node.In order to solve (2) in a distributed manner, we considertwo methods as the candidates: Lagrangian dual decompo-sition method (LDDM) [32] and consensus-based distributedprojected subgradient method (CDPSM) [33]. LDDM is aniterative method for solving the convex optimization problemin parallel. Other than considering forming the dual problem,CDPSM presents a consensus mechanism to solve the convexoptimization problem, whose objective function is sum ofseveral local objective functions, through distributed agents.Both of them can be adapted to solve our constrained convexoptimization problem in parallel. In this paper, we implementboth algorithms and then compare their communication com-plexity and convergence speed.

1) Consensus-based Distributed Projected SubgradientMethod (CDPSM): This method is originally proposed tosolve constrained optimization problems in multi-agents net-works. In our paper, we adapt this method to our EDR system.The objective function Eg in our replica selection problem isthe sum of functions which are local objective functions forreplicas, in the form of Eg =

∑N En. Each replica works on

solving its own local optimization problem En which is subjectto the local constraints pc,n ∈ Pn, where Pn is a subset ofthe constraint sets that have local variables of replica n. Theoptimization problem in replica n can be formulated as:

minimizepc,n

En

subject to pnc,n ∈ Pn

The main idea of this algorithm is to use a consensusmechanism among distributed replicas to split the computationwork. Each distributed replica keeps working on solving asubproblem of the global problem. The consensus mechanismcan combine solutions of subproblems to form the globaloptimization solution. Given pc,n is the solution to the globaloptimization problem, each replica n starts by estimating{pc,n | c ∈ C n ∈ N}n ∈ Pn and updating its solution pc,niteratively by cooperating with other replicas. The consensusand projection procedure for iteratively estimating can bedenoted by the following equation:

pnc,n(k + 1) = Pro jPn [

N∑j=1

a jn · p

jc,n(k) − dk · gn(k)]+ (3)

where a jn are the weights of all the replicas, dk > 0 is the step

size, and gn(k) is the subgradient on its local objective functionEn. Since the objective function of our problem is twicedifferentiable, we could use gradient instead of subgradient

as gn(k). The symbol Pro jPn [·]+ denotes the operation ofprojection. We have:

Pro jPn [p∗c,n]+ = arg minpc,n∈Pn

‖p∗c,n − pc,n‖

By projecting the solution pc,n back into its own local con-straint set Pn, the algorithm guarantees that in each iterationthe solution is feasible.

Based on this method, every replica in our system keepsrunning to handle client requests and the consensus mecha-nism. We can present the algorithm for each replica as follows:

Algorithm 1 Algorithm of CDPSM1: Initialization: Set the unit price of replica i.2: repeat3: Collect the clients’ requests from clients.4: Collect the solution pc,n from other replicas.5: Get the consensus solution Vc,n =

∑n an pi

c,n, where∑n an = 1

6: Update solution by pc,n = Vc,n − d · g(Vc,n), where d isstep size and g(Vc,n) is gradient value of function En atVc,n.

7: Project pc,n to the constraint sets following the projectrule PXn [p∗c,n]+

8: until pc,n do not change.

The size of solution pc,n in each replica is O(|C| · |N |). Theconsensus mechanism requires distributed replicas to requestthe solutions from other replicas. So the communication com-plexity of each iteration is of size O(|C| · |N| · |N−1| · |N |) whichis approximately O(|C| · |N |3), where C is the number of clientsand N is the number of replicas.

2) Lagrangian Dual Decomposition Method (LDDM):Since there are dependencies in the global variables amongreplicas, we need to decouple them in order to solve the prob-lem in parallel. LDDM provides us with a way to solve suchproblem. Given the original problem (2), we can formulate theLagrangian dual problem from the global optimization problemas:

minimizepc,n

L(pc,n, µ) =

N∑n=1

En +

C∑c=1

µi · hc(P)

subject to fn(P) =∑

c

pc,n − Bn ≤ 0,∀n ∈ N

ec,n(P) = tc,n − T ≤ 0,∀n ∈ N

(4)

By using the Lagrangian multiplier µ, the equality con-straints that have the global coupling variables of the originalproblem, are transformed into the objective function of its dualproblem (4). So for the replicas in our system, each of themjust needs to solve the local optimization problem and update µby the clients periodically. The local optimization subproblemis defined as (in replica n):

minimizepc,n

En +

C∑c=1

µi · pc,n

subject to∑

C

pc,n − Bn ≤ 0

tc,n − T ≤ 0,∀c ∈ C

(5)

where {pc,n | c ∈ C} are the local variables in replica n.The task of updating µ is assigned to the clients since the

equality constraints in the original problem (2) are associatedwith each client request. The updating of µ is done by solvingthe problem (6). Gradient method can be used to solve suchlinear programming problem. µ can be any real number.

minimizeµ

g(µ) = infpc,n

L(pc,n, µ)

subject to µ ∈ RC(6)

We implement the algorithm as below:Algorithm 2 Algorithm of LDDM (Replica n)

1: Initialization: Set the unit price of replica i.2: Collect the clients’ requests and their values of µ and

inform the other replicas.3: repeat4: Solve the local optimization problem (5).5: Send solution pc,n to each client c.6: Request the new µc from the client c.7: Stops if {pc,n | c ∈ C} do not change.8: until pc,n do not change.

To achieve higher performance for distributed algorithms,both low communication complexity and high algorithm con-vergence rate are required. Comparing with CDPSM, thesystem implemented with the LDDM has lower complexity. Itsruntime coordination is between pairs of clients and replicas,so there is little communication among the replicas. The sizeof the solution of each replica is O(|C|). The communicationcomplexity of each iteration is O(|C| · |N |), which is lowerthan the complexity of using CDPSM shown in the previoussubsection. In theory, the LDDM also has higher convergencerate than CDPSM. Fig. 5 shows the comparison of simulatedconvergence rates of these two methods. To explain the con-cept, we conduct a simulation experiment with three replicasusing MatLab. For solving the same optimization problem, theCDPSM converges slower than the LDDM. So theoreticallyspeaking, the LDDM is expected to have higher performancefor solving our problem. This theory will be validated bythe experiments of running data intensive applications on realworld machines in Section IV.

Additionally, to solve convex minimization problem, thestep size we choose in the algorithm can affect the conver-gence speed or even determine if the algorithm can convergesuccessfully. To guarantee the fairness of the comparison, wechoose to use constant step size for both algorithms in thispaper.

IV. Experiment Results and Analysis

In this section, we first present a system which can emulatethe behaviors of data centers in cloud in terms of energyconsumption. Then, we use two types of data-intensive applica-tions, video streaming and distributed file services, to evaluatedata centers’ performance and energy cost with EDR. Finally,we demonstrate experimentally the cost differences in variousreplica selection scenarios and show that EDR performs as wellas DONAR while additionally taking system energy costs intoconsideration.

Fig. 5. Simulation results for CDPSM and LDDM methods in our EDRsystem. Different convergence rates have been shown here for comparison.

A. System Setup and Assumptions

1) Assumptions: In the following experiments, we aregoing to use a single cluster node to emulate a real replica. Weassume that, for data-intensive applications, energy cost modelof a single cluster node is very similar to that of a data center. Itcan be proved as below. Assuming we have workload p, basedon equation (1) we can describe the energy consumption(Es)of a single cluster machine as:

Es = αp + βp3 (7)

We assume γn =3 here for data intensive applications. If weare using a data center to handle p client requests, the taskcan be split into pi where

∑Ni=1 pi = p, N is the number of

nodes involved with this task in this data center. So the energyconsumption(Ed) of this data center for request p is:

Ed =

N∑i=1

(αpi + βp3i ) = α

N∑i=1

pi +β

N∑i=1

p3i = αp +β

N∑i=1

p3i (8)

In reality, the energy consumption of network devices is muchlower than that of servers in a data center. Therefore, we canassume that the value of β is much smaller than α in equation(1). So we can have Es ≈ Ed. Based on this assumption, it isreasonable for us to use a cluster node to model the energybehaviors of a real replica in cloud.

2) System Setup: We use eight nodes of our SystemGcluster as replicas to conduct our experiments. The SystemGcluster is a 22.8 TFLOPS supercomputer providing a researchplatform for development of high performance software andsimulation tools. Each node is equipped with two quad-core 2.8GHz Intel Xeon Processors, an 8 GB RAM, and a 6MB cache.SystemG is also equipped with both Ethernet and Infinibandadapters and switches. In this experiment, we use the Domin-ion PX Intelligent Power Distribution Units to dynamicallyprofile power consumption of controlled machines. The powersampling rate is approximately 50 times/sec.

The model parameters used in this section are defined asfollows: 1) For the electricity prices (¢/kwh) in this study,we random generate an integer number between 1 and 20 foreach of the 8 replicas in every experiment. This is to simulatevarious power prices of data centers in different geographicallocations; 2) the bandwidth cap for our SystemG Ethernetis approximately 100 MB/s; 3) we set the user-defined maxtolerable network latency T as 1.8ms, the worst case scenariofor one full-size frame 1518 bytes under heavy workload onSystemG; 4) according to the measurement on SystemG, weset the scalars αn = 1, and βn = 0.01.

In our experiments, we use two types of data-intensive ap-plications: the video streaming and the distributed file service.

Fig. 3. Runtime power profile for individual replica using CDPSM (distributed file service)

Fig. 4. Runtime power profile for individual replica using LDDM (distributed file service)

The size per request is different for these two applications.We set the size per request for the video streaming is approx-imately 100 MBytes and for the distributed file service it isapproximately 10 MBytes.

B. Performance and Power Analysis

In this subsection, we use a data intensive application(distributed file service) as an example to study the power andperformance characteristics of our EDR system implementedwith CDPSM and LDDM.

The power profiles of 8 replicas running with distributedfile service application using our EDR system are shown inFig 3 (CDPSM) and Fig 4 (LDDM). In most cases, systemenergy is consumed by both replica selection phase (includinglocal solution calculation and global solution synchronization),and the file transferring phase after the selection. The “valleys”shown in these two figures represent the time when only replicaselection process is running or system is listening to the newrequests. The “peaks” represent the time when replicas areaccepting new clients’ requests or transferring files to theprevious clients. The execution time of each replica shown inthe figures depends on both assigned workload and the solutioncalculation+synchronization time. We can observe that, when

handling the same number of client requests, EDR systemimplemented with LDDM runs faster than the system withCDPSM (for most of the individual cases, LDDM finishesearlier). It validates that LDDM has a lower communicationcomplexity and better convergence rate than CDPSM. Also,the average power of using LDDM is lower than that of usingCDPSM. The reason is that compared to LDDM, CDPSMneeds to coordinate with all other replicas and clients at everyiteration in order to make runtime scheduling decision, whichresults in constant higher workload intensity. This also indi-cates that CDPSM’s system complexity is higher than LDDM.In Fig 4, we can also observe that the power consumptionof replica 3 and 5 remain constantly low during the entireexecution. This is because neither of these two replicas hasbeen selected as the downloading targets by EDR at runtime.

C. Energy Cost Analysis

In order to show that EDR can effectively reduce the totalenergy cost of the data centers, we conduct the experimentswith 8 replicas and real time client requests. The pattern ofdata-intensive requests follows Youtube commercial workloadpatterns [34]. Based on the traffic data, we evaluate the total en-ergy cost of all 8 replicas running with LDDM- and CDPSM-based EDR. And then we compare the results with that of

using baseline algorithm, Round-Robin. For example, Fig. 6and 7 show the energy cost of each of the 8 replicas runningwith video streaming and distributed file service applicationsunder three different algorithms. The electricity prices (¢/kwh)for No.1 to No.8 replicas are: 1,8,1,6,1,5,2,3. This is randomlygenerated according to Section IV-A(2).

Fig.6 and 7 show that the traffic loads are successfullydistributed among replicas under EDR runtime scheduling.Here we take the application of video streaming as an example,shown in Fig. 6. The pattern of distribution is determinedby our energy cost model at runtime considering varyingelectricity prices, bandwidth capacity of each replica andnetwork latency. In Fig.6, most of the traffic load is assignedto replica 3, 5, and 7 primarily due to the relatively lowerelectricity prices, but also related to the bandwidth cap andnetwork latency.

Fig. 6. Energy cost of each replica for the video streaming application underthree different scheduling approaches.

Fig. 7. Energy cost of each replica for the distributed file service applicationunder three different scheduling approaches.

Fig.8 shows the total energy consumption and cost of alleight replicas running with LDDM- and CDPSM -based EDRand the baseline system running with Round-Robin method fortwo data intensive applications. Fig.8 (a) shows that LDDMoutperforms both CDPSM and Round-Robin approaches interms of total energy cost. It is because LDDM can convergefaster than CDPSM in general, which means less communi-cation and synchronization overhead. Fig.8(b) shows a veryinteresting phenomenon: for the video streaming case, CDPSMactually consumes less total energy (joules) than LDDM. Thisresult still makes sense because our objective function is tominimize the total energy cost (cents) instead of total energyconsumption. Also, even though CDPSM requires additionalenergy for computation and communication, it still outper-forms the round-table method because CDPSM does providethe global optimization solution for workload distribution. Theobservations from Fig.8 are consistent with the other 40 runsunder various configurations using EDR. Through all the runs,the LDDM-based EDR can save an average of 12% energy cost

compared to the Round-Robin method, while CDPSM-basedEDR can save an average of 22.64% energy consumption.

Fig. 8. The total data center energy cost and energy consumption comparisonsunder three different scheduling approaches for two data intensive applications.

D. System Performance Analysis

While the decentralized architecture may bring the issueof communication overhead to the replica selection system,we validate the performance of EDR with another efficientdecentralized replica selection system, DONAR [3]. UnlikeEDR, DONAR does not consider energy cost reduction atruntime scheduling. In this experiment, we use three replicasin EDR and three mapping nodes in DONAR. These mappingnodes function as distributed coordinators to split the loadinto different replicas. The requests also follow the pattern ofYoutube.com. The system performance results for EDR andDONAR are shown in Fig.9.

Fig. 9. System Performance of DONAR and EDR in terms of response timewhile number of client requests scale.

From Fig. 9, we can observe that the performance of EDRis very close to DONAR, which has been validated to be asefficient as the centralized system [3] . The response time perrequest is less than 200 ms. And the response time increasesclose to linearly when the client requests increase. As wementioned previously, EDR is implemented with LDDM whichhas the communication complexity of O(|C| · |N |), and forDONAR it is O(|C| · |N | · |M|) where |M| is the number ofthe mapping nodes. Therefore, with the increasing system size|M|, EDR will eventually outperform DONAR in a large scalecloud system.

V. Conclusion and future work

In this paper, we propose EDR, an energy-aware runtimescheduling system for data-intensive applications in the cloud.EDR provides a decentralized architecture to solve the replicaselection problem and considers not only the bandwidth ca-pacity and network latency but also the total energy cost ofthe entire cloud when forming the data center energy costmodel. Our experiments prove that EDR can effectively reducethe total energy cost with a comparable efficiency to thebest available decentralized replica selection system namedDONAR. In future, we plan to port EDR to a large scale realworld commercial cloud such as Amazon EC2, and also withmore restrictions other than bandwidth capacity and latency.

VI. Acknowledgment

This material is based upon work supported by the NationalScience Foundation under Grant No. 0905187 and 0910784.

References

[1] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The costof a cloud: research problems in data center networks,” SIGCOMMComput. Commun. Rev., vol. 39, no. 1, pp. 68–73, Dec. 2008. [Online].Available: http://doi.acm.org/10.1145/1496091.1496103

[2] M. Litzkow, M. Livny, and M. Mutka, “Condor-a hunter of idle work-stations,” in Distributed Computing Systems, 1988., 8th InternationalConference on, jun 1988, pp. 104 –111.

[3] P. Wendell, J. W. Jiang, M. J. Freedman, and J. Rexford, “Donar:decentralized server selection for cloud services,” SIGCOMM Comput.Commun. Rev., vol. 41, pp. 231–242, August 2010. [Online]. Available:http://doi.acm.org/10.1145/2043164.1851211

[4] the U.S. EPA ENERGY STAR Program.(2007) http://www.energystar.gov. [Online]. Available:http://www.energystar.gov

[5] J. G. Koomey, C. Belady, M. Patterson, and A. Santos, “Assessing trendsover time in performance, costs, and energy use for servers,” 2009.

[6] J. Koomey, “Growth in data center electricity use2005 to 2010,” Tech. Rep., July 2011. [Online].Available: http://fulltextreports.com/2011/08/04/growth-in-data-center-electricity-use-2005-to-2010/

[7] A. Qureshi, “Plugging Into Energy Market Diversity,” in 7th ACMWorkshop on Hot Topics in Networks (HotNets), Calgary, Canada,October 2008.

[8] R. Bianchini and R. Rajamony, “Power and energy management forserver systems,” Computer, vol. 37, no. 11, pp. 68 – 76, nov. 2004.

[9] L. Rao, X. Liu, L. Xie, and W. Liu, “Minimizing electricity cost:Optimization of distributed internet data centers in a multi-electricity-market environment,” in INFOCOM, 2010 Proceedings IEEE, march2010, pp. 1 –9.

[10] A. Berl, E. Gelenbe, M. Di Girolamo, G. Giuliani, H. De Meer, M. Q.Dang, and K. Pentikousis, “Energy-efficient cloud computing,” TheComputer Journal, vol. 53, no. 7, pp. 1045–1051, 2009. [Online]. Avail-able: http://comjnl.oxfordjournals.org/cgi/doi/10.1093/comjnl/bxp080

[11] R. Urgaonkar, U. Kozat, K. Igarashi, and M. Neely, “Dynamic resourceallocation and power management in virtualized data centers,” inNetwork Operations and Management Symposium (NOMS), 2010 IEEE,april 2010, pp. 479 –486.

[12] R. Buyya, A. Beloglazov, and J. H. Abawajy, “Energy-efficient man-agement of data center resources for cloud computing: A vision,architectural elements, and open challenges,” CoRR, vol. abs/1006.0308,2010.

[13] A. Beloglazov and R. Buyya, “Energy efficient resource managementin virtualized cloud data centers,” in Proceedings of the 201010th IEEE/ACM International Conference on Cluster, Cloud andGrid Computing, ser. CCGRID ’10. Washington, DC, USA:IEEE Computer Society, 2010, pp. 826–831. [Online]. Available:http://dx.doi.org/10.1109/CCGRID.2010.46

[14] Z. Zong, M. Briggs, N. O’Connor, and X. Qin, “An energy-efficientframework for large-scale parallel storage systems,” in Parallel andDistributed Processing Symposium, 2007. IPDPS 2007. IEEE Interna-tional, march 2007, pp. 1 –7.

[15] H. S. Kim, D. I. Shin, Y. J. Yu, H. Eom, and H. Y.Yeom, “Towards energy proportional cloud for data processingframeworks,” in Proceedings of the First USENIX conference onSustainable information technology, ser. SustainIT’10. Berkeley, CA,USA: USENIX Association, 2010, pp. 4–4. [Online]. Available:http://dl.acm.org/citation.cfm?id=1863159.1863163

[16] Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. Andrew,“Greening geographical load balancing,” in Proceedings of the ACMSIGMETRICS joint international conference on Measurement andmodeling of computer systems, ser.SIGMETRICS ’11. New York, NY, USA: ACM, 2011, pp. 233–244.[Online]. Available: http://doi.acm.org/10.1145/1993744.1993767

[17] J. W. Smith and I. Sommerville, “Workload classification &software energy measurement for efficient scheduling on private cloudplatforms,” CoRR, vol. abs/1105.2584, 2011.

[18] A. Lewis, S. Ghosh, and N.-F. Tzeng, “Run-time energyconsumption estimation based on workload in server systems,”in Proceedings of the 2008 conference on Power awarecomputing and systems, ser. HotPower’08. Berkeley, CA,USA: USENIX Association, 2008, pp. 4–4. [Online]. Available:http://dl.acm.org/citation.cfm?id=1855610.1855614

[19] P. Mahadevan, P. Sharma, S. Banerjee, and P. Ranganathan, “Energyaware network operations,” in INFOCOM Workshops 2009, IEEE, april2009, pp. 1 –6.

[20] S. Seetharaman, “Energy conservation in multi-tenant networksthrough power virtualization,” in Proceedings of the 2010 internationalconference on Power aware computing and systems, ser. HotPower’10.Berkeley, CA, USA: USENIX Association, 2010, pp. 1–8. [Online].Available: http://dl.acm.org/citation.cfm?id=1924920.1924924

[21] A. Ruiz-Alvarez and M. Humphrey, “An automated approach to cloudstorage service selection,” in Proceedings of the 2nd internationalworkshop on Scientific cloud computing, ser. ScienceCloud ’11.New York, NY, USA: ACM, 2011, pp. 39–48. [Online]. Available:http://doi.acm.org/10.1145/1996109.1996117

[22] K. Le, R. Bianchini, M. Martonosi, and T. Nguyen, “Cost-and energy-aware load distribution across data centers,” Proceedings of HotPower,2009.

[23] G. Chen, W. He, J. Liu, S. Nath, L. Rigas, L. Xiao, andF. Zhao, “Energy-aware server provisioning and load dispatchingfor connection-intensive internet services,” in Proceedings ofthe 5th USENIX Symposium on Networked Systems Designand Implementation, ser. NSDI’08. Berkeley, CA, USA:USENIX Association, 2008, pp. 337–350. [Online]. Available:http://dl.acm.org/citation.cfm?id=1387589.1387613

[24] K. Rajamani and C. Lefurgy, “On evaluating request-distributionschemes for saving energy in server clusters,” in Proceedings ofthe 2003 IEEE International Symposium on Performance Analysisof Systems and Software, ser. ISPASS ’03. Washington, DC, USA:IEEE Computer Society, 2003, pp. 111–122. [Online]. Available:http://dl.acm.org/citation.cfm?id=1153924.1154555

[25] F. Dabek, R. Cox, F. Kaashoek, and R. Morris, “Vivaldi: adecentralized network coordinate system,” in Proceedings of the2004 conference on Applications, technologies, architectures, andprotocols for computer communications, ser. SIGCOMM ’04. NewYork, NY, USA: ACM, 2004, pp. 15–26. [Online]. Available:http://doi.acm.org/10.1145/1015467.1015471

[26] S. Song, C.-Y. Su, R. Ge, A. Vishnu, and K. W. Cameron, “Iso-energy-efficiency: An approach to power-constrained parallel computation,” inProceedings of the 2011 IEEE International Parallel & DistributedProcessing Symposium, ser. IPDPS ’11. Washington, DC, USA:IEEE Computer Society, 2011, pp. 128–139. [Online]. Available:http://dx.doi.org/10.1109/IPDPS.2011.22

[27] S. Song and K. W. Cameron, “System-level power-performance effi-ciency modeling for emergent gpu architectures,” in PACT, 2012, pp.473–474.

[28] S. Nedevschi, L. Popa, G. Iannaccone, S. Ratnasamy, and D. Wetherall,“Reducing network energy consumption via sleeping and rate-adaptation,” in Proceedings of the 5th USENIX Symposium onNetworked Systems Design and Implementation, ser. NSDI’08. Berke-ley, CA, USA: USENIX Association, 2008, pp. 323–336. [Online].Available: http://dl.acm.org/citation.cfm?id=1387589.1387612

[29] J. Restrepo, C. Gruber, and C. Machuca, “Energy profile aware routing,”in Communications Workshops, 2009. ICC Workshops 2009. IEEEInternational Conference on, june 2009, pp. 1 –5.

[30] T. Ye, L. Benini, and G. De Micheli, “Analysis of power consumptionon switch fabrics in network routers,” in Design Automation Conference,2002. Proceedings. 39th, 2002, pp. 524 – 529.

[31] T. G. Grid, “The Green Grid Data Center Power Efficiency Metrics:PUE and DCiE,” Tech. Rep., 2007.

[32] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and distributed computa-tion: numerical methods. Upper Saddle River, NJ, USA: Prentice-Hall,Inc., 1989.

[33] A. Nedic, A. Ozdaglar, and P. Parrilo, “Constrained consensus andoptimization in multi-agent networks,” Automatic Control, IEEE Trans-actions on, vol. 55, no. 4, pp. 922 –938, april 2010.

[34] P. Gill, M. Arlitt, Z. Li, and A. Mahanti, “Youtube traffic characteriza-tion: a view from the edge,” in Proceedings of the 7th ACM SIGCOMMconference on Internet measurement, ser. IMC ’07. New York, NY,USA: ACM, 2007, pp. 15–28.

Documents

EDR: An Energy-Aware Runtime Load Distribution …people.cs.vt.edu/bxl4074/cluster13_li.pdfEDR: An Energy-Aware Runtime Load Distribution System for Data-Intensive Applications in