15
DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision Process Ms. A. Anitha 1 , Mrs. G. Sangeethalakshmi 2 , Ms. P.Shobana 3 1 Research Scholar, Dept of Computer Science and Applications, D.K.M College for Women (Autonomous), Vellore, Tamilnadu, India 2 Assitant Professor, Dept of Computer Science and Applications, D.K.M College for Women (Autonomous), Vellore, Tamilnadu, India 3 Research Scholar, Dept of Computer science and Applications, D.K.M College for Women (Autonomous), Vellore, Tamilnadu, India AbstractThe number and scale of data centers explode with the dramatically surging demand for cloud computing services, resulting in huge electricity consumption as well as an enormous impact on sustainability. While numerous efforts have been dedicated to decreasing the carbon footprint of data centers, there is a surprising and also embarrassing lack of attention to the enormity of data center water consumption despite its emergence as a critical concern for future sustainability. In this paper, we take the first step towards the data center water efficiency. We first identify two characteristics of data center water efficiency: water efficiency varies by location and also over time. The three factors, i.e., task assignment, data placement and data movement, deeply influence the operational expenditure of data centers. In this paper, a Markov Decision Process algorithm for Water Sustainability has been designed for a sustainable data collection centers while cost minimization in the network dynamically schedules workloads to water-efficient data centers for improving the overall water usage effectiveness (an emerging metric for quantifying data center water efficiency) while satisfying the electricity cost constraint. We also perform a trace-based simulation study to validate the analysis. The result shows that compared to the state-of-the-art cost- minimizing MDP significantly improves the water efficiency (by 60%) and reduces the water consumption (by 51%). KeywordsElectricity consumption, Water consumption, Markov Decision Process, Water Sustainability. I. INTRODUCTION Extended droughts have become a norm worldwide. For instance, California has been experiencing its fourth year of drought during a row, mandating water restrictions throughout the state. Meanwhile, drought is rising as a hidden threat to several trade sectors, as well as data centers. Data centers, housing tens of thousands of servers to satiate the ever increasing demand for internet and cloud computing services, are ill-famed for high energy consumption. It varies from day to day, however a typical data center of this size will use many thousands of gallons every day. This raises serious issues for data centers’ operational costs and property impacts because of carbon footprints embedded in electricity usage. Whereas several recent studies have centered on decreasing energy consumption furthermore as carbon footprint of data centers for property, an equal, if no more, necessary however typically neglected side of data center property is that the large water footprint. 1.1. Data Centers Not Just Power Hungry, They're Thirsty, Too Data centers, that house pc data systems, are energy hogs that still plump out due to our new love of cloud computing. Facilities often usually run at full power 24x 7 in spite of demand, gobbling up 2 percent of the nation’s electricity, and their backup generators square measure typically high-

Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

DOI: 10.23883/IJRTER.2017.3177.HFFT8 6

Water Saving In Geo-Distributed Data Centers Using Markov Decision

Process

Ms. A. Anitha1, Mrs. G. Sangeethalakshmi

2, Ms. P.Shobana

3

1Research Scholar, Dept of Computer Science and Applications, D.K.M College for Women

(Autonomous), Vellore, Tamilnadu, India 2Assitant Professor, Dept of Computer Science and Applications, D.K.M College for Women

(Autonomous), Vellore, Tamilnadu, India 3Research Scholar, Dept of Computer science and Applications, D.K.M College for Women

(Autonomous), Vellore, Tamilnadu, India

Abstract— The number and scale of data centers explode with the dramatically surging demand for

cloud computing services, resulting in huge electricity consumption as well as an enormous impact

on sustainability. While numerous efforts have been dedicated to decreasing the carbon footprint of

data centers, there is a surprising and also embarrassing lack of attention to the enormity of data

center water consumption despite its emergence as a critical concern for future sustainability. In this

paper, we take the first step towards the data center water efficiency. We first identify two

characteristics of data center water efficiency: water efficiency varies by location and also over time.

The three factors, i.e., task assignment, data placement and data movement, deeply influence the

operational expenditure of data centers. In this paper, a Markov Decision Process algorithm for

Water Sustainability has been designed for a sustainable data collection centers while cost

minimization in the network dynamically schedules workloads to water-efficient data centers for

improving the overall water usage effectiveness (an emerging metric for quantifying data center

water efficiency) while satisfying the electricity cost constraint. We also perform a trace-based

simulation study to validate the analysis. The result shows that compared to the state-of-the-art cost-

minimizing MDP significantly improves the water efficiency (by 60%) and reduces the water

consumption (by 51%).

Keywords—Electricity consumption, Water consumption, Markov Decision Process, Water

Sustainability.

I. INTRODUCTION

Extended droughts have become a norm worldwide. For instance, California has been experiencing

its fourth year of drought during a row, mandating water restrictions throughout the state.

Meanwhile, drought is rising as a hidden threat to several trade sectors, as well as data centers. Data

centers, housing tens of thousands of servers to satiate the ever increasing demand for internet and

cloud computing services, are ill-famed for high energy consumption. It varies from day to day,

however a typical data center of this size will use many thousands of gallons every day. This raises

serious issues for data centers’ operational costs and property impacts because of carbon footprints

embedded in electricity usage. Whereas several recent studies have centered on decreasing energy

consumption furthermore as carbon footprint of data centers for property, an equal, if no more,

necessary however typically neglected side of data center property is that the large water footprint.

1.1. Data Centers Not Just Power Hungry, They're Thirsty, Too Data centers, that house pc data systems, are energy hogs that still plump out due to our new love of

cloud computing. Facilities often usually run at full power 24x 7 in spite of demand, gobbling up 2

percent of the nation’s electricity, and their backup generators square measure typically high-

Page 2: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 7

powered by diesel – not the greenest of fuels. But how data centers use water, and how much they

use, is less well known?

There are nearly three million information centers within the us, ranging in size from many servers in

a workplace closet to a 1.4 million square foot in-construction Facebook data center in Iowa. All of

them use water, some directly, some indirectly. That is, several giant data centers withdraw water on

to cool their servers, whereas nearly all use electricity generated at power plants – that rely on water

for his or her own cooling wants – to power their servers and cooling systems.

For example, Amazon estimated back in 2009 that a 15 megawatt data center can need up to 360,000

gallons of water daily, leading one among the company’s data center designers to admit that "water

consumption (in data centers) is super embarrassing. It simply doesn’t feel accountable."

The water use at data centers area unit calculated by metric known as Water Use potency, or WUE,

has been embraced by data center managers.

WUE = Annual Water Usage (L)/IT Equipment Energy (kWh)

The ensuing variety can tell you how several liters of water per kilowatt hour are used onsite to

control the data center. If you wish to urge a more correct WUE, and you add the amount of water

needed to come up with the electricity used at the data center in question (The green Grid provides

all the figures and formulas needed).

The ideal WUE is in fact zero, meaning that no water is employed to control the data center.

However considering our current water-reliant energy system and the tendency of data centers to

suppose water for direct cooling.

1.2. Data Center Water Usage: An Emerging Challenge According to the New York Times, a minimum of 302 firms will be asked to report their water use to

the Carbon revelation Project. CDP Water revelation can evoke info related to overall water use,

risks and opportunities in operations and provide chains, as well as water management and

improvement plans.

While PUE is presently making headlines as data center efficiency metric, water use within the data

center is rising because the latest speech act challenge, reports data Center. Data centers use

enormous amounts of water to cool server farms and water management can become a more pressing

issue as water usage is being pushed to air par with carbon emissions.

Cloud computing has additionally magnified the number of water required to stay data centers cool.

Google and Microsoft have already been noted for his or her steps taken to cut back water usage in a

number of their facilities. As water usage is followed a lot of closely, firms are getting to ought to

follow in their footsteps.

Cooling system whereas advanced cooling systems (e.g., air economizer]) area unit smart at water

saving, most data centers are situated in places wherever installation of such cooling systems isn't

possible or economical. Thus, it's common that data centers, adore AT&T and eBay admit water-

intensive strategies for cooling (e.g., water-side saver and cool chillers). it had been calculable that a

15MW data center may consume 360,000 gallons of cooling water every day, whereas another report

aforesaid that the U.S. National Security data center in American state would need up to 1.7 million

gallons of water for cooling every day (enough to satiate 10,000 households’ daily needs).

1.3. Limitations of the existing research.

• First, they need applicable climate conditions and/or fascinating locations that aren't

applicable for all knowledge centers (e.g., “free air cooling” is ideally appropriate in cold

areas such as Dublin wherever Google has one data center).

Page 3: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 8

• Second, they are doing not address, and will even increase, indirect off-site water

consumption (e.g., on-site facilities for treating business water or seawater save fresh

however might consume additional electricity.

• Third, building water treatment facilities, usually need substantial capital investments

which will not be reasonable for all data center operators.

1.4. Proposed Technique.

Our approach relies on the following two techniques.

• Geographic load balancing (GLB): dynamically dispatching workloads to geo-distributed

data centers.

• Power proportionality: dynamically turning on/off servers in accordance with workloads.

By exploiting temporal and abstraction diversities of water potency, we have a tendency to propose a

unique geographical load balancing (GLB) approach, known as GLB for Water property (GLB-WS),

that dynamically schedules workloads to water economical data centers for rising the general water

potency whereas satisfying the electricity price constraint. Our approach is software-based and

fundamentally differs from the prevailing engineering-based techniques that specialize in facility

innovation. we have a tendency to additionally perform a trace primarily based simulation study to

enhance the analysis. The result's consistent with our analysis: it shows that GLB-WS will

effectively direct workloads from water-consuming data centers to water economical ones which,

compared to the progressive cost-minimizing GLB approach, GLB- WS considerably improves the

water potency (by roughly 60%) and reduces the water consumption (by roughly 51%).

II. WATER CONSUMPTION IN DATA CENTERS In this section, we offer a short background for data center water consumption, introduce associate

degree rising metric for quantifying water potency, and additionally establish two key characteristics

of water potency in data centers.

2.1. Water consumption We would wish to first draw the readers’ attention to the refined distinction between water

withdrawal and water consumption. the previous refers to getting water from somewhere (e.g., public

water facilities), whereas the latter refers to “losing” water (e.g., into the setting via evaporation) and

manufacturing waste water (e.g., into waste systems1) during this paper, we have a tendency to

specialize in water consumption (also interchangeably named as water usage where applicable) that

bears a direct impact on the recent clean water availableness. In general, knowledge centers consume

water each directly and indirectly.

2.1.1. Direct water consumption

Cooling systems (especially cool excitation systems that use evaporation as the heat rejection

mechanism) in knowledge centers use water directly for example, a pair of even with outside air

cooling, Facebook’s water potency remains 0.22L/KWh (for cooling systems) in its latest knowledge

center in Prineville, OR [6], whereas eBay uses 1.62L/kWh (as of may 29, 2016).

2.1.2. Indirect water consumption Indirect water usage stems from the method of thermoelectricity generation that employs evaporation

for cooling and hence consumes an astonishing quantity of water. whereas sure varieties of

electricity energy (e.g., by star photograph voltaic and wind) consume nearly zero water, “water-

free” electricality solely takes up a very tiny portion within the total electric generation capability

(e.g., under ten within the U.S.). Moreover, though a lot of of the water withdrawn by power plants

for steam condensation eventually returns to the system (hence, not thought-about as “consumed”), a

non-negligible fraction of the withdrawn water is “lost/consumed” by evaporation: e.g.,, the U.S., the

national average water consumption is one.8L/kWh (which is additionally spoken as Energy Water

Intensity issue, or EWIF).

Page 4: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 9

Adding up direct and indirect water consumption, data centers are more and more “thirsty” for water

and the web water consumption needs immediate attention.

2.2. Characteristics of data center water efficiency Data center water potency exhibits each spatial and temporal diversities, as such that below.

2.2.1. Spatial diversity Both direct and indirect WUEs demonstrate a big variation across completely different geographic

locations. as an instance, Fig. 1(b) shows the spatial diversity of the common EWIF in state level

(because some states use a lot of water-efficient technologies or turn out a lot of solar/wind

electricity, whereas alternative states turn out a lot of water intense thermal and nuclear electricity).

(a) California Electricity fuel mix on June 2016, (b) State Level EWIF versue CO2 emission in US

Figure 1. California electricity fuel mix and EWIF versus carbon emissions.

Both direct and indirect WUEs demonstrate a big variation across completely different geographic

locations. as an instance, Fig. 1(b) shows the spatial diversity of the common EWIF in state level

(because some states use a lot of water-efficient technologies or turn out a lot of solar/wind

electricity, whereas alternative states turn out a lot of water intense thermal and nuclear electricity).

Figure 2. Direct WUE of Facebook’s data center in Prineville.

2.2.2. Temporal diversity Fig. 1 shows a 90-day history of average daily (direct) WUE, and that we will see that the WUE

changes drastically over the time. whereas there's no public data for period of time EWIF in

numerous cities/states, it's different time-varying furthermore as a result of, as in Fig. 1(a), electricity

fuel combine is time-varying and completely different electricity generation ways consume different

amounts of water (excluding electricity that water consumption is troublesome to judge, nuclear

electricity consumes the foremost water, followed by coal-based thermal electricity and so star

PV/wind electricity).

The existing GLB techniques that concentrate on either carbon potency or electricity value

minimization don't essentially optimize the water potency, as a result of carbon/electricity-efficient

data centers might not be water-efficient. This will be seen from Fig. 1(b), within which the

(indirect) water efficiency isn't in proportion to carbon potency. The relation between electricity cost

and water potency is analogous, however not shown here for brevity. Therefore, a brand new GLB is

required for optimizing data center water potency that we'll address within the following sections.

Page 5: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 10

III. MODEL

We take into account a discrete-time model with equal-length time slots indexed by t =1, 2, ⋅⋅⋅ , every

of that includes a period that matches the timescale that data center operator will accurately predict

the longer term information (including the work arrival rate, on-the-scene renewable energy provide

and/or electricity price) and update its resource management call. Within the following analysis, we

tend to principally target hour-ahead prediction and hourly choices for the convenience of

presentation. The time index t is omitted where applicable while not inflicting ambiguity.

Next, we present the modeling details for the data centers and workloads. Key notations are

summarized in Table 1.

Table 1. List of Notations

3.1. Server

We consider N geo-distributed data centers, indexed by � =1, 2, ⋅⋅⋅ N,. Each data center � is part

steam-powered by on-site renewable energy plants (e.g., solar panel and/or wind turbines) and

contains �� servers that area unit homogeneous at intervals the data center, whereas heterogeneous

servers is simply captured by extending our model. Processing speed of a server is quantified in step

with the service rate (rather than the particular clock rates), i.e., the typical range of jobs processed

during a unit time. Specifically, the service rate of a server in data center � is �i.

We denote by ���� ∈ [0, ��] the number of servers turned on in data center � at time t. While in

theory ���� should be integers, approximating it as continuous values does not affect the

optimization result significantly because there are usually tens of thousands of servers in a data

center. In our study, we have a tendency to concentrate on server power consumption, whereas the

power consumption of other parts are captured by the ( possibly time-varying) power usage

effectiveness (PUE) issue that, multiplied by the server power consumption, gives the total server

power consumption of data center � during time t by �������, �����, which can be expressed as

�������, ����� = ����. ���,� + ��,�����

������ , (1)

Where ���� = ∑ "� ,# ��$#%& is the total amount of workloads dispatched to data center � (with

"�,# �� being the amount of workloads originating from the j-th gateway, as will be specified in the

next sub section), ��,� is the static server power regardless of the workloads ( as long as a server is

turned on) and ��,� is the computing power incurred only when a server is processing workloads in

data center �.

3.2. Electricity cost

We denote the electricity price in data center � at time t by '���, which is known to the data

center operator no later than the beginning of time t and may change over time if the data centres

participate in real-time electricity markets (e.g., hourly market). Given (��� ∈ )0, (�,��*+ amount of

obtainable on-site renewable energy in knowledge center i, we are able to categorical the incurred

electricity value of data center � as

�������, ����� = ',�t)γ,�t ∙ p,�a,�t, m,�t� − r,�4+5 (2)

Where γ,�t is the PUE of data center �, and [∙]5 = max {∙ ,0} indicates that no electricity are going

to be drawn from the ability grid if on-the-scene renewable energy is already sufficient . While we

tend to use Eqn. (2) to represent the electricity value of data center i at time t, our analysis isn't

restricted to a linear electricity value operate and might also model different electricity value

Page 6: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 11

functions appreciate nonlinear convex functions (e.g., data centers are charged at a higher value if it

consumes a lot of power).

3.3. Water consumption. Water is consumed each directly (i.e., by cooling system) and indirectly (i.e., by electricity

generation) in data centers. The direct water consumption will be simply obtained by multiplying the

server power consumption with direct WUE, whereas the indirect water consumption depends on the

electricity usage and also the native EWIF. Specifically, we will specific the water consumption of

data center � at time t as

9��� = :�,;�� ∙ �������, ����� + :�,�;�� ∙ )<��� ∙ �������, ����� − (���+5 , (3)

Where :�,;�t is the direct WUE at time t and :�,�;�t is the EWIF of the electricity powering data

center�. Note that, while the value of :�,�;�t is determined supported the energy fuel combine and

might be obtained by inquiring the native utility company, exploit the worth of :�,�;�t is not simple

as a result of it appears to depend upon varied factors comparable to cooling technology, humidness

and temperature and there's no publically obtainable study on this (to our greatest data and

additionally as supported by green Grid’s claim that the study of WUE continues to be at the terribly

starting ). During this paper, we tend to assume that :�,;�t is known at the beginning of time slot t,

while noting that it could be a potential separate research topic to find the key factors

affecting :�,;�t.

3.4. Workload We take into account a state of affairs during which there are J gateways, each of that represents a

geographically targeted supply of workloads (e.g., a state or province so forwards the incoming

workloads to the N geo-distributed data centers. The term “workload” is generic, representing a

synthesis of computing tasks/jobs (e.g., search requests, video streaming). We denote the work-load

arrival rate at the j-th gateway by"#�� = [0, "#,��*], and the workload is dispatched to data center � at a rate of "�,#�� that we shall optimize. We assume that "#� � available (e.g., by using regression

based prediction) at the beginning of each time slot t, as widely consider in prior work.

We quantify the overall end-to-end delay performance for processing workloads from gateway j in

data center j using the average delay=�,#�����, ���� , which is intuitively increasing in ���� =∑ "�,#��>

#%& and decreasing in ���� where ���� is the number of (homogenous) servers turned on

in data center � . As a concrete example we are able to model the service method at every server as an

M/G/1/PS queue. Then, by incorporating the network transmission delay, the end-to-end average

delay of workloads regular from gateway j to data center � is =�,#�����, ����� = &

��?����/���� + A�,# ��, (4)

Where ���� = ∑ "�,#��>#%& represents the total workloads processed in data center i , and A�,#�� is

average network delay approximated in proportion to the distance between data center i and the j-th

gateway, which might be calculable by varied approaches appreciate mapping and artificial

coordinate approaches. It ought to be more created clear that our delay model is principally supposed

to characterize the overall trend of the general end-to-end performance and to facilitate the server

provisioning call, whereas the delay performance for specific tenants/applications are handled

victimization separate techniques on the far side the scope of our study.

IV. GEOGRAPHIC LOAD BALANCING In this section, we have a tendency to initial gift optimization objective, constraints, yet because the

drawback formulation for minimizing WUE via geographic load leveling. Then, we develop

associate degree economical rule to resolve the matter.

Page 7: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 12

4.1. Problem Formulation In this section, we have a tendency to initial specify the improvement objective as well as

constraints, and so present the problem formulation.

Objective. supported the metric recently developed for measuring data center water potency, we

specialize in maximizing the general (hourly) WUE of all the data centers, specific as follows.

B�"��, ���� = ∑ C���D�EF

∑ G������,����D�EF

, (5)

Where 9��� is the water usage (both directly and indirectly) and ���"���, ���� is the server power

consumption in data center � given by 4 and 2, respectively. As our study makes the primary try to

rigorously optimize information center water efficiency from the resource management perspective

(fundamentally differing from the pricey engineering approach of renovating the cooling system, we

build the subsequent three remarks to clarify our objective. First, whereas WUE was originally

projected as a metric for or quantifying data center water potency over a year, we take the position

that optimizing hourly WUE provides additional timely info and should facilitate data center

managers to enhance their operations additional promptly. This position has additionally been strong

by Facebook’s recent move to in public report their hourly (direct) WUE. Second, though we choose

to optimize the combined WUE of all the data centers (because all the water usage are attributed to

the common data center owner, love Google and Microsoft), various objectives such as (weighted)

add WUE of individual data centers and total water consumption may be optimized as variants of our

study. Third, whereas our current study focuses on optimizing water potency (which is without doubt

an rising critical issue for property computing) while not expressly taking into account electricity

energy minimization, we don't shall downplay the seriousness of the soaring electricity consumption

in data centers. In fact, one among our constraints is obligatory on the whole electricity cost, that

implicitly bounds the most electricity consumption. we will collectively optimize the energy and

water potency in our future work..

4.2. Markov Decision Process (MDP) For Load Balancing MDP consists of States (s), actions (a), transition probabilities (P) and rewards (R).Load States.PM-

State is the load state of a PM based on totally different resources. VM-state is the resource

utilization level of a VM. Three levels for each resource – high, medium and low. Total number of

states = L^R. Objective of the algorithm rule – make sure that utilization of each resource of the PM

is below a certain threshold. Action - migration of a VM in a particular state, or no migration at all.

Transition Probability - probability that an action a will lead to state s’. Reward – given after

transition to state s’ from state s by taking action a.

Figure 3. Example of simple MDP

(a) States and Actions The state and action set remain constants and PM first determines its own state. It determines the

state of all its VMs. MDP finds an optimal action and is ready to sustain this state. For T1 = 0.3, T2

= 0.8, State changes from PM high->medium.

Page 8: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 13

(b) Transition Probabilities

To Determine the probability of transitioning to each state after action a Need to be stable and

calculated by a central server using a trace of the states of the VMs being migrated and Changes in

PM state after migration

(c) Rewards Encourages PMs to maximize rewards. Rewards are two types, In Positive rewards Transition from a

high state to low or medium state and No action in medium or low state. In Negative rewards

Transition to high state and No action in high state.

Main Challenge. Addressing data centers’ water footprint within the face of extended droughts

needs long-term efforts, however the long-term nature additionally creates challenges because the

desired water capping constraint in (9) couples the work management selections across completely

different time slots: while GLB and power quotient decisions have to be compelled to be created

while not foreseeing the way future, this decisions can implicitly have an effect on the longer term

selections (e.g., mistreatment a lot of water at this point slot can lead to less water budget on the

market for future uses). Accurate prediction of such offline info (e.g., volatile outside wet bulb

temperature that affects WUE, non-stationary work arrival, etc.) is sort of difficult, if not possible

[19], necessitating an online approach.

Solution. To address the shortage of long-term future data, we tend to leverage the sample-path

Lyapunov technique to develop a web algorithm that produces GLB and power proportionality

choices solely supported presently accessible information. Originally planned for establishing system

stability, Lyapunov technique was later extended to attain long-term queuing stability in networks ,

with a salient feature that it doesn't need future data once creating control choices. Here, we are able

to treat the data center water footprint in each time slot as “job arrivals” to a virtual water queue, and

consider the desired water usage as “job departures”. Thus, if the virtual queue lengths are often

pushed to zero at the tip, then the specified long-term water footprint capping is achieved in a web

manner.

V. PERFORMANCE EVALUATION

This section presents a trace-based simulation study to validate our analysis. We first present our

data set and then compare GLB-WS with previous analysis

5.1. Data set We contemplate four geographically distributed data centers situated in: (#1) Prineville, OR, (#2)

Northlake, IL, (#3) Forest city, NC, and (#4) county, NJ. The four data centers have peak powers of

25MW, 18MW, 30MW and 20MW, respectively. For the convenience of illustration, the PUEs are

chosen as 1.30 for all the data centers. In these four data centers, every server has a most power of

200W, 160W, 220W, and 250W, respectively, and static/idle server power takes up hour of the

maximum power. The normalized service rates of every server within the four data centers are

chosen to be 1.00, 0.90, 1.25 and 1.10, severally. All the workloads are distributed to the four

information centers by the front-end entryway in St. Luis, MO, which has comparable distances to

any or all the info centers. By default, the common reaction time constraint is 500ms (as within the

case of net service). The data center capability provisioning and load distribution choices are updated

hourly.

5.1.1. Workloads We acquire a 24-hour workload trace by identification the server usage log of Florida International

University (FIU, an outsized public university in the U.S. with over 50,000 students) on day, 2016,

and scale the FIU workload proportionately. Figure. 4(a) shows the trace normalized with reference

to the overall computing capability of the four data centers. Other synthetic workloads also are tested

and also the results are similar.

5.1.2. On-site renewable energy

The four sets of the hourly renewable energies (generated through star panels and wind turbines) on

may 1 of 2016 from four locations that are closest to the data centers, and scale them proportionally

Page 9: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 14

such on-site renewable energy provide takes up or so the maximum of the most energy demand in

every data center on the average.

(a) FIU workload trace on May 1, 2016 (b) WUE comparison Between GLB-WS

(c) Water consumption Comparison (d) Electricity Comparison between GLB-WS

and GLB-Cost GLB-WS and GLB-Cost

Figure 4. FIU workload trace and performance comparison between GLB-WS and GLB-Cost.

5.1.3. Electricity price We obtain from hourly electricity prices from four trading nodes closest to our considered data

centers on May 1, 2016.

5.1.4. Indirect EWIF and direct WUE for cooling system Due to the shortage of access to actual EWIF data within the four data center locations, we have a

tendency to use the (time-varying) state-level EWIF values. Since solely Face book has simply

begun to report (hourly) direct WUE for its data center in Prineville, OR, we have a tendency to use

the data bestowed for shrewd the onsite water usage in data center #1. For the other 3 data centers,

we use 1.62L/kWh (the same as eBay’s direct WUE in could twenty nine, 2016), 0.30L/kWh, and

1.00L/kWh, respectively, whereas randomly adding up to 20 noises to incorporate the temporal

diversity.

5.2. Results

In this subsection, we have a tendency to compare GLB-WS with the progressive research

exploitation the on top of trace data. GLB has been extensively studied for data center optimization

from varied perspectives: e.g., minimizing electricity value, maximizing renewable energy utilization

minimizing each electricity and delay value, and capping long-run energy consumption. While it's

inconceivable to check GLB-WS against all the existing GLB techniques, we choose GLB for

electricity value minimization, called GLB-Cost, as our benchmark, because it is one amongst the

most widely-considered GLB techniques. Comparison against different GLB techniques is similar

and hence omitted for brevity.

5.2.1. Improved water efficiency

We first see from Fig. 3(b) that GLB-WS significantly improves the data center water potency

compared to GLB-Cost: GLB-WS reduces the average WUE by nearly hour than GLB-Cost. will be

as a result of GLB-WS can dynamically schedule workloads to water economical data centers, while

Page 10: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 15

GLB-Cost is water-oblivious and should “inappropriately” process additional workloads in water

consuming (but cost-effective) data centers.

5.2.2. Reduced water consumption While the express objective of GLB-WS is minimizing the WUE by scheduling workloads to water

efficient data centers, an on the spot byproduct of GLB-WS is the reduced water consumption. We

have a tendency to see from Fig. 4(c) that, compared to GLB-Cost, GLB-WS reduces the water

consumption remarkably (by over 51 on averages), making data centers far more water efficient.

(a) Normalized WUE in each data center (b) Workload distribution under GLB-WS and GLB-Cost

Figure 5. Water efficiency and distributed workloads in each data center.

5.2.3. Comparable electricity Cost and energy consumption. The focus of GLB-WS is not minimizing the electricity cost, but

rather satisfying the given budget constraint. Thus, as can be seen from Fig. 4(d), GLB-WS incurs a

higher electricity bill (by approximately 5 − 20%) compared to GLB-Cost that explicitly minimizes

the electricity cost. While electricity cost is absolutely important for data centers, we argue that

sustainability, in particular water sustainability, is also critical and needs to be taken into

consideration in the future design of data centers. Nonetheless, as water-efficient data centers is

typically different from cost-effective ones (as can be seen from Fig. 5(b)), it may not be possible to

optimize both metrics simultaneously, and an inherent tradeoff exists between (water) sustainability

and data center operational cost, pointing to a potential research direction as our future work.

Although not apparent in the paper for brevity, we note that the average electricity energy

consumptions by GLB-WS and GLB-Cost are almost the same (i.e., within 1% in our case study).

5.2.4. Water-driven scheduling Just as GLB-Cost is cost-driven and schedules workloads to efficient data centers, GLB-WS is

water-driven and may effectively schedule workloads to water-efficient data centers. we have a

tendency to show in Fig. 5(a) the average normalized WUE (i.e., water usage per unit computing

capacity): information centers #3 and #4 are water economical (but price inefficient), whereas data

centers #1 and #2 area unit on the opposite facet. Thus, it will be seen from Fig. 5(b) that GLB-WS

will schedule additional workloads to data centers #3 and #4, whereas GLB-Cost favors data centers

#1 and #2 for process workloads.

This intuition is further highlighted when we remove the electricity budget constraint for GLB-WS:

as shown in Fig. 4(b), without considering budget constraint, GLB-WS will schedule almost all

workloads to water-efficient data centers (i.e., #3 and #4).

To sum up, GLB-WS focuses on a novel side of data center operation, i.e., water potency. It will

effectively schedule workloads to water-efficient data centers to reduce the full water consumption,

thereby up the water potency. Even so, aside from water potency, we don't imply that GLB-WS

outperforms all the present GLB techniques in each other side (e.g., GLB-Cost minimizes the

electricity value, whereas the GLB focuses on carbon footprint). Instead, we emphasize that GLB-

WS is complementary to the present research which water sustainability deserves more attention

from the research community.

Page 11: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 16

VI. RELATED WORK We provide a snapshot of the related work from the following aspects. Data center optimization.

There has been a growing interest in optimizing data center operation from numerous perspectives

adore cutting electricity bills minimizing brown energy consumption, and minimizing response

times. maybe, “power proportionality” via dynamically turning on/off servers based on the

workloads has been extensively studied and advocated as a promising approach to reducing the

energy value of data centers. By exploring spatial diversities of electricity costs and/or energy

“greenness”, study geographical load leveling among multiple data centers to reduce energy value,

caps the long-term energy consumption supported foretold workloads, and propose to cut back

brown energy usage by scheduling workloads to data centers with green energies.

Water efficiency in data centers. To our best data, there are no analysis activities that explicitly

optimize water potency in data centers. the sole analysis works that area unit loosely relevant to data

center water consumption are either point out the criticality of conservation or develop a dashboard

for visualizing the water potency however no effective solutions are proposed towards water

property in data centers. in public familiar efforts for water data in knowledge centers area unit

principally restricted to engineering approaches and include putting in innovative cooling systems

(e.g., outside air economizer), using recycled water, and powering data centers with on-site

renewable energies to cut back the consumption of electricity. These engineering ways are expensive

and orthogonal to our proposal: we are employing a “software” approach that improves water

potency from the perspective of employment management without substantial capital investments.

To sum up, our work takes the first step to carefully address water property in data centers. whereas

our projected GLB-WS might not possibly outmatch all the present GLB techniques in each side

(e.g., GLB-WS incurs a better electricity price than GLB-Cost that notably minimizes the cost),

GLB-WS explicitly focuses on water potency that's changing into a important concern in future data

centers in light-weight of the worldwide water shortage trend. Moreover, our analysis on data center

water potency provides a crucial, unique and complementary perspective to the present data center

analysis, and that we take the freedom of envisioning that incorporating water potency is

increasingly essential in future analysis efforts.

5.1DFT

Figure 6. DFTOF MDP

Task Scheduler

View Transaction

Schedule File

Load Prediction

Skewness Calculation

Green Computing

Owner

Choose plan

Select SLA and Plan

File upload

File Update

File Delete

View File

Login

File Search

Download Request

Download File

View Downloaded file

User Cloud

Approve Memory

and SLA to Owner

View Translation

View all file

Cloud Request

Page 12: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 17

5.2 SCREENSHOT

HOMEPAGE

PURCHASECLOUD

MEMORYSLAAGREEMENT

OWNERLOGIN

Page 13: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 18

OWNERCLOUDLOGIN

UPLOADFILE

UPLOADVIEW

SCHEDULERLOGIN

TASKSCHEDULER

Page 14: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 19

SCHEDULEVIEWFILE

SCHEDULEFILE

TRANSCTION

VII. CONCLUSION In this paper, we tend to created the primary step towards water property in data centers. we tend to

showed that data center WUE conjointly exhibits spatial and temporal diversities. investment these

characteristics, we tend to proposed GLB-WS, a GLB technique that may dynamically dispatch

Page 15: Water Saving In Geo-Distributed Data Centers Using Markov … · 2017-07-12 · DOI: 10.23883/IJRTER.2017.3177.HFFT8 6 Water Saving In Geo-Distributed Data Centers Using Markov Decision

International Journal of Recent Trends in Engineering & Research (IJRTER)

Volume 03, Issue 05; May - 2017 [ISSN: 2455-1457]

@IJRTER-2016, All Rights Reserved 20

workloads to water economical data centers whereas satisfying the electricity cost and delay

constraint. we tend to conjointly performed a trace-based simulation study to enrich the analysis. The

result was per our analysis: compared to the progressive cost- minimizing GLB approach, GLB-WS

considerably improves water potency and reduces water consumption. A natural future analysis

direction is combining water potency with alternative metrics (e.g., carbon potency, electricity cost)

in data center optimization.

REFERENCES [1] Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. Andrew,“Greening geographical load balancing,” in

SIGMETRICS, 2011.

[2] N. Buchbinder, N. Jain, and I. Menache, “Online job migration for reducing the electricity bill in the cloud,” in IFIP

Networking, 2011.

[3] C. Ren, D. Wang, B. Urgaonkar, and A. Sivasubramaniam,“Carbon-aware energy capacity planning for datacenters,”

in MASCOTS, 2012.

[4] P. X. Gao, A. R. Curtis, B. Wong, and S. Keshav, “It’s not easy being green,” SIGCOMM Comput. Commun. Rev.,

vol. 42, no. 4, pp. 211–222, Aug. 2012. [Online]. Available: http://doi.acm.org/10.1145/2377677.2377719

[5] T. Evans, “The different technologies for cooling data centers,” http:/ /www. apcmedia. com/salestools/ VAVR-

5UDTU5/VAVR-5UDTU5 R2 EN.pdf.

[6] “AT&T Sustainability,” http://www.att.com/genlanding-pages?pid=24188.

[7] “eBay,” http://dse.ebay.com.

[8] “Facebook data center dashboard,” http://www.fbpuewue.com.

[9] U.S. Dept. of Energy, “Energy demands on water resources,” Dec.2006.

[10] The Green Grid, “Water usage effectiveness (WUE): A green grid data center sustainability metric,” Whitepaper,

2011.

[11] Water Facts: Water - Water.org. http://water.org/water-crisis/water-facts/water/.

[12] McKinsey Report, “Charting our water future: Economic frameworks to inform decision-making,” 2009.

[13] P. Folger, B. A. Cody, and N. T. Carter, “Drought in the United States: Causes and issues for congress,” Apr. 2013,

http://www.fas.org/sgp/crs/misc/RL34580.pdf.