25
RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008 Sameep Mehta Presented by: Yun Liaw IBM India Research Lab

RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Embed Size (px)

Citation preview

Page 1: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS

Anindya Neogi

IEEE Network Operations and Management Symposium, 2008

Sameep Mehta

Presented by: Yun Liaw

IBM India Research Lab

Page 2: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Outline

Introduction ReCon Overview Mapping Logic Experimental Validation Related Works Discussions Conclusion and Comments

112/04/19

2

Page 3: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Introduction

Server virtualization has regained popularity for various reasons Virtual machines (VMs) support more flexible and finer

grain resource allocation Physical server’s cost of management and total cost of

ownership (TCO) has gone up drastically Virtualization enables consolidation of a number of

smaller machines as VMs on a large server Leads to more efficient utilization of hardware resources Saving floor spaces, saving management cost

112/04/19

3

Page 4: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Introduction (cont’d)

ReCon: a tool that uses historical resource usage monitoring data to recommend a dynamic or static consolidation plan on servers

112/04/19

4

White: high utilization Black: low utilization

Page 5: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

ReCon Overview

Trace data: a set of measurements taken from the system, typically in a timeseries format E.g., CPU, memory, etc.

Cost Static cost: the base cost of

running a physical server with associated workload

Dynamic cost: the cost that varies with the utilization

VM migration cost

112/04/19

5

Page 6: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

ReCon Overview (cont’d)

Constraints: to restrict the space of possible mappings between VMs and physical servers System constraints Application level constraints Legal constraints

“What-if” input configuration: For users be able to tweak the input parameters and review the impact of consolidation Time window size of dynamic

consolidation The period that a server should

have no workload to consider turning it off

112/04/19

6

Page 7: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

ReCon Overview (cont’d)

Optimal Mapping Algorithm: To take all parameters, costs, constraints, configurations and process the trace data to generate static or dynamic server consolidation Consolidation window: the

non-overlapping time window to divide the historical data for dynamic consolidation For each time window, a

optimal mapping from VM to physical servers are created

In static consolidation, the time window is assumed to be the entire trace

112/04/19

7

Page 8: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Mapping Logic – Basic Notations

Let VM = {VM1, VM2,…, VMN} Each VMi observes and stores K variables O = {O(1,i), O(2,i),…, O(K,i)}

Each VMi is monitored for T time steps, the time series generated by jth sensor of VMi is

112/04/19

8

Informal Problem StatementGiven N application VMs, find n physical machines where n < N

such that each VM is assigned to one physical machine while satisfying domain specified constraints

Informal Problem StatementGiven N application VMs, find n physical machines where n < N

such that each VM is assigned to one physical machine while satisfying domain specified constraints

Page 9: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Mapping Logic - Constraints

Virtual machine constraints:Each VMi is associated with a list of Mi constraints

Physical server constraints:Each physical server Pi is associated with a list of Li constraints

The jth constraint of VMi which should hold in the interval [t1, t2]

The constraint is said to be satisfied if

9

},...,,{ ),(),2(),1( iMii iVCVCVCVC

},...,,{ ),(),2(),1( iLii iPCPCPCPC

1)( ],[),(21 PCeval tt

ij

Where P is the properties of the environment/architecture in time [t1, t2]

Page 10: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Mapping Logic – Optimization Problem Formulation Assume that in the initial, each VM (application) is

hosted by one physical machine, and each physical machine hosts exactly one VM |VM| = |P| = N n is not known a priori, and N is the upper bound of n

A: a N×N matrix, such that Ai,j =1 specifies that VMi is assigned to Pj A will be a diagonal matrix in the initial

Y: a |P| bit long vector, such that Yi =1 implies that Pi is currently running some VMs Y will be a vector with all 1 in the initial

112/04/19

10

Page 11: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Mapping Logic – Optimization Problem Formulation

Costi: the fixed cost incurred if Pi is active

MCosti,j: the cost for migrating VMi to Pj

F: a function that calculates the dynamic cost if one or more VMs are assigned to it Currently this function uses the CPU utilization for

computing the dynamic cost The benefit function attained by the consolidation

is as the following function

11

The cost of initial setting Fixed cost of running physical servers

Cost of VMs migrating to Pj

Page 12: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Mapping Logic – Optimization Problem Formulation the first term of B is fixed and does not change

while maximizing the function, therefore the objective function can be transferred to minimize

112/04/19

12

Page 13: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Mapping Logic – Dynamic/Static Consolidation Dynamic Consolidation

Assume the consolidation window size is Ts

Firstly minimize the optimization function in time interval [1, Ts], and generate the assignment matrix A[1,Ts]

While consolidating for time interval [Ts+1, 2*Ts], using the new set of constraints and A[1,Ts] as the starting point for optimization

Static Consolidation Set all migration costs to zero Set the consolidation window to cover the whole time

period

112/04/19

13

Page 14: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Experimental Validation – Data Set

The trace data was collected using the Model Driven Monitoring System (MDMS) [8] 186 physical servers 35 clusters with each cluster supporting one application Approximately 15 parameters are monitored for every

server But in this paper, authors use CPU utilization data only

Parameter are sampled at 5 minutes interval The optimization problem solver: AMPL and CPLEX

14

[8] B. Krishnamurthy, et al., “Data tagging architecture for system monitoring in dynamic environments,” in IEEE NOMS, 2008

Page 15: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Experimental Validation – Evaluation Metrics Time efficiency

To measure how fast it works given the size of data The efficiency ε is defined as TR/TS

TS: the consolidation window size TR: the time taken by ReCon to generate consolidation plan ε ~ 0 for a highly efficient tool ε ≧ 1 renders the tool useless for all practical purpose

Effectiveness The percentage of physical machines that can be turned off by

packing N VMs onto ni physical machines while satisfying all constraints in the corresponding consolidation window I

The effectiveness S is given by (N - ni)/N S ~ 1 implies most of the physical machines can be turned off

112/04/19

15

Page 16: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Experimental Validation – Change in recommendations VS migration cost

Recommendation: The merging of two VMs onto the same physical server

Migration cost: Inter cluster migration cost is normalized to be 100 Intra-cluster migration cost is varied as percentage of inter migration

cost

112/04/19

16

Page 17: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Experimental Validation – Efficiency Results

112/04/19

17

VMs1~175

ConsolidationWindow

10 ~240 (min)

Tim

e T

aken

on

Page 18: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Experimental Validation – Change in recommendations over time period

To study how the recommendations vary with change in the consolidation window size

Results: As the time window is increased, the number of

recommendations decreases More samples makes it difficult to satisfy the constraints

Time Window Size The time window size should not be too big in order to capture

the dynamic behavior The time window size should not be too small so that the

optimization engine is not used repeatedly without any gain Recommend value: T=300 minutes

112/04/19

18

Page 19: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

112/04/1919

X axis: time windowY axis: cost saving

Page 20: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Experimental Validation – Change in Cluster Over Time To study the effect of recommendations on

individual clusters Based on the mean and standard deviation of

saving, the clusters can be categorized into four groups Low Variation – Low Saving Low Variation – High Saving High Variation – Low Saving High Variation – High Saving

112/04/19

20

Page 21: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

21

Low VariationLow Saving

Low VariationHigh Saving

High VariationLow Saving

High VariationHigh Saving

X axis: time windowY axis: cost saving

Page 22: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Related Works

There is significant work in capacity planning and runtime resource management domain without bringing in the aspect of virtualization

VMware’s Distributed Resource Scheduler (DRS) Bobroff et al. describe algorithms for

reconsolidation in a dynamic setting while managing SLA violations

In static consolidation several bin-packing heuristics have been used to map VMs to physical servers

112/04/19

22

Page 23: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Discussion and Future Work

Handling multiple attributes The implementation cannot exploit the relationships or

correlation among attributes E.g., the time lag relation between high CPU utilization and high

I/O utilization Runtime reconfiguration tool

In order to convert the planning tool into a real time decision module, highly efficient implementation and forecasting logic is needed Machine learning and time series forecasting techniques are the

candidates for the author’s next step Extending what-if analysis

112/04/19

23

Page 24: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Conclusion

A VM consolidation planning tool called ReCon is provided To analyze the historical resource consumption data The consolidation problem is formulated in an

optimization framework Time varying constraints are easily incorporate to

temporal change in workload characteristics Different migration cost function, virtualization

models can be plugged into the tool

112/04/19

24

Page 25: RECON: A TOOL TO RECOMMEND DYNAMIC SERVER CONSOLIDATION IN MULTI-CLUSTER DATACENTERS Anindya Neogi IEEE Network Operations and Management Symposium, 2008

Comment

The problem is well-formulated But the mentioned cost functions are mysterious

112/04/19

25