21
State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions On Knowledge And Data Engineering Vol 23. No.9,September 2011 Presented by-Kartik Babu Boga

State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Embed Size (px)

Citation preview

Page 1: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

State Monitoring in Cloud Datacenters

Shing Meng (Student Member, IEEE)

Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE)

IEEE Transactions On Knowledge And Data EngineeringVol 23. No.9,September 2011

Presented by-Kartik Babu Boga

Page 2: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

INTRODUCTION• A DataCenter :

Facility for housing computer and its associated components such as Telecommunication and Storage Systems .

• A Cloud DataCenter: is advancement in DataCenter promoting provision for On-Demand system resources and computing.

• A Cloud Application: is delivering Software as a Service(SaaS) over Internet.

Ex 1)The National Climatic Data Center (NCDC) Is a public data center that maintains the world's largest archive of weather

information

Ex 2) Amazons elastic computer cloud (EC2)It is designed to make web-scale computing easier for developers.

Page 3: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

• The scale of cloud datacenters and the diversity of application specific metrics pose significant challenges on both system and data aspects of datacenter monitoring due to the following reasons.

Event capturing: Tremendous amount of events raise a number of system level issues.

Resource Consumption: Large scale monitoring involves processing large amount of data.

Reliability: System failures raise system level issues in Data center Monitoing .

Page 4: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Challenges in Large-Scale Monitoring

• Distributed Aggregation:

Its hard to summarize the voluminous monitored values.

• Shared Aggregation :

Some Monitoring tasks share similarities and perform monitoring in isolation leading to un necessary resource consumption.

In this paper, we study state monitoring at cloud datacenters, which can be viewed as a cloud state management issue.

Page 5: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

State Monitoring

• A key challenge for efficient state monitoring is meeting the two demanding objectives: high level of correctness, which ensures zero or very low error rate, and high communication efficiency which requires minimal communication cost.

Page 6: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

• If the overall request rate, deviates from a normal state for a distributed application We refer to this type of monitoring as state monitoring. State monitoring is widely used in many applications. Examples are:

Traffic engineering Quality of serviceBotnet detection

• One intuitive state monitoring approach is the instantaneous state monitoring, which triggers a state alert whenever a predefined threshold is violated.

• Example: Internet applications causes frequent and unnecessary state alerts

Page 7: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Instanataneous State Monitoring

• An intuitive approach for state monitoring.

• Triggers a state alert when a predefined threshold violates.

• Most of the exisiting work dealt with this approach.

Page 8: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

WIndows based StatE monitoring(WISE)• In this paper, we introduce the concept of window-based state

monitoring and devise a distributed WISE framework for cloud datacenters.

• WISE triggers state alerts only when the state violation is continuous.

• WISE may not scale well in the presence of lager number of monitoring nodes.

• Thus we present an improved windows based monitoring approach that improves our basic approach along several dimensions.

• We develop a set of optimization techniques to optimize the performance of the fully distributed WISE.

• We also compare the original WISE with the improved WISE on various aspects. Our results suggest that the improved WISE is more desirable for large-scale datacenter monitoring.

Page 9: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

3 Unique Contributions by WISE

• Employ novel distributed state monitoring algorithm achieving communication efficiency.

• Use a distributed parameter tuning and a cost model to reduce communication cost.

• Develop set of optimization techniques.

Page 10: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Example

Page 11: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Problem Description• The focus of existing work is to find optimal local threshold values

that minimize the overall communication cost.

• As monitored values often contain momentary bursts and outliers, instantaneous state monitoring is subject to cause frequent and unnecessary state alerts, which could further lead to unnecessary

countermeasures.• Problem is challenging because careful handling of monitoring

windows at distributed nodes is required to ensure both

communication efficiency and monitoring correctness.• We start with the most intuitive approach, applying the

instantaneous monitoring algorithm.

Page 12: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Approach• Three technical developments that form the core of the WISE

monitoring approach: The WISE monitoring algorithm

Reports partial information on local violation series at the monitor node side to save communication cost.

The monitoring parameter tuning schemes

If a node often observes higher monitored values compared with other nodes centralized parameter tuning scheme is used. In Exponential increasing nature of search space, we develop a distributed parameter tuning scheme that avoids centralized information collecting and parameter searching.

Page 13: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Performance optimization techniques.

To further minimize the communication cost between a coordinator node and its monitoring nodes we use 2 techniques. The staged global poll and The termination message

• To achieve the best communication efficiency, local monitoring parameters need to be tuned according to the given monitoring task and monitored value distributions.

• The WISE monitoring algorithm guarantees monitoring Correctness.

• Monitor Algorithm

– WISE uses two separate algorithms for Monitor node and Coordinator Node.

– Filtering Windows, Skeptical Windows.

Page 14: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Performance Evaluation • WISE achieves a reduction from 50 to 90 percent in communication

cost compared with instantaneous monitoring algorithm

• The centralized parameter tuning scheme effectively improves the communication efficiency.

• The optimization techniques further improve the communication efficiency of WISE

• The actual gain is generally better (50 to 90 percent reduction in communication cost) with parameter tuning and optimized subroutines.

• The centralized scheme suffers from scalability issues with small number of monitor nodes.

Page 15: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Basic Wise and Improved Wise• The scalability of WISE is better in communication overhead when

compared to Instantaneous, even more with the distributed parameter tuning.

• Though distributed parameter tuning scheme has less performance than the centralized scheme, due to its Scalability is used for large scale distributed systems.

• There is more communication efficiency when using two optimization techniques with distributed parameter tuning.

Page 16: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Distributed Tuning versus Centralized Tuning

• The distributed scheme performs even better than the centralized scheme when the number of nodes is large.

• The distributed tuning scheme is a desirable alternative as it provides comparable communication efficiency and better scalability.

Page 17: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Related Research• The early work [18] done by Dilman and Raz propose a Simple Value

scheme which sets all Ti to T=n and an Improved Value which sets Ti to a value lower than T=n.

• Jain et al. [26] discuss the challenges in implementing distributed triggering mechanisms for network monitoring and they use local constraints of T=n to detect violation

• The more recent work of Sharfman et al. [20] represents a geometric approach for monitoring threshold functions.

• Kashyap et al. [22] propose the most recent work in detecting distributed constraint violation

Page 18: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Conclusion and Future work

• The increasing use of consolidation and virtualization is driving to manage cloud applications and services.

• State monitoring the crucial functionality for on demand provision of resources and services in cloud datacenters.

• Not only resilient to bursts, outliers but also save communication. Experiment result show WISE achieved 50 -90% communication reduction.

Page 19: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

• Current results monitor the window-based state violation for one application running over a collection of distributed computing nodes

• Future research is Scheduling of multiple application State monitoring tasks.

• Perform failure resilient state monitoring.

Page 20: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

References• Amazon, “Amazon Elastic Computer Cloud(Amazon ec2),”

2008.• “Amazon Cloudwatch Beta,” http://

aws.amazon.com/cloudwatch/, 2011.• S. Meng, T. Wang, and L. Liu, “Monitoring Continuous State

Violation in Datacenters: Exploring the Time Dimension,” Proc. IEEE 26th Int’l Conf. Data Eng. (ICDE), 2010.

• http://en.wikipedia.org/wiki/Cloud_computing

Page 21: State Monitoring in Cloud Datacenters Shing Meng (Student Member, IEEE) Ling Liu (Senior Member, IEEE) Ting Wang (Student Member, IEEE) IEEE Transactions

Questions???