DRS and DPM Implementation in Virtualized Environment Large Scale Performance Statistics gathering and monitoring

1

CMPE 283 – Project 2

DRS and DPM Implementation in Virtualized Environment Large Scale Performance Statistics gathering and monitoring

Submitted to

Professor Simon Shim

Submitted By

Team -‐01

Akshay Wattal Apoorva Gouni Gopika Gogineni

Pratyusha Mandapati

2

Table of Contents

1. Introduction: .................................................................................................................... 3

2. Background: ..................................................................................................................... 4

3. Requirements ................................................................................................................... 4

4.Design ............................................................................................................................... 5

5.Implementation .............................................................................................................. 12

6. Assumptions: ................................................................................................................. 24

7. Limitations ..................................................................................................................... 25

8. Future Work and its extension ....................................................................................... 25

9. Individual Contribution .................................................................................................. 26

10. Installation and Execution manual ................................................................................ 27

12. Screenshots .................................................................................................................. 37

13. Conclusion .................................................................................................................... 54

14. References: .................................................................................................................. 55

3

1. INTRODUCTION:

Virtualization has overall eased the IT computing by providing high cost savings and can also greatly enhance organization’s business agility. Companies that employ partitioning, workload management and other virtualization techniques are better positioned to respond to changing demands in their business.

1.1. Goals:

Some of the key challenges that the virtualization industry faces are to manage efficient utilization of resources, proper consumption of power and collections of logs from the virtualized environment for monitoring and analysis purpose. In this project we are doing to try to understand these challenges and propose a reasonable solution. 1.2. Objectives:

The objective of this project can be divide into two parts: • Creating a simple Develop a simple DRS (Distributed Resource Scheduler) and DPM

(Distributed Power Management) function. • Develop a real time statistics gathering and analysis framework. Which is capable of

capturing metrics from vHosts and the Virtual Machines and has the ability to visualize them in an efficient manner.

1.3. Needs:

Distributed resource scheduler and distributed power management are essential for balancing the load in the virtualize environment. It prevents from over and under-‐utilization of resources. There are millions of performance records generated from virtual machines, its important to have a performance statistics collector and analyzer framework to monitor the health of the infrastructure and thus carry out smooth DevOps.

4

2. BACKGROUND:

The Distributed resource scheduler (DRS) helps in distribution of the load across the different hosts that are available in the vCenter. Distributed Power management runs on top of DRS and performs the primary function of monitoring the hosts and virtual machines CPU usage periodically and power-‐offs the host with least utilization leading to migration of virtual machines. In addition, collection of large-‐scale statistics is done and displayed graphically for analytics purposes.

3. REQUIREMENTS

3.1. Functional Requirements

1. Collection of the performance statistics metrics of the hosts and virtual machines. 2. Storing of the data should be in scalable NoSQL database. 3. Use of logstash for log filtering and parsing should be implemented. 4. Agent aggregator should read and aggregate data from the NoSQL database in

intervals of 5minutes, 1hour and daily and store into relational database MySQL. 5. The data from the MySQL database should be presented in the form of charts by using

visualization tool. 6. The visualization tool must be able to present the data on a real-‐time basis by

updating itself at regular time intervals.

3.2. Non-‐Functional Requirements

1. The system must be designed to scale automatically for any future use and should not have a single point of failure.

2. The visualization should be clear and convey meaningful insights.

5

4.DESIGN

Part 1-‐ DRS & DPM Initially, the number of vHosts and the virtual machines under each vHost will be listed. Then, the CPU usage metric will be calculated for each vHosts. When the new virtual machine is added, it will be placed under the vHosts with less CPU usage there by balancing the load. That is, the virtual machine with less CPU usage will be migrated to the vHosts with less CPU usage.

Sample code that shows the Distributed Resource Scheduling when a new vHost is added to the vCenter.

6

Part 2 Architecture Diagram The below diagram illustrates the solution architecture for the large-‐scale statistics gathering and analysis tool. It highlights the different components and tools that were used for implementing the solution. In addition, it also points the solution flow and the deployment scheme used.

Figure: Solution Architecture of the framework

7

Components Listed below are the major components that formed the basis of the system and solution architecture:

1. Java Agent Collector The Java Agent Collector is essential a Java program that uses the VMware Infrastructure APIs, which initiates a Performance Manager to fetch the vital performance statistics of the vHosts and the VMs. After getting the statistics it performs the task of writing it to the log file. This collector is deployed independently on each of the virtual machines.

2. Log File As mentioned above, the log file acts as a local storage for the virtual machine capturing the performance statistics. Each log file that is stored on the VM is stored in as <vHost Name>.log i.e. 130.65.132.131.log

3. Logstash Logstash is an open source tool for managing events and logs. It provides the capability to collect logs, parse them and store them for later use. In this framework, it polls on the log files for “events”, where event is a single line written to the log file. As soon as an event is detected it pushes it to the MongoDB cloud server. In addition, using Logstash in our implementation prevents from single point of failure. In case, if the connection to MongoDB were not established, it would re-‐try to connect. Once, the connection is back up it will automatically sync the missing log entries from the log files to the MongoDB storage.

4. MongoDB Cloud Storage (MongoLabs) MongoLabs provides MongoDB-‐as-‐a-‐Service. Deployed on top of AWS, it’s servers as our primary client side database for storing the raw log data, consisting of performance statistics of the vHosts and Virtual Machines. It’s highly scalable and reliable. As explained above, logstash pushes huge amount of data to the MongoDB cloud infrastructure. It consists of a web based UI for management of the database (configurations for remote connection, delete etc.)

5. Java Agent Analyzer Agent Analyzer is a multi-‐threaded java program that spawns a total of three threads. These threads connect to the MongoDB cloud infrastructure after every fixed internal (5min, 1hour and 24hours) and performs roll-‐ups/aggregation on the data it fetches

8

and stores it into the MySQL database. Having an analyzer ensures that only the processed data goes into the final database, so that it can allow for easy and effective visualization.

6. MySQL Database MySQL is a SQL database that forms the primary server-‐side database, storing the entire valuable aggregated processed data.

7. Visualization Block The visualization blocks forms an important part of this large-‐scale framework. It allows the user capability to view and drill-‐down into the key performance statistics. In our solution following pieces are integrated together to generate data insights:

• PHP PHP is a server side scripting language that connects to the MySQL database and caters to the request sent from the UI.

• UI/Bootstrap The UI is developed using a JavaScript template framework called Bootstrap. It sends AJAX calls (GET) to get data from the MySQL interface.

• Apache Httpd Server This is an open sourced HTTP server from Apache that hosts the PHP and UI files. It allows for communication between the UI and PHP scripts.

Key Workflows To understand the solution we can divide the architecture into three key workflows:

1. Data Collection Below diagram shows the data collection workflow. In this each collector agent running on the virtual machine gets the performance data (both vHost and VM) using the Performance manager and metrics id. After getting the data, it writes the data to the log file. This acts as a local storage. Logstash process that is running on the VM reads the log files and pushes the data received into the MongoDB cloud storage platform.

9

Figure: Data Collection Workflow

2. Data Aggregation The next workflow the follows is the data aggregation. The Java Agent collector is a multithreaded program that spawns three threads for 5minutes, 1hour and 24hours data aggregation. Its does the main job of fetching the data from the MongoDB datastore and aggregating it into the MySQL database. Since, it’s a thread it sleeps for specific time interval before again performing the aggregation.

Figure: Data Aggregation Workflow

10

3. Data Visualization The final workflow is of visualizing the aggregated data. PHP scripts act as an abstraction on top of the MySQL database fetching the data as requested by the UI (which is javascript). The communication between the UI and PHP takes place over http using REST web services. The Apache Http server that hosts the web application caters the http communication. High Charts, Data Tables and Google Maps are used for visualizing the data insights. The data in the dashboards refreshes after ever 5minutes to show the latest data.

Figure: Data Visualization Workflow

Database Schema Design

1. Client Side MongoDB For storing the huge amount of data in MongoDB the following schema is used:

11

2. Server Side MySQL It’s essential that the server side schema is designed to scale and to collect fine information to be presented over the UI. Thus following schema design is used:

Figure: MySQL Schema Design

12

5.IMPLEMENTATION

Part 2 5.1. Environment

The project has been implemented in the following environment: • One datacenter consisting of two Hosts

o vHost-‐130.65.132.131 o vHost-‐130.65.132.132.

• The vHosts have VMWare ESXi 5.0 installed. • Each host has two virtual machines

o T01-‐VM01-‐Ubuntu01 and T01-‐VM01-‐Ubuntu01 under vHost -‐130.65.132.131 o T01-‐VM01-‐Ubuntu03 and T01-‐VM01-‐Ubuntu04 under vHost -‐130.65.132.132.

• Each VM has the following OS and tools: o Ubuntu-‐10.04 o Java (version -‐1.8.0_25). o Logstash-‐1.4.2.

• Javascript is used for the client side UI.

5.2. Tools

The following tools were used for development, debugging and testing purpose: • vSphere client and server – For connecting to and hosting the virtualized environment. • Eclipse IDE – For developing the java code and .jar files. • Logstash-‐1.4.2 – Acts as a log management framework. • MongoLab – Cloud storage for MongoDB (MongoDB-‐as-‐a-‐Service) • MySQL – 5.1.75 is used as the server side storage. • PHP – 5.4.31 is used as the server side scripting language. • Bootstrap – UI template tool for creating the dashboard. • HighCharts – For visualizing the key statistics. • Data Tables -‐ For visualizing the key statistics. • Google Charts – For plotting machine’s IP address. • Apache httpd-‐ 2.2.27 -‐ For hosting the web application.

5.3. Implementation Approach:

The statistics of individual virtual machines and their corresponding host machine are collected by deploying a jar file designed and compiled on each virtual machine. This jar file

13

java project is designed to accept the virtual machine name as an argument at runtime. The following three main aspects of the project design are followed -‐ retrieving the statistics, saving them to log file on individual virtual machines, saving them to the client side database system and retrieving them from the server side database for visualization. Statistics collection: The jar file deployed on each virtual machine collects the statistics of that virtual machine and its host machine. This jar file is a java project with the code structures as mentioned below:

• This java project is designed to have three classes: statsLog.java, pThread.java and stats.java.

• statsLog.java: This class has the main() function defined. It accepts a valid VM name as argument and gets the corresponding host name from a dynamically created HashMap of vHosts and VMs. A thread is created using the pThread class to maintain a separate thread of operation.

• pThread.java: This thread is started in the statsLog.java and have the virtual machine and host system objects. The run() function creates the stats.java class object and passes the virtual machine and host system object. This run method is a blocking call and executed in a infinite loop to collect the statistics after a certain time interval.

• Stats.java: This class defines the methods to collect statistics from the virtual machine(printVMdetails) and its host system(printHOSTdetails). These methods use the following prebuilt vMWare classes to collect the statistics.

o PerformanceManager: returns the performance manager object for the vCenter service instance.

o PerfProviderSummary: queries and returns the summary provider for the virtual machine object passed.

o PerfQuerySpec: builds query for the corresponding metric and retrieves the metric data.

o The following metric IDs are used to collect the statistics:

§ 2 : CPU Usage § 6 : CPU usage in megaHertz § 24 : Memory Usage § 29 : Memory Granted § 33 : Memory Active § 98 : Memory Consumed § 125 : Disk Usage § 130 : Disk Read § 131 : Disk Write § 143 : Net Usage § 148 : Net Received § 149 : Net Transmitted § 155 : System Up time § 480 : System resources CPU usage

14

o All these statistics are saved to a log file on each Virtual Machine.

Log Parsing: The Log file saved on each virtual machine in a specified location is parsed to the Mongo DB in cloud (Mongo Lab) by using the Logstash. For this a configuration file is created where the following things are specified:

• Input file path. • Output file path/Connection to the MongoDB. • The Database name and the Collection name to store the data.

The format of the configuration file is as below: Input {

File {

codec => "json" stat_interval => 0 path => "/home/administrator/Desktop/Stats/130_65_132_131.log"

} } Output {

Mongodb {

collection => "mylogcollection" database => "vmwaredb" uri=> "mongodb://Apoorva:[email protected]:61200/vmwaredb"

} } Log Storage:

• MongoDB is used to collect the data from Logstash and store the logs collected. • Received logs are stored in “mylogcollection” collection in “vmwaredb” database. • Remote connection is added to MongoLabs instance of MongoDB to allow remote

connections.

15

Log Aggregation: This is done using the T01_Analyser.java program, which is multithreaded java program that spawns three threads for minute, hourly and daily aggregation. These are recurring threads that execute after every fixed configurable interval. Once the aggregation is done it opens a connection with MySQL database using MySQL.jdbc.connection and executes two insert statements per time interval (i.e. for VM and vHost), resulting in update of 6 tables. Data Storage and Retrieval: Datastorage is done using the MySQL database; following is the schema creation script: CREATE DATABASE IF NOT EXISTS `vmware_monitoring` /*!40100 DEFAULT CHARACTER SET latin1 */; USE `vmware_monitoring`; DROP TABLE IF EXISTS `vHost_monitoring_5min`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `vHost_monitoring_5min` ( ìd` int(11) unsigned NOT NULL AUTO_INCREMENT, `timestamp` varchar(255) NOT NULL, `name` varchar(255) DEFAULT NULL, `cpuUsage` decimal(11.3) DEFAULT NULL, `cpuUsagemhz` decimal(11.3) DEFAULT NULL, `memUsage` decimal(11.3) DEFAULT NULL, `memGranted` decimal(11.3) DEFAULT NULL, `memActive` decimal(11.3) DEFAULT NULL, `memConsumed` decimal(11.3) DEFAULT NULL, `diskUsage` decimal(11.3) DEFAULT NULL, `diskRead` decimal(11.3) DEFAULT NULL, `diskWrite` decimal(11.3) DEFAULT NULL, `netUsage` decimal(11.3) DEFAULT NULL, `netReceived` decimal(11.3) DEFAULT NULL, `netTrasnmitted` decimal(11.3) DEFAULT NULL, `sysUptime` decimal(11.3) DEFAULT NULL, `sysResourcesCpuUsage` decimal(11.3) DEFAULT NULL, PRIMARY KEY (ìd`), KEY `name` (`name`) ) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; DROP TABLE IF EXISTS `VM_monitoring_5min`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `VM_monitoring_5min` ( ìd` int(11) unsigned NOT NULL AUTO_INCREMENT, `timestamp` varchar(255) NOT NULL, `vHost_name` varchar(255) DEFAULT NULL,

16

`vm_name` varchar(255) DEFAULT NULL, `cpuUsage` decimal(11.3) DEFAULT NULL, `cpuUsagemhz` decimal(11.3) DEFAULT NULL, `memUsage` decimal(11.3) DEFAULT NULL, `memGranted` decimal(11.3) DEFAULT NULL, `memActive` decimal(11.3) DEFAULT NULL, `memConsumed` decimal(11.3) DEFAULT NULL, `diskUsage` decimal(11.3) DEFAULT NULL, `diskRead` decimal(11.3) DEFAULT NULL, `diskWrite` decimal(11.3) DEFAULT NULL, `netUsage` decimal(11.3) DEFAULT NULL, `netReceived` decimal(11.3) DEFAULT NULL, `netTrasnmitted` decimal(11.3) DEFAULT NULL, `sysUptime` decimal(11.3) DEFAULT NULL, `sysResourcesCpuUsage` decimal(11.3) DEFAULT NULL, PRIMARY KEY (ìd`), KEY `vm_name` (`vm_name`) ) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; DROP TABLE IF EXISTS `vHost_monitoring_1hour`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `vHost_monitoring_1hour` ( ìd` int(11) unsigned NOT NULL AUTO_INCREMENT, `timestamp` varchar(255) NOT NULL, `name` varchar(255) DEFAULT NULL, `cpuUsage` decimal(11.3) DEFAULT NULL, `cpuUsagemhz` decimal(11.3) DEFAULT NULL, `memUsage` decimal(11.3) DEFAULT NULL, `memGranted` decimal(11.3) DEFAULT NULL, `memActive` decimal(11.3) DEFAULT NULL, `memConsumed` decimal(11.3) DEFAULT NULL, `diskUsage` decimal(11.3) DEFAULT NULL, `diskRead` decimal(11.3) DEFAULT NULL, `diskWrite` decimal(11.3) DEFAULT NULL, `netUsage` decimal(11.3) DEFAULT NULL, `netReceived` decimal(11.3) DEFAULT NULL, `netTrasnmitted` decimal(11.3) DEFAULT NULL, `sysUptime` decimal(11.3) DEFAULT NULL, `sysResourcesCpuUsage` decimal(11.3) DEFAULT NULL, PRIMARY KEY (ìd`), KEY `name` (`name`) ) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; DROP TABLE IF EXISTS `VM_monitoring_1hour`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `VM_monitoring_1hour` ( ìd` int(11) unsigned NOT NULL AUTO_INCREMENT, `timestamp` varchar(255) NOT NULL,

17

`vHost_name` varchar(255) DEFAULT NULL, `vm_name` varchar(255) DEFAULT NULL, `cpuUsage` decimal(11.3) DEFAULT NULL, `cpuUsagemhz` decimal(11.3) DEFAULT NULL, `memUsage` decimal(11.3) DEFAULT NULL, `memGranted` decimal(11.3) DEFAULT NULL, `memActive` decimal(11.3) DEFAULT NULL, `memConsumed` decimal(11.3) DEFAULT NULL, `diskUsage` decimal(11.3) DEFAULT NULL, `diskRead` decimal(11.3) DEFAULT NULL, `diskWrite` decimal(11.3) DEFAULT NULL, `netUsage` decimal(11.3) DEFAULT NULL, `netReceived` decimal(11.3) DEFAULT NULL, `netTrasnmitted` decimal(11.3) DEFAULT NULL, `sysUptime` decimal(11.3) DEFAULT NULL, `sysResourcesCpuUsage` decimal(11.3) DEFAULT NULL, PRIMARY KEY (ìd`), KEY `vm_name` (`vm_name`) ) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; DROP TABLE IF EXISTS `vHost_monitoring_24hour`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `vHost_monitoring_24hour` ( ìd` int(11) unsigned NOT NULL AUTO_INCREMENT, `timestamp` varchar(255) NOT NULL, `name` varchar(255) DEFAULT NULL, `cpuUsage` decimal(11.3) DEFAULT NULL, `cpuUsagemhz` decimal(11.3) DEFAULT NULL, `memUsage` decimal(11.3) DEFAULT NULL, `memGranted` decimal(11.3) DEFAULT NULL, `memActive` decimal(11.3) DEFAULT NULL, `memConsumed` decimal(11.3) DEFAULT NULL, `diskUsage` decimal(11.3) DEFAULT NULL, `diskRead` decimal(11.3) DEFAULT NULL, `diskWrite` decimal(11.3) DEFAULT NULL, `netUsage` decimal(11.3) DEFAULT NULL, `netReceived` decimal(11.3) DEFAULT NULL, `netTrasnmitted` decimal(11.3) DEFAULT NULL, `sysUptime` decimal(11.3) DEFAULT NULL, `sysResourcesCpuUsage` decimal(11.3) DEFAULT NULL, PRIMARY KEY (ìd`), KEY `name` (`name`) ) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; DROP TABLE IF EXISTS `VM_monitoring_24hour`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `VM_monitoring_24hour` ( ìd` int(11) unsigned NOT NULL AUTO_INCREMENT, `timestamp` varchar(255) NOT NULL,

18

`vHost_name` varchar(255) DEFAULT NULL, `vm_name` varchar(255) DEFAULT NULL, `cpuUsage` decimal(11.3) DEFAULT NULL, `cpuUsagemhz` decimal(11.3) DEFAULT NULL, `memUsage` decimal(11.3) DEFAULT NULL, `memGranted` decimal(11.3) DEFAULT NULL, `memActive` decimal(11.3) DEFAULT NULL, `memConsumed` decimal(11.3) DEFAULT NULL, `diskUsage` decimal(11.3) DEFAULT NULL, `diskRead` decimal(11.3) DEFAULT NULL, `diskWrite` decimal(11.3) DEFAULT NULL, `netUsage` decimal(11.3) DEFAULT NULL, `netReceived` decimal(11.3) DEFAULT NULL, `netTrasnmitted` decimal(11.3) DEFAULT NULL, `sysUptime` decimal(11.3) DEFAULT NULL, `sysResourcesCpuUsage` decimal(11.3) DEFAULT NULL, PRIMARY KEY (ìd`), KEY `vm_name` (`vm_name`) ) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */; For retrieving the data following PHP scripts were created. These scripts define the data model for returning the data to the UI, when a web service request is initiated. These scripts are coupled with the MySQL database to allow easy connection and access to data.

User Interface (UI): The UI is implemented using Bootstrap, a JavaScript framework. We call the dashboard as VMware DevOps. It makes REST calls to retrieve the data and render it on the screen. For visualization of data HighCharts and Data Tables are used. Google Maps APIs are used to determine the location of the server based its the IP. The stats are updated automatically after 5minutes.

19

Part 1 -‐ DRS Initially, we set the monitoring interval and create the performance query specification and then loop over the samples received from the performance manager. Here is the sample code for it.

After obtaining the CPU usage of the vHosts, the DRS will be started in order to balance the load.DRS will now check the load of the vHosts and then start cloning the virtual machines as per the load balancing requirements. The sample code for it is here.

20

The implementation of the DRS second part algorithm can be shown using the screenshots below: When the program runs on the vCenter using the credentials given, it adds a new vHost from the admin console using the SSL thumbprint of the host to be added-‐

21

When a new host is added, DRS checks the CPU usage of each VHost and the Virtual machines under the host having the highest CPU usage is migrated to the new VHost. Metrics are collected for every VHost to get the CPU usage. As the host: 132.65.132.131 is having the highest CPU usage, the virtual machine Ubuntu 02 is migrated to the new host: 132.65.132.133. While migrating the virtual machines to the new host, it checks for few conditions

1. If Power On: Live migrate the virtual machine 2. If power Off: Cold migrate the virtual machine

The lower threshold value for the host CPU usage is set to 30%. If the usage of the host is less than 30%, migration doesn’t take place.Sample code which shows the threshold value as 30% for the CPU usage is as follow:

22

When the vHosts has only one virtual machine, migration is not performed on the virtual machine. The below screenshot shows that the virtual machine under the second vHosts having the highest CPU usage is migrated successfully. Thus, load balancing of the vHosts can be obtained using the Distributed Resource Scheduling.

Part 1 –DPM

Initially, it will check all the vHosts and if there exists any vHosts with no VM’s under it, then it will delete that particular vHost. Then, it will determine the CPU usage for all the vHosts. Here is the sample code for it:

Whenever, a new VM has been added it check the CPU usage of all Vhosts at that particular point of time.

23

It will then sort the vHosts from lower to higher CPU usage. All the VM’s under the vHosts having less than 30% CPU usage will be migrated to the other vHost and that old vHost will be deleted. The sample code for it is here.

24

6. ASSUMPTIONS:

Part 1-‐ DRS:

DRS PART-‐1: The assumptions to implement the DRS PART1 are: • Initially, VCenter has 2host and each having 2 virtual machines

o vHost 1: 132.65.132.131 § VM1: T01-‐VM01-‐Ubuntu01 § VM2: T01-‐VM01-‐Ubuntu02

o vHost 1: 132.65.132.132 § VM1: T01-‐VM01-‐Ubuntu03 § VM2: T01-‐VM01-‐Ubuntu04

• vMotion is enabled for all the vHost sand VMWare tools are installed for each VM. • Program Prime95 is running on each virtual machine.

Part 1-‐ DPM

• In this part, we have assumed that the minimum CPU usage value to be 30%. • Number of vHosts to be three. • There will be no VM’s under the third vHost.

Part 2:

The solution was built under a controlled environment consisting of 2 Virtual Machines and 2 Hosts. Following are the assumptions taken into consideration while building this solution:

1. The Virtual Machines that are in Power On state will only be considered for logs monitoring and analysis, since the Agent Collector is running on the virtual machines to capture the performance data.

2. The Virtual Machines are packaged with logstash and agent collector (.jar) installation files. In addition, these files are executed using the shell scripts. This is to prevent single point of failure.

3. The interval for collection of performance data is set to 20 seconds, since it’s a part of project requirement and to avoid data excessive IO on the log files.

4. The Aggregator program is executed from a physical machine and is considered to be highly robust and redundant, so as to withstand failure.

5. To simulate stress on the Virtual Machines, programs like Prime95 and Folding@home are used.

25

6. For storing raw logs MongoLabs cloud infrastructure is used. It’s considered to be highly scalable and elastic.

7. The UI should be dynamic and the visualization illustrating the performance statistics should clear and crisp.

7. LIMITATIONS

Part 1:

• The current implementation is limited to a controlled environment, thus the real execution of DRS and DPM in industry cannot be studied.

• The algorithm is not taking into consideration other performance metrics apart from CPU Usage.

Part 2:

Following are the limitations of the current solution architecture:

1. The Agent collector runs on the virtual machines capturing the performance metrics. This leads to high CPU utilization and slows down the other tasks running on the VM.

2. The performance metrics that are captured as a part of this implementation are pre-‐defined. There is no way to dynamically add new performance metric ids.

3. Using MongoDB-‐as-‐a-‐service (MongoLabs) acts a black-‐box in-‐terms of scaling and security. Even though, authentication keys and credentials are used, there is no visibility into where the data is being stored.

4. Incase of logstash failure, no captured data (in log files) will be passed to the MongDB database server for storage.

8. FUTURE WORK AND ITS EXTENSION

Part 1:

The implementation of DRS algorithm for various vHosts and the virtual machines running under them can be done. So as to, simulate the real industry environment. In addition, the behavior of DRS and DPM execution can be monitored and triggered using a remote UI.

26

Part 2:

This project can be scaled to a complete log analyzing and visualization enterprise solution. Listed below are some of the extensions for the project:

1. Providing capability to analyze logs per VM using the User Interface (UI). 2. Dynamic or easy on boarding of new performance metrics. 3. Providing feature to connect to vCenter through the UI and get all the performance

metrics of corresponding vHosts and VMs dynamically. 4. The project can be also extended to provide monitoring and visualization capabilities

for other virtualization players for e.g. Hyper-‐V, Xen etc.

9. INDIVIDUAL CONTRIBUTION

Name Contribution

Akshay Wattal • Overall designing of system framework and architecture

• Performed PoC for different architectural components.

• Implementation of the Visualization framework.

• Integration and System testing • Agent analyzer implementation • Documentation

Apoorva Gouni

• Overall designing of system framework and architecture

• Agent Collector implementation • Configuration of Logstash in VMs • MongoLabs-‐ connection • Unit Testing • Documentation

Gopika Gogineni


• DPM Implementation • Documentation

Pratyusha Mandapati


• DRS-‐ Create VM scenario • Documentation

27

10. INSTALLATION AND EXECUTION MANUAL

• Java Installation: Java is installed on each Virtual machine. Below are the steps followed to install Java.

o Download and save the jdk-‐8u25-‐linux-‐i586.trz.gz o Switch to the directory where you saved the file o Uncompress the file. o Launch Java

• Logstash 1.4.2 Installation: Logstash is installed on each Virtual machine. The below

are the steps followed to install the Logstash: o Download the logstash tar file o Extract the logstash tar file o Launch the Logstash o Download Logstash-‐contrib-‐1.4.2 o Extract the Logstash-‐contrib-‐1.4.2 o Run the Contrib Plug-‐in

Execution: The following commands should be executed in the command prompt for each Virtual machine.

java –jar stats.jar VM_name : This starts collecting the statistics and saving it to the log file in specified location. Here “stats.jar” is the name of the jar file and VM_name is the name of the virtual machine on which you are running this jar file.

28

Bin/logstash agent –f stats.conf : This command is used to run the Logstash and parse the log from virtual machine to MongoDB on cloud. Here “stats.conf” is the name of configuration file.

• MongoLabs Configuration: The detailed steps are provide here http://docs.mongolab.com/

• MySQL Installation: It is installed on the physical machine running on Mac OS. It’s installed using a third part package manger for Mac called homebrew.

brew install mysql

Execution: To start the MySQL service run the following command -‐ mysql.server restart

• PHP Installation: It is installed on the physical machine running on Mac OS. It’s installed using a third part package manger for Mac called homebrew. Following is the command for the same (Detailed link https://github.com/Homebrew/homebrew-‐php):

brew install php56

• Apache Http Server Installation and execution: Its built-‐in the Mac OS. To start the

apache server first deploy all the PHP and UI files (html, css, js) into the root document folder. Next, execute the following command to start the service

./apachectl start

To stop the server, run-‐ ./apachectl stop

29

11. UNIT, SYSTEM AND INTEGRATION TESTING:

The test case were executed in a controlled environment consisting of 2 vHosts and 2 Virtual machines: Test

Case Id Test case Description Expected

Result Actual Result Details of Test

Results Pass Fail 1 When Virtual machine is

powered ON, it should return the Virtual Machine name, its corresponding host name and the performance statistics of both virtual machine and its corresponding host should be saved in a log file at specified location.

The Statistics of the vHosts and the VMs are successfully fetched.

Yes Successfully printed the Virtual Machine name and its corresponding vHost name and saved the performance metrics of both virtual machine and its corresponding host should be saved in a log file at specified location. Screenshot Test-‐1

2 To check the connection with the Mongo DB and Storage of Data in MongoDB Run the command to execute the Logstash. Data from specified input path should be parsed to MongoDB.

Successfully connection should be established with MongoDB and data should be parsed from specified location and saved to MongoDB.

Yes Screenshot Test-‐2 shows the expected output.

3 Test remote connectivity with the MongoDB Labs server for Aggregator so that the Aggregator program could remote connect.

Remote connection should is successful Robomongo tool.

Yes Screenshots Test-‐3 shows the expected output.

30

4 Test case to ensures that the aggregator is able is connect to MongoDB and insert aggregated statistics in MySQL database.

The data from MongoDB is aggregated and pushed into MySQL

Yes Screenshots Test-‐4 shows the data aggregated into different MySQL tables corresponding to the timeframe.

5 Visualization 5.1 To test if the UI REST

web services are able to connect with the deployed PHP scripts and get back the data.

There should be no error on loading the VMware DevOps dashboard and data should be fetched for the charts.

Yes Screenshots Test-‐5.1 shows the excepted output of the dashboard getting loaded successfully without and errors.

5.2 To test data consistency between the MySQL data and the data populated into visualization widgets.

The data points on the charts and table are correct an as expected.

Yes Test-‐5.2 shows the excepted output of the getting correct and matching data loaded on the UI.

5.3 Test Case to check if the graph and data tables are auto refreshing after every fixed interval of time.

The values in the graphs are updated after a configured time interval.

Yes Test-‐5.3 shows that the graphs are auto-‐refreshed.

31

Testing Screenshots:

Test-‐1:

32

Test-‐2:

33

Test-‐3:

34

Test-‐4:

35

Test-‐5.1:

Test-‐5.2:

36

Test-‐5.3:

37

12. SCREENSHOTS

Part 1: DRS1, Initial placement: Running Prime95 for increasing CPU utilization:

Comparison of vHosts CPU utilization for initial placement and selection of vHost:

38

Virtual Machine added to the selected vHost:

DRS2, Load balancing:

Adding a new vHost: 132.65.132.133 in the vCenter.

39

Collecting the metrics for vHosts and VMs and displaying the CPU usage including the time stamp.

40

Checks performed for live and cold migration and migrating the Virtual Machine – T01-‐VM01-‐Ubuntu02 to the new vHost.

VM migrated successfully

41

Checks performed on the second vHost: 132.65.132.132 to migrate the Virtual machine: T01-‐VM01-‐Ubuntu03 to the new vHost.

DPM (Distributed Power Management):

The vHost 130.65.132.133 is deleted as it has no VM’s.

42

Below is the code execution of the above deletion procedure.

Migrating VM T01-‐VM01-‐Ubuntu03 from 130.65.132.132 to 130.65.132.131

43

Below is the code execution of the above migration procedure.

Migrating VM T01-‐VM-‐01-‐Ubuntu04 from 130.65.132.132 to 130.65.132.132

44

Below is the code execution of the above migration procedure.

Deleting the vHost 130.65.132.132, since no VMs are left

45

Below is the code execution of the above deletion procedure.

At the end of DPM only one vHost with four VMs are left.

46

Part 2: Note: Screenshots of Running Collector, Generated Log File, Logstash execution, MonogDB storage and Aggregation are already captured in Testing Section above. Not repeating to limit the length of the document. Visualization

Below are the screenshots of the VMware DevOps dashboard that we have created.

This is called the VMware vHost Top Statistics dashboard, its primary purpose it to show Top3 defaulters vHosts that are consuming the maximum resources interms of the CPU, Memory, Disk I/O and Usage.

47

This is called the VMware VMs Top Statistics dashboard, its primary purpose it to show Top3 defaulters Virtual Machines that are consuming the maximum resources interms of the CPU, Memory, Disk I/O and Usage.

Next from the Left side panel we can expand VMware Inventory and drill-‐down to each ESXi or VM level. The following screenshot is of Overview of monitored vHost.

48

CPU Overview of vHosts, it consists of two charts with three time lines By minutes, hour and day.

Memory Overview of vHosts, it consists of four charts with three time lines By minutes, hour and day.

49

Disk I/O Overview of vHosts, it consists of three charts with three time lines By minutes, hour and day.

Network Overview of vHosts, it consists of three charts with three time lines By minutes, hour and day.

50

And finally System Overview of vHosts, it consists of three charts with three time lines By minutes, hour and day.

Next from the Left side panel we can expand VMware Inventory-‐>Virtual Machines. The first is the VMs Overview providing quick stats.

51

CPU Overview of VM, it consists of two charts with three time lines By minutes, hour and day.

Memory Overview of VM, it consists of four charts with three time lines By minutes, hour and day.

52

Disk I/O Overview of VM, it consists of three charts with three time lines By minutes, hour and day. Note, if we need to compare or just view only one VMs statistics we can simply enable/disable it from chart legend as shown below.

Network Overview of VM, it consists of three charts with three time lines By minutes, hour and day.

53

And finally System Overview of VM, it consists of three charts with three time lines By minutes, hour and day.

Next, to purge the MongoDB collections we can simply use the UI to clear all records. The extension of this could be to delete based on Timestamp.

54

The last feature of this dashboard is that you can find the location of any server using the IP address.

13. CONCLUSION:

1. It was highly challenging, in real-‐time while migrating the VMs as a part of DPM and trying to maintain the synchronization with the MySQL table at the same time.

2. In real time, generating the graphs dynamically with the response was slow. 3. Configuring logstash was very challenging. The difficult part lies in determining the

output forms by making use of plug-‐ins. 4. We had no hands-‐on experience with Creation of new resource pools and managing

them within our vCenter in our initial labs. As, a result we felt that task to be challenging.

5. As the infrastructure of the lab has provided every team, with limited storage, we had to always monitor that our machines did not run out of memory by expelling the log files that have been collected.

6. It took us a considerable amount of time, just for sorting out the statistics that are required to be collected by us as per the project requirements.

55

7. In order to generate the algorithm that will best suit to implement DRS and DPM was really challenging and took some considerable amount of time. Once we have figured it out, it was really easy for us to implement DRS and DPM.

14. REFERENCES:

1. http://logstash.net 2. CMPE 283 Project-‐2 Description PDF 3. https://mongolab.com/welcome/ 4. http://php.net/manual/en/ref.pdo-‐mysql.php 5. http://startbootstrap.com/

Software

DRS and DPM Implementation in Virtualized Environment Large Scale Performance Statistics gathering and monitoring