22
DASM Self-Service Framework Technical White Paper May 2009

DASM Self-Service Framework

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DASM Self-Service Framework

DASM Self-Service Framework

Technical White Paper May 2009

Page 2: DASM Self-Service Framework

www.datasynapse.com p. 2

Confidentiality Notice and Disclaimer

Copyright © 2009 DataSynapse, Inc. All Rights Reserved.

Neither this document nor any of its contents may be used or disclosed without the express written consent of DataSynapse. This document does

not carry any right of publication or disclosure to any other party. While the information provided herein is believed to be accurate and reliable,

DataSynapse makes no representations or warranties, express or implied, as to the accuracy or completeness of such information.

Only those representations and warranties contained in a definitive license agreement shall have any legal effect. In furnishing this document,

DataSynapse reserves the right to amend or replace it at any time and undertakes no obligation to provide the recipient with access to any

additional information. Nothing contained within this document is or should be relied upon as a promise or representation as to the future.

GridServer® is a registered trademark. DataSynapse, the DataSynapse logo, GridClient, GridBroker, DASM, FabricBroker, LiveCluster, VersaUtility,

VersaVision, SpeedLink and RTI Design are trademarks. GRIDesign is a registered servicemark of DataSynapse, Inc. All other product names are

trademarks or registered trademarks of their respective companies.

DataSynapse products are protected by U.S. Patent No. 6,757,730, U.S. Patent No. 7,093,004 and U.S. Patent No. 7,130,891; other patents pending.

Page 3: DASM Self-Service Framework

www.datasynapse.com p. 3

Table of Contents

Introduction ---------------------------------------------------------------------------------------------------------------- 4

Key Insights ----------------------------------------------------------------------------------------------------------------- 4

Comparative Analysis ---------------------------------------------------------------------------------------------------- 7

Benefits -------------------------------------------------------------------------------------------------------------------- 10

Roles and Responsibilities -------------------------------------------------------------------------------------------- 15

Operating Model -------------------------------------------------------------------------------------------------------- 17

Service Levels ------------------------------------------------------------------------------------------------------------ 20

Page 4: DASM Self-Service Framework

www.datasynapse.com p. 4

Introduction

DASM Self-Service Framework (SSF) provides an elastic IT operating environment to create and manage

application infrastructure on system resources at the click of a button. Unlike existing cloud and

virtualization management solutions, SSF allows for rapid re-purposing of multiple physical or virtual

machines with complex multi-component application infrastructure configurations.

This approach is unique in that the multi-component application infrastructure configurations are managed

independent of the physical or virtual machines in which they reside. This paper addresses the model, use-

case, roles, responsibilities, and underlying technology that underpin this framework.

FFeeaattuurree PPrroodduuccttiioonn EEnnvviirroonnmmeennttss NNoonn--PPrroodduuccttiioonn EEnnvviirroonnmmeennttss

Need for repurposing systems for different purposes

Low to Medium Very High

Change Controls Very High Low in Development; Medium in Quality Assurance; High in Pre-Production Staging

Scalability Requirements High Low in general; High in bench mark testing

Availability Requirements Very High Low to Medium

It is important to note that because of the inherent differences between Production and Non-Production

environments, various features of SSF provide different levels of benefits within Production and Non-

Production environments. Nevertheless, in this whitepaper, we will discuss SSF as it applies across the entire

spectrum of a typical IT datacenter.

Before we dive into the details of how SSF works, we will focus on key insights required to understand SSF.

These insights will prepare us to understand a comparative analysis of SSF compared to other solutions

available in the application infrastructure landscape and also provide the groundwork for understanding the

benefits of implementing SSF.

Following the discussion on benefits, we will discuss the SSF system architecture, roles and responsibilities,

the operational model, impact on service levels, and the capacity planning model required to realize the SSF

system architecture in practice.

Key Insights

The key insights required to understand DASM SSF are as follows:

DASM SSF provides 1-click capability to create complex distributed application infrastructure

configuration images from templates and run them within the SSF environment

SSF images and templates do not contain any OS or application platform binary distributions within

them and both the templates and the images are typically only a few megabytes in size

SSF image is run by transiently inserting the image into any relevant operating system (OS) image

set: the complete runtime stack comprising of OS layer, application platform layer and application

components is automatically assembled at the time of image startup, and is disassembled at the

time of image shutdown

Page 5: DASM Self-Service Framework

www.datasynapse.com p. 5

If changes are made to a running SSF image, these changes are automatically captured within the

SSF image at the time of image shutdown

The interconnections among distributed SSF image components are dynamically established at the

time of image startup

Clustered components within a running SSF image are capable of dynamic clustering in response to

varying load experienced by the applications hosted within the cluster

Multiple, time-stamped versions of an SSF image can be saved in an image repository, and any

saved version can be restored from the repository and run within any SSF environment

SSF images can be easily copied from one environment to another environment

Data associated with some SS image components, such as database servers or LDAP directory

servers, may be optionally mapped to non-transient shared SAN or NAS storage, even as the

runtime processes are always inserted into a transient OS image

Let us try to visualize the insights we have noted above in the context of an example.

Example WebSphere Cell SSF Image

Let us discuss the key insights in the context of the example WebSphere Cell SSF image, as shown in Figure 1.

This example SSF image defines following application infrastructure configuration:

An IBM HTTP Web Server, with a WebSphere Plug-in capable of routing traffic to WebSphere cluster

members

An IBM WebSphere Deployment Manager controlling the WebSphere cell, including the WebSphere

cluster

A WebSphere Cluster with two application server nodes, each node running a cluster member,

whereby each node is part of the WebSphere cell defined by the Deployment Manager

Cluster has dynamic clustering capability and new nodes and cluster members can be added on the

fly if the applications hosted within the cluster require horizontal scaling. Information about the

cluster members dynamically added or removed from the cluster is dynamically propagated to the

IBM HTTP web server plug-in

An Oracle 10g database that is connected with non-transient data storage in a SAN

A Sun Directory server that is connected with non-transient data storage in a SAN

This SSF image is transiently inserted into the OS image set shown in Figure 1.

However, this SSF image can be lifted from the current OS image set and inserted into an

entirely different compatible OS image set with a completely different network profile.

Page 6: DASM Self-Service Framework

www.datasynapse.com p. 6

Figure 1 Self-Service Image

Page 7: DASM Self-Service Framework

www.datasynapse.com p. 7

Comparative Analysis

BladeLogic and Tivoli Comparative Analysis

In non-virtualized infrastructure environments, the capability to create and run complex, distributed

application infrastructure configurations from templates is available in the form of a persistent install of

distributed components into specific OS images, using solutions such as offered by BladeLogic, or Tivoli. This

approach can be characterized as fast provisioning approach whereby a complex distributed configuration

can be created fairly quickly from standardized configurations.

Under this approach, once the distributed application infrastructure configuration is installed into an OS

image set, it is locked into that OS image set. Any changes made to this working configuration can not be

copied out of the underlying OS image set. If you want to create a copy of a working application

infrastructure configuration, you have start all over again. If you have to move the configuration to a new OS

image set, you have to start all over again.

In non-volatile, stable, production environments, this approach works reasonably well, but it is sub-optimal

for non-production environments, where change frequency is very high. This approach is also not the best

approach for production environments characterized by volatile application loads, because this approach

does not lend itself to dynamic clustering.

In summary, this approach offers limited flexibility in terms of copying and migration of

configurations within application development life-cycle.

VMware Lab Manager Comparative Analysis

Readers may spot many common concepts between SSF and what is offered by virtual infrastructure

management tools, such as VMware Lab Manager. Therefore, following questions need to be discussed:

Why bother creating SSF images that are disconnected from OS images?

Why not store the distributed application infrastructure configuration within virtual machines created

from templates? In fact, VMware Lab Manager offers exactly such capability.

The two questions raised above are central to understanding the motivation behind the unique approach

offered by DASM SSF and we offer answers to these questions below.

We first start by examining how VMware Lab Manager works. We have selected VMware Lab Manager

because it is the most mature product in its genre. Other products in this area are still being developed by

companies such as Microsoft and Citrix. The important point to note is that regardless of the vendor,

VMware Lab Manager is representative of an approach whereby application infrastructure configuration is

stored locked in with the OS image set.

VMware Lab Manager can quickly create distributed application infrastructure topologies from

configurations stored in an image library. Each configuration in the image library comprises of one or more

machine templates, whereby each template contains a guest operating system and relevant application

components. Before a configuration can be used to create a working topology, a copy of a configuration is

checked out from the image library and is deployed to create the working topology. To optimize storage,

each change to the working copy is stored as a delta disk change from the base configuration image.

The VMware Lab Manager approach works reasonably well, at least for some specific uses cases in labs

(hence the name of the product), but consider following limitations with this approach:

1. How do you take an existing working copy of a configuration within VMware Lab Manager and

move it to a completely different environment that is outside VMware infrastructure? That is not

Page 8: DASM Self-Service Framework

www.datasynapse.com p. 8

possible, and root cause of this issue is that the OS image set and the application infrastructure

configuration are locked together. This creates problems during life-cycle stage management. By

way of contrast, SSF approach has no such limitation.

2. Imagine a situation where you need to perform bench mark testing on the configuration shown in

Figure 1 and you want to be able to dynamically scale the cluster during your bench mark test to

find out the appropriate minimum number of cluster members needed to meet baseline load on the

application? How do you do that under VMware Lab Manager? You will need to manually clone and

configure a new virtual machine into the working copy of the configuration. By way of contrast, SSF

approach offers dynamic clustering as an integral feature.

3. Imagine a situation where there are 5 configurations like the one shown in Figure 1, but they only

differ in minor aspects of some security settings and performance settings within the Deployment

Manager. Within VMware Lab Manager, you will need to create 5 different images within the

configuration library, and each of the five configuration images will essentially be the same except

for some minor differences within the Deployment Manager configuration. This will result in a

proliferation of configurations within the image library, and the root cause of this proliferation can

be directly attributed to the fact that VMware Lab Manager does not separate application

components from the OS images.

4. Now imagine a situation where there are hundreds of working copies of various configurations in

use within the Lab Manager environment and you have to apply a patch to an application platform

binary distribution? This means now you have to use some facility to apply patches to possibly

hundreds of virtual machines. By way of contrast, SSF approach requires patching a single

application platform binary distribution.

Before we leave this specific discussion, it is important to note that we are not minimizing

numerous benefits offered by VMware virtual infrastructure. In fact, SSF is completely

integrated with VMware virtual infrastructure and leverages many virtualization benefits

offered by VMware in creating and managing the OS image set.

Summary of Comparative Analysis

DASM SSF is unique in its approach of creating a complex, distributed application infrastructure configuration

image from a template and running the SSF image by transiently inserting the image into any relevant OS

image set.

On the following page we offer a summary of the comparative analysis.

Page 9: DASM Self-Service Framework

www.datasynapse.com p. 9

Feature DASM

SSF VMware

Lab Manager BladeLogic,

Tivoli

1-Click creation of complex, distributed, application infrastructure configurations from templates

Yes Yes Yes, but with less automation, compared to other two approaches

Ability to run application infrastructure configuration image within any relevant virtual or physical infrastructure

Yes No No

Ability to copy an image from one environment to another environment

Yes Yes, but all environments have to be based on VMware virtual infrastructure

No

Ability to save and restore multiple versions of a working image

Yes Yes No

Ability to convert a working image into a template

Yes Yes No

Ability to patch all images dependent on an application platform by patching a single application platform binary distribution

Yes No No

Ability to discard physical or virtual machines, while retaining a working image

Yes Yes No

Page 10: DASM Self-Service Framework

www.datasynapse.com p. 10

Benefits

This brings us to one of the most important topics within this book: What are the benefits of DASM SSF? In

this section, we will discuss all the major DASM SSF benefits.

DASM SSF benefits are impacted by a number of business and IT drivers: We will discuss such drivers and

note their impact on SSF benefits. As we noted at the outset of this chapter, a given benefit may not be

uniformly applicable across Production and Non-Production environments, so we will note the relative

importance of a given benefit within Production and Non-Production environments.

DASM benefits can be analyzed from an absolute standpoint in the context of a standalone SSF solution, or

can be analyzed from a comparative standpoint in the context of other solutions: In this section, we will do

both. We present the discussion on benefits in the table shown below.

Benefit Business and IT Drivers Relative Importance Analysis

Agility in creating complex, distributed application infrastructure configurations

Key business drivers that directly impact this benefit are: 1. Rate of growth 2. Competitive landscape Key IT drivers that directly benefits this benefit are: 1. Number of IT applications 2. Rate of change in applications 3. Level of innovation

Non-Production: High. Non-production environments are characterized by a high level of stand-up and tear down of complex configurations, and multiple copies of same configurations Production: Medium. Production environments are relatively stable in terms of stand-up and tear-down of complex configurations. The exception to this general rule is that at the time of roll-out of new applications, the level of flux even in the production environment can be fairly high.

The faster the rate of growth, the more intense the competitive landscape, the more the number of applications, the higher the rate of change and level of innovation, the more the level of this benefit. To quantify this benefit, two financial models are relevant: 1. NPV of accelerated earnings achieved through faster time to market 2. Opportunity cost of delays in time to market in the face competition

Page 11: DASM Self-Service Framework

www.datasynapse.com p. 11

Lower

infrastructure

related capital

and operational

expenditure

Key business drivers that

directly impact this

benefit are:

1. Level of infrastructure

sharing accepted by

business practices

2. Pattern of IT resource

consumption by

business over a 24-hour

period

3. Regulatory constraints

on sharing infrastructure

4. Internal managerial

cost accounting

practices

Key IT drivers that

directly impact this

benefit are:

1. Current level of

infrastructure spending

2. Current level of server

consolidation

3. Current level of

infrastructure

virtualization

4. Ability to consolidate

disparate application

platforms

Non-Production: High.

The need for

repurposing

infrastructure is very

high, which creates

many opportunities for

re-use of infrastructure

resources for multiple

objectives

Production: Medium. A

certain base level

consumption of

resources is always

needed, and

opportunities for

repurposing and reusing

infrastructure resources

in the context of

multiple applications

depend on the 24-hour

pattern of resource

consumption. Shared

contingency resources

for production

environments and

sharing use of DR

environments with non-

production use offer

additional opportunities

for deriving this benefit

in production

environments.

The more business is

willing to share IT

infrastructure resources,

especially for non-

production

environments, higher

the level of this benefit.

The level of this benefit

is incrementally lower

when extensive server

consolidation has

already been achieved

through conventional

physical server

consolidation, or

through use of

virtualized infrastructure

resources. To quantify

this benefit, three

financial modeling

approaches are possible.

1. In the absence of

extensive server

consolidation already in

place, this benefit can be

modeled as a

conventional server

consolidation exercise.

2. If extensive server

consolidation is already

in place, we can model

this benefit in the

context of selected

application projects that

are willing to share

resources. To do so, we

identify infrastructure

Page 12: DASM Self-Service Framework

www.datasynapse.com p. 12

Horizontal scaling to satisfy volatile demand

Key business drivers that directly impact this benefit are: 1. Nature of load on business applications 2. Impact of volatile demand on business Key IT drivers that impact this benefit are: 1. Can horizontal scaling of application platforms satisfy volatile demand? 2. Is overflow capacity available for horizontal scaling?

Production: High, if horizontal scaling of application platforms can satisfy volatile demand Non-Production: Low, except in benchmark testing.

If applications are architected such that application service levels can be satisfied though horizontal scaling of application platforms, then the level of this benefit is very high

Flexibility in the use of physical and virtual infrastructure resources, OS platforms and application platforms

Key business drivers that directly impact this benefit are: 1. Need for innovation Key technology drivers that directly impact this benefit are: 1. Ability to absorb IT innovation

Production: Low Non-Production: High

Conventional solutions such as BladeLogic offer limited flexibility in being able to change infrastructure resources, change OS platforms and change application platforms. Innovation requires change, and this lack of flexibility can have huge opportunity costs.

Page 13: DASM Self-Service Framework

www.datasynapse.com p. 13

System Architecture

The system architecture of SSF is comprised of the following important components:

DASM Broker console, which acts as the centralized control center for the framework

DASM Reporting Database, which stores important data collected by DASM

Self-Service Portal, which provides the web-based front-end for SSF users

Self-Service Repository, which stores all the SSF images created and used in the self-service

framework. This is the master repository for all SSF images

Physical and or Virtual Servers – DASM and the Self-Server Framework can operate on legacy

hardware

For Virtual Deployments:

a. VMware vCenter Server, which is used by DASM Broker to manage virtual Resource Pools

and ESX server clusters. DASM Broker uses the vCenter Server to automatically create,

power on, power off, shutdown, and destroy any virtual machines it uses to run various

SSF image components.

b. The virtual machines run in VMware virtual infrastructure Resource Pools, defined within

ESX server clusters

c. A standby pool of ESX servers, which is automatically drained and refilled by DASM Broker

through the vCenter Server, as the demand for ESX server hosts fluctuates within the

VMware Resource Pools managed by DASM Broker through vCenter Server

For Non-Virtual Deployments

a. A pool of physical machines running DASM Engines. These could be legacy x86 machines

that are not properly configured for virtual environments or non-x86 machine types

The system architecture for SSF is shown in detail in Figure 2 below.

Page 14: DASM Self-Service Framework

www.datasynapse.com p. 14

Page 15: DASM Self-Service Framework

www.datasynapse.com p. 15

Roles and Responsibilities

Functional roles envisaged for SSF operational model are described below:

System Administrator

This role is responsible for installing and maintaining DASM Brokers and Engines within physical or

virtual infrastructure. This role is responsible for installation and maintenance of operating systems on

bare metal, or as guest operating systems in virtual machines, and creation and maintenance of VMware

templates containing DASM Engine software

This role is responsible for integrating DASM Broker with VMware vCenter Server and, if necessary,

existing workflow systems and datacenter reporting tools.

This role is responsible for security configuration of DASM Brokers. This includes authentication

configuration based on internal database or external LDAP directory, and optional SSL configuration of

browser communications with DASM Broker administrator console, programmatic client communication

with Broker web services, communication between DASM Brokers and Engines, and communication

between DASM Primary and Secondary brokers

This role is responsible for ensuring availability of DASM Broker and Engine instances and hosts and will

need to use appropriate monitoring tools and alerting mechanisms to ensure availability

The skill sets required for this role are general system administration skills for relevant operating

systems, and ability to work with web based administrative console

This role needs to be trained in the installation, maintenance, system configuration of DASM Broker. This

role does not need to be trained in any operational and functional aspects of DASM broker

The responsibilities for this role are unchanged across development, UAT, pre-production and

production environments

IT Architect and Developer

This role defines the application templates offered under SSF services. This role has following responsibilities:

Interact with relevant application development teams and discover requirements for defining a new

application template

Specify the application infrastructure configuration using application template meta-data

Specify the customizable attributes of the application template that require user input

Specify the process workflow for orchestrating the startup and shutdown of the SSF image created from

a product

Build from scratch, customize, or procure (from DataSynapse) following DASM software packages:

- DASM Distributions required by the application template component

- DASM Containers required by the application template component

- Application templates required to build distributed components of the application infrastructure

configuration

- Configure default orchestration rules for startup and shutdown procedures of the SSF image

defined by the application templates.

Page 16: DASM Self-Service Framework

www.datasynapse.com p. 16

- The skill sets required for this role are a combination of architect and developer skills sets that

include Java, XML, Eclipse IDE toolset, and understanding of relevant application infrastructure

platforms for which distributions, containers, and domains are needed.

- This skill set requires a combination of architectural skills and hands on capabilities. If needed, this

role may be split into two separate roles: IT Architect and Application Developer.

- It is possible to outsource this role to DataSynapse Consulting Services.

- The responsibilities for this role are unchanged across development, UAT, pre-production and

production environments.

Framework Administrator

- A Framework Administrator is responsible for deploying all software artifacts to a DASM broker

- A Framework Administrator is responsible for defining and administering all DASM Broker policies.

This includes adding removal of various DASM domains from various policies

- A Framework Administrator is responsible for scheduling DASM Broker policies or manually

activating and deactivating them, as needed

- A Framework Administrator is responsible for uploading and assigning permissions to application

templates within Self-Service Portal

- A Framework Administrator is responsible for all administrative activities associated with Self-Service

Portal

- The skill sets required for this role are a combination of DASM Broker administration skills and

general control management skills. This role requires an understanding of available infrastructure

and an understanding of resource needs of relevant applications

- The responsibilities for this role are unchanged in development, UAT, preproduction and production

environments. Of course, the relevant policies defined across various environments would reflect the

need of the associated environments

Framework User

- A Framework User is any personnel that is authorized to create and manage SSF images using the SSF

web-based User Console

- No special skill set or training is required for this role

Page 17: DASM Self-Service Framework

www.datasynapse.com p. 17

Operating Model

First we will enumerate the key elements of the operating model and then elaborate on each element. The

key elements of the operating model are as follows:

New application templates are defined through an interaction between SSF team and application

development teams, and offered within the Self-Service Portal as application templates

Each application template can be used to create an SSF image, which may contain one or more

distributed components

After a new application template is ready for deployment within SSF, a capacity impact is undertaken to

estimate additional capacity required to bring a new application into SSF. Details around capacity

planning model are beyond the scope of this document.

Application templates offered within SSF may be configured for instant provisioning , or for

administrative workflow processing:

Framework Users submit requests for various application templates through the Self- Service Portal User

Console

Request for application images with automated workflow processing are processed automatically and

Framework Users get access to their requested SSF images in a matter of minutes

Application image requests requiring approval are processed by Framework Administrators and once

processing is complete, Framework Users get access to their requested images

To deploy applications, Framework users directly interact with web-based administrative consoles or OS

consoles of relevant distributed components

Application images may be bound by to end date, and the application image resources can be released

after the end date. The Framework User can extend the life span of a application image by requesting a

new end date

Page 18: DASM Self-Service Framework

www.datasynapse.com p. 18

New Application Template Definition Process

The key steps in the new application templates definition process are shown in Figure 3

New Application Template Definition

Gather application infrastructure requirements from application

development teams

Specify application infrastructure, including application infrastructure

topology components and user inputs for customizing topology

Validate application infrastructure specification with application

development teams

Build templates usingDataSynapse Studio

Test application infrastructure template in DataSynapse Studio

Template tests passed

Template spec is valid

NO

Build FabricServer Distributions and Containers required using

DataSynapse Studio, or procure them from DataSynapse. Customize

as needed

YES

Test Distributions and Containers in FabricServer Broker

Distribution and Container Tests passed

NO

YES

Deploy application infrastructure templates in

Self Service Portal, specifying either

automated or non-automated process

workflow

YES

NO

Application Template Complete

Page 19: DASM Self-Service Framework

www.datasynapse.com p. 19

User Image Request Process The key steps in the user image request process are shown in Figure 4

User Request

User logs on to SSF portal, selects and offered SSF template, slects a target SSF runtime environment

and creates a new SSF image

Do the requested SSF template and target environment allow

automated processing

User can access SSF image components and deploy applications within relevant

components. User can startup, shutdown, redeploy, capture, save captured versions

and restore saved versions of the SSF image

SSF Admin receives an email notification about user request

Request approved by the SSF Admin

SSF image is created and deployed within target environment and the

requesting user is informed through email

User is informed through email

about the rejected request

User Request Complete

YES

NO

YES

NO

Page 20: DASM Self-Service Framework

www.datasynapse.com p. 20

Capture, Save and Restore Application Images

Once a user is informed that a new application image has been created and deployed, the user can startup

the image. Once the application image is running within the SSF environment, user can access web-based

consoles or OS consoles of relevant image components and deploy their business logic.

Once the configuration of the applications components is complete, the application image can be captured

and optionally saved into the SSF repository. Any saved version of an image within SSF repository can be

restored to be the current version within SSF runtime environment.

Service Levels

Many Service Level attributes are favorably impacted through SSF and no Service Level attribute is negatively

impacted though SSF. Below, we will discuss important service level attributes that are significantly impacted

by SSF operating model.

Mean Recovery Time for SSF Image

SSF offers no explicit service level objectives around mean recovery time for an image, but it is important to

note that barring capacity constraints, the mean recovery time is expected to be consistent or better than

current times. This is because complete recovery of an application image is the same as a shutdown and

restart of an image, the mean time for which is generally well known within IT.

It is important not to confuse this mean recovery time with high availability capabilities at the physical

infrastructure level, such as VMware HA Cluster, or VMware VMotion capability. Any capability that prevents

hardware failure is orthogonal to this Mean Recovery Time: This mean recovery time starts after hardware,

or software, related failure has actually transpired and thus recovery is needed.

Mean Planned Down Time

SSF offers no explicit service level related to mean planned down time. However, it is important to note that

SSF favorably impacts many planned down time related activities, such as maintenance and upgrade of

application infrastructure topology and therefore is expected to impact the mean planned down time very

favorably.

Key Performance Indicators Thresholds

One of the key service level objectives offered by SSF applications is KPI measurements for selected

application infrastructure components and specification of threshold activation rules. These rules can be

used with all SSF application images, whether or not the components are clustered.

When threshold activation rules are used with clustered components, such rules offer the capability for

dynamic clustering that delivers automatic stabilization of KPIs around an optimal level. Such KPI driven

dynamic clustering may directly impact the responsiveness of applications hosted within the application

infrastructure. The details of this service level are always application specific.

Page 21: DASM Self-Service Framework

www.datasynapse.com p. 21

Capacity Planning Model

At the end of the day, SSF needs adequate capacity to deliver its offered application infrastructure. This

requires a capacity planning model to help SSF deliver its application infrastructure in a manner that is

economically efficient, yet delivers adequate service levels to achieve business objectives.

Capacity is measured in CPU, memory, shared storage, and network bandwidth. The essential questions for

any capacity model are as follows:

How much capacity do I need at this instant, let us say time t0, to run applications in scope at a level of

throughput and response time that meets business objectives at this instant?

How much capacity do I need at some time t1 > t0 in the future to run applications in scope at a level of

throughput and response time that will meet business objectives at time t1?

One conventional approach to do capacity planning is to measure historical trends and project them into the

future. This works well over short periods of time, say weeks or months, but works horribly over longer

periods, say over years, because all trends last for a period of time, and then change unpredictably.

The approach of the SSF capacity planning model can be summarized as evidence based incremental capacity

expansion. The exact details of this model are as follows:

In SSF capacity model, the main objective is to not add capacity early, but to add it just in time and only

when the SSF provides objective evidence that more capacity is needed

SSF capacity model assumes an environment that includes VMware virtual infrastructure, other virtual

and physical infrastructure, but in particular relies on a Standby pool of VMware ESX servers to

implement an elastic capacity planning model

SSF capacity model assumes a very quick procurement process. Ideally, one should be able to procure a

new VMware ESX server resource within 5 – 7 business days

SSF capacity model recommends starting out with a small Standby pool and growing capacity in the

smallest economically feasible chunks. For example, a recommended chunk for adding CPU and memory

capacity is 4 Dual Core CPU ESX Server with 8 GB memory. Recommended shared storage increment is 1

Terabyte

In SSF capacity model, initial capacity of the Standby VMware ESX server pool is recommended to be

four machines, whereby each machine is a 4 Dual Core CPU machine with 8 GB memory. Initial storage is

recommended to be 1 TB

Of these 4 machines, 2 should be online and 2 machines should be in the Standby ESX server pool (See

Figure 2)

Each SSF product should track relevant KPIs, and have thresholds associated with these KPIs, even when

no dynamic clustering is needed. Violation of KPI thresholds are captured in DASM reporting database

and offer objective evidence of actual application throughput and response times

SSF system architecture will automatically drain and fill the Standby pool as needed. If the standby pool

is drained down to 1 or less machine at least 30 percent of the time over a week, it is time to procure 1

more machine

If the Standby pool is drained down to 0 machines for at least 50 percent of the time over a week, it is

time to procure 1 more machine and add it to the Standby pool

If the Standby pool has more than 2 machines at least 50 percent of time over a week, it is time to

remove 1 machine from the Standby pool, as long there is a minimum of 2 machines left in the Standby

pool

Page 22: DASM Self-Service Framework

www.datasynapse.com p. 22

If any topology component shows KPI threshold violations that are not automatically satisfied, consider

adjusting the cluster size for clustered topology components and then analyze its impact on the Standby

pool