Managing Your Cloud with Confidence - Mark Rivington, n•fluence 2012

Preview:

Citation preview

MANAGING YOUR CLOUD WITH CONFIDENCEMARK RIVINGTON, CTO, NIMSOFT (Final-May1)

SPECIFIC APPROACHES TO MANAGING THE DATACENTRE PLUS CLOUD AND CLOUD SERVICES

PARTICULAR MANAGEMENT CHALLENGES IN THE NEW WORLD OF CLOUD COMPUTING 2

Agenda

HOW DID IT COME TO THIS?1

FOCUSED ON MONITORING AND PERFORMANCE MANAGEMENT4

WITH REFERENCE TO REAL WORLD CUSTOMER EXAMPLES 5

3

@mrivingt | #nfluence

How did IT come to this?

vCloud Director

NRE Coalition

HP BladesystemMATRIX

Public and Private Clouds

X

IaaS SaaS PaaS

X

Private(Internal)

Public(External)

DATACENTERS

The cloud effect on IT

Traditional systems management is based on complete control of all components and resources

The physical datacenter embodied this principle of control

The cloud dissipates the datacenter and disseminates control beyond organizational boundaries

Now the “datacenter” is a heterogeneous mix of disparate computing environments

Controlling across cloud boundaries is the challenge

@mrivingt | #nfluence

Unified monitoring is a major goal

vCloud Director

NRE Coalition

HP BladesystemMATRIX

Public and Private Clouds

X

IaaS SaaS PaaS

X

Private(Internal)

Public(External)

DATACENTERS

@mrivingt | #nfluence

Cloud layers

“Abstraction”

Application Private Virtual Data Center

Virtualized Infrastructure

Physical Infrastructure and Components

“Virtualization”

@mrivingt | #nfluence

Cloud layers – Depth of vision

“Abstraction”

Application Private Virtual Data Center

Virtualized Infrastructure

Physical Infrastructure and Components

“Virtualization”

1

2

3

@mrivingt | #nfluence

Private Cloud

Depth of vision

– SaaS and PaaS– IaaS– For the benefit of the consumer– For the service provider themselves

– In the traditional private datacenter– Provided by service providers

– Full Visibility

Public Cloud

DataCenter

X

IaaS SaaS PaaS

X

Private(Internal)

Public(External)

3

2 1

@mrivingt | #nfluence

In depth visibility into the performance, availability and status of your instancesSaaS and PaaS– URL and web service response– End user experience – passive and synthetic– Transaction performance counters

# transactions, latency, service timeAnalysis and predictive reporting

– Subscription status– SLA measurement and reportingSaaS – You don’t “own” the application– Specific SaaS application APIsPaaS – You do “own” the application– Application instrumentation– Application frameworks generally expose specific

performance metrics

Monitoring SaaS / PaaS services

Backup as a Service

force.com

@mrivingt | #nfluence

Exposed by Cloud APIs: You’ll need more:Virtual server instances - Network, CPU, Storage details - read/writeAdditional global IaaS offering performance– Server start up times– Availability of servers /

instance types – by location– Usage data

Monitoring IaaS infrastructure (consumer)

Just as a datacenter:– Detailed Server Monitoring– Application – Exchange, Sharepoint,

AD, Notes, DB, etc.– Web Server – IIS, Apache,

WebSphere, WebLogic etc.– Multi-tier web application views– End user experience and

transactions– Plus workflow, automation, usage

metering, integration with Service Desk, CMDB…

@mrivingt | #nfluence

Data Centers

Monitoring must behave well in the cloud

Zero touch configuration and deployment of monitoring for new instances

Registration and graceful de-registration of agents

Monitoring policies obtained at instantiation time (no stale images)

Connect to management server and begin reporting

Connect securely back to data centers if they exist

Server Instance

Register

Policy

Cloud Hub

Report

De-register

@mrivingt | #nfluence

A model for IaaS monitoring

MonitoringData

Aggregation

Visualization

Provisioning

Service Desk

Service Reporting

ConfigurationManagement

DataPerformance

Data Service Data

Self Service DashboardReports

LaunchTerminate

Dynamic State

“Surge Computing

” Automatio

n

Incidents

Reporting

Performance

Aggregation

ServiceViews

Direct VisualizationMonitoring

Policy

PortalIntegration

IaaS Cloud Monitoring Architecture 12/14/2009 Mark

Rivington

@mrivingt | #nfluence

– Brand name consumer media streaming company

– Highly asymmetric workloads and user demands

– Heavily utilized datacenters

– Capital intensive datacenter costs

– Early users of Public IaaS

– Shift to Operational Expense

– Used monitoring to determine and control overspill in to the cloud

A customer example of active management

NotEnough

Configuration data– Defines the SQL Query that measures

the load (single value) e.g. select average (cpu) from server where server name like ‘CDN%’

– Sets the low threshold for the value e.g. 60 and 80.

– Defines the actions (launch or terminate) that occur below the “low” and above the “high” thresholds.

– Sets the number of instances to be “launched” or “terminated”

– Sets the image name to be launched– Provides any other parameters needed

for launching terminating instances.

NotesThe ‘Thermostat’ process implements the logic of the system and the ‘Cloud Control’ process (CCP) interfaces with the specific cloud services

The Thermostat requests functions from the CCP via NMS call-back functions.

There are delays configured within the system to prevent repeated requests for launch or terminate.

Select Performance

data

Configuration Data

Launch/TerminateRequests

Out of Range

Thermostat Process

Within Range

(Delay) Loop

Cloud Provider Specific “Plug-ins”

Terminate

Instances

LaunchInstanc

esCloud Control Process

Amazon

Rackspace

Savvis

(Other Cloud Providers)

Too many

NMS Performance

data

@mrivingt | #nfluence

– Key requirement is to offer self service monitoring of cloud instances to the consumer

– Graduated levels of monitoring servicewith appropriate pricing

– Monitoring must be driven through provisioning

– Multi-tenancy and Scalability are vital– Performance and availability data

must be accessible through CSP Portal

– Direct data access or portal to portal integration

Monitoring IaaS infrastructure (service provider)

Customer A View

Customer B View

Customer C View

Master Cloud View Client A Data

Client B DataClient C DataClient D DataClient E DataClient F Data

SLM DS

@mrivingt | #nfluence

Provisioning drives monitoring

– It is all about the APIs

– Templated (e.g. good, better, best) monitoring policies deployed at instantiation

– Modifiable through specific API calls

– Driven entirely through external automation or provisioning system

Presentation InformationReporting – Dashboards – Portals and

Widgets

SLA and Business Service Mgmt

Correlation and Root Cause Analysis

Performance & Availability

Event and Alarm Management

Wor

kflow

Automation

Datacenter Virtualization

End UserExperien

ceCloud

and SaaSPower

andFacilities

Custom

API

@mrivingt | #nfluence

SoftLayer as an example

@mrivingt | #nfluence

Softlayer as an example

@mrivingt | #nfluence

SoftLayer as an example

@mrivingt | #nfluence

Private cloud

vCloud Director

NRE Coalition

HP Bladesystem Matrix

Effectively a combination of consumer and service provider IaaS monitoring requirements

Plus classical datacenter monitoring for internal private cloud infrastructure

Need to Support specific branded infrastructure stack solutions e.g. VCE Vblock

@mrivingt | #nfluence

Vblock specific monitoring (as an example)

Discovery and Deployment– Auto-discovery, auto-monitoring, pre-built

templatesOperational– Under usage, over-commitment identification– Vblock root cause analysisChassis– Monitoring of all aspects of the rack– Compute– Cisco UCS blades and elements– Storage– EMC’s CLARiiON™, Symmetrix™ and Celera– Networking and interconnects– Cisco routers, SAN switches and Nexus™ soft

switches

@mrivingt | #nfluence

Visualization of the whole stack is key

@mrivingt | #nfluence

A customer example of private cloud monitoring– Global Investment Bank– Long term user of “other 3” systems

management suite– Moving from physical to virtual to

private cloud– Transformation from 6 weeks to 6

minutes in terms of server delivery– Needed a more flexible monitoring

solution – Key was integration with new

configuration management application– Self Service monitoring is vital to

private cloud– Currently has over 28,000 servers

under management and is still growing

@mrivingt | #nfluence

Review of cloud types and monitoring requirements

Self Service

Integration with config/provisioning/etc.

Zero Touch Monitoring Activation

Very high scalability

Dynamic Registration

Data Aggregation and Reporting

Multi-Tenancy VariesEnd User Monitoring- Synthetic Transactions- Real User Monitoring

URL and Web Service Response Monitoring VariesApplication Specific Instrumentation- Application dashboards- URL data gathering- App. Specific metrics

Varies

SLA reporting for customers VariesCompliance SLA/SLO reporting on business impact VariesIntegration with existing Datacenter monitoring

IaaS PaaS SaaS Private