Upload
vmworld
View
161
Download
1
Tags:
Embed Size (px)
DESCRIPTION
VMworld Europe 2013 Thirumalesh Reddy, Vmware Venkat Gopalakrishnan, VMware Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare
Citation preview
Moving Enterprise Application Dev/Test
to VMware’s Internal Private Cloud
Architecture and Operations Transformation
Thirumalesh Reddy, Vmware
Venkat Gopalakrishnan, VMware
OPT6594
#OPT6594
3
Executive Summary
Key Lesson Learned
Invest in Agility, and Service Quality and Cost will Improve
AppOps Team Deploy integrated, complex SDLC instances to support 600 developers.
Challenge Process is manual, siloed, slow, unreliable. Reduces developer efficiency. Increases risk.
Two Fundamentally Different Options
1. Fix the “human middleware” on traditional infrastructure
2. Replace and automate on private cloud SDDC
Results From Choice to Replace and Automate on SDDC
Process time – dropped from 4 weeks to 36 hours
Developer productivity – increased 20% or more
Project schedule risk - eliminated
Annual infrastructure and operating costs - reduced by $6M annually
4
Agenda
Business Case
Options
Solution, Architecture and Implementation
Transforming Operations
Results
Key Takeaways
5
Corporate IT Application Group
Manage portfolio of enterprise
applications used by global
business functions
AppOps team
27 engineers
Customer
600 developers
Role
Provision 16 different dev/test
instances that include 80+ app
components.
Infrastructure footprint
~4,000 non-production VMs
~500 production VMs
Enterprise Application Portfolio
SaaS 65
IT tools 50
Business 100
Total 215
6
AppOps Provisions Environments Across SDLC
Support 30 to 50 major development projects per year
Team of 27 engineers manually builds each SDLC instance
Each project needs SDLC instance multiple times per project
Enterprise App Development Project - 9 months
20 Major Steps
3 to 5 Weeks in Traditional Virtualized Environment
Request for
SDLC Instance
Infrastructure
Verification
Hardware
Setup
Build VMs –
new or clone
DNS
Entries
Install,
Setup,
Configure
Workload
Database
Refresh
Latest Code
Deployment
Load
Balancer
Entries
Web Server
Configuration
Firewall
Changes
External
Interface &
Integration
PPM
Tasks
Workload
Monitoring
Setup
Security – VM
access control
Functional
Testing
Environmental
Testing
Production Dev Test UAT Stage Load
Test
7
Human Middleware Problem – Customer View
Variable Quality
Variations in
calendar and
service quality
Schedule Risk
Late projects cause
domino effect
with constrained
resource
Unpredictable
AppOps may
say “No”
to some
requests
Disruptive
Developer must
work around
3+ weeks gaps up
to 5x per project
“I can’t develop”
“I can’t test”
“My project is late”
“I’m waiting for the
software I need to
run my business…”
8
Human Middleware Problem – AppOps team view
Global Team
Management
Project manage
around PTO,
holidays,
variable skills
Capacity
Constrained
Only 4-6 projects
in parallel
Slow and
Error Prone
Many manual steps.
Ticketing systems.
Human error.
Handoffs
Silos. Globally
distributed teams.
Multiple application
experts.
9
100% Task Automation - Not Going to Meet Needs
Request Infrastructure
Verification
Hardware
Setup
Build VMs –
New or Clone
DNS Entries Install, Setup,
Configure
Workload Database
Refresh
Latest Code
Deployment
Load Balancer
Entries
Web Server
Configuration
Firewall
Changes
External Interface &
Integration
PPM Tasks Workload
Monitoring Setup
Security – VM
access control Testing
1- 2 days 3- 5 days 2 – 4 weeks 3 – 5 days
1 – 2 days 4 – 7 days 2 – 3 days 2 – 5 days
2 – 5 days 1 -2 days 2 – 4 days 1 – 2 days
3 – 7 days 2 – 3 days 1 day 5 – 6 days
Task time Wait time
10
Original Time for
Initial Request
1- 2 days
Opportunity – Replace with Automation
Entire Process
After Automation
1- 2 days
11
Two Fundamentally Different Options
Fix
The “human middleware”
on traditional infrastructure
Replace and Automate
End-to-end provisioning
on SDDC Private Cloud
Option 1 Option 2
12
Phased Project Approach
Deploy automation and management
capabilities
Create 5+1 vDCs
Blueprints for 80+ applications
Service catalog with 16 instances
Transition 2,800 VMs - Dev, Test, UAT
Key Milestone – 4 months
• 1st automated instance @ 172 hours
Expand service profiles – using
expanded virtual network and
storage in IaaS
Financial transformation – chargeback
Advanced analytics, performance
management
Transition 1,200 VMs Stage, LoadTest
Phase 2 - H1 2014
Production Dev Test UAT Stage Load
Test
Phase 1 - Completed
© 2009 VMware Inc. All rights reserved
Cloud Automation and Management
Architecture and Implementation
Thirumalesh Reddy, Sr. Director VMware
14
Two Project Goals
Transition
to Private Cloud – 4,000 VMs
Automate
the Process – 24 hours
Key Dependency
Need SDDC to automate the process
15
Project “OneCloud” - Explosive Tenant Growth
Corp IT AppOps = Tenant #4
Very low cost per VM
“Cloud first” policy in IT
AppOps
SDLC
provisioning
Hands
On Labs
Hol.vmware.com
Services &
Support
Customer environment
reproduction
Sales
Engineering
Demo Pods
VMworld 2013
Management
BU Field
Testing
TechSummit
2013
Tech Ops
Mini R&D
Cloud
Training
LiveFire
Private Cloud IaaS Software Defined Data Center
June
2012
Jan.
2013
Today End
2013
2014
Launched
Built on
vCloud Suite
4 tenants
10,000 VMs
9 tenants
38,000 VMs
12 tenants
50,000 VMs
More
services
Timeline
16
Private Cloud IaaS Software Defined Data Center
Bring Your Own - Application Ops
Three-tier Ops Model
Different Tenants
Different Application Ops
Application Ops (Provided by Tenant)
Now an infrastructure service consumer. Provisioning.
Monitoring. Configuring. Upgrades. Maintenance.
Many typical ops tasks still required.
Infrastructure Ops (Provided by OneCloud infrastructure team)
Network, storage, compute availability. Deliver to SLA.
Tenant/Service Ops (Provided by OneCloud service team)
Common service definitions, SLA, tenant onboarding, tenant management
Private Cloud IaaS Software Defined Data Center
17
Bring Your Own - Cloud Automation and Management
Tenant - needs different service
levels, automation and
management capabilities
IaaS - needs automation and
management capabilities
Service Manager
Decides what goes in service catalog.
Service Catalog
Mechanism to request service.
Policy
Logic used to guide automation.
Cloud Automation and Management
Manage workloads and underlying services.
Tenant 1 Tenant 2 Tenant 3
Private Cloud IaaS Software Defined Data Center
18
• Cloud Automation
• Dev, Test instances
• Policy based provisioning
• Storage Mgmt.
• 30+ App Blueprints
• 500+ VM’s
• Non-prod Environments
• One Cloud
Management
Cloud Virtual Infrastructure
Provisioning Automation
Cloud Storage Virtualization
Done
Q3 – Q4 12
• Cloud Automation & Management
• Dev, Test, UAT instances
• Scaling/Upgrades
• Policy based provisioning
• Storage Mgmt.
• Security Mgmt.
• Monitoring & Analytics
• VM Asset Mgmt.
• 50+ App Blueprints
• 2500+ VM’s
• Non-prod Environments
• One Cloud
• VCHS IaaS Validation
Management
Cloud Virtual Infrastructure
Provisioning/Scaling/Upgrade Automation
Monitoring
Cloud Storage Virtualization
Cloud Security
Done
Q1 – Q2 13
Service Catalog
Phased Project Approach
• Stage and load test instances
• Service catalog
• Performance management
• Network, Storage & Security Virtualization
• 80+ App Blueprints
• 3500+ VM’s
Management
Cloud Virtual Infrastructure
Monitoring
Cloud Storage Virtualization
Cloud Network Virtualization
Cloud Security
Provisioning/Scaling/Upgrade Automation
Performance Mgmt
Service Catalog
In-progress
Q3 – Q4 13 • Cloud Automation Scaling/Upgrades
• Cloud Storage & Network Mgmt. & Scaling
• Cloud Security Mgmt.
• Cloud Performance Mgmt.
• Usage & Charge-back
• Analytic and Correlations
• 100+ App Blueprints
• 4500+ VM’s
• All Non-prod & Prod Environments
Cloud Virtual Infrastructure
Management
Monitoring
Cloud Storage Virtualization
Cloud Network Virtualization
Cloud Security
Provisioning/Scaling/Upgrade Automation
Performance Management
Big Data Operational and Biz Analytic and Correlations
Service Catalog
2014
19
vCenter Orchestrator
VMware Application Director (VCAC Enterprise)
Dev Test UAT Stage Mgmt.
Cloud Providers
App Blueprints (130+)
VMware vCloud Director / VCAC
VMware vCloud Automation Center
Access Control
Provisioning Policies
Service Catalog
vCAC Workflows
Cloud Administrator
Blueprint
Manager
AppOps/ Biz/Dev Consumer
One Cloud - Private Cloud IaaS Software Defined Data Center
Deployment Plans
Automation and Management – Based on vCloud
vCloud Suite
20
Cloud Automation and Management - Extensibility
Load Balancer
(F5/vShield)
IPAM
(Men & Mice)
LDAP
(Lotus)
Config
(GIT Repo)
Other
3rd Party
Plug-Ins
Configure
VCAC Service
Catalogs and
Policies
Configure
Monitor Agents
and Collector
Feeds
Export
App D Services
and Blueprints
Export
Provision and
De-provision
Workflows
Export App D
Update Harness
Container
VMware /
Non-VMware
Components
Plug-In Mgmt.
Service Profile
and Provision
Tasks/Services
Mgmt.
Provisioned
Workloads
Unified Asset
Inventory
Provisioned
Workloads
Monitoring
Mgmt.
Provisioned
Workloads
Monitoring
Provision/
De-provisioning
Audits
Extension 3rd party
components
Private Cloud IaaS Software-defined Data Center
vCloud Suite
…
21
Cloud Automation & Mgmt. Platform: Life-cycle Diagram E
xte
ns
ion
vC
lou
d
Dir
ecto
r
Ap
pli
cati
on
Dir
ecto
r (V
CA
C
En
terp
rise)
vC
en
ter
Orc
he
str
ato
r
vC
lou
d
Au
tom
ati
on
Cen
ter
Build Configure Phase
Provision Phase
Monitor & Manage Phase
22
Policy Driven Business Workloads Provisioning
Self-serviced Portal for consumers to request Services on-demand from the published Service Catalog
Request Services for a specific term more of a leased consumption model
Request Services with the choice of Service Profiles based on cost & Performance needs
Policy driven approval process
Fully configured Business Workloads provisioned based on requests with out manual intervention
Workloads provisioned with integrated monitoring provide deep insight & visibility
Policy driven alerts, notifications
Rabbit MQ Server MQ
Spring Module
Controller
Modules
Lotus LDAP
Server vShield Edge
vCOps
Log Insight
Hyperic
LSPA
Client
Cloud Automation & Management
Access Control
Provisioning Policies
Users
Catalogs
App Director
vC
en
ter
Orc
hestr
ato
r
VC
AC
Se
rvin
gs
Cata
log
Ma
na
ge
r
Scripts/Tasks
Application
Blueprints
Deployment
Tasks
Cloud Provider
Access Control
Provisioning Policies
Users
Catalogs
SDLC Instance
Config
Management
3rd Party
LB
IPAM
Analytics
Async
call
Audits
Postgres DB
Message
Handlers
IPAM
Client LB
Client CM
Client
Extension Module
23
Policy Driven Business Workloads De-Provisioning
Self-serviced Portal for extension
of lease term.
Fully automated de-provisioning
once the lease term expires.
Reclamation of resources will
help reduce the future CAPEX
investments.
Rabbit MQ Server MQ
Spring Module
Lotus LDAP
Server vShield Edge
CM
Client
Cloud Automation & Management
Access Control
Provisioning Policies
Users
Catalogs
App Director
vC
en
ter
Orc
hestr
ato
r
VC
AC
Se
rvin
gs
Cata
log
Ma
na
ge
r
Scripts/Tasks
Application
Blueprints
Deployment
Tasks
Cloud Provider
Access Control
Provisioning Policies
Users
Catalogs
SDLC Instance
Config
Management
3rd Party
LB
IPAM
Analytics
Async
call
Purge
Audits
Postgres DB
IPAM
Client LB
Client LDAP
Client
Purge Records
MQ
Purge Message Handlers
Purge Controller Modules
Rea
d V
M P
rofile
s
Extension Module
24
Provisioning with VMware Application Director
vCAC
VM1
VM2 VM3
Guest Cust
Copy Files
Install Web
Start Web
Guest Cust
Copy Files
Install App
Start App
Guest Cust
Copy Files
Install DB
Config DB
Start DB Config App
Config Web
OneCloud IaaS Software Defined Data Center
25
Policy Driven Management
Policy Driven Storage Provisioning Leveraging vCloud Stack
In this diagram depending on the Service Profile requested, storage will be provisioned to appropriate storage profile.
Reduced OPEX with automation and reduced CAPEX with workloads provisioned with right resources
Setup different service profile policies driven by the cost and performance needs of the enterprises like Platinum, Gold, Silver etc.
Service Profile = Storage + Network + Monitor + Existing Services
Setup approval and access policies
vSphere
VMware Application Director
Access Control
Provisioning Policies
Users
Catalogs
vCenter Organization
pVDC Tier3 pVDC Tier 2 pVDC Tier 1
Application Blueprints
Cloud Providers
Deployment Profiles
Workflows Policies
vC
en
ter
Orc
he
str
ato
r
VC
AC
Se
rvic
e C
ata
log
Ma
na
ge
r
vApps provisioned
in Tier 1, Tier 2
and Tier 3 Org
vDCs
Organization vDCs
Organization Network
vApp Network
External Networks
Resource Pools
Datastores
Port Groups or
dvPort Groups
Cloud
System
Admin
26
Integrated Monitoring
Monitor Dev, Test, Load or
Production environment
Workload Monitoring
Monitor application layers e.g.
Portal, SOA, EBS
Layer Level Monitoring
View metrics from vCops, Log
Insight in single Graph
Metrics Correlation
Real time monitoring of Key
performance Indicators
Real time view
Log Insight Hyperic vCenter Operations
Management Suite
Allows user to create new
dashboard for the required
metrics
Create New Dashboards
Drill down to resource level from
aggregated view
Drill down
3rd Party
27
Lessons Learned
Separate the run team for automation build team.
Develop a phased approach with clear scope on the Services to be provided
either in Infrastructure or Applications for the successful implementation.
Develop Architecture using more cohesive integrated stack like vCloud Stack to
build Automation and Mgmt. platform to achieve all the capabilities at lower
costs and time to market.
Define the cost model for the services to serve different cost & performance
needs like Dev, UAT, Mission Critical etc.
© 2009 VMware Inc. All rights reserved
Automation and Ops Transformation
Venkat Gopalakrishnan – Director IT
29
Traditional Operations Functions – Provided by AppOps
People Process Governance
Extension
via
API and SDK
3rd Party
Components
Cloud Automation and Management
vCloud Suite
Private Cloud IaaS Software Defined Data Center
30
Why Standardize and Automate Service Provisioning?
Service
Definition
Blueprint
Policy POC1 POC2 To Catalog
Provision QA Staging Release
40 work weeks effort – Per Release…
20 work weeks effort – Once!
Run Book
36 hours
Service
Request
4 weeks
Virtual Data Center
Virtual Server
It takes less effort/time to convert the runbook into blueprints
than it takes to “run” the runbook...
31
Transformation – Process
Challenges
First version of automation platform did not meet all automation needs
Actively deploying instances while building machines
Difficulty in managing integration with SaaS apps
High inflow of demand
Action
Automation platform capability getting enhanced – actions in parallel
Testing suite additional functions getting automated – environmental and
functional
Continuous process improvement in place, root cause action after every cycle
Instance provisioning being treated as a ‘release’
Documentation is key to achieve predictability
32
Total Cycle Time - Improvements
1. Re-provision instead of repair,
and cross-training teams
2. Improve blueprints to drive down
defects, automate functional and
environmental testing
3. Additional automation platform
capabilities
Plan to get to 24 hour goal
• More automation and management
changes
• Improve QA testing process
Improvements
Provision – 16 hours
QA – 8 hours
2013 Goal
0
20
40
60
80
100
120
140
160
180
200
Test13 Dev14 Test14 Dev15 Test15 Dev16 Test16
1.
2.
3.
Pro
vis
ion
ing tim
e (
ho
urs
)
SDLC Instance - Oracle ERP with Portal (date)
05/07 05/22 05/27 06/19 06/25 07/22 08/05
33
Process – Details
Results
4 weeks to 36 hours.
24 hours (Provisioning 16 hours, Testing 8 hours) by Q4’13
Streamline demand intake process
Created bandwidth to provision an instance per week
Key
Takeaways
Automate end to end process, not focus on individual tasks
Empower global team
Don’t skimp on Blueprints
34
Transformation – People
Challenges
New People Roles and Change in Skill Sets
New role for Blueprint creation and management
Automation requires global coverage to manage process
Scarcity of skilled resources to perform new role
Most top skills were in one location
Action
IT resources obtained vCloud certification
Technical skills assessment, create a well balanced global team
Created subject matter expertise in installation and configuration
of tech stack/application
Team got solid shell scripting, tasks automation and trouble shooting skills
35
People – Details
Results
27 – now 22 – goal 5 (old instances still in use)
Provisioning can be initiated and executed from any of global
location
Employees performing high value work like blueprint
management
Key
Takeaways
Promote and help people internalize vision to get in lock step
Mental shift – fix blue print and re-provision vs fix problem
36
Transformation – Governance
Challenges
Functional test failures
Blueprint changes resulting in manual work
Lack of service definition and process to track cost per service
Action
Avoided changes during provisioning cycle
Re-provision instead of repair
Initiated programs to transform IT-as-a-Service (ITaaS)
37
Governance: Details
Results
Predictable delivery of 36 hours, targeting 24 hours by Q4’13
Improvement in functional testing, lower defect count
15 instances provisioned in 4 months
Key
Takeaways
Spend time in Blue printing all apps, no shortcut
“Disposable Infrastructure” reduce IT Capex
38
Results
Phase 1 Phase 2
Cycle Time
Hours per SDLC instance
172
36
Today
Phase 1 Phase 2
Virtual Machines Transitioned
To Private Cloud
Phase 1 Phase 2
AppOps Team
# of Engineers
Goal – 4000
2,800
2,200
Goal - 5
27
22
Reduced provision time
95% (4 weeks to 36 hours)
Improved productivity
of 600 developers
20%
Reduced
IT operations costs
$1.5M /year
Able to say
“yes” to developer requests
Reduced the cost of
a VM/month
80% ($133 to $20)
Reduced
infrastructure costs
$4.5M/year
672 hours (4 weeks)
Goal – 24 hours
Today Today
39
Phase 1 Phase 2
Cycle Time
Hours per SDLC instance
172
36
Today
Phase 1 Phase 2
Virtual Machines Transitioned
To Private Cloud
Phase 1 Phase 2
AppOps team
# of Engineers
Goal – 4000
2,800
2,200
Goal - 5
27
22
672 hours (4 weeks)
Goal – 24 hours
Today Today
Reduced provision time
95% (4 weeks to 36 hours)
Improved productivity
of 600 developers
20%
Reduced
IT operations costs
$1.5M /year
Able to say
“yes” to developer requests
Reduced the cost of
a VM/month
80% ($133 to $20)
Reduced
infrastructure costs
$4.5M/year
Bottom Line
Agility is Self-Sustaining
40
Key Takeaways (advice)
Share results of early automation with developers (customers) Show how the effort will help them.
Training is key. Blueprint management role become key SME. Help them become experts.
Don’t try to automate individual tasks Take holistic approach – system’s footprint view.
SDDC provides greater flexibility not possible with server virtualization Software controlled infrastructure.
41
Additional Resources
Blogs.vmware.com/cloudops
http://www.vmware.com/solutions/vmware-it-journey/
@vmwarecloudops #cloudops
THANK YOU