30
© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com Anatomy of Cloud-Based Recovery, A Detailed Walk-Through

Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Anatomy of Cloud-Based Recovery, A Detailed

Walk-Through

Page 2: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Topics

Challenges with Recovery

Adoption of Cloud-Based Recovery

Anatomy of Cloud Recovery

Cloud Business Drivers & Different Types of Clouds

Page 3: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Cloud Business Drivers

Op/Ex and Cap/Ex Budget Reductions

Architecture Resiliency

Improved Service Levels

Quicker Time-to-Market

Cost Control

Pay-Per-Use Model

Service Contraction and Expansion

Able to Focus on Core Business Competencies

3

CO-LOCATION

ENTERPRISE

ON-PREMISE

Page 4: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

The Four Types of Clouds

Shared infrastructure

for organizations in a

specific community with

common concerns.

4

Provider of services and

resources including apps and

storage available to general

public via Internet.

Infrastructure operated solely

for a single organization,

managed and hosted whether

internally or by a third-party.

Several unique cloud

entities bound together to

offer benefits of multiple

deployment models.

COMMUNITY PUBLIC

PRIVATE HYBRID

Page 5: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Cloud Offerings: Three Flavors

Software as a Service (SaaS)

Software solution delivered

via web browser not requiring

any in-house component.

Salesforce.com

Google Apps

5

Platform as a Service (PaaS)

Programming environment where

clients build applications that

leverage the Cloud.

Microsoft Azure

Google App Engine

Amazon Web Services

Infrastructure as a Service (IaaS)

Provider that supplies a Cloud

datacenter to deploy your virtual

servers, applications and policies.

SunGard Enterprise Cloud

AT&T

Amazon EC2

Page 6: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Topics

Challenges with Recovery

Adoption of Cloud-Based Recovery

Anatomy of Cloud Recovery

Cloud Business Drivers & Different Types of Clouds

Page 7: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Holistic Recovery Planning Demands Consideration of a

Full Range of Failure Incidents

7

File Missing … Application Failure …

Datacenter Disruption … Facility Outage

Page 8: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Protection Complexity Has Increased Over the Last Few

Decades

What methods do you use?

Traditional backup

Tape rotation

Tiered disk-to-disk

Replication - Which types?

Snapshot

Archive

Online services

Separate BC/DR

8

The result in most

environments:

high complexity, cost,

and administrative

burden

– without a consistent

recovery result

Page 9: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Typical Complex Environment At A Glance

9

Business

Process

Application

Layer

Data

Storage

Layer

Protection

Layer

CRM

SAP

Use Layer

Oracle

DB

SAN

Backup

Financial

Systems Manufacturing

Oracle

Financials

Multi-Server

Oracle DB

Custom

App

Back-Office

Exchange

SharePoint

SQL Server

File and Print

SAN

LAN

VTL Tape Library

High-Speed

Disk SATA

Disk

Replication

NAS

Archive

Page 10: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Recovery Site

10

Business

Process

Application

Layer

Data

Storage

Layer

Protection

Layer

CRM

SAP

Use

Layer

Oracle

DB

SAN

Backup

Financi

al

System

s

Manufactu

ring

Oracle

Financials

Multi-Server

Oracle DB

Custom

App

Simple

Back-

Office

Exchange

SharePoint

SQL Server

File and Print

SAN

LAN

VTL Tape Library

High-Speed

Disk SATA

Disk

Replication

NAS

Archive

Business

Process

Application

Layer

Data

Storage

Layer

Protection

Layer

CRM

SAP

Use

Layer

Oracle

DB

SAN

Backup

Financi

al

System

s

Manufactu

ring

Oracle

Financials

Multi-Server

Oracle DB

Custom

App

Simple

Back-

Office

Exchange

SharePoint

SQL Server

File and Print

SAN

LAN

VTL Tape Library

High-Speed

Disk SATA

Disk

Replication

NAS

Archive

Recovery can

double your

investment and

complexity…

Page 11: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Recovery Requirements

11

… and demands

resources which

may be

unavailable or

too costly.

Time

Procedures

Expertise Capacity

Coherent

Data

People

Page 12: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Costly - Purchases include Hardware, Software, Maintenance, Support, Training, Power, Space, Cooling costs

Off-Site Transit – Risk of loss for tapes transported off-site for

recovery

The Never Ending Backup - More data is being backed up as

growth rates continue to climb for all companies

Labor Intensive - Set up, on going management and problem resolution when the jobs fail

Backup Remains “Broken”…

12

Page 13: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

….And Recovery Is Even More Challenging

Recovery is expensive, requiring dedicated staff + resources

Recovery is risky technically and operationally, tapes fail

Recovery is a thankless task – no upside, lots of downside

Devoting resources to recovery has opportunity cost

13

Page 14: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

What solves the problems… Programmatic approach to recovery life cycle management

• Application to hardware mapping

• Identifying interdependencies

• Collecting and sharing knowledge

• Restoration “recipe” / Run Books

Life Cycle Management

• Recovery Change Management

• Maintaining Procedures

• Updating configurations

• Involved in customer’s Change

Control process

• Test Planning

• Test Monitoring and Execution

• Post test analysis

• Remediation efforts to fill gaps

Test Management

Restoration Procedure Documentation

Procedure Execution

• Startup Operating Systems

• Install Backup

environments

• Restore Operating

environments

• Restore Application Data

14

Page 15: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Topics

Challenges with Recovery

Adoption of Cloud-Based Recovery

Anatomy of Cloud Recovery

Cloud Business Drivers & Different Types of Clouds

Page 16: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Data Movement Predicts Recovery Points and Recovery Times

16

1 Week

RPO is last

successful

backup

1 Day

RPO is last

tape rotation

1 Min

RPO is last

transaction

Failure Event

RTO is time to

copy online file(s) or

restart the application

RTO is time to

restore data

from disk

RTO is time to

restore data

from tape

>24 Hrs

8-24 Hrs

15min-

4 Hrs

Tape Online Disk Replication

Failure

Page 17: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Site-to-site

VPN to

remote

facility

Virtual Windows

& Linux Machines

Internet

Recovered Virtual Machines (VMs) or LPARs

in Private Cloud

ATOT/D Failover

• Create VM/LPAR

• Hydrate data to provisioned

VM/LPAR

• Customer connects to

recovered VM/LPAR

remotely

Customer Data

• Backup data

SunGard

R2C Vault

Local

Vault

Client Based VPN

Licenses for

remote users

LAN

ATOT/D VPN

Backup Traffic

Site-to-Site IP/Sec VPN

for Backup

How it works…

1) All Servers are backed-up with an EVault block-level differential daily backup job,

which writes to a local vault appliance and is sized for 35 day retention.

2) Once the local vault appliance applies the writes it replicates over a secure VPN to

the remote vault at SunGard, which can be sized to any retention period.

3) ATOT/D Vaulting automation procedures are used to rehydrate each servers

backup into a virtual machine (VM) or LPAR.

4) ATOT/D end users access the recovered environment by way of a Client Based

VPN or site-to-site VPN to an alternative facility.

4 HP SUN pSeries iSeries

Physical Windows

& Linux Machines

Backup

Firewall

ATOT/D

Firewall

Target Use Case

• SMB and SME environments with <70

Windows, Linux, AIX, HP-UX, Solaris,

and OS/400 servers.

• < 10TB of Storage

• <24 hour RTO

• <24 hour RPO

• Customer looking for RaaS

3

1

2

2

Customer

Backup and Restore to Cloud - Solution Overview (Recover2Cloud for Vaulting)

Service

Provider

17

Page 18: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Internet

Host Replication to Dormant VMs in Cloud - Solution Overview (Recover2Cloud for Server Replication)

Windows or Linux

Virtual Machines

Physical

Windows

Server

Target Use Case

• SMB and SME environments with <70 virtual Windows or

Linux servers or physical Windows servers

• <4 hour RTO

• Near-real time to 15 minute RPO

• Customer looking for RaaS

• Ability to recover applications in one of two modes:

• “Crash Consistency” - recovery for any Windows or RH

Linux application to seconds before it crashed

• “Application Consistency” - recovery for any application

that supports a quiesce command to a known good

bookmark

Replication

Traffic

How it works…

1) Replication agent is installed on the source Windows/RH Linux

server(s). The agent compresses and sends OS, Apps and user data

changes to “Master Target” via a secure Internet connection (VPN).

2) For environments with very high change rates a Process Server can

be used to off-load replication & compression.

3) “Master Target” stores all the changes in individual VMDK files for

each protected server.

4) Upon Failover Request, SunGard promotes the VMDK file(s) from the

Master Target to a production VMs.

5) ATOT/D end users access the recovered environment by way of a

Client Based VPN or site-to-site VPN to an alternative facility.

Client-based VPN

connection

ATOT/D

Recovered Windows and Linux Servers

in Private Cloud

ATOT/D Failover

• Promote VMDK

from Master Target

• Create production VMs

3 Day CDP Log

Protected Servers stored as

a “Dormant VMs”

• System

• Apps

• Data

Master

Target VM

Optional Process

Server

LAN

ATOT/D VPN

Replication Traffic

Site-to-Site IP/Sec VPN

for replication

1

2

3

4

Customer Site-to-site

VPN to

remote

facility

5

Service

Provider

18

Page 19: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Internet

Target Use Case

• SME environments with <1000 VMware VM’s

• <4 hour RTO per 250 Always-On VM’s

• <8 hour RTO per 250 On-demand VM’s

• Near-real time to 15 minute RPO

• Customer looking for a RaaS

Replication

Traffic

How it works…

1) VMware’s Storage-based Site Recovery Manager is used to replicate to a

target storage array and DR SRM instance at SunGard over a secure VPN.

2) ATOT/D for:

• Always-On Recover2Cloud compute infrastructure : VMware’s SRM

orchestration engine is used to recover VM’s into the Always-On infrastructure.

• On-Demand Recover2Cloud compute infrastructure : the target storage array

and DR SRM instance are attached to the On-Demand infrastructure, then

VMware’s SRM orchestration engine is used to recover the VMs.

3) ATOT/D end users access the recovered environment by way of a Client

Based VPN or site-to-site VPN to an alternative facility.

Client-based

VPN

connection

ATOT/D

LAN

ATOT/D VPN

Replication Traffic

Site-to-Site IP/Sec VPN

for replication

1

2

Customer

SRM vCenter

SRM vCenter

Site-to-site

VPN to

remote

facility

Customer DR

SRM Instance

ATOT/D Failover

• On-demand or

Always-On Compute

resources are used to

recover Customer’s

VMware environment

Customer

Dedicated SAN

SAN

1

2

Site Recovery Manager Replication to Cloud - Solution Overview (Recover2Cloud for SRM)

3

iSCSI or Fiber Channel

Service

Provider

19

Page 20: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com 20

SunGard/AT&T solution for a Manufacturing client in AL

Background:

Large Manufacturing company in AL with multiple locations along the Gulf Coast

Region and an internal IT DR solution.

Windows and iSeries environment with two tiers of recovery (<4 hours RTO and near

zero RPO for critical applications and 48 to 72 hour RTO for non-critical applications).

The April 27th 2011 tornadoes created a heightened sense of awareness of the need

for:

• a comprehensive “worry free” DR strategy that was a Managed Service and backed by SLA’s; and

• Onsite mobile work place for end-users.

Product/Solution

• Recover2Cloud Server Replication for tiered applications

• Replication of AS400 from Birmingham to PA including a DS3

• Traditional Hot Site Recovery for remaining infrastructure along with (2) Mobile

Recovery Units

• Consulting Services for Technology Migration

The client now has a Recovery Solution to keep their operations running 24x7

Page 21: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Topics

Challenges with Recovery

Adoption of Cloud-Based Recovery

Anatomy of Cloud Recovery

Cloud Business Drivers & Different Types of Clouds

Page 22: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

How SunGard and AT&T Views Cloud: An Evolutionary

Step for Recovery-as-a-Service (RaaS)

RaaS after the cloud

Pay as a service

Private

Flexible

Scalable

Heterogeneous

Vendor agnostic

Online

Application recovery

22

Cloud is a new platform

that enables more cost-

effective application

recovery

Page 23: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Cloud: Who Owns the Compute Assets? Who Owns the

Responsibility for Recovery?

Differs by service provider

Differs by client

Sometimes, clients want to

contract for the resources

– but manage everything

on their own

Other times, clients want

the service provider to

handle everything for them

23

IaaS

PaaS

RaaS

Cloud Service Levels

Page 24: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

For Recovery, Cloud Platforms Lower Cost and

Improve Scalability

What Cloud Changes

Lower cost platform

Faster response to real-time fluctuations on-demand

Compute, network, storage

OPEX rather than CAPEX investment

Need to manage service providers – and define who has responsibility

Need to balance secure (private) vs lowered cost (shared)

What Cloud Does Not

Change

Need to modernize data movement

Need to analyze applications value and downtime business impact

Need to tier applications and prioritize recovery investment

Need for applications expertise to plan and implement successful recovery

Need to maintain and test recovery procedures

24

Page 25: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Advice for Controlling Cloud Recovery Costs

Invest at the level of responsibility that you want

Invest with the RPO/RTO service levels that you need

Invest according to the business value of each application

Add to your own recovery expertise and procedures

Or, completely replace your need to operate recovery

25

Page 26: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Stepping Into Cloud

1. Evaluate business impact of applications – and tier

2. Establish RPO/RTO targets for each tier

3. Select data movement technologies which match

RPO/RTO targets

4. Consider the cost-savings and expertise of service

providers

• Is the cloud flexible including virtual, cloud, hybrid infrastructure?

• Is the responsibility level clear?

• SLAs in writing?

• How many actual disaster declarations have they managed?

26

Page 27: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com 27

Page 28: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Want to hear more and discuss your requirements?

If you are a current AT&T client please contact:

• Lorilee Ressler, AT&T Advanced Cloud Solutions

[email protected] 412-759-8071

If you are a current SunGard client please contact:

• Rick McAdoo, [email protected] 412-309-1252

28

Page 29: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Appendix

29

Page 30: Anatomy of Cloud-Based Recovery, A Detailed Walk-Through · Programmatic approach to recovery life cycle management ... Replication agent is installed on the source Windows/RH Linux

© 2012 SunGard Availability Services LP. - All Rights Reserved - www.sungardas.com

Data Points: The 2011 Digital Universe Study1

In 2010: Crossed the zettabyte barrier

In 2011: Growth will surpass 1.8 ZBs (1.8

trillion GBs)

500 quadrillion “files” or containers

Commercial organizations responsible for

80% of information

Growth of data is more than doubling every

two years

Outpaces the growth of storage

Growth of files is faster than growth of data –

by a factor of 8 over the next five years

IT staff resources will grow by <1.5

30

1IDC 2011 Digital Universe Study,

sponsored by EMC and published

June 2011