9
WHITE PAPER Understanding Disaster Recovery Options

WP Understanding DR Options

Embed Size (px)

Citation preview

WHITE PAPER

Understanding Disaster Recovery Options

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

2 © 2013 TwinStrata, Inc. | www.twinstrata.com

Table of Contents

Overview .............................................................................................. 3

Dimensions of Disaster Recovery ............................................................................................................. 3

Disaster Recovery Tiers ............................................................................................................................ 3

Levels of Criticality for Data ...................................................................................................................... 5

Example Disaster Recovery Options ......................................................... 5

Data Recovery Only Options .................................................................... 6

Offsite Tape Backup .................................................................................................................................. 6

Online Disk-based Backup ........................................................................................................................ 6

Cloud-Based Backup ................................................................................................................................. 6

Data and Application Recovery Options ................................................... 7

Remote Site Disaster Recovery and Business Continuity ......................................................................... 7

Cold-Standby Disaster Recovery and Business Continuity Site ................................................................ 7

Hot-Standby Disaster Recovery and Business Continuity Site ................................................................. 7

Disaster Recovery as a Service.................................................................................................................. 8

Summary of Disaster Recovery Solutions ................................................. 9

About TwinStrata CloudArray Disaster Recovery as a Service ..................... 9

Try CloudArray for Free........................................................................... 9

About TwinStrata .................................................................................. 9

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

3 © 2013 TwinStrata, Inc. | www.twinstrata.com

Overview

The process of choosing a disaster recovery (DR) solution can be daunting. Every organization has different requirements around what applications and data are critical, how quickly they need to recover (recovery time objective), how much they’re willing to risk (recovery point objective) and how much they’re willing to spend. And because the tipping point for each organization is slightly different, it can often be difficult to identify the “right” solution for your particular environment. Even worse, analysts and vendors serving this market have created a nearly impenetrable wall of confusing terms, classification, products and services, making the task of choosing a solution even more formidable.

There are multiple dimensions to the DR problem that must be considered and addressed at different levels. This paper outlines several disaster recovery options and explains how each option fits within the context of those dimensions. Additionally, it describes where and how a cloud-based disaster recovery as a service (DRaaS) solution fits into the DR ecosystem and outlines some of the advantages of using the cloud for disaster recovery.

Dimensions of Disaster Recovery

The three primary dimensions that govern the choice of a disaster recovery solution are Recovery Point Objective (RPO), Recovery Time Objective (RTO) and cost.

Recovery Point Objective (RPO) is the “what data am I willing to risk” dimension. Specifically, it refers to the amount of

tolerable data loss in the event of a disaster. For example, with a tape backup taken once per day, the RPO is up to 24 hours.

Recovery Time Objective (RTO) is the “how quickly do I need to recover my data” dimension. This describes the amount of downtime that is tolerable in a disaster recovery solution. For example, with a tape backup shipped offsite the RTO is the amount of time it takes to physically ship the tape to the recovery site plus the amount of time it takes to recover from tape and start up the affected applications.

Finally, Cost. The cost range of a disaster recovery solution varies tremendously depending on the solution. In addition to calculating the business costs of RTO and RPO, the cost of equipment and required services must be factored. For example, a near-zero downtime and data loss solution requires an additional site equipped not only with the appropriately mirrored storage, networking and computes, but also power conditioning equipment, cooling and backup infrastructure all of which must be running all of the time.

Disaster Recovery Tiers

The SHARE users group (www.share.org) defines seven tiers of Disaster Recovery. Each tier is formulated by the RTO and RPO which are discussed in the Dimensions of Disaster Recovery outlined above. This tiered structure acts a tool to define a common industry-standard set of terminology for categorizing solutions. Not all solutions are likely to fit neatly into a single tier. The different DR tiers are outlined in Table 1 on the next page.

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

4 © 2013 TwinStrata, Inc. | www.twinstrata.com

TIER DESCRIPTION RTO/RPO/COST

Tier-0 – No off site data No recovery plans and no off site data

Not applicable

Tier-1 – Data backup with no hot standby site

Referred to as the pickup truck access method (PTAM), backups are stored to tape or disk and physically moved offsite. Recovery involves physically moving the media for restoration.

Days-Weeks of downtime / Days of data loss / Low cost

Tier-2 – Data backup with a hot standby site

Incorporates the same premise as Tier-1 except there is a hot standby site that includes the hardware needed to restore operations from backup.

Days of downtime / Days of data loss / Moderate cost

Tier-3 – Electronic Vaulting Blends Tier-2 with the idea that some of the more business critical data is moved in electronic form to the remote site from which it is then backed up to media. This data can be restored without physically moving it from another site.

1 Day of downtime / Days of data loss / High cost

Tier 4 – Electronic Vaulting with Point in Time copy

Combines Tier-3 with disk based electronic vaulting solutions such as point in time snapshots, which are typically taken at a higher frequency than tape based solutions.

1 Day of downtime / 1 Day of data loss / High cost

Tier-5 – Multi-site Transaction Integrity

Includes all aspects of Tier-4 along with application level data consistency guarantees maintained between sites. This is done via a high bandwidth link between sites. Only data in flight is lost in a disaster.

Minutes to hours of downtime / Minutes to hours of data loss / Extremely high cost

Tier-6 – Zero Data Loss Incorporates all aspects of Tier-5 solutions combined with the notion that even data in flight between sites is not lost in a disaster.

Seconds of downtime / Zero data loss / Extremely High cost

Tier-7 – Automated & Business Integrated solution

Tier-6 level solution with the additional element of automated application failover and failback.

Zero downtime / Zero data loss / Extremely High cost

Table 1: Seven tiers of disaster recovery solutions

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

5 © 2013 TwinStrata, Inc. | www.twinstrata.com

Levels of Criticality for Data

There are four categories or levels of criticality for application data protected by a DR solution. These classifications are used to determine the tier of DR solution to implement. They are outlined in Table 2 below.

DATA TIER DESCRIPTION

Critical Critical to business revenue generating operations. Loss of access to this tier of data translates to lost revenue for the business or inability to conform to regulatory compliance.

Important Important to business processes and internal operations.

Semi-Important Important to internal business process, but can be re-created from other sources.

Non-Important Data that can be re-created from other sources such as reports.

Table 2: Data Classifications

Example Disaster Recovery Options

There are numerous options around DR services and the best option is dependent upon the Business Continuity and DR planning goals for the business. In larger enterprises, multiple solutions are implemented due to differing requirements across the organization. This holds true at times even in a smaller organization.

At its simplest, the traditional disaster recovery market can be bifurcated into two categories: strategies that provide offsite data protection (and consequently recovery) only, and those that enable the recovery of both data and associated applications. Not surprisingly, as recovery time decreases, solution costs begin to increase exponentially.

While every organization aspires to a zero-downtime environment in the event of a disaster, the reality is that the associated costs and resources required to achieve that goal are far out of the reach of the typical organization.

For small and mid-market businesses in particular, having a true disaster recovery strategy that includes application recovery has often been unrealistic.

Figure 1: Example DR options

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

6 © 2013 TwinStrata, Inc. | www.twinstrata.com

Outlined in the sections that follow is a brief overview of four traditional data center-based DR service options and how cloud-based options contrast in terms of both time and RTO. For the purposes of simplicity, this paper focuses primarily on RTO over RPO with the implicit understanding that the two often correlate. These example options are illustrated in Figure 1 on the previous page.

Data Recovery Only Options

Offsite data protection is a critical component to disaster recovery. After all, what good are your applications if you don’t have your data? However, offsite data protection can range widely in terms of RPO, RTO and costs depending on your organization’s implementation.

Generally speaking, however, data recovery options such as offsite tape backup and online disk-based backup carry lower costs but much higher recovery times than solutions that recover both data and applications.

Offsite Tape Backup

The most basic, common, low-cost, and low-tiered DR solution is traditional backup. Typically backups are taken daily to tape and physically shipped offsite.

There are two fundamental problems with offsite backup. The first is that the RTO is very long, especially if there is no standby site. ChemPoint.com, for example, estimated that its previous tape-based disaster recovery strategy carried a recovery point objective of one week and a potential recovery time objective of two weeks.

The second problem lies in the difficulty in testing and verifying recovery procedures. Due to the fact that the RTO is measured in days,

testing a recovery from an offsite backup is expensive in terms of time and resources.

Online Disk-based Backup

A slightly higher tiered data recovery solution is online disk-based backup. In a traditional environment, online disk-based backup operates very similarly to offsite tape backup with one notable exception. Unlike tape, backups are accessible online, making it possible to decrease recovery time for the data only. However, this decrease in recovery time comes with it a corresponding increase in costs. Moreover, this method of backup still focuses solely on data protection. In the event of a true disaster, organizations will still need to bring their applications back online.

A subtle but important consideration in the RTO of online disk-based backup is how long the data takes to be restored across the network. Because all data is typically required for the restoration to complete, there may be instances where it is more practical to ship data on physical media than try to restore over the network.

Cloud-Based Backup

As cloud-based technology has become more widely accepted, more organizations are turning to cloud storage, especially for backup. Like online disk backup, this has the advantage of being immediately accessible, significantly reducing recovery time, but at a much lower cost.

Organizations who have secondary locations (in the form of a remote office, for example) can immediately access their data online, using the secondary site as a failover for applications from the afflicted site. In the previous ChemPoint.com example, the company reduced their RTO from two weeks to just a

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

7 © 2013 TwinStrata, Inc. | www.twinstrata.com

couple hours by combining cloud-integrated storage in all three of their offices.

Moreover, some cloud-integrated storage technologies have the ability to operate in cloud compute environments, making it possible to access data in the cloud without having a secondary site at all. This approach has significant RTO and RPO advantages over tape-based backup and significant cost advantages over online disk-backup.

Also, cloud-integrated storage can recall individual data blocks immediately and may not require a full restore across the network for partial recovery starting with critical workloads. This may be particularly valuable for larger data sets where it is not practical to transfer all data across a network.

Data and Application Recovery Options For organizations that require a more comprehensive disaster recovery and business continuity strategy, application recovery in addition to data recovery is critical. These solutions require access to some kind of secondary location with the infrastructure needed to reconstitute both the applications and the data lost at the afflicted site.

Costs here can vary widely based on organization size and complexity, as well as if disaster recovery is desired in seconds/minutes or in hours.

Remote Site Disaster Recovery and Business Continuity

Organizations with multiple locations are at a distinct advantage when it comes to disaster recovery and business continuity. Typically, such organizations are able to temporarily use infrastructure from a remote (unafflicted) office until systems can once again be brought up at the disaster site. Primary requirement here is access to data stored at the disaster site (easily

done if using a cloud storage solution), as well as the ability access/install any affected applications.

Cold-Standby Disaster Recovery and Business Continuity Site

Cold standby disaster recovery and business continuity sites afford an RPO of hours and an RTO of hours to a day. Typically, the solution uses a journal or log to mirror writes to a local disk, which are then moved over a dedicated link to a mirrored volume at another site. The solution is expensive by nature due to the fact that another site is involved. Recovery testing is easier however than backups simply because the data already resides on storage on the remote site and the remote site can be used to test the solution. Hosted environments are an example of a cold standby DR site.

Hot-Standby Disaster Recovery and Business Continuity Site

The highest tiered solution is a hot standby DR/BC site with synchronous replication. This is typically a tier-6 or tier-7 solution. The solution, in real-time over a high-speed link, mirrors writes to a hot standby site. The RTO and RPO for a solution such as this are usually in the minutes range, but can also be zero-loss solutions. Extremely expensive to develop and maintain, hot standbys represent the lowest data and time risk to an organization and are much easier to test than with backups.

A modern, lower cost model of this is an always-on cloud compute environment. Similar to its physical counterpart, an in-cloud hot standby requires recurring cloud-based infrastructure costs and secondary application and virtualization licenses to maintain the environment. In addition, many cloud compute environments aren’t easily compatible with on-premises VMware environments – leading to complex, risky orchestration and a reliance on conversion or migration tools that may or may not work.

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

8 © 2013 TwinStrata, Inc. | www.twinstrata.com

Disaster Recovery as a Service

Disaster Recovery as a Service (DRaaS) is a flexible alternative to traditional DR solutions in that the cloud is used as the standby site. This allows for a much lower cost solution for the tier-2 through tier-5 range of DR options. It also allows for an RPO and RTO in the range of hours at a price point much lower than traditional multi-site mirroring alternatives.

There are a number of options in terms of DRaaS implementation and the one right for an organization as in any DR solution depends on the desired RTO and RPO. Table 3 below outlines a few example DRaaS implementations.

TYPE DESCRIPTION RTO / RPO / COST

Backup to Cloud / Restore on premise

Data is backed up to the cloud and restored on premise.

Days of downtime / Days of data loss / Moderate cost

Replicate to Cloud / Restore on premise

Data is mirrored to the cloud and restored on premise.

1-2 Days of downtime / 1-2 Days of data loss / Moderate cost

Backup to Cloud / Restore in Cloud

Data is backed up to the cloud and restored in the cloud. The cloud has standby VMs, bare metal, or both to restore to.

4-24 Hours of downtime / 1 Day of data loss / Low cost

Replicate to Cloud / Restore in Cloud

Data is mirrored to the cloud or the cloud is used as primary storage. The cloud has standby VMs, bare metal or both.

4-24 Hours of downtime / Hours of data loss / Low cost

Table 3: DRaaS examples

In an ideal DRaaS set-up, data and VMs are replicated to the cloud through either traditional backup software or through data replication such as vSphere Replication (the latter providing a lower RPO than the former) in a virtualized environment. Applications can then be quickly spun up on an as-needed basis in a complimentary virtualization environment (for a VMware-based organization, for example, the cloud compute environment should be VMware-based as well) in order to ensure compatibility – while other solutions may spin up applications to a heterogeneous virtual environment, they increase risk by creating dependencies on virtual machine conversion or migration tools.

Such a setup provides the recovery time advantages of a warm standby solution without the astronomical costs. The on-demand nature of the solution also minimizes waste from underutilization.

Disaster Recovery as a Service provides RPO and RTO in the range of hours at a price point much lower than traditional multi-site mirroring alternatives.

WHITE PAPER: UNDERSTANDING DISASTER RECOVERY OPTIONS

9 © 2013 TwinStrata, Inc. | www.twinstrata.com

Summary of Disaster Recovery Solutions

Identifying the “right” disaster recovery strategy appropriate for a particular organization cannot be determined generically. Rather, each organization needs to appropriately weigh its tolerance for risk, its resources available and its budget to determine what best fits their organization’s requirements. For many organizations, the right answer will be a hybrid of more than one of the options outlined above – variable by application or data type based on importance to the business.

With the advent and maturation of cloud-based technologies, however, organizations now have a wider variety of choices at lower cost points, making it far easier to reduce both RPO and RTO without a commensurate explosion of overall costs.

About TwinStrata CloudArray Disaster Recovery as a Service Having recognized a critical hole in the market – a need for a robust but affordable DR solution that provides greater application and data recovery – TwinStrata has developed a Disaster Recovery as a Service offering that delivers VMware-virtualized environments with an always-available way to recover data and applications at a moment’s notice – without racking up ongoing secondary infrastructure costs.

Included with every license of TwinStrata CloudArray, CloudArray DRaaS provides on-demand disaster recovery that can spin up a VMware-based infrastructure in 2-4 hours.

Try CloudArray for Free

Download a free copy of CloudArray today to start enjoying all the benefits of cloud storage. You’ll receive a fully functional release of CloudArray, complete with a free cloud storage account for the evaluation period. http://www.twinstrata.com/free-trial

About TwinStrata

TwinStrata delivers cloud-integrated storage solutions that seamlessly combine the flexibility of cloud-based technologies with the robustness of traditional storage. As a result, customers benefit from significant reductions in IT costs, time and administrative requirements. Customers of all sizes use TwinStrata to capitalize on the cloud’s scalability and economy advantages without sacrificing the security, performance and peace of mind of local storage. More information is available at twinstrata.com.

WP-UDRO-05131