21
Sorting Through the Confusion Replacing Tape with Disk for Backups WHITE PAPER

Sorting through the confusion white paper

  • Upload
    servium

  • View
    654

  • Download
    1

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Sorting through the confusion white paper

Sorting Through the Confusion Replacing Tape with Disk for Backups

WHITE PAPER

Page 2: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 1 |

Table of Contents Introduction .................................................................................................................... 2

Considerations When Examining Disk- Based Backup Approaches........................................... 4

Backup Requirements ....................................................................................................... 7

Backup or Cloud Services .................................................................................................. 8

Disk Staging .................................................................................................................. 10

Primary Storage SNAPS .................................................................................................. 12

Backup Application Deduplication in the Media Server ......................................................... 13

Backup Application Client Side Deduplication ..................................................................... 15

Purpose-Built Target Side Deduplication Appliances ............................................................ 17

Summary ...................................................................................................................... 18

About ExaGrid ............................................................................................................... 19

Page 3: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 2 |

Introduction The reason a 50-year old technology like tape is still around is simple; it’s CHEAP. But there is increasing pressure on businesses to fix their backups, as detailed in many sources including the report, “Best Practices for Addressing the Broken State of Backup” by Dave Russell, research vice president at Gartner. He found that “for many organizations, backup has become an increasingly daunting and brittle task fraught with significant challenges.” The pressure of data growth has increased sharply as businesses need to store both onsite and offsite copies of their data. This can mean storing 40 to 100 times the volume of their primary dataset, due to storing weeks of retention onsite and weeks, months and in some cases, years of retention off site. Onsite copies are kept in order to recover a deleted or overwritten file or to recover from a system outage, hardware failure or data corruption that goes unnoticed until the data is needed again, perhaps weeks later. Offsite copies are kept in order to recover data if the primary site has a disaster. Maintaining more copies with longer term retention is being driven by business needs such as SEC Audits, regulations such as the Gramm Leach Bliley Act (GLBA), Health Insurance Portability and Accountability Act (HIPAA) and Sarbanes Oxley (SOX), legal requirements such as the need for legal discovery, Service Level Agreements (SLA), contractual reasons and many other business and legal reasons. The challenge of labeling tapes, transporting tapes, storing them and ultimately finding the right tape when requested is a challenge in itself. This is compounded by the fact that the data may not even be on the tapes, since 30% of tapes are found corrupted, damaged or blank. Tape has some intrinsic problems The number of simultaneous jobs that can be writing to tape is determined by the number

of drives in the tape library resulting in unnecessarily long backup times. Restores fail about 30% of the time from tapes, which can be missing files, have

corrupted files or can be unreadable or blank. Tapes are physically transported repeatedly and can be lost, misplaced, or stolen in

transit. Tape labels are handwritten, making them subject to human error. They can also fall off

or be unreadable. Tapes can be damaged by wear though overuse in the rotation, heat, humidity, magnetic

fields, dirt and other environmental conditions. Data at rest on tapes is not encrypted. Tape encryption dramatically increases the backup

window. Tapes can be password-protected but require a system to track passwords, which is subject to human error.

It takes time to restore from tapes and even more time if, for any reason, a tape set is bad. In that case, you have to fall back to an earlier tape set and start the restore again.

Page 4: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 3 |

The time to first find, then retrieve, tapes in transit or at remote locations must be factored in, as well.

Inertia and confusion have kept tape alive The market is full of affordable alternatives to tape, but at the same time there is a lot of confusion about technologies like deduplication. Disk has been making steady inroads on tape as the primary target for backup software for reasons that go much deeper than price with respect to tape, or being faster and easier to manage than tape. Because of the ambiguity surrounding tape alternatives, some businesses are confused when looking to replace tape and tape solutions. Disk is just one part of a whole new equation that has emerged where near real- time business continuity and disaster recovery are the new desired end results. Disk eliminates the daily grind and uncertainty that typically surrounds backup to tape. Instead, IT staffs get relief from worrying whether backups and restores are completing successfully, or that backup jobs have failed. Disk by its very nature fixes all the intrinsic problems of tape Volumes or NAS shares are virtual and can have a large number of backup jobs being written

in parallel, reducing the backup window. Backups and restores are reliable with disk. With tape, up to 15 percent of backups fail and

up to 30% of restores fail. With disk virtually 100% of backups complete and 100% of restores succeed.

Disks are in a hermetically sealed case inside a temperature and humidity controlled data center, eliminating the environmental degradation issues of tape.

Disks reside a rack, in a data center, which is in turn is secured by physical and network security. Therefore, the security issues of tape moving around are eliminated.

There are no handwritten labels that can fall off. The software automatically tags all jobs that have been written to disk.

In addition to physical and network data center security, data stored on disk can be encrypted with only a 3 - 4% performance reduction. Encrypting while writing to tape dramatically slows down the backups.

Restoring data from backups, including incremental backups, is fast from disk. No time is lost to finding or retrieving tapes in transit or at remote locations. Not only is disk more reliable to restore from, but it is random access versus tape which is sequential access.

Over the last decade, a range of technologies has emerged that makes it feasible for disk to replace tape. Disk-based solutions now offer the benefits that only tape once offered, such as infinite capacity, portability and manageability. Make better use of resources The use of disk storage for augmenting tape, or of disk storage and deduplication either augmenting or eliminating tape, is becoming a more logical investment for organizations. .

Page 5: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 4 |

Scarce resources once used to deliver "just" data protection can be repurposed for the strategic business initiatives of disaster recovery and business continuity. Those responsible for planning or carrying out backups are looking for tape alternatives that offer: Less IT staff time spent on backups, resulting in time to focus on other valuable IT

initiatives Faster backups More reliable backups Faster and more reliable restores Ability to meet all financial, governmental and legal retention requirements Achieve all of the above without making any major changes to the current environment

that could create work, risk or change Considerations When Examining Disk- Based Backup Approaches Now that it is economically feasible to move from a tape-based to disk-based backup approach, a large number of vendors with varying approaches have emerged. This has caused a great amount of confusion for IT managers looking to adopt a disk-based backup system for their organization. To help clear away this confusion, this white paper will first present a general overview of several different deduplication approaches. This section will show how deduplication can store far less data on a given amount of disk using new technologies that minimize the amount of disk required. This results in a cost for disk that is about the same cost as tape. For reference, a chart is included that lists the backup requirements of each of these approaches. Next, each of the six potential solutions that are often considered to replace tape with disk will be presented in turn. Information about each approach will be shown, including the pros and cons of each approach. These six approaches are as follows: Backup services or cloud backup services Disk staging – storing data on disk that has been inserted between the media servers and

the tape libraries Primary storage SNAPs Backup application data deduplication in the media server writing to standard disk Backup application data deduplication on server agents (client side) writing to standard

disk Purpose-built target side appliance with deduplication

Data Deduplication Overview One of the few remaining arguments for tape is that tape libraries will technically never "run out of retention capacity". As soon as a tape cartridge fills up, it can be replaced with another tape

Page 6: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 5 |

cartridge and the full cartridges can be stored. When writing to disk, storing the same amount of data that is stored on tape would require a massive amount of disk, resulting in high cost. However, if you could use a fraction of the space required to store the data on disk and bring the cost of disk storage close to the cost of tape, then disk is clearly the better alternative. From week to week, only about 2% of the bytes change. However, with tape backup 98% of the unchanged data is backed up repeatedly, resulting in saving the identical data dozens and even hundreds of times. With disk, deduplication software can intelligently save only the 2% of the data that changes from week to week, saving only the changed data. The net result of using disk storage and data deduplication together is you only need 1/20th to 1/50th of the storage you would need on tape. Since tape costs about 1/20th the price of disk per TB of usable capacity, using data deduplication effectively neutralizes the price gap between tape and disk by using far less disk space than is required to store the same data on tape. There are many methods to data deduplication including: Fixed data block (64KB to 128KB) - used in Backup Software Applications

Changed storage blocks - used in primary storage SNAPS

Byte level - used in target side appliances

Data block with variable content splitting - used in target side appliances

Zone-byte level - used in target side appliances

All of these methods reduce redundant data in backups. For example, if a full backup of 50TB of data is completed every Friday night, and 10 weeks are kept onsite, it would take 500TB of disk space to store the backup. However, most of the full backup is unchanged from week to week. Only the data that has been changed, edited or created that week needs to be stored. On average, only about 2% of the data changes from week to week. In this example, 2% is about 1TB per week.

Figure 1 - Data Deduplication Taxonomy

Page 7: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 6 |

If you were to take out all of the redundant data, over time the storage required can be reduced by as much as 50:1, depending on the deduplication method used. Factors Impacting Deduplication Results In general, the higher the deduplication ratios, the better. A higher deduplication ratio uses less disk space over time and needs far less WAN bandwidth to replicate data to the offsite disaster recover site. Deduplication Approach The deduplication approach selected impacts the amount of storage savings that will result. 64KB to 128KB fixed block will average about 7 to 1 Byte, Segment-block and Zone will average from average from about 20: 1 to 50: 1

reduction in data storage

Data Mix Affects Results The deduplication ratio can range from 10: 1 to as much as 50: 1, depending on the mix of data types being backed up. Databases can get very high deduplication ratios of over 100: 1. Unstructured file data will see an average ratio of 7-10:1. Deduplicating compressed or encrypted files does not yield a high ratio or significant space savings. Retention Period The longer the retention period, the higher the deduplication ratio will be.

Getting the Best Results The best deduplication ratios will be achieved in environments that are: Using byte, data block or zone-level deduplication Backing up no compressed or encrypted data Retaining data for longer-term periods, on the order of 18 weeks

The worst deduplication ratio will be achieved in environments that are: Using 64KB or 128KB fixed block deduplication Backing up a large amount of compressed or encrypted data Retaining data for shorter-term periods, on the order of 4 weeks or less

The net is that not all deduplication approaches achieve the same results. Deduplication ratios are clearly impacted by data types and retention periods. All of these factors need to be taken into consideration when choosing the proper disk backup approach.

Figure 2 - Deduplication Reduces Storage over Time

Page 8: Sorting through the confusion white paper

Backup Requirements The chart below shows the top backup requirements of most IT shops, arranged in priority order. Each of the approaches, including staying with tape, is shown in its own column. As you can see, not all approaches can meet all requirements. The key is to list your requirements and match them against each of the solutions to see which solutions best meet your requirements. The following sections show the strengths and limitations of each of the 6 disk solutions.

Page 9: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 8 |

Backup or Cloud Services There are many backup or cloud services to which backup can be outsourced, and the market is evolving as new players enter the field. These services require replacing the server agents used by the backup application. The service can then remotely manage the backup environment. At the start, one complete backup of the data needs to be sent to the backup server. The logistics of doing this data transfer can be troublesome, due to the large, sustained bandwidth required. After the initial full backup is transferred, just the changes in the data need to be uploaded to the outsourced service. Most of these agents only move changed bytes once the initial full backup is at the server provider (in the cloud). Before a cloud backup recovery strategy is implemented, two key factors should be considered. First, one should ask what the recovery point objective (RPO) is for the business service that is being considered. Second, one should ask what the recovery time objective (RTO) is for the business service. Be sure to evaluate carefully the claims made in cloud service contracts. The most important of these contractual promises is the availability of the service, the provider’s service level agreements (SLAs), and the security of your data. According to a Yankee Group report1,”cloud contracts are rife with disclaimers, misleading uptime guarantees, and questionable privacy policies…” Strengths Frees up IT staff to do other core/critical IT tasks

Weaknesses Requires changing all the server agents from your existing backup application to the

outsourced service backup agents. Any changes to the agents will require weeks or months of tweaking.

1 http://www.yankeegroup.com/about_us/press_releases/2010-04-21.html

Figure 3 - Typical Cloud Backup

Page 10: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 9 |

Good for small amounts of data, typically under 1TB. Best fit for small IT shops or a large company’s small remote office, but not for multi-TB environments. This limitation is due to the time needed to recover the data over the Internet. Under normal operation only the changed bytes or blocks get sent for replication. However, if a full backup restore is required it would take about 31 days to get 1TB of data over 3MB of bandwidth from the internet. It is key to note that it is not the bandwidth between site you are using but rather your bandwidth to the internet.

If the data is over a few TB, most service providers need to place a hardware appliance (cache) in the IT environment to keep at least one week of backups (including a full backup) on-site to overcome the recovery bottleneck presented by bandwidth to the internet. The cost of the cache appliance plus the monthly fees makes a backup or cloud service the most expensive backup choice if you have more than a few TBs of data to protect.

Summary For consumers, small IT environments (<1TB) and small remote offices with a small or

nonexistent IT staff, a small data center (if any) and low bandwidth, a backup service is the best way to go.

If there is a reasonable amount of data (>1TB) services become too cumbersome and too costly.

Page 11: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 10 |

Disk Staging Disk staging places a disk between the media server or storage nodes and the tape library. This is also considered tape augmentation. All backup applications can write directly to a disk volume or NAS share, so disk staging works natively with all backup applications. Disk staging reduces the perceived backup window at the client level, reduces the backup verification window at the server level, and provides the high speed recovery of files from disk, rather than tape. Strengths By placing disk between the media servers/storage nodes and the tape library many

problems are solved: Multiple parallel jobs can be handled, without being limited to the number of

physical tape drives. This results in faster backups, assuming that media servers can keep up.

Reliable backups and reliable restores for the data are assured using disk. Weaknesses Disk staging becomes expensive very quickly:

Disk staging does not eliminate the use of tape onsite or offsite. It simply augments tape onsite.

There is no data deduplication when using disk staging so the amount of disk grow very quickly and becomes extremely expensive with any level of retention.

For example, two weeks of nightly backups and weekly full backups require storing four times the size of the primary data on disk. This assumes a rotation of full backups for databases and email nightly, incremental backups on files nightly and full backups on Friday.

Each night, a combination of incremental backups of files and full backups of databases and email will equal about 25% of a full backup. These Monday through Thursday nightly backups will add up roughly to the size of a full backup.

Figure 4 - Disk Staging Concept Overview

Page 12: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 11 |

Using 40TB of data as an example, nightly backups after four nights will be 40TB and a Friday full backup will be 40TB. Together, they will require a total of 80TB of disk storage. After two weeks, this expands to 160TB of disk storage required. Therefore, 90% of customers using disk staging keep between one and two weeks of data on disk.

Summary Disk staging is good for one to two weeks of onsite retention on disk.

It is estimated that about 70% of tape users use disk staging

For retention over one or two weeks, or tape replacement onsite, an organization must

use data deduplication in order to store only unique data (not the redundant data) in order to use far less disk, thus reducing the cost impact.

Page 13: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 12 |

Primary Storage SNAPS Primary storage SNAPS (a quick logical copy or snapshot) are useful primarily for short-term retention. They are just the first line of defense in a layered backup scheme that includes long-term backups. SNAPS save changed storage blocks on a periodic basis (e.g. hourly) that allow for roll back to the last period. Primary storage SNAPS are not intended for long-term or historical backup. Strengths SNAPS allow rolling back to earlier

points and are more granular than a nightly backup

SNAPS can be replicated offsite for disaster recovery of short-term, periodic SNAP points Weaknesses SNAPs write into the same volume as the primary data so they do not offer protection

against a system crash, virus attack, data corruption or other event that destroys the primary data. The SNAPs would get destroyed along with the primary data. This is why 99% of IT environments keep a backup copy on a separate system onsite (tape or disk).

SNAPs are not good for long-term retention uses such as legal discovery, regulatory compliance or SEC audits. When years of retention are required, a traditional backup approach is required due to the need to store data at specific points in time but not every interval in between, such as monthly backups for 3 years and then yearly backups for 4 additional years.

Summary Primary Storage SNAPS and long-term traditional backup can co-exist as part of a multi-

layered approach to backup tailored to the specific requirements of the business. Primary Storage SNAPS provide for fine-granularity backup points onsite and also offsite, if replicated.

It is estimated that 99% of IT environments use a traditional, longer-term backup system. About 50% of IT environments deploy some type of Primary SNAPs as well.

Figure 5 - SNAPS Concept

Page 14: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 13 |

Backup Application Deduplication in the Media Server Some backup applications have a data deduplication feature that can be deployed as an agent in the media server. The intent is to be able to eliminate tape using standard disk in conjunction with the backup application. Data deduplication is a very compute intensive process. If deduplication is run in the media server, resource utilization will increase significantly. This can slow backups down dramatically. To avoid this hit to overall backup performance, backup software uses a form of deduplication that results in a lower reduction rate. Using the least possible processor and memory resources for the deduplication process avoids starving the media server tasks of resources, but at the cost of lowering deduplication performance. Typically this approach uses 64KB or 128KB fixed blocks and will yield a data reduction ratio of about 6-7:1. By comparison, target-side appliances that use byte, zone-byte or segment-block with variable length content-splitting average from about a 20:1 to as much as a 50:1 data reduction ratio, or a minimum of approximately three times that of the software approach. In addition, software deduplication can only process data that comes from its own proprietary agents. It cannot deduplicate data from other sources including other backup applications, utilities or data base dumps. Some vendors bundle the media server software on a storage server that includes a CPU, memory and disk. This does not change the deduplication rate or the heterogeneous nature of the solution. Strengths Relatively simple to manage through the backup application Good for environments that have less than 3TB of data to backup, use a single backup

application and do not plan to replicate to a second site for disaster recovery Weaknesses Disk usage is high as the deduplication ratio is only 6-7:1. Over time the disk space

required grows sharply.

Figure 6 - Running Deduplication on Media Server

Page 15: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 14 |

Bandwidth needed to send backups to a second site is high as the deduplication ratio is only 6-7:1. By comparison, target-side appliances that use byte, zone-byte or segment-block with variable length content-splitting average from about a 20:1 to as much as a 50:1 data reduction ratio, or a minimum of approximately three times that of the software approach.

Cannot deduplicate data from: Veeam, VizionCore Lightspeed, SQL Safe, Redgate Direct SQL Dumps, Direct Oracle RMAN Dumps Bridgehead for Meditech data Direct UNIX TAR files Other traditional backup applications

Summary Deduplication in the backup software is good for short-term retention and low amounts of

data in environments that are not heterogeneous and where offsite disaster recovery data is not required.

Page 16: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 15 |

Backup Application Client Side Deduplication Some industry backup applications offer a form of data deduplication in the application server agents or clients. The intent is to be able to eliminate tape using standard disk along with the backup application. The deduplication occurs at the backup agent/client on each application server. Data deduplication is a very compute intensive process. Resource utilization will increase significantly if deduplication is run in the application server (client side), and slow down backups dramatically. To minimize this impact, client side deduplication software approaches use a less-efficient form of deduplication. Typically they use 64KB or 128KB fixed blocks where they achieve a data reduction rate of about 6-7:1. By comparison, target-side appliances that use byte, zone-byte or segment-block with variable length content-splitting average from about a 20:1 to as much as a 50:1 data reduction ratio, or a minimum of approximately three times that of the software approach. Running a compute intensive deduplication process on your applications servers creates other performance and availability challenges. Furthermore, databases and email, which are 80% of the Monday through Thursday backups, are still sent as full backups. This means that only 20% of the nightly data is actually deduplicated, by client side deduplication, during the week. The true impact is on the Friday night full backup, where 80% of the data is unstructured file data. In addition, the software approach to deduplication can only process data that comes from its own proprietary agents. It cannot deduplicate data from other sources including other backup applications, utilities or data base dumps. Strengths Great fit for deduplicating data from small remote sites, then replicating it back to a

corporate datacenter for backup.

This approach can shorten the backup window, but only on the Friday full backup. During the week, backups are still full backups for data base and email.

Weaknesses Requires new agents on servers; added risk and cost of changing agents.

Figure 7 - Client Side Deduplication

Page 17: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 16 |

Deduplication ratio is only 6-7:1 and the disk space required increases quickly.

Bandwidth usage to a second site is high as the deduplication ratio is only 6-7:1. By comparison, target-side appliances that use byte, zone-byte or segment-block with variable length content splitting average from about 20: 1 to 50: 1 data reduction ratio, or at a minimum three times that of software deduplication.

Cannot deduplicate data from:

Veeam, VizionCore Lightspeed, SQL Safe, Redgate Direct SQL Dumps, Direct Oracle RMAN Dumps Bridgehead for Meditech data Direct UNIX TAR files Other traditional backup applications

Summary Very good for replicated remote site data back to a corporate datacenter

Very few businesses actually use this approach due to its risk to application servers and

weaknesses

Page 18: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 17 |

Purpose-Built Target Side Deduplication Appliances Target-side deduplication appliances are built specifically to replace the tape library in the backup process onsite and, optionally, offsite. Because they are dedicated appliances, the hardware and the deduplication methods used can be optimized for that single purpose. Future disk space requirements to deal with data growth are drastically reduced because deduplication ratios from 20:1 to as much as 50:1 can be achieved, Only the data that changes, about 2% of the backup size, is replicated offsite and requires far less bandwidth. In addition, target-side appliances can process data from a variety of utilities and backup applications. Strengths No change to your backup environment.

Use all backup applications, utilities and dumps you are currently using. Can take in data from:

Traditional backup applications

Veeam, VizionCore Lightspeed, Redgate, SQLSafe SQL Dumps, Oracle RMAN dumps Direct UNIX TAR files Many other backup applications and utilities

20:1 to as much as 50:1 deduplication ratios use less disk space and far less bandwidth

for replication.

Special features for: Tracking data to offsite Disaster Recovery Improving Disaster Recovery RPO (recovery point objective) and RTO (recover time

objective) Purging data as the retention policy calls for aging out data

Weaknesses Backup window improves over using a tape library, but not by as much as client side

deduplication for the Friday night full backup

Figure 8 - Target Side Deduplication Appliance

Page 19: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 18 |

Summary When evaluating different approaches to replacing tape with disk, take the time to ask the right questions and understand the strengths and weaknesses of each alternative.

Page 20: Sorting through the confusion white paper

Sorting Through the Confusion P a g e 19 |

About ExaGrid ExaGrid is the leader in cost-effective disk-based backup solutions. A highly scalable system that works with existing backup applications, the ExaGrid system is ideal for companies looking to quickly eliminate the hassles of tape backup while reducing their existing backup windows. ExaGrid’s innovative approach minimizes the amount of data to be stored by providing standard data compression for the most recent backups along with zone-level data deduplication technology for all previous backups. Customers can deploy ExaGrid at primary and secondary sites to supplement or eliminate offsite tapes with live data repositories or for disaster recovery. With offices and distribution worldwide, ExaGrid has more than 3,500 systems installed and hundreds of published customer success stories and testimonial videos available at www.exagrid.com.

Page 21: Sorting through the confusion white paper

ExaGrid Systems, Inc | 2000 West Park Drive | Westborough, MA 01581 | 1-800-868-6985 | www.exagrid.com © 2011 ExaGrid Systems, Inc. All rights reserved. ExaGrid is a registered trademark of ExaGrid Systems, Inc.