Download pdf - Archiving for Dummies

Transcript
Page 1: Archiving for Dummies
Page 2: Archiving for Dummies

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 3: Archiving for Dummies

by Lawrence C. Miller, CISSP

ArchivingFOR

DUMmIES‰

ORACLE SPECIAL EDITION

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 4: Archiving for Dummies

Archiving For Dummies®, Oracle Special Edition

Published by John Wiley & Sons, Inc. 111 River St. Hoboken, NJ 07030-5774 www.wiley.com

Copyright © 2012 by John Wiley & Sons, Inc., Hoboken, New JerseyPublished by John Wiley & Sons, Inc., Hoboken, New JerseyNo part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.Trademarks: Wiley, the Wiley logo, For Dummies, the Dummies Man logo, A Reference for the Rest of Us!, The Dummies Way, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries, and may not be used without written permission. Oracle is a registered trademark of Oracle International Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book.

LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETE-NESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITU-ATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRIT-TEN AND WHEN IT IS READ.

For general information on our other products and services, please contact our Business Development Department in the U.S. at 317-572-3205. For details on how to create a custom For Dummies book for your business or organization, contact [email protected]. For information about licensing the For Dummies brand for products or services, contact BrandedRights&[email protected]: 978-1-118-28494-0 (pbk); ISBN: 978-1-118-28765-1 (ebk)Manufactured in the United States of America10 9 8 7 6 5 4 3 2 1

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 5: Archiving for Dummies

Contents at a GlanceIntroduction .................................................................. 1

Chapter 1: Recognizing Today’s IT Challenges ...... 3Explosive Data Growth ..................................................4Diverse Data Types and Uses .......................................5Legal and Regulatory Requirements ...........................7Logical and Physical Data Migration ...........................8Rising Costs ....................................................................9

Chapter 2: Archive 101 .............................................. 11How Does an Archive Differ from a Backup? ............11What Are the Different Types of Archives? ..............15Why Is Tape the Best Archive Media? .......................17

Chapter 3: Archive Components .............................. 21Archive Software ..........................................................21Tape Software ...............................................................23Tape Libraries ..............................................................26Drives and Media .........................................................31

Chapter 4: Archive Use Cases ................................. 35Healthcare .....................................................................36Media and Entertainment ...........................................36Telecommunications ...................................................37High-Performance Computing (HPC) ........................38

Chapter 5: Ten Key Factors to Consider in Implementing Your Archive ..................................... 41

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 6: Archiving for Dummies

Publisher’s AcknowledgmentsWe’re proud of this book and of the people who worked on it. For details on how to create a custom For Dummies book for your busi-ness or organization, contact [email protected]. For details on licensing the For Dummies brand for products or services, contact BrandedRights&[email protected] of the people who helped bring this book to market include the following:

Publishing and Editorial for Technology Dummies

Richard Swadley, Vice President and Executive Group PublisherAndy Cummings, Vice President and PublisherMary Bednarek, Executive Director, AcquisitionsMary C. Corder, Editorial Director

Publishing and Editorial for Consumer Dummies

Kathleen Nebenhaus, Vice President and Executive Publisher

Composition Services

Debbie Stailey, Director of Composition Services

Business Development

Lisa Coleman, Director, New Market and Brand Development

Acquisitions, Editorial, and Vertical Websites

Senior Project Editor: Zoë WykesEditorial Manager: Rev MengleAcquisitions Editor: Katie Feltman Senior Business Development Representative: Karen L. HattanCustom Publishing Project Specialist: Michael Sullivan

Composition Services

Senior Project Coordinator: Kristie ReesLayout and Graphics: Lavonne RobertsProofreader: John GreenoughSpecial Help from Oracle: Scott Allen, Doug Chamberlain, Donna Harland, Cindy McCurley, Arthur Pasquinelli, Christine Rogers, Allison Roth, Mark Schaffer, Kerstin Woods

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 7: Archiving for Dummies

Introduction

S ince the beginning of time, mankind has communi-cated written ideas and information with symbols.

From the cave paintings of the Paleolithic Age and the hieroglyphs of ancient Egypt, to modern alphabets around the world, information becomes more or less permanent when it is written. All that is required to “read” these permanent records is the ability to see it and interpret it.Today, enormous amounts of information — whether trivial or profound — is written and recorded digitally in thousands of different applications and formats at an absolutely stunning pace. Yet, ironically, this digital information is written as symbols (1’s and 0’s on mag-netic media) that represent other symbols (alphabets, for example) that cannot possibly be “seen” by human eyes — let alone interpreted — without the proper tools: computers and their associated software and applications.Managing these vast repositories and archives for our use today is a challenge in and of itself. But what com-puters and technology will exist 50, 100, or even 1,000 years from now to interpret the wealth of information that modern society has amassed? What will be the predominant file format? Will your expensive enter-prise hard disks be unreadable fossils in the next mil-lennia? Or will all of our achievements over the last 50 years be lost to future generations in what Popular Mechanics has called the Digital Ice Age?

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 8: Archiving for Dummies

2

While this book can’t answer all of these questions for the ages, it can help you solve your organization’s archive and data management challenges today — and for at least the foreseeable future!

About This BookThis book consists of five short chapters, covering today’s data archiving challenges, the basics of archives, archive components, use cases, and key fac-tors to consider for your archive solution. Each chap-ter is written as a stand-alone chapter, so feel free to start reading anywhere and skip around throughout the book!

Icons Used in This BookThroughout this book, we occasionally use icons to call attention to important information that is particularly worth noting. Here’s what to expect.

This icon points out information that may well be worth committing to your nonvolatile memory!

If you’re an insufferable insomniac or vying to be the life of a World of Warcraft party, take note. This icon explains the jargon beneath the jargon.

Thank you for reading, hope you enjoy the book, please take care of your writers! Seriously, this icon points out helpful sugges-tions and useful nuggets of information.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 9: Archiving for Dummies

Chapter 1

Recognizing Today’s IT Challenges

In This Chapter▶ Seeing the “data” forest — and all its trees▶ Using, and re-using, different types of data▶ Complying with data retention regulations▶ Keeping data formats and media current▶ Managing data storage costs

D ata retention in our modern digital era is a major challenge for businesses and organizations of all

sizes, in all industries, worldwide. Common issues include the explosive growth of digital data, different data types and uses, complex regulatory requirements, data migration difficulties, and rising power, space, cooling, and management costs.This chapter explores these data retention challenges in depth.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 10: Archiving for Dummies

4

Explosive Data GrowthThe march of digital data growth continues at a stunning pace. The 2011 IDC Digital Universe Study estimates that by 2020 the total amount of digital infor mation created, captured, and replicated will grow to 35,000 exabytes (see Figure 1-1). Just to put that in context, it would take almost 1.9 quadrillion (yes, quadrillion) trees to print 35,000 exabytes of data! That’s nearly 5,000 times the number of trees on the entire planet (which NASA esti-mates at approximately 400 billion)!

A terabyte is equal to 1024 gigabytes, a pet-abyte is equal to 1024 terabytes, and an exa-byte is equal to 1024 petabytes.

Figure 1-1: The nature of storage and data management has to change!

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 11: Archiving for Dummies

5

In many industries — such as health care, life sciences, media/entertainment, and energy — and in specialized markets, such as video surveillance and product life cycle management, the shift to digital content is now beyond the point of no return. These digital transfor-mations are already spurring exponential increases in image data and associated content.The expanding use of automated sensors, high resolu-tion medical scanners, earth observation satellites, and high performance technical computing applications across a broad range of industries is likewise driving much of this data growth.At the same time, companies are leveraging more col-laboration, social networking, and web-based business applications to boost productivity and improve cus-tomer support. Large databases are at the heart of many of these applications. Data mining and analysis of these databases for business intelligence to improve efficiencies and market opportunities is driving the need for storage-intensive data warehousing.

Diverse Data Types and UsesEnterprises must not only manage the growth of data, but also recognize the value and types of data — and its anticipated uses — within their organizations, as well.It is widely estimated that more than 80 percent of all organizational data is unstructured. This means that the vast majority of your storage capacity is being used

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 12: Archiving for Dummies

6

for e-mail, documents, images, and audio and video files. This unstructured data probably has a different value than the data in your business-critical databases, for example. Rather than treating all of your data equally, shouldn’t your lower value data have a corre-sponding lower storage cost? According to IDC, unstructured data is pro-

jected to grow at a compound annual growth rate (CAGR) of more than 60 percent, com-pared to approximately 20 percent CAGR for transactional data.

Data use and re-use presents another challenge — or opportunity — for organizations seeking innovative solutions to their growing data storage costs. Eighty percent of all data (both structured and unstructured) is never again used or accessed after 90 days. How often do you look at an e-mail message, a sales transac-tion, or a shipping manifest that is more than 90 days old? Yet this data is frequently stored on the same high-speed, high-performance, high-cost disks as the rest of your active production data.At the same time, when you do need a file from last year, it holds high value to you again. Therefore, you cannot simply delete all of that data. And with advances and new ways to search and analyze data coming every day — the data you consider inactive today may hold untapped value just around the corner. Archive data must still be retained, protected, and readily accessible when needed, but there are lower-cost alternatives that are better suited to data that is infrequently — or never again — accessed.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 13: Archiving for Dummies

7

Legal and Regulatory RequirementsIncreasingly stringent data retention and protection regulations and complex compliance requirements also contribute to the data growth problem. These include the U.S. Health Insurance Portability and Accountability Act (HIPAA), Gramm-Leach-Bliley Act (GLBA), Sarbanes-Oxley (SOX), Canada’s Management of Information Technology Security (MITS) directive, the EU Statutory Audit and Company Reporting Directive (EuroSox), and Japan’s Financial Instruments and Exchange Law (J-SOX), among others.According to the Storage Networking Industry Association (SNIA), 80 percent of organizations partici-pating in a recent survey responded that they are required to retain data for more than 50 years, and 68 percent of companies require a 100-year archive! These include governments, digital libraries, research organi-zations, and industries that need to keep track of data on population-wide drug interactions or individual air-craft for 10, 20, or 50 years, for example.

How long does your organization’s archive horizon need to be? Challenge your retention requirements to ensure that they do not expose your organization to excess costs and liability, but still meet your business needs.

Not only do organizations today have longer data retention requirements, but they also have to have

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 14: Archiving for Dummies

8

their archives readily available and easily accessible — not just locked away in a cave somewhere. It is abso-lutely critical that your organization have the right combination of archive hardware and software to ensure your data can be archived efficiently, securely, and reliably. You must be able to accurately catalog and

index the contents of your archives and quickly restore to online storage or other media when needed. In the event of litigation, this capability will help your organization reduce the scope of legal discovery and quickly comply with a subpoena while control-ling legal costs.

Logical and Physical Data MigrationLong-term retention of digital information also creates unique technical issues for organizations. These include the logical and physical migration of archive data.Data must be updated, typically every three to five years, to newer formats that are supported by current and future applications. This cycle is known as logical data migration.Although most common applications today provide some level of backward compatibility for data created and saved with older versions of that application, there are limits to that compatibility — particularly for pro-prietary applications and unstructured data.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 15: Archiving for Dummies

9

For example, many popular word processing, CAD (computer-aided design), and graphics file formats that were in popular use just 10 or 15 years ago are now obsolete and unreadable.One solution to this problem is to convert data to a common plain-text format, such as ASCII (American Standard Code for Information Interchange) or Unicode. However, these formats do not maintain the original data structure or metadata, and cannot sup-port rich-text features and graphic images.Physical data migration refers to the need to copy archived data to newer storage media in order to pre-serve its integrity over time which, like logical data migration, typically tends to happen every three to five years, depending on the media type.Physical data migration is also necessary to ensure that current media formats are used, and that current backup and archiving software can read, write, and cat-alog the data properly.Both logical and physical data migrations require extensive time and resources. As the volume of organi-zational data continues to grow, so too do the resources required to migrate that data.

Rising CostsAlthough the cost per gigabyte of storage has steadily decreased over time, energy and storage management costs are increasing.Storage consumes almost half of all data center power today, and it is growing at a rapid rate. Within ten

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 16: Archiving for Dummies

10

years, the total power consumed by storage will easily represent the majority of the energy consumed in the data center.The cost of managing this data is exploding as well. The increasing number of data sources, data formats, gov-ernment and industry regulations, and the business- critical nature of data is driving up management costs year over year — even faster than energy utilization rates. Data and storage management will soon become the number one cost within many data centers.Considering that 80 percent of all data older than 90 days is never looked at again, you need a better way to deal with massive amounts of data storage.

It is more important than ever to align the value of data with the capabilities and cost of the media on which it is stored. This can best be achieved with storage and archive solu-tions that:

✓ Drive the cost of storage used for data that is almost never accessed again to virtually zero

✓ Assure access to valuable content that needs to be accessed over the long-term

✓ Increase the amount of data that storage and data-base administrators can manage

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 17: Archiving for Dummies

Chapter 2

Archive 101

In This Chapter ▶ Defining archives versus backups▶ Using disk-based and tape-based archives▶ Choosing the best archive media

T his chapter explains exactly what an archive is and helps you to differentiate archives from backups.

You also find out about the different archive types and why tape is the best media for long-term archive data.

How Does an Archive Differ from a Backup?An archive is data storage that is used for long-term retention of permanent records and information. Archives consist of data that is no longer modified or regularly accessed but is still important and has value to the organization. Archive data is retained for a period of time, as defined by organizational policy (or indefinitely) for future reference, and for legal or regu-latory compliance. Archives must be cataloged, fully indexed, and searchable, so that data can be easily located and retrieved when needed.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 18: Archiving for Dummies

12

Data archives are sometimes confused with data back-ups. Although both archives and backups may employ similar hardware and software technologies, they are distinctly different in several ways.Data archives are used for long-term retention of per-manent records. In contrast, data backup is a copy of data that is still in production and is regularly accessed or modified. Archive data is analogous to finished prod-uct, whereas production data (and its associated back-ups) is analogous to work-in-progress (WIP).

A backup is a copy of data. An archive is the data.

Because production data is regularly accessed and modified, it is susceptible to corruption or destruction. In such an event, the backup copy is used to restore the original data.

The purpose of an archive is long-term reten-tion of permanent records. The purpose of a backup is to create a short-term copy of pro-duction data in case the original data is cor-rupted or destroyed.

Organizations typically employ a combination of differ-ent backup routines to maintain an accurate copy of production data. These include ✓ Full backups: All of the data is copied. ✓ Incremental backups: Only data that has changed

since the last backup is copied. ✓ Differential backups: Only data that has changed

since the last full backup is copied.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 19: Archiving for Dummies

13

By comparison, archiving simply moves data to a sepa-rate repository, based on a pre-defined policy — such as the last time a file was accessed or modified.Both archives and backups must be cataloged so that data can be located when needed. However, archives also require robust indexing and searching capabilities. A typical archive request may be: “I need to locate all files that contain the phrase ‘clinical drug trials’ cre-ated between 1995 and 2002.” A similar request for a file restore from backup should result in the requester being banned from ever using a computer again: “I just accidentally deleted a file that contained the word ‘oops’ created between 1995 and 2002, but I have no idea what the complete name of the file was, what directory it was in, or what server it was on. Can you drop everything and restore it for me?”!Speed is important to both archives and backups, but for different reasons.The ability to quickly index files and perform accurate full-text searches of extremely large (several terabytes or more) data repositories is critical for locating archive data. Archive data is, by definition, data that is not regularly accessed or modified, so it can be migrated to an archive from the production environ-ment at pretty much any time.Backups today are increasingly being performed on pro-duction data in near real-time as backup systems and software become more robust and sophisticated. But regardless of the backup systems and software, back-ups can still limit access to certain files while running,

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 20: Archiving for Dummies

14

and can adversely affect system and network perfor-mance. For these reasons, most backups typically still occur in a backup window during nonproduction hours. Speed is critical to ensure that all production data can be backed up during the allotted backup window. Speed is also critical when restoring backups. In the event of a disaster, quickly and correctly restoring systems and data can be a daunting task that is of utmost impor-tance to the continuity of business operations. On a much smaller scale, “individual disasters” happen almost every day, requiring a fast recovery capability: “Eke, I just deleted my presentation for our sales meet-ing and it’s only an hour away!”Finally, archives and backups often use similar types of storage media. However, archives and backups each have unique characteristics that should more clearly dictate the storage media that is most appropriate for each use.Archive data is written to media only once, but may be accessed many times over a period of many, many years. Over time, the amount of archive data within an organization typically grows exponentially (refer to Chapter 1). For these reasons, your primary factors for selecting archive media (in order) should be ✓ Reliability ✓ Cost ✓ Speed

Backup tapes are constantly handled and rotated through a backup cycle that performs numerous high-speed reads and writes of data. This significantly shortens the life of a backup tape. Although you may be replacing backup

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 21: Archiving for Dummies

15

tapes on a regular basis, archive tapes are not normally subjected to that same level of wear and tear. Archive tapes typically have a 30-year life — though you may perform data migrations more frequently (refer to Chapter 1).

Backup data is written to the same media many times over a relatively short period, defined by your organi-zation’s backup cycle. For example, your organization may have a six-week backup cycle that enables it to recover data or system configurations up to six weeks old. Typically, a backup must be completed during a limited backup window to minimize its impact during production hours. For these reasons, your primary fac-tors for selecting backup media (in order) should be ✓ Speed ✓ Reliability ✓ Cost

Do not confuse a backup cycle with a recovery point objective (RPO). A backup cycle defines the oldest version of data that can be recov-ered. An RPO, used in disaster recovery and business continuity plans, defines the most current version of data that can be recovered.

What Are the Different Types of Archives?Archives can be either disk-based or tape-based.A disk-based archive usually consists of large disk sub-systems or storage arrays and is typically implemented with a tiered storage system.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 22: Archiving for Dummies

16

A tiered storage system maintains an organization’s production data on its highest performance drives — such as serial-attached SCSI (SAS) or solid-state drives (SSD) — and automatically moves archive data to slower drives — such as serial ATA (SATA).Disk-based archive data must also be maintained at an off-site location for disaster recovery purposes. This requires a similar disk configuration at a secondary data center with sufficient network bandwidth for copying and replication between the two sites. Disk-based archives can be very costly to acquire, operate, and maintain.A tape-based archive is usually implemented with a tape library.Tape-based archive data can be quickly accessed and restored when needed. Today’s tape technology reduces the latency to access data to very acceptable times for most organizations.

Today’s enterprise-class tape libraries have advanced capabilities that include automatic compression, WORM (Write-once, Read-many) technology, and encryption. These tape librar-ies can be automatically managed to augment expensive disk storage capacity with less costly tape-based storage.

Finally, tapes containing archive data can be easily and securely copied locally, and then transported to an off-site location for disaster recovery purposes. Alternatively, an additional copy of the archive data can be created at a remote location.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 23: Archiving for Dummies

17

Why Is Tape the Best Archive Media?Disk space is an important part of any enterprise data storage strategy, but it is simply not practical or even desirable to use disk exclusively for all of your enter-prise storage needs.Enterprise storage is not an “either/or” proposition. Flash storage, disk, and tape all have their place in an enterprise tiered-storage strategy, and you have to use the right tool for the job. Flash storage is ideal for tasks with intensive I/O requirements where speed is the most critical factor. Disk works best for primary stor-age and as a staging area for backups. And tape is ideal for backups and archives.

Tape and disk storage systems can and should coexist in a tiered-storage strategy.

Many storage vendors paint tape storage as an inferior solution to disk, a last-generation technology on the verge of extinction — a dinosaur, if you will. But the reality is that tape storage is not a dinosaur. Tape stor-age continues to be a key component in the enterprise data center, and most of the world’s information is actually stored on tape! This has been true for many years, and will be well into the future.Tape also has better error correction rates and longer refresh cycles than disk (see Table 2-1).

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 24: Archiving for Dummies

18

Table 2-1 Tape versus Disk PerformanceCharacteristic Disk Tape

Max shelf life (bit rot) 10 yrs 30 yrs

Best practices for data migration 4-5 yrs 10-15 yrs

Uncorrected bit error rate 10-14 10-19

Power and cooling 290x 1x

Finally, tape has a significantly lower total cost of own-ership (TCO) compared to disk. The cost and perfor-mance advantages of tape include ✓ Acquisition costs. The Clipper Group (www.

clipper.com) estimates that the cost to imple-ment a disk-based archive is 15 times more than the cost of a tape-based archive.

✓ Energy savings. An enterprise-class tape library uses much less energy (290 times less according to the Clipper Group!) than disk because it doesn’t spin 24/7 like disk. In a 2010 study, the Clipper Group concluded that the cost of energy alone for the average disk-based solution exceeds the entire TCO for the average tape-based solution.

✓ Management savings. Tape has a higher ratio of petabytes managed per storage administrator than disk. This translates to lower overall labor costs.

✓ Longevity. No matter how you store your data, eventually it has to be moved either due to obso-lescence or deterioration of the storage media. It is not uncommon for archive data to remain on

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 25: Archiving for Dummies

19

tape for up to a decade (though the tape itself can last up to 30 years) — disk archives typically need to be replaced every three to five years.

✓ Scalability. Tape storage systems are highly scalable — simply add more tapes for additional capacity and more drives for performance. The amount of tape and capacity that can be stored in a tape library dwarfs the capacity of comparable disk storage systems. Thus, you get more pet-abytes of storage per square foot in the datacen-ter with tape than with disk.

✓ Data integrity and auditing. Assuming the data is good when you archive it and the storage media is properly maintained, with tape, WYSIWYG (what you see is what you get) becomes “what you store is what you get.” But disk is constantly subject to corruption due to bad sectors, disk failure, mal-ware, or accidental overwrites.

Debunking five myths about tape storageMyth #1: Tape is more expensive than disk. Tape costs less per terabyte, consumes less energy, and is less expensive to operate than disk.

Myth #2: Tape is cheaper to buy, but more expensive to operate. The Data Mobility Group reports the TCO for a Serial ATA-based disk storage system is 11 times higher than an LTO-based tape configuration over a seven year period.

(continued)

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 26: Archiving for Dummies

20

Myth #3: Tape has gone away; no enterprise data center uses it. Most enterprise organizations use a tiered storage strategy with tape as the foundation layer, and nearly half of the world’s data is stored on magnetic tape.

Myth #4: Tape is unreliable. The bit error rate (BER) for Oracle’s enterprise tape products is more than 4 million times better than enterprise disk.

Myth #5: Tape is a greater security risk than disk. Tape is designed to be portable and therefore has a higher potential for loss. As a result, tape encryption became a necessity long before other storage media encryption and is far more advanced. Tape encryption is built into the tape drive and runs without performance degradation.

(continued)

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 27: Archiving for Dummies

Chapter 3

Archive Components

In This Chapter▶ Managing your archive data▶ Managing your tape data▶ Checking out tape libraries▶ Comparing tape drives and media

I n this chapter, you learn about the key components of an enterprise archive: archive software, tape

libraries, and drives and media.

Archive SoftwareArchive software components consist of content and data management software.

Content management softwareOracle’s WebCenter Content unifies data into a single repository where organizations can track information uniformly using metadata and logging. This information can then be integrated with business processes and enterprise applications.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 28: Archiving for Dummies

22

WebCenter Content offers best-in-class capabilities for managing data logically throughout its lifecycle based on business needs. Questions such as when data will need to be accessed, when it needs to be stored, and when it needs to be deleted apply to every instance of data a company manages or generates. WebCenter Content automatically manages those lifecycle deci-sions based on organizational policies to help organiza-tions extract more value from the data.

Archiving best practices are to always have at least two copies of your archive data on tape. With WebCenter Content, you can keep up to four copies on tape.

WebCenter Content provides ✓ Intelligent content management ✓ Collaboration and re-use access to multiple

applications ✓ Content management policies based on content ✓ Central search engine capabilities

Data management softwareWorking in conjunction with content management soft-ware applications, such as WebCenter Content (see the preceding section), Oracle’s Sun Storage Archive Manager (SAM) provides physical storage management and is used to optimize data placement across multiple tiers of storage, which can include tape and remote storage, as well as high-capacity disk storage.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 29: Archiving for Dummies

23

SAM presents the file system as if all data is located on primary disk. As data is accessed that is on archive devices only, SAM dynamically stages the data to the primary disk or directly to the application for immedi-ate access.SAM works transparently in the background with tiered storage and makes archive copies based on policies that define file system characteristics. SAM can manage ✓ Thousands of SAN clients ✓ Hundreds of file systems ✓ Billions of files ✓ Petabytes of disk cache ✓ Exabytes of archive

WebCenter Content manages what the data does and what it means; SAM dynamically manages exactly where the data resides in a hierarchy of storage mediums and protects the data with advanced features that include integrity checks, WORM, and encryption.

Tape SoftwareTape software components consist of tape analytics and the Linear Tape File System (LTFS).

Tape analyticsOracle’s StorageTek Tape Analytics software simplifies tape storage management, taking a proactive approach to eliminate library, drive, and media errors through an intelligent monitoring software application exclusively available for Oracle StorageTek tape libraries.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 30: Archiving for Dummies

24

With StorageTek Tape Analytics software, you gain insight into detailed health information that helps you to make decisions about your tape environment prior to device failures (see Figure 3-1). Efficiently monitoring your storage environment is key to cost management. When archive applications encounter problems due to tape drive or media exchange errors, assets sit idle, administrators scramble, and data transfer to end-users is delayed. Any of these setbacks may have significant costs associated with them, leaving storage budgets depleted and users frustrated. With StorageTek Tape Analytics’ proactive approach to tape monitoring, errors are reduced, data flows freely, and the cost of managing an archive is ultimately reduced.StorageTek Tape Analytics is built to meet four key needs of all archive storage environments:

Figure 3-1: Oracle StorageTek Tape Analytics.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 31: Archiving for Dummies

25

✓ Smart: Intelligent algorithms compute hardware health recommendations.

✓ Secure: Out-of-band tape monitoring adds zero risk for implementation.

✓ Simple: A tool that monitors tape so customers don’t have to. Easy to deploy with a single IP con-nection to each library and a single pane-of-glass interface.

✓ Scalable: Supports multiple libraries and multiple sites, designed to meet the needs of a single library of users to the world’s largest archives.

Linear Tape File System (LTFS)In order to present a complete file image to a user, two types of data need to be stored: the file metadata — containing the file structure, file names, file format, and other data elements that are indexed to simplify access to the data on the tape; and the file data — the raw file content that is stored on the tape.A tape that is LTFS-formatted is designed so that it may be split into two partitions. The smaller of the two partitions, at the beginning of the tape, holds all of the file metadata for all of the files on the tape. In the metadata partition, files are stored in a hierarchical directory structure. The rest of the tape, the second partition, is dedicated to data storage, as tape storage has done for decades. Because LTFS is an open format, anyone with a compatible tape drive and the drivers to operate it can read an LTFS tape without assistance from any other software. Oracle’s open source StorageTek Linear Tape File System (LTFS), Open Edition software enables customers to write files to tape in this self-describing format, much the same way files are written to disk and flash storage devices.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 32: Archiving for Dummies

26

When a piece of tape media is loaded into a tape drive, the complete file folder image is displayed, with the file structure being pulled from the first partition and the raw file content being accessed from the second partition. StorageTek LTFS is extremely flexible, with sup-

port for all three major tape drive offerings: Oracle’s StorageTek T10000C tape drive, Oracle’s StorageTek LTO-5 tape drive from HP or IBM.

Tape LibrariesA tape library is a key infrastructure component in a tiered-storage strategy. Tiered storage aligns the value of your data assets with the most appropriate storage media in order to reduce cost and effectively manage data throughout its lifecycle (see Figure 3-2). Tape libraries provide comprehensive, highly scalable stor-age solutions for backup and archive applications in enterprise, midrange, distributed, and entry-level data center environments.

Figure 3-2: Tiered-storage is comprised of disk and tape.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 33: Archiving for Dummies

27

To learn more about tiered-storage, download your free copy of Storage Tiering For Dummies, Oracle Special Edition at www.oracle.com/us/products/servers-storage/storage/ index.html.

Large-scale archive (more than 500 TB)For enterprise data centers storing more than 500 TB, Oracle offers enterprise-class tape libraries: Oracle’s StorageTek SL3000 and StorageTek SL8500 modular library systems.For large archives (greater than 5 PB), the Oracle StorageTek SL8500 modular library system — the world’s first exabyte storage solution — delivers significant value through heterogeneous data consolidation and multi- generation media support in an ultra-dense footprint.Both the StorageTek SL8500 and StorageTek SL3000 use a unique centerline architecture in which drives are kept at the center of the library, thereby alleviating robot contention. Robots travel one-third to one-half the distance required by other libraries, thereby improving cartridge-to-drive performance by up to 50 percent over other libraries.

StorageTek SL8500 and SL3000: At a glanceThe StorageTek SL8500 and StorageTek SL3000 modular tape library systems (see the following table) are flexible, highly scalable storage solutions that feature

✓ Scalability and performance with capacity on demand so that you can install physical capacity in advance, then tap into it incrementally when you need it

(continued)

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 34: Archiving for Dummies

28

✓ RealTime Growth capability to non-disruptively add more cartridge slots, drives, and robotics

✓ Easy consolidation with Any Cartridge Any Slot technol-ogy for seamless mixed-media support that allows you to combine heterogeneous data sources and media types slot by slot for optimal consideration

✓ Industry-leading availability with redundant and hot-swappable robotics and library control cards

SL8500 SL3000

Cartridge Slots: Up to 100,000 Up to 5925Capacity (Compressed):

Up to 1 exabyte Up to 60 petabytes

Tape Drives: Up to 640 Up to 56Native Throughput (TB/hr):

Up to 552.9 Up to 48.4

Tape Drive Choices: T10000C/B/A T9840D/C/B/A LTO 5/4/3/2

T10000C/B/A T9840D/C LTO5/4/3

Number of Physical Partitions:

Eight Eight

Redundant Components:

Robotics, elec-tronics, control path CAPS, fans, power

Robotics, elec-tronics, control path CAPS, fans, power

(continued)

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 35: Archiving for Dummies

29

StorageTek SL500: At a glanceThe SL500 is a reliable, scalable, simple rack-mounted tape automation solution that features

✓ Enterprise-class reliability with superior robotics and easy serviceability

✓ Cartridge and drive expansion modules and capacity on demand for industry leading scalability

✓ Ideal for consolidation with up to eight native partitions and up to 575 LTO cartridges providing maximum native (uncompressed) capacity of over 860 TB (LTO-5) and more than 9 TB of native (uncompressed) throughput

When integrated with a tiered-storage strategy that includes disk, Oracle’s Storage Archive Manager (SAM) and Oracle’s StorageTek T10000C tape media, the StorageTek SL8500 delivers a highly scalable enterprise-class archive system.

Mid-range archive (50 – 500 TB)The Oracle’s StorageTek SL500 tape library is ideal for consolidation of distributed environments, which helps you save time, space, and energy by consolidating mul-tiple libraries and applications into a central location. The StorageTek SL500 is also ideal for rack-based D2D2T (disk-to-disk-to-tape) solutions, when combined with SAM and Oracle’s Pillar Axiom 600.For organizations with mid-range (50-500 TB) data archiving needs, the StorageTek SL500 provides a flexible and scalable archive solution.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 36: Archiving for Dummies

30

However, for customers in this segment who have high availability requirements or also need mainframe con-nectivity, the StorageTek SL3000 (discussed in the pre-vious section) is the right archive solution.

Small-scale archive (less than 50 TB)Oracle’s StorageTek SL24 and SL48 tape libraries are designed to meet the data storage demands — including backup, archiving, and disaster recovery — of fast- growing businesses, workgroups, and remote offices.

StorageTek SL24 and SL48: At a glanceThe SL48 tape library and SL24 tape autoloader (see the fol-lowing table) provide reliability, simplicity, and value.

SL48 SL24

Cartridge Slots: Up to 48 Up to 24Capacity (Native): Up to 72

terabytesUp to 36 terabytes

Tape Drives: Two full-height or four half-height

One full-height or two half-height

Native Throughput (TB/hr): 1.92 0.96Tape Drive Choices: LTO 5/4/3 LTO5/4/3Number of Physical Partitions:

Four Two

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 37: Archiving for Dummies

31

These small form-factor, rack-mounted librar-ies offer a number of interchangeable parts and are ideal for small and medium-sized businesses or remote office locations.

Drives and MediaThe most common tape drive and media format in use today is LTO (Linear Tape-Open). The latest version is LTO-5 with a native capacity of 1.5 TB (uncompressed) and a maximum speed of 140 MB/s. LTO-5 tape sup-ports dual partitioning and the Linear Tape File System (LTFS, discussed earlier in this chapter), which enables creation of tape-based file systems that are similar to disk-based file systems.For organizations that grow beyond the scalability of LTO, enterprise-class drives (such as Oracle’s StorageTek T10000C) and cartridges (such as Oracle’s StorageTek T10000 T2) provide higher capacity and throughput performance. Tape drive capacity and throughput are two key considerations when compar-ing the overall expense of different tape storage solutions.Other factors to consider when comparing tape drive technology include ✓ Acquisition cost. It’s important to evaluate the

combined cost of all drives, media, and library slots — not just individual drive and media costs, as the drives have different capacity and perfor-mance points.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 38: Archiving for Dummies

32

✓ Media re-use. Tape drives have different media re-use strategies. Some drives are able to write to previous generation media, while some tape drives allow you to reuse existing media at the full, higher capacity of future drive generations.

✓ Data integrity. StorageTek enterprise tape drives, like the StorageTek T10000C, have many features to improve reliability in archiving environments, including data integrity validation (DIV), which ensures that data is not corrupted while traveling along the data path, and StorageTek T10000 T2 media, which boasts 30+ years shelf life.

✓ Reliability. Drives are designed for different duty cycles and have different features to improve overall reliability.

Table 3-1 summarizes the characteristics of StorageTek T10000C drives and StorageTek LTO-5 tape drives.

Table 3-1 Tape Drive CharacteristicsT10000C Drive

LTO-5

Capacity (uncompressed)

5 TB* 1.5 TB

Throughput (uncompressed)

252 MB/sec**

140 MB/sec

* 5.5 TB with StorageTek Maximum Capacity feature. ** Native sustained data rate. 240 MB/sec full file host data rate,

includes wrap turnarounds.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 39: Archiving for Dummies

33

Oracle’s Optimized Solution for Lifecycle Content Management

For unstructured content management across any industry, Oracle’s Optimized Solution for Lifecycle Content Management (see the following figure) brings together many of the archive components (discussed in this chapter) into a ready-to-implement architecture that removes guesswork for customers. From content ingestion and creation to long-term retention, this architecture brings together best-of-breed components in a streamlined solution for the best TCO, speed to implementation, and reduced risk.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 40: Archiving for Dummies

34

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 41: Archiving for Dummies

Chapter 4

Archive Use Cases

In This Chapter ▶ Examining archive requirements for healthcare▶ Looking out for media and entertainment needs▶ Talking about telecommunications▶ Crunching data for research and HPC applications

A rchive systems are critical for managing data in virtually every business and organization, in

every industry.Data archiving needs are driven by many factors, including customer needs (such as online access to digitized check images for bank customers), legal requirements (such as legal holds for pending litiga-tion), and regulatory compliance (such as Sarbanes-Oxley and the Health Insurance Portability and Accountability Act, or HIPAA).Not only does archive data need to be retained for long (or indefinite) periods, but it must be fully indexed, searchable, and easily retrievable.In this chapter, we explore several industry use cases for archives.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 42: Archiving for Dummies

36

HealthcareMedical records and documents are typically managed through an enterprise content-management system that requires securely storing and accessing docu-ments and records from multiple sources. Archived records are frequently compared with current results throughout the life of the patient requiring quick and reliable access to all patient data, regardless of where it may be stored. Retention requirements for medical data extend well beyond the life of the patient.

SAM is a proven solution for PACS (Picture Archiving and Communication System) and has been for more than ten years. Refer to Chapter 3 for more about SAM.

Media and EntertainmentThe media and entertainment industry requires stream-ing input and non-linear editing of very large files.

University of Michigan Radiology Department praises StorageTek SL8500

“Oracle’s StorageTek SL8500 Modular Library System is part of our tiered storage strategy, providing a very cost-effective, high-performance archive of our critical patient data. The library provides us with an extremely reliable, scalable, and cost-effective storage solution within a very constrained datacenter environment with limited power and cooling.”

Steve Ramsey, Director of Image Management and Computing Services, University of Michigan

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 43: Archiving for Dummies

37

Having an archive server behind a Digital Asset Manager (DAM) solution gives this industry the ability to access more files using less costly storage. The ability to share files among editors improves time to revenue, as well as providing access to older data to generate new revenue.

TelecommunicationsVarious regulations, such as the European Union’s Directive 2006/24/EC, require companies operating in the telecommunications industry to retain detailed communications data for various periods.Given the volume of communications traffic today — whether telephony or Internet — from both mobile and landline sources, the need for robust, highly scalable archive solutions to store the vast amounts of data being generated is of paramount importance in the tele-communications industry.

Thought Equity Motion gets disk performance on tape with SAM

Thought Equity Motion (www.thoughtequity.com) is a leading provider of cloud-based video management and licensing services for master-quality video.

“SAM has helped Thought Equity Motion treat multi-petabyte libraries as nearly equivalent to spinning disk without the cost, and with a much longer data management horizon. SAM streamlines the interaction between our applications and the data, eliminating the need for costly direct integra-tions with storage applications.”

Mark Lemmons, CTO, Thought Equity Motion

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 44: Archiving for Dummies

38

High-Performance Computing (HPC)HPC environments, such as those found in many research settings and in the public sector, use multiple types of data. For example, streaming input from instrumentation that generates video and sound from sensors and other data collection devices and is then shared by multiple groups for immediate analysis, as well as being archived for easy access for future com-parisons. Another example of data uses that are typical in HPC environments is data that is originally stored on a parallel file system. Parallel jobs are run to analyze the information. The raw data as well as the results of analysis are archived for future access and analysis.

Oracle StorageTek solutions help ECMWF forecast data archiving needs

The European Centre for Medium-Range Weather Forecasts (ECMWF) develops medium-range and seasonal forecasting using numerical methods and has produced operational medium-range weather forecasts since 1979.

Challenges ✓ Ensure the safe and cost-effective storage of 23 TB of

worldwide meteorological data, collected daily ✓ Enable fast access to 20 PB of archive data for 2,300 users

and researchers across Europe and the world to improve the accuracy of forecasting and severe weather warnings

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 45: Archiving for Dummies

39

✓ Ensure that the ECMWF can continue to collect, store, retrieve, and analyze weather forecasting data, even with a projected 60 percent annual data growth rate

SolutionImplemented three Oracle StorageTek SL8500 modular library systems to provide scalable storageResults ✓ Provided researchers with fast and easy access to a

well-organized archive holding at least 50 years of data collected from weather stations around the globe

✓ Stored data in a safe and secure environment that mini-mized electricity usage and that has grown from 14 tera-bytes annually in 1995 to 23 terabytes daily in 2011

✓ Enabled fast retrieval of data held on thousands of tapes, with the system typically handling 9,000 tape mounts per day and rising to 12,000 daily at peak times

✓ Accelerated access to 20 PB of archive data enabling researchers to improve accuracy of forecasts and severe weather warnings

“Oracle’s StorageTek SL8500 modular library system offers us an excellent balance in terms of cost, access, speed, and ease of use. The data stored in it is very close to being online, but without the prohibitive expense of spinning storage.”Francis Dequenne, Principal Systems Analyst, Data Handling Systems, ECMWF

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 46: Archiving for Dummies

40

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 47: Archiving for Dummies

Chapter 5

Ten Key Factors to Consider in Implementing Your Archive

In This Chapter▶ Knowing what’s important in an archive solution

E nterprise archive requirements go beyond capac-ity considerations alone. Here are ten things you

should consider for your archive.

AvailabilityAn archive needs to provide high availability for your organization’s archive data and fast, reliable access for your users, when they need it. An enterprise archive must have the following capabilities: ✓ Full-text indexing of all archive data ✓ Strong search engine with simple and advanced

user-defined search variables ✓ Access to different audiences as different uses are

determined for data ✓ Version control for multiple copies of data

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 48: Archiving for Dummies

42

IntegrityAn archive must preserve the integrity of the original data without data loss or corruption. Archive hardware and software should be fully integrated and work together to minimize silent data corruption (data that is altered without logging an error), sustain data through time, and guarantee data integrity.

AuthenticityData that is stored in an archive must be saved in its original format, including all versions and associated metadata. Additionally, an enterprise archive solution should provide the option to preserve the data not only in its original format, but also in a transformed format that conforms to a universal archive standard.

Reusability and CollaborationArchive data must be available for all users within the organization based on permissions managed with

Archiving America’s Archives“Reliable retrieval is key: An archive that only stores content is indistinguishable from a landfill. We need technology that reliably delivers all the content whenever requested and tells us proactively if there are issues affecting the retrieval of archived content.”

Scott Rife, U.S. Library of Congress

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 49: Archiving for Dummies

43

role-based access. Archive data should be stored in open formats that facilitate reusability and collabora-tion between different user groups. Re-using data allows organizations to save money, for example, by not having to collect data again to run new tests. Companies shorten the product to market cycle and hit the mark with new products through analysis of collab-orative data.

SecurityAn enterprise archive solution must protect data against corruption. Robust security features, including access control and encryption, protect the confidenti-ality, integrity, and availability of your archives.

SustainabilityAn archive system must be sustainable through technol-ogy changes that occur over time. The “100-year” archive is increasingly becoming a standard require-ment in many enterprises. Your archive must be sustain-able and able to migrate stored data through numerous inevitable technology changes over a period of many, many years. An open (rather than proprietary) format helps ensure sustainability and facilitates transforming your archives to new formats in the future.

TrustworthinessYour archive vendor must be committed for the long-term. While there are never any guarantees that even the most reputable and financially stable companies will be around in 50 or 100 years, investing in bleeding-edge archive technology from the latest overnight

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 50: Archiving for Dummies

44

dot.com sensation is a recipe for disaster. Look for a vendor with an established reputation in the industry and a known commitment to your archive success.

Cost-effectivenessWhen selecting your archive solution, you must con-sider the total cost of ownership. This includes not only your initial capital expenditures, but the ongoing operating expenses as well.

Automated tape is the most cost-effective storage medium — between 8 and 12 cents per gigabyte. The Clipper Group (www.clipper.com) estimates this to be up to 15:1 savings over disk.

Automated Data and Storage ManagementTiered storage enables you to take advantage of disk and tape in your archive. Your archive software must integrate with all of your storage hardware to manage your archive data efficiently and dynamically.

Infrastructure AnalyticsMaintaining a healthy archive environment is critical. Analytic software proactively monitors the health of your archive devices.

These materials are the copyright of John Wiley & Sons, Inc. and any dissemination, distribution, or unauthorized use is strictly prohibited.

Page 51: Archiving for Dummies