View
867
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Presentation delivered by Audax Group CIO to Gartner Symposium ITxpo on managing the Copy Data Explosion with Actifio
Citation preview
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates.© 2012 Gartner, Inc. and/or its affiliates. All rights reserved.
Erik-Jan Dubóvik
Chief Information Officer
Audax Group
CIO Perspectives: Opportunities in Managing the Copy Data Explosion
About Audax Group
• Background- Founded 1999, ~140 ppl, offices in Boston & New York
- Investor in lower-middle market companies
- Manage over $5B of assets through our private equity, mezzanine debt, and private senior debt businesses
Copy Data Management Visualized
Infrastructure-Centric Data Management
1 Redundant – Multiple silos, same 4 primitives
2 Complex – Keep adding to relieve “symptoms”
3 Slow – Moving lots of data across networks
Status Quo
1 Flexible – Any environment (virtual, hybrid…)
2 Simple – One integrated data protection app
3 Fast – Data mounts directly to production
Information-Centric Data Management
DUPLICATION + INFRASTRUCTURE + OPERATIONS + COMPLEXITY + COST
A whole new market category…
“Copy data management: These products can perform a host of functions, including backup, archiving, replication and creation of test data using a minimal number of copies.”
To go from good to great, storage administrators should evaluate these types of tools:
13 March 2013 ID:G00248888
…And a ‘Best Practice’
Dave RussellVP Distinguished Analyst
“The notion of copy data management — which reduces the proliferation of secondary copies of data for backup, disaster recovery, testing and reporting — is becoming increasingly important to contain costs and to improve infrastructure agility.” 15 August 2013 G00252768
Best Practices for Repairing the Broken State of Backup
Copy Data Growth DriversQ: What are the reasons for growth of secondary data copies?
Other
Lack of data copy management tools and/or practices
New/expanded use of business analytics
Regulatory requirements to store data for a specific period of time
Larger size of secondary copies to be created
More copies per application are created
Increased number of applications
0% 20% 40% 60% 80%
% of respondentsN=556
The Power of Copy Data Management
Tools Landscape
RecoverPointSRDF
MirrorView
DataDomainAvamar
AvamarNetworker
FAST
TimefinderRM
SnapView
Remote Copy
Continuous Access
DataProtector
AdaptiveOptimization
Virtual CopyEVA
Snapshot
SnapMirror
SyncSortCommVaultNetBackup
AST
InmageTrue Copy
CommVault
SmartTiers
Shadow ImageCoW
Snapshot
NetBackupBackupExec
VxFS DST
PureDiskStoreOnce
SnapShot
HDIM
VVR
RealTime
Tiering
Snapshot
Backup
Dedup
Replication
Context and Problem
• Situation- Resource & time intensive business processes require
immediate systems performance and limited downtime- 5 ESX Hosts, 50 servers, 16TB storage, Dual LTO4- 500k emails/mo (3,500/FTE); annual data growth 10%
• The Problem- Backup window entering business day- Business continuity technology didn’t protect all
systems & relied on tape* for server restoration- Level 1 RTOs range 5hrs (SQL) to 12hrs (email), 48hrs (file)
- Backup email service not acceptable for multi-hour use
* If tapes are corrupt, RPO grows to 7 days or longer.
Objectives
• Justification & Business Case- Fully protect all company systems
- Eliminate need for expensive Tier 1 storage
- Establish Co-Lo for systems and personnel
- Free-up expensive real estate (i.e., NY Server Room)
- Avoid growing IT staff
• Specific Goals & Timeline- 3 month project start-to-finish
- Major improvement of RPO/RTO
Objectives: RPO/RTO
Actifio Target
Level 1 RPO/RTO Level 2 RPO/RTO Level 3 RPO/RTO
3hr/15min 18hr/30min 24hr/45min
Previous Capability
Level 1 RPO/RTO Level 2 RPO/RTO Level 3 RPO/RTO
24hr/80hr 24hr/8.7 days 24hr/19.6 days
Graphic source: Wikipedia
• Alternatives Considered- Expand existing host-based replication software
(DoubleTake, WANSync)
- Veeam + new storage• Pushing limits of tech at a comparatively higher cost
• Considerations
Failover: How long to “spin up” server in Production site? DR?
Application support: Linux, Exchange, SQL, Server, SharePoint?
Storage: How much required? De-dupe/compression (important if using one device for backup)?
Replication: Site-to-site on- premise capable? Site-to-Cloud (If so, what limitations, if any)?
Severability vs. Integration: Acceptable risk if part of VM environment (vs. standalone)?
Data Restore: Server vs. item-level? Number of snapshots? How long to “spin up” server?
Cost: Savings from HW/ SW elimination, avoidance & downsizing? Staff optimization?
Timing: Natural refresh cycle of related HW/ SW (e.g., storage, dedupe, backup, data center)?
Connectivity: Local environment (Fibre vs. iSCSI)? WAN (1MB/5/10/100/1GB)?
Approach: Options
• Strategies- Engage business management to participate in people/
process change and define system priorities
- Embrace opportunity around architecture change
• Technologies Leveraged- Actifio, VMware, Cisco, Metro-E (100MB)
Approach
SITE B: FAILOVERSITE A: PRODUCTION
Our Actifio Environment
Capture only changed blocks(zero backup window)
Store only unique blocks(10X lower storage)
Move only unique blocks(70% less bandwidth)
Recreate data on demand
Instantly mount recovered data(zero restore window)
Incremental restore for BCInstantly mount recovered data(zero restore window)
Recreate data on demand
Ingest Server ONCE
Challenges & Results
• Biggest Challenges- Overly aggressive protection SLAs @ start
- Multiple power outages during transition
- Metro-E providers didn’t provide “true” Layer 2
• How Did We Overcome Them (Or Not)?- Increased RPOs for Level 2 & 3 systems
- Stopped synchronization for 18 hours to re-index system
- Implement Network Interface Devices (NIDs) to route all Layer 2 traffic (necessary for Metro-E High Availability)
Challenges & Results
• Results: $ and Intangible- Increased short-term costs, but $150k less than
alternative.
- Met all RPO/RTO objectives; didn’t meet timeline• Metro-E networking issues were unforeseen
• Upside Surprises- Added near real-time restoration of item-level objects
from any backup of Exchange & SharePoint
- Decided to move Production to Co-LO; new storage implementation to be handled through Actifio
• Lessons Learned- Engage telecom carrier Engineering early on
- Use project as opportunity to review Business Continuity on a holistic basis
- Partner w/ cross-functional vendor (storage, backup)
• What Would We Do Differently?- Less aggressive with Level 2 & 3 SLAs @ start
- Test network technology earlier & more often
Lessons Learned & Recommendations
Quantifying The Problem
Total Data in
Environment (TB)
Total Amount of Production Data (TB)
100
The Copy Data Ratio (CDR)
Example: (45TB / 8TB ) x 100 = 563
Quantifying The Problem
100 – 150 150 – 350 350 – 700700 – 1,000
Optimistic Opportunistic Urgency Crisis
The Copy Data Ratio (CDR)What’s Your Number?
563
Evaluating CDR Score in Relation to Operational Complexity
3Opportunity for savings, some
efficiency gains
1Transformational opportunity for
savings, efficiency gains
4Limited savings,
efficiency opportunities
2Large opportunity
for savings, efficiency gains
Toolsin Use
Copy Data Ratio
High
Low
Low High
563
Summary
• Copy data is a source of significant spend and inefficiency in the enterprise
• Impact felt most severely on revenue-generating and business-agility initiatives
• Delays / issues due to resource drain from copy data sprawl
• Important to understand the magnitude of the problem
• Calculating the Copy Data Ratio (CDR) can help influence an action plan based on effort / impact analysis