Upload
lamdieu
View
226
Download
2
Embed Size (px)
Citation preview
April 24, 2014
Gideon Senderov Director, Advanced Storage Products
NEC Corporation of America
Scale-out Object Store for PB/hr Backups and Long Term Archive
Long-Term Data in the Data Center
Page 2 © NEC Corporation of America 2014
Source: IDC, November 2012
0
20
40
60
80
100
120
140
2011 2012 2013 2014 2015 2016
Content Depots & CloudUnstructured dataReplicated dataStructured data
Consumption of Enterprise Disk Capacity by Type
CAGR
22.5%
47.5%
(EB)
28.8%
46.0%
83%
PBBA Market Forecast (CY2012 - CY2017)
Page 3 © NEC Corporation of America 2014
(Source: IDC)
Market is growing fast.
Archive Market Trend
Page 4 © NEC Corporation of America 2014
▌ Expected rapid growth in disk based archive market due to wide usage of active archive, growing with CAGR 35%, and reach $5.7B WW in 2016
▌ File- and Object-Based Storage (FOBS) market exceeding $23B in 2013, projected to grow with CAGR 25% and reach $38B WW in 2017 (IDC, August 2013)
WW Disk based Archive sales forecast
Sale
s rev
enue
($M
)
0
1,000
2,000
3,000
4,000
5,000
6,000
2010 2011 2012 2013 2014 2015 2016
CAGR 35%
Source: IDC, June 2013
Page 5 © NEC Corporation of America 2014
Storage Infrastructure Spending
Scale-out Object Storage
Page 6 © NEC Corporation of America 2014
▐ Object Storage A storage architecture that manages data as objects, as opposed to other
storage architectures like file systems which manages data as a file hierarchy and block storage which manages data as blocks within sectors and tracks
Object storage systems allow relatively inexpensive, scalable and self-healing retention of massive amounts of unstructured data.
Source: Wikipedia, April 2014
▐ Inadequate scalability of capacity & performance Cannot scale performance to
keep up with data growth Multiple products with different
architectures More siloed capacity to manage
▐ Limited deduplication scope
Limited scalability proliferates duplicate data across appliances
Lower deduplication ratio for large environments
Legacy Solutions – Scalability Limitations
Page 7 © NEC Corporation of America 2014
▐ Single deduplication repository across entire solution ▐ Data deduplication across ALL data from ALL node
Cross-node deduplication for greater efficiency Cross application deduplication leveraging application-aware deduplication
Local vs. Global Deduplication
Page 8 © NEC Corporation of America 2014
5 6 7 8 2 3 4 1
Q: What data can be restored if block #1 lost?
A: NONE! 1 4 6 2 1
7 1 4 6 1
3 6
4 6 1
7 6
8 1 4
7 5 1
Day 1: Full Day 2: Incremental …Day 7: Full Day 3:
Incremental
1
Traditional RAID is not sufficient for deduplicated data
Legacy Solutions – Resiliency Limitations
Page 9 © NEC Corporation of America 2014
Nodes • Industry-standard servers • Multiple types allowed • Heterogeneous & open
• PERFORMANCE SCALABILITY
• CAPACITY SCALABILITY
Unrestricted Scalability Intelligent Management SW
• Fully distributed system • Self-aware & self-organizing • Data management services • Virtualizes hardware platform
Community of Smart Nodes
Scale-out Scalable Grid Storage Architecture
Page 10
Accelerator Nodes
Storage Nodes
Hybrid Nodes
© NEC Corporation of America 2014
▐ Enable in-place technology refresh with no data migration ▐ Ever greener storage with faster, denser components ▐ Enable continuous data availability ▐ Reduce CapEx and OpEx with deduplication ▐ Non-disruptive scaling from dynamic auto provisioning storage
Adaptive Grid Storage For Long-Term Data
Page 11
Online Upgrade/Expansion with Multi-generation Nodes
Non-disruptive Remove Non-disruptive Add/Upgrade
V2 Grid + V3 + V4 + Vx = 1 System
© NEC Corporation of America 2014
▐ Simple, fast deployment < 45 minutes to backup & archive
▐ Self-discovering capacity
No storage provisioning No tape emulation tasks
▐ Self-tuning and Resource Management
Optimized performance & capacity
▐ Self-healing Automatic recovery across resources
▐ Web-browser GUI
Monitoring and Planning
Hands-Free Management
Page 12 © NEC Corporation of America 2014
Advanced Erasure-Coded Data Resiliency
▐ “User dialable” disk/node protection Default protection against 3 concurrent failures Dynamically allocated Intermix of multiple
resiliency levels (1-6) for different applications
▐ Greater protection with lower overhead Default setting: 25% capacity overhead 1.5x greater protection than traditional RAID 6
with lower overhead and faster recovery No idle spare drives
▐ Faster self-healing with less performance
degradation Only data is reconstructed rather than entire drive Data is reconstructed across multiple spindles
Page 13 © NEC Corporation of America 2014
Lab Benchmark Test
Page 14 © NEC Corporation of America 2014
AN AN
SN SN SN SN
Scale-out vs. Scale-up Data Deduplication
Page 15 © NEC Corporation of America 2014
Controller
Shelf Shelf Shelf Shelf
▐ Single or Multi-Controller
▐ Fixed maximum throughput
▐ Scalability limited by controller physical capabilities
▐ Modular scale-out front-end
▐ Scalable maximum throughput
▐ Scalability independent of physical capabilities with linear performance
Prevents silos of deduped data
Scale-out Inline Global Data Deduplication
▐ Distributed Two-Tier Architecture Independent linear scalability of
performance & capacity
▐ Global Deduplication Data deduplication across ALL
data from ALL nodes
▐ Distributed Hash Table Data routed to responsible
Storage Node Deduplication & hash table
processing scales linearly with Storage Nodes
Page 16 © NEC Corporation of America 2014
AN AN
SN SN SN SN
Linear Performance Scalability
Page 17 © NEC Corporation of America 2014
Scale-out Deduplication Performance
Page 18 © NEC Corporation of America 2014
File Aggregation (tar)
Blocking
Backup Server
Storage Media
Clients’ Files
Agent Operation
▐ Application metadata makes user data appear different
▐ Inserted metadata reduces deduplication efficiency
Application Metadata Impact
Metadata Filtering
Page 19 © NEC Corporation of America 2014
Dat
a (B
ytes
)
Time (Weeks)
Capacity Optimization through Enhanced Deduplication
Original Data Generic Dedupe Application-Aware Dedupe
Application-Aware Dedupe
Generic Dedupe
Application-Aware Deduplication
▐ Application-aware deduplication leverages format awareness to filter metadata inserted by the application and deduplicate the data payload separately
▐ Application-Awareness can increase Reduction Ratio by 130% or more
Page 20 © NEC Corporation of America 2014
21
WAN-optimized Replication
Site A
Replication Site A Site C
Geo-Distributed Grid
21
Site B
Site B
Page 21 © NEC Corporation of America 2014
Disaster Recovery and Cloud
Accelerator Nodes
Storage Nodes
Many to one replication Many to many replication
WAN-Optimized Replication
▐ Asynchronous grid-to-grid WAN-optimized replication for DR
▐ Deduplication across all replicated HYDRAstor grids Minimizes network bandwidth requirements Minimizes DR site capacity requirements
▐ Policy-based data selection – File System Granularity
▐ In-flight encryption Page 22 © NEC Corporation of America 2014
Customer Configuration Example
Page 23 © NEC Corporation of America 2014
▐ Replication across primary and secondary datacenter grid systems
▐ Replication from remote sites within the U.S., and across the Atlantic and the Pacific
HYDRAstor Gen-4 Family
HS3-410 3.2TB/hr, 3.6TB/hr (OST)
8-24TB Raw 104-312TB Effective
HS8-4001 4.9TB/hr, 6.3TB/hr (OST)
12-48TB Raw 156-624TB Effective
HS8-4002-96 9.7TB/hr, 12.6TB/hr (OST)
96TB Raw 1.2PB Effective
HS8-4001-96 4.9TB/hr, 5.6TB/hr (OST)
96TB Raw 1.2PB Effective
HS8-4002-192 9.7TB/hr, 12.6TB/hr (OST)
192TB Raw 2.5PB Effective HS8-4006-720
29.2TB/hr, 37.8TB/hr (OST) 720TB Raw
9.4PB Effective
HS8-4165-7920 802TB/hr, 1,040TB/hr (OST)
7,920TB Raw 103PB Effective
Single Node Model
Grid Model
Capacity
Performance
Hybrid Node
Storage Node
Node Building Blocks
Page 24 © NEC Corporation of America 2014
▐ Common code/features and modular scalability across models
▐ Intermix of multi-generation nodes in the same grids
▐ All Software Features supported for entire product line DRD™ protection – Advanced erasure-coded data resiliency DataRedux™ – Inline application-aware global data deduplication and compression Cloning/Snapshot – Instant fully deduplicated file system or file R/W copy Dynamic Data Shredding – Data shredding for deleted classified data AN Failover – Front-end failover for High Availability RepliGrid™ (Option) – WAN-optimized replication with in-flight encryption HYDRAlock™ (Option) – WORM file system functionality Encryption at Rest (Option) – Encryption to protect data at rest HYDRAstor OpenStorage Suite (Options)
• Express I/O – Lightweight data transport for high throughput • Dynamic I/O – Adaptive I/O load balancing across nodes • Optimized Copy – WAN-optimized copy services • Optimized Synthetics – Storage-synthesized full backups
Common Code Base and Functionality
Page 25 © NEC Corporation of America 2014
Scale-out Storage for Long-Term Data
Page 26
1 PB/hr 100 PB
© NEC Corporation of America 2014
Page 27
Perf
orm
ance
Functionality
Potential Enhancement Directions
Page 27 © NEC Corporation of America 2014