Roberto Tolini - NetAppBusiness Solutions Architect EMEA
NetApp Distributed Content Repositories: What Are We Doing in Real Life?
2
“Big Content” and Object Storage
StorageGRID: Overview and Architecture
Where does it fit? Use cases and target markets
Competition overview
How to Prove it works? PoC, Test and Demo capabilities
Summary, resources, and contacts for EMEA
Agenda
3
An Introduction
Big Content and Object Storage
4
What Does Your Corporate Data Look Like?
Other5%
Struc-tured Content
15%
File Data80%
Human-generated and machine-generated file data represent ~80% of all corporate data
This data cannot be deleted, even though…
…97% of this data will never be touched again
It’s too expensive to keep this data on primary storage
5
Some numbers to start
6
All That Data Is Stressing the Infrastructure
Challenges
Rapid, untamed growth of unstructured data
Perpetually retain large and growing datasets
Distributed users and app environment
Needs PB scale, billions of objects,
reduced operational overhead, efficient management
Policy-based placement, seamless technology refresh
Predictable, location-independent access anywhere, anytime
7
Block File Object
So: what is exactly Object Storage?
Specific location on disks / memory
Tracks
Sectors
Specific folder in fixed logical order
File path
File name
Date
Flexible container size
Data and Metadata
Unique ID
8
Distributed Content RepositoriesBased on NetApp StorageGRID Software
Large content repository for big, unstructured data Billions of data sets, dozens of petabytes
Create, manage and consume content globally Predictable access to data
independent of location Policy-controlled
data stores at each site
Intelligent data classification and access Metadata-based management
9
NetApp StorageGRID: Overview and Architecture
10
StorageGRID− Acquisition of Bycast Inc. in 2010 with a decade of object storage
innovation
− Footprint in long-term archive, healthcare market
− Since 2010, expansion from healthcare to telecom/service providers
− IBM OEM customers transitioned to NetApp-branded product
− Currently in version 9 (9.0.2)
− First product to support industry-standard object storage CDMI
A bit of history
11
NetApp StorageGRID Solution
CIFSNFSHTTP/CDMI
CIFSNFSHTTP/CDMI
CIFSNFSHTTP/CDMI
CIFSNFSHTTP/ CDMI
MULTIPLE: APPLICATIONS + SITES + PROTOCOLS
MULTIPLE: TARGETS and TIERS
MULTIPLE: TENANTS and POLICIES and ADMINISTRATORS
Site 1 Site 2 … Site NSite 3
APPLICATIONS APPLICATIONS APPLICATIONS APPLICATION
Disk Storage Tape
12
StorageGRID®
CIFSNFSHTTP
CIFSNFSHTTP
ILM Policy ManagementM U LT I P L E : A P P L I C AT I O N S + S I T E S + P R O T O C O L S
Site 1 Site 2
App1 App2
ILM Evaluation…
1. Number of copies
2. Storage location
3. Storage tier
4. Retention period
Policy
File1 Metadata: FPTH starts with “/app1share/*”
File2 Metadata: XTYP equals “bronze”
13
Distributed Content RepositoryStorageGRID features and capabilities summary
StorageGRID
M U LT I P L E : A P P L I C AT I O N S + S I T E S + P R O T O C O L S
M U LT I P L E : TA R G E T S + V E N D O R S + T I E R S
M U LT I P L E : T E N A N T S + P O L I C I E S + A D M I N I S T R AT O R S
Site 1 Site 2 … Site N
Site 3APPLICATIONS APPLICATIONS APPLICATIONS APPLICATION
TapeNetApp E-Series Storage Systems
Technical Overview
− Multi-protocol: CIFS, NFS, RESTful HTTP
− Scale-out architecture: capacity, count, sites, tiers, tenants, throughput
− Policy-management: copies, locations & tiers on ingest / over time
− Object storage: compression, encryption, fingerprint + metadata, WORM
− HA & DR: NDO, active+active for data and metadata, self-healing
14
Example: SG data Flows in Multi-Site
Data CentresEdge Site
Edge Site
GatewayNode
StorageNode
AdminNode
Customer LAN G
rid Internal LAN
WAN Network Router
WAN
WAN Network Router
Grid Internal LAN
ControlNode
StorageNode
GatewayNode
Storage Node
WAN
WAN Network Router
GatewayNode
Remote Site LAN
Grid Internal LAN
ControlNode
Customer LAN
AdminClients
StorageClients
StorageClients
StorageClients
15
HTTP API via Gateway
Simple Client Implementation
Gateway load-balances sessionsacross available Storage Nodes
Storage Nodes perform HTTP APItransactions
Gateway Node
Storage Node
Admin Node
Control Node
Grid Network
Client Network
HTTP APIClient
16
Sample Code – Storing Data
Application Code• HTTP “PUT” Request
Grid Response• “PUT” accepted
Grid Response• Object received• UUID returned to app
Data Transfer
17
StorageGRID
Is an Object Storage software solution
Is a software component (Bycast)
Runs on a computing layer (default option: VMs)
Holds the “intelligence”; manages data according to defined policies
Data (objects) are stored in a storage layer; different types supported.
The whole “solution” is referred to as “NetApp DCR.”
What Is StorageGRID?
18
StorageGRID Software
Servers or Hosts
Storage
Network
DCR Solution components
⁞Example: 2 Sites - DC and DR solution
19
NetApp StorageGRID 9.0 (9.0.2 today)
SUSE Linux Enterprise Software 11 SP2
VMware ESXi 5.0 (upd1)
Building Block Software – 9.0 (main)
VMware ESXi
SG SG
⁞
20
Where does it fit? Use cases and target markets
21
Where is DCR Solution a good fit?− In general: wherever there is a need for preservation, compliance, data integrity,
distributed repositories (multi-site, multi-tenant), high availability, scalability, etc.
And where it’s not…− In general: highly transactional data, “dumb” storage, “none of the above”, etc…
Target markets and use case examples: − HealthCare: PACS (imaging), Electronic Health Records
− File and email archiving
− “Dropbox-like”, iCloud-like cloud services (sharing, synchronizing)
− Cloud archiving, backup: legal archiving, service providers, knowledge preservation
DCR Solution: Target Markets
22
Customer existing Application(s)− StorageGRID is mainly accessed by applications rather than users directly.
Example: PACS, Document management, etc...
− Is there an existing integration/reference with customer application? (Using API integration vs filesystem)
Customer needs or problems to solve − We might need to propose an application that can solve customer problems
and leverage StorageGRID capabilities. Examples: archiving, “dropbox”
− It might make sense to bring in a partner ISV or develop an ad hoc solution with a NetApp partner
What I do Need to Consider?
23
StorageGRID can be accessed in two ways:
− FileSystem: by exporting a CIFS/NFS share to users/applications
− HTTP API (SG API OR CDMI): by presenting a URL
Why does it matter? What are the differencies?
− File System: simple and immediate No integration needed.
− API: need a “connector” to application (integration)
Why develop or integrate with API?
− Almost infinite scalabilty, truly unique namespace, can leverage metadata in a much more efficient way.
File System vs. API: Why Should I Care?
24
Main reasons for the choice:
− Applications can leverage Object Storage to enhance metadata use for data management
− Applications can use a truly global namespace via API
− StorageGRID provides real distributed content management (DR, ILM for data, HTTP access)
− Performances and scalability model (scale-out with “blocks”). Almost infinite scalability.
− Long-term data integrity guarantee built-in
Why NetApp StorageGRID? General Considerations
25
Business requirements:
− Preserve medical records for long term (integrity guarantee)
− Ensure compliance to regulations (HIPAA, EU, etc...)
− Guaranteed data accessibility, distribution and sharing
Solution:
− A PACS and/or EHR application implemented on NetApp StorageGRID infrastructure
Use Case 1: Healthcare (PACS and EHR)
26
Description
Billing (for Clud model or managed service)
Infrastructure
Flat subscription (level-of-service)
Per effective usage (GB OR Objects stored/retrieved)
On Premises/Local Cloud service (example Iron Mountain) Managed Service
NetApp StorageGRID
Front-end application (PACS)
(optional) Middleware application (i.e.: DeJearnette, ForeCare,etc...)
• Doctors and healthcare professionals
• Small-Medium Hospitals
• Large hospitals (DR)
• Service Providers
Administrator
• Web-based configuration for infrastructure (local and centralized), capacity, access, etc...
Offer elements
Levels of service (onsite infrastructure vs remote infrastructure. Retention,
etc...)«Compliance» on data
(audit, WORM, managed lifecycle, etc...)
Core Optionals
2
1
Delivery Models
• Solution allows store patient health records (PACS, others) in a «cloud» (Grid). Data can be either totally offsite (only local «cache» installed at customer), totalli onsite or both onsite and offsite («cache» + local storage at customer).
• Data integrity guarantee (self-healing), DR and compliance (HIPAA, etc...). Managed object lifecycle.
Target Market Segments
Solution Summary: One-Pager
Managed Health Records repository
Hospital
Medical record generation (Exam, X-
Ray, etc...)
Patients
Interface with PACS Local AND/OR Cloud Archive,
SP facilities4
3
27
Business requirements:
− Offload of less accessed files from primary storage
− Archiving of files (with or without legal value)
− Archiving of emails (MS Outlook, Lotus Domino, etc...).
Solution:
− An application for „file (or email) archiving “ implemented on NetApp StorageGRID infrastructure
Use Case 2: File and Email Archiving
28
It is a solution that includes a file (and, in some cases, email) archiving/tiering application that enables offload of user content from primary storage to other data storage tier(s).
NetApp StorageGRID is used as secondary tier and provides the distributed content infrastructure
Application moves data from primary storage based on different parameters (age, metadata, etc...) to StorageGRID.
Solution can leverage StorageGRID data management features (ILM, data protection, self-healing, multiple sites synchronization, etc...)
Use Case (Solution) Overview
29
File Archiving: Theory Of Operations
Inactive files are moved to StorageGRID (stubbed or not stubbed depending on the methodology used)
Event-based, policy-based. User-initiated, MS SharePoint, etc...
HTTP/CDMI
StorageGRID
30
File Archiving: Theory Of Operations #2
31
File archiving and Email Archiving
− Symantec EV (API integration)
File archiving/Tiering
− NTP Software OSCC (API integration)
− PoINT Software Storage Manager (API integration)
− F5 ARX (CIFS “integration”, validated architecture)
Other solutions (general approach)
− FSG: CIFS/NFS shares are used whenever there is not a specific API integration (any other application)
Solutions Examples: A Real-life “Taste”
32
Business requirements: − A “Cloud File Hosting” Service for retail customers (end users)
and/or businesses (“private Dropbox”-style).
Solution: − A partner application for “Cloud File Hosting” implemented on
NetApp StorageGRID infrastructure
Use Case Example 3: Cloud File Sharing
33
In general it is a file hosting solution for individuals (retail customers) or enterprises; it was developed for NetApp StorageGRID
It syncronizes files accross desktop, laptop and smartphones (iPhone, Android, Blackberry and Windows Mobile), allowing users to share and access them from everywhere
Can be “white labelled”, customized, run stand-alone or integrated with billing systems, CRM, LDAP (for users access)
NetApp StorageGRID provides the distributed content infrastructure
What Is Cloud File Sharing?
34
Description
Billing
Infrastructure provider
Flat subscription
Per effective usage (GB OR Objects stored/retrieved)
Self-provisioning through provider Web portal (user creation, level of service, etc...)
On Line (SP portal) Direct/Indirect sales (B2B)
NetApp StorageGRID Front-end application (Partner or SP-
customized)
Professionals
Small-Medium Enterprises
End users
Large Corporates
Online folder creation
Users authentication
Users invited and files shared via e-mail o SMS (with or without password)
• File synchronization between PC, tablet and smartphone. Search and archive capabilities
• Access to folders limited to selected groups (employees, suppliers, customers)
Administrator
• Web-based configuration for infrastructure capacity, access, etc...
Offer elements
Levels of service«Compliance» on data
(audit, WORM, managed lifecycle, etc...)
Core Optionals
42
3
1
Sales Model
Activation
Solution allows to share documents and information within working groups inside company or with external entities
Multi-channel access (desktop, web,mobile, etc...) with content synchronization
Target Market Segments
Solution Summary: One-Pager
Secure file sharingPrivate «Dropbox»
Users
2
35
Turk Telekom “BuluttDepo” (MRD “Nimbus” application)
− API-based integration with StorageGRID
− Developed specifically for Service Providers
Mezeo Cloud
− API-based integration with StorageGRID
− Multi-purpose “Cloud” application
Other solutions (general approach)
− FSG: CIFS/NFS shares are used whenever there is not a specific API integration (any other application)
Solutions Examples: A Real-life "Taste"
36
Competition overview
37
Well, first and foremost ourselves...but we’re improving − “Blurred” border between object storage and “scale-out NAS” solutions. Not
always easy to understand which is best fit. We often end up competing both with Object Storage and NAS solutions.
Main competitors: − EMC: Atmos and Centera (typically in banking sector), Isilon (typically in
Service Providers)
− HDS: HCP (Hitachi Content Platform) and HUS (Unified Storage, now with http and object interface)
− IBM SONAS (scale-out NAS)
− DDS WoS (Web Object Scalar) and others (less in EMEA, more in U.S.)
Who Is The “Enemy”?
38
Understand the workload
− Object counts & sizes
Sweet-spot object >500KB
Counts up to 8B, Capacity to 35PB
− Performance requirements
100MB/s ingest per file system namespace
10Gbps aggregate ingest/retrieve via object APIs
Look for the ISV that completes the puzzle
− Enterprise Archive
− Media
− Healthcare
Strategies to Win
Do not undervalue E-Series!– Rock-solid enterprise arrays
350,000 systems deployed WW
– Density and performance
1.8PB – 2.4PB per rack
39
Object Storage Vendor Ecosystem
Scale-out Object Store(Traditional)
Scale-out Object Store(Startups)
Open Source
Key Value Store
(Centera)(Atmos)
40
We have some good material on FieldPortal
− Forrester Report: Total Economic impact of NetApp DCR Solution
https://fieldportal.netapp.com/Core/DownloadDoc.aspx?documentID=91451&contentID=122053
− ESG Lab Validation report NetApp DCR
https://fieldportal.netapp.com/Core/DownloadDoc.aspx?documentID=80710&contentID=99326
− EMC Atmos: CAT Competitive presentation
https://fieldportal.netapp.com/Core/DownloadDoc.aspx?documentID=94618&contentID=129854
Other resources are available internally at the moment (just ask if you need), but they will soon be made available on FieldPortal.
Competitive Resources
41
How to prove it works?
Test, PoC and performances testing guidelines
42
StorageGrid: demo /PoC capabilitiesLab-on-demand Targeted for online demos (1-2 hours) Requires access to NetApp LoD
StorageGrid-in-a-Laptop (SiL) Complete set of functions, can be done «on-the-fly» Grid Nodes consolidated and pre-configured Fits in a laptop low resources consumption
StorageGrid-in-a-Box (SiB) Complete set of functions, can be done onsite. Needs server (or servers) Allows for higher performance
43
StorageGrid: demo /PoC capabilities (cont)
“Full system” (SG full stack of components) Full Grid deployment Allows for full performances Needs server (or servers) and E-Series storage Needs onsite work for implementation Lead time impacted:
− Purchase of demo equipment− HW delivery time− Talk to your TPM!
44
NetApp StorageGRID: Lab-on-Demand https://labondemand.netapp.com Needs registration (partner, eventually customer) Full guided Lab (1h) or «free session»
45
StorageGRID-in-a-Laptop (SiL): overview
Pre-packaged set of 2 VMs to be installed on Vmware Workstation 7/8 or ESXi 5
Can be deployed at customer site Can be installed on a laptop or server Includes a set of prepackaged test cases/scripts (additional
Linux VM) and PoC guide Limited customization options Compressed images ~7GB Needed space: ~120GB (max) Ask us for details
46
NetApp StorageGRID: SiL topology
Additional Linux VM for running tests/scripts
Individual VMs can be spread across multiple servers if needed (smaller servers, using existing resources
Some nodes can be turned down if needed
Vmware ESX/ESXi/Workstations/Server
47
NetApp StorageGRID: SiL overview
48
StorageGRID-in-a-Box (SiB): overview
Pre-packaged set of VMs to install on Vmware ESXi 5 Can be deployed at customer site Need at least a server to be installed on Includes a set of prepackaged test cases/scripts (additional
Linux VM) and PoC guide Can be customized according to test needs Can be spread across multiple servers/sites Compressed images ~40GB, needed space 800GB-1.2TB Ask us for details
49
SiB Configuration (example) One Intel Sever,
with two 6-core Xeon processors, 48GB memory, 8x600GB SAS disks, four GbE NICs
One 8 port 1GbE switch (or customer switch)
Vmware ESXi 5
StorageGRID Software License
50
Typical Test Cases and proof points
File System access:
− CIFS/NFS basic access, FSG cache behavior, replication, WORM, etc…
HTTP API access
− SG API and/or CDMI ingest/retrieve, metadata update, etc…
ILM (object lifecycle management)
− Automatic content placement based on metadata, etc…
Data integrity (content self-healing and inclusive protection)
HA and “no single-point-of-failure design”
Integration with application(s): involve application vendors or developers!
51
Summary, Resources, and Contacts for EMEA
52
Learn what StorageGRID is and which are the use cases where we can more effectively position it
Understand which are the critical points to address in each of them and leverage existing experiences
Get in touch with people who can help you in EMEA. (You’re welcome. )
Key Takeaways: What Should I Remember?
53
NetApp DCR Solution: − http://www.netapp.com/us/solutions/big-data/distributed-content-repositories.ht
ml
NetApp Fieldportal DCR solution landing page:− https://fieldportal.netapp.com/applications/storagegrid.aspx#14550
Contacts for EMEA− Roberto Tolini: [email protected]
− Philippe Wackers: [email protected]
− Or contact your local NetApp office/TPM
Resources and Information
54
© 2013 NetApp, Inc. All rights reserved. No portions of this document may be reproduced without prior written consent of NetApp, Inc. Specifications are subject to change without notice. NetApp, the NetApp logo, and Go further, faster, are trademarks or registered trademarks of NetApp, Inc. in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such.