Upload
easter-harvey
View
214
Download
0
Embed Size (px)
Citation preview
© 2013 IBM Corporation
Storage and “The Cloud”
1. What is driving IT / Businesses to Cloud 2. Traditional IT Organization Impact 3. Traditional vs. Design-for-Fail, On-premise vs. Off-premise 4. IBM Big Data / Cloud Storage Products and Directions
IBM Cloud Storage Briefing - December 3, 2013
Provided by: John Sing, Executive IT Consultant, San Jose, California [email protected]
© 2013 IBM Corporation2
IBM Cloud Storage Briefing – December 3, 2013
What is driving IT and Businesses to Cloud
© 2013 IBM Corporation3
IBM Cloud Storage Briefing – December 3, 2013
Value delivered
Storage Provisioning
Continuous Access to data
From traditional
Weeks
To cloud
Minutes
For usersFor users
Reduced storage admin
costs
Up to 50% savings
For ITFor IT
Reduced energy costs Up to 36%
Increased storage utilization Up to 90% From 50%
Localized, any time
any where
Dynamic (Elastic)
Centralized
FixedStorage Capacity
Modern 21st Century Cloud Business Value
Time-to-DeliveryCompetitive Advantage
Revenue“Time is Money”
Time-to-DeliveryCompetitive Advantage
Revenue“Time is Money”
© 2013 IBM Corporation4
IBM Cloud Storage Briefing – December 3, 2013
Primary drivers for move to cloud = business reasons
http://www.kpmg.com/global/en/issuesandinsights/articlespublications/cloud-service-providers-survey/pages/service-providers.aspx
Competitive Advantage,Revenue
Competitive Advantage,Revenue
© 2013 IBM Corporation5
IBM Cloud Storage Briefing – December 3, 2013
Bandwidth availability is tipping point for adoption of “The Cloud”………
Worldwide broadband bandwidth availability is becoming commonplace
Facilitates a pervasive web services delivery model – (i.e. “The Cloud”)
Hosted in mega data centers with massive amounts:– Processors, Storage, Network
Today, when above 3 come together in a geo:
– We are seeing small, medium on-premise data centers worldwide rapidly disappearing, off-premise, into the cloud
The real question: – Is traditional IT re-capturing / replacing workloads
when they move off-premise to Cloud ?
© 2013 IBM Corporation6
IBM Cloud Storage Briefing – December 3, 2013
Cloud Mega Data Centers = new modular IT implementation style…
Internet-scale centers…..
Data: –10s / 100s petabytes
Servers: –100,000s ….
Workloads:–Require server clusters of 100s, 1000s, 10,000, more …..
Modular implementation
© 2013 IBM Corporation7
IBM Cloud Storage Briefing – December 3, 2013
Amazon Web Services
Amazon Web Services 1Q12: 450,000 servers
Amazon Perdix Modular Datacenter
EC2 17K core, 240 teraflop cluster 42nd fastest supercomputer in world
1Q12:
450,000Servers
estimated
1Q13: > 2 trillion
objects in S3
1Q13: 1.1 Mreq/sec
http://aws.typepad.com/aws/2012/04/amazon-s3-905-billion-objects-and-650000-requestssecond.html http://gigaom.com/cloud/how-big-is-amazon-web-services-bigger-than-a-billion/http://aws.typepad.com/aws/2013/04/amazon-s3-two-trillion-objects-11-million-requests-second.html
© 2013 IBM Corporation8
IBM Cloud Storage Briefing – December 3, 2013
Growth ofThe Cloudby 2016
Mobile
Geo-locational
Real-time data
Shift to cloud mega-data centers
http://www.datacenterknowledge.com/archives/2012/10/23/cisco-releases-2nd-annual-global-cloud-index/
Source:
> 50% in cloud
Cisco already knows > 50%
workload is in the cloud
© 2013 IBM Corporation9
IBM Cloud Storage Briefing – December 3, 2013
Cloud: No longer exploratory
Expectations: Cloud computing will be "just computing" by 2018
•Cloud is at the end of its beginning phase and has gotten serious
•Private cloud is growing, but giving way to hybrid cloud
•Service providers, VARs, SIs are rising to the cloud opportunity
•Cloud adoption is strong across large enterprise as well as SMB.
© 2013 IBM Corporation10
IBM Cloud Storage Briefing – December 3, 2013
So, What is a Cloud, really?
Why does it impact Traditional On-Premise IT organization so heavily?
Extracted from presentation: “Building a 21st Century Cloud Storage Service” by John Sing: http://snjgsa.ibm.com/~singj/public/2013_Berlin_System_Storage_x_Pure_Symposium/sCS05_John_Sing_Building_21st_Century_Cloud_Storage_Service_Industry_Best_Practice.ppt
Extracted from presentation: “Building a 21st Century Cloud Storage Service” by John Sing: http://snjgsa.ibm.com/~singj/public/2013_Berlin_System_Storage_x_Pure_Symposium/sCS05_John_Sing_Building_21st_Century_Cloud_Storage_Service_Industry_Best_Practice.ppt
© 2013 IBM Corporation11
IBM Cloud Storage Briefing – December 3, 2013
To users, cloud seems “easy”, “instant”, “self-service”. So what has to happen in the background?
Some would say that virtualization = cloud
Some IT traditionalists would say that cloud is nothing more than much better managed centralized, automated data centers
Unfortunately, such statements severely undersize the essential organizational element
To provide true cloud services, you must also execute a significant shift in:
– Organizational lines– Processes– Workflows– Workload types– Required skill sets Key message
© 2013 IBM Corporation12
IBM Cloud Storage Briefing – December 3, 2013
This is the cloud-enableddata centerjourney
1. Virtualized
2. Deployed
3. Optimized
4. Enhanced
5. Monetized
Cloud adoption maturity
levels
Level of cloud capability(macropatterns)
http://www.redbooks.ibm.com/abstracts/redp4893.html
IBM Redpaper
© 2013 IBM Corporation13
IBM Cloud Storage Briefing – December 3, 2013
What’s most important: cloud macropattern workflows
1. Simple IaaS
4. ITIL ManagedIaaS
2. CloudMgmt
3. AdvIaaS
© 2013 IBM Corporation14
IBM Cloud Storage Briefing – December 3, 2013
Are you ready?
Cloud micro-pattern workflows
IBM Storwize V7000, SVC, XIV Tivoli Storage Manager
Tivoli Storage Productivity Center
Smart Cloud Storage Access
Problem! Traditional IT organization looks nothing like this workflow!
© 2013 IBM Corporation15
IBM Cloud Storage Briefing – December 3, 2013
IBM Redpapers: Building Cloud Enabled Data Center / Service Provider
http://www.redbooks.ibm.com/abstracts/redp4912.html
http://www.redbooks.ibm.com/abstracts/redp4893.html http://www.redbooks.ibm.com/abstracts/redp4873.html
© 2013 IBM Corporation16
IBM Cloud Storage Briefing – December 3, 2013
Example: IBM Storage products within the Cloud workflowNon-Technical Users
P9: IBM SmartCloud Storage AccessP9: IBM SmartCloud Storage Access
P8: IBM Tivoli Storage Productivity CenterP8: IBM Tivoli Storage Productivity Center
P0: IBM SVC / Storwize V7000 U
Self Provisioning Requests for Windows or Linux OS and end user consumption
Eth
ern
et N
etw
ork
P0: IBM SONAS
File
P0: IBM XIV
Block
Virtualizes
IBM or 3rd party Storage arrays(HP, NetApp, EMC, etc.)
CIFS / NFS
Provisioning Requests for LUNs to be assign/consume by either to physical or Virtual
Servers
Server, Application Owners, Developers users, etc.
LUN
Physical or Virtual
Servers
LUNs
LUN
LUN
LUN
DB2
SAP
ERPs
TPC/Storage Admin
16
© 2013 IBM Corporation17
IBM Cloud Storage Briefing – December 3, 2013
Key Cloud organizational learning point:
Cloud involves major re-alignment of IT organization, skills
Re-alignment of IT processes, to facilitate real-time, elastic management, monitoring, delivery based on service catalog
– Aligned with the Lines of Business revenue generation / competitive advantage needs (requires full-time liason positions)
Creation of service catalog requires IT to invest different efforts into design/automation of IT capability
– New, additional skill requirements, aligned along a very different organizational structure, metrics, and speed criteria
Provide governance that addresses risk of unauthorized or rogue access to services– Only appropriate approvals and credentials, thus new emphasis on network + security
Addressing resistance to change within IT organization is the biggest success factor
If the on-premise IT organizations is unable to change….. – this is also a major off-premise cloud driver
If the on-premise IT organizations is unable to change….. – this is also a major off-premise cloud driver
© 2013 IBM Corporation18
IBM Cloud Storage Briefing – December 3, 2013
This organizational shift is a main reason why “ready-to-go” cloud workflow products (such as OpenStack) are so attractive:
Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/
OpenStack already has all cloud workflows
ready for production
© 2013 IBM Corporation19
IBM Cloud Storage Briefing – December 3, 2013
OpenStack is comprised of seven core projects that form a complete Cloud Infrastructure as a Service (IaaS) solution
Compute (Nova)
Block Storage (Cinder)
Network (Neutron)
Provision and manage virtual resources
Dashboard (Horizon)Self-service portal
Image (Glance)Catalog and manage server images
Identity (Keystone)Unified authentication, integrates with existing systems
Object Storage (Swift)petabytes of secure, reliable object storage
IaaS
Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/
IaaS
Understand OpenStack to understand IBM
Cloud Storage directions
Understand OpenStack to understand IBM
Cloud Storage directions
HorizonHorizon
SwiftSwiftGlanceGlance
KeystoneKeystone
NovaNova
CinderCinder
NeutronNeutron
© 2013 IBM Corporation20
IBM Cloud Storage Briefing – December 3, 2013
Did you know: two different types of IT architectures have emerged
Design-for-Fail IT implementation has some similarities, but clearly isn’t the same, as Traditional IT architecture
Knowledge Check
© 2013 IBM Corporation21
IBM Cloud Storage Briefing – December 3, 2013
Today there are two major types of IT Cloud architectures and workloads:
Transactional IT
“Systems of Record”
Internet Scale Workloads
“Systems of Engagement”
Cloud, High Availability, Resiliency, Disaster Recovery characteristics
Can be adapted to Cloud “agnostic / after the fact”
Data Strategy Can leverage traditional tools/concepts to understand / implement cloud
Storage/server virtualization and pooling
Automation End to end automation of server / storage virtualization
Commonality Apply master vision and lessons learned from internet scale data centers
© 2013 IBM Corporation22
IBM Cloud Storage Briefing – December 3, 2013
The other major type of IT Cloud architecture and workload is:
Transactional IT
“Systems of Record”
Internet Scale Workloads
“Systems of Engagement”
Cloud, High Availability, Resiliency, Disaster Recovery characteristics
Can be designed “Agnostic / after the fact” using server or storage virtualization, replication
Cloud capabilities are “designed into software stack from the beginning”
Data Strategy Use traditional tools/concepts to understand / know data
Storage/server virtualization and pooling
Proven Open Source toolset used implement failure tolerance and redundancy in the application stack
Automation End to end automation of server / storage virtualization and replication
End to end automation of the application software stack providing failure tolerance
Commonality Apply master vision and lessons learned from internet scale data centers
Apply master vision and lessons learned from internet scale data centers
© 2013 IBM Corporation23
IBM Cloud Storage Briefing – December 3, 2013
Today: two different types of IT
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
Internet scale wkloadsTransactional IT
© 2013 IBM Corporation24
IBM Cloud Storage Briefing – December 3, 2013
Today’s two major IT workload types
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads
© 2013 IBM Corporation25
IBM Cloud Storage Briefing – December 3, 2013
How to build these two different IT architectures
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
Transactional ITInternet scale wkloads
© 2013 IBM Corporation26
IBM Cloud Storage Briefing – December 3, 2013
What You (Consumer) Get with These different approaches:
Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/
Transactional ITInternet scale wkloads
© 2013 IBM Corporation27
IBM Cloud Storage Briefing – December 3, 2013
Policy-based Clouds and Design-for-Fail Clouds areworkload optimized architectural choices
Policy-based Clouds
• Purpose optimized for longer-lived virtual machines managed by Server Administrator
• Centralizes enterprise server virtualization administration tasks
• High degree of flexibility designed to accommodate virtualization all workloads
• Significant focus on managing availability and QoS for long-lived workloads with level of isolation
• Characteristics derived from exploiting enterprise class hardware
• Legacy applications
Design-for-fail Clouds
• Purpose optimized for shorter-term virtual machines managed via end-user or automated process
• Decentralized control, embraces eventual consistency, focus on making “good enough” decisions
• High degree of standardization
• Significant focus on ensuring availability of control plane
• Characteristics driven by software
• New applications
Transactional ITInternet scale wkloads
© 2013 IBM Corporation28
IBM Cloud Storage Briefing – December 3, 2013
Example: Traditional IT vs. Hadoop for Big DataTraditional approach : Move data to program
Big Data approach: Move function/programs to data
Database server
Data
Query Data
return Data
process Data
Master node
Data nodes
Data
Application server
User request
Send result
User request
Send Function to process on Data
Query & process Data
Data nodes
Data
Data nodes
Data
Data nodes
DataSend Consolidate result
Traditional approachApplication server and Database server are separateAnalysis Program can run on multiple Application serversNetwork is still in the middleData has to go through networkDesigned to analyze TBs of data
•Big Data Approach Analysis Program runs where the data is : on Data NodeOnly Analysis Program has to go through the networkAnalysis Program is executed on every DataNodeDesigned to analyze PBs of dataHighly Scalable :
1000s NodesPetabytes and more
Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and Francois Gibello/France/IBM for the use of this slide
© 2013 IBM Corporation29
IBM Cloud Storage Briefing – December 3, 2013
29
Example: Traditional IT vs. Hadoop for Big Data
Database server
Data
Query Data
return Data
process Data
Application server
User request
Send result
Master node
Data nodes
Data
User request
Send Function to process on Data
Query & process Data
Data nodes
Data
Data nodes
Data
Data nodes
DataSend Consolidate result
Example: How many hours of Clint Eastwood appears in all the movies he has done?
Task: All movies need to be parsed to find Clint’s face
•Traditional approach :1)Upload a movie to the application server through the network
2) The Analysis Program compares Clint’s picture with every frame of the loaded movie.
3) Repeat the 2 previous steps for every movie
•Big Data Approach :
1)Send the Analysis Program and Clint’s picture to all the DataNodes.
2) The Analysis Program in every DataNode (all in parallel) compares the Clint’s picture with every frame of the loaded movie.
3) The results of every DataNodes are consolidated. A unique result is generated.
Traditional approach : Move data to program
Big Data approach: Move function/programs to data
Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and Francois Gibello/France/IBM for the use of this slide
Note: Hadoop typically uses direct attached storage
© 2013 IBM Corporation30
IBM Cloud Storage Briefing – December 3, 2013
Hadoop principles: Storage, HDFS and MapReduce
Hadoop Distributed File System = HDFS : where Hadoop stores the data– HDFS file system spans all the nodes in a cluster with locality awareness
Hadoop data storage, computation model– Data stored in a distributed file system, spanning many inexpensive computers– Send function/program to the data nodes– i.e. distribute application to compute resources where the data is stored– Scalable to thousands of nodes and petabytes of data
MapReduce Application
1. Map Phase(break job into small parts)
2. Shuffle(transfer interim outputfor final processing)
3. Reduce Phase(boil all output down toa single result set)
Return a single result setResult Set
Shuffle
public static class TokenizerMapper extends Mapper<Object,Text,Text,IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text();
public void map(Object key, Text val, Context StringTokenizer itr = new StringTokenizer(val.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } }}
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWrita private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> val, Context context){ int sum = 0; for (IntWritable v : val) { sum += v.get();
. . .
public static class TokenizerMapper extends Mapper<Object,Text,Text,IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text();
public void map(Object key, Text val, Context StringTokenizer itr = new StringTokenizer(val.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } }}
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWrita private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> val, Context context){ int sum = 0; for (IntWritable v : val) { sum += v.get();
. . .
Distribute maptasks to cluster
Hadoop Data Nodes
Data is loaded, spread, resident
in Hadoop cluster
Performance = tuning Map Reduce workflow,
network, application, servers, and storage
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/ http://blog.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/ http://www.slideshare.net/allenwittenauer/2012-lihadoopperf
© 2013 IBM Corporation31
IBM Cloud Storage Briefing – December 3, 2013
Two different types of cloud tooling
Cloud storage tooling will most likely reside:
In the external shared storage stack for policy-based traditional transactional IT:– External IBM Smarter Storage hardware and software for block and file storage
In the virtualized server, direct attach storage, application stack for design-for-fail:– IBM SmartCloud software, IBM participation in Open Stack, IBM Softlayer
Both are appropriate, match to proper environment
Transactional ITInternet scale wkloads
http://www.slideshare.net/johnsing1/s-bd03-infinitybeyond2internetscaleworkloadsdatacenterdesignv6speaker
© 2013 IBM Corporation32
IBM Cloud Storage Briefing – December 3, 2013
Read all about it. Google published this information into the public domain in 2009. 2nd Edition of this book published July 2013(includes Flash storage)
By Google:– Luiz Andre Barroso– Uri Holze
Available to all, free of charge
Download original edition at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006New! 2nd Edition published July 2013: http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y201306CAC024
Video of Luis giving one of these lectures: http://inst-tech.engin.umich.edu/leccap/view/cse-dls-08/4903
http://www.barroso.org/
© 2013 IBM Corporation33
IBM Cloud Storage Briefing – December 3, 2013
Size of Cloud Market:
Magnitude of On-premise vs. Off-premise
© 2013 IBM Corporation34
IBM Cloud Storage Briefing – December 3, 2013
Size of Server, Storage, Networking aggregate marketplaces
Compound Growth Rate 2013-2017
Cloud Service Provider (CSP) 25%Enterprise Private Cloud (EPC) 23%Non-Cloud -7%Total 3%
Source: IBMSource: IBM
2013 2017
$104B $117B
37% is for Storage
© 2013 IBM Corporation35
IBM Cloud Storage Briefing – December 3, 2013
Cloud adoption continues acceleration through 2017
35 September 2013
On premise vs. off premise spend
EPC, $24B23% CGR
CSP, $33B25% CGR
Source: IBM
EnterpriseOn-premiseNon-Cloud
EnterpriseOn-premiseNon-Cloud
Cloud IaaSCloud IaaS
Cloud server, storage,
networking$57B, 24%CGR
48% of Total
Non-Cloud$60B,
-7%CGR52% of Total
Cloud Services
Off premis
e
Off premis
e
On premis
e
On premis
e
Off-premise is clearly the growth
area
© 2013 IBM Corporation36
IBM Cloud Storage Briefing – December 3, 2013
IBM Big Data / Analytics Storage Positioning
© 2013 IBM Corporation37
IBM Cloud Storage Briefing – December 3, 2013
We are building real-time, integrated stream computing on massive scale
n d
Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf
© 2013 IBM Corporation38
IBM Cloud Storage Briefing – December 3, 2013
Data inMotion
Data atRest
Data inMany Forms
Information Ingestion and Operational Information
Information Ingestion and Operational Information
Decision Management
BI and Predictive Analytics
Navigation and Discovery
IntelligenceAnalysis,
Raw Data Structured Data Text Analytics Data Mining
Entity Analytics Machine Learning
Landing Area, Analytics Zone, Archive
Landing Area, Analytics Zone, Archive
Video/AudioNetwork/SensorEntity Analytics
Predictive
Real-time AnalyticsReal-time Analytics
Exploration,Integrated Warehouse,
and Mart Zones
DiscoveryDeep Reflection
OperationalPredictive Stream Processing
Data Integration Master Data
StreamsStreams
Information Governance, Security and Business Continuity Information Governance, Security and Business Continuity
Batch parallel Big Data processing
Real-Time In-memory servers
Data WarehouseTraditional IT
However, note there are multiple types of Big DataHowever, note there are multiple types of Big Data
© 2013 IBM Corporation39
IBM Cloud Storage Briefing – December 3, 2013
Data inMotion
Data atRest
Data inMany Forms
Information Ingestion and Operational Information
Information Ingestion and Operational Information
Decision Management
BI and Predictive Analytics
Navigation and Discovery
IntelligenceAnalysis
Raw Data Structured Data Text Analytics Data Mining
Entity Analytics Machine Learning
Landing Area, Analytics Zone and Archive
Landing Area, Analytics Zone and Archive
Video/AudioNetwork/SensorEntity Analytics
Predictive
Real-time AnalyticsReal-time Analytics
Exploration,Integrated Warehouse,
and Mart Zones
DiscoveryDeep Reflection
OperationalPredictive
Stream Processing Data Integration
Master Data
StreamsStreams
Information Governance, Security and Business Continuity Information Governance, Security and Business Continuity
IBM BigInsights
IBMInfoSphereStreams
IBM Data Warehouseproducts
IBM end to end Big Data portfolioIBM end to end Big Data portfolio
IBM STG: x, p, PureSystems, Platform Computing
IBM STG: x, p, PureSystems, Platform
Computing
IBM SWG
© 2013 IBM Corporation40
IBM Cloud Storage Briefing – December 3, 2013
IBM Big Data Storage positioningIBM Big Data Storage positioning
Hadoop
oStorage for Hadoop– IBM Big Data Networked Storage
Solution for HadoopoPureSystems
– IBM PureData System for Hadoop with pre-installed IBM BigInsights
– Generally Available September 2013
Hadoop
oStorage for Hadoop– IBM Big Data Networked Storage
Solution for HadoopoPureSystems
– IBM PureData System for Hadoop with pre-installed IBM BigInsights
– Generally Available September 2013
Optimized Multi-Temperature Data Warehouse
oAll Flash– FlashSystem
oHybrid– DS8000 EasyTier– Storwize EasyTier– FlashSystem Solution (VSC +
FlashSystem)– XIV
oPureSystems– PureFlex (Storwize w/EasyTier)– PureData for Transactions (Storwize)– PureData for Analytics (Netezza)
Optimized Multi-Temperature Data Warehouse
oAll Flash– FlashSystem
oHybrid– DS8000 EasyTier– Storwize EasyTier– FlashSystem Solution (VSC +
FlashSystem)– XIV
oPureSystems– PureFlex (Storwize w/EasyTier)– PureData for Transactions (Storwize)– PureData for Analytics (Netezza)
Customer disk GB cost expectation (USA): 10 to 15 cents/GB with
direct or SAS attach, extreme density
Customer disk GB cost expectation (USA): 30 to 70 cents/GB
© 2013 IBM Corporation41
IBM Cloud Storage Briefing – December 3, 2013
IBM Cloud Storage Directions
© 2013 IBM Corporation42
IBM Cloud Storage Briefing – December 3, 2013
BLOCK
FILE
OBJECT
Data Growth Types in the Cloud
Worldwide File-based vs Block-based Storage Capacity Shipments 2008-2015
Block
File
Object
Block – Traditional data is structured and managed by OS i.e. Database File – High growth data is unstructured and managed by OS i.e. File System Object – Higher growth data is unstructured and managed by Application
© 2013 IBM Corporation43
IBM Cloud Storage Briefing – December 3, 2013
43
Object Storage – fundamental type of storage for Cloud
Object Storage
Network “Best Case” delivery
Best usage = data that doesn’t change
i.e. backups, archives, digital images, virtual machine images….
Distance limited only to acceptable network latency
ServersServers
ApplicationsApplications
Object storage features are minimal compared to NAS or SAN: – store, retrieve, copy, delete files– control which users can do what
Protocol usually HTTP interface Object Storage API (RESTful API) – Can be in URL format for WWW access
Application is responsible for tracking object unique IDs and supplying that unique ID to retrieve data from object storage
Typically longer response times than either NAS or SAN– Slower throughput compared traditional file system means object storage
unsuitable for data that changes frequently
Typical usages: great fit for data that doesn't change much: – backups, archives, video and audio, VM images– i.e. internet-scale repositories of data– This is why it is so essential to Cloud
No concept of file system. Rather, application saves object (files + additional metadata) to the object store via PUT API cmd, application gets a unique keyfor the saved file, application must provide that unique key to a GET API command to retrieve files
Can imbed searchable metadata directly into object storage system
© 2013 IBM Corporation44
IBM Cloud Storage Briefing – December 3, 2013
Objects are a natural fit to “born on cloud” data (mobile, social)
Objects are written once and never modified (although they can be replaced) – this describes most born on the cloud data
– Pictures, e-mails, movies, tweets, blog-posts, web pages, etc. – This data is both consumer and enterprise– Much of this data is accessed from mobile devices
Hence Object Storage is essential to participate in Cloud Storage world
Pictures Collaboration Backup Archive
Rackspace
Consumer Apps Business Apps
© 2013 IBM Corporation45
IBM Cloud Storage Briefing – December 3, 2013
45
Storage: SAN / NAS / Object
STORAGESTORAGE
IP NetworkIP Network
APPLICATIONAPPLICATION
NAS(Network Attached Storage)
CIFS, NFS, HTTP
FILE SYSTEMFILE SYSTEM
File I/O
Block I/O
File I/O
STORAGESTORAGE
APPLICATIONAPPLICATION
SAN(Storage Area Network)
FICON, FC, iSCSI, FCoE
Fibre Channel SAN or iSCSI
Fibre Channel SAN or iSCSI
FILE SYSTEMFILE SYSTEM
File I/O
Block I/O
STORAGESTORAGE
Object Storage (HTTP)
OBJECT CONTAINER OBJECT CONTAINER
Block I/O
Object I/O
Object APIObject API
Object APPLICATION
Object APPLICATION
Object Storage
Object API
IP NetworkIP Network
Object API
Block I/O
© 2013 IBM Corporation46
IBM Cloud Storage Briefing – December 3, 2013
IBM Cloud Storage – current products and future directions
Traditional IT:
IBM Smart Cloud Storage Access - to provide P9 and P8 Self-Service Automation (storage)
IBM Tivoli Storage Productivity Center – to provide P6 Storage Virtualization Management
IBM Storwize Family and XIV – provide P0 storage virtualization including enterprise best-in-class OpenStack exploitation
IBM SONAS and V7000 Unified - provide P0 storage virtualization for file storage
Cloud Storage and Object Storage Directions:
Exploitation of OpenStack Cinder for block storage
Exploitation of OpenStack Swift for software-defined object storage approach
Best-in-class OpenStack enterprise exploitation
Design for Fail / Cloud Native / Internet scale IT :
Exploit SoftLayer for Cloud Native
Migrate IBM SmartCloud workloads into Softlayer workflow approach over time
© 2013 IBM Corporation47
IBM Cloud Storage Briefing – December 3, 2013
OpenStack components; IBM Storage strategic exploitation
HorizonHorizon
NovaNova
CinderCinder
SwiftSwift
NeutronNeutron
KeystoneKeystone
GlanceGlance
New in HavanaMetering (Ceilometer) Basic Cloud Orchestration & Service Definition (Heat)
OsloShared ServicesOsloShared Services
SoftwareDefinedObjectIBM
Storage
SVC / Storwize
XIVFuture
directions
© 2013 IBM Corporation48
IBM Cloud Storage Briefing – December 3, 2013
OpenStack Object Storage component – “Swift”
An open source, highly available, distributed, eventually consistent object store– Two tier architecture consisting of client facing proxies and storage servers– Information protected through three-way replication (by default)– Supports geo-distribution– The dominant design for scale-out object stores
Swift was developed as pure software disconnected from hardware
– Typically implemented on storage rich servers, e.g.,
– IBM x3630 M4
Swift in production at Softlayer,Rackspace, Korea Telecom, Wikimedia,
UCSD, Internap, Sonian, MercadoLibre, . . .
Internetor
Intranet
Internetor
Intranet
Private Network
Clients send REST
requests
Storage Servers (account, container and object) store, serve
and manage data and metadata partitioned based upon ring
Proxy Layer (public face) authenticates and forwards
to appropriate storage server(s) using ring
© 2013 IBM Corporation49
IBM Cloud Storage Briefing – December 3, 2013
IBM Object Storage Cloud and IBM OpenStack directions
2014 directions: a pure IBM Storage Software offering, based on OpenStack Swift, with IBM value-add, providing object storage interface with highly available, cost effective, scale out storage features.
– Leverage open source assets for a lightweight and flexible, interoperable foundation
Target Markets– Telco/CSP, MSP, HealthCare, FSS
Scope– Simple and Easy to use management
• Ease of Use XIV/Storwize GUI• Build on community tools • Smart Swift infrastructure management• Cloud Support: Provisioning, Metering
– Multi-tenant security • Authentication and management isolation
– Compliance• Object Retention
– Architecturally able to scale• To thousands of nodes• Initial offerings much smaller
…
Private Network
…
Zone 1 Zone 2 Zone n
…
Object URL call:
http://<host>/<api versions>/<account>/<container>/<object>
© 2013 IBM Corporation50
IBM Cloud Storage Briefing – December 3, 2013
IBM SmartCloud capabilities for major IT architectures
Scalable
Virtualized
Automated Lifecycle
Heterogeneous Infrastructure
Cloud Enabled
Elastic
Multi-tenant
Integrated Lifecycle
Standardized Infrastructure
Cloud Native
+Existing
Middleware Workloads
EmergingPlatform
Workloads
Compatibility with existing systems“Systems of Record”
Exploitation of new environments“System of Engagement”
IBM SoftLayer
IBM SCE+
Internet scale wkloadsTraditional IT
© 2013 IBM Corporation51
IBM Cloud Storage Briefing – December 3, 2013
SoftLayer provides world-wide services with a standardized modular infrastructure; triple network architecture and powerful automation.
World-Wide Services 13 Data Centerswith 100,000 Servers and 22,000,000 Domains in the US, Amsterdam and Singapore 19 Network Points of Presencein 5 countries to facilitate response times 21,000 Customers
* Sold in US English, US $ Pricing
Tokyo
Hong Kong
Singapore
Seattle
San Jose
Los AngelesDenver
Dallas (6)Houston (2)
ChicagoNew York City
Washington DC
Atlanta
Miami
LondonAmsterdam
Frankfurt
Flexible, Automated InfrastructureData Center & Pods
• Standardized, modular hardware configurations• Globally consistent service portfolio
Triple Network• Public network for cloud services• VPN for secure management • Private network for communications and shared services
IMS (Automation Software)• Bare metal provisioning• Integrated BSS/OSS• Comprehensive network management
© 2013 IBM Corporation52
IBM Cloud Storage Briefing – December 3, 2013
Learning Points
Cloud is being driven not only by cost, but more importantly by:
– Time-to-market– Elasticity– Change business process– Competitive imperatives
Cloud is a significant shift in: – Organizational lines– Processes– Workflows– Workload types– Required skill sets
Cannot deliver true cloud services with a traditional IT organization
– The workflow, process, responsibility, reporting lines all different in cloud
– To provide elastic capacity, self-service E2E automation
Changing focus from on-premise (traditional IT) to off-premise (cloud)
IBM Cloud Storage products / directions include:
– Traditional IT (on-prem or off-prem): • Smart Cloud Storage Access, TPC,
Storwize, XIV• OpenStack exploitation
– Object Storage• Software defined object storage
– Design for Fail, Cloud Native IT:• OpenStack + XIV/Storwize• Softlayer
© 2013 IBM Corporation53
IBM Cloud Storage Briefing – December 3, 2013
For more reading and reference, full decks by John Sing:
“Building a 21st Century Cloud Storage Service – Industry Best Practices” (external customer conference presentation):
– http://www.slideshare.net/johnsing1/building21stcenturycloudstorageservicejohnsingv4
“State of the Cloud - Internet Scale Data Center Workloads – Comparison to Traditional IT”: (external customer conference presentation):
– http://www.slideshare.net/johnsing1/s-ge01-toinfinityandbeyond2012bigdatainternetscaleupdatev2johnsing-23463356
“Disruptive Innovation in the Modern IT World”:– http://www.slideshare.net/johnsing1/a-india-csii2012disruptiveinnovationinthemodernitwo
rldv3plenarypresentation
“Hadoop – it’s not just Internal Storage”:– http://www.slideshare.net/johnsing1/hadoopitsnotjustinternalstoragev14
© 2013 IBM Corporation54
IBM Cloud Storage Briefing – December 3, 2013
Gracias
Grazie
Thank YouJapanese
Hebrew
Spanish
French
Russian
German
Italian
English
Brazilian Portuguese
Arabic
Traditional Chinese
Simplified Chinese
Hindi
TamilKorean
Thai
TesekkurlerTurkish
German
Obrigado
© 2013 IBM Corporation55
IBM Cloud Storage Briefing – December 3, 2013
© 2013 IBM Corporation56
IBM Cloud Storage Briefing – December 3, 2013
Appendix: Disruptive Innovation
© 2013 IBM Corporation57
IBM Cloud Storage Briefing – December 3, 2013
With all this opportunity……. Why is this Disruptive Change flat-lining traditional consumer PC / desktop manufacturers?
PC / laptop stalwarts
Unsuccessful in shift
To mobile
http://gigaom.com/2012/09/01/hp-dell-and-the-paradox-of-the-disrupted/
PC/laptopmarket value
big decreases
Cloud / mobilemarket value
*bigger increases*
Mar
ket
Cap
italiz
atio
n
© 2013 IBM Corporation58
IBM Cloud Storage Briefing – December 3, 2013
Observe: how fast mobile internet grows by 2014
By 2014:
Mobile will be main way
Of connecting to Internet
Inter-
Disciplinary
http://www.digitalbuzzblog.com/2011-mobile-statistics-stats-facts-marketing-infographic
© 2013 IBM Corporation59
IBM Cloud Storage Briefing – December 3, 2013
Disruptive Innovation
Definition:
Create new market and value
Eventually disrupts existing
Displaces earlier technology
Clayton ChristensenHarvard Business School
http://en.wikipedia.org/wiki/Disruptive_innovation
© 2013 IBM Corporation60
IBM Cloud Storage Briefing – December 3, 2013
Disruptive Innovation
Not “advanced technologies”
Inferior yet “good enough”
Novel combinations
Starts low end
Grows up-market–“low end
disruption”
Clayton ChristensenHarvard Business School
http://en.wikipedia.org/wiki/Disruptive_innovation
© 2013 IBM Corporation61
IBM Cloud Storage Briefing – December 3, 2013
Disruptive Innovation
Learn lessons
Watch today’s world
Illustrative examples only
© 2013 IBM Corporation62
IBM Cloud Storage Briefing – December 3, 2013
Disruptive Innovation
“Consumerization”
Not just technology
Delivery models (cloud)
Business models
Ecosystems
Clayton ChristensenHarvard Business School
http://en.wikipedia.org/wiki/Disruptive_innovation
© 2013 IBM Corporation63
IBM Cloud Storage Briefing – December 3, 2013
Mobile has affected all business models…
Mobile =
Geo-locational superfood
Real-time analytics
http://www.digitalbuzzblog.com/2011-mobile-statistics-stats-facts-marketing-infographic
© 2013 IBM Corporation64
IBM Cloud Storage Briefing – December 3, 2013
Cloud-scale Data Centers required for: Data Supertransformagicability
TaxiWiz
HousingMaps
Source: http://mashable.com/2007/07/11/google-maps-mashups-2/
Weatherbug
© 2013 IBM Corporation65
IBM Cloud Storage Briefing – December 3, 2013
By 2016, how much mobile data? What kind?
2012:–Mobile-connected
devices > # people
2016:–10 billion mobile devices–(world population: 7.3 B)
http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html
Smartphones 48%
Web data,video70%
© 2013 IBM Corporation66
IBM Cloud Storage Briefing – December 3, 2013
Disruptive Innovation
Big Data / Cloud on disruptive path
Traditional IT still around but….
Newer technologies disrupt all platforms
Clayton ChristensenHarvard Business School
What will the effect be on your IT organization?
Inter-
Disciplinary
© 2013 IBM Corporation67
IBM Cloud Storage Briefing – December 3, 2013
Internet Scale Workload Characteristics - 1
Embarrassingly parallel Internet workload
– Immense data sets, but relatively independent records being processed• Example: billions of web pages, billions of log / cookie / click entries
– Web requests from different users essentially independent of each over• Creating natural units of data partitioning and concurrency• Lends itself well to cluster-level scheduling / load-balancing
– Independence = peak server performance not important– What’s important is aggregate throughput of 100,000s of servers
i.e. Very low inter-process
communication
Workload Churn
– Well-defined, stable high level API’s (i.e. simple URLs)– Software release cycles on the order of every couple of weeks
• Means Google’s entire core of search services rewritten in 2 years– Great for rapid innovation
• Expect significant software re-writes to fix problems ongoing basis– New products hyper-frequently emerge
• Often with workload-altering characteristics, example = YouTube
© 2013 IBM Corporation68
IBM Cloud Storage Briefing – December 3, 2013
Internet Scale Workload Characteristics - 2
Platform Homogeneity– Single company owns, has technical capability, runs entire platform
end-to-end including an ecosystem– Most Web applications more homogeneous than traditional IT– With immense number of independent worldwide users
1% - 2% of all Internet requests
fail*
Users can’t tell difference between Internet down and
your system down
Hence 99% good enough
*The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzle
http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Fault-free operation via application middleware– Some type of failure every few hours, including software bugs– All hidden from users by fault-tolerant middleware– Means hardware, software doesn’t have to be perfect
Immense scale: – Workload can’t be held within 1 server, or within max size tightly-clustered
memory-shared SMP– Requires clusters of 1000s, 10000s of servers with corresponding PBs
storage, network, power, cooling, software– Scale of compute power also makes possible apps such as Google Maps,
Google Translate, Amazon Web Services EC2, Facebook, etc.
© 2013 IBM Corporation69
IBM Cloud Storage Briefing – December 3, 2013
Internet Scale data center power components…
Image courtesy of DLB Associates: D. Dyer, “Current trends/challenges in datacenter thermal management—a facilities perspective,”presentation at ITHERM, San Diego, CA, June 1, 2006.“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 4-1, p.40 Barroso, Holzle
http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
© 2013 IBM Corporation70
IBM Cloud Storage Briefing – December 3, 2013
Breakdown of data center energy overheads
Image courtesy of ASHRAE “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-2, p.49 Barroso, Holzlehttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006
Chiller alone is 33% of the cost
UPS alone is 18% of
construction cost
Physical cooling, UPS dominates the electrical power cost
© 2013 IBM Corporation71
IBM Cloud Storage Briefing – December 3, 2013
construction cost of Internet Scale Data Center is Power / Cooling
Facebook’s North Carolina Data Center Goes Live
Facebook: Lulea, Sweden - 290K sq ft (27K sq meters) by late 2012
Facebook – Prinville, Oregon
Has spent $1B on it’s data centers
Open Compute Project
? Reducing power profile reduces
construction cost
© 2013 IBM Corporation72
IBM Cloud Storage Briefing – December 3, 2013
Wow. Given that fact…..
Whose data centers are most power efficient?
Reducing power profile = lowers initial CAPEX SIGNIFICANTLY
Therefore, fundamental Internet Scale Data Center goal is:
Decrease Power Usage Effectiveness (PUE)
PUE =
http://gigaom.com/cloud/whose-data-centers-are-more-efficient-facebooks-or-googles/
Total Building Power consumed---------------------------------------------
IT power consumed
© 2013 IBM Corporation73
IBM Cloud Storage Briefing – December 3, 2013
Google claims its data centers use 50% less energy than competitors
Power Usage Effectiveness– PUE=1.14 means power overhead is
only 14%– Industry average is around 1.8
http://venturebeat.com/2012/03/26/google-data-centers-use-less-energy/
Industry average PUE is about 1.8
http://www.datacenterknowledge.com/archives/2011/05/10/uptime-institute-the-average-pue-is-1-8/