© 2013 IBM Corporation Storage and “The Cloud” 1. What is driving IT / Businesses to Cloud 2. Traditional IT Organization Impact 3. Traditional vs. Design-for-Fail,

© 2013 IBM Corporation

Storage and “The Cloud”

1. What is driving IT / Businesses to Cloud 2. Traditional IT Organization Impact 3. Traditional vs. Design-for-Fail, On-premise vs. Off-premise 4. IBM Big Data / Cloud Storage Products and Directions

IBM Cloud Storage Briefing - December 3, 2013

Provided by: John Sing, Executive IT Consultant, San Jose, California [email protected]

© 2013 IBM Corporation2

IBM Cloud Storage Briefing – December 3, 2013

What is driving IT and Businesses to Cloud



Value delivered

Storage Provisioning

Continuous Access to data

From traditional

Weeks

To cloud

Minutes

For usersFor users

Reduced storage admin

costs

Up to 50% savings

For ITFor IT

Reduced energy costs Up to 36%

Increased storage utilization Up to 90% From 50%

Localized, any time

any where

Dynamic (Elastic)

Centralized

FixedStorage Capacity

Modern 21st Century Cloud Business Value

Time-to-DeliveryCompetitive Advantage

Revenue“Time is Money”

Time-to-DeliveryCompetitive Advantage

Revenue“Time is Money”



Primary drivers for move to cloud = business reasons

http://www.kpmg.com/global/en/issuesandinsights/articlespublications/cloud-service-providers-survey/pages/service-providers.aspx

Competitive Advantage,Revenue

Competitive Advantage,Revenue



Bandwidth availability is tipping point for adoption of “The Cloud”………

Worldwide broadband bandwidth availability is becoming commonplace

Facilitates a pervasive web services delivery model – (i.e. “The Cloud”)

Hosted in mega data centers with massive amounts:– Processors, Storage, Network

Today, when above 3 come together in a geo:

– We are seeing small, medium on-premise data centers worldwide rapidly disappearing, off-premise, into the cloud

The real question: – Is traditional IT re-capturing / replacing workloads

when they move off-premise to Cloud ?



Cloud Mega Data Centers = new modular IT implementation style…

Internet-scale centers…..

Data: –10s / 100s petabytes

Servers: –100,000s ….

Workloads:–Require server clusters of 100s, 1000s, 10,000, more …..

Modular implementation



Amazon Web Services

Amazon Web Services 1Q12: 450,000 servers

Amazon Perdix Modular Datacenter

EC2 17K core, 240 teraflop cluster 42nd fastest supercomputer in world

1Q12:

450,000Servers

estimated

1Q13: > 2 trillion

objects in S3

1Q13: 1.1 Mreq/sec

http://aws.typepad.com/aws/2012/04/amazon-s3-905-billion-objects-and-650000-requestssecond.html http://gigaom.com/cloud/how-big-is-amazon-web-services-bigger-than-a-billion/http://aws.typepad.com/aws/2013/04/amazon-s3-two-trillion-objects-11-million-requests-second.html



Growth ofThe Cloudby 2016

Mobile

Geo-locational

Real-time data

Shift to cloud mega-data centers

http://www.datacenterknowledge.com/archives/2012/10/23/cisco-releases-2nd-annual-global-cloud-index/

Source:

> 50% in cloud

Cisco already knows > 50%

workload is in the cloud



Cloud: No longer exploratory

Expectations: Cloud computing will be "just computing" by 2018

•Cloud is at the end of its beginning phase and has gotten serious

•Private cloud is growing, but giving way to hybrid cloud

•Service providers, VARs, SIs are rising to the cloud opportunity

•Cloud adoption is strong across large enterprise as well as SMB.



So, What is a Cloud, really?

Why does it impact Traditional On-Premise IT organization so heavily?

Extracted from presentation: “Building a 21st Century Cloud Storage Service” by John Sing: http://snjgsa.ibm.com/~singj/public/2013_Berlin_System_Storage_x_Pure_Symposium/sCS05_John_Sing_Building_21st_Century_Cloud_Storage_Service_Industry_Best_Practice.ppt

Extracted from presentation: “Building a 21st Century Cloud Storage Service” by John Sing: http://snjgsa.ibm.com/~singj/public/2013_Berlin_System_Storage_x_Pure_Symposium/sCS05_John_Sing_Building_21st_Century_Cloud_Storage_Service_Industry_Best_Practice.ppt



To users, cloud seems “easy”, “instant”, “self-service”. So what has to happen in the background?

Some would say that virtualization = cloud

Some IT traditionalists would say that cloud is nothing more than much better managed centralized, automated data centers

Unfortunately, such statements severely undersize the essential organizational element

To provide true cloud services, you must also execute a significant shift in:

– Organizational lines– Processes– Workflows– Workload types– Required skill sets Key message



This is the cloud-enableddata centerjourney

1. Virtualized

2. Deployed

3. Optimized

4. Enhanced

5. Monetized

Cloud adoption maturity

levels

Level of cloud capability(macropatterns)

http://www.redbooks.ibm.com/abstracts/redp4893.html

IBM Redpaper



What’s most important: cloud macropattern workflows

1. Simple IaaS

4. ITIL ManagedIaaS

2. CloudMgmt

3. AdvIaaS



Are you ready?

Cloud micro-pattern workflows

IBM Storwize V7000, SVC, XIV Tivoli Storage Manager

Tivoli Storage Productivity Center

Smart Cloud Storage Access

Problem! Traditional IT organization looks nothing like this workflow!



IBM Redpapers: Building Cloud Enabled Data Center / Service Provider

http://www.redbooks.ibm.com/abstracts/redp4912.html

http://www.redbooks.ibm.com/abstracts/redp4893.html http://www.redbooks.ibm.com/abstracts/redp4873.html



Example: IBM Storage products within the Cloud workflowNon-Technical Users

P9: IBM SmartCloud Storage AccessP9: IBM SmartCloud Storage Access

P8: IBM Tivoli Storage Productivity CenterP8: IBM Tivoli Storage Productivity Center

P0: IBM SVC / Storwize V7000 U

Self Provisioning Requests for Windows or Linux OS and end user consumption

Eth

ern

et N

etw

ork

P0: IBM SONAS

File

P0: IBM XIV

Block

Virtualizes

IBM or 3rd party Storage arrays(HP, NetApp, EMC, etc.)

CIFS / NFS

Provisioning Requests for LUNs to be assign/consume by either to physical or Virtual

Servers

Server, Application Owners, Developers users, etc.

LUN

Physical or Virtual

Servers

LUNs

LUN

LUN

LUN

eMail

DB2

SAP

ERPs

TPC/Storage Admin

16



Key Cloud organizational learning point:

Cloud involves major re-alignment of IT organization, skills

Re-alignment of IT processes, to facilitate real-time, elastic management, monitoring, delivery based on service catalog

– Aligned with the Lines of Business revenue generation / competitive advantage needs (requires full-time liason positions)

Creation of service catalog requires IT to invest different efforts into design/automation of IT capability

– New, additional skill requirements, aligned along a very different organizational structure, metrics, and speed criteria

Provide governance that addresses risk of unauthorized or rogue access to services– Only appropriate approvals and credentials, thus new emphasis on network + security

Addressing resistance to change within IT organization is the biggest success factor

If the on-premise IT organizations is unable to change….. – this is also a major off-premise cloud driver

If the on-premise IT organizations is unable to change….. – this is also a major off-premise cloud driver



This organizational shift is a main reason why “ready-to-go” cloud workflow products (such as OpenStack) are so attractive:

Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/

OpenStack already has all cloud workflows

ready for production



OpenStack is comprised of seven core projects that form a complete Cloud Infrastructure as a Service (IaaS) solution

Compute (Nova)

Block Storage (Cinder)

Network (Neutron)

Provision and manage virtual resources

Dashboard (Horizon)Self-service portal

Image (Glance)Catalog and manage server images

Identity (Keystone)Unified authentication, integrates with existing systems

Object Storage (Swift)petabytes of secure, reliable object storage

IaaS

Source: http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-architecture/

IaaS

Understand OpenStack to understand IBM

Cloud Storage directions

Understand OpenStack to understand IBM

Cloud Storage directions

HorizonHorizon

SwiftSwiftGlanceGlance

KeystoneKeystone

NovaNova

CinderCinder

NeutronNeutron



Did you know: two different types of IT architectures have emerged

Design-for-Fail IT implementation has some similarities, but clearly isn’t the same, as Traditional IT architecture

Knowledge Check



Today there are two major types of IT Cloud architectures and workloads:

Transactional IT

“Systems of Record”

Internet Scale Workloads

“Systems of Engagement”

Cloud, High Availability, Resiliency, Disaster Recovery characteristics

Can be adapted to Cloud “agnostic / after the fact”

Data Strategy Can leverage traditional tools/concepts to understand / implement cloud

Storage/server virtualization and pooling

Automation End to end automation of server / storage virtualization

Commonality Apply master vision and lessons learned from internet scale data centers



The other major type of IT Cloud architecture and workload is:

Transactional IT

“Systems of Record”

Internet Scale Workloads

“Systems of Engagement”

Cloud, High Availability, Resiliency, Disaster Recovery characteristics

Can be designed “Agnostic / after the fact” using server or storage virtualization, replication

Cloud capabilities are “designed into software stack from the beginning”

Data Strategy Use traditional tools/concepts to understand / know data

Storage/server virtualization and pooling

Proven Open Source toolset used implement failure tolerance and redundancy in the application stack

Automation End to end automation of server / storage virtualization and replication

End to end automation of the application software stack providing failure tolerance

Commonality Apply master vision and lessons learned from internet scale data centers

Apply master vision and lessons learned from internet scale data centers



Today: two different types of IT

Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/

Internet scale wkloadsTransactional IT



Today’s two major IT workload types

Source: http://it20.info/2012/02/the-cloud-magic-rectangle-tm/ Transactional IT Internet scale wkloads



How to build these two different IT architectures


Transactional ITInternet scale wkloads



What You (Consumer) Get with These different approaches:





Policy-based Clouds and Design-for-Fail Clouds areworkload optimized architectural choices

Policy-based Clouds

• Purpose optimized for longer-lived virtual machines managed by Server Administrator

• Centralizes enterprise server virtualization administration tasks

• High degree of flexibility designed to accommodate virtualization all workloads

• Significant focus on managing availability and QoS for long-lived workloads with level of isolation

• Characteristics derived from exploiting enterprise class hardware

• Legacy applications

Design-for-fail Clouds

• Purpose optimized for shorter-term virtual machines managed via end-user or automated process

• Decentralized control, embraces eventual consistency, focus on making “good enough” decisions

• High degree of standardization

• Significant focus on ensuring availability of control plane

• Characteristics driven by software

• New applications




Example: Traditional IT vs. Hadoop for Big DataTraditional approach : Move data to program

Big Data approach: Move function/programs to data

Database server

Data

Query Data

return Data

process Data

Master node

Data nodes

Data

Application server

User request

Send result

User request

Send Function to process on Data

Query & process Data

Data nodes

Data

Data nodes

Data

Data nodes

DataSend Consolidate result

Traditional approachApplication server and Database server are separateAnalysis Program can run on multiple Application serversNetwork is still in the middleData has to go through networkDesigned to analyze TBs of data

•Big Data Approach Analysis Program runs where the data is : on Data NodeOnly Analysis Program has to go through the networkAnalysis Program is executed on every DataNodeDesigned to analyze PBs of dataHighly Scalable :

1000s NodesPetabytes and more

Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and Francois Gibello/France/IBM for the use of this slide



29

Example: Traditional IT vs. Hadoop for Big Data

Database server

Data

Query Data

return Data

process Data

Application server

User request

Send result

Master node

Data nodes

Data

User request

Send Function to process on Data

Query & process Data

Data nodes

Data

Data nodes

Data

Data nodes

DataSend Consolidate result

Example: How many hours of Clint Eastwood appears in all the movies he has done?

Task: All movies need to be parsed to find Clint’s face

•Traditional approach :1)Upload a movie to the application server through the network

2) The Analysis Program compares Clint’s picture with every frame of the loaded movie.

3) Repeat the 2 previous steps for every movie

•Big Data Approach :

1)Send the Analysis Program and Clint’s picture to all the DataNodes.

2) The Analysis Program in every DataNode (all in parallel) compares the Clint’s picture with every frame of the loaded movie.

3) The results of every DataNodes are consolidated. A unique result is generated.

Traditional approach : Move data to program

Big Data approach: Move function/programs to data

Thank you to: Pascal VEZOLLE/France/IBM@IBMFR and Francois Gibello/France/IBM for the use of this slide

Note: Hadoop typically uses direct attached storage



Hadoop principles: Storage, HDFS and MapReduce

Hadoop Distributed File System = HDFS : where Hadoop stores the data– HDFS file system spans all the nodes in a cluster with locality awareness

Hadoop data storage, computation model– Data stored in a distributed file system, spanning many inexpensive computers– Send function/program to the data nodes– i.e. distribute application to compute resources where the data is stored– Scalable to thousands of nodes and petabytes of data

MapReduce Application

1. Map Phase(break job into small parts)

2. Shuffle(transfer interim outputfor final processing)

3. Reduce Phase(boil all output down toa single result set)

Return a single result setResult Set

Shuffle

public static class TokenizerMapper extends Mapper<Object,Text,Text,IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text();

public void map(Object key, Text val, Context StringTokenizer itr = new StringTokenizer(val.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } }}

public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWrita private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<IntWritable> val, Context context){ int sum = 0; for (IntWritable v : val) { sum += v.get();

. . .

public static class TokenizerMapper extends Mapper<Object,Text,Text,IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text();

public void map(Object key, Text val, Context StringTokenizer itr = new StringTokenizer(val.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } }}

public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWrita private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<IntWritable> val, Context context){ int sum = 0; for (IntWritable v : val) { sum += v.get();

. . .

Distribute maptasks to cluster

Hadoop Data Nodes

Data is loaded, spread, resident

in Hadoop cluster

Performance = tuning Map Reduce workflow,

network, application, servers, and storage

http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/ http://blog.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/ http://www.slideshare.net/allenwittenauer/2012-lihadoopperf



Two different types of cloud tooling

Cloud storage tooling will most likely reside:

In the external shared storage stack for policy-based traditional transactional IT:– External IBM Smarter Storage hardware and software for block and file storage

In the virtualized server, direct attach storage, application stack for design-for-fail:– IBM SmartCloud software, IBM participation in Open Stack, IBM Softlayer

Both are appropriate, match to proper environment


http://www.slideshare.net/johnsing1/s-bd03-infinitybeyond2internetscaleworkloadsdatacenterdesignv6speaker



Read all about it. Google published this information into the public domain in 2009. 2nd Edition of this book published July 2013(includes Flash storage)

By Google:– Luiz Andre Barroso– Uri Holze

Available to all, free of charge

Download original edition at: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006New! 2nd Edition published July 2013: http://www.morganclaypool.com/doi/abs/10.2200/S00516ED2V01Y201306CAC024

Video of Luis giving one of these lectures: http://inst-tech.engin.umich.edu/leccap/view/cse-dls-08/4903

http://www.barroso.org/



Size of Cloud Market:

Magnitude of On-premise vs. Off-premise



Size of Server, Storage, Networking aggregate marketplaces

Compound Growth Rate 2013-2017

Cloud Service Provider (CSP) 25%Enterprise Private Cloud (EPC) 23%Non-Cloud -7%Total 3%

Source: IBMSource: IBM

2013 2017

$104B $117B

37% is for Storage



Cloud adoption continues acceleration through 2017

35 September 2013

On premise vs. off premise spend

EPC, $24B23% CGR

CSP, $33B25% CGR

Source: IBM

EnterpriseOn-premiseNon-Cloud

EnterpriseOn-premiseNon-Cloud

Cloud IaaSCloud IaaS

Cloud server, storage,

networking$57B, 24%CGR

48% of Total

Non-Cloud$60B,

-7%CGR52% of Total

Cloud Services

Off premis

e

Off premis

e

On premis

e

On premis

e

Off-premise is clearly the growth

area



IBM Big Data / Analytics Storage Positioning



We are building real-time, integrated stream computing on massive scale

n d

Chart in public domain: IEEE Massive File Storage presentation, author: Bill Kramer, NCSA: http://storageconference.org/2010/Presentations/MSST/1.Kramer.pdf



Data inMotion

Data atRest

Data inMany Forms

Information Ingestion and Operational Information


Decision Management

BI and Predictive Analytics

Navigation and Discovery

IntelligenceAnalysis,

Raw Data Structured Data Text Analytics Data Mining

Entity Analytics Machine Learning

Landing Area, Analytics Zone, Archive

Landing Area, Analytics Zone, Archive

Video/AudioNetwork/SensorEntity Analytics

Predictive

Real-time AnalyticsReal-time Analytics

Exploration,Integrated Warehouse,

and Mart Zones

DiscoveryDeep Reflection

OperationalPredictive Stream Processing

Data Integration Master Data

StreamsStreams

Information Governance, Security and Business Continuity Information Governance, Security and Business Continuity

Batch parallel Big Data processing

Real-Time In-memory servers

Data WarehouseTraditional IT

However, note there are multiple types of Big DataHowever, note there are multiple types of Big Data



Data inMotion

Data atRest

Data inMany Forms



Decision Management

BI and Predictive Analytics

Navigation and Discovery

IntelligenceAnalysis

Raw Data Structured Data Text Analytics Data Mining

Entity Analytics Machine Learning

Landing Area, Analytics Zone and Archive

Landing Area, Analytics Zone and Archive

Video/AudioNetwork/SensorEntity Analytics

Predictive

Real-time AnalyticsReal-time Analytics

Exploration,Integrated Warehouse,

and Mart Zones

DiscoveryDeep Reflection

OperationalPredictive

Stream Processing Data Integration

Master Data

StreamsStreams

Information Governance, Security and Business Continuity Information Governance, Security and Business Continuity

IBM BigInsights

IBMInfoSphereStreams

IBM Data Warehouseproducts

IBM end to end Big Data portfolioIBM end to end Big Data portfolio

IBM STG: x, p, PureSystems, Platform Computing

IBM STG: x, p, PureSystems, Platform

Computing

IBM SWG



IBM Big Data Storage positioningIBM Big Data Storage positioning

Hadoop

oStorage for Hadoop– IBM Big Data Networked Storage

Solution for HadoopoPureSystems

– IBM PureData System for Hadoop with pre-installed IBM BigInsights

– Generally Available September 2013

Hadoop

oStorage for Hadoop– IBM Big Data Networked Storage

Solution for HadoopoPureSystems

– IBM PureData System for Hadoop with pre-installed IBM BigInsights

– Generally Available September 2013

Optimized Multi-Temperature Data Warehouse

oAll Flash– FlashSystem

oHybrid– DS8000 EasyTier– Storwize EasyTier– FlashSystem Solution (VSC +

FlashSystem)– XIV

oPureSystems– PureFlex (Storwize w/EasyTier)– PureData for Transactions (Storwize)– PureData for Analytics (Netezza)

Optimized Multi-Temperature Data Warehouse

oAll Flash– FlashSystem

oHybrid– DS8000 EasyTier– Storwize EasyTier– FlashSystem Solution (VSC +

FlashSystem)– XIV

oPureSystems– PureFlex (Storwize w/EasyTier)– PureData for Transactions (Storwize)– PureData for Analytics (Netezza)

Customer disk GB cost expectation (USA): 10 to 15 cents/GB with

direct or SAS attach, extreme density

Customer disk GB cost expectation (USA): 30 to 70 cents/GB



IBM Cloud Storage Directions



BLOCK

FILE

OBJECT

Data Growth Types in the Cloud

Worldwide File-based vs Block-based Storage Capacity Shipments 2008-2015

Block

File

Object

Block – Traditional data is structured and managed by OS i.e. Database File – High growth data is unstructured and managed by OS i.e. File System Object – Higher growth data is unstructured and managed by Application



43

Object Storage – fundamental type of storage for Cloud

Object Storage

Network “Best Case” delivery

Best usage = data that doesn’t change

i.e. backups, archives, digital images, virtual machine images….

Distance limited only to acceptable network latency

ServersServers

ApplicationsApplications

Object storage features are minimal compared to NAS or SAN: – store, retrieve, copy, delete files– control which users can do what

Protocol usually HTTP interface Object Storage API (RESTful API) – Can be in URL format for WWW access

Application is responsible for tracking object unique IDs and supplying that unique ID to retrieve data from object storage

Typically longer response times than either NAS or SAN– Slower throughput compared traditional file system means object storage

unsuitable for data that changes frequently

Typical usages: great fit for data that doesn't change much: – backups, archives, video and audio, VM images– i.e. internet-scale repositories of data– This is why it is so essential to Cloud

No concept of file system. Rather, application saves object (files + additional metadata) to the object store via PUT API cmd, application gets a unique keyfor the saved file, application must provide that unique key to a GET API command to retrieve files

Can imbed searchable metadata directly into object storage system



Objects are a natural fit to “born on cloud” data (mobile, social)

Objects are written once and never modified (although they can be replaced) – this describes most born on the cloud data

– Pictures, e-mails, movies, tweets, blog-posts, web pages, etc. – This data is both consumer and enterprise– Much of this data is accessed from mobile devices

Hence Object Storage is essential to participate in Cloud Storage world

Pictures Collaboration Backup Archive

Rackspace

Consumer Apps Business Apps



45

Storage: SAN / NAS / Object

STORAGESTORAGE

IP NetworkIP Network

APPLICATIONAPPLICATION

NAS(Network Attached Storage)

CIFS, NFS, HTTP

FILE SYSTEMFILE SYSTEM

File I/O

Block I/O

File I/O

STORAGESTORAGE

APPLICATIONAPPLICATION

SAN(Storage Area Network)

FICON, FC, iSCSI, FCoE

Fibre Channel SAN or iSCSI

Fibre Channel SAN or iSCSI

FILE SYSTEMFILE SYSTEM

File I/O

Block I/O

STORAGESTORAGE

Object Storage (HTTP)

OBJECT CONTAINER OBJECT CONTAINER

Block I/O

Object I/O

Object APIObject API

Object APPLICATION

Object APPLICATION

Object Storage

Object API

IP NetworkIP Network

Object API

Block I/O



IBM Cloud Storage – current products and future directions

Traditional IT:

IBM Smart Cloud Storage Access - to provide P9 and P8 Self-Service Automation (storage)

IBM Tivoli Storage Productivity Center – to provide P6 Storage Virtualization Management

IBM Storwize Family and XIV – provide P0 storage virtualization including enterprise best-in-class OpenStack exploitation

IBM SONAS and V7000 Unified - provide P0 storage virtualization for file storage

Cloud Storage and Object Storage Directions:

Exploitation of OpenStack Cinder for block storage

Exploitation of OpenStack Swift for software-defined object storage approach

Best-in-class OpenStack enterprise exploitation

Design for Fail / Cloud Native / Internet scale IT :

Exploit SoftLayer for Cloud Native

Migrate IBM SmartCloud workloads into Softlayer workflow approach over time



OpenStack components; IBM Storage strategic exploitation

HorizonHorizon

NovaNova

CinderCinder

SwiftSwift

NeutronNeutron

KeystoneKeystone

GlanceGlance

New in HavanaMetering (Ceilometer) Basic Cloud Orchestration & Service Definition (Heat)

OsloShared ServicesOsloShared Services

SoftwareDefinedObjectIBM

Storage

SVC / Storwize

XIVFuture

directions



OpenStack Object Storage component – “Swift”

An open source, highly available, distributed, eventually consistent object store– Two tier architecture consisting of client facing proxies and storage servers– Information protected through three-way replication (by default)– Supports geo-distribution– The dominant design for scale-out object stores

Swift was developed as pure software disconnected from hardware

– Typically implemented on storage rich servers, e.g.,

– IBM x3630 M4

Swift in production at Softlayer,Rackspace, Korea Telecom, Wikimedia,

UCSD, Internap, Sonian, MercadoLibre, . . .

Internetor

Intranet

Internetor

Intranet

Private Network

Clients send REST

requests

Storage Servers (account, container and object) store, serve

and manage data and metadata partitioned based upon ring

Proxy Layer (public face) authenticates and forwards

to appropriate storage server(s) using ring



IBM Object Storage Cloud and IBM OpenStack directions

2014 directions: a pure IBM Storage Software offering, based on OpenStack Swift, with IBM value-add, providing object storage interface with highly available, cost effective, scale out storage features.

– Leverage open source assets for a lightweight and flexible, interoperable foundation

Target Markets– Telco/CSP, MSP, HealthCare, FSS

Scope– Simple and Easy to use management

• Ease of Use XIV/Storwize GUI• Build on community tools • Smart Swift infrastructure management• Cloud Support: Provisioning, Metering

– Multi-tenant security • Authentication and management isolation

– Compliance• Object Retention

– Architecturally able to scale• To thousands of nodes• Initial offerings much smaller

…

Private Network

…

Zone 1 Zone 2 Zone n

…

Object URL call:

http://<host>/<api versions>/<account>/<container>/<object>



IBM SmartCloud capabilities for major IT architectures

Scalable

Virtualized

Automated Lifecycle

Heterogeneous Infrastructure

Cloud Enabled

Elastic

Multi-tenant

Integrated Lifecycle

Standardized Infrastructure

Cloud Native

+Existing

Middleware Workloads

EmergingPlatform

Workloads

Compatibility with existing systems“Systems of Record”

Exploitation of new environments“System of Engagement”

IBM SoftLayer

IBM SCE+

Internet scale wkloadsTraditional IT



SoftLayer provides world-wide services with a standardized modular infrastructure; triple network architecture and powerful automation.

World-Wide Services 13 Data Centerswith 100,000 Servers and 22,000,000 Domains in the US, Amsterdam and Singapore 19 Network Points of Presencein 5 countries to facilitate response times 21,000 Customers

* Sold in US English, US $ Pricing

Tokyo

Hong Kong

Singapore

Seattle

San Jose

Los AngelesDenver

Dallas (6)Houston (2)

ChicagoNew York City

Washington DC

Atlanta

Miami

LondonAmsterdam

Frankfurt

Flexible, Automated InfrastructureData Center & Pods

• Standardized, modular hardware configurations• Globally consistent service portfolio

Triple Network• Public network for cloud services• VPN for secure management • Private network for communications and shared services

IMS (Automation Software)• Bare metal provisioning• Integrated BSS/OSS• Comprehensive network management



Learning Points

Cloud is being driven not only by cost, but more importantly by:

– Time-to-market– Elasticity– Change business process– Competitive imperatives

Cloud is a significant shift in: – Organizational lines– Processes– Workflows– Workload types– Required skill sets

Cannot deliver true cloud services with a traditional IT organization

– The workflow, process, responsibility, reporting lines all different in cloud

– To provide elastic capacity, self-service E2E automation

Changing focus from on-premise (traditional IT) to off-premise (cloud)

IBM Cloud Storage products / directions include:

– Traditional IT (on-prem or off-prem): • Smart Cloud Storage Access, TPC,

Storwize, XIV• OpenStack exploitation

– Object Storage• Software defined object storage

– Design for Fail, Cloud Native IT:• OpenStack + XIV/Storwize• Softlayer



For more reading and reference, full decks by John Sing:

“Building a 21st Century Cloud Storage Service – Industry Best Practices” (external customer conference presentation):

– http://www.slideshare.net/johnsing1/building21stcenturycloudstorageservicejohnsingv4

“State of the Cloud - Internet Scale Data Center Workloads – Comparison to Traditional IT”: (external customer conference presentation):

– http://www.slideshare.net/johnsing1/s-ge01-toinfinityandbeyond2012bigdatainternetscaleupdatev2johnsing-23463356

“Disruptive Innovation in the Modern IT World”:– http://www.slideshare.net/johnsing1/a-india-csii2012disruptiveinnovationinthemodernitwo

rldv3plenarypresentation

“Hadoop – it’s not just Internal Storage”:– http://www.slideshare.net/johnsing1/hadoopitsnotjustinternalstoragev14



Gracias

Grazie

Thank YouJapanese

Hebrew

Spanish

French

Russian

German

Italian

English

Brazilian Portuguese

Arabic

Traditional Chinese

Simplified Chinese

Hindi

TamilKorean

Thai

TesekkurlerTurkish

German

Obrigado





Appendix: Disruptive Innovation



With all this opportunity……. Why is this Disruptive Change flat-lining traditional consumer PC / desktop manufacturers?

PC / laptop stalwarts

Unsuccessful in shift

To mobile

http://gigaom.com/2012/09/01/hp-dell-and-the-paradox-of-the-disrupted/

PC/laptopmarket value

big decreases

Cloud / mobilemarket value

*bigger increases*

Mar

ket

Cap

italiz

atio

n



Observe: how fast mobile internet grows by 2014

By 2014:

Mobile will be main way

Of connecting to Internet

Inter-

Disciplinary

http://www.digitalbuzzblog.com/2011-mobile-statistics-stats-facts-marketing-infographic



Disruptive Innovation

Definition:

Create new market and value

Eventually disrupts existing

Displaces earlier technology

Clayton ChristensenHarvard Business School

http://en.wikipedia.org/wiki/Disruptive_innovation




Not “advanced technologies”

Inferior yet “good enough”

Novel combinations

Starts low end

Grows up-market–“low end

disruption”






Learn lessons

Watch today’s world

Illustrative examples only




“Consumerization”

Not just technology

Delivery models (cloud)

Business models

Ecosystems





Mobile has affected all business models…

Mobile =

Geo-locational superfood

Real-time analytics

http://www.digitalbuzzblog.com/2011-mobile-statistics-stats-facts-marketing-infographic



Cloud-scale Data Centers required for: Data Supertransformagicability

TaxiWiz

HousingMaps

Source: http://mashable.com/2007/07/11/google-maps-mashups-2/

Weatherbug



By 2016, how much mobile data? What kind?

2012:–Mobile-connected

devices > # people

2016:–10 billion mobile devices–(world population: 7.3 B)

http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-520862.html

Smartphones 48%

Web data,video70%




Big Data / Cloud on disruptive path

Traditional IT still around but….

Newer technologies disrupt all platforms


What will the effect be on your IT organization?

Inter-

Disciplinary



Internet Scale Workload Characteristics - 1

Embarrassingly parallel Internet workload

– Immense data sets, but relatively independent records being processed• Example: billions of web pages, billions of log / cookie / click entries

– Web requests from different users essentially independent of each over• Creating natural units of data partitioning and concurrency• Lends itself well to cluster-level scheduling / load-balancing

– Independence = peak server performance not important– What’s important is aggregate throughput of 100,000s of servers

i.e. Very low inter-process

communication

Workload Churn

– Well-defined, stable high level API’s (i.e. simple URLs)– Software release cycles on the order of every couple of weeks

• Means Google’s entire core of search services rewritten in 2 years– Great for rapid innovation

• Expect significant software re-writes to fix problems ongoing basis– New products hyper-frequently emerge

• Often with workload-altering characteristics, example = YouTube



Internet Scale Workload Characteristics - 2

Platform Homogeneity– Single company owns, has technical capability, runs entire platform

end-to-end including an ecosystem– Most Web applications more homogeneous than traditional IT– With immense number of independent worldwide users

1% - 2% of all Internet requests

fail*

Users can’t tell difference between Internet down and

your system down

Hence 99% good enough

*The Data Center as a Computer: Introduction to Warehouse Scale Computing, p.81 Barroso, Holzle

http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006

Fault-free operation via application middleware– Some type of failure every few hours, including software bugs– All hidden from users by fault-tolerant middleware– Means hardware, software doesn’t have to be perfect

Immense scale: – Workload can’t be held within 1 server, or within max size tightly-clustered

memory-shared SMP– Requires clusters of 1000s, 10000s of servers with corresponding PBs

storage, network, power, cooling, software– Scale of compute power also makes possible apps such as Google Maps,

Google Translate, Amazon Web Services EC2, Facebook, etc.



Internet Scale data center power components…

Image courtesy of DLB Associates: D. Dyer, “Current trends/challenges in datacenter thermal management—a facilities perspective,”presentation at ITHERM, San Diego, CA, June 1, 2006.“The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 4-1, p.40 Barroso, Holzle

http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006



Breakdown of data center energy overheads

Image courtesy of ASHRAE “The Data Center as a Computer: Introduction to Warehouse Scale Computing”, figure 5-2, p.49 Barroso, Holzlehttp://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006

Chiller alone is 33% of the cost

UPS alone is 18% of

construction cost

Physical cooling, UPS dominates the electrical power cost



construction cost of Internet Scale Data Center is Power / Cooling

Facebook’s North Carolina Data Center Goes Live

Facebook: Lulea, Sweden - 290K sq ft (27K sq meters) by late 2012

Facebook – Prinville, Oregon

Has spent $1B on it’s data centers

Open Compute Project

? Reducing power profile reduces

construction cost



Wow. Given that fact…..

Whose data centers are most power efficient?

Reducing power profile = lowers initial CAPEX SIGNIFICANTLY

Therefore, fundamental Internet Scale Data Center goal is:

Decrease Power Usage Effectiveness (PUE)

PUE =

http://gigaom.com/cloud/whose-data-centers-are-more-efficient-facebooks-or-googles/

Total Building Power consumed---------------------------------------------

IT power consumed



Google claims its data centers use 50% less energy than competitors

Power Usage Effectiveness– PUE=1.14 means power overhead is

only 14%– Industry average is around 1.8

http://venturebeat.com/2012/03/26/google-data-centers-use-less-energy/

Industry average PUE is about 1.8

http://www.datacenterknowledge.com/archives/2011/05/10/uptime-institute-the-average-pue-is-1-8/

Documents

© 2013 IBM Corporation Storage and “The Cloud” 1. What is driving IT / Businesses to Cloud 2. Traditional IT Organization Impact 3. Traditional vs. Design-for-Fail,