76
MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE Joachim Schröder Manager Solution Architects, DACH Email: [email protected] November, 14th 2013

MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

MANAGING UNSTRUCTURED DATAAT PETABYTE-SCALE

Joachim SchröderManager Solution Architects, DACHEmail: [email protected], 14th 2013

Page 2: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 2 2 RHS: Managing Unstructured Data

WHO’S RED HAT? - RED HAT PORTFOLIO

CERTIFIED CLOUD PROVIDERS

ON-PREMISE

JBOSS ENTERPRISE MIDDLEWARE

RED HAT ENTERPRISE MRGMESSAGING, REALTIME LINUX, GRID

CLOUDFORMSHYBRID CLOUD MANAGEMENT

RED HAT ENTERPRISE VIRTUALIZATION

RED HAT STORAGE

RED HAT ENTERPRISE LINUX

RED HAT OPENSHIFTPLATFORM AS A SERVICE

RED HAT OPENSTACKINFRASTRUCTURE AS A SERVICE

RED HAT NETWORK SATELLITELINUX STACK MANAGEMENT

CONSISTENT ENVIRONMENT

Page 3: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 3 3 RHS: Managing Unstructured Data

THE INFORMATION EXPLOSION

Main growth drivers:Virtualisation, Cloud, Mobile Computing and Big Data

Source: IDC's Digital Universe Study, Dec 2012

Page 4: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 4 4 RHS: Managing Unstructured Data

CORNERSTONE OF THE NEW SOFTWARE DEFINED DATACENTER

COMPUTECOMPUTE

SOFTWARE- DEFINED / BASED COMPUTE

(Virtualization)

SOFTWARE- DEFINED / BASED COMPUTE

(Virtualization)

STORAGESTORAGE

SOFTWARE-DEFINED / BASED STORAGE

SOFTWARE-DEFINED / BASED STORAGE

NETWORKINGNETWORKING

SOFTWARE-DEFINED / BASED NETWORKING

SOFTWARE-DEFINED / BASED NETWORKING

ENVIRONMENTALENVIRONMENTAL

SOFTWARE-DEFINED / BASED FACILITIES

SOFTWARE-DEFINED / BASED FACILITIES

DATA CENTER FABRIC DATA CENTER FABRIC

DATACENTER EVOLUTION

Page 5: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 5 5 RHS: Managing Unstructured Data

WHAT IS RED HAT STORAGE?

OpenSource

Scale-out NAS (Network Attached Storage)

deployable on

on-premise, virtualized and Cloud environments

based on GlusterFS

running on standard x86 Hardware

Page 6: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 6 6 RHS: Managing Unstructured Data

ENTERPRISEMOBILITY

INCREASE DATA, APPLICATION AND INFRASTRUCTURE AGILITY

CLOUD APPLICATIONS

CONVERGED COMPUTE AND STORAGE

FILE SERVICES OPEN OBJECT APIs

OPEN, SOFTWARE-DEFINED STORAGE PLATFORM

SCALE-OUT STORAGE ARCHITECTURE

PHYSICAL

Standard x86 systemsScale-out NAS solutions

VIRTUAL

Include idle or legacy resources

CLOUD

EBSEBS

BIG DATA WORKLOADSENTERPRISE APPLICATIONS

DATA SERVICES

PERSISTENT DATA STORES

Page 7: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 7 7 RHS: Managing Unstructured Data

RED HAT STORAGEFOR ON-PREMISE

SERVER (CPU/MEM)

1TB

• Single namespace

• Aggregates CPU, memory,network capacity.

• Deploys on Red Hat-supported servers and underlying storage: DAS, JBOD.

• Scale out linearly.

• Scale out performance and capacity as needed.

• Replicate synchronouslyand asynchronously.

RED HAT STORAGE DEPLOYMENT ON-PREMISE

RED HAT STORAGEFOR ON-PREMISE

1TB

Scale out performance, capacity, and availability

Sca

le u

p c

apac

ity

SINGLE GLOBAL NAMESPACE

...

...SERVER(CPU/MEM)

............ ... ...

Page 8: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 8 8 RHS: Managing Unstructured Data

RED HAT STORAGE SERVER FOR PUBLIC CLOUD

RED HAT STORAGEFOR PUBLIC CLOUD

EBS

Scale out performance, capacity, and availability

Sca

le u

p c

apac

ity

• GlusterFS Amazon Machine Images (AMIs)

• The only way to achieve high availability of Elastic Block Storage (EBS)

• Multiple EBS devices pooled

• POSIX compatible (no application to rewrite required to run on Amazon EC2)

• Scale out capacity and performance as needed

SINGLE GLOBAL NAMESPACE

RED HAT STORAGE DEPLOYMENT ON AMAZON CLOUD

...

...EC2

............ ... ...

Page 9: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 9 9 RHS: Managing Unstructured Data

z

ADMINISTRATOR

RED HATSTORAGE CLI

USERS

SSH

NFS

CIFS

Fuse

OpenStack Swift

Cloud Volume Manager(glusterd)

Cloud Volume Manager(glusterd)

Cloud Volume Manager(glusterd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

RED HAT STORAGE POOL

VIRTUAL PHYSICAL

RED HAT STORAGE—50,000 FOOT OVERVIEW

Page 10: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 10 10 RHS: Managing Unstructured Data

CIFS HADOOP ENABLEMENT

REPLICATION

MULTI-SITE DR

MULTI-TENANT:NAMESPACE AND ENCRYPT

MULTI-TENANT:QoS (CGROUPS)

VOLUMESNAPSHOT

CLIENT/ PRESENTATION

BACKEND/ PERSIST

SAMBA USER APP QEMU

SWIFT

MANAGE

FUSE NFS

TRANSLATORS

TRANSLATORS

GLUSTERFS FRAMEWORK

GLUSTERFS

NETWORK STACK

NETWORK DEVICE PLATFORM BLOCK DEVICE

HARDWARE ENABLEMENT

LOCAL FILESYSTEM

LOGICAL VOLUME MANAGEMENT

XFS OTHER

RED HAT ENTERPRISE LINUX

PL

AT

FO

RM

MA

NA

GE

AB

ILIT

Y

RED HAT STORAGE TECHNOLOGY STACK

Page 11: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 11 11 RHS: Managing Unstructured Data

RED HAT STORAGE SCALABILITY

2 4 6 8 10 12 14 160

2000

4000

6000

8000

10000

12000

14000

Sequential Read Transfer Rates

glusterfs,repl=1 glusterfs,repl=2

Gluster-NFS,repl=1 Gluster-NFS,repl=2

Servers

MB

/s

2 4 6 8 10 12 14 160

2000

4000

6000

8000

10000

12000

14000

Sequential Write Transfer Rates

glusterfs,repl=1 glusterfs,repl=2

Gluster-NFS,repl=1 Gluster-NFS,repl=2

Servers

MB

/s

Page 12: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 12 12 RHS: Managing Unstructured Data

RHS-C Management Console

Page 13: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 13 13 RHS: Managing Unstructured Data

DESIGNED FOR MANAGING UNSTRUCTURED DATA

SUPPORTING A WIDE RANGE OF ENTERPRISE AND EMERGING WORKLOADS

Page 14: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 14 14 RHS: Managing Unstructured Data

RED HAT STORAGE FOR OPENSTACK

Page 15: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 15 15 RHS: Managing Unstructured Data

ENSURE GLOBAL DATA PROTECTION AND AVAILABILITYTRANSPARENTLY DISTRIBUTE DATA GLOBALY

SITE A SITE B

REMOTE SITE / DR

Page 16: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 16 16 RHS: Managing Unstructured Data

BRING APPLICATIONS CLOSER TO THE DATA

CONVERGING COMPUTE AND STORAGE

INCREASE AGILITY

REDUCE LATENCY

PROCESS DATA LOCALLY

REDUCE COSTS

STORAGE RESIDENT APPLICATIONS

Page 17: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 17 17 RHS: Managing Unstructured Data

HIGHLY AVAILABLE CLOUD STORAGE FOR AMAZON EC2LEVERAGE THE ELASTICITY OF THE CLOUD WITHOUT RE-WRITING YOUR APPLICATIONS

US East (N. Virginia)US East (N. Virginia) Ireland (Cork)

CREATING HIGHLY AVAILABLE, SCALEABLE EBS STORAGE POOLS - ACROSS ZONES

Now available as AWS test-driveNow available as AWS test-drive

Page 18: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 18 18 RHS: Managing Unstructured Data

DELIVER COST EFFECTIVE ELASTIC CAPACITY AND PERFORMANCE53% - 78% REDUCTION IN COSTS

SOURCE: IDC REPORT – THE ECONOMICS OF SOFTWARE BASED STORAGE

YEAR

Page 19: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 19 19 RHS: Managing Unstructured Data

“Red Hat worked with us the entire way as we designed and built our architectures, helping with best practices, design considerations and layout, performance testing, and migration.”

MOHIT ANCHLIAARCHITECT, INTUIT TURBO TAX

PROBLEM NEEDED A FAST, RELIABLE, AND COST-EFFECTIVE STORAGE SOLUTION

TO MEET GROWING SAAS LINE OF BUSINESS

TAX RETURNS AND OTHER DATA WERE BEING STORED AS BLOBS IN AN EXPENSIVE ORACLE DB

SOLUTION RED HAT STORAGE SERVER 2.O FOR ON-PREMISE OBJECT STORAGE

HP DL2000s AND APACHE CASSANDRA

BENEFITS SCALEABLE ON-DEMAND STORAGE FOR UNSTRUCTURED DATA

COST EFFECTIVE SOLUTION THAT LEVERAGES COMMODITY HARDWARE

MEET GROWING CAPACITY AND PEAK PERFORMACE NEEDS

ACHIEVE MULTI-SITE DISASTER RECOVERY

MANAGING SPRAWLING UNSTRUCTURED FINANCIAL DATA

Page 20: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 20 20 RHS: Managing Unstructured Data

IS THE OPPORTUNITY REAL ?

Page 21: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 21 21 RHS: Managing Unstructured Data

DELIVERING AGILITY AND COST ADVANTAGE

FOUNDATION FOR HYBRID CLOUD AND BIG DATA

DEPLOY DATA ANYWHERE PHYSICAL, VIRTUAL CLOUD

ELASTIC CAPACITY AND PERFORMACE

MODERN, SECURE FILE AND OBJECT STORAGE

ENSURE DATA PROTECTION AND AVAILABILITY

CONVERGE COMPUTE AND STORAGE

RED HAT STORAGESERVER

DELIVERING THE NEXT GENERATION OF OPEN SOFTWARE-DEFINED STORAGE TODAY

DESIGNED FOR TODAYS IT & DATA ECONOMICS

MANAGING UNSTRUCTUREDDATA AT SCALE

Page 22: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 22 22 RHS: Managing Unstructured Data

RED HAT STORAGE – DEVELOPING 3rd PARTY ECO-SYSTEM

Page 23: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 23 23 RHS: Managing Unstructured Data

RED HAT LEADS THROUGH OPEN INNOVATION

Page 24: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 24 24 RHS: Managing Unstructured Data

COMMUNITY INNOVATION

SNAPSHOTTING

CHANGE DETECTION

COMPRESSION

pNFS AND NFSv4 SUPPORT

GLUSTER.ORG COMMUNITY FORGE ENHANCEMENTS AND PROJECTS

MULTI-MASTER GEO-REPLICATION

FILE VERSIONING

3-WAY REPLICATION

ERASURE CODING

SMB 3.0 SUPPORT

TRANSLATORS EXTENSION FOR PYTHON

PUPPET MANAGEMENTMODULE

GTOP - MONITORING

GLUSTER PROFILING

NDMP SERVER

SELINUX SUPPORT

PMUX – LIGHTWEIGHT MAP REDUCE

Page 25: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 25 25 RHS: Managing Unstructured Data

RED HAT STORAGE INFORMATION RESOURCES

RED HAT STORAGE PRODUCT INFORMATION

HTTP://WWW.REDHAT.COM/PRODUCTS/STORAGE-SERVER/

RED HAT STORAGE SOLUTIONS

HTTP://WWW.REDHAT.COM/PROMO/LIBERATE/SOLUTIONS.HTML

RED HAT STORAGE CUSTOMER SUCCESS STORIES

HTTP://WWW.REDHAT.COM/PROMO/LIBERATE/RESOURCES.HTML

RED HAT STORAGE SERVICES

HTTP://WWW.REDHAT.COM/PROMO/LIBERATE/SERVICES.HTML

GLUSTER COMMUNITY FORGE

HTTP://FORGE.GLUSTER.ORG

GLUSTER COMMUNITY HTTP://WWW.GLUSTER.ORG

Page 26: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 26 26 RHS: Managing Unstructured Data

BACKUP

Page 27: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 27 27 RHS: Managing Unstructured Data

How Does GlusterFS Work Without Metadata?

Files are placed on a brick(s) in the cluster based on a calculation

All native clients have an algorithm built-in

All storage nodes have an algorithm built-in

Files can then be retrieved based on the same calculation

For non-native clients, the server handles retrieval and placement

Page 28: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

28

Page 29: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

29

Page 30: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

30

Page 31: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

RED HAT CONFIDENTIAL

xfs/LVM xfs/LVM xfs/LVM xfs/LVM

Page 32: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 32 32 RHS: Managing Unstructured Data

Bricks

A brick is the combination of a node and a file system: hostname:/dir

Each brick inherits limits of the underlying filesystem(xfs)

RHS operates at the brick level, not at the node level

Ideally, each brick in a cluster should be the same size

Storage Node 1

/export1

Storage Node 2

/export2

/export3

/export1

/export2

/export3

/export4

/export5

Storage Node 3

/export1

/export2

/export3

/export4

3 bricks 5 bricks 4 bricks

Page 33: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 33 33 RHS: Managing Unstructured Data

Volumes A volume consists of 1 or more bricks => exported with Gluster.

volumes have administrator assigned export names

a brick is a member of only one volume

A namespace can have 1 or more volumes

A namespace can consist of replicated and distributed volumes

data in different volumes physically exists on different bricks

volumes can be mounted on clients using NFS, CIFS and/or GlusterFS clients (native FUSE client)

Storage Node

/export1

Storage Node

/export2

/export3

/export1

/export2

/export3

/export4

/export5

Storage Node

/export1

/export2

/export3

/export4

3 bricks 5 bricks 4 bricks

Volume “homeshares”: 6 brick replicaexported as /homeshares

Volume “scratchspace” 6 brick distribute,exported as /scratchspace

Page 34: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 34 34 RHS: Managing Unstructured Data

Data Flow with NFS/CIFS Client

Page 35: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 35 35 RHS: Managing Unstructured Data

Data Flow with Native Client

Page 36: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:
Page 37: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 37 37 RHS: Managing Unstructured Data

Seamless Integration for Hadoop Deployments

Built using the Hadoop file system API Requires simple configuration file changes C Lib GlusterFS client enable GlusterFS direct access Provides Java binding for Hadoop compatibility

GlusterFS can co-exist HDFS Does not use the NameNode metadata server

Page 38: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 38 38 RHS: Managing Unstructured Data

Hadoop architecture overview

Page 39: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

1

MANAGING UNSTRUCTURED DATAAT PETABYTE-SCALE

Joachim SchröderManager Solution Architects, DACHEmail: [email protected], 14th 2013

Page 40: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

2

Session title

Non confidential 2 2 RHS: Managing Unstructured Data

WHO’S RED HAT? - RED HAT PORTFOLIO

CERTIFIED CLOUD PROVIDERS

ON-PREMISE

JBOSS ENTERPRISE MIDDLEWARE

RED HAT ENTERPRISE MRGMESSAGING, REALTIME LINUX, GRID

CLOUDFORMSHYBRID CLOUD MANAGEMENT

RED HAT ENTERPRISE VIRTUALIZATION

RED HAT STORAGE

RED HAT ENTERPRISE LINUX

RED HAT OPENSHIFTPLATFORM AS A SERVICE

RED HAT OPENSTACKINFRASTRUCTURE AS A SERVICE

RED HAT NETWORK SATELLITELINUX STACK MANAGEMENT

CONSISTENT ENVIRONMENT

Page 41: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 3 3 RHS: Managing Unstructured Data

THE INFORMATION EXPLOSION

Main growth drivers:Virtualisation, Cloud, Mobile Computing and Big Data

Source: IDC's Digital Universe Study, Dec 2012

Page 42: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 4 4 RHS: Managing Unstructured Data

CORNERSTONE OF THE NEW SOFTWARE DEFINED DATACENTER

COMPUTECOMPUTE

SOFTWARE- DEFINED / BASED COMPUTE

(Virtualization)

SOFTWARE- DEFINED / BASED COMPUTE

(Virtualization)

STORAGESTORAGE

SOFTWARE-DEFINED / BASED STORAGE

SOFTWARE-DEFINED / BASED STORAGE

NETWORKINGNETWORKING

SOFTWARE-DEFINED / BASED NETWORKING

SOFTWARE-DEFINED / BASED NETWORKING

ENVIRONMENTALENVIRONMENTAL

SOFTWARE-DEFINED / BASED FACILITIES

SOFTWARE-DEFINED / BASED FACILITIES

DATA CENTER FABRIC DATA CENTER FABRIC

DATACENTER EVOLUTION

Todays Modern Data Center is Increasingly defined by and based on software.

Compute was the first – with virtualization – to begin to abstract data center resources aiding with

4

Page 43: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 5 5 RHS: Managing Unstructured Data

WHAT IS RED HAT STORAGE?

OpenSource

Scale-out NAS (Network Attached Storage)

deployable on

on-premise, virtualized and Cloud environments

based on GlusterFS

running on standard x86 Hardware

Page 44: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

6

<header> <date/time>

<footer> 6

BUILD SLIDE

6

So to summorize requiremnts fot he Big data challange:We need a highly available storage solution that can handle hardware failure.It need opens standards and possiblility to be replicated and access over geographical distance.Automatic self heal if a component would fail to maintan protection level.Automatic management or minimal manual management would be prefered.Deployment agnostic, so run your solution private, public or replicate in between, don’t be locked in.Avoid massive migrations when you lifecylcle hardware. Migrations might not be an option if you have a large volume.

Page 45: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

7

Session title

Non confidential 7 7 RHS: Managing Unstructured Data

RED HAT STORAGEFOR ON-PREMISE

SERVER (CPU/MEM)

1TB

• Single namespace

• Aggregates CPU, memory,network capacity.

• Deploys on Red Hat-supported servers and underlying storage: DAS, JBOD.

• Scale out linearly.

• Scale out performance and capacity as needed.

• Replicate synchronouslyand asynchronously.

RED HAT STORAGE DEPLOYMENT ON-PREMISE

RED HAT STORAGEFOR ON-PREMISE

1TB

Scale out performance, capacity, and availability

Sc

ale

up

cap

acit

y

SINGLE GLOBAL NAMESPACE

...

...SERVER(CPU/MEM)

............ ... ...

Page 46: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

8

Session title

Non confidential 8 8 RHS: Managing Unstructured Data

RED HAT STORAGE SERVER FOR PUBLIC CLOUD

RED HAT STORAGEFOR PUBLIC CLOUD

EBS

Scale out performance, capacity, and availability

Sca

le u

p c

apac

ity

• GlusterFS Amazon Machine Images (AMIs)

• The only way to achieve high availability of Elastic Block Storage (EBS)

• Multiple EBS devices pooled

• POSIX compatible (no application to rewrite required to run on Amazon EC2)

• Scale out capacity and performance as needed

SINGLE GLOBAL NAMESPACE

RED HAT STORAGE DEPLOYMENT ON AMAZON CLOUD

...

...EC2

............ ... ...

Page 47: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 9 9 RHS: Managing Unstructured Data

z

ADMINISTRATOR

RED HATSTORAGE CLI

USERS

SSH

NFS

CIFS

Fuse

OpenStack Swift

Cloud Volume Manager(glusterd)

Cloud Volume Manager(glusterd)

Cloud Volume Manager(glusterd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

Brick(glusterfsd)

RED HAT STORAGE POOL

VIRTUAL PHYSICAL

RED HAT STORAGE—50,000 FOOT OVERVIEW

Page 48: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

10

Session title

Non confidential 10 10 RHS: Managing Unstructured Data

CIFS HADOOP ENABLEMENT

REPLICATION

MULTI-SITE DR

MULTI-TENANT:NAMESPACE AND ENCRYPT

MULTI-TENANT:QoS (CGROUPS)

VOLUMESNAPSHOT

CLIENT/ PRESENTATION

BACKEND/ PERSIST

SAMBA USER APP QEMU

SWIFT

MANAGE

FUSE NFS

TRANSLATORS

TRANSLATORS

GLUSTERFS FRAMEWORK

GLUSTERFS

NETWORK STACK

NETWORK DEVICE PLATFORM BLOCK DEVICE

HARDWARE ENABLEMENT

LOCAL FILESYSTEM

LOGICAL VOLUME MANAGEMENT

XFS OTHER

RED HAT ENTERPRISE LINUX

PL

AT

FO

RM

MA

NA

GE

AB

ILIT

Y

RED HAT STORAGE TECHNOLOGY STACK

Page 49: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

11

Session title

Non confidential 11 11 RHS: Managing Unstructured Data

RED HAT STORAGE SCALABILITY

2 4 6 8 10 12 14 160

2000

4000

6000

8000

10000

12000

14000

Sequential Read Transfer Rates

glusterfs,repl=1 glusterfs,repl=2

Gluster-NFS,repl=1 Gluster-NFS,repl=2

Servers

MB

/s

2 4 6 8 10 12 14 160

2000

4000

6000

8000

10000

12000

14000

Sequential Write Transfer Rates

glusterfs,repl=1 glusterfs,repl=2

Gluster-NFS,repl=1 Gluster-NFS,repl=2

Servers

MB

/s

Page 50: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 12 12 RHS: Managing Unstructured Data

RHS-C Management Console

Page 51: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 13 13 RHS: Managing Unstructured Data

DESIGNED FOR MANAGING UNSTRUCTURED DATA

SUPPORTING A WIDE RANGE OF ENTERPRISE AND EMERGING WORKLOADS

Page 52: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 14 14 RHS: Managing Unstructured Data

RED HAT STORAGE FOR OPENSTACK

The different module components:

- Horizon : Management Dashboard- Nova : Computing resources- Glance : Image service- Swift : Object Store- Quantum : Networking module- Cinder : Volume service- Keystone : Authentication

Red Hat Storage fits in as an infrastructural component in below...

Page 53: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 15 15 RHS: Managing Unstructured Data

ENSURE GLOBAL DATA PROTECTION AND AVAILABILITYTRANSPARENTLY DISTRIBUTE DATA GLOBALY

SITE A SITE B

REMOTE SITE / DR

Page 54: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

16

Session title

Non confidential 16 16 RHS: Managing Unstructured Data

BRING APPLICATIONS CLOSER TO THE DATA

CONVERGING COMPUTE AND STORAGE

INCREASE AGILITY

REDUCE LATENCY

PROCESS DATA LOCALLY

REDUCE COSTS

STORAGE RESIDENT APPLICATIONS

Red Hat Storage Allows You to Bring Application

For years traditional storage companies have made the promise of enabling you to take advantage of the compute power locked up in storage appliances to be able to support application workloads.

With Todays new data landscape

REDUCE LATENCYGain increased performance for datasets by eliminating the network hop introduced by traditional architectures

PROCESS DATA LOCALLY

REDUCE COSTSYou can reduce costs by eliminating an entire tier of infrastructure

INCREASE AGILITY

CUSTOMER EXAMPLE:

Customer are doing this Today, In one example a financial services company is taking advantage of this capability to provide new levels of availability and performance for their Splunk implementation.

If you are not familiar with Splunk.. Splunk is a search analytics product used to cull through thousands and thousands of log in an organization, analyze the data and present the results in business dashboards to drive anything from IT and security analytics to business analysis.

Page 55: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 17 17 RHS: Managing Unstructured Data

HIGHLY AVAILABLE CLOUD STORAGE FOR AMAZON EC2LEVERAGE THE ELASTICITY OF THE CLOUD WITHOUT RE-WRITING YOUR APPLICATIONS

US East (N. Virginia)US East (N. Virginia) Ireland (Cork)

CREATING HIGHLY AVAILABLE, SCALEABLE EBS STORAGE POOLS - ACROSS ZONES

Now available as AWS test-driveNow available as AWS test-drive

CUSTOMER USE CASE IntelliTEK

17

Page 56: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 18 18 RHS: Managing Unstructured Data

DELIVER COST EFFECTIVE ELASTIC CAPACITY AND PERFORMANCE53% - 78% REDUCTION IN COSTS

SOURCE: IDC REPORT – THE ECONOMICS OF SOFTWARE BASED STORAGE

YEAR

Page 57: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

STORAGE

19

Session title

Non confidential 19 19 RHS: Managing Unstructured Data

“Red Hat worked with us the entire way as we designed and built our architectures, helping with best practices, design considerations and layout, performance testing, and migration.”

MOHIT ANCHLIAARCHITECT, INTUIT TURBO TAX

PROBLEM NEEDED A FAST, RELIABLE, AND COST-EFFECTIVE STORAGE SOLUTION

TO MEET GROWING SAAS LINE OF BUSINESS

TAX RETURNS AND OTHER DATA WERE BEING STORED AS BLOBS IN AN EXPENSIVE ORACLE DB

SOLUTION RED HAT STORAGE SERVER 2.O FOR ON-PREMISE OBJECT STORAGE

HP DL2000s AND APACHE CASSANDRA

BENEFITS SCALEABLE ON-DEMAND STORAGE FOR UNSTRUCTURED DATA

COST EFFECTIVE SOLUTION THAT LEVERAGES COMMODITY HARDWARE

MEET GROWING CAPACITY AND PEAK PERFORMACE NEEDS

ACHIEVE MULTI-SITE DISASTER RECOVERY

MANAGING SPRAWLING UNSTRUCTURED FINANCIAL DATA

Presentation Path:Pandora serves up all of its music files through Red Hat Storage.

Imagine the scalability challenges Pandora faces. Each store song needs to be transcoded into 12 different file formats, depending on the device (phone, tablet, computer, etc.) accessing it.

Pandora needs to scale up immediately to accommodate a peak in traffic and, at the same time, accommodate long tail content access as well.

There is a publicly referenceable case study related to this customer. There is no formal write-up available.

Page 58: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 20 20 RHS: Managing Unstructured Data

IS THE OPPORTUNITY REAL ?

Page 59: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

21

Session title

Non confidential 21 21 RHS: Managing Unstructured Data

DELIVERING AGILITY AND COST ADVANTAGE

FOUNDATION FOR HYBRID CLOUD AND BIG DATA

DEPLOY DATA ANYWHERE PHYSICAL, VIRTUAL CLOUD

ELASTIC CAPACITY AND PERFORMACE

MODERN, SECURE FILE AND OBJECT STORAGE

ENSURE DATA PROTECTION AND AVAILABILITY

CONVERGE COMPUTE AND STORAGE

RED HAT STORAGESERVER

DELIVERING THE NEXT GENERATION OF OPEN SOFTWARE-DEFINED STORAGE TODAY

DESIGNED FOR TODAYS IT & DATA ECONOMICS

MANAGING UNSTRUCTUREDDATA AT SCALE

Page 60: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

22

Session title

Non confidential 22 22 RHS: Managing Unstructured Data

RED HAT STORAGE – DEVELOPING 3rd PARTY ECO-SYSTEM

Page 61: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

23

Session title

Non confidential 23 23 RHS: Managing Unstructured Data

RED HAT LEADS THROUGH OPEN INNOVATION

Gerry

Page 62: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 24 24 RHS: Managing Unstructured Data

COMMUNITY INNOVATION

SNAPSHOTTING

CHANGE DETECTION

COMPRESSION

pNFS AND NFSv4 SUPPORT

GLUSTER.ORG COMMUNITY FORGE ENHANCEMENTS AND PROJECTS

MULTI-MASTER GEO-REPLICATION

FILE VERSIONING

3-WAY REPLICATION

ERASURE CODING

SMB 3.0 SUPPORT

TRANSLATORS EXTENSION FOR PYTHON

PUPPET MANAGEMENTMODULE

GTOP - MONITORING

GLUSTER PROFILING

NDMP SERVER

SELINUX SUPPORT

PMUX – LIGHTWEIGHT MAP REDUCE

Page 63: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

25

Page 64: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 26 26 RHS: Managing Unstructured Data

BACKUP

Page 65: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 27 27 RHS: Managing Unstructured Data

How Does GlusterFS Work Without Metadata?

Files are placed on a brick(s) in the cluster based on a calculation

All native clients have an algorithm built-in

All storage nodes have an algorithm built-in

Files can then be retrieved based on the same calculation

For non-native clients, the server handles retrieval and placement

Page 66: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

For use only by a student enrolled in a Red Hat training course taught by Red Hat, Inc. or a Red Hat Certified Training Partner. No part of this publication may bephotocopied, duplicated, stored in a retrieval system, or otherwise reproduced without prior written consent of Red Hat, Inc. If you believe Red Hat training materials

are being improperly used, copied, or distributed please email <[email protected]> or phone toll-free (USA) +1 (866) 626 2994 or +1 (919) 754 3700.

28

Page 67: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

For use only by a student enrolled in a Red Hat training course taught by Red Hat, Inc. or a Red Hat Certified Training Partner. No part of this publication may bephotocopied, duplicated, stored in a retrieval system, or otherwise reproduced without prior written consent of Red Hat, Inc. If you believe Red Hat training materials

are being improperly used, copied, or distributed please email <[email protected]> or phone toll-free (USA) +1 (866) 626 2994 or +1 (919) 754 3700.

29

Page 68: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

For use only by a student enrolled in a Red Hat training course taught by Red Hat, Inc. or a Red Hat Certified Training Partner. No part of this publication may bephotocopied, duplicated, stored in a retrieval system, or otherwise reproduced without prior written consent of Red Hat, Inc. If you believe Red Hat training materials

are being improperly used, copied, or distributed please email <[email protected]> or phone toll-free (USA) +1 (866) 626 2994 or +1 (919) 754 3700.

30

Page 69: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

RED HAT CONFIDENTIAL

xfs/LVM xfs/LVM xfs/LVM xfs/LVM

31

Past decade: Linux + volume x86 servers transformed the server marketDisplaced costly proprietary RISC/UNIX systemsEnabled new classes of workloadsShowed superior economics

Current decade: Open-source-based storage + volume x86 servers tranform storage marketDisplacing costly proprietary SAN and NAS systemsCost 1/3 to 1/2 the price of alternative proprietary solutionsEnabling new classes of workloadsHelping realize the true potential of hybrid clouds

Page 70: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 32 32 RHS: Managing Unstructured Data

Bricks

A brick is the combination of a node and a file system: hostname:/dir

Each brick inherits limits of the underlying filesystem(xfs)

RHS operates at the brick level, not at the node level

Ideally, each brick in a cluster should be the same size

Storage Node 1

/export1

Storage Node 2

/export2

/export3

/export1

/export2

/export3

/export4

/export5

Storage Node 3

/export1

/export2

/export3

/export4

3 bricks 5 bricks 4 bricks

Page 71: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 33 33 RHS: Managing Unstructured Data

Volumes A volume consists of 1 or more bricks => exported with Gluster.

volumes have administrator assigned export names

a brick is a member of only one volume

A namespace can have 1 or more volumes

A namespace can consist of replicated and distributed volumes

data in different volumes physically exists on different bricks

volumes can be mounted on clients using NFS, CIFS and/or GlusterFS clients (native FUSE client)

Storage Node

/export1

Storage Node

/export2

/export3

/export1

/export2

/export3

/export4

/export5

Storage Node

/export1

/export2

/export3

/export4

3 bricks 5 bricks 4 bricks

Volume “homeshares”: 6 brick replicaexported as /homeshares

Volume “scratchspace” 6 brick distribute,exported as /scratchspace

Page 72: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 34 34 RHS: Managing Unstructured Data

Data Flow with NFS/CIFS Client

Page 73: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 35 35 RHS: Managing Unstructured Data

Data Flow with Native Client

Page 74: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:
Page 75: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 37 37 RHS: Managing Unstructured Data

Seamless Integration for Hadoop Deployments

Built using the Hadoop file system API Requires simple configuration file changes C Lib GlusterFS client enable GlusterFS direct access Provides Java binding for Hadoop compatibility

GlusterFS can co-exist HDFS Does not use the NameNode metadata server

Page 76: MANAGING UNSTRUCTURED DATA AT PETABYTE-SCALE · RED HAT STORAGE RED HAT ENTERPRISE LINUX RED HAT OPENSHIFT PLATFORM AS A SERVICE ... PERSISTENT DATA STORES. Session title RHS:

Session title

Non confidential 38 38 RHS: Managing Unstructured Data

Hadoop architecture overview