81
Elastic “Big Data” Infrastructure Rackspace® Private Cloud powered by OpenStack® Use Case by Natasha Gajic Analytical Compute Grid (ACG) October 17, 2012

ACG_Rackspace.pdf

Embed Size (px)

DESCRIPTION

true

Citation preview

Elastic “Big Data” Infrastructure Rackspace® Private Cloud powered by OpenStack® Use Case

by Natasha Gajic

Analytical Compute Grid (ACG)

October 17, 2012

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Rackspace’s EBI Environment

Current Environment

Windows and Linux operating systems

Oracle and Microsoft databases solutions

Microsoft and Oracle replication technology

SSIS

Informatica

Dedicated servers

Rapid data set growth

“Big Data” Problem

Cost of purchasing additional licenses

Time required to set up new hardware

Increased demand for DBA resources

System performance

System scalability

Capacity

2

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Analytical Compute Grid (ACG) Features

•Host ever growing set of data

•Quick data collection and retrieval

•Rapid scalability

•Ease of maintenance

•Provide standard data access API

3

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Analytical Compute Grid (ACG) Features

•Ability to provide variety of storage types:

• Columnar

• Relational

• HDFS

•Enable users to select optimal storage

type for information collected

•Leverage Rackspace® Private Cloud

powered by OpenStack® and open

source technology

4

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

Analytical Compute Grid (ACG) Quality Attributes

5

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

High Level Architecture

6

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

7

8 Hypervisor Servers each:

Dual Socket Six Core 2.4GHz Processors

96GB RAM

Terabytes of Storage

*The environment will grow significantly next year

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 8

ACG on Rackspace® Private Cloud powered by OpenStack®

Image

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 9

ACG on Rackspace® Private Cloud powered by OpenStack®

Database Engine Selection

Columnar Cassandra

Relational PostgreSQL

HDFS Hadoop

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 10

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 11

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 12

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 13

ACG on Rackspace® Private Cloud powered by OpenStack®

Node

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 14

ACG on Rackspace® Private Cloud powered by OpenStack®

Controller

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 15

ACG on Rackspace® Private Cloud powered by OpenStack®

Controller

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 16

ACG on Rackspace® Private Cloud powered by OpenStack®

Controller

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 17

ACG on Rackspace® Private Cloud powered by OpenStack®

API

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

18

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 19

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure:

• Resides on a set of Rackspace® Private Cloud powered by OpenStack® instances

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 20

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure:

• Resides on a set of Rackspace® Private Cloud powered by OpenStack® instances

• It is a set of pointers ultimately addressing database entities

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 21

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure:

• Resides on a set of Rackspace® Private Cloud powered by OpenStack® instances

• It is a set of pointers ultimately addressing database entities

• ACG Controller manages Indexing Structure

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 22

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure:

• Resides on a set of Rackspace® Private Cloud powered by OpenStack® instances

• It is a set of pointers ultimately addressing database entities

• ACG Controller manages Indexing Structure

• Dynamically expands vertically and horizontally to address a growing data set

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 23

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure Enables:

• Distribution of data bases across many instances

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 24

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure Enables:

• Distribution of data bases across many instances

• Splitting large data sets across many instances

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 25

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure Enables:

• Distribution of data bases across many instances

• Splitting large data sets across many instances

• Parallelization of large data set queries

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 26

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure Enables:

• Distribution of data bases across many instances

• Splitting large data sets across many instances

• Parallelization of large data set queries

• Deploying data stores with optimal configuration, minimizing maintenance

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 27

ACG on Rackspace® Private Cloud powered by OpenStack®

Indexing Structure

• ACG Indexing Structure Enables:

• Distribution of data bases across many instances

• Splitting large data sets across many instances

• Parallelization of large data set queries

• Deploying data stores with optimal configuration, minimizing maintenance

• Accessing data residing in

variety of storage types via

uniform interface

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 28

ACG on Rackspace® Private Cloud powered by OpenStack®

Sorter & Aggregator

• ACG Sorter & Aggregator Enables:

• Joining the results from multiple ACG nodes

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 29

ACG on Rackspace® Private Cloud powered by OpenStack®

Sorter & Aggregator

• ACG Sorter & Aggregator Enables:

• Joining the results from multiple ACG nodes

• Result sorting and aggregation

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 30

ACG on Rackspace® Private Cloud powered by OpenStack®

Sorter & Aggregator

• ACG Sorter & Aggregator Enables:

• Joining the results from multiple ACG nodes

• Result sorting and aggregation

• Together with temporary segment it will support

joining heterogeneous data sets

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes

31

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

32

Rackspace® Private Cloud

powered by OpenStack®

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

33

Rackspace® Private Cloud

powered by OpenStack®

Creates ACG node in 30 seconds

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

34

Rackspace® Private Cloud

powered by OpenStack®

Creates ACG node in 30 seconds

Creates ACG nodes concurrently

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

35

Rackspace® Private Cloud

powered by OpenStack®

Creates ACG node in 30 seconds

Creates ACG nodes concurrently

ACG

Controlled data set size resulting

in:

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

36

Rackspace® Private Cloud

powered by OpenStack®

Creates ACG node in 30 seconds

Creates ACG nodes concurrently

ACG

Controlled data set size resulting

in: Quick data distribution

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

37

Rackspace® Private Cloud

powered by OpenStack®

Creates ACG node in 30 seconds

Creates ACG nodes concurrently

ACG

Controlled data set size resulting

in: Quick data distribution

Query parallelization

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes - Performance

38

Rackspace® Private Cloud

powered by OpenStack®

Creates ACG node in 30 seconds

Creates ACG nodes concurrently

ACG

Controlled data set size resulting

in: Quick data distribution

Query parallelization

Fast data retrieval

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Scalability

39

Rackspace® Private Cloud

powered by OpenStack®

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Scalability

40

Rackspace® Private Cloud

powered by OpenStack®

Quick and concurrent ACG node

creation

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Scalability

41

Rackspace® Private Cloud

powered by OpenStack®

Quick and concurrent ACG node

creation

Ability to re-size existing nodes

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Scalability

42

Rackspace® Private Cloud

powered by OpenStack®

Quick and concurrent ACG node

creation

Ability to re-size existing nodes

Ability to remove nodes

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Scalability

43

Rackspace® Private Cloud

powered by OpenStack®

Quick and concurrent ACG node

creation

Ability to re-size existing nodes

Ability to remove nodes

ACG

Indexing structure and controlled

data set size allow ACG to

stabilize quickly as it expands or

contracts

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Availability

44

Rackspace® Private Cloud

powered by OpenStack®

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Availability

45

Rackspace® Private Cloud

powered by OpenStack®

Rapidly replace failed ACG nodes

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Availability

46

Rackspace® Private Cloud

powered by OpenStack®

Rapidly replace failed ACG nodes

ACG

Deploys data store native

availability mechanisms

(replication, data distribution…)

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

47

Rackspace® Private Cloud

powered by OpenStack®

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

48

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

49

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

50

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

CPU power

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

51

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

CPU power

RAM

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

52

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

CPU power

RAM

No DBA or system administrators

activity required

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

53

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

CPU power

RAM

No DBA or system administrators

activity required

ACG

Controlled data set size enables:

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

54

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

CPU power

RAM

No DBA or system administrators

activity required

ACG

Controlled data set size enables: Optimal and stable data store

configuration

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

55

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

CPU power

RAM

No DBA or system administrators

activity required

ACG

Controlled data set size enables: Optimal and stable data store

configuration

Reducing demand for managing

data store objects

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Maintainability

56

Rackspace® Private Cloud

powered by OpenStack®

Adding ACG nodes expands:

Storage capacity

CPU power

RAM

No DBA or system administrators

activity required

ACG

Controlled data set size enables: Optimal and stable data store

configuration

Reducing demand for managing

data store objects

Stable query execution plans

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 57

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Flexibility

ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 58

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Flexibility

ACG ACG

Variety of storage types:

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 59

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Flexibility

ACG ACG

Variety of storage types: Columnar – Cassandra : time series data

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 60

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Flexibility

ACG ACG

Variety of storage types: Columnar – Cassandra : time series data

Relational – PostgreSQL : relational data

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 61

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Flexibility

ACG ACG

Variety of storage types: Columnar – Cassandra : time series data

Relational – PostgreSQL : relational data

HDFS – Hadoop : un-structured data

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 62

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Flexibility

ACG ACG

Variety of storage types: Columnar – Cassandra : time series data

Relational – PostgreSQL : relational data

HDFS – Hadoop : un-structured data

Ability to select optimal storage

type for individual use case

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 63

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG ACG

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 64

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG ACG

Standard interfaces:

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 65

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG ACG

Standard interfaces:

SQL language

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 66

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG ACG

Standard interfaces:

SQL language

JDBC API

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 67

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG ACG

Standard interfaces:

SQL language

JDBC API

Data store native calls

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 68

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG ACG

Standard interfaces:

SQL language

JDBC API

Data store native calls

Native bulk loader utility

RACKSPACE® HOSTING | WWW.RACKSPACE.COM 69

ACG on Rackspace® Private Cloud powered by OpenStack®

Quality Attributes – Usability

ACG ACG

Standard interfaces:

SQL language

JDBC API

Data store native calls

Native bulk loader utility

ACG will support joining

heterogeneous data sets

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

70

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Subject:

• Complex availability calculation sourcing 3 months of monitoring data and creating 1 billion records in initial calculation

71

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Environment 1

• Data Warehouse Microsoft SQL server database

• SSIS data loading

• SQL server with 24 CPUs and 250GB RAM was dedicated to the initial calculation

• SQL server stored procedure performed the calculation

• Source and result are stored in traditional data warehouse structure

72

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Environment 2

• In 30 seconds, ACG Node Manager instantiated new columnar data store consisting of 4 Cassandra nodes, and registered it in ACG Indexing Structure

• Each ACG node has 2CPUs and 8GB RAM

• Informatica data loading

• Calculation developed in Java

• Source and result are stored in columnar structure suitable for time series data

73

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Calculation Duration

•Microsoft SQL Server lasted 5 days

•ACG calculation completed in 3.5 hours

• Storage Size

• Microsoft SQL server 500GB

•ACG 20 GB

• Complexity of the calculation

•Columnar data store is optimal for time series data. Sourcing from columnar data store resulted in relatively simple Java calculation process comparing to SQL server stored procedure

74

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Result

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Selecting optimal data store for use case resulted in:

• Substantial performance improvement

75

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Conclusion

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Selecting optimal data store for use case resulted in:

• Substantial performance improvement

• Reduced storage demand

76

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Conclusion

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Selecting optimal data store for use case resulted in:

• Substantial performance improvement

• Reduced storage demand

•Simplified processes

77

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Conclusion

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Selecting optimal data store for use case resulted in:

• Substantial performance improvement

• Reduced storage demand

•Simplified processes

•Ability to process terabytes of data per day close to real-time and on-demand

78

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Conclusion

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Selecting optimal data store for use case resulted in:

• Substantial performance improvement

• Reduced storage demand

•Simplified processes

•Ability to process terabytes of data per day close to real-time and on-demand

•Improved trending and reporting:

• enhances support capabilities

• improved Rackspace customer experience

79

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Conclusion

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

• Selecting optimal data store for use case resulted in:

• Substantial performance improvement

• Reduced storage demand

• Simplified processes

• Ability to process terabytes of data per day close to real-time and on-demand

• Improved trending and reporting:

• enhances support capabilities

• improved Rackspace customer experience

• Significant cost reduction

80

ACG on Rackspace® Private Cloud powered by OpenStack®

Rackspace Use Case - Conclusion

81

RACKSPACE® HOSTING | 5000 WALZEM ROAD | SAN ANTONIO, TX 78218

US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM

RACKSPACE® HOSTING | © RACKSPACE US, INC. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN TH E UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM