13

The IBM Data Engine for NoSQL on IBM Power Systems™

Embed Size (px)

Citation preview

The use of NoSQL has exploded in recent years to meet user expectations for real-time response at scale.

The massive size and growth of mobile and social applications built around cloud architectures have driven the adoption of NoSQL databases for their speed, capacity, resiliency and simplicity.

Many industries, including banking, defense, biotech, web, telecom and others, have adopted NoSQL database capabilities.

NoSQL databases can fall into one of the following categories:

• Key value store (Redis, Memcached)• Column store (Cassandra, Bigtable)• Document Store (MongoDB, CouchDB)• Graph (Neo4j, Titan)

Many high-performance databases run in-memory to meet the demands of analytics, web, mobile and social applications that need lightning-fast response; therefore, memory capacity defines the size of the data set that can be processed.

• NoSQL databases in particular run entirely in-memory or rely heavily on memory as a cache to meet application performance requirements.

• These solutions can get expensive and hard to scale, and the latency associated with traditional I/O attached storage can degrade application performance.

40 TB

IBM has a solution: The IBM Data Engine for NoSQL

The IBM Data Engine for NoSQL is an integrated platform for a large and fast-growing key value store NoSQL database (Redis). By using a combination of DRAM and Coherent Accelerator Processor Interface (CAPI)–attached flash memory, this integrated platform creates a new tier of memory of up to 40 TB capacity. The IBM Data Engine for NoSQL offers significantly lower deployment and operational costs and improved computing performance for super-scalable, high-performing KVS memory databases (Redis: Provided by Redis Labs) on a scale-out infrastructure.

10%3x

80%Flash

All FlashAll Memory

Per

form

ance

Cost

Relative Performance and Cost as a Function of Memory/Flash Ratio

Performance (typical)

Cost (typical)

With the IBM Data Engine for NoSQL, large databases are faster and cheaper to run.

By reducing the number of nodes required for the solution by up to 24 times, there is a dramatic reduction in the total cost of operation (TCO) for networking floor space, energy cooling and operations overhead.* A 12 TB database is one-third the cost of traditional deployment, while maintaining a very high ratio of performance to cost.

*For KVS workloads only.

What is the Coherent Accelerator Processing Interface (CAPI)?

A key innovation in the IBM POWER8® architecture, CAPI is an innovative method of adding a processing engine to a POWER8 system.

• CAPI accelerator acts as a peer to POWER8 cores, sharing the same memory space and greatly reducing device communication overhead.• CAPI devices can accelerate applications beyond the capabilities of a general-purpose processor.• CAPI accelerators can participate like POWER8 processors, with direct access to memory, greatly reducing overhead.• Simplified addressing makes CAPI easy to use and easy to program.• Monte Carlo algorithms, key value stores, and financial and medical algorithms are ideal for CAPI.• CAPI can also be used as a foundation for flash memory expansion.• A wide variety of application domains can take advantage of CAPI, including database acceleration

and fast storage, data analytics and pattern recognition, visual/biometric analysis, and high-performance computing applications in healthcare, weather, finance and insurance, oil and gas and manufacturing.

What is Redis?

Redis (REmote DIctionary Server) is an in-memory, key value store NoSQL database that offers high performance, scalability and persistent storage on disk.

Redis supports several kinds of values, including simple string values or more complex data structures. These include binary-safe strings, lists, sets, sorted sets, hashes, bit arrays or bitmaps and HyperLogLogs. It also supports a lightweight and easy-to-use publish/ subscribe mechanism for broadcasting messages and client libraries that are available for all major languages. Redis is used by a number of organizations, including Twitter, Instagram, Pinterest, GitHub, Craigslist and Stack Overflow.

About

Redis Labs is the leading commercial provider for Redis open-source. Redis Labs Enterprise Cluster (RLEC) is the only on-premise, enterprise-grade deployment environment for Redis OSS, enabling super-fast performance, seamless scalability, true high availability, reliability and best-in-class expertise. 4,200 customers, 40 countries, 24,000 free trial customers, over 80,000 DBs, 24/7 support. HQ in Mountain View, CA, R&D in Tel Aviv.

Application

FlashAPIs

POWER8

DRAM

FLASH ARRAY

PSLFlashAFU

Hardware Components of the IBM Data Engine for NoSQL

What are the hardware components of the IBM Data Engine for NoSQL?The design enables the processor main memory to provide the fast response times that applications require by using main memory to cache or hold the most frequently accessed data, while leveraging the flash storage attached via CAPI to store the remaining in-memory data*.

• IBM FlashSystem® 840 Storage solution, firmware version 1.1.3.0 or later• FlashSystem storage array• CAPI adapter card• FPGA chip• Fiber channel I/O ports

*Providing the POWER8 processors with direct access to both DRAM and flash enables application software to adjust memory and flash usage ratios to optimize performance and cost.

Redis Configuration/Setup/Provisioning

Redis Instance

KV Fcn

Block FcnDisk Utility

Linux Kernel

Firmware

PSL

AFU

Up to 40 TB - Fiber Attached

Master Context

Adapter STUB Data FlowsConfiguration PathsError Flows

Software Components of the IBM Data Engine for NoSQL

What are the software components of the NoSQL Data Engine?This software arrangement provides the application with direct access to the flash memory through a set of developer APIs that provides a key value, and raw block I/O interfaces to manage and access the data in flash memory.

Management Layer: Consists of the initialization scripts invoked at system boot and shutdown.

Master Context: Daemon that initializes the adapter, completes logical unit number (LUN) discovery and mapping, does error recovery and health checking, addresses uncorrectable errors and manages link events on behalf of client application software.

Block I/O APIs: Handle read/write requests for specific blocks and issue commands directly to the accelerator function unit (AFU) to read/write data on a logical address in flash memory.

Key Value Storage APIs: Provide a generic key value database that forms the bridge between Redis and the block I/O APIs.

Redis Instance: A commercial grade Redis implementation provided by Redis Labs.

Built for Linux

IBM has introduced a line of Linux®-only scale-out servers that include the POWER8 processors optimized for Linux. What that means is nearlyseamless swapping of POWER8 into any infrastructure built on Linux. Specifically:

• Hardware-agnostic applications written in scripting or interpretive languages (Java, Perl, Python, PHP) run as is on IBM Power SystemsTM versus x86.

• Most x86/Linux applications written in C/C++ require only a recompile.

10x

7x

140 msec

BANDWIDTH

PROCESSING

REDUCTION INLATENCY

The POWER8 Difference

Building on the collaboration with the OpenPOWER Foundation, IBM is uniquely positioned to deliver a higher-performing stack by working with key component providers while still allowing interchangeability of the components.* Here are the indisputable facts: POWER8 vs. x86:

• 10x increase in bandwidth• 7x reduction in latency• From 1-second processing to

140 milliseconds• CAPI, SMT and NVIDIA GPU accelerators• OpenPOWER Foundation

*Based on a POWER8 S824 with 24 cores, 256 GB Memory, 3.52 GHz, RHEL 7.0, WAS 8.5.5.2, DB2 9.7, JDK 7.0 FP1 compared to an Ivy Bridge EP 24 cores, 256 GB Memory, 2.7 GHz, RHEL 6.5, WAS 8.5.5.1, DB2 9.7, JDK 7.0 FP1.