13
Best practices for using flash in hyperscale software storage architectures Brandon Hoang Solutions Architect Flash Memory Summit 2015 Santa Clara, CA 1

Best practices for using flash in hyperscale software storage architectures

Embed Size (px)

Citation preview

Page 1: Best practices for using flash in hyperscale software storage architectures

Best practices for using flash in hyperscale software storage architectures

Brandon HoangSolutions Architect

Flash Memory Summit 2015Santa Clara, CA 1

Page 2: Best practices for using flash in hyperscale software storage architectures

Flash Memory Summit 2015Santa Clara, CA 2

Page 3: Best practices for using flash in hyperscale software storage architectures

Software-Defined Storage (SDS)

Flash Memory Summit 2015Santa Clara, CA 3

Software Software-Defined Storage

+ =

Double-digit growthA $2.8 billion market by 2017

Commodity Servers

IDC 2015

Page 4: Best practices for using flash in hyperscale software storage architectures

Who is Hedvig?

Flash Memory Summit 2015Santa Clara, CA 4

• Founded in 2012 by Avinash Lakshman – Co-inventor of Amazon Dynamo

and inventor of Apache Cassandra

• Develop the Hedvig Distributed Storage Platform – A software-defined storage solution

Page 5: Best practices for using flash in hyperscale software storage architectures

Anatomy of a distributed, hyperscale storage system

Each node hosts:• Metadata• Data

Flash Memory Summit 2015Santa Clara, CA 5

Hosts access storage tier via storage proxies

Scale-out storage tier with

commodity servers

Applicationtier

Page 6: Best practices for using flash in hyperscale software storage architectures

Taking advantage of flash w/ SDS

Flash Memory Summit 2015Santa Clara, CA 6

• At the storage server node– Store metadata on SSD: fast lookups and tracking– Write-optimization: sequentialize random I/O– Auto-tier and cache active data on SSDs: speed access

to hot data and buffer HDD capacity tier– Provision volumes on “all-flash” persistent storage (aka

“pin to flash”): dedicated, consistent performance for latency sensitive apps

• At the application host– Cache hot data on local SSD or PCIe flash to

accelerate access and avoid network latencies

Page 7: Best practices for using flash in hyperscale software storage architectures

Biggest flash benefit to hyperscale: Sequentializing random I/O

Flash Memory Summit 2015Santa Clara, CA 7

Application writes data in random blocks, and gets immediate ack from cluster

1

Storage cluster sequentializes incoming blocks (in RAM+SSD) into larger chunks

2

Larger sequentialized data chunks written to underlying disks according to policy

3

Hyperscale client

Hyperscale nodes

Application Server

Page 8: Best practices for using flash in hyperscale software storage architectures

Three ways flash is used in hyperscale systems

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Page 9: Best practices for using flash in hyperscale software storage architectures

Option #1: Node OS storage

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Type of flash:SLC/MLC SSDs or PCIe Flash

Use of flash:-- Store metadata for fastoperations – dedupe, compression, snaps, clones-- Autotiering to ensure hot data is migrated to flash-- Write logs for metadata and data

Typical configuration:2x 300GB MLC SAS/SATA SSDs

Page 10: Best practices for using flash in hyperscale software storage architectures

Option #2: Node volume storage

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Type of flash:SLC/MLC SSDs or PCIe Flash

Use of flash:-- All-flash virtual volumesfor dedicated, consistentperformance on a per-appbasis-- Flash performance for read and write operations

Typical configuration:2x 800GB MLC SAS/SATA SSDs

Page 11: Best practices for using flash in hyperscale software storage architectures

Option #3: Client-side cache

Read/write cache on storage nodes

“Pin to flash” dedicated primary storage volume Client side read cache

Type of flash:SLC/MLC SSDs or PCIe Flash

Use of flash:-- Write-through cache to store hot blocks-- Local metadata storage

Typical configuration:800GB MLC SAS/SATA SSDs

Typical application server can deliver 65K IOPs vs. 15K

without flash

Page 12: Best practices for using flash in hyperscale software storage architectures

Results: Law Firm• Challenge:

– Needed quick, reliable indexing and lookups of massive 100 million active client legal docs

– Traditional NAS underperformed required access time – Standalone servers with flash performed well, but

predictably ran out of space• Solution/Result:

– Hedvig software-defined storage with SSD/HDD and client-side flash caching

– ~9x faster performance with flash– Scale-out architecture simplifies growth and

expansion

Flash Memory Summit 2015Santa Clara, CA 12

Page 13: Best practices for using flash in hyperscale software storage architectures

Flash Memory Summit 2015Santa Clara, CA 13

Thank you!

For more on Hedvig visit: • Web: hedviginc.com• Twitter: @hedviginc