25
SPDK FTL & Marvell OCSSD for Noisy Neighbor Problem David Recker Marketing VP Circuit Blvd., Inc. April 2019 4/17/2019 © 2019 Circuit Blvd., Inc. 1

SPDK FTL & Marvell OCSSD for Noisy Neighbor ProblemSPDK+-+(Circuit...Marvell assumes no obligation to update or otherwise correct or revise this information. Marvell shall not be responsible

  • Upload
    others

  • View
    23

  • Download
    1

Embed Size (px)

Citation preview

SPDK FTL & Marvell OCSSDfor Noisy Neighbor Problem

David ReckerMarketing VP

Circuit Blvd., Inc.April 2019

4/17/2019 © 2019 Circuit Blvd., Inc. 1

Who We Are

4/17/2019 © 2019 Circuit Blvd., Inc. 2

Sunnyvale, CA, U.S.A.

IndustryEnterprise/Cloud Database and Storage

Year Founded2017

MissionWe develop next gen database/storage systems leveraging expertise in memory semiconductor, solid-state storage system, and operating systems

Open Source Contributions• Linux LightNVM, OCSSD 2.0 specification, OpenSSD FPGA platform

• SPDK (since SPDK v17.10)

• RocksDB

OCSSD with SPDK FTL

• SPDK FTL on Marvell’s OCSSD Platform• We have been evaluating SPDK FTL on Marvell's SSD SoC platform since Jan ’19

• SPDK (Flash Translation Layer) FTL: The Flash Translation Layer library provides block device access on top of non-block SSDs implementing Open Channel interface. It handles the logical to physical address mapping, responds to the asynchronous media management events, and manages the defragmentation process*

• Measured various performance metrics of initial prototype and demonstrate how SPDK OCSSDs can solve the noisy neighbor problem in multi-tenant environments

• Share experimental data based on our current implementation (both SPDK FTL and Marvell’s controller being continuously improved)

• (Demo) SPDK Driven OCSSD Comparison (Isolation vs Non-Isolation) • Demo table outside (please feel free to drop by for further questions)

4/17/2019 © 2019 Circuit Blvd., Inc. 3

* SPDK FTL definition: https://spdk.io/doc/ftl.html

Hardware Setup

• SuperMicro X11DPG• 2 * Xeon Scalable Gold 6126 2.6 Ghz (12 cores)

• hyperthreading disabled• 8 * 32 GB DIMM 2666 MT/s

• 2 * OCSSD 2.0• Marvell 88SS1098 controller

• PCIe Gen3x4 slot to each CPU package• nvme id-ns

• LBADS=12 (4KiB), MS=0• ocssd geometry

• 8 grp (3), 8 pu (3), 1478 chk (11), 6144 lbk (13)• () means bit length in LBAF

• ws_opt=24 (96KiB)• 3D TLC NAND

• write unit: 96KiB (one shot program)• read unit: 32KiB

4/17/2019 © 2019 Circuit Blvd., Inc. 4

OCSSD1

OCSSD2

CPU1

CPU2

OCSSD Geometry

4/17/2019 5

OCSSD

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

grp 0 1 2 3 4 5 6 7

pu

PC

Ie

0

1

2

3

:

1476

1477

chk 0 1 2 3 4 5 6 7

8 9 10 11 12 13 14 15

16 17 18 19 20 21 22 23

: : : : : : : :

: : : : : : : :

: : : : : : : :

6120 6121 6122 6123 6124 6125 6126 6127

6128 6129 6130 6131 6132 6133 6134 6135

6136 6137 6138 6139 6140 6141 6142 6143

lbk (4KiB)3D TLC NAND2 plane blocks

64 (layer) * 4 (wordline) * 3 (page) * 8 (lbk) = 6144NAND op: tR < tPROG < tBERS*: rs_opt(optimal read size) is not defined on OCSSD spec 2.0

grp pu chk lbk

bits 3 3 11 13

ex) 7 1 1 5

OCSSD LBA (64bits)

ws_opt=24

rs_opt=8 (*)

(1) pu range = 0-15,16-31,32-47,48-63(2) pu range = 0-63for pblk & spdk ftl

Software Setup

• linux 4.17 for pblk• with Marvell’s patches applied

• linux 5.0 for SPDK

• fio 3.13• isolcpus=6-11,18-23 for fio threads

• cpus_allowed=6-11 or 18-23

• SPDK• master: 7b0579d (4/9/2019)

4/17/2019 © 2019 Circuit Blvd., Inc. 6

• Additional changes to SPDK master• Marvell specific patch (CircuitBlvd’s)

• num_chk=1478 as posted on SPDK github issue

• OCSSD identification quirk & edlp=0

• vector reset (0x90) to DSM deallocate (0x09)

• erase should be done in synchronous mode

• vendor specific cmd to build chunk info

• Optimal read size (rsopt) patch• will be posted to SPDK gerrithub

• Cherry picks to avoid chk wptr error (Intel’s)• https://review.gerrithub.io/c/spdk/spdk/+/449068

• https://review.gerrithub.io/c/spdk/spdk/+/450174

• https://review.gerrithub.io/c/spdk/spdk/+/449239

0

0.5

1

1.5

2

2.5

3

3.5

0 1000 2000 3000 4000G

iB/s

Time (seconds)

pblk spdk ftl spdk ftl-rsopt• single bdev on 64 PUs

• 2 * W (128k write T1Q64)

• 1 * R (128k read T1Q64)

• spdk ftl isn’t aware of TLC read unit as posted on SPDK Trello

Throughput Comparison

4/17/2019 © 2019 Circuit Blvd., Inc. 7

* pblk target created with op=20

1st write 2nd write

read

Noisy Neighbor Problem Solved by OCSSD• 4k randread T3Q64 & randwrite T1Q64 on four partitions

• each partition is pre-conditioned with 128k write

4/17/2019 © 2019 Circuit Blvd., Inc. 8

0

50

100

150

200

250

7100 7120 7140 7160 7180 72000

50

100

150

200

250

7100 7120 7140 7160 7180 7200

pblk

spdk ftl

3 reads

1 write

3 reads

1 write

X: seconds, Y: K IOPS

• not isolated: single bdev on 64 Pus • isolated by 2 channels: four bdevs per 16 Pus

0

50

100

150

200

250

7100 7120 7140 7160 7180 7200

0

50

100

150

200

250

7100 7120 7140 7160 7180 7200

Contributions & Future Works

• OCSSD 2.0 API & FTL• https://github.com/spdk/spdk/commits?author=youngtack• https://github.com/spdk/spdk/commits?author=iClaire

• FTL issues• https://trello.com/c/Osol93ZU• https://github.com/spdk/spdk/issues/created_by/youngtack• https://github.com/spdk/spdk/issues/created_by/iClaire

• Future works• random IOPS bottleneck analysis• ANM analysis once Marvell firmware will support• CPU affinity per FTL bdev analysis• PMDK and ZNS support of FTL bdev

4/17/2019 © 2019 Circuit Blvd., Inc. 9

Acknowledgement

• Wojciech Malikowski (Intel) – SPDK FTL

• Matias Bjørling (Western Digital) – QEMU NVMe, LightNVM PBLK

• Luan Ton-That (Marvell) - OCSSD firmware

• John Schadegg (Marvell) - OCSSD EVB

4/17/2019 © 2019 Circuit Blvd., Inc. 10

Open-Channel SSD Roadmap

4/17/2019 © 2019 Circuit Blvd., Inc. 11

2011 2014 2015 2018

Jasmine OpenSSDIndilinx (SoC) SATA

Cosmos OpenSSDFPGA w/ PCIe Gen 2

OCSSD SpecLightNVM Architecture

OCSSD ProjectsAlibaba OCSSD

Microsoft Denali

2019

OCSSD w/ SPDKMarvell SoC w/ SPDK FTL

2020 ~

Cinabro™ Storage Appliance

SPDK FTL + PMDKOCSSD / ZNS

Optane DIMM

CinabroTM Architecture

4/17/2019 © 2019 Circuit Blvd., Inc. 12

20 ~ 30 SSDs

SW Stack and Storage Appliance

OS

SPDK FTL / PMDK

App

OCSSD / ZNSOptane DIMM

Summary

• The SPDK+OCSSD shows promise in alleviating the Noisy Neighbor problem.

• SPDK OCSSD Reference Platform Availability: 2H ‘19

• For inquiries or more information:

4/17/2019 © 2019 Circuit Blvd., Inc. 13

[email protected]

www.circuitblvd.com

Mar vel l Conf ident ia l

Marvell Data Center &

Enterprise Open Channel SSD Controller

S P D K 2 0 1 9

Mar vel l Conf ident ia l

Agenda

15

• Marvell 88SS1098 Datacenter NVMe SSD Controller

• Marvell OC Drive (Prototype)

Mar vel l Conf ident ia l

88SS1098 - Marvell Datacenter NVMe SSD Controller

16

Feature 88SS1098

Capacity8TB/8CH or 16TB/16CH

(via 2x4GB/s MCI)

PCIe Gen 3x4, Single and dual port

NVMe1.3 , 64 VF

64 IO queues , 256 commands

Virtualization 64VF

Metadata T10 / DIF / DIX

Program/Erase

Suspend & Resume

Natively supported including out-of-

order transfers

CPU QUAD CORTEX – R5 ARM

Feature 88SS1098

NAND I/F speed 800MT/s

Reliability Gen4 LDPC

SGL Yes

IO Determinism Yes

T10 E2E DIX Yes

Encryption AES-XTS

Mar vel l Conf ident ia l17

88SS1098

128K Seq Write 2.73 GB/s

128K Seq Read 3.31 GB/s

4K Random Write 500 KIOPs

4K Random Read 650 KIOPs

88SS1098 - Marvell Datacenter NVMe SSD Controller

NAND: Toshiba BICS3 TLC, NFIF : 533 MT/s

8 Channels, 64 dies

Mar vel l Conf ident ia l

Marvell OC Drive (Prototype)

18

• Host: Linux PC with PCIe 3.0

• Drive: M.2 SSD/PCIE3.0x4

• Approach:

– Align with Linux open-source community and SPDK

– Evaluate open-channel SSD solution with prototype

• Targets:

– Support open-channel SSD interface v2.0

» In-house modification to support v2.0 read/write/erase operations

» Aligned with Linux upstream kernel 4.17, 4.18, 5.0

– Integrate with Marvell SSD controller and expose as a block device using pblk path in lightNVM

» Multi pblk instances support

Ubuntu Linux PC

NVMe Controller

PCIe3.0x4 I/F

Back End Controller

NAND NAND NAND NAND

Marvell SSD Device / Drive

Linux Host

OC SSD Media FW

Mar vel l Conf ident ia l

NVMe Command Support

Operation NVMe Command

Read Read Chunk

Write Write Chunk

Erase Reset Chunk (Free or Vacant)

Get Geometry Geometry

Get Chunk Information Get Log Page (Chunk Information)

Media Feedback Get/Set Features (Media Feedback)

19

Mar vel l Conf ident ia l20

88SS1098 OC Drive Prototype

128K Seq Write 2.7 GB/s

128K Seq Read 2.3 GB/s

4K Random Write 594 KIOPs

4K Random Read 448 KIOPs

OC Prototype Performance

We can achieve maximum possible chip performance with future product code

NAND: Toshiba BICS3 TLC, NFIF : 533 MT/s

8 Channels, 64 dies

Mar vel l Conf ident ia l

Planned Features for OCSSD

21

• Vector I/O and Asynchronized erase

– High performance

• NAND error recovery

– Highly efficient error recovery algorithms for best QoS and drive life

– Reusable, compatible and tested with all major NAND vendors

• Meta support

– To store host LBA in NAND

• Performance tuning

Mar vel l Conf ident ia l

Summary

22

• Marvell 88SS1098 controller is a perfect fit for both

conventional enterprise and open channel SSD products

• Marvell has highly efficient FW components

– Unified HAL : Provides access and exercises all HW features

– Full featured media management and NAND error recovery

– FW for NVMe block and other IP’s

Mar vel l Conf ident ia l

Q & A

The information contained in this presentation is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided “AS IS”, without warranty of any kind, express or implied. This information is based on Marvell’s current product roadmap, which are subject to change by Marvell without notice. Marvell assumes no obligation to update or otherwise correct or revise this information. Marvell shall not be responsible for any direct, indirect, special, consequential or other damages arising out of the use of, or otherwise related to, this presentation or any other documentation even if Marvell is expressly advised of thepossibility of such damages. Marvell makes no representations or warranties with respect to the contents of the presentation and assumes no responsibility for any inaccuracies, errors or omissions that may appear in this presentation.

6-May-19 24

6-May-19 25