27
Become a SSD expert in minutes! Ryan Smith [email protected] 408-205-8889

Become a SSD expert in minutes! - Electronics & Appliances ... · PDF fileWhat is a SSD? •SSD = Solid State Drive

  • Upload
    lykhanh

  • View
    431

  • Download
    4

Embed Size (px)

Citation preview

Become a SSD expert in minutes!

Ryan Smith

[email protected]

408-205-8889

2 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

What is a SSD?

• SSD = Solid State Drive • RAM-based introduced in 1970’s • Flash-based version in 1990’s • Today, it typically uses NAND Flash • 2012 is a big year for SSDs • Don’t complicate it.. it’s just a really fast drive!

0

5

10

15

20

25

30

35

2010 2011 2012

Mil

lio

ns

# of SSDs sold

PC Server Storage

Source : Samsung

3 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Why an SSD?

• Three things that dictate the speed of your PC/Server:

• CPU, DRAM, and HDD

Everything is speeding up.. Except the HDD

Processor:

•Multi-core

•Higher bandwidth

Memory:

•Larger footprint

•Higher bandwidth

Storage:

•Minor throughput improvements

•Currently solved with spindles

Time

Pe

rfo

rma

nce

Closing the gap with Solid State

Storage

4 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Why an SSD?

• Lower response times (latency) • Higher IOPS and Throughput • Lower Power • No RVI Issues, More reliable

Random Performance (IOPS) Power Consumption (Watt)

Read Write 70:30

Test Environment : Intel SR2600UR Server / IOMeter2008 Test Environment : Intel SR2600UR Server / IOMeter2008 / 4KB RND R70:W30

Active Idle

Source : Samsung

SM825

15K RPM HDD

SM825

15K RPM HDD

X100

X60

X30

-87%

-75% 8.5

12.6

3.2

1.1

11K

23K

43K

5 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

So what’s there to know about an SSD?

SSD Key Characteristics

SSD Components

NAND Characteristics

P/E Cycles

WAF

TBW

SMART

Host Interface

Sustained vs. Peak Performance

Benchmarking

SSD Influencers

TRIM

Over-provisioning

Changing Workload

MLC 1 0

1 0

3,000

User Area Reserved O/P

SSD Key Characteristics

7 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

SSD Components

• Host/NAND Controller • Firmware • NAND Flash • DRAM • Capacitors (optional)

NAND

DRAM

Firmware

Controller

DRAM

NAND Flash

Controller

Firmware

Host Interface

All components work closely together SSD Image Source : Anandtech

8 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

NAND Characteristics

• Types of NAND

• TLC

• MLC

• E-MLC

• SLC

Geometry / Lithography

• 4xnm, 3xnm, 2xnm

• Smaller = Less Cost

NAND Hierarchy

• Pages: Smallest unit that can be read/written (e.g., 8KB)

• Erase block: Groups of pages (e.g., 64 pages @ 8KB = 512KB)

500-1K P/E Cycles 1 year retention

3-5K P/E Cycles 1 year retention

10-30K P/E Cycles 3 month retention

90-100K P/E Cycles 3 mo – 1 yr retention

TLC 1 0

1 0

1 0

MLC 1 0

1 0

E-MLC

SLC 1 0

1 0

1 0

PC Enterprise

1 0

1 0

1 0

1 0

1 0

1 0

9 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

P/E Cycles

• As geometries shrink, error correction must get better • It’s like a car warranty!

• 3 years or 50,000 miles

• 3 years or 3,000 P/E Cycles

Not a useful characteristic by itself

3,000

3xnm 2xnm 2ynm

ECC Requirements

Program / Erase Cycles

The # of times a given NAND cell can be programmed & erased

10 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Write Amplification Factor (WAF)

• WAF 1 means 1MB from host writes 1MB to NAND • WAF 5 means 1MB from host writes 5MB to NAND

• Factors that can affect WAF:

Write Amplification Factor

Bytes written to NAND versus bytes written from PC/Server

Controller

Flash Translation Layer (FTL)

Wear Leveling

Over-provisioning

Garbage Collection

Host Application Write Profile (Ran vs. Seq)

Free user space / TRIM

Bytes written to NAND

Bytes written from Host

WAF =

11 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Write Amplification (WAF) Example

• Below example illustrates WAF of 6

Z

A

LBA 0

Host

SS

D

Time

B

C D

E F

LBA 0 A B

C D

E F

Z B C D E F C

ach

e

Fla

sh

Z Z B C D E F

Z B C D E F

Z B

C D

E F

Host wants to update LBA 0

No more free pages Need to erase entire block Read existing data to Cache

Erase block Write modified page and

old pages back to Flash

4KB from Host

24KB to NAND

12 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

TBW

Examples: ((128GB / 1000) * 3000) / 5 = 76.8 TBW ((128GB / 1000) * 3000) / 2.5 = 153.6 TBW ((256GB / 1000) * 3000) / 5 = 153.6 TBW ((128GB / 1000) * 30000) / 5 = 768 TBW

TeraBytes Written

# of terabytes you can write to the drive over it’s useful life

(Capacity GB/1000) x PE Cycles

WAF

TBW =

13 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

SMART

ID Attribute Name

5 Reallocated Sector Count

9 Power-on Hours

12 Power-on Count

177 Wear Leveling Count

179 Used Reserved Block Count

180 Unused Reserved Block Count

181 Program Fail Count

182 Erase Fail Count

187 Uncorrectable Error Count

195 ECC Error Count

199 CRC Error Count

241 Total LBA Written

Look at health and various statistics Allows for predictable maintenance windows Calculate WAF, TBW

Host GB written = [ID241] / (2/1024/1024)

NAND GB written = [ID177] * Capacity GB WAF = NAND GB / Host GB

Expected Life (yrs) = Warranty PE * ([ID9]/24/365) / [ID177]

14 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Host Interface

• This is how you communicate to the SSD • So many choices..

• SATA

• SAS

• PCIe (NVMe, SCSIe, SATAe, Proprietary)

Which is right for you?

PC Server External Storage

SATA PCIe

SATA SAS PCIe

SATA + SAS bridge

SAS

15 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Sustained vs. Peak Performance

• There can be significant differences in sustained vs. peak • Run enterprise benchmark (e.g., SNIA RTP 2.0) • Or even better, run your own workload (or simulated)

4KB Ran. R/W 100/0 (NCQ=16)

4KB Ran. R/W 65/35 (NCQ=16)

4KB Ran. R/W 0/100 (NCQ=16)

[IOPS] [Ran. Performance @ 4KB]

PM830 128GB

Value SSD

SM825 200GB

Mainstream SSD Vendor “X” 160GB

Value SSD

1MB Seq. R/W 100/0 (NCQ=16)

1MB Seq. R/W 65/35 (NCQ=16)

1MB Seq. R/W 0/100 (NCQ=16)

[MBs] [Seq. Performance @ 1MB]

PM830 128GB SM825 200GB Vendor “X” 160GB

Source : Samsung / SNIA RTP2.0 Benchmark

99% below Peak

94% below Peak

Samsung PM830 vs Vendor “X” 11x Sustained Random Writes

Samsung PM830 vs Vendor “X” 2x Sustained Sequential Writes

95% below Peak

95% below Peak

Over 10,000 IOPS!

There is a BIG difference between “Value” and “Mainstream/Enterprise”

SSDs when you have any degree of writes in your workload

16 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Benchmarking

Benchmark URL

SNIA RTP 2.0 http://www.snia.org/tech_activities/standards/curr_standards/pts

Iometer http://sourceforge.net/projects/iometer/

ATTO Disk http://www.attotech.com/products/product.php?sku=Disk_Benchmark

CrystalDiskMark http://crystalmark.info/software/CrystalDiskMark/index-e.html

HD Tune Pro http://www.hdtune.com/

AS SSD (SSD) http://alex-is.de/PHP/fusion/downloads.php?download_id=9

Anvil (SSD) http://thessdreview.com/latest-buzz/anvil-storage-utilities-releases-new-storage-and-ssd-benchmark/

Scripts Have multiple “dd” running with best guess workload, capturing timing/speeds

Real Workload Capture trace during real workload and playback (ioapps, blktrace/btereplay)

Synthetic or actual workload & take measurements

17 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

SSD Reviewers

• Good SSD Review sites available..

SSD Influencers

19 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

TRIM

• Helps the SSD know which blocks aren’t used • Widely supported standard: Windows, Mac OS X, Linux, hdparm • Better sustained performance and extends TBW • Without TRIM, SSD only knows block isn’t used once the same

LBA is written to

Hi

Hi

Bye

Hi Bye

No TRIM needed

Hi

Hi

Bye

Hi Bye

TRIM makes SSD aware

LBA 0 LBA 0

LBA 0 LBA 0

Host

SS

D ?

TRIM

Hi Bye

LBA 0

LBA 0 LBA 0 LBA 1

LBA 1 LBA 0

LBA 1

Time Time

20 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Over-Provisioning

• Helps a few things:

• Improves Write Performance

• Reduces WAF, Increases TBW

User Area Reserved O/P

128GB

100GB 128GB Base-2 to Base-10 conversion:

137,438,953,472 to 128,000,000,0000 (6.9%)

28GB 28% O/P

Sample 128GB SSD 120GB 100GB

Over-Provisioning 7% 28%

Random Read (8K) IOPS 80K 80K

Random Write (8K) IOPS 1,800 6,300

Sequential Read (64K) MB/s 500 500

Sequential Write (64K) MB/s 400 400

4KB Random WAF 5 1.35

4KB Random TBW 15 45

-73%

3x

3.5x

These performance numbers are fictitious but do represent the actual benefits seen during tests

21 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Change Write Workload

• Write sequentially instead of random to reduce WAF

• If you have control of the I/O to the disk, this will pay off

Align your writes with the page boundaries (e.g., 8KB)

If alignment is too hard to implement, just increase your IO size

Only 1 Page needed 2 Pages needed

Random Sequential

MLC 512GB SSD 60 TBW 1250 TBW 20x

8K

8K 8K 8K

8K

LBA 0 LBA 16

LBA 8

LBA 0 LBA 16

8K 8K 8K

LBA

16

Change block alignment

Host

SS

D

Applications of SSDs

23 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

HDD Replacement

• Replace boot drive or main storage • Fastest and easiest way to experience SSDs

SSD HDD

Server

HDD

SSD

Storage

24 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Caching Appliance

• Read and/or Write Cache • Sits between servers and storage, typically in a SAN • Used to speed up legacy or slower storage

HDD

SSD

Servers Cache

Storage

25 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

Tiered Storage

• An external storage device (NAS, SAN) • Only puts “hot” or “critical” data on SSD • Most of the storage is still on HDD

SSD

Servers

HDD HDD

Storage

26 / ? YYYY.MM.DD / 홍길동 책임 / xxxxxx팀

All Flash Storage

• External storage based on 100% SSD/Flash • Typically uses MLC and de-duplication/compression to

achieve better pricing • Designers of these systems are Flash experts

SSD SSD SSD

Servers

Storage

New Green Memory App Available