21
1 Differentiated Storage Services Michael Mesnier, Jason Akers, Feng Chen Intel Corporation Tian Luo The Ohio State University 23rd ACM Symposium on Operating Systems Principles (SOSP) October 23-26, 2011, Cascais, Portugal

Differentiated Storage Services

  • Upload
    berne

  • View
    69

  • Download
    0

Embed Size (px)

DESCRIPTION

Differentiated Storage Services. Tian Luo The Ohio State University. Michael Mesnier, Jason Akers, Feng Chen Intel Corporation. 23rd ACM Symposium on Operating Systems Principles (SOSP) October 23-26, 2011, Cascais , Portugal . Technology overview. An analogy: moving & shipping. - PowerPoint PPT Presentation

Citation preview

Page 1: Differentiated Storage Services

1

Differentiated Storage Services

Michael Mesnier, Jason Akers, Feng ChenIntel Corporation

Tian LuoThe Ohio State University

23rd ACM Symposium on Operating Systems Principles (SOSP)

October 23-26, 2011, Cascais, Portugal

Page 2: Differentiated Storage Services

2

An analogy: moving & shipping

Why should computer storage be any different?

Technology overview

Classification Policy assignment Policy enforcement

Page 3: Differentiated Storage Services

3

Differentiated Storage Services

(offline)

Classifier QoS Policy

Metadata Low latency

Boot files Low latency

Small files High throughput

Media files High bandwidth

… …

Computer system

Operating system

Applications or DB

File system

I/O Classification

I/O Classification

I/O Classification

Storage system

Management firmware

Storage controller

QoS Policies

QoS Mechanisms

StoragePool A

StoragePool B

StoragePool C

= Current & future research

Technology overview

Classification Policy assignment Policy enforcement

Classify each I/O in-band

Page 4: Differentiated Storage Services

4

The SCSI CDB

5 bits 32 classes

Page 6: Differentiated Storage Services

6

Filesystem prototypes (Ext3 & NTFS)

Classify each I/O in-band

Classifier Cache priority

Metadata 0

Journal 0

Directories 0

Files <= 4KB 1

Files <=16KB 2

Files <=64KB 3

… …Files > GB Lowest

Computer system

Operating system

Applications or DB

File system

I/O Classification

I/O Classification

I/O Classification

Storage system

Management firmware

Storage controller

QoS Policies

QoS Mechanisms

= Current & future research

Technology overview

FS classification FS policy assignment FS policy enforcement

Disk SSD

Page 7: Differentiated Storage Services

7

Classifier Cache priority

System tables 0Temp. tables (on write) 1

Randomly tables 2Temp. tables (on read) 3

Sequential tables BypassIndex files Bypass

Database prototype (PostgreSQL)

Classify each I/O in-band

Computer system

Operating system

Applications or DB

File system

I/O Classification

I/O Classification

I/O Classification

Storage system

Management firmware

Storage controller

QoS Policies

QoS Mechanisms

= Current & future research

Technology overview

DB classification DB policy assignment DB policy enforcement

Disk SSD

Page 8: Differentiated Storage Services

8

Selective cache algorithms Selective allocation

– Always allocate high-priority classes– E.g. FS metadata and DB system tables always allocated

– Conditionally allocate low-priority classes– Depends on cache pressure, cache contents, etc.– High/low cutoff is a tunable parameter

Selective eviction– Evict in priority order (lowest priority first)

– E.g., temporary DB tables evicted system tables– Trivially implemented by managing one LRU per class

Technology overview

Page 9: Differentiated Storage Services

9

Technology development

Page 10: Differentiated Storage Services

10

Ext3 prototype OS changes (block layer)

– Add classifier to I/O requests– Only coalesce like-class requests– Copy classifier into SCSI CDB

Ext3 changes– 18 classes identified – Optimized for a file server

Small files & metadata A small kernel patch A one-time change to the FS

Ext3 Class

Group Number

Cache priority

Unclassified 0 12Superblock 1 0Group desc. 2 0

Bitmap 3 0Inode 4 0

Indirect block 5 0Directories 6 0

Journal 7 0File <= 4KB 8 1

File <= 16KB 9 2File <= 64KB 10 3

… … …File > 1GB 18 11

Technology development

Page 11: Differentiated Storage Services

11

Ext3 classification illustrated echo ‘Hello, world!’ >> foo; sync

– READ_10(lba 231495 len 8 grp 9) <=4KB– WRITE_10(lba 231495 len 8 grp 9) <=4KB– WRITE_10(lba 16519223 len 8 grp 8) Journal– WRITE_10(lba 16519231 len 8 grp 8) Journal– WRITE_10(lba 16519239 len 8 grp 8) Journal– WRITE_10(lba 16519247 len 8 grp 8) Journal– WRITE_10(lba 8279 len 8 grp 5) Inode

7 I/Os (28KB) to write 13 bytes– Metadata accounts for most of the overhead

I/O classification shows read-modify-write and

metadata updates

Technology development

NTFS classification is implementedwith Windows filter drivers

Page 12: Differentiated Storage Services

12

PostgreSQL prototype Classification API: scatter/gather I/O

OS changes (block layer)– Add O_CLASSIFIED file flag– Extract classifier from SG I/O

A small OS & DB patch A one-time change to the OS & DB

PostgreSQL class

Group Number

Unclassified 0Transaction log 19System table 20

Free space map 21Temporary table 22Random table 23

Sequential table 24Index file 25Reserved 26-31

fd=open("foo", O_RDWR|O_CLASSIFIED, 0666); class = 19;myiov[0].iov_base = &class;myiov[0].iov_len = 1;myiov[1].iov_base = “Hello, world!”;myiov[1].iov_len = 13;writev(fd, myiov, 2);

Preliminary DB classes

Technology development

Page 13: Differentiated Storage Services

13

Cache implementations Fully associative read/write LRU cache

– Insert(), Lookup(), Delete(), etc.– Hash table maps disk LBA to SSD LBA– Syncer daemon asynchronously cleans cache

Monitors cache pressure for selective allocateMaintains multiple LRU lists for selective evict

Front-ends: iSCSI (OS independent) and Linux MD MD cache module (RAID-9)

Technology development

Striping: mdadm –create /dev/md0 –level=0 –raid-devices=2 /dev/sdd /dev/sdeMirroring: mdadm –create /dev/md0 –level=1 –raid-devices=2 /dev/sdd /dev/sde RAID-9: mdadm –create /dev/md0 –level=9 –raid-devices=2 <cache> <base

Page 14: Differentiated Storage Services

14

Evaluation

Page 15: Differentiated Storage Services

15

Experimental setup Host OS (Xeon, 2-way, quad-core, 12GB RAM)

– Linux 2.6.34 (patched as described) Target storage system

– HW RAID array + X25-E cache Workloads and cache sizes

– SPECsfs: 18GB (10% of 184GB working set)– TPC-H: 8GB (28% of 29GB working set)

Comparison– LRU versus LRU-S (LRU with selective caching)

Evaluation

Page 16: Differentiated Storage Services

16

SPECsfs I/O breakdown

Large files pollute LRU cache(metadata and small files evicted)

LRU

LRU-S fences off large file I/O

LRU-S

Page 17: Differentiated Storage Services

17

SPECsfs performance metrics

Syncer overhead

LRU-SLRU

LRU LRU-S

I/O Throughput

LRU LRU-S

Hit rate

LRU LRU-SHDD

Running time

1.8x speedup

Page 18: Differentiated Storage Services

18

SPECsfs file latencies

LRULRU-S

Reduction in write latency over HDD

LRU suffers from write outliers(from eviction overheads)

LRULRU-S

Reduction in read latency over HDD

LRU-S reduces read latency(most small files are cached)

LRULRU-S

Page 19: Differentiated Storage Services

19

TPC-H I/O breakdown

Indexes pollute LRU cache(user tables evicted)

LRU

LRU-S fences off index files

LRU-S

Page 20: Differentiated Storage Services

20

TPC-H performance metrics

Syncer overhead I/O Throughput

LRU-SLRU

LRU LRU

LRU

LRU-S LRU-S

LRU-S

HDD

Running timeHit rate

1.2x speedup

Page 21: Differentiated Storage Services

Intel Confidential

21

Conclusion & future work Intelligent caching is just the beginning

– Other types of performance differentiation– Security, reliability, retention, …

Other applications we’re looking at – Databases– Hypervisors– Cloud storage– Big Data (NoSQL DB)

Work already underway in T10 Open source coming soon…

Thank you!

Questions?