27
LIO AND TCMU 1 LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover <[email protected]> @iamagrover March 11, 2015

LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover @iamagrover March

  • Upload
    others

  • View
    43

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU1

LIO and the TCMU Userspace Passthrough:The Best of Both Worlds

Andy Grover <[email protected]>@iamagroverMarch 11, 2015

Page 2: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU2

What is LIO?

Multi-protocol in-kernel SCSI target

Page 3: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU3

Multi-protocol in-kernel SCSI target

Unlike other targets like IET, tgt, and SCST, LIO is entirely kernel code.

Page 4: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU4

iSCSI commands

User

Kernel

disk.img

/dev/vg0/vol0

Configuration

Fabrics

tcm_fciscsi

/dev/sdc

block

file

pscsi

rtslib

targetcli

LIO core

/sys/kernel/config/target

Backstores

Page 5: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU5

Why add userspace command handling?

● Enable wider variety of backstores without kernel code

● Clustered network storage filesystems like Ceph, GlusterFS, & other things that have shared libraries available

● File formats beyond .img, such as qcow2 & vmdk for more interesting features & compatibility

● SCSI devices beyond mass storage● Enable experimentation just like FUSE did for

filesystems

Page 6: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU6

Userspace handling challenges: perf

● I/O latency● I/O throughput● Parallelism within a dev (OoO cmd completion)● Parallelism across devs, we're good.

Page 7: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU7

Userspace handling challenges: usability

● Configuration as simple as existing backstores● Userspace daemon failure● Userspace daemon activation/restart● Balance ultimate flexibility with common use● Avoid “multiple personalities”● Reasonable resource usage

Page 8: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU8

iSCSI commands

User

Kernel

disk.img

/dev/vg0/vol0

Configuration

Fabrics

tcm_fciscsi

/dev/sdc

block

file

pscsi

rtslib

targetcli

LIO core

/sys/kernel/config/targettcm-user

uio0 uio1

cmd handling daemon

SCSI Command processing

Backstores

NEW!NEW!

Page 9: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU9

tcm-user

cmd handling daemon

/dev/uio0 /sys/class/uio/uio0

uio0

● read to get configuration

● read (wait for cmd)● write (cmds done)● mmap (get SMR)

from LIO core

Shared Memory Region Layout(not to scale)

MailboxMailbox

Command RingCommand Ring

Data AreaData Area

cmdr_off

cmdr_size

cmd_head

cmd_tail

cmd_entry

...

opcode iovec[]

in/out data

User/Kernel Communication

...

dev add/remove

(netlink)

Page 10: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU10

tcm-user merged in 3.18

● Initial patch as simple as possible● Performance tuning not done, BUT an interface

flexible enough to not block likely perf optimization strategies

● Acceptance enabled phase 2: userspace usability pieces

Page 11: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU11

Performance Opportunities for Later

● Larger vmalloc()ed shared mem region● -> Demand-allocate pages to avoid bloat

● Block size == PAGE_SIZE preferred

● Complete commands out of order● Must handle data area fragmentation

● Fabrics copy into already-mapped pages● Just moves the cache misses?● Fabrics don't have per-lun device queues

anyway. Just IB reimplemented poorly?● Userspace busywait on ring cmd_head

Page 12: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU12

Our user-kernel API ended up flexible, but fraught with danger!

● Ring operations easy to mess up● Make every handler write daemon boilerplate?

● And support Netlink?

● And maybe D-Bus???

Page 13: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU13

tcmu-runner: A standard handler daemon

● Handle the messy ring bits● Expose a C plugin API● Implement library routines for common handler

code, e.g. mandatory SCSI commands● Permissively licensed: Apache 2.0

● Doubles as sample code for do-it-yourself-ers● Needed a prototype daemon in any case

Page 14: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU14

iSCSI commands

User

Kernel

disk.img

/dev/vg0/vol0

Configuration

Fabrics

tcm_fciscsi

/dev/sdc

block

file

pscsi

rtslib

targetcli

LIO core

/sys/kernel/config/targettcm-user

uio0 uio1

Backstores

tcmu-runner

tape vmdk mmc smc glfs

NEW!NEW!

Page 15: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU15

Handlers Need Config Info

● All configuration should still be through standard LIO mechanisms (i.e. targetcli)

● User backstore create includes a URL-like “configstring” that gives handler and per-device handler-specific stuf

● This is published in uio sysfs, and netlink add_device message

● Also has info to allow going from uio dev back to matching LIO backstore

● Handler needs attribs, block size, etc.

Page 16: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU16

Config tools need Info on Handlers Too!

● Users should call “backstores/foo create x y z” not “backstores/user <configstring>”

● targetcli needs param and help strings for xyz● Verify backstore params are correct before

creating the device● Must be loosely coupled – no hard dependencies

on targetcli or tcmu-handler● Solution: D-Bus!

Page 17: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU17

iSCSI commands

User

Kernel

disk.img

/dev/vg0/vol0

Configuration

Fabrics

tcm_fciscsi

/dev/sdc

block

file

pscsi

rtslib

targetcli

LIO core

/sys/kernel/config/targettcm-user

uio0 uio1

Backstores

tcmu-runner

tape vmdk mmc smc glfsNEW!NEW!

DBus

Page 18: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU18

The QEMU Question

● QEMU has great support for many image formats and other backstores

● Can we reuse or integrate somehow?● How?

● Build qemu handler code separately and integrate as a tcmu-runner handler?

● Extend qemu to implement TCMU directly, and possibly also enable it to configure LIO exports?

● Deferred for now

Page 19: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU19

Getting Involved

● Give feedback!● [email protected], [email protected]

● Check out tcmu-runner: https://github.com/agrover/tcmu-runner and its included sample handlers.

● Use github PRs and issue tracking.● Much help needed, esp. QEMU hackers!

● Start doing some TCMU performance benchmarking● Start thinking of interesting device types, userspace

libraries to use, or weird things to do with a SCSI command sandbox

Page 20: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU20

Thanks!Thanks!Questions?Questions?

Page 21: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU21

ISCSI commands

User

Kernel

disk.img

/dev/vg0/vol0Fabrics

tcm_fciscsi

/dev/sdc

file

pscsi

LIO core

/sys/kernel/config/targettcm-user

tcmu-runner

block

rtslib

targetcli targetd

liblvm

lsmcli (libstoragemgmt) Remote

Local

NEW!NEW!

Page 22: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU22

tcm-user

tcmu-runner

/dev/uio0 /sys/class/uio/uio0

uio0

● read to get configuration

● read (wait for cmd)● write (cmds done)● mmap (get SMR)

from LIO core

Shared Memory Region Layout(not to scale)

MailboxMailbox

Command RingCommand Ring

Data AreaData Area

cmdr_off

cmdr_size

cmd_head

cmd_tail

cmd_entry

...

opcode iovec[]

in/out data

User/Kernel Communication

...

Page 23: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU23

SCSIA set of standards for physically connecting and

transferring data between computers and peripheral devices*

* http://en.wikipedia.org/wiki/SCSI

Page 24: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU24

SCSI targetInitiator sends commands,

Target handles them

Page 25: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU25

Multi-protocol SCSI targetSCSI commands & data can be sent over many

types of physical links and protocols. e.g:

● SCSI Parallel Interface (original)● iSCSI (over TCP/IP)● SAS (over SATA cables)● Fibre Channel (over FCP)● FCoE (over Ethernet)● SRP (over Infiniband)● SBP-2 (over Firewire)

Page 26: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU26

tgtd

iSCSI commandsfrom initiator

User

KernelBacked by local filesor block devices

libglfs

librbdNEW!NEW!

disk.img /dev/vg0/vol0

tgtadmConfiguration

Page 27: LIO and the TCMU Userspace Passthrough: The Best of Both ...LIO and the TCMU Userspace Passthrough: The Best of Both Worlds Andy Grover  @iamagrover March

LIO AND TCMU27

ISCSI commands

User

Kernel

disk.img

/dev/vg0/vol0Fabrics

tcm_fciscsi

/dev/sdc

file

pscsi

LIO core

/sys/kernel/config/targettcm-user

tcmu-runner

librbd libglfs qemu-lio-tcmu

block

qcow2 vmdk vdi rbd* glfs*

others

Coming

Soon!Coming

Soon!