Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов,...

Preview:

Citation preview

Ceph new store: BlueStoreМаксим Воронцов

About me

● Главный инженер по вычислительным комплексам● Работаю с Linux 8 лет● WAS/DB2/MQ вот это все● Много разных проектов

About RedSys

● Бизнес интегратор● Существует более 20 лет● Офисы в MOW, LED, OVB, GOJ, ROV, KHV● RED = Responsibility + Efficiency + Development● Отрасли - ТЭК, ВПК, Госы, Телеком, etc.

Customers

TOC

● Before Ceph● Ceph first advent● Ceph temptations● BlueStore prophecy● Ceph FileStore vs BlueStore● Let's fight● Results● Awaiting Ceph second advent

Software Defined Storage

● Unlimited scalability● Storage virtualization● Policy-driven administration● API services● Support for block, file and object data types

IBM definition

«SDS in today's business context refers to IT storage that goes beyond typical array interfaces (for example, command line and graphic user) to operate within a higher architectural construct.»

Examples

● AWS S3● EMC ScaleIO● Ceph● GlusterFS● Huawei FusionStorage● IBM ElasticStorage● NexentaStor

Issue

● DB2 on z/OS

Issue

● DB2 on z/OS● XML in DB2

Issue

● DB2 on z/OS● XML in DB2● Signed XML in DB2 (no way)

Issue

● DB2 on z/OS● XML in DB2● Signed XML in DB2 (no way)● You really shouldn't store blobs in relational store

To find a way

● More money to IBM?

To find a way

● More money to IBM?● More money to someone else?

To find a way

● More money to IBM?● More money to someone else?● Something else?

Which one?

● AWS S3● Ceph● IBM ElasticStorage● Huawei OceanStor● Swift

Why this one?

Standing on the shoulders of giants

● CERN● Cisco● Deutsche Telecom● Yahoo● Cloudmouse.ru● ...

Preborn

7 guests in VMWare:● 1 MON● 3 OSD● 1 ActiveMQ● 1 Tomcat● 1 ElasticStorage

Long story short

Long long story about...

Long story short

Long long story about…

What is English for «импортозамещение»?

Long story short

Long long story about…

What is English for «импортозамещение»?

Catch up and overtake z/OS

Long story short

Long long story about…

What is English for «импортозамещение»?

Catch up and overtake z/OS

What is Russian for LTFS?

Long story short

Long long story about…

What is English for «импортозамещение»?

Catch up and overtake z/OS

What is Russian for LTFS?

What is Russian for WORM?

BlueStore prophecy

Ceph Jewel Preview: a new store is coming, BlueStore

Ceph scheme

OSD scheme

FileStore scheme

BlueStore scheme

BlueStore advanced scheme

Mount directory structure

$ ls -R /var/lib/ceph/osd/ceph-0 | wc -l

Mount directory structure

$ ls -R /var/lib/ceph/osd/ceph-0 | wc -l

FileStore BlueStore

18656 16

HW test

$ sudo dd bs=1G count=1 oflag=direct \

if=/dev/zero of=zerofile

1+0 records in

1+0 records out

1073741824 bytes (1,1 GB) copied, 10,275 s, 105 MB/s

HW test

$ iperf3 -c osd00

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval Transfer Bandwidth Retr

[ 4] 0.00-10.00 sec 7.40 GBytes 6.35 Gbits/sec 3278 sender

[ 4] 0.00-10.00 sec 7.39 GBytes 6.35 Gbits/sec receiver

$ iperf3 -c osd00-ci

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval Transfer Bandwidth Retr

[ 4] 0.00-10.00 sec 15.5 GBytes 13.3 Gbits/sec 64 sender

[ 4] 0.00-10.00 sec 15.5 GBytes 13.3 Gbits/sec receiver

Ceph tests

$ ceph osd pool create radosbench 64

$ rados bench -p radosbench 300 write \

--no-cleanup

$ rados bench -p radosbench 300 seq

$ rados bench -p radosbench 300 rand

$ rbd create fio_test --size 10G

$ fio rbd.fio

Results

Results

Not so fast

$ ceph-disk prepare --bluestore /dev/sdd /dev/sdb

$ ls /dev/disk/by-partlabel/ -l

osd-device-2-block -> ../../sdb2

osd-device-2-data -> ../../sdd1

Not so fast

$ ceph-disk prepare --bluestore /dev/sdd /dev/sdb

$ ls /dev/disk/by-partlabel/ -l

ceph%20data -> ../../sdb1

ceph%20block -> ../../sdb2

Not so fast

Here be dragons

Tech preview

CPU regression on too fast disks ;-)

Did you do backup today?

Hot to reach me

Mail + hangouts: 6012030@gmail.com

mail: maxim.vorontsov@redsys.ru

http://redsys.ru

● https://www.redbooks.ibm.com/abstracts/redp5121.html● http://www.sersc.org/journals/IJMUE/vol10_no11_2015/27.p

df● http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf● https://ceph.com● https://www.sebastien-han.fr/blog/● https://cds.cern.ch/record/2015206/files/CephScaleTestMa

rch2015.pdf● http://rocksdb.org

Recommended