13
Ceph Object Storage at Spreadshirt June 2015 Jens Hadlich Chief Architect

Ceph Object Storage at Spreadshirt

Embed Size (px)

Citation preview

Page 1: Ceph Object Storage at Spreadshirt

Ceph Object Storage at Spreadshirt

June 2015

Jens Hadlich Chief Architect

Page 2: Ceph Object Storage at Spreadshirt

About Spreadshirt

2

Spread it with Spreadshirt

A global e-commerce platform for everyone to create, sell and buy ideas on clothing and accessories across many points of sale. •  12 languages, 11 currencies •  19 markets •  150+ shipping regions

•  community of >70.000 active sellers •  € 72M revenue (2014) •  >3.3M items shipped (2014)

Page 3: Ceph Object Storage at Spreadshirt

Object Storage at Spreadshirt

•  What? –  Store and read primarily user generated content, mostly images

•  Typical sizes: –  a few dozen KB, a few MB

•  Some 10s of terabyte (TB) of data •  Read > Write

•  „Never change a running system“? –  Currently solution from the early days with big storage + lots of files /

directories doesn‘t work anymore •  Regular UNIX tools get unusable in practice •  Not designed for „the cloud“ (e.g. replication is an issue)

–  Growing number of users à more content –  Build a truly global platform (multiple regions and data centers)

3

Page 4: Ceph Object Storage at Spreadshirt

Ceph

•  Why Ceph? –  Vendor independent –  Open source –  Runs on commodity hardware –  Local installation for minimal latency –  Existing knowledge and experience –  S3-API

•  Simple bucket-to-bucket replication –  A good fit also for < Petabyte –  Easy to add more storage –  (Can be used later for block storage)

4

Page 5: Ceph Object Storage at Spreadshirt

Ceph Object Storage Architecture

5

Overview

Ceph Object Gateway

Monitor

Cluster Network

Public Network

OSD OSD OSD OSD OSD

Monitor Monitor

A lot of nodes and disks

Client HTTP (S3 or SWIFT API)

RADOS (reliable autonomic distributed object store)

Page 6: Ceph Object Storage at Spreadshirt

Ceph Object Storage Architecture

6

A little more detailled

Monitor

Cluster Network

Public Network

Client

RadosGW

HTTP (S3 or SWIFT API)

Monitor Monitor

Some SSDs (for journals) More HDDs JBOD (no RAID)

OSD node

Ceph Object Gateway

librados

Odd number (Quorum)

OSD node OSD node OSD node OSD node

1G

10G (the more the better)

...

RADOS (reliable autonomic distributed object store)

OSD node

Page 7: Ceph Object Storage at Spreadshirt

Ceph Object Storage Architecture

7

Initial Setup (planned)

Cluster Network (OSD Replication)

Cluster nodes 3 x SSD (journal / index) 9 x HDD (data)

3 Monitors

2 x 1G, IPv4

2 x 10G, IPv6

Public Network

Client HTTP (S3 or SWIFT API)

HAProxy

RadosGW

Monitor

RadosGW

Monitor

RadosGW

Monitor

RadosGW RadosGW

2 x 10G, IPv6 Cluster Network

RadosGW on each node

Page 8: Ceph Object Storage at Spreadshirt

Ceph Object Storage Performance

8

Some smoke tests

•  How fast is RadosGW? Get an impression. –  Response times (read / write)

•  Average? •  Percentiles (P99)?

–  Compared to AWS S3?

•  A very minimalistic test setup –  3 VMs (KVM) all with RadosGW, Monitor and OSD

•  2 Cores, 4GB RAM, 1 OSD each (15 GB + 5GB), 10G Network between nodes, HAProxy (round-robin), LAN, HTTP

–  No further optimizations

Page 9: Ceph Object Storage at Spreadshirt

Ceph Object Storage Performance

9

Some smoke tests

•  How fast is RadosGW? –  Random read and write –  Object size: 4 KB

•  Results: Pretty promising! –  E.g. 16 parallel threads, read:

•  Avg 9 ms •  P99 49 ms •  > 1.300 requests/s

Page 10: Ceph Object Storage at Spreadshirt

Ceph Object Storage Performance

10

Some smoke tests

•  Compared to Amazon S3? –  Comparing apples and oranges (unfair, but interresting)

•  http vs. https, LAN vs. WAN etc.

•  Reponse times –  Random read, object size: 4KB, 4 parallel threads, location: Leipzig Ceph S3 AWS S3

eu-central-1 eu-west-1

Location Leipzig Frankfurt Ireland Avg 6 ms 25 ms 56 ms P99 47 ms 128 ms 374 ms Requests/s 405 143 62

Page 11: Ceph Object Storage at Spreadshirt

Global Availability

11

•  1 Ceph cluster per data center

•  S3 bucket-to-bucket replication

•  Multiple regions, local delivery

Page 12: Ceph Object Storage at Spreadshirt

To be continued ...

+ = ?

Page 13: Ceph Object Storage at Spreadshirt

Thank You! [email protected]