59
Ceph Fundamentals Ross Turk VP Community, Inktank

Ceph Day Santa Clara: Ceph Fundamentals

  • Upload
    inktank

  • View
    1.290

  • Download
    0

Embed Size (px)

DESCRIPTION

Ross Turk, VP, Marketing & Community, Inktank Ceph is an open source distributed object store, network block device, and file system designed for reliability, performance, and scalability. It runs on standard hardware, has no single point of failure, and is supported by the Linux kernel. It also works great with OpenStack and CloudStack. If you’ve heard of Ceph but aren’t sure where it fits into your plans, this is the talk for you. Designed for those who are new to Ceph, this talk will cover Ceph’s design principles, overall architecture, and integration with other operational systems.

Citation preview

Page 1: Ceph Day Santa Clara: Ceph Fundamentals

Ceph FundamentalsRoss TurkVP Community, Inktank

Page 2: Ceph Day Santa Clara: Ceph Fundamentals

2

ME ME ME ME ME ME.

Ross TurkVP Community, Inktank

[email protected]@rossturk

inktank.com | ceph.com

Page 3: Ceph Day Santa Clara: Ceph Fundamentals

3

Ceph Architectural OverviewAh! Finally, 32 slides in and he gets to the nerdy stuff.

Page 4: Ceph Day Santa Clara: Ceph Fundamentals

4

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 5: Ceph Day Santa Clara: Ceph Fundamentals

5

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 6: Ceph Day Santa Clara: Ceph Fundamentals

6

DISK

FS

DISK DISK

OSD

DISK DISK

OSD OSD OSD OSD

FS FS FSFS btrfsxfsext4

MMM

Page 7: Ceph Day Santa Clara: Ceph Fundamentals

7

M

M

M

HUMAN

Page 8: Ceph Day Santa Clara: Ceph Fundamentals

8

Monitors:• Maintain cluster

membership and state• Provide consensus for

distributed decision-making• Small, odd number• These do not serve stored

objects to clients

M

OSDs:• 10s to 10000s in a cluster• One per disk• (or one per SSD, RAID group…)• Serve stored objects to

clients• Intelligently peer to perform

replication and recovery tasks

Page 9: Ceph Day Santa Clara: Ceph Fundamentals

9

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 10: Ceph Day Santa Clara: Ceph Fundamentals

10

LIBRADOS

M

M

M

APP

socket

Page 11: Ceph Day Santa Clara: Ceph Fundamentals

LLIBRADOS• Provides direct access to

RADOS for applications• C, C++, Python, PHP, Java,

Erlang• Direct access to storage

nodes• No HTTP overhead

Page 12: Ceph Day Santa Clara: Ceph Fundamentals

12

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 13: Ceph Day Santa Clara: Ceph Fundamentals

13

M

M

M

LIBRADOS

RADOSGW

APP

socket

REST

Page 14: Ceph Day Santa Clara: Ceph Fundamentals

14

RADOS Gateway:• REST-based object

storage proxy• Uses RADOS to store

objects• API supports buckets,

accounts• Usage accounting for

billing• Compatible with S3 and

Swift applications

Page 15: Ceph Day Santa Clara: Ceph Fundamentals

15

Page 16: Ceph Day Santa Clara: Ceph Fundamentals

16

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

Page 17: Ceph Day Santa Clara: Ceph Fundamentals

17

M

M

M

VM

LIBRADOS

LIBRBD

HYPERVISOR

Page 18: Ceph Day Santa Clara: Ceph Fundamentals

18

LIBRADOS

M

M

M

LIBRBD

HYPERVISOR

LIBRADOS

LIBRBD

HYPERVISORVM

Page 19: Ceph Day Santa Clara: Ceph Fundamentals

19

LIBRADOS

M

M

M

KRBD (KERNEL MODULE)

HOST

Page 20: Ceph Day Santa Clara: Ceph Fundamentals

20

RADOS Block Device:• Storage of disk images in

RADOS• Decouples VMs from host• Images are striped across

the cluster (pool)• Snapshots• Copy-on-write clones• Support in:• Mainline Linux Kernel

(2.6.39+)• Qemu/KVM, native Xen

coming soon• OpenStack, CloudStack,

Nebula, Proxmox

Page 21: Ceph Day Santa Clara: Ceph Fundamentals

21

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 22: Ceph Day Santa Clara: Ceph Fundamentals

22

M

M

M

CLIENT

0110

datametadata

Page 23: Ceph Day Santa Clara: Ceph Fundamentals

23

Metadata Server• Manages metadata for a

POSIX-compliant shared filesystem• Directory hierarchy• File metadata (owner,

timestamps, mode, etc.)• Stores metadata in RADOS• Does not serve file data to

clients• Only required for shared

filesystem

Page 24: Ceph Day Santa Clara: Ceph Fundamentals

24

What Makes Ceph Unique?Part one: it never, ever remembers where it puts stuff.

Page 25: Ceph Day Santa Clara: Ceph Fundamentals

25

APP??

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 26: Ceph Day Santa Clara: Ceph Fundamentals

26How Long Did It Take You To Find Your Keys This Morning?azmeen, Flickr / CC BY 2.0

Page 27: Ceph Day Santa Clara: Ceph Fundamentals

27

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 28: Ceph Day Santa Clara: Ceph Fundamentals

28Dear Diary: Today I Put My Keys on the Kitchen CounterBarnaby, Flickr / CC BY 2.0

Page 29: Ceph Day Santa Clara: Ceph Fundamentals

29

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

A-G

H-N

O-T

U-Z

F*

Page 30: Ceph Day Santa Clara: Ceph Fundamentals

30I Always Put My Keys on the Hook By the Doorvitamindave, Flickr / CC BY 2.0

Page 31: Ceph Day Santa Clara: Ceph Fundamentals

31

HOW DO YOUFIND YOUR KEYS

WHEN YOUR HOUSEIS

INFINITELY BIGAND

ALWAYS CHANGING?

Page 32: Ceph Day Santa Clara: Ceph Fundamentals

32The Answer: CRUSH!!!!!pasukaru76, Flickr / CC SA 2.0

Page 33: Ceph Day Santa Clara: Ceph Fundamentals

33

OBJECT

10 10 01 01 10 10 01 11 01 10

hash(object name) % num pg

CRUSH(pg, cluster state, rule set)

Page 34: Ceph Day Santa Clara: Ceph Fundamentals

34

OBJECT

10 10 01 01 10 10 01 11 01 10

Page 35: Ceph Day Santa Clara: Ceph Fundamentals

35

CRUSH• Pseudo-random placement

algorithm• Fast calculation, no lookup• Repeatable, deterministic• Statistically uniform

distribution• Stable mapping• Limited data migration on

change• Rule-based configuration• Infrastructure topology aware• Adjustable replication• Weighting

Page 36: Ceph Day Santa Clara: Ceph Fundamentals

36

CLIENT

??

Page 37: Ceph Day Santa Clara: Ceph Fundamentals

37

Page 38: Ceph Day Santa Clara: Ceph Fundamentals

38

Page 39: Ceph Day Santa Clara: Ceph Fundamentals

39

Page 40: Ceph Day Santa Clara: Ceph Fundamentals

40

CLIENT

??

Page 41: Ceph Day Santa Clara: Ceph Fundamentals

41

What Makes Ceph UniquePart two: it has smart block devices for all those impatient, selfish VMs.

Page 42: Ceph Day Santa Clara: Ceph Fundamentals

42

LIBRADOS

M

M

M

VM

LIBRBD

HYPERVISOR

Page 43: Ceph Day Santa Clara: Ceph Fundamentals

43

HOW DO YOUSPIN UP

THOUSANDS OF VMsINSTANTLY

ANDEFFICIENTLY?

Page 44: Ceph Day Santa Clara: Ceph Fundamentals

44

144 0 0 0 0

instant copy

= 144

Page 45: Ceph Day Santa Clara: Ceph Fundamentals

45

4144

CLIENT

write

write

write

= 148

write

Page 46: Ceph Day Santa Clara: Ceph Fundamentals

46

4144

CLIENTread

read

read

= 148

Page 47: Ceph Day Santa Clara: Ceph Fundamentals

47

What Makes Ceph Unique?Part three: it has powerful friends, ones you probably already know.

Page 48: Ceph Day Santa Clara: Ceph Fundamentals

48

M

M

M

APACHE CLOUDSTACK

HYPER-VISOR

PRIMARY STORAGE POOL

SECONDARY STORAGE POOL

snapshots

templates

images

Page 49: Ceph Day Santa Clara: Ceph Fundamentals

49

M

M

M

OPENSTACK

KEYSTONE API

SWIFT API CINDER API GLANCE API

NOVAAPI

HYPER-VISOR

RADOSGW

Page 50: Ceph Day Santa Clara: Ceph Fundamentals

50

What Makes Ceph Unique?Part three: clustered metadata

Page 51: Ceph Day Santa Clara: Ceph Fundamentals

51

M

M

M

CLIENT

0110

Page 52: Ceph Day Santa Clara: Ceph Fundamentals

52

M

M

M

Page 53: Ceph Day Santa Clara: Ceph Fundamentals

53

one tree

three metadata servers

??

Page 54: Ceph Day Santa Clara: Ceph Fundamentals

54

Page 55: Ceph Day Santa Clara: Ceph Fundamentals

55

Page 56: Ceph Day Santa Clara: Ceph Fundamentals

56

Page 57: Ceph Day Santa Clara: Ceph Fundamentals

57

Page 58: Ceph Day Santa Clara: Ceph Fundamentals

58

DYNAMIC SUBTREE PARTITIONING

Page 59: Ceph Day Santa Clara: Ceph Fundamentals

59

Questions?

Ross TurkVP Community, Inktank

[email protected]@rossturk

inktank.com | ceph.com