How Ceph performs on ARM Microserver Cluster

How CEPH Performs in ARM-based Microserver Cluster ?�

AARONJOUE

Agenda �  About Ambedded �  What is the Issues of Using Single Server Node with Multiple Ceph

OSD? �  1 Microserver vs. 1 x OSD architecture �  The benefits �  The basic High Availability Ceph Cluster �  Scale it Out �  Why Network matter? �  How fast it can self-heal a failed OSD on this architecture �  How much we save the energy �  Easy way to run a ceph cluster

About Ambedded Technology

Founded in Taiwan Taipei, Office in National Taiwan University Innovative Innovation Center

Launch Gen 1 microserver architecture Storage Server Product Demo in ARM Global Partner Meeting UK Cambridge.

Partnership with European customer for the Cloud Storage Service. Installed 1800+ microservers & 5.5PB in operating since 2014

•  Launch the 1st ever Ceph Storage Appliance powered by Gen 2 ARM microserver

•  Awarded as the 2016 Best of INTEROP Las Vegas Storage product. Defeat VMware virtual SAN.

Issues of Using Single Server Node with Multiple Ceph OSDs

•  The smallest failure domain is the OSDs inside a server. One Server fail causes many OSD down.

•  CPU utility is 30%-40% only when network is saturated. The bottleneck is network instead of computing.

•  The power consumption and thermal heat eat your money

x N x N x N

Network

40Gb 40Gb 40Gb Micro server cluster

Micro server cluster

ARM micro server cluster 1 micro-server failed, ONLY 1 disk failed

Traditional Server #1

x N x N x N

Client #1 Client #2 Network

10Gb 10Gb 10Gb

X86 server 1 motherboard failed, a bunch of disk failed

High Availability- Minimize HW Failure Domain�

N >>>>>>>> 1

The Benefit of Using One to One Architecture

•  True no single point of failure. •  The smallest failure domain is one OSD •  The MTBF of a micro server is much higher than a all-in-

one mother board •  Dedicate H/W resource to get stable OSD service •  Aggregate network bandwidth with failover •  Low power consumption and cooling cost •  OSD, MON, gateway are all in the same boxes. •  3 units form a high availability cluster

The Basic High Availability Cluster

1x MON

7x OSD

Basic Configuration

Scale Out the cluster

ScaleOutTest�

62,546

125,092

187,639

17,910

26,866

10,000

15,000

20,000

25,000

30,000

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

180,000

200,000

0 5 10 15 20 25

4KRead

4KWrite

Number of SSDs

14 SSD

21 SSD

Random Read IOPS

Random write IOPS

9 Capacity �

Performance�

Scale Out Unified Virtual Storage More OSDs, More Capacity, More Performance

3x Mars 200 -HDD

6x Mars 200 - HDD

5x Mars 200 -HDD

4x Mars 200 –HDD

3x Mars 200 -SSD

5x Mars 200 -SSD

4x Mars 200 –SSD

6x Mars 200 -SSD ALL Flash�

ALL HDD�

Network does Matter! HGST He 10T

16x OSD

20Gb uplink 40Gb Uplink Increase

BW IOPS BW IOPS �

4K Write 1 Client 7.2 1,800 11 2,824 57%

4K Write 2 Client 13 3,389 20 5,027 48%

4K Write 4 Client 22 5,570 35 8,735 57%

4K Write 10 Client 39 9,921 60 15,081 52%

4K Write 20 Client 53 13,568 79 19,924 47%

4K Write 30 Client 63 15,775 90 22,535 43%

4K Write 40 Client 68 16,996 96 24,074 42%

The purpose of this test is to know how much improvement if the uplink bandwidth is increased from 20Gb to 40Gb. Mars 200 has 4x 10Gb uplinks ports. The test result shows 42-57% improvement on IOPS.

Self-healing is much faster & Safer than RAID�

Test method� Ambedded CEPH Storage � Disk Array�

# of HDD /Capacity� 16 x HDD / 10TB each� 16 x HDD / 3TB each�

Data Protection� Replica =2� RAID 5�

Stored data on each HDD� 3TB� No matter how much�

Time for recover data health if lost ONE HDD�

5 hours 10 minutes (System keep normal work)

41 hours�

IT people hand-in� Auto-detection, self-healing Autonomic!�

Somebody has to replace the damaged disk�

Data recover method� Self-healing, cluster will gather the re-generated data copies from the existing health disk

Need to re-build the whole disk by the entire disk array�

Recovery time vs. Disk #� More disks, Shorter time� More disks, LONGER time�

Mars 200: 8-Node ARM Microserver Cluster 8x 1.6GHz ARM v7 Dual Core hot swappable microserver - 2G Bytes DRAM - 8G Bytes Flash - 5 Gbps LAN - < 5 Watts power consumption

Storage - 8x hot swappable SATA3 HDD/SSD - 8x SATA3 Journal SSD

300 Watts Redundant power supply

OOB BMC port

Dual hot swappable uplink switches - Total 4x 10 Gbps - SFP+/10G Base-T Combo

(320W-60W) x 24h x 365 days /1000 x $0.2 USD x 40 units X 2 (power & Cooling) = USD 36,000/rack/YR This electricity cost is based on TW rate, it could be double or triple in Japan or Germany Note: SuperMicro server 6028R-WTRT with 8x3.5” HDD bay

Green Storage Saving More in Operation

AMBEDDED UNIFIED VIRTUAL STORAGE (UVS) MANAGER

Easy Use the Ceph Cluster

What You Can do with UVS Manager

•  Deploy OSD, MON, MDS •  Create Pool, RBD image, iSCSI LUN, S3 user •  Support replica (1- 10) And Erasure Code (K+M) •  OpenStack back storage management •  Create CephFS •  Snapshot, Clone, Flatten image •  Crush Map configuration •  Scale out your cluster

Dashboard

MON & MDS Management

Pool Management

Image Management

Crush Map

CephX User Management

S3 and Swift Object Storage

iSCSI LUN

OpenStack backend Storage

Non-Stop Firmware Upgrade

THANK YOU�Ambedded Technology http://www.ambedded.com.tw TEL： +886-2-23650500 Email : Aaron@ambedded.com.tw

How Ceph performs on ARM Microserver Cluster

Technology

HPE ProLiant MicroServer Gen10 - techsys.vn

Ceph Day Beijing - SPDK for Ceph

Ceph Day Santa Clara: Ceph at DreamHost

Ceph Day Beijing: Containers and Ceph

Ceph Day Beijing- Ceph Community Update

MicroServer + ClearOS Promo - tdhpe.techdata.eu DOCUMENTS/MicroSer… · MicroServer Promo Overview 2 MicroServer ClearOS Subscription Übersicht 210€ Cash Back SKUs 210 Promotion

Ceph Day Amsterdam 2015 - Ceph over IPv6

Ambedded ARM Datacenter Microserver

Ceph Day Santa Clara: Ceph Performance & Benchmarking

Microserver HP - Folder

Using Ceph in OStack.de - Ceph Day Frankfurt

Ceph Performance and Optimization - Ceph Day Frankfurt

Ceph Day NYC: Ceph in the Ecosystem

HPE ProLiant MicroServer Generation 10

Ceph Day Berlin: Erasure Code in Ceph

HP ProLiant MicroServer - Tecno-Soft.com ProLiant MicroServer... · 2014-01-04 · HP ProLiant MicroServer Maintenance and Service Guide ... Top panel ... réparation peut être effectuée

HP MicroServer

HP ProLiant MicroServer

London Ceph Day: Ceph at CERN

Ceph Day London 2014 - Ceph Ecosystem Overview