41

20140509 cern open_stack_linuxtag_v3

Embed Size (px)

Citation preview

Page 1: 20140509 cern open_stack_linuxtag_v3
Page 2: 20140509 cern open_stack_linuxtag_v3

Tim Bell

[email protected]

@noggin143

09/05/2014 LinuxTag 2014 2

Page 3: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 3

CERN was founded 1954: 12 European States

“Science for Peace”

Today: 21 Member States

Member States: Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Israel, Italy, the Netherlands, Norway, Poland, Portugal, Slovakia, Spain, Sweden, Switzerland and the United Kingdom

Candidate for Accession: Romania

Associate Members in Pre-Stage to Membership: Serbia

Applicant States for Membership or Associate Membership:Brazil, Cyprus (awaiting ratification), Pakistan, Russia, Slovenia, Turkey, Ukraine

Observers to Council: India, Japan, Russia, Turkey, United States of America; European Commission and UNESCO

~ 2,300 staff

~ 1,000 other paid personnel

> 11,000 users

Budget (2013) ~1,000 MCHF

Page 4: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 4

Page 5: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 5

Page 6: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 6

Page 7: 20140509 cern open_stack_linuxtag_v3

Collisions

09/05/2014 LinuxTag 2014 7

Page 8: 20140509 cern open_stack_linuxtag_v3

A Big Data Challenge

09/05/2014 LinuxTag 2014 8

In 2014,

• 100PB archive with additional 35PB/year

• 10,000 servers

• 75,000 disk drives

• 45,000 tapes

In 2015,

• Run 2 of LHC expected to double data rates

• But many limits and limitations…

Page 9: 20140509 cern open_stack_linuxtag_v3

The CERN Meyrin Data Centre

09/05/2014 LinuxTag 2014 9

Page 10: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 10

Page 11: 20140509 cern open_stack_linuxtag_v3

09/05/2014

Bamboo

Koji, Mock

AIMS/PXE

Foreman

Yum repo

Pulp

Puppet-DB

mcollective, yum

JIRA

Lemon /

Hadoop /

LogStash /

Kibana

git

OpenStack

Nova

Hardware

database

Puppet

Active Directory /

LDAP

LinuxTag 2014 11

Page 12: 20140509 cern open_stack_linuxtag_v3

Status

• Multi-data centre cloud in production since July 2013 (Geneva and Budapest)

• Currently running OpenStack Havana• KVM and Hyper-V deployed

• All configured automatically with Puppet

• 65,000 cores in CERN IT Private Cloud

• 3PB Ceph pool available for volumes, images and other physics storage

09/05/2014 LinuxTag 2014 12

Page 13: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 13

Microsoft Active

Directory

CERN DB

on Demand

CERN Network

Database

Account mgmt

system

Horizon

Keystone

Glance

NetworkCompute

Scheduler

Cinder

Nova

Block Storage

Ceph & NetAppCERN

Accounting

Ceilometer

Page 14: 20140509 cern open_stack_linuxtag_v3

Monitoring - Flume, Elastic

Search, Kibana

14

HDFS

Flume

gatewayelasticsearch Kibana

OpenStack infrastructure

Page 15: 20140509 cern open_stack_linuxtag_v3

compute-nodescontrollers

compute-nodes

Scaling Architecture Overview

15

Child Cell

Geneva, Switzerland

Child Cell

Budapest, HungaryTop Cell - controllers

Geneva, Switzerland

Load Balancer

Geneva, Switzerland

controllers

Page 16: 20140509 cern open_stack_linuxtag_v3

Architecture Components

16

rabb

itmq

- Keystone

- Nova api

- Nova conductor

- Nova scheduler

- Nova network

- Nova cells

- Glance api

- Ceilometer agent-central

- Ceilometer collector

Controller

- Flume

- Nova compute

- Ceilometer agent-compute

Compute node

- Flume

- HDFS

- Elastic Search

- Kibana

- MySQL

- MongoDB

- Glance api

- Glance registry

- Keystone

- Nova api

- Nova consoleauth

- Nova novncproxy

- Nova cells

- Horizon

- Ceilometer api

- Cinder api

- Cinder volume

- Cinder scheduler

rabb

itmq

Controller

Top Cell Children Cells

- Stacktach

- Ceph

- Flume

Page 17: 20140509 cern open_stack_linuxtag_v3

Some Caution on Cells• Single cell limits around 1,000 hypervisors

• Can be adapted using Bluehost alternative approach

with MySQL replication

• Significant function gap being worked on

• Flavors, Availability zones, Scheduling, Ceilometer

need workarounds

• Tested in the OpenStack gate

• Not blocking so local QA environment needed

09/05/2014 LinuxTag 2014 17

Page 18: 20140509 cern open_stack_linuxtag_v3

Scheduling at Scale• CERN users want more sophisticated scheduling:

• Processor architecture

• Private network subnets

• Varying memory/core/disk ratios

• Hardware with more redundancy

• Servers should be used fully

• Tetris-like problem to find the matches

• Packing is more difficult the nearer to 100% used

• Cells scheduler is rather simple currently

• Try Cell X, if not match, try Cell Y…

09/05/2014 LinuxTag 2014 18

Page 19: 20140509 cern open_stack_linuxtag_v3

Upgrade Strategy• Surely “OpenStack can‟t be upgraded”

• Our Essex, Folsom and Grizzly clouds were „tear-down‟

migrations

• Puppet managed VMs are typical Cattle cases – re-create

• User VMs snapshot, download image and upload to new instance

• One month window to migrate

• Users of production services expect more

• Physicists accept not creating/changing VMs for a short period

• Running VMs must not be affected

09/05/2014 LinuxTag 2014 19

Page 20: 20140509 cern open_stack_linuxtag_v3

Phased Migration• Migrated by Component

• Choose an approach (online with load balancer, offline)

• Spin up „teststack‟ instance with production software

• Clone production databases to test environment

• Run through upgrade process

• Validate existing functions, Puppet configuration and monitoring

• Order by complexity and need• Ceilometer, Glance, Keystone

• Cinder, Client CLIs, Horizon

• Nova

09/05/2014 LinuxTag 2014 20

Page 21: 20140509 cern open_stack_linuxtag_v3

Upgrade Experience• No significant outage of the cloud

• During upgrade window, creation not possible

• Small incidents (see blog for details)

• Puppet can be enthusiastic! - we told it to be

• Community response has been great

• Bugs fixed and points are in Juno design summit

• Rolling upgrades in Icehouse will make it easier

09/05/2014 LinuxTag 2014 21

Page 22: 20140509 cern open_stack_linuxtag_v3

OpenStack Federation• OpenStack clouds in many high energy physics sites

• 2 more clouds at CERN in experiment areas (>20K cores each)

• Many collaborating sites adopting OpenStack

• Rackspace collaboration in Openlab

• Aim for seamless cloud resources (CERN, sites, public)

• All code to be included as open source in core OpenStack

• Federation building blocks (authentication, images, compute)

• Authentication included in Icehouse

• More to come…

09/05/2014 LinuxTag 2014 22

Page 23: 20140509 cern open_stack_linuxtag_v3

Next Steps• Scaling to >100,000 cores by 2015

• Around 100 hypervisors per week with fixed staff

• Deploying and configurimg the latest features

• Kerberos / X.509 certificate authentication

• Delegated quota management

• Orchestration

• Database as a Service

• Cells scaling and scheduling

• Federation

09/05/2014 LinuxTag 2014 23

Page 24: 20140509 cern open_stack_linuxtag_v3

Summary• OpenStack at CERN is in production for thousands of

physicists to analyse the results of the LHC

• Rapid innovation around OpenStack gives new function

at an incredible rate

• Upgrades already done at scale and are approaching

transparent in future

• Collaboration around vibrant open source communities

has delivered production quality services

09/05/2014 LinuxTag 2014 24

Page 25: 20140509 cern open_stack_linuxtag_v3

Questions ?

09/05/2014 LinuxTag 2014 25

• Details at http://openstack-in-

production.blogspot.fr

• CERN User guide at

http://information-

technology.web.cern.ch/boo

k/cern-private-cloud-user-

guide

• Previous presentations at

http://information-

technology.web.cern.ch/boo

k/cern-private-cloud-user-

guide/openstack-information

Page 26: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 26

Page 27: 20140509 cern open_stack_linuxtag_v3

Service Models

09/05/2014 LinuxTag 2014 27

• Pets are given names like pussinboots.cern.ch

• They are unique, lovingly hand raised and cared for

• When they get ill, you nurse them back to health

• Cattle are given numbers like vm0042.cern.ch

• They are almost identical to other cattle

• When they get ill, you get another one

Page 28: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 28

Page 29: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 29

http://www.eucalyptus.com/blog/2013/04/02/cy13-q1-community-analysis-%E2%80%94-openstack-vs-opennebula-vs-eucalyptus-vs-

cloudstack

Page 30: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 30

Page 31: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 31

Tier-1 (11 centres):•Permanent storage•Re-processing•Analysis

Tier-0 (CERN):•Data recording•Initial data reconstruction•Data distribution

Tier-2 (~200 centres):• Simulation• End-user analysis

• Data is recorded at CERN and Tier-1s and analysed in the Worldwide LHC

Computing Grid

• In a normal day, the grid provides 100,000 CPU days executing over 2 million jobs

Page 32: 20140509 cern open_stack_linuxtag_v3

09/05/2014 LinuxTag 2014 32

Page 33: 20140509 cern open_stack_linuxtag_v3

Training for Newcomers

09/05/2014 LinuxTag 2014 33

Buy the book rather than guru mentoring

Page 34: 20140509 cern open_stack_linuxtag_v3

What are the Origins of Mass ?

09/05/2014 LinuxTag 2014 34

Page 35: 20140509 cern open_stack_linuxtag_v3

Matter/Anti Matter Symmetric?

09/05/2014 LinuxTag 2014 35

Page 36: 20140509 cern open_stack_linuxtag_v3

Where is 95% of the Universe?

09/05/2014 LinuxTag 2014 36

Page 37: 20140509 cern open_stack_linuxtag_v3

New Data Centre in Budapest

09/05/2014 LinuxTag 2014 37

Page 38: 20140509 cern open_stack_linuxtag_v3

Monitoring - Kibana

38

Page 39: 20140509 cern open_stack_linuxtag_v3

Monitoring - Kibana

39

Page 40: 20140509 cern open_stack_linuxtag_v3

Metering at Scale• Ceilometer provides metering functions for

OpenStack

• Requires careful configuration for cells

09/05/2014 LinuxTag 2014 40

Page 41: 20140509 cern open_stack_linuxtag_v3

I/O at Scale• Most hypervisors are recycled servers

• Most are 2 SATA disks 1-2 TBs

• Some SSD but limited capacity

• IOPS limited with local storage

• Some guest tuning e.g. Linux scheduler

• General approach to use remote storage

• Ceph storage

• Network protocols such as webdav

09/05/2014 LinuxTag 2014 41