Rally--OpenStack Benchmarking at Scale

[at scale]

OpenStackBenchmarking

Boris PavlovicMirantis, 2013

Agenda

● Benchmarking OpenStack at scale○ What? Why? How?

● Rally○ What is Rally?○ Vision○ Examples and results

● How to ensure that OpenStack works at scale?

● How to detect performance issues quickly and improve OpenStack scalability?

Benchmarking OpenStack

● Generate load from concurrent users● Capture key metrics--avg/max time, failure rate

○ VM provisioning○ Floating IP allocation○ Snapshot creation

● Verify that the cloud works fine

...

● PROFIT!!!

A straightforward way to benchmark OpenStack

● Generate load from concurrent users● Capture key metrics--avg/max time, failure rate

○ VM provisioning○ Floating IP allocation○ Snapshot creation

● Verify that the cloud works fine

...

● PROFIT!!!

… but what if it breaks apart?

A straightforward way to benchmark OpenStack

Incorrect deployment setup?

Non-optimal hardware?

Bug in the code?

Did you take enough time to educate yourself?

;)

RTFM

Really?

Read the docs… (after an hour)

There should be an

● 3 common approaches:○ Use better hardware○ Deploy better○ Make the code better

Improve OS cloud performance and scalability

● 3 common approaches:○ Use better hardware○ Deploy better○ Make the code better

● But we need to know data points○ Which part of the code is a bottleneck?○ What hardware limits are hit, if any? ○ How deployment topology influences

performance?

Improve OS cloud performance and scalability

RALLY

Shine a light in the darkness

● Rally is a community-based project that allows OpenStack developers and operators to get relevant and repeatable benchmarking data of how their cloud operates at scale.

● Wiki https://wiki.openstack.org/wiki/Rally

What is Rally?

https://wiki.openstack.org/wiki/Rally

● Different types of user-defined workloads○ For developers: synthetic tests, stress tests

○ For operators: real-life cloud usage patterns

● Flexible reporting○ For developers: low-level profiling data, bottlenecks

○ For operators: high-level data about cloud

performance, highlights of bottlenecks within their use case

Relevant to both devs and operators

RALLY

How Rally works

Deploy OpenStack

cloud

Deploy engines

…

DevStack

Fuel

Dummy

Run specified scenarios

Get results

Server Providers

…

Virsh

OpenStack

LXC

Amazon

Get results● Execution

time breakdown

● Failure rates● Graphics● Profiling data

Parameters ● Number of

users● Number of

tenants● Concurrency● Type of

workload● Duration

Benchmarking scenarios

Real-life workloads

Synthetic workloads

OpenStackcloud

Workload 1

Workload 2

Workload 3

Results

Data for Developers

- Low-level profiling- Tomograph results- Graphs

Data for Stakeholders

- Historical data- SLAs- Bottlenecks

● Put stress test on various OpenStack components○ Large number of provisioned VMs per second○ Large number of provisioned volumes per second○ Large number of uploaded images per second ○ Large amount of active resources (VMs/images/volumes)

● Expose bottlenecks and uncover design issues in OpenStack

● Create a golden standard for everyone in the community to validate against

Synthetic tests for developers

How did we deploy OpenStack?

● Using Fuel● On real hardware● 3 physical controllers● 500+ physical compute nodes● In HA deployment mode with Galera,

HAProxy, Corosync, Pacemaker

Large numbers of active

VMs shouldn’t affect

provision of new VMs

Large number of active VMs

Large number of concurrent users

Average time of

booting and deleting VMs with different numbers of concurrent users

Profiling with Tomograph and Zipkin

Highlights:

● Launch 3 VMs○ 336 DB queries ○ 74 RPC calls

● Delete 3 VMs under high load○ 1 minute global DB lock

on quotas table

● Rationale○ In the real world, scenarios are more complicated, than “boot-destroy”

immediately○ Workloads rarely change--OpenStack and its topology/configuration

change often ○ Profiles are specific for businesses

● Expected outcome○ Let companies specify their existing workload and benchmark cloud

according to this workload ○ Let companies share

Why real workloads in addition to synthetic?

What to benchmark

Provision VMs Use VMs Destroy VMs

How long (on average)? How long (on average)?

How long (maximum)? How long (maximum)?

Success rate? Success rate?

1.

2.

3.

Detailed benchmark of each step

Provision VMs Use VMs Destroy VMs

nova

-api

nova

-db

sche

dule

com

pute

netw

ork

glan

ce

1s 2s 9s 4s 8s 2m

nova

-api

nova

-db

com

pute

netw

ork

nova

-dd

1s 2s 9s 4s 8s

Another workload representation

What it shows

● Areas of biggest concern

● A baseline for all future changes (OpenStack version, deployment topology, Neutron plugin)

What we ultimately want to achieve

● Provide a mechanism to easily define workloads

● Let users benchmark their cloud within specified workload

● Provide historical data on all applied optimizations to see if they are heading to better performance

● Greatly improve profiling capabilities to quickly pinpoint problem location

● Extend workload definitions to support richer and more realistic tests, combine workloads

● Support historical data and provide means of comparison/analytics

● Better correlation between business KPIs and reporting

Roadmap

Join Rally community

● It’s up to you to make Rally better

● Join our team:○ Wiki: https://wiki.openstack.org/wiki/Rally○ Project space: https://launchpad.net/rally○ IRC chat: #openstack-rally on irc.freenode.net

https://wiki.openstack.org/wiki/Rally

https://launchpad.net/rally

Technology

Rally--OpenStack Benchmarking at Scale