Rally--OpenStack Benchmarking at Scale

[at scale]

OpenStackBenchmarking

Boris PavlovicMirantis, 2013

Agenda

● Benchmarking OpenStack at scale○ What? Why? How?

● Rally○ What is Rally?○ Vision○ Examples and results

● How to ensure that OpenStack works at scale?

● How to detect performance issues quickly and improve OpenStack scalability?

Benchmarking OpenStack

● Generate load from concurrent users● Capture key metrics--avg/max time, failure rate

○ VM provisioning○ Floating IP allocation○ Snapshot creation

● Verify that the cloud works fine

● PROFIT!!!

A straightforward way to benchmark OpenStack

● Generate load from concurrent users● Capture key metrics--avg/max time, failure rate

○ VM provisioning○ Floating IP allocation○ Snapshot creation

● Verify that the cloud works fine

● PROFIT!!!

… but what if it breaks apart?

A straightforward way to benchmark OpenStack

Incorrect deployment setup?

Non-optimal hardware?

Bug in the code?

Did you take enough time to educate yourself?

Really?

Read the docs… (after an hour)

There should be an

● 3 common approaches:○ Use better hardware○ Deploy better○ Make the code better

Improve OS cloud performance and scalability

● 3 common approaches:○ Use better hardware○ Deploy better○ Make the code better

● But we need to know data points○ Which part of the code is a bottleneck?○ What hardware limits are hit, if any? ○ How deployment topology influences

performance?

Improve OS cloud performance and scalability

Shine a light in the darkness

● Rally is a community-based project that allows OpenStack developers and operators to get relevant and repeatable benchmarking data of how their cloud operates at scale.

● Wiki https://wiki.openstack.org/wiki/Rally

What is Rally?

● Different types of user-defined workloads○ For developers: synthetic tests, stress tests

○ For operators: real-life cloud usage patterns

● Flexible reporting○ For developers: low-level profiling data, bottlenecks

○ For operators: high-level data about cloud

performance, highlights of bottlenecks within their use case

Relevant to both devs and operators

How Rally works

Deploy OpenStack

Deploy engines

DevStack

Run specified scenarios

Get results

Server Providers

OpenStack

Amazon

Get results● Execution

time breakdown

● Failure rates● Graphics● Profiling data

Parameters ● Number of

users● Number of

tenants● Concurrency● Type of

workload● Duration

Benchmarking scenarios

Real-life workloads

Synthetic workloads

OpenStackcloud

Workload 1

Workload 2

Workload 3

Results

Data for Developers

- Low-level profiling- Tomograph results- Graphs

Data for Stakeholders

- Historical data- SLAs- Bottlenecks

● Put stress test on various OpenStack components○ Large number of provisioned VMs per second○ Large number of provisioned volumes per second○ Large number of uploaded images per second ○ Large amount of active resources (VMs/images/volumes)

● Expose bottlenecks and uncover design issues in OpenStack

● Create a golden standard for everyone in the community to validate against

Synthetic tests for developers

How did we deploy OpenStack?

● Using Fuel● On real hardware● 3 physical controllers● 500+ physical compute nodes● In HA deployment mode with Galera,

HAProxy, Corosync, Pacemaker

Large numbers of active

VMs shouldn’t affect

provision of new VMs

Large number of active VMs

Large number of concurrent users

Average time of

booting and deleting VMs with different numbers of concurrent users

Profiling with Tomograph and Zipkin

Highlights:

● Launch 3 VMs○ 336 DB queries ○ 74 RPC calls

● Delete 3 VMs under high load○ 1 minute global DB lock

on quotas table

● Rationale○ In the real world, scenarios are more complicated, than “boot-destroy”

immediately○ Workloads rarely change--OpenStack and its topology/configuration

change often ○ Profiles are specific for businesses

● Expected outcome○ Let companies specify their existing workload and benchmark cloud

according to this workload ○ Let companies share

Why real workloads in addition to synthetic?

What to benchmark

Provision VMs Use VMs Destroy VMs

How long (on average)? How long (on average)?

How long (maximum)? How long (maximum)?

Success rate? Success rate?

Detailed benchmark of each step

Provision VMs Use VMs Destroy VMs

1s 2s 9s 4s 8s 2m

1s 2s 9s 4s 8s

Another workload representation

What it shows

● Areas of biggest concern

● A baseline for all future changes (OpenStack version, deployment topology, Neutron plugin)

What we ultimately want to achieve

● Provide a mechanism to easily define workloads

● Let users benchmark their cloud within specified workload

● Provide historical data on all applied optimizations to see if they are heading to better performance

● Greatly improve profiling capabilities to quickly pinpoint problem location

● Extend workload definitions to support richer and more realistic tests, combine workloads

● Support historical data and provide means of comparison/analytics

● Better correlation between business KPIs and reporting

Roadmap

Join Rally community

● It’s up to you to make Rally better

● Join our team:○ Wiki: https://wiki.openstack.org/wiki/Rally○ Project space: https://launchpad.net/rally○ IRC chat: #openstack-rally on irc.freenode.net

Rally--OpenStack Benchmarking at Scale

Technology

[OpenStack 스터디] OpenStack With Contrail

Rally Organisation and Management – Guidelinesdrascombe-association.org.uk/rally/rally-guidelines.pdf · The rally organiser will arrange to supply information about the rally location,

Azores Rally 2016: Rally Guide

RALLY-TOKON KILPAILUOHJEET 1 RALLY-TOKON KILPAILUOHJEET Rally-tokokilpailussa noudatetaan rally-tokokilpailun sääntöjä ja kilpailuohjetta. Rally-tokokilpailun kilpailuohjeet on

匠が斬る!OpenStackストレージとネットワークの活用術 · •OpenStackのプロジェクトであるRallyを使って3rd Party連携後のOpenStack環境をテスト

Rally for yourself. RALLY FOR RECOVERY › wp-content › uploads › 2016 › 08 › Re… · Rally for your friends and family. Rally for your community. Rally for yourself. RALLY

Benchmarking Openstack Installations using Rally

GDL OpenStack Community - Openstack Introduction

Четырехлетие OpenStack - Сложный возраст OpenStack

Panel Discussion: Performance Measuring Tools for the ... · Panel Discussion: Performance Measuring Tools for the Cloud April 2016 Austin Openstack Summit. ... Benchmarking tool

Intel’s Innovations with OpenStack - IT168.com – 电商时 …topic.it168.com/factory/adc2013/doc/wangqing.pdf · · 2013-08-05Introducing COSBench •An Intel-developed benchmarking

OpenStack Installation (Icehouse) OpenStack

Red Hat OpenStack Platform 16 · scripts also perform additional OpenStack resource tests for OpenStack Compute (nova), OpenStack Block Storage (cinder), OpenStack Image (glance),

OpenStack on OpenStack

Mirantis OpenStack Reference Architecture for Dell · PDF fileMirantis OpenStack ... RA Reference Architecture Rally An open source tool to perform cloud verification, benchmarking

Openstack Rally - Benchmark as a Service. Openstack Meetup India. Ananth/Rahul

Openstack meetup: NFV and Openstack

Performance Analysis and Tuning in China Mobile's OpenStack … · · 2016-10-22OpenStack black-box testing often uses the rally [1] ... We focused on virtual machine ... authenticate

OpenStack Foundation - Certified OpenStack Administrator

Red Hat OpenStack Platform 10 OpenStack Benchmarking Service · PDF fileRed Hat OpenStack Platform 10 OpenStack Benchmarking Service 4. ... (with either a JSON or ... Red Hat OpenStack