Icpe2015 weiyi shang (1)

Preview:

Citation preview

1

Automated Detection of Performance Regressions Using Regression Models on

Clustered Performance Counters

Weiyi Shang, Ahmed E. HassanQueen’s University, Kingston, Canada

Mohamed Nasser, Parminder FloraBlackBerry, Waterloo, Canada

2

What is a performance regression?

Old version New version

Does the new version have worse performance than the old version?

3

How to detect performance regression?

Old Version

New version

requests

requests

requests

Performance counters

Performance counters

4

Large software systems generate large amounts of performance counters

5

Performance engineers pick counters to compare using T-test

Picking the wrong counters will not detect performance regression!

Are these counters significantly

different between old and new

versions?

6

Performance engineers build a model using all counters

Target Counter Independent Counters

CPUMemory I/O read I/O write

7

Target counter

Old version New version

Select target counter

Build model

Verify model

Model Prediction error

Target counter is selected by experience!

Model-based performance regression detection

8

Selecting a wrong target counter will fail to detect performance regression

Memory-related performance regressions will not be detected by this model!

Memory has a low correlation with CPU

CPUMemory I/O read I/O write

I/O write has a high correlation with CPU

9

Our approach

Target counter

Old version New version

Select target counter

Build model

Verify model

Model Prediction error

Reduce counters

Reduced counters

Clustercounters

Clusters of counters For each cluster

10

Select target counter

Build model

Verify model

Reduce counters

ClusterCounters

Removing zero variance counters

Removing redundant counters

Available CPU cores:4, 4, 4, …4, 4

I/O write byte/sec= a*I/O write op/sec +b*CPU

Step 1: Reduce counters

11

Step 2: Cluster counters

Hierarchical clustering Dendrogram cutting

CPU I/O Memory CPU I/O Memory

Select target counter

Build model

Verify model

Reduce counters

ClusterCounters

12

Step 3: Select target counter

Select target counter

Build model

Verify model

Reduce counters

ClusterCounters

CPU from old version

CPU from new version

Kolmogorov–Smirnov test

P-VALUE

We select the counter that has significant difference between two versions with highest certainty.

We select the counter with

smallest p-value

13

Step 3: Build model Select target counter

Build model

Verify model

Reduce counters

ClusterCounters

CPUMemory I/O read I/O write

We build a linear regression model.

+ c+ ba

14

Step 4: Verify model

Linear regression models for each cluster

Calculate prediction error

New version counters left in cluster 1

Linear regression model for cluster 1

… …

Mean prediction error

< threshold

> threshold

Select target counter

Build model

Verify model

Reduce counters

ClusterCounters

15

Case study: subject systems

DELL DVD Store

Enterprise Application

CPU overhead Memory overhead I/O overhead

Removing text index Removing column index

Real-life performance regression

16

Can our approach detect performance

regressions?

AccuracyApplicability

How many target counters does our

approach pick?

Is our approach better than traditional

approaches?

Comparison

17

Our approach picks a small number of target counters

DELL DVD Store

Our approach picks 2 to 4 target counters.

Performance engineers cannot pick target counters based on experience

Picked target counters are different across runs of our approach.

18

Can our approach detect performance

regressions?

AccuracyApplicability

How many target counters does our

approach pick?

Is our approach better than traditional

approaches?

Comparison

Our approach picks a small number of target counters.

19

Can our approach detect performance

regressions?

AccuracyApplicability

How many target counters does our

approach pick?

Is our approach better than traditional

approaches?

Comparison

Our approach picks a small number of target counters.

20

Our approach can detect performance regressions

DELL DVD Store

Max prediction error4% to 11% without regression

24% to 1622% with regression

Our approach is not heavily impacted by the choice of threshold value.

Without performance regression

3%

9% Max prediction error

With performance regression

2%

916%Max prediction error

21

Can our approach detect performance

regressions?

AccuracyApplicability

How many target counters does our

approach pick?

Is our approach better than traditional

approaches?

Comparison

Our approach picks a small number of target counters.

Our approach can detect performance regressions and is not heavily impacted by threshold value.

22

Can our approach detect performance

regressions?

AccuracyApplicability

How many target counters does our

approach pick?

Is our approach better than traditional

approaches?

Comparison

Our approach picks a small number of target counters.

Our approach can detect performance regressions and is not heavily impacted by threshold value.

23

Our approach outperforms picking one target counter to build a model

DELL DVD Store

Without performance regression

0%

With performance regression

0%

Building one model using memory as target counter may fail to detect performance regression

24

T-test does not perform well in detecting performance regressions.

DELL DVD Store

Without performance regression

3 to 6 counters are flagged

There are a large number of counters with significant differences in the T-test results even though no regressions exist.

Enterprise Application

Without performance regression

32% counters are flagged

25

Can our approach detect performance

regressions?

AccuracyApplicability

How many target counters does our

approach pick?

Is our approach better than traditional

approaches?

Comparison

Our approach picks a small number of target counters.

Our approach can detect performance regressions and is not heavily impacted by threshold value.

Our approach out performs traditional approaches.

26

How to detect performance regression?

Old Version

New version

requests

requests

requests

Performance counters

Performance counters

27

28

Target counter

Old version New version

Select target counter

Build model

Verify model

Model Prediction error

Target counter is selected by experience!

Model-based performance regression detection

29

30

Our approach

Target counter

Old version New version

Select target counter

Build model

Verify model

Model Prediction error

Reduce counters

Reduced counters

Clustercounters

Clusters of counters For each cluster

31

32

Can our approach detect performance

regressions?

AccuracyApplicability

How many target counters does our

approach pick?

Is our approach better than traditional

approaches?

Comparison

Our approach picks a small number of target counters.

Our approach can detect performance regressions and is not heavily impacted by threshold value.

Our approach out performs traditional approaches.

33

weiyishang.wordpress.comswy@cs.queensu.ca

Recommended