46
1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

  • View
    217

  • Download
    3

Embed Size (px)

Citation preview

Page 1: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

1

Performance Engineering

Prof. Jerry Breecher

MEASUREMENT AND STATISTICS

Page 2: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

2

Measurement and Statistics

In order to get you in the mood for doing some measuring, statistics, and estimating, here are some quotations with the right flavour:

 

 

"Figures don't lie, but liars figure." Mark Twain

 

"There are three kinds of untruths; lies, damn lies, and statistics." Mark Twain

 

The following are from "Policy Paradox and Political Reason" by Deborah Stone.

 

"Numerals hide all the difficult choices that go into a measurement."

 

"Certain kinds of numbers, big ones, numbers with decimal points, ones not multiples of ten, seemingly advertise the prowess of the measurer."

 

"How accurate a number is depends on the cost of acquiring it and on how important it is."

 

Page 3: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

3

Measurement and Statistics

"Numbers are a form of poetry. Symbols are another."

 

"No number is innocent, for it is impossible to count without making categorization."

 

"Every number is a political statement about where to draw the line."

 

"The first number you measure becomes the status quo."

Page 4: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

4

Measurement and Statistics 

Purpose:

 

This section is about the methodology of measurement. What goes into designing an experiment, gathering some numbers, interpreting the results, and presenting those results to management in a way that allows them to make the necessary decisions.

 

 

 

Warm-up Experiment:

 

Divide into teams and measure the length of an object in the classroom. To do so you will need to make team decisions about tools, techniques, and reporting metrics.

 

Upon completion, discuss what can be learned from this experiment.

Page 5: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

5

Measurement and Statistics  FUNDAMENTAL QUESTIONS ABOUT MEASUREMENT:

What kind of accuracy can you expect from a computer (or any other) measurement? When you make a measurement, can you believe the result? How sure are you of the result? How should you state the result of an experiment? How do you reflect your belief in its accuracy? Can one number represent the performance of a product? When have you measured enough?

Figures don't lie, but liars figure. How do you extrapolate from what you know to what you'd like to know?

Page 6: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

6

Measurement and Statistics  

FUNDAMENTAL QUESTIONS ABOUT MEASUREMENT:  How do you know what tools to use? Is everything in a computer measurable? How do you know what to measure? Should you always know the result of a measurement before you make it? How do you figure out dependencies; how does one variable depend on another?

So after all this talk about the details of measurement, how do you actually design an experiment?

Page 7: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

7

Measurement and Statistics

1. What kind of accuracy can you expect from a computer (or any other) measurement?

 Associated Questions are: • What are some sources of uncertainty when measuring a computer and its software.• Is a computer deterministic? (What is the meaning of deterministic? Do a detour on predictable,

deterministic, stochastic and chaotic.)• What are the pros and cons of taking all the variation out of an environment. Repeatability vs. believability.

 Here are some factors that lead to experimental variation: • System/Component/Molecule/Atom – how granular is the measurement.• Background Activity• End effects and incomplete cycle effects. Measurement error.

• Randomness doesn't mean equality (stochastic process).  

Example: Travelling around a monopoly board. • Randomness from resource contention ( stochastic process ). 

Example: Six processes do nothing but read randomly from a single disk. Do they each make approximately the same number of accesses after 1 second? 1 minute? 1 hour? 1 day?

Page 8: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

8

Measurement and Statistics1. What kind of accuracy can you expect from a computer (or any other)

measurement? Here are some factors that lead to experimental variation (continued): • Changing hardware.  

Example: Variations in fullness of a disk, CPU boards, interrupt traffic. • Tool granularity  

Example: Our experiment in class.

 Example: You write a program that measures time in seconds. What percentage accuracy can

you get from your experiment. Example: You want to measure the time required to execute a routine and have available a

system call named get_time_of_day. get_time_of_day returns time in units of 1/65535 seconds = 16 microseconds. The time required to execute the get_time_of_day routine itself is 100 microseconds. What is the shortest routine that can be measured with this tool? How would you do it?

 Bottom Line: Never believe a real system number to better than 5 - 10%. Artificial numbers can

sometimes be repeated to 1 - 2%, but are susceptible to spurious factors.

Page 9: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

9

Measurement and Statistics2. When you make a measurement, can you believe the result? How

sure are you of the result? ? Suppose you make several determinations of some measure. If you can answer yes to

the following questions, then you can have some faith in your measurement: • Can you explain why the numbers vary? (“Handwaving” isn't allowed here, but “statistics”

may be a valid answer.)

• If variations are greater than 10%, can you figure out what's causing the variation and could you eliminate it if time allowed?

• If the granularity of your tool is greater than the measurement variations, is that acceptable? (Your granularity then becomes your uncertainty.)

 But How Much Do You Trust It?  To answer this we need a brief digression into some math. Suppose we've taken a number of measurements

nmmm ,..., 21

Page 10: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

10

Measurement and Statistics2. When you make a measurement, can you believe the result? How

sure are you of the result? ? 

Then the mean and standard deviation are:

n

i imnmEM1

/1

}1{

)(][][

2

5.022

N

mMmEmESD i

i

s = = variance = SD2

The first form of the Standard Deviation is the form of the underlying data. The second form is that of the measured data. They are the same for an infinite amount of data and close enough for a large set of numbers. NOTE: Use of these equations assumes that the measurements are independent of each other.

Page 11: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

11

Measurement and Statistics

Confidence Intervals: We'd like to say “I'm p% sure that with

n samples the actual value is within d of the mean of the measurements.” In this section, we develop simple ways to be able to make that statement.

 Example of Standard Deviations using

Normal Distributions: By quoting the standard deviation of a

measurement, we say we're 68% sure the true mean is within a standard deviation of the measured mean. Unfortunately, that 68% depends on having a large number of samples.  For smaller numbers, the percentage will change.

2. When you make a measurement, can you believe the result? How sure are you of the result? ?

Normal distribution showing mean and variance.

Page 12: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

12

Measurement and Statistics

Distributions: Student-T  Both the normal and Student-T

distributions represent how random data should be found. The difference lies in how many samples are taken; the Normal Distribution assumes a very large (like infinite) number of samples, while the Student T is for n (less than infinite) samples. As you see in the examples on subsequent pages, n is used as part of the confidence calculation.

The derivation of the t-distribution was first published in 1908 by William Sealy Gosset, while he worked at a Guinness Brewery in Dublin. He was not allowed to publish under his own name, so the paper was written under the pseudonym Student. The t-test and the associated theory became well-known through the work of R.A. Fisher, who called the distribution "Student's distribution".

2. When you make a measurement, can you believe the result? How sure are you of the result? ?

T distribution showing dependence only on number of samples.

Page 13: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

13

Measurement and Statistics

The Burns Co. is now making laptop computers in its Shelbyville plant. Mr. Burns is too cheap to wreck too many computers in a test, so he's letting his QA guru, Homer, smash five of them. Homer is to record from how high in the air he can drop each laptop on the floor before it won't work anymore.

Mr. Burns' wants laptops that can survive a fall from his height of five feet, two The t-test will tell us if we can accept that the average breaking point for a Burns laptop is greater than 5'2", given what we know about the sample.

Let's say the five computers broke at drops of:– 4 feet, 8 inches – 5 feet, 1 inch – 2 feet, 3 inches – 6 feet, 10 inches – 7 feet, 1 inch

2. When you make a measurement, can you believe the result? How sure are you of the result? ?

Page 14: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

14

Measurement and Statistics

Using the formula:

(avg. of sample) - (presumed avg. of larger pop.) t = -------------------------------------------------- (st. dev. of sample) / (sq. root of sample size)

• we get an average breaking height of 62.2 inches, St Dev of 23.4, and a t-score of 0.0191.

• Let's go to the t-score table. There we find the t-value for four degrees of freedom and a 90-percent confidence interval (that's p=.05, since taking .05 off each side of the bell curve leaves us with .90 in the middle). That value is 2.13.

• Since the value we calculated is less than the table's t-value, that means we cannot accept the assumption that all Burns laptops together have an average breaking drop of over 62 inches. Even though our sample's average came in (just) over that.

2. When you make a measurement, can you believe the result? How sure are you of the result? ?

Page 15: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

15

Measurement and Statistics2. When you make a measurement, can you believe the result? How sure are

you of the result? ? 

Example – Use of Student-T:

 As part of our ongoing regression test package, monitoring the performance of PRODUCT X, we

run tests that tickle a number of code paths. In this table, higher numbers are better - they represent the number of transactions completed – they are throughput.

 

RESULTS Model --> 110 120 130 140 150 160Product X, Version A 3.25 6.34 9.37a 11.8b 14.3 16.6cProduct X, Version B 3.20 6.30 9.22d 11.8e 14.4f 16.8 Here are the raw numbers which went into making up the averages indicated above: a b c d e f9.36 11.76 16.59 9.21 11.83 14.409.37 11.80 16.59 9.22 11.82 14.299.38 11.79 16.58 9.20 11.85 14.439.35 11.77 16.63 9.22 11.82 14.369.38 11.85 16.66 9.23 11.88 14.44

Page 16: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

16

Measurement and Statistics2. When you make a measurement, can you believe the result? How sure are

you of the result? ? 

Example – Use of Student-T: Let's work through in detail the numbers in "f". We find the 

mean = (14.40 + 14.29 + 14.43 + 14.36 + 14.44 )/5 = 14.38 

SD = SQRT( (.02 + .09 + .05 + .02 +.06 )/4 ) = SQRT( 0.00375 ) = 0.061 

s = variance = SD2 = 0.00375  Suppose we want to find the confidence interval for 95% confidence. With 5 variables, we have n =

4 degrees of freedom. Read the table for t(0.975) ( there's 2.5% UNconfidence on each side of the curve ) giving 2.78.

 d = t * SQRT( s / n ) = 2.78 * SQRT( 0.00375 / 5 ) = 2.78 * 0.027 = 0.075

 The number is 14.38 +- 0.075 with 95% confidence. (How should you round off this number to

accurately reflect your confidence?)

Page 17: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

17

Measurement and StatisticsExample of Normal

Distribution:

 

Suppose we’ve been making measurements as shown in the first column in the Table below. By inserting those numbers in Excel, the spreadsheet will calculate all kinds of things for us automatically.

Measurements (Sorted)

MeanStandard Deviation

<-- From Excel's Functions

1.9 3.90 0.95

2.7

2.8 1.9 <-- From Tools->

2.8 Data_Analysis->

2.8 Mean 3.961290323 Descriptive Statistics

2.9 Standard Error 0.159749631

3.1 Median 3.9 (Note: Excel has

3.1 Mode 2.8 eliminated the

3.2 Standard Deviation 0.889448301 outlying value.)

3.2 Sample Variance 0.79111828

3.3 Kurtosis -0.674508941

3.4 Skewness 0.419790238

3.6 Range 3.2

3.7 Minimum 2.7

3.8 Maximum 5.9

3.9 95% Confidence 0.32

4.1

4.1

Etc.

etcetera

Page 18: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

18

Measurement and Statistics2. When you make a measurement, can you believe the result? How sure

are you of the result?

COMPARING TWO SETS OF MEASUREMENTS:

You’ve just measured the Performance of the latest release of your product. The numbers are better than they were when you measured them on the last release. But what does “better” mean. How do you show that two sets of numbers, with lots of uncertainty in each of the sets, really have one set better than the other.

 First of all, here’s the easy way. With your two sets, calculate their means and their confidence

intervals (the % confidence you use is up to you.) Visually plot these results as show in the three examples below:

A B C

A.      Here the confidence intervals don’t overlap. The

results are different from each other.

The results are such that the mean of one set is within the confidence interval of the other set. The two sets are NOT different.

The confidence intervals overlap but the means are not inside the CI of the other set. Need to do a more complex

test.

Page 19: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

19

Measurement and Statistics2. When you make a measurement, can you believe the result?

How sure are you of the result?

COMPARING TWO SETS OF MEASUREMENTS:

In essence this is a way to combine the confidences for the two data sets so as to determine the confidence in the difference between the two sets. This is called a t-test.

 

Excel can do a t-test as shown in the data below:

Data Set Data Set 1 25.36 19.1216.57 3.520.62 3.381.41 2.500.64 3.607.26 1.74

5.31 5.64 <-- Average =AVERAGE(A3:A8)6.16 6.64 <-- Standard Deviation =STDEV(A3:A8)

0.465703 <-- Result of the t-test says there is a 46% chance these are from the same distribution

=TTEST(A3:A8,B3:B8,1,1)

So for these sets of data, the answer is inconclusive. We can’t tell if there’s a significant difference between the data sets.

Page 20: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

20

Measurement and Statistics2. When you make a measurement, can you believe the result?

How sure are you of the result?

CHECKING A SERIES OF VALUES:

We'd like to know if a series of values matches a predicted distribution. In other words, we have a theory of what an experiment should give - do the results in fact match the theory? Chi-Squared tables are available for this purpose.

 

Calculate Chi - Squared

n

nn

E

EO 22 )(

where O = Observed and E = Expected. 

Page 21: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

21

Measurement and Statistics2. When you make a measurement, can you believe the result?

How sure are you of the result?

CHECKING A SERIES OF VALUES:Example: Suppose a random number generator is invoked 200 times and produces values shown in this table:

Range Number of Values0.0 - 0.1 230.1 - 0.2 220.2 - 0.3 190.3 - 0.4 150.4 - 0.5 220.5 - 0.6 210.6 - 0.7 200.7 - 0.8 160.8 - 0.9 210.9 - 1.0 21

Plugging this into the equation gives:

1.320/62

20/)111601425149(

20/)1140125123( 22222222222

There are nine degrees of freedom. From the chi-squared distribution at this same website. Look along the 9-degree row and find that 3.1 is between 3.325 (0.050) and 2.700 (0.025) - interpolated as approximately 0.040. We can reject the hypothesis the results are the same with a probability of about 4%. Conversely, we can be 96% sure the distribution is uniform. Exercise:Do this same calculation using the Chi Squared Function in Excel.

Page 22: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

22

Measurement and Statistics3. How should you state the result of an experiment? How do

you reflect your belief in its accuracy?Pat has developed a new product, "rabbit" about which she wishes to determine performance.

There is special interest in comparing the new product, rabbit to the old product, turtle, since the product was rewritten for performance reasons. (Pat had used Performance Engineering techniques and thus knew that rabbit was "about twice as fast" as turtle.) The measurements showed:

 

Performance Comparisons

 Product Transactions / second Seconds/ transaction Seconds to process transaction

Turtle 30 0.0333 3

Rabbit 60 0.0166 1

Which of the following statements reflect the performance comparison of rabbit and turtle?

 o Rabbit is 100% faster than turtle.o Rabbit is twice as fast as turtle.o Rabbit takes 1/2 as long as turtle.o Rabbit takes 1/3 as long as turtle.o Rabbit takes 100% less time than turtle.

o Rabbit takes 200% less time than turtle.o Turtle is 50% as fast as rabbit.o Turtle is 50% slower than rabbit.o Turtle takes 200% longer than rabbit.o Turtle takes 300% longer than rabbit.

Page 23: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

23

Measurement and Statistics3. How should you state the result of an experiment? How do

you reflect your belief in its accuracy?• The guiding principle in stating a result is to keep it simple.

• State the accuracy using the same methods we've just discussed. Use Means, Standard Deviations, and Confidence Intervals.

• Include the number of decimal points that reflect the accuracy of your answer. Avoid things like 7.365 with standard deviation of 2.

• It goes without saying that reflecting your belief in the accuracy presupposes you’ve done the experiment correctly. Some simple guidelines:

A. In my experience, you always do the experiment wrong the first five times. Through experience you learn to look critically at your result to see if it makes sense. If not, then you go figure out what went wrong. Usually it’s some parameter that wasn’t controlled.

B. Only vary one parameter at a time.

C. Watch out for interactions between parameters. The result of changing one parameter results in some other parameter changing as well.

D. Don’t do too many or too few experiments.

E. Get someone else to check your results – by the time you finish a measurement you have too much invested in it and are very likely to miss something obvious.

Page 24: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

24

Measurement and Statistics4. Can one number represent the performance of a product?

Answer: No, but you'll be asked to do it anyway.

 

Preparation For This Section – some definitions:

 

Mean or Expected Value:

dxxxfxpxEMeann

iii )()(

1

Median That value for which there’s an equal probability of being above it and below it. Mode The most likely value. The value with the highest probability.

Mode

Median

Mean

Page 25: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

25

Measurement and Statistics4. Can one number represent the performance of a product?

Example: The Performance Group at the XYZ Corporation has developed a synthetic workload

that they feel reflects the kind of computer work done by XYZ's "typical" customer. This workload is composed of various programs driven by a remote terminal emulator ( RTE ). The RTE can both initiate programs and log when the programs complete.

 This workload was run last week with results shown in the table: Results of XYZ Corp Performance Benchmark

Transaction Type Time to complete transactionEdit a file 14 secCompile and link a file 143 secRun compiled program 17 sec200 disk reads 6 sec1000 process reschedules 3 sec100 physical page faults 10 secSend and receive mail 57 sec TOTAL TIME 250 sec

NOTE: Because all these programs are started simultaneously, there is contention for resources.

 The time reported to management was 250 seconds.

Page 26: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

26

Measurement and Statistics4. Can one number represent the performance of a product?

Example:

 

Questions:

 • Is this a good performance indicator?

• If yes, then sit and relax a few minutes.

• If no, how would you express the results of these tests? How might you revamp the tests?

What guidelines can be derived for producing one-number performance metrics?

Page 27: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

27

Measurement and Statistics5. When have you measured enough?

This is really two questions:

 

a) When have you measured enough to get the accuracy of answer that management expects at this time?

 This is a matter of setting the correct expectations before you start. Many times

the answer is in response to a “what if” question – you can get the appropriate accuracy in one hour. Other times you’ll need weeks of design/setup/measurement/analysis to get the expected accuracy.

 

NOTE: Only a small amount of the total experimental time is in the measurement. Most time goes for design and elimination of unwanted factors. So this question could be stated as “How complicated should an experiment be?”

b) When have you measured enough to get the degree of accuracy you expected for the experiment?

 

You can use the confidence measures we discussed before. In essence, confidence is

nConfidence

1

Page 28: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

28

Measurement and Statistics5. When have you measured enough?

 

The relationship between the number of required samples and experimental parameters is:

 

meanrx

zsn

100

Here n = number of samples requiredz = the number of deviations of the desired confidences = Standard Deviationr = The desired accuracy in percent.xmean = The mean of the measurement

 NOTE: See that the more accuracy you want (s), the more measurements you need.NOTE: If your numbers all come out the same, stop. Measurement uncertainty is not the largest part of the error in your metric.

Page 29: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

29

Measurement and Statistics6. Figures don't lie, but liars figure. How do you extrapolate from

what you know to what you'd like to know?

Often we need a result that is unmeasurable, or would require eons to determine. Is it legal to guess? Answer:

Sure - as long as you also estimate the uncertainty of your guess.  Here are a few practice situations that will help you improve your powers of estimation. Remember,

there is no RIGHT answer. 1. Estimate how many people will come to this class next week. More important than the answer is

the assumptions you use for your answer.

2. Approximately how many cars were in the parking lot outside this building when you came in tonight? How many are there now?

3. What is the probability that you will be killed in a car accident?

4. I recently saw a lawn service truck that had printed on its side “Over 7 trillion blades cut.” Is this a reasonable claim for them to make?

Page 30: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

30

Measurement and Statistics6. Figures don't lie, but liars figure. How do you extrapolate from

what you know to what you'd like to know?

5. Here is a comic strip version of an approximation problem. It contains a model, and then an estimation of the required parameters in the model. 6. But be careful; sometimes the model doesn’t

work.

Page 31: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

31

Measurement and Statistics7. How do you know what tools to use?

• We'll do a lot more on tools later, but for right now, the best answer is to measure the simplest way possible.

• Usually tools are easier to come by than environments.

• Make sure the tool is less granular than the required uncertainty.

Page 32: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

32

Measurement and Statistics8. Is everything in a computer measurable?

• Some electrical signals may not be available.

• The place to make a measurement is in code not under your control.

• We have a very poor sense of typical/normal. We don't know what our users typically do with the machine.

• The measurement may perturb the system and destroy what we wanted to know.

• Available measurements may not relate to what I want to know. For instance, which disk blocks are being accessed by each of the processes on a system.

Page 33: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

33

Measurement and Statistics9. How do you know what to measure?

• This is the hardest question of all. To know what to measure you must have a picture or model of your product. Most of the rest of this course will deal with various kinds of pictures.

• Often an adequate model is a causal one: first procedure A executes; this causes hardware B to produce an effect; then interrupt code handles the hardware result; etc.

• Things to keep in mind include:

• Interaction between variables – do you expect a change in X to produce a change in Y? You should have a guess as to the result before you make the measurement.

• Changing one variable at a time, and measuring it at 10 different values, can be extremely wasteful and time consuming.

• Change only the variables that matter. If you don’t know, try changing something, just once, and see what happens

• Example: You wish to design an experiment that will measure the time required to execute a program on various Intel processors.. What parameters would you need to vary to try different processors and configurations? DESIGN THE TESTS TO BE RUN.

Page 34: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

34

Measurement and Statistics10. Should you always know the results of a measurement before you

make it?

• You should always have a guess so you can tell if your result is way off. That guess should be the result of a model/theory of how the mechanism you are measuring is working

Page 35: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

35

Measurement and Statistics11. How do you figure out dependencies; how does one variable

depend on another ?

This whole topic is something called linear regression. It says that if you can plot two variables, x and y, and there’s a simple relationship between the variables, then you can define the dependency between them.

A linear regression means that we can fit a curve of the form y = a + bx. The quality of the fit (error) can be defined as the sum of the y distances between the fitting-curve and the experimental data.  

Good SIMPLE Model. Good Complicated Model BAD Model

 

 

Page 36: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

36

Measurement and Statistics11. How do you figure out dependencies; how does one variable

depend on another ?

 So the “best fit” is defined to be the curve that minimizes the sum of errors squared.

2

11

2 )( i

n

ii

n

ii bxaye

with the constraint that 0)(11

i

n

ii

n

ii bxaye

When you solve this, you can immediately determine the values of a and b from the experimental data.

22 xnx

yxxyb and xbya

n

iixn

x1

1and with

n

iiyn

y1

1

Page 37: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

37

Measurement and Statistics11. How do you figure out dependencies; how does one variable

depend on another ?

Let’s uses as an example the following pairs of data (14,2), (16,5), 27,7), (42,9), (39,10), (50,13), (83,20).

We COULD use the equation above to determine a and b. Or, Excel can be used in the same way and gives the same results.

EXAMPLE OF LINEAR REGRESSION - Using LINEST(B3:B9, A3:A9, True, True)X Y

14 2 0.25449 Slope16 5 0.036 Y-Intercept27 742 939 1050 13

The equation in this case is Y = 0.036 + 0.25449 X.

Page 38: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

38

Measurement and Statistics(14,2), (16,5), 27,7), (42,9), (39,10), (50,13), (83,20).

Also, if you know what you’re doing, you can use “Tools Data Analysis Regression” and Excel will give you all kinds of statistics evaluating the goodness of fit of the straight line. (Note that you may need to use ToolsOptions to bring in the analysis tools.)

 If the model you’re expecting isn’t a straight line, then you’ll need to do more sophisticated analysis, but the method follows in the same way as we’ve just done.

Page 39: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

39

1. State Goals and Define The System

a. What is it you hope to accomplish? Why is it worth doing?

b. What is the hardware and software (the system) that you will use to achieve these goals?

 

2. List Services and Outcomes

a.       For the system you’ve chosen, what are the services provided. For instance, if you’re studying a disk subsystem, it can absorb data (write) or present you with data (read) or give an error.

b.      By outcomes here are meant very high level statements. The outcome of a disk read is DATA. It’s not a performance or quantifiable answer expected here.

 

3. Select Metrics

a.      What are the criteria you want to use to compare performance? This is still not a quantifiable value, but simply what it is you will measure. This could be a speed metric, or an accuracy metric.

Measurement and Statistics12. So after all this talk about the details of measurement, how do you actually

design an experiment?

We’re going to follow through these steps and recommend that you use them in your experiments. (These are originally due to Jain.)

Page 40: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

40

4. List Parameters

a.       What parameters affect performance? If you’re measuring disks, then the model of disk determines it’s seek time, it’s rotational latency, etc. This is a system parameter.

b.       The kind of test you use, determined by the workload you use, can also define parameters. These might be requested IO’s per second, random or sequential blocks, etc.

 

5. Select Factors to Study

a.       A factor is a parameter that you vary.

b.      So, for the parameters you’ve just listed – all of which you COULD vary, which ones will you actually modify during the course of the experiment?

6. Select Evaluation Technique

a.      You could do this experiment by modeling. You would mathematically represent the system under study and modify parameters in this model.

b.      You could do this experiment by simulation. You would write a program that represented the system. Again you could modify parameters and look at results.

c.      You could do this experiment by measurement. Here you have a real system, drive it with some kind of workload, and get the results.

d.     In practice, in industry, only measurements are valued. It’s generally cheaper to use the real system than it is to build a mathematical or simulated system.

Measurement and Statistics

Page 41: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

41

7. Select Workload

a.       How will you drive the system under test?

b.     It depends on the Evaluation Technique. With a simulation you may have collected some data that you can feed into your program.

c.      For a measurement evaluation, you will have some kind of software that drives the system you’re testing. You will need to find a workload that tickles the parameter of interest to you.

 

8. Design Experiments

a.       What experiments will you do to collect the data you want?

b.     This means selecting the actual values to be used as factors. If one of your factors is the type/model of disk, then how many different disks will you use?

Measurement and Statistics

Page 42: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

42

9. Make A Guess What The Result Will Be

a.     Many people take a measurement and say “Oh, that must be right.” The best way to be able to make that statement is to have understood what should happen and then either get what you expected or not.

b.       If you get what’s expected, then you can be confident that:

      You understand a picture of how the system is working.

      You did your measurements correctly.

c.       If you DON’T get what’s expected, then you can be confident that:

      You didn’t understand the system and so you need to form a new picture.

      You did the measurement wrong – there’s some experimental error.

 

10. Conduct the Measurement, Analyze and Interpret Data

a.       Now actually do the measurement, simulation, or whatever you’ve designed.

b.       It’s rare that you just get a number and you’re all done.

c.       There is always interpretation to be done:

      What does the data mean?

      Is this the result I would expect?

d.       There are always statistics to be done:

      Is the data valid?

      What is the uncertainty in the measurements?

Measurement and Statistics

Page 43: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

43

11. Figure Out What You Want To Talk About

a.     Know your audience. Are they management types (who want only an overview) or are they technical people (who want all the details.) Proper targeting is important!

b.      Choose from all the data you have, those pieces that are most relevant. Don’t forget to make it interesting!

 

12. Present Final Results

c.       As you know, in the real world, it’s not what you do, it’s what others think you do.

d.       Presentation is everything.

Measurement and Statistics

Page 44: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

44

Measurement and StatisticsBONUS: There are various terms and definitions we never got around to formally defining. Here they are

Definitions of Measured Data

These are some basic terms to define so we have a common lingo.

Independent Events Two events are independent if there’s no way that the occurrence of the first event can have anything to do with the second event.

Random Variate A variable that can take on one of a particular set of values with a specified probability.

Cumulative Distribution Function The CDF maps a given value to the probability that the variable has a value equal to or less than a.

)()( axPaFx

Probability Density Function The deriviative of the CDF dx

xdFxf

)()(

Gives the probability of x being in the interval (x1, x2).

2

1

)()( 21

x

x

dxxfxxxP

Page 45: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

45

Measurement and StatisticsDefinitions of Measured Data

These are some basic terms to define so we have a common lingo.

Probability Mass Function The equivalent of the PDF but used for discrete variables.

Mean or Expected Value

Variance A measure of the deviation of the values from the mean.

Standard Deviation This is another measure of the deviation of values. Represented by , the square root of the variance.

 

dxxxfxpxEMeann

iii )()(

1

])[()( 22 xExVar

dxxfxxp i

n

iii )()()( 22

Page 46: 1 Performance Engineering Prof. Jerry Breecher MEASUREMENT AND STATISTICS

46

Measurement and StatisticsDefinitions of Measured Data

 Covariance Given two random variables x and y with means x and y,

their covariance is

)()()()])([(),( yExExyEyxEyxCov yxxy

For independent variables, the covariance is 0.

Correlation Coefficient Another measure of how two variables are interdependent.

yx

xyxyyxnCorrelatio

2

),(

Median That value for which there’s an equal probability of being above it and below it.

Mode The most likely value. The value with the highest probability.

Normal Distribution The most commonly used distribution. The sum of a large number of independent observations from any distribution has a normal distribution.