80
Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Embed Size (px)

Citation preview

Page 1: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 1CS 239, Spring 2007

Experiment DesignCS 239

Experimental Methodologies for System Software

Peter ReiherMay 1, 2007

Page 2: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 2CS 239, Spring 2007

Designing Experiments• Introduction• A digression into multilinear

regression• Types of experiment designs• Simple experimental design• Full factorial design

Page 3: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 3CS 239, Spring 2007

Introduction To Experiment Design

• You know your metrics

• You know your factors

• You know your levels

• You’ve got your instrumentation and test loads

• Now what?

Page 4: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 4CS 239, Spring 2007

Goals in Experiment Design

• Obtain maximum information

• With minimum work

– Typically meaning minimum number of experiments

• More experiments aren’t better if you’re the one who has to perform them

• Well-designed experiments are also easier to analyze

Page 5: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 5CS 239, Spring 2007

Basic Experiment Design Terminology

• Response variable – quantitative description of outcome of experiment

• Factors – Variables that affect the response variable (aka predictors)

• Levels – Values a factor can assume

Page 6: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 6CS 239, Spring 2007

More Terminology

• Primary factors – the most important ones (i.e., those you test)

• Secondary factors – the less important ones (i.e., those you don’t test)

• Replication – one run of an experiment– Multiple replications for statistical

validation

Page 7: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 7CS 239, Spring 2007

What Is Experiment Design?

• Specification of:

– Number of experiments

– Factor level combinations for experiments

– Number of replications of each experiment

Page 8: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 8CS 239, Spring 2007

An Example

• Consider Time Warp

• The response variable is run time

• The factors are number of nodes used, dynamic load management, lazy vs. aggressive cancellation, maybe others

• The primary factors will be number of nodes and use of dynamic load management

Page 9: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 9CS 239, Spring 2007

Continuing the Example

• What are the levels?– For number of nodes, if we have a 64

node parallel processor, could choose any integers between 1 and 64

• Dynamic load management is either on or off

Page 10: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 10CS 239, Spring 2007

So What Might Be the Experiment Design?

• 6 experiments

• All combinations of:

– Nodes chosen from (8,32,64)

– Dynamic load management (off,on)

• Three replications of each experiment

Page 11: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 11CS 239, Spring 2007

Interactions

• If there is more than one primary factor, factors might interact

• Might be relationships between how various factors affect performance

• E.g., in Time Warp case, perhaps dynamic load management is more effective with more nodes

• Vital to understand factor interactions– By end of experiments, if not before

• Clearly complicates experiment design

Page 12: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 12CS 239, Spring 2007

Basic Problem in Designing Experiments

• You have chosen some number of factors

• They may or may not interact

• How can you design an experiment that captures the full range of the levels?

– With minimum amount of work

• Which combination or combinations of the levels of the factors do you measure?

Page 13: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 13CS 239, Spring 2007

Common Mistakes in Experimentation

• Ignoring experimental error

• Uncontrolled parameters

• Not isolating effects of different factors

• One-factor-at-a-time experiment designs

• Interactions ignored

• Designs requiring too many experiments

Page 14: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 14CS 239, Spring 2007

A Digression Into Multilinear Regression

• Like linear regression, but assumes multiple control variables

• Solution generally uses simple matrix math

• Introduces an issue not present in simple linear regression: multicollinearity

Page 15: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 15CS 239, Spring 2007

Multiple Linear Regression

• Models with more than one predictor variable

• But each predictor variable has a linear relationship to the response variable

• Conceptually, plotting a regression line in n-dimensional space, instead of 2-dimensional

Page 16: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 16CS 239, Spring 2007

Basic Multiple Linear Regression Formula

Response y is a function of k predictor variables x1,x2, . . . , xk

y b b x b x b x ek k 0 1 1 2 2

Page 17: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 17CS 239, Spring 2007

A Multiple Linear Regression Model

Given sample of n observations

model consists of n equations :

x x x y x x x yk n n kn n11 21 1 1 1 2, , , , , , , , , ,

y b b x b x b x ek k1 0 1 11 2 21 1 1

y b b x b x b x ek k2 0 1 12 2 22 2 2

y b b x b x b x en n n k kn n 0 1 1 2 2

Page 18: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 18CS 239, Spring 2007

Looks Like It’s Matrix Arithmetic Time

y = Xb +e y

y

y

x x x

x x x

x x x

b

b

b

e

e

en

k

k

n n kn k n

1

2

11 21 1

12 22 2

2 2

0

1

1

2

1

1

1

.

.

.

. . . . .

. . . . .

. . . . .

.

.

.

.

.

.

Page 19: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 19CS 239, Spring 2007

Analysis of Multiple Linear Regression

• Listed in box 15.1 of Jain

• Not terribly important (for our purposes) how they were derived

– This isn’t a class on statistics

• But you need to know how to use them

• Mostly matrix analogs to simple linear regression results

Page 20: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 20CS 239, Spring 2007

Multiple Linear Regression Example

• Internet Movie Database keeps popularity ratings of movies (in numerical form)

• Postulate popularity of Academy Award winning films is based on two factors -– Age– Running time

• Produce a regression

rating = b0 + b1(length) +b2(age)

Page 21: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 21CS 239, Spring 2007

Some Sample Data

• I selected 35 Oscar winner spanning 78 years• Used IMDB data to get run times and user rating

– As of 4/29/07• User rating is our y• Run time and age are our control variables• What’s the regression, and how good is it?• Complete data in spreadsheet posted on class web

page

Page 22: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 22CS 239, Spring 2007

Now For Some Tedious Matrix Arithmetic

• We need to calculate X, XT, XTX, (XTX)-1, and XTy

Because

I shall spare you the details, but

b = (7.34, -.009,.005)

Meaning the regression predicts

rating = 7.34 - .009*age +.005*length

b X X X yT T1

Page 23: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 23CS 239, Spring 2007

How Good Is This Regression Model?

• How accurately does the model predict the rating of a film based on its age and running time?

• Best way to determine this analytically is to calculate the errors

or

SSE T T T y y b X y

03.152 ieSSE

Page 24: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 24CS 239, Spring 2007

Now Calculate R2

• SSY =

• SS0 =

• SST = SSY – SS0 = 2107.42 – 2089.03 = 18.39

• SSR = SST – SSE = 18.39 – 15.03 = 3.36

• In other words, the regression stinks

42.21072iy

03.20892 yn

18.39.18

36.32 SST

SSRR

Page 25: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 25CS 239, Spring 2007

Why Does It Stink?

• Let’s look at the properties of the regression parameters

• Now calculate standard deviations of the regression parameters

69.32

03.15

3

n

SSEse

Page 26: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 26CS 239, Spring 2007

Calculating STDEV of Regression Parameters

• Estimations only, since we’re working with a sample

• estimated stdev of 67.97.69.000 csb e

00003.000044.69.111 csb e

00003.000041.69.222 csb e

Page 27: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 27CS 239, Spring 2007

Calculating Confidence Intervals of STDEVs

• At the 90% level, for instance

• Confidence intervals for

All three are statistically significant

23.8,46.667.311.134.70 b

0086.,0087.00003.311.1009..1 b

0053,.0052.000028.311.1005.2 b

Page 28: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 28CS 239, Spring 2007

So Why Does Regression “Fail”?

• Look at magnitude of standard deviations of regression parameters

• Almost all deviation is within the b0 parameter

• The other parameters are minimally predictive of the value

• Trying to fit a line to “noise” in the variable– Or to some other parameter not yet

identified

Page 29: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 29CS 239, Spring 2007

Analysis of Variance

• Are the predictor parameters significant predictors of the variance in the data?

• F-tests can be used to answer this question

– Determine if the SSR is significantly higher than the SSE

– Equivalent to testing that y does not depend on any of the predictor variables

• Discussed in detail in book

Page 30: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 30CS 239, Spring 2007

Running an F-Test• Need to calculate SSR and SSE• From those, calculate mean squares of the

regression (MSR) and the errors (MSE)• MSR/MSE has an F distribution• If MSR/MSE > F-table, predictors explain a

significant fraction of response variation– F-tables in appendices A6-A8

• Note typo in book’s table 15.3– SST matches not

y y y y

Page 31: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 31CS 239, Spring 2007

F-Test for Our Example

• SSR = 3.36• SSE = 15.03• MSR = SSR/k = 3.36/2 = 1.68• MSE = SSE/(n-k-1) = 15.03/(35 - 2 - 1) =.47• F-computed = MSR/MSE =3.58

• F[90; 2,32] = 2.268 (at 90%)

• So it passes the F-test at 90%• Which means predictors variables explain some of

variation with 90% confidence

Page 32: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 32CS 239, Spring 2007

Multicollinearity

• If two predictor variables are linearly dependent, they are collinear

– Meaning they are related

– And thus the second does not improve the regression

– In fact, it can make it worse

• Typical symptom is inconsistent results from various significance tests

Page 33: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 33CS 239, Spring 2007

Finding Multicollinearity

• Must test correlation between predictor variables

• If it’s high, eliminate one and repeat the regression without it

• If the significance of regression improves, it’s probably due to collinearity between the variables

Page 34: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 34CS 239, Spring 2007

Why Didn’t Regression Work Well Here?

• Check the scatter plots

– Rating vs. age

– Rating vs. length

• Regardless of how good or bad regressions look, always check the scatter plots

Page 35: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 35CS 239, Spring 2007

Rating Vs. Age

4

5

6

7

8

9

10

0 10 20 30 40 50 60 70 80 90

age

ranking

Page 36: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 36CS 239, Spring 2007

Rating vs. Length

4

5

6

7

8

9

10

90 110 130 150 170 190 210

Length

Ranking

Page 37: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 37CS 239, Spring 2007

What Do These Charts Tell Us?

• Trends in age and length don’t explain much– A little, but not much

• Not much difference between lower and upper ranges of each

• Great deal of variation across small sub-ranges of each

Page 38: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 38CS 239, Spring 2007

Types of Experimental Designs

• Simple designs

• Full factorial design

• Fractional factorial design

Page 39: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 39CS 239, Spring 2007

Simple Designs

• Vary one factor at a time

• For k factors with ith factor having ni levels

• Perhaps with multiple replications of experiments

n nii

k

1 1

1

Page 40: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 40CS 239, Spring 2007

A Simple Design for TW Experiment

• 2 factors (# of nodes, DLM)• Factor 1 has three levels• Factor 2 has two levels• Run following 4 experiments:

– (8 nodes, DLM off), (32 nodes, DLM off), (64 nodes, DLM off)

– (8 nodes, DLM on)• Might be better to use (64 nodes, DLM on),

instead

Page 41: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 41CS 239, Spring 2007

Critique of Simple Designs

• Assumes factors don’t interact

• Often more effort than required

• Don’t use it, usually

• In TW case, clearly you don’t determine whether DLM works better or worse with differing numbers of nodes

Page 42: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 42CS 239, Spring 2007

Full Factorial Designs

• For k factors with ith factor having ni levels

• Again, possibly with multiple replications per experiment

n nii

k

1

Page 43: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 43CS 239, Spring 2007

Full Factorial Design of TW Experiment

• 2 factors (# of nodes, DLM)• Factor 1 has three levels• Factor 2 has two levels• Run the following 6 experiments:

– (8 nodes, DLM off), (32 nodes, DLM off), (64 nodes, DLM off)

– (8 nodes, DLM on), (32 nodes, DLM on), (64 nodes, DLM on)

Page 44: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 44CS 239, Spring 2007

Critique of Full Factorial Design

• Test every possible combination of factors’ levels

• Captures full information about interaction

• A hell of a lot of work, though

– Usually more than needed

Page 45: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 45CS 239, Spring 2007

Reducing the Work in Full Factorial Designs

• Reduce number of levels per factor

– Generally a good choice

– Especially if you know which factors are most important - use more levels for them

• Reduce the number of factors

– But don’t drop important ones

• Use fractional factorial designs

Page 46: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 46CS 239, Spring 2007

Fractional Factorial Designs• Only measure some combination of the

levels of the factors• Must design carefully to best capture any

possible interactions• Less work, but more chance of inaccuracy• Especially useful if some factors are

known not to interact

Page 47: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 47CS 239, Spring 2007

2k Factorial Designs• Used to determine the effect of k factors

– Each with two alternatives or levels

• Often used as a preliminary to a larger performance study

– Each factor measured at its maximum and minimum level

– Perhaps offering insight on importance and interaction of various factors

Page 48: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 48CS 239, Spring 2007

Unidirectional Effects• Effects that only increase as the level of a

factor increases

– Or visa versa

• If this characteristic is known to apply, a 2k factorial design at minimum and maximum levels is useful

• Shows whether the factor has a significant effect

Page 49: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 49CS 239, Spring 2007

22 Factorial Designs

• Two factors with two levels each

• Simplest kind of factorial experiment design

• Concepts developed here generalize

• A form of regression can be easily used here

• Simplest to show with an example

Page 50: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 50CS 239, Spring 2007

22 Factorial Design Example• Reduce the TW experiment we

discussed earlier• Use two levels of nodes (8,64)• And two levels of DLM (off, on)• Will require 4 experiments

– Possibly with multiple replications• Analyze with simple regression model

Page 51: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 51CS 239, Spring 2007

Defining Variables for the 22 Factorial TW Example

xA 11 if 8 nodes

if 64 nodes

xB 11 if no dynamic load management

if dynamic load management used

• Values are set to -1 and 1 for ease in modeling

Page 52: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 52CS 239, Spring 2007

Sample Data For Example

• Single runs of one benchmark simulation

DLM

(1)

NO

DLM

(-1)

8 Nodes (-1) 64 Nodes (1)

820

776 197

217

Page 53: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 53CS 239, Spring 2007

The Sign Table Method

Another way to look at it shown in this table -

Experiment A B y

1 -1 -1 y1

2 1 -1 y2

3 -1 1 y3

4 1 1 y4

Page 54: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 54CS 239, Spring 2007

Regression Model for Example

• y = q0 + qAxA + qBxB + qABxAxB

• Note this is a nonlinear model– Third term describes interactions between

variables820 = q0 -qA - qB + qAB

217 = q0 +qA - qB - qAB

776 = q0 -qA + qB - qAB

197 = q0 +qA + qB + qAB

820

776 197

217

1

1

-1

-1

• 4 equations in 4 unknowns

Page 55: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 55CS 239, Spring 2007

Solving the Equations

q0 = 1/4(820 + 217 + 776 + 197) = 502.5

qA = 1/4(-820 + 217 - 776 + 197) = -295.5

qB = 1/4(-820 - 217 + 776 + 197) = -16

qAB = 1/4(820 - 217 - 776 + 197) = 6

So,

y = 502.5 - 295.5xA - 16xB + 6xAxB

Page 56: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 56CS 239, Spring 2007

The Sign Table MethodI A B AB y1 -1 -1 1 8201 1 -1 -1 2171 -1 1 -1 7761 1 1 1 1972010 -1182 -64 24 Total502.5 -295.5 -16 6 Total/4

Same results as using equationsy = 502.5 - 295.5xA - 16xB + 6xAxB

Page 57: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 57CS 239, Spring 2007

Allocation of Variation for 22 Model

• Calculate the sample variance of y

Numerator is the SST - total variation

SST = 22qA2 + 22qB

2 + 22qAB2

• We can use this to explain what causes the variation in y

s

y yy

ii22

12

2

2

2 1

Page 58: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 58CS 239, Spring 2007

Terms in the SST

• 22qA2 is part of variation explained by the

effect of A - SSA

• 22qB2 is part of variation explained by the

effect of B - SSB

• 22qAB2 is part of variation explained by the

effect of the interaction of A and B - SSAB

SST = SSA + SSB + SSAB

Page 59: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 59CS 239, Spring 2007

Variations in Our Example

• SST = 350449

• SSA = 349281

• SSB = 1024

• SSAB = 144

• We can now calculate the fraction of the total variation caused by each effect

Page 60: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 60CS 239, Spring 2007

Fractions of Variation in Our Example

• Fraction explained by A is 99.67%• Fraction explained by B is 0.29%• Fraction explained by the interaction of A

and B is 0.04%• So almost all the variation comes from the

number of nodes• So if you want to run faster, apply more

nodes, don’t turn on dynamic load management

Page 61: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 61CS 239, Spring 2007

22r Factorial Designs

• 2 factors, 2 levels each, with r replications at each of the four combinations

• y = q0 + qAxA + qBxB + qABxAxB + e

• Now we need to compute effects, estimate the errors, and allocate variation

• We can also produce confidence intervals for effects and predicted responses

Page 62: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 62CS 239, Spring 2007

Computing Effects for 22r Factorial Experiments

• We can use the sign table method

• But instead of single observations, regress off the mean of the r observations

• Compute errors for each replication using similar tabular method

• Similar methods used for allocation of variance and calculating confidence intervals

Page 63: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 63CS 239, Spring 2007

Example of 22r Factorial Design With Replications

• Same Time Warp system as before, but with 4 replications at each point (r=4)

• No DLM, 8 nodes: 820, 822, 813, 809

• DLM, 8 nodes: 776, 798, 750, 755

• No DLM, 64 nodes: 217, 228, 215, 221

• DLM, 64 nodes: 197, 180, 220, 185

Page 64: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 64CS 239, Spring 2007

22r Factorial Example Analysis Matrix

I A B AB y Mean 1 -1 -1 1 (820,822,813,809) 8161 1 -1 -1 (217,228,215,221) 220.251 -1 1 -1 (776,798,750,755) 769.751 1 1 1 (197,180,220,185) 195.52001.5 -1170 -71 21.5 Total500.4 -292.5 -17.75 5.4 Total/4

q0= 500.4 qA= -292.5 qB= -17.75 qAB= 5.4

Page 65: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 65CS 239, Spring 2007

Estimation of Errors for 22r Factorial Example

• Figure differences between predicted and observed values for each replication

• Now calculate SSE

e y yy q q x q x q x ix

ij ij i

ij A Ai B Bi AB A Bi

0

SSE eijj

r

i

2

11

22

2606

Page 66: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 66CS 239, Spring 2007

Allocating Variation

• We can determine the percentage of variation due to each factor’s impact

– Just like 22 designs without replication

• But we can also isolate the variation due to experimental errors

• Methods are similar to other regression techniques for allocating variation

Page 67: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 67CS 239, Spring 2007

Variation Allocation in Example

• We’ve already figured SSE

• We also need SST, SSA, SSB, and SSAB

• Also, SST = SSA + SSB + SSAB + SSE

• Use same formulae as before for SSA, SSB, and SSAB

SST y yiji j

2

,

Page 68: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 68CS 239, Spring 2007

Sums of Squares for Example

• SST = SSY - SS0 = 1,377,009.75• SSA = 1,368,900• SSB = 5041• SSAB = 462.25• Percentage of variation for A is 99.4%• Percentage of variation for B is 0.4%• Percentage of variation for A/B interaction is

0.03% • And 0.2% (apx.) is due to experimental errors

Page 69: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 69CS 239, Spring 2007

Confidence Intervals For Effects• Computed effects are random variables

• Thus, we would like to specify how confident we are that they are correct

• Using the usual confidence interval methods

• First, must figure Mean Square of Errors

s

SSE

r2

22 1

Page 70: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 70CS 239, Spring 2007

Calculating Variances of Effects

• Variance of all effects is the same -

• So standard deviation is also the same

• In calculations, use t- or z-value for 22(r-1) degrees of freedom

s s s ss

rq q q qe

A B AB0

2 2 2 22

22

Page 71: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 71CS 239, Spring 2007

Calculating Confidence Intervals for Example

• At 90% level, using the t-value for 12 degrees of freedom, 1.782

• And standard deviation of effects is 3.68

• Confidence intervals are qi-+(1.782)(3.68)

• q0 - (493.8,506.9)• qA - (-299.1,-285.9)• qB - (-24.3,-11.2)• qAB - (-1.2,11.9)

Page 72: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 72CS 239, Spring 2007

Visual Tests for Verifying Assumptions

• What assumptions have we been making?– Model errors are statistically independent– Model errors are additive– Errors are normally distributed– Errors have constant standard deviation– Effects of errors are additive

• Which boils down to independent, normally distributed observations with constant variance

Page 73: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 73CS 239, Spring 2007

Testing for Independent Errors

• Compute residuals and make a scatter plot

• Trends indicate a dependence of errors on factor levels

– But if residuals order of magnitude below predicted response, trends can be ignored

• Sometimes a good idea to plot residuals vs. experiments number

Page 74: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 74CS 239, Spring 2007

Example Plot of Residuals vs. Predicted Response

-30

-20

-10

0

10

20

30

40

0 100 200 300 400 500 600 700 800 900

Page 75: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 75CS 239, Spring 2007

Example Plot of Residuals Vs. Experiment Number

-30

-20

-10

0

10

20

30

40

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Page 76: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 76CS 239, Spring 2007

Testing for Normally Distributed Errors

• As usual, do a quantile-quantile chart

– Against the normal distribution

• If it’s close to linear, this assumption is good

Page 77: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 77CS 239, Spring 2007

Quantile-Quantile Plot for Example

y = 0.0731x + 4E-17

R2 = 0.9426

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-30 -20 -10 0 10 20 30 40

Page 78: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 78CS 239, Spring 2007

Assumption of Constant Variance

• Checking homoscedasticity

• Go back to the scatter plot and check for an even spread

Page 79: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 79CS 239, Spring 2007

The Scatter Plot, Again

-30

-20

-10

0

10

20

30

40

0 100 200 300 400 500 600 700 800 900

Page 80: Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007

Lecture 8Page 80CS 239, Spring 2007

Example Shows Residuals Are Function of Predictors

• What to do about it?• Maybe apply a transform?• To determine if we should, plot standard

deviation of errors vs. various transformations of the mean

• Here, dynamic load management seems to introduce greater variance– Transforms not likely to help– Probably best not to describe with regression