1 Chapter 6 The 2 k Factorial Design. 2 6.1 Introduction The special cases of the general factorial...

Preview:

Citation preview

1

Chapter 6 The 2k Factorial Design

2

6.1 Introduction

• The special cases of the general factorial design (Chapter 5)

• k factors and each factor has only two levels• Levels:

– quantitative (temperature, pressure,…), or qualitative (machine, operator,…)

– High and low– Each replicate has 2 2 = 2k observations

3

• Assumptions: (1) the factor is fixed, (2) the design is completely randomized and (3) the usual normality assumptions are satisfied

• Wildly used in factor screening experiments

4

6.2 The 22 Factorial Design

• Two factors, A and B, and each factor has two levels, low and high.

• Example: the concentration of reactant v.s. the amount of the catalyst (Page 208)

5

• “-” And “+” denote the low and high levels of a factor, respectively

• Low and high are arbitrary terms

• Geometrically, the four runs form the corners of a square

• Factors can be quantitative or qualitative, although their treatment in the final model will be different

6

• Average effect of a factor = the change in response produced by a change in the level of that factor averaged over the levels if the other factors.

• (1), a, b and ab: the total of n replicates taken at the treatment combination.

• The main effects:

AAyy

n

b

n

aab

baabn

ababn

A

2

)1(

2

)]1([2

1)]}1([]{[

2

1

BByy

n

a

n

bab

ababn

baabn

B

2

)1(

2

)]1([2

1)]}1([]{[

2

1

7

• The interaction effect:

• In that example, A = 8.33, B = -5.00 and AB = 1.67

• Analysis of Variance• The total effects:

n

ab

n

ab

baabn

ababn

AB

22

)1(

])1([2

1)]}1([]{[

2

1

baabContrast

ababContrast

baabContrast

AB

B

A

)1(

)1(

)1(

8

• Sum of squares:

ABBATE

i j

n

kijkT

AB

B

A

SSSSSSSSSS

n

yySS

n

ababSS

n

ababSS

n

baabSS

4

4

])1([

4

)]1([

4

)]1([

22

1

2

1 1

2

2

2

2

9

Response:Conversion ANOVA for Selected Factorial ModelAnalysis of variance table [Partial sum of squares]

Sum of Mean FSource Squares DF Square Value Prob > FModel 291.67 3 97.22 24.82 0.0002A 208.33 1 208.33 53.19 < 0.0001B 75.00 1 75.00 19.15 0.0024AB 8.33 1 8.33 2.13 0.1828Pure Error 31.33 8 3.92Cor Total 323.00 11

Std. Dev. 1.98 R-Squared 0.9030Mean 27.50 Adj R-Squared 0.8666C.V. 7.20 Pred R-Squared 0.7817

PRESS 70.50 Adeq Precision 11.669

The F-test for the “model” source is testing the significance of the overall model; that is, is either A, B, or AB or some combination of these effects important?

10

• Table of plus and minus signs:

I A B AB

(1) + – – +

a + + – –

b + – + –

ab + + + +

11

• The regression model:

– x1 and x2 are coded variables that represent the

two factors, i.e. x1 (or x2) only take values on –

1 and 1.

22110 xxy

2/)(

2/)(

2/)(

2/)(

2

1

highlow

highlow

highlow

highlow

CatalystCatalyst

CatalystCatalystCatalystx

ConcConc

ConcConcConcx

– Use least square method to get the estimations of the coefficients

– For that example,

– Model adequacy: residuals (Pages 213~214)

12

21 2

00.5

2

33.85.27ˆ xxy

13

• Response surface plot:

– Figure 6.3

CatalystConcy 00.58333.033.18ˆ

14

6.3 The 23 Design

• Three factors, A, B and C, and each factor has two levels. (Figure 6.4 (a))

• Design matrix (Figure 6.4 (b))• (1), a, b, ab, c, ac, bc, abc• 7 degree of freedom: main effect = 1, and

interaction = 1

15

16

• Estimate main effect:

• Estimate two-factor interaction: the difference between the average A effects at the two levels of B

])1([4n

1

4

)1(

4

abcacaba

])1([4

1

bccbabcacaba

n

bccb

n

yy

bcabccacbaban

A

AA

n

aacbbc

n

cababc

acacbabbcabcn

AB

44

)1(

)]1([4

1

17

• Three-factor interaction:

• Contrast: Table 6.3– Equal number of plus and minus– The inner product of any two columns = 0– I is an identity element– The product of any two columns yields another

column– Orthogonal design

• Sum of squares: SS = (Contrast)2/8n

)]1([4

1

)]}1([][][]{[4

1

ababcacbcabcn

ababcacbcabcn

ABC

18

19

Factorial Effect

TreatmentCombination

I A B AB C AC BC ABC

(1) + – – + – + + –

a + + – – – – + +

b + – + – – + – +

ab + + + + – – – –

c + – – + + – – +

ac + + – – + + – –

bc + – + – + – + –

abc + + + + + + + +Contrast   24 18 6 14 2 4 4

Effect   3.00 2.25 0.75 1.75 0.25 0.50 0.50

Table of – and + Signs for the 23 Factorial Design (pg. 218)

20

• Example 6.1

A = gap, B = Flow, C = Power, y = Etch Rate

21

22

• The regression model and response surface:– The regression model:

– Response surface and contour plot (Figure 6.7)

3131 2

625.153

2

125.306

2

625.1010625.776ˆ xxxxy

23

24

25

6.4 The General 2k Design

• k factors and each factor has two levels• Interactions• The standard order for a 24 design: (1), a, b, ab, c,

ac, bc, abc, d, ad, bd, abd, cd, acd, bcd, abcd

two-factor interactions2

three-factor interactions3

1 factor interaction

k

k

k

26

• The general approach for the statistical analysis:– Estimate factor effects– Form initial model (full model)– Perform analysis of variance (Table 6.9)– Refine the model– Analyze residual– Interpret results

2

...

)(2

12

2

)1()1)(1(

KABCkKABC

KABCk

KABC

Contrastn

SS

Contrastn

KABC

kbaContrast

27

28

6.5 A Single Replicate of the 2k Design• These are 2k factorial designs

with one observation at each corner of the “cube”

• An unreplicated 2k factorial design is also sometimes called a “single replicate” of the 2k

• If the factors are spaced too closely, it increases the chances that the noise will overwhelm the signal in the data

29

• Lack of replication causes potential problems in statistical testing– Replication admits an estimate of “pure error”

(a better phrase is an internal estimate of error)

– With no replication, fitting the full model results in zero degrees of freedom for error

• Potential solutions to this problem– Pooling high-order interactions to estimate

error (sparsity of effects principle)– Normal probability plotting of effects

(Daniels, 1959)

30

• Example 6.2 (A single replicate of the 24 design)– A 24 factorial was used to investigate the effects

of four factors on the filtration rate of a resin– The factors are A = temperature, B = pressure,

C = concentration of formaldehyde, D= stirring rate

31

32

• Estimates of the effects

Term Effect SumSqr % ContributionModel InterceptError A 21.625 1870.56 32.6397Error B 3.125 39.0625 0.681608Error C 9.875 390.062 6.80626Error D 14.625 855.563 14.9288Error AB 0.125 0.0625 0.00109057Error AC -18.125 1314.06 22.9293Error AD 16.625 1105.56 19.2911Error BC 2.375 22.5625 0.393696Error BD -0.375 0.5625 0.00981515Error CD -1.125 5.0625 0.0883363Error ABC 1.875 14.0625 0.245379Error ABD 4.125 68.0625 1.18763Error ACD -1.625 10.5625 0.184307Error BCD -2.625 27.5625 0.480942Error ABCD 1.375 7.5625 0.131959

Lenth's ME 6.74778 Lenth's SME 13.699

33

• The normal probability plot of the effectsDESIGN-EXPERT PlotFiltration Rate

A: TemperatureB: PressureC: ConcentrationD: Stirring Rate

Normal plot

No

rma

l % p

rob

ab

ility

Effect

-18.12 -8.19 1.75 11.69 21.62

1

5

10

20

30

50

70

80

90

95

99

A

CD

AC

AD

34

35

DESIGN-EXPERT Plot

Filtration Rate

X = A: TemperatureY = C: Concentration

C- -1.000C+ 1.000

Actual FactorsB: Pressure = 0.00D: Stirring Rate = 0.00

C: ConcentrationInteraction Graph

Filt

ratio

n R

ate

A: Temperature

-1.00 -0.50 0.00 0.50 1.00

41.7702

57.3277

72.8851

88.4426

104

DESIGN-EXPERT Plot

Filtration Rate

X = A: TemperatureY = D: Stirring Rate

D- -1.000D+ 1.000

Actual FactorsB: Pressure = 0.00C: Concentration = 0.00

D: Stirring RateInteraction Graph

Filt

ratio

n R

ate

A: Temperature

-1.00 -0.50 0.00 0.50 1.00

43

58.25

73.5

88.75

104

36

• B is not significant and all interactions involving B are negligible

• Design projection: 24 design => 23 design in A,C and D

• ANOVA table (Table 6.13)

37

38

Response:Filtration Rate ANOVA for Selected Factorial ModelAnalysis of variance table [Partial sum of squares]

Sum of Mean FSource Squares DF Square Value Prob >FModel 5535.81 5 1107.16 56.74 < 0.0001A 1870.56 1 1870.56 95.86 < 0.0001C 390.06 1 390.06 19.99 0.0012D 855.56 1 855.56 43.85 < 0.0001AC 1314.06 1 1314.06 67.34 < 0.0001AD 1105.56 1 1105.56 56.66 < 0.0001Residual 195.12 10 19.51Cor Total 5730.94 15

Std. Dev. 4.42 R-Squared 0.9660Mean 70.06 Adj R-Squared 0.9489C.V. 6.30 Pred R-Squared 0.9128

PRESS 499.52 Adeq Precision 20.841

39

• The regression model:

• Residual Analysis (P. 235)• Response surface (P. 236)

Final Equation in Terms of Coded Factors:

Filtration Rate =+70.06250+10.81250 * Temperature+4.93750 * Concentration+7.31250 * Stirring Rate-9.06250 * Temperature * Concentration+8.31250 * Temperature * Stirring Rate

40

41

42

• Half-normal plot: the absolute value of the effect estimates against the cumulative normal probabilities.

DESIGN-EXPERT PlotFiltration Rate

A: TemperatureB: PressureC: ConcentrationD: Stirring Rate

Half Normal plot

Ha

lf N

orm

al %

pro

ba

bility

|Effect|

0.00 5.41 10.81 16.22 21.63

0

20

40

60

70

80

85

90

95

97

99

A

CD

AC

AD

43

• Example 6.3 (Data transformation in a Factorial Design)

A = drill load, B = flow, C = speed, D = type of mud, y = advance rate of the drill

44

• The normal probability plot of the effect estimates

DESIGN-EXPERT Plotadv._rate

A: loadB: flowC: speedD: mud

Half Normal plot

Ha

lf N

orm

al %

pro

ba

bili

ty

|Effect|

0.00 1.61 3.22 4.83 6.44

0

20

40

60

70

80

85

90

95

97

99

B

C

D

BCBD

45

• Residual analysisDESIGN-EXPERT Plotadv._rate

Residual

No

rma

l % p

rob

ab

ility

Normal plot of residuals

-1.96375 -0.82625 0.31125 1.44875 2.58625

1

5

10

20

30

50

70

80

90

95

99

DESIGN-EXPERT Plotadv._rate

Predicted

Re

sid

ua

ls

Residuals vs. Predicted

-1.96375

-0.82625

0.31125

1.44875

2.58625

1.69 4.70 7.70 10.71 13.71

46

• The residual plots indicate that there are problems with the equality of variance assumption

• The usual approach to this problem is to employ a transformation on the response

• In this example, yy ln*

47

DESIGN-EXPERT PlotLn(adv._rate)

A: loadB: flowC: speedD: mud

Half Normal plotH

alf

No

rma

l % p

rob

ab

ility

|Effect|

0.00 0.29 0.58 0.87 1.16

0

20

40

60

70

80

85

90

95

97

99

B

C

D

Three main effects are large

No indication of large interaction effects

What happened to the interactions?

48

Response: adv._rate Transform: Natural log Constant: 0.000

ANOVA for Selected Factorial Model

Analysis of variance table [Partial sum of squares]Sum of Mean F

Source Squares DF Square Value Prob > FModel 7.11 3 2.37 164.82 < 0.0001B 5.35 1 5.35 371.49 < 0.0001C 1.34 1 1.34 93.05 < 0.0001D 0.43 1 0.43 29.92 0.0001Residual 0.17 12 0.014Cor Total 7.29 15

Std. Dev. 0.12 R-Squared 0.9763Mean 1.60 Adj R-Squared 0.9704C.V. 7.51 Pred R-Squared 0.9579

PRESS 0.31 Adeq Precision 34.391

49

• Following Log transformation

Final Equation in Terms of Coded Factors:

Ln(adv._rate) =+1.60+0.58 * B+0.29 * C+0.16 * D

50

DESIGN-EXPERT PlotLn(adv._rate)

Residual

No

rma

l % p

rob

ab

ility

Normal plot of residuals

-0.166184 -0.0760939 0.0139965 0.104087 0.194177

1

5

10

20

30

50

70

80

90

95

99

DESIGN-EXPERT PlotLn(adv._rate)

PredictedR

es

idu

als

Residuals vs. Predicted

-0.166184

-0.0760939

0.0139965

0.104087

0.194177

0.57 1.08 1.60 2.11 2.63

51

• Example 6.4:– Two factors (A and C) affect the mean number

of defects– A third factor (B) affects variability– Residual plots were useful in identifying the

dispersion effect– The magnitude of the dispersion effects:

– When variance of positive and negative are equal, this statistic has an approximate normal distribution

)(

)(ln

2

2*

iS

iSFi

52

53

54

55

6.7 2k Designs are Optimal Designs

• Consider 22 design with one replication.• Fit the following model:

• Matrix form:

56

211222110 xxxxy

Xy

ab

b

a

12

2

1

0

1111

1111

1111

1111)1(

• The LS estimation:

• D-optimal criterion, |X’X|: the volumn of the joint confidence region that contains all coefficients is inversely proportional to the square root of |X’X|.

• G-optimal design: 57

4

)1(4

)1(4

)1(4

)1(

')'(ˆ 1

abba

abba

abba

abba

YXXX

)1(4

)ˆVar(max min 22

21

22

21

2

xxxxy

58

6.8 The Addition of Center Points to the 2k Design • Based on the idea of replicating some of the runs

in a factorial design• Runs at the center provide an estimate of error and

allow the experimenter to distinguish between two possible models:

01 1

20

1 1 1

First-order model (interaction)

Second-order model

k k k

i i ij i ji i j i

k k k k

i i ij i j ii ii i j i i

y x x x

y x x x x

59

60

no "curvature"F Cy y

The hypotheses are:

01

11

: 0

: 0

k

iii

k

iii

H

H

2

Pure Quad

( )F C F C

F C

n n y ySS

n n

This sum of squares has a single degree of freedom

To detect the possibility of the quadratic effects: add center points

61

62

• Example 6.6

Refer to the original experiment shown in Table 6.10. Suppose that four center points are added to this experiment, and at the points x1=x2 =x3=x4=0 the four observed filtration rates were 73, 75, 66, and 69. The average of these four center points is 70.75, and the average of the 16 factorial runs is 70.06. Since are very similar, we suspect that there is no strong curvature present.

4Cn

Usually between 3 and 6 center points will work well

Design-Expert provides the analysis, including the F-test for pure quadratic curvature

63

64

65

• If curvature is significant, augment the design with axial runs to create a central composite design. The CCD is a very effective design for fitting a second-order response surface model

Recommended