Linear models questions.pdf

7/29/2019 Linear models questions.pdf

1/120

QUESTION 1.1) Obtain thelikelihoodfunctionfor thesampleobservationsY1,...,Yn

given X1,...,Xn if thenormal model is assumed tobeapplicable.2) Obtain themaximumlikelihood estimators foro and1.Solution:

1) Under normal modelYi hasNo 1Xi,2, withthecorrespondingdensity functiongiven by

fYiyi 1

2 2exp yio1Xi

2

22

Hencethelikelihoodfunctionfor thenormal error model, given thesampleobservationsY1,...,Yn, is:

Lo,1,2 i1

n1

22exp 1

22Yi o 1Xi2

2) In order to find theMLE weuse

lnLo,1,2 lni1

n1

22exp 1

22Yi o 1Xi2

n2 ln2 122 Yi o 1Xi2

olnLo,1,2 1

222Yi o 1Xi 1

1

2Yi no 1Xi

and1

lnLo,1,2 122

2Yi o 1Xi Xi

1

2XiYi oXi 1Xi2

Fromo

lnLo,1,2 01

lnLo,1,2 0

weget thefollowingequationsYi no 1Xi

XiYi oXi 1Xi2

Fromthefirst onewegeto Y

1X

Using it in thesecond wegetXiYi Y

1XXi

1Xi

2

hence

1 XiYi

Xi Yin

Xi2 Xi2

n

XiXYiY

XiX2

TheMLE foro and1 are

1

XiYi Xi Yi

n

Xi2 Xi2

n

ando Y

1X

thesameas estimatorsobtained usingleastsquares method.QUESTION 2.

Datafromastudy of therelationbetween thesizeof abid in million rands (X) and thecost to thefirmof preparing thebid in thousands rands (Y) for 12 recent bidsare

-1-


2/120

presented in table below:

i 1 2 3 4 5 6 7 8 9 10 11 12

Xi 2.13 1.21 11.0 6.0 5.6 6.91 2.97 3.35 10.39 1.1 4.36 8.0

Yi 15.5 11.1 62.6 35.4 24.9 28.1 15.0 23.2 42.0 10 20 47.5

Thescatter plot strongly suggest that theerror varianceincreasewith X.Fit theweighted lest squares regressionlineusingweightswi 1

Xi2 .

Solution

Xi Yi wi 1/Xi2 wiXi wiYi wiXiYi wiXi

2

2.13 15.5 0.220415 0.469484 3.41643 7.276995 1

1.21 11.1 0.683013 0.826446 7.587449 9.173554 1

11 62.6 0.008264 0.090909 0.517355 5.690909 1

6 35.4 0.027778 0.166667 0.983333 5.9 1

5.6 24.9 0.031888 0.178571 0.794005 4.446429 1

6.91 28.1 0.020943 0.144718 0.588505 4.06657 1

2.97 15 0.113367 0.3367 1.700507 5.050505 1

3.35 23.2 0.089107 0.298507 2.067276 6.925373 1

10.39 42 0.009263 0.096246 0.389061 4.042348 1

1.1 10 0.826446 0.909091 8.264463 9.090909 1

4.36 20 0.052605 0.229358 1.0521 4.587156 1

8 47.5 0.015625 0.125 0.742188 5.9375 1

63.02 335.3 2.098715 3.871698 28.09667 72.18825 12 Totals

bi

wiXiYiwiXi wiYi wi

wiXi2 wiXi2

wi

72.18825 3.87169828.09667

2.098715

12 3.8716982

2.098715

4. 1906

bo wiYibiwiXi

wi

28.096674.19063.8716982.098715

5. 6568

HenceY 5.6568 4.1906X

QUESTION 3.Thefollowing datewereobtained in acertain study.

i 1 2 3 4 5 6 7 8 9 10 11 12

Xi 1 1 1 2 2 3 3 3 3 5 5 5

Yi 4.8 4.9 5.1 7.9 8.3 10.9 10.8 11.3 11.1 16.5 17.3 17.1

Summary calculational resultsare: Xi 34, Yi 126,Xi2 122Yi2 1554.66, XiYi 434.1) Fit alinear regression function2) Performan F test to determinewhether or not thereis lack of fit

of alinear regression function. Use 0.05.Solution

-2-


3/120

1) Wehave

b1 XiYi

Xi Yin

Xi2 Xi2

n

434 34126

12

122 342

12

3

andbo Y b1X 1n Yi b1 1n Xi 112 126 3 112 34 2

ThereforeY 2 3 X2) F test for lack of fit.Wehavec 4 levels for X and 3 replicates for X 1 level,2 replicates for X 2 level, 4 replicates for X 3 leveland 3 replicates for X 5 level, and n 12.Hence

Yj 1nj

i1

nj

Yi,j - themean atj th level ofX.

Y 1 13 4.8 4.9 5.1 4. 9333 at level X 1Y 2

127.9 8.3 8. 1 at level X 2

Y 3 1410.9 10.8 11.3 11.1 11. 025 at level X 3

Y 4 1316.5 17.3 17.1 16. 967 at level X 5

SSPE j1

c

i1

nj

Yi,j Yj2 4.8 4.93332 4.9 4.93332 5.1 4.93332

7.9 8.12 8.3 8.12 10.9 11.0252 10.8 11.0252 11.3 11.0252 11.1 11.0252 16.5 16.9672 17.3 16.9672 17.1 16.9672 0. 62083

MSPE SSPEnc 0.62083

124 0.077604

SSE i1

n Yi Yi2 Yi2 boYi b1XiYi

1554.66 2 126 3 434 0. 66SSLF SSE SSPE 0.66 0.62083 0.03917MSLF SSLF

c2 0.03917

42 0.019585

Thehypothesis

Ho:EY o 1X

Ha:EY o 1X

Test statisticsF MSLF

MSPE

0.019585

0.077604

0. 25237Thedecision rule

If F F1 ;c 2,n c conclude Ho

If F F1 ;c 2,n c conclude HaF1 ;c 2;n c F0.95;2;8 4.46SinceF 0. 25237 F1 ;c 2;n c 4.46 weconcludeHo.

Thereis no lack of fit.QUESTION 4.1) Statethesimplenormal linear regressionmodel in matrix terms.2) Provethefollowingformula for SSE:

SSE Y

Y b

X

Y3) Provethat for

Yh Xh

b thevarianceis in matrix notation

-3-


4/120

2Yh 2X h

X X1X hSolution1) Let

Y

Y1

Y2:

Yn

X

1 X1

1 X2: :

1 Xn

o1

1

2

:

n

then

n1

Y n2

X21

n1

where: is thevector of parametersX - matrix of knownconstants,namely,thevalues of theindependentvariable

is avector of independent normal randomvariableswithE 0

and2

2

I.2)We know thatSSE Yi2 boYi b1XiYiLet us noticethat

if Y

Y1

Y2

:

Yn

then Y Y1 Y2 .. . Yn

and

if X

1 X1

1 X2

: :

1 Xn

then X 1 1 ... 1

X1 X2 .. . Xn.

Hence

Y Y Y1 Y2 .. . Yn

Y1

Y2

:

Yn

Yi2 Yi2

and

X Y 1 1 ... 1

X1 X2 .. . Xn

Y1

Y2

:

Yn

YiXiYi

Using this with b bo

b1wehave

-4-


5/120

Y Y bX Y Yi2 bo b1YiXiYi

Yi2 boYi b1XiYi SSEwhichcompletes theproof.

or2)Weknow thatA A, A B A B, and AB BA

andthenormal equation:X Xb X Y

hence

X Xb X Y 0

0

where b bo

b1

so

X Xb X Y bX X Y X 0

0

0 0

FromdefinitionSSE ee Y Xb Y Xb Y bX Y Xb

Y Y Y Xb bX Y bX Xb Y Y bX Y bX Xb Y Xb Y Y bX Y bX X Y Xb

Y Y bX Y 0 0 bob1

Y Y bX Y 0 Y Y bX Y

3)We know that:Let W bearandomvector obtained by premultiplyingtherandomvector

Y by aconstant matrix AW AY

Then*) 2W 2AY A2YA

Since

Yh X h

b using *) withA X h

weget

2Yh X h

2bX hUsingthefact that2b 2X X1

weget2

Yh X h

2X X1X h 2X h X X1X h

QUESTION 5.Thefitted values and residuals of aregression analysis aregiven below

t 1 2 3 4 5 6 7 8 9 10

Yt 21.96 4.15 7.36 22.11 10.98 22.06 47.35 47.05 73.40 69.79et -1.45 -0.26 -0.16 -0.20 0.32 0.63 0.24 0.55 -0.50 -0.65

-5-


6/120

t 11 12 13 14 15 16 17 18 19 20Yt 83.83 87.09 75.64 76.15 69.08 32.24 47.30 52.29 78.03 77.78

et 0.06 -0.09 -0.24 -1.03 0.02 0.56 0.80 0.11 0.57 0.72

Assumethat thesimplelinear regressionmodel with therandomterms followingafirst-order autoregressiveprocessis appropriate.Conduct aformal test for positiveautocorrelation using 0.05.Solution

Thehypothesis:

Ho : 0

Ha : 0

TheDurbin-Watson test statistics:

D

t2

n

etet12

t1

n

et2

6.50256.7072

0. 96948

t1

n

et2

1.452 0.262 0.162 0.202 0.322 0.632

0.242 0.552 0.502 0.652 0.062 0.092 0.242 1.032 0.022 0.562 0.802 0.112

0.572 0.722 6. 7072

t2

n

et et12 0.26 1.452 0.16 0.262 0.20 0.162

0.32 0.202 0.63 0.322 0.24 0.632 0.55 0.242 0.50 0.552 0.65 0.502 0.06 0.652 0.09 0.062 0.24 0.092 1.03 0.242 0.02 1.032 0.56 0.022 0.80 0.562 0.11 0.802 0.57 0.112 0.72 0.572 6. 5025

Thedecision rule

If D dU conclude Ho

IfD dL concludeHa

IfdL D dU thetest is inconclusive

p 2 , dL 1.20 and dU 1.41SinceD 0.96948 dL 1.20 weconcludeHa, that theerror terms arepositively autocorrelated.

QUESTION 6.Thefollowing datawereobtained in acertain experiment:

-6-


7/120

i Xi,1 Xi,2 Yi

1 1 2 2.5

2 1 2 3

3 1 2 3.5

4 2 1 3

5 2 1 4

6 0 1 1

7 0 1 1.5

8 0 1 2

9 1 0 1.5

10 1 0 2

11 1 0 2.5Thedatasummary is given below in matrix form

X X

11 10 11

10 14 10

11 10 17

X X1

2354

527

16

527

1154

0

16

0 16

X Y

26.5

29

29.5

Y Y 72.25

Assumethat first-order regressionmodel with independentnormal errorsis appropriate.1) Find theestimated regressioncoefficients.2) Obtain an ANOVA table and useit totest whether thereis aregression

relation using 0.05.3) Estimate1 and2 jointly by theBonferroni procedureusing

80percentfamily confidencecoefficient.Solution

1) b

bo

b1

b2

X X1X Y

2354

527

16

527

1154

0

16

0 16

26.5

29

29.5

1.0

1. 0

0. 5

2) Y 1 1Y Yi 26.5 (weget it from X Y )

SSTO Y Y 1n Y11Y

72.25 111

26.5 26.5 8. 4091

-7-


8/120

SSE Y Y bX Y 72.25 1 1 0.5

26.5

29

29.5

2.0

SSR bX Y 1n Y11Y

1 1 0.5

26.5

29

29.5

111

26.5 26.5 6. 4091

ANOVA table

sourceof variation SS df MS

regression SSR 6.4091 p 1 2 MSR SSRp1 3.2046

error SSE 2 n p 8 MSE SSEnp 28

0.25

total SSTO 8.4091 n 1 10Hypothesis:Ho : 1 2 0Ha : not both1 and2 areequal tozero

Test statisticsF MSR

MSE

3.20460.25

12. 818

Thedecision ruleIfF F1 ;p 1,n p, concludeHoIfF F1 ;p 1,n p, concludeHa

F1 ;p 1,n p F0.95,2,8 4.46SinceF 12.818 F0.95,2,8 4.46weconcludeHa (not both1 and2 areequal tozero), that means that thereis alinear associationbetween X and Y .3) Ifg parameters areto beestimated jointly (whereg p), theBonferroni confidencelimits withfamily confidencecoefficient 1 are:

bk BsbkB t1 /2g,n p

andweget s2b1, s2b2s2b MSEX X1

In our caseg 2, b1 1, b2 0.5

s2b MSEX X1 0.25

2354

527

16

527

1154

0

16

0 16

0. 10648 4. 6296 102 4. 1667 102

4. 6296 102 0.050926 0

4. 1667 102 0 0.041667

sos2b1 0.050926 and sb1 0.050926 0. 22567

s2b2 0.04166 and sb2 0.04166 0. 20411B t1 /2g,n p t1 0.2

22,8 t0.95,8 1.860

-8-


9/120

Hencethelimits for b1 and b2 are0.58025,1.4197 and 0.12036,0.87964respectively, since1 1.860 0. 22567 0. 580251 1.860 0. 22567 1. 4197

0.5 1.860 0. 20411 0. 120360.5 1.860 0. 20411 0. 87964QUESTION 7.For acertain experiment thefirst-order regressionmodel with twoindependentvariables was used. Thecalculated diagonal elementsof thehat matrix are:

i 1 2 3 4 5 6 7 8

hi,i 0.237 0.237 0.237 0.237 0.137 0.137 0.137 0.137

i 9 10 11 12 13 14 15 16

hi,i 0.137 0.137 0.137 0.137 0.237 0.237 0.237 0.237

1) Describeuseof hat matrix for identifyingoutlyingX observations.2) Identify any outlyingX observations using thehat matrix method.Solution1) Thehat matrix H is given by:

H XX X1X

Thediagonal element hi,i in thehat matrix is called theleverageof the ith observation.Thus, alargeleveragevaluehi,i indicates that the ith observation is distantfromthecenter of theX observations. Themean leveragevalue

h hi,i

n pn

Hence, leveragevalues greater than2pn areconsidered by this ruletoindicateoutlying

observations with regard to theX values.2) In our casen 16, p 3 so thecritical value

2pn

616

0. 375

Sinceall leveragevalues in our casearelessthan 0.375 thereforethis methoddoes notidentified outlying observationsfor X.QUESTION 8.Provethat1) SSRX1,X2,X3 SSRX1 SSRX2,X3 X12) SSRX1 SSRX2 X1 SSRX2 SSRX1 X2Solution:1) Weknow that

SSRX2,X3 X1 SSEX1 SSEX1,X2,X3and

SSTO SSEX1 SSRX1 SSEX1,X2,X3 SSRX1,X2,X3HenceLHS SSRX1,X2,X3 SSTO SSEX1,X2,X3 SSEX1 SSRX2 SSEX1,X2,X3 SSRX1 SSEX1 SSEX1,X2,X3 SSRX1 SSRX2,X3 X1 RHSwhichcompletes theproof.2) Weknow that

SSRX2 X1 SSEX1 SSEX1,X2,SSRX1

X2 SSEX2

SSEX1,X2

andSSTO SSEX1 SSRX1 SSEX2 SSRX2

-9-


10/120

HenceLHS SSRX1 SSRX2 X1 SSRX1 SSEX1 SSEX1,X2 SSTO SSEX1,X2 SSEX2 SSRX2 SSEX1,X2 SSRX2 SSRX1 X2 RHSwhichcompletes theproof.

QUESTION 9.Themeasurer2 is called thecoefficient of determination and is given by formula

r2 SSTOSSESSTO

SSR

SSTO 1 SSE

SSTO

Thesquareroot (with aplus or minus sign is attached to this measureaccording towhether theslopeof thefitted regressionlineis positiveor negative)

r r2 is called thecoefficient of correlation. Provethefollwing

r XiXYiY

XiX2YiY21/2Solution:Wearegoing to usethefollowingformulas:

SSR b1Xi XYi Y, b1 XiXYiY

XiX2 ,SSTO Yi Y2,

Let us noticethat b1 0 if Xi XYi Y 0and b1 0 if Xi XYi Y 0.

r2 SSRSSTO

b1XiXYiYYiY2

XiXYiYXiX2

XiXYiY

YiY2

XiXYiY2

XiX2YiY2Hence

r signb1 XiXYiY

XiX2YiY212

XiXYiYXiX2YiY2

12

QUESTION 10.Thefollowing datawereobtained in certain experiment:

-10-


11/120

i Yi Xi,1 Xi,2

1 64 4 2

2 73 4 4

3 61 4 2

4 76 4 4

5 72 6 2

6 80 6 4

7 71 6 2

8 83 6 4

9 83 8 2

10 89 8 4

11 86 8 212 93 8 4

13 88 10 2

14 95 10 4

15 94 10 2

16 100 10 4

Thedatasummary is given below in matrix form

X X

16 112 48

112 864 336

48 336 160

X X1

9980

780

316

780 180 0

316

0 116

X Y

1308

9510

3994

Y Y 108896


relation using

0.05.3) Estimate1 and2 jointly by theBonferroni procedureusing80 percentfamily confidencecoefficient.Solution

1) b

bo

b1

b2

X X1X Y

-11-


12/120

9980

780

316

780

180

0

316

0 116

1308

9510

3994

75320

17740

358

37. 65

4. 425

4. 375

2) Y 1 1Y Yi 1308 (weget it from X Y )

SSTO Y Y 1n Y11Y

108896 116

1308 1308 1967

SSE Y Y bX Y 108896

37. 65

4. 425

4. 375

1308

9510

3994

94. 3

SSR bX Y 1n Y11Y

37. 654. 425

4. 375

13089510

3994

116

1308 1308 1872. 7

ANOVA table



error SSE 94.3 n p 13 MSE SSEnp 94.313

7. 253846154

total SSTO 1967 n 1 15

Hypothesis:Ho : 1 2 0Ha : not both1 and2 areequal tozero


MSE

1872.77.253846154

258. 1664899


F1 ;p 1,n p F0.95,2,13 3.81SinceF 258. 1664899 F0.95,2,13 3.81

weconcludeHa (not both1 and2 areequal tozero), that means that thereis alinear associationbetween X and Y .3) Ifg parameters areto beestimated jointly (whereg p), theBonferroni confidencelimits withfamily confidencecoefficient 1 are:

bk BsbkB t1 /2g,n p


In our caseg 2, b1 4.425, b2 4.375

-12-


13/120

s2b MSEX X1 7. 253846154

9980

780

316

780

180

0

316

0 116

8. 976634616 . 6347115385 1. 3600961540. 6347115385 0.09067307693 0

1. 360096154 0 0. 4533653846

sos2b1 0.09067307693 and sb1 0.09067307693 0. 3011197053s2b2 0. 4533653846 and sb2 0. 4533653846 0. 6733241304B t1 /2g,n p t1 0.20

22,13 t0.95,13 1.771

Hencethelimits for b1 and b2 are3. 8917,4. 9583 and3. 1825,5. 5675

respectively, since4.425 1.771 0. 3011197053 3. 89174.425 1.771 0. 3011197053 4. 95834.375 1.771 0. 6733241304 3. 18254.375 1.771 0. 6733241304 5. 5675QUESTION 11.

Thefollowing datawereobtained in certain experiment:

i Xi,1 Xi,2 Yi

1 1 2 2.5

2 1 2 3

3 1 2 3.5

4 2 1 3

5 2 1 4

6 0 1 1

7 0 1 1.5

8 0 1 2

9 1 0 1.5

10 1 0 2

11 1 0 2.5


X X

11 10 11

10 14 10

11 10 17

X X1

2354

527

16

527

1154

0

16

0 16

X Y

26.5

29

29.5

Y Y 72.25

Assumethat first-order regressionmodel with independentnormal errorsis appropriate.

-13-


14/120

1) Find theestimated regressioncoefficients.2) Obtain an ANOVA table and useit totest whether thereis aregression

relation using 0.05.3) Estimate1 and2 jointly by theBonferroni procedureusing80 percentfamily confidencecoefficient.

Solution

1) b

bo

b1

b2

X X1X Y

2354

527

16

527

1154

0

16

0 16

26.5

29

29.5

1.0

1. 0

0. 5

2) Y 1 1Y Yi 26.5 (weget it from X Y )SSTO Y Y 1n Y

11Y 72.25 1

11 26.5 26.5 8. 4091

SSE Y Y bX Y 72.25 1 1 0.5

26.5

29

29.5

2.0

SSR bX Y 1n Y11Y

1 1 0.5

26.5

29

29.5

111

26.5 26.5 6. 4091 6.4091/2 3. 2046

ANOVA table




10 0.2

total SSTO 8.4091 n 1 10

Hypothesis:Ho : 1 2 0Ha : not both1 and2 areequal tozero


MSE

3.20460.2

16. 023


F1 ;p 1,n p F0.95,2,9 4.26SinceF 16.023 F0.95,2,13 4.26weconcludeHa (not both1 and2 areequal tozero), that means that thereis a

linear associationbetween X and Y .3) Ifg parameters areto beestimated jointly (whereg p), theBonferroni confidence

-14-


15/120

limits withfamily confidencecoefficient 1 are:bk BsbkB t1 /2g,n p



s2b MSEX X1 3.2046

2354

527

16

527

1154

0

16

0 16

1. 3649 0. 59346 0. 53411

0. 59346 0. 65278 0

0. 53411 0 0. 53411

so

s2b1 0. 65278 and sb1 0. 65278 0. 80795s2b2 0. 53411 and sb2 0. 53411 0. 73083B t1 /2g,n p t1 0.2

22,9 t0.95,9 1.833

Hencethelimits for b1 and b2 are 0. 48097,2. 481 and 0. 83961,1. 8396respectively, since1 1.833 0. 80795 0. 480971 1.833 0. 80795 2. 4810.5 1.833 0. 73083 0. 839610.5 1.833 0. 73083 1. 8396

QUESTION 12.For acertain experiment thefirst-order regressionmodel with twoindependentvariables was used. Thecalculated diagonal elementsof thehat matrix are:

i 1 2 3 4 5 6 7 8 9 10

hi,i 0.91 0.194 0.131 0.268 0.149 0.141 0.429 0.067 0.135 0.165

i 11 12 13 14 15 16 17 18 19 20

hi,i 0.179 0.059 0.110 0.156 0.095 0.128 0.97 0.230 0.112 0.073

1) Describeuseof hat matrix for identifyingoutlyingX observations.2) Identify any outlyingX observations using thehat matrix method.

Solution1) Thehat matrix H is given by:

H XX X1X

Thediagonal element hi,i in thehat matrix is called theleverageof the ith observation.Thelargeleveragevaluehi,i indicates that theith observationis distant fromthecenter of theX observations. Themean leveragevalue

h hi,i

n pn


observations with regard to theX values.2) In our casen 20, p 3 so thecritical value

2pn

620

0.36

Sinceleveragevalues corresponding totheresultsfrom1, 7, 17 trials aregreaterthan our critical valuethereforewecan classify 1,7, 17 as an outlyingobservation.

-15-


16/120


i 1 2 3 4 5 6 7Yi 2.92 2.33 2.25 1.58 2.08 3.51 3.34

ei 0.18 -0.03 0.75 0.32 0.42 0.19 0.06

i 8 9 10 11 12 13 14 15Yi 2.42 2.84 2.50 3.59 2.16 1.91 2.50 3.26

ei -0.42 0.06 -0.20 -0.39 -0.36 -0.51 -0.50 0.54


Thehypothesis:

Ho : 0

Ha : 0


D

t2

n

etet12

t1

n

et2

3.46042.21788

1. 560228687

t1

n

et2

0.182 0.032 0.752 0.322 0.422 0.192

0.062

0.422

0.062

0.202

0.392

0.362

0.512 0.502 0.542 2.21788

t2

n

et et12 0.03 0.182 0.75 0.032 0.32 0.752

0.42 0.322 0.19 0.422 0.06 0.192 0.42 0.062 0.06 0.422 0.20 0.062 0.39 0.202 0.36 0.392 0.51 0.362 0.50 0.512 0.54 0.502 3. 4604

Thedecision rule

If D dU conclude Ho

IfD dL concludeHa

IfdL D dU thetest is inconclusivep 2 , dL 1.08 and dU 1.36SinceD 1. 560228687 dU 1.36 weconcludeHo, that theerror terms arenot positively autocorrelated.QUESTION 14Provethefollowing statements:1) Thesumof theobserved values Y i equals thesumof thefitted values

Yi;

i1

n

Yi i1

n Yi

2) Theregressionlinealwaysgoes through thepoint(X,YSolution:

-16-


17/120

1)This condition is implicit in thefirst normal equation

Yi nbo b1Xi bo b1Xi i1

n Yi.

2) Theestimated regressionlineisY bo b1X

WehavetoshowthatY bo b1X.

Fromthefirst normal equationdiveded by n (both sides)Yi nbo b1Xi : n

weget1n Yi bo b1 1n Xi

and Y bo b1X whichcompletes theproof.QUESTION 15.

Thefollowing datewereobtained in acertain study.

i 1 2 3 4 5 6 7 8 9 10 11 12

Xi 1 1 1 2 2 3 3 3 3 5 5 5

Yi 4.8 4.9 5.1 7.9 8.3 10.9 10.8 11.3 11.1 16.5 17.3 17.1Summary calculational resultsare: Xi 34, Yi 126,Xi2 122Yi2 1554.66, XiYi 434.1) Fit alinear regression function2) Performan F test to determinewhether or not thereis lack of fit

of alinear regression function. Use 0.05.Solution1) Wehave

b1 XiYi

Xi Yin

Xi

2 Xi2

n

434 34126

12

122 342

12

3

andbo Y b1X 1n Yi b1 1n Xi 112 126 3

112

34 2


Yj

1

nj i1

nj

Yi,j - themean atj th level ofX.Y 1

134.8 4.9 5.1 4. 9333 at level X 1

Y 2 127.9 8.3 8. 1 at level X 2

Y 3 1410.9 10.8 11.3 11.1 11. 025 at level X 4

Y 4 1316.5 17.3 17.1 16. 967 at level X 5

SSPE j1

c

i1

nj

Yi,j Yj2 4.8 4.93332 4.9 4.93332 5.1 4.93332

7.9 8.12 8.3 8.12 10.9 11.0252 10.8 11.0252 11.3 11.0252 11.1 11.0252 16.5 16.9672 17.3 16.9672 17.1 16.9672 0. 62083

MSPE SSPEnc 0.62083124 0.077604

-17-


18/120

SSE i1

n

Yi Yi2 Yi2 boYi b1XiYi


c2 0.03917

42 0.019585

ThehypothesisHo:EY o 1X

Ha:EY o 1X


MSPE

0.0195850.077604

0. 25237

Thedecision rule


If F F1 ;c 2,n c conclude HaF1

;c

2;n

c F0.95;2;8 4.46

SinceF 0. 25237 F1 ;c 2;n c 4.46 weconcludeHo.Thereis not lack of fit.QUESTION 16Consider thesimplelinear regressionmodel expressed in matrix terms.Provethefollowing formula for SSE:SSE Y Y bX YSolutionWeknow thatSSE Yi2 boYi b1XiYiLet us noticethat

if Y

Y1

Y2

:

Yn


and

if X

1 X1

1 X2

: :

1 Xn

then X 1 1 ... 1

X1 X2 .. . Xn.

Hence

Y Y Y1 Y2 .. . Yn

Y1

Y2

:

Yn

Yi2 Yi2

and

-18-


19/120

X Y 1 1 ... 1

X1 X2 .. . Xn

Y1

Y2

:

Yn

YiXiYi


b1wehave

Y Y bX Y Yi2 bo b1YiXiYi

Yi2 boYi b1XiYi SSEwhichcompletes theproof.

QUESTION 17.

Provethefollowingtheorem:TheoremMSE is an unbiased estimator of2 for thesimple linearregressionmodel.SolutionProof: Weknow thatSSE Yi

Yi2 Yi bo b1Xi2

Yi Y b1X b1Xi2 Yi2 nY2 Xi2 nX2b12 2nXb1Y 2XiYib1ESSE EYi2 nY2 Xi2 nX2b12 2nXb1Y 2XiYib1

EYi2

nEY2

Xi2

nX2

Eb12

2nXEb1Y 2XiEYib1

EYi2 nEY2 Xi X2Eb12 2nXEb1Y 2XiEYib1EYi2 2 o 1Xi2 n2 no2 2no1X 12Xi2

nEY2 n 2

n o 1X2 2 no

2 2no1X n1

2X2

Xi X2Eb12 Xi X2 2

XiX2 1

2 2 1

2Xi X2

2 12Xi2 n12X2 Xi X2Eb12

Eb1Y 1n E

XiX

XiX2Yi Yj 1n E

ij

XiX

XiX2YiYj XiXXiX2

Yi2

1n ij

XiX

XiX2 EYi Yj 1n

XiX

XiX2 EYi2

1n

ij

XiX

XiX2o 1Xio 1Xj

1n

XiX

XiX22 o 1Xi2

1n

ij

XiX

XiX2o 1Xio 1Xj XiXXiX2

o 1Xi2

1n

XiX

XiX22 1n

i,j

XiX

XiX2o 1Xio 1Xj

1n

j

i

XiX

XiX2o 1Xio 1Xj

1n

j

o 1Xji

XiX

XiX2o

1n no 1Xj 1i

XiXXiXiX2 o 1X1iXiXXiXiX2 o 1X1

-19-


20/120

2nXEb1Y 2nXo 1X1 2no1X 2n12X2 2nXEb1Y

EYib1 EXjX

XiX2YjYi

j,,ji

XjX

XiX2EYjYi

XiX

XiX2EYi

2

ji

XjX

XiX2o 1Xio 1Xj

XiX

XiX22 o 1Xi2

ji

XjX

XiX2o 1Xio 1Xj

XiX

XiX2o 1Xi2

XiX

XiX22

j

XjX

XiX2o 1Xio 1Xj

XiX

XiX22

o 1Xij

XjX

XiX2o 1Xj

XiX

XiX22

o 1Xi oj

XjX

XiX2 1

j

XjXXj

XiX2

XiX

XiX22

o 1Xi1 XiX

XiX22

2XiEYib1 2Xi o 1Xi1 XiXXiX22

2Xio 1Xi1 2 XiXXiXiX22

2o1Xi 212Xi2 22 2no1X 21

2Xi2 22 2no1X 212Xi2 22 2XiEYib1HenceESSE EYi2 nY2 Xi2 nX2b12 2nXb1Y 2XiYib1 EYi2 nEY2 Xi2 nX2Eb12 2nXEb1Y 2XiEYib1

EYi2

nEY2

Xi X2

Eb12

2nXEb1Y 2XiEYib1 n2 no

2 2no1X 1

2Xi2 2 no2 2no1X n12X2 2 1

2Xi2 n12X2 2no1X 2n12X2 2no1X 212Xi2 22 n 22

Weused thefollowingX

j

Xi X 0

j

XjXXj

XiX2

j

XjXXjXj

XiX

XiX2

j

XjXXjXXiX

XiX2

j X

jXXjX

XiX2 1

and Xi2 nX2 Xi X2 Xi2 nX2Finally

EMSE E SSEn2

ESSE

n2 n22

n2 2

QUESTION 18.Assumethat thenormal regressionmodel is applicable.For thefollowingdatagiven by:

-20-


21/120

i 1 2 3 4 5 6

Xi 4 1 2 3 3 4

Yi 16 5 10 15 13 22

usingmatrix methodfind:1)Y Y2) X X3) X Y4) b

5)Testthe Ho : 1 0 versus Ha : 1 0usingANOVA.with 0.05

6) Covariance-variancematrix s2bSolution.

X

1 4

1 1

1 2

1 3

1 3

1 4

Y

16

5

10

15

13

22

1Y 1 1 1 1 1 1

16

5

10

1513

22

81

1)Y Y

16

5

10

15

1322

16

5

10

15

1322

1259

2) X X

1 4

1 1

1 2

1 3

1 3

1 4

1 4

1 1

1 2

1 3

1 3

1 4

6 17

17 55

-21-


22/120

3) X Y

1 4

1 1

1 2

1 31 3

1 4

16

5

10

1513

22

81

261

4) X X1 6 17

17 55

1

A1 1detA

Aij

5541

1741

1741

641

b bo

b1 X X1X Y

5541

1741

1741

641

81

261

18

4118941

0. 439024. 6098

5) SSR bX Y 1nY11Y

0. 43902

4. 6098

81

261 1

6 812 145. 2

SSTO Y Y 1nY11Y 1259 1

6 812 331

2 165. 5

SSE Y Y bX Y 1259 1238. 7 20. 3

(ANOVA table):sourceof variation SS df MS

regression SSR 145.2 1 MSR SSR1

145.2

1 145.2

error SSE 20.3 4 MSE SSEn2

20.34

5. 075

total SSTO 165.5 5

Ho : 1 0Ha : 1 0

F MSRMSE

145.25.075

28. 611

Thedecision rule:IfF F1 ;1,n 2, concludeHo

IfF F1 ;1,n 2, concludeHaF1 ;1,n 2 F0.95,1,4 7.71SinceF 0. 39511 F1 ;1,n 2 F0.95,1,4 7.71weconcludeHo that means that thereis alinear associationbetween X and Y .

6) s2b SEX X1 5. 0755541

1741

1741

641

6. 8079 2. 1043

2. 1043 . 74268

QUESTION 19Provethefollowing statements:1) Thesumof residuals is zero:

-22-


23/120

i1

n

ei 0

2) Thesumof theweighted residuals is zero when theresidual in theith trialis weighted by thelevel of theindependent variable in theith trial:

i1

n

Xiei 0

Solution:1) Let us recall thefirst normal equationYi nbo b1XiPuttingall terms on onesidewegetYi nbo b1Xi 0Hence

i1

n

ei i1

n

Yi Yi

i1

n

Yi bo b1Xi Yi nbo b1Xi 0

2) Let us recall thesecond normal equation

XiYi

boXi

b1Xi2

Puttingall terms on onesidewegetXiYi boXi b1Xi2 0Hence

i1

n

Xiei i1

n

XiYi bo b1Xi XiYi boXi b1Xi2 0

QUESTION 20.Theresults of acertain experiments areshown below

i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2

Xi 5.5 4.8 4.7 3.9 4.5 6.2 6.0 5.2 4.7 4.3 4.9 5.4 5.0 6.3 4.6 4.3 5.0 5.9 4.1 4.

Yi 3.1 2.3 3.0 1.9 2.5 3.7 3.4 2.6 2.8 1.6 2.0 2.9 2.3 3.2 1.8 1.4 2.0 3.8 2.2 1.Summary calculationresultsare:Xi 100.0,Yi 50.0,Xi2 509.12,Yi2 134.84,XiYi 257.66.a) Obtain theleastsquares estimates ofo and1, andstatetheestimated regression

function.b) Obtain thepoint estimatefor mean Y when X scoreis 5.0c) What is thepointestimateof changein themean responsewhen theX score

increasesby one.

Solution.

a) b1 XiYi

Xi

Yi

n

Xi2 Xi2

n

XiXYiY

XiX2

257.66 1005020

509.12 1002

20

0. 83991

bo 1n Yi b1Xi Y b1X 120 50 0. 83991 100 1. 6996

Y 1.6996 0. 83991 Xb)

Y 1.6996 0. 83991 5 2. 5

c) 0. 83991 ( b1)QUESTION 21.For thefollowingset of data:

Xi 30 20 60 80 40 50 60 30 70 60

Yi 73 50 128 170 87 108 135 69 148 132

-23-


24/120

1) Obtain theestimated regressionfunction. 2) Interpret bo and b1.Solution

1)

Xi Yi XiYi Xi2

30 73 30 73 302

20 50 20 50 202

60 128 60 128 602

80 170 80 170 802

40 87 40 87 402

50 108 50 108 502

60 135 60 135 602

30 69 30 69 692

70 148 70 148 702

60 132 60 132 602

500 1100 61800 32261 Totals

b1 XiYi

Xi Yin

Xi2 Xi2

n

XiXYiYXiX2

61800 5001100

10

32261 5002

10

68007261

0. 93651

bo 1n Yi b1Xi Y b1X 110 1100 0. 93651 500 63. 175

Y 63. 175 0. 93651 X2) bo - sincewedont know theif thescopeof themodel cover X 0 wecan not

giveanyinterpretation ofbo.

b1 0.93651 - mean of Y increaseby 0.93651 when X increaseby 1.QUESTION 22.Provethefollowingformula for SSE:SSE Yi2 boYi b1XiYiSolution:By thedefinition

SSE i1

n

Yi Yi2

i1

n

Yi bo b1Xi2 i1

n

ei2

Hence

SSE YiYi2 Yi2 2Yi

Yi

Yi

2

Yi2

Yi

Yi Yi

Yi

Yi2

Yi2 Yibo b1Xi bo b1XiYi

Yi

Yi2 boYi b1XiYi boYi Yi b1XiYi

Yi

Yi2 boYi b1XiYiWeused thefollowingpropertiesYi

Yi 0,XiYi

Yi Xiei 0

QUESTION 231) Statethenormal error model2)Find thedistributionofYi under normal error model3)Show that bo as defined in (3.19) is an unbiased estimator ofo.

Solution1) Thenormal error model is as follows:

-24-


25/120

Yi o 1Xi iwhere:

Yi - is thevalueof theresponsein theithtrialo and1 areparametersXi is theknown constant, namely,thevalueof theindependent

variable in theithtriali areindependent N0,2 i 1,2,...,n.

2) SinceYiis alinear transformationof anormally distributedrandomvariable ithereforeYi has normal distribution.EYi Eo 1Xi i o 1Xi Ei o 1XiVarYi Varo 1Xi i Vari 2

Henceunder normal modelYi hasNo 1Xi,23) Wehavetoshowthat Ebo oFrom2) abovewehaveYi hasNo 1Xi,2whereXi, o,1 - constants

b1XiXYiY

XiX2

Eb1 EXiXYiYXiX2

XiXEYiYXiX2

XiX1XiXXiX2

1XiXXiXXiX2

1

We usedEYi Y EYi 1n EYj o 1Xi 1n o 1Xjo 1Xi o 1 1n X1 X2 ....Xn 1Xi XSincebo Y b1XsoEbo EY b1X E 1n Yi XEb1

1n EYi X1 1n o 1Xi 1X

o 11n Xi 1X o

QUESTION 24Provethat1) SSRX1,X2,X3 SSRX1 SSRX2,X3 X12) SSRX1 SSRX2 X1 SSRX2 SSRX1 X2Solution:1) Weknow that

SSRX2,X3 X1

SSEX1 SSEX1,X2,X3andSSTO SSEX1 SSRX1 SSEX1,X2,X3 SSRX1,X2,X3

HenceLHS SSRX1,X2,X3 SSTO SSEX1,X2,X3 SSEX1 SSRX2 SSEX1,X2,X3 SSRX1 SSEX1 SSEX1,X2,X3 SSRX1 SSRX2,X3 X1 RHSwhichcompletes theproof.2) Weknow that

SSRX2 X1 SSEX1 SSEX1,X2,

SSRX1 X2 SSEX2 SSEX1,X2and

-25-


26/120

SSTO SSEX1 SSRX1 SSEX2 SSRX2HenceLHS SSRX1 SSRX2 X1 SSRX1 SSEX1 SSEX1,X2 SSTO SSEX1,X2 SSEX2 SSRX2 SSEX1,X2 SSRX2 SSRX1 X2 RHS

whichcompletes theproof.QUESTION 25For thefollowingset of data:

Xi 30 20 60 80 40 50 60 30 70 60

Yi 73 50 128 170 87 108 135 69 148 132

1) Obtain theestimated regressionfunction.2) Interpret bo and b1.3) Find the95% confidenceinterval for:o4)Testthe Ho : 1 0 versus Ha : 1 0 using tand 0.055) Find the90% confidenceinterval for1 andinterpret it.

Solution:

Xi Yi XiYi Xi2

Yi bo b1Xi ei Yi

Yi ei

2

30 73 2190 900 70 3 9

20 50 1000 400 50 0 0

60 128 7680 3600 130 -2 4

80 170 13600 6400 170 0 0

40 87 3480 1600 90 -3 9

50 108 5400 2500 110 -2 4

60 135 8100 3600 130 5 25

30 69 2070 900 70 -1 1

70 148 10360 4900 150 -2 4

60 132 7920 3600 130 2 4

500 1100 61800 28400 1100 0 60 TOTALS

ThereforeX 1n Xi 50010 50Y 1n Yi 110010 110

To calculatebo and b1 weusethefollowing formulas

b1 XiYi

Xi Yin

Xi2 Xi2

n

61800 5001100

10

28400 5002

10

2.0

bo Y b1X 110 2 50 10SoYi 10 2 Xi2) Sincewedo not knowwhether thescopeof themodel includesX 0 weareunabletoprovideany particular meaning for bo.b1 2 this can beinterpret as follows: thechangein themean of theprobability distributionof Y is equal to 2 per unit increasein X.

-26-


27/120

3) SSE Yi Yi2 ei2 60

MSE SSEn2

608

7.5

Then

s2bo MSE Xi2

n

XiX2

7.5 28400103400

6. 2647

0necan usealso

s2bo MSE1n

X2

XiX2 7.5 1

10

502

3400 6. 2647

so sbo s2bo 6.2647 2. 5029

The95% confidenceinterval foro isbo t1 /2;n 2sbo,bo t1 /2;n 2sbo 10 2.306 2.5029,10 2.306 2.5029 4. 2283,15. 772where t1 /2;n 2 t0.975;8 2.3064)Test for 1Ho;1 0

Ha;1 0s2b1

MSE

XiX2

MSE

Xi2 Xi2

n

7.5

28400 5002

10

7.5

3400 0.002206

sb1 0.002206 0.046968Thetest statistics

t b1

sb1

20.046968

42. 582

Thedecision rulein our caseisIf |t | t1 /2;n 2, concludeHoIf|t | t1 /2;n 2, concludeHaIn our case|t | 42.582 2.306 t0.975;8 t1 /2;n 1

so weconcludeHa (1 0), that means that thereis alinear associationbetween X and Y .5) Confidenceinterval foroFirst wecalculate

s2bo MSE Xi2

nXiX2 7.5 28400

103400 6. 2647

0necan usealso

s2bo MSE1n

X2

XiX2 7.5 1

10

502

3400 6. 2647

so sbo s2bo 6.2647 2. 5029

The90% confidenceinterval foro isbo t1 /2;n 2sbo,bo t1 /2;n 2sbo 10 1.860 2.5029,10 1.860 2.5029 5.34,14.66where t1 /2;n 2 t0.95;8 1.860QUESTION 26.

Provethefollowingformula for SSR:SSR b1

2Xi X2Solution:By definition:

SSR Yi Y2

Yi

2 2

YiY Y

2

Yi

2

2n 1n i1

n YiY nY2

-27-


28/120

Yi

2 2nY2 nY2

Yi

2 nY2

bo b1Xi2 nY2 bo2 2bob1Xi b12Xi2 nY2 nY b1X2 2Y b1Xb1nX b1

2Xi2 nY2 nY2 2nb1X Y nb1

2X2 2nb1X Y 2nb12X2 b1

2Xi2 nY2

b12Xi2 nb12X2 b12Xi2

Xi2

n b12Xi X2We usedbo Y b1X ,

Yi bo b1Xi

and Y 1n i1

n

Yi 1n

i1

n Yi

QUESTION 27.Show that bo as defined by bo Y b1X is an unbiased estimator ofo.SolutionUnder normal modelYi hasNo 1Xi,2whereXi, o,1 - constants

b1 XiXYiY

XiX2

Eb1 EXiXYiYXiX2

XiXEYiYXiX2

XiX1XiXXiX2

1XiXXiXXiX2

1

EYi Y EYi 1n EYj o 1Xi 1n o 1Xjo 1Xi o 1 1n X1 X2 ....Xn 1Xi Xbo Y b1XsoEb

o EY

b

1X E 1

n Y

i

XEb1

1n EYi X1 1n o 1Xi 1X

o 11n Xi 1X o

QUESTION 28.In atest of thealternatives Ho : 1 0 versus Ha : 1 0, astudent concludedHo. Does this conclusion imply that thereis no linear association between X and Y ?Solution.

Thenull (Ho) hypothesis consists twocases1 0 and1 0. Only thesecond partsupports statement that thereis nolinear association between X and Y .

Thereforetheresult of thetest does not imply that thereis no linear associationbetween X and Y .

QUESTION 29.Thefollowing datewereobtained in acertain study.

i 1 2 3 4 5 6 7 8 9 10 11 12

Xi 1 1 1 2 2 2 2 4 4 4 5 5

Yi 6.2 5.8 6 9.7 9.8 10.3 10.2 17.8 17.9 18.3 21.9 22.1


of alinear regression function. Use

0.05.Solution

-28-


29/120

1) Wehave

b1 XiYi

Xi Yin

Xi2 Xi2

n

534 33156

12

117 332

12

4

andbo Y b1X 1n Yi b1 1n Xi 112 156 4 112 33 2


Yj 1nj

i1

nj


Y 1 13 6.2 5.8 6 6.0 at level X 1Y 2

149.7 9.8 10.3 10.2 10.0 at level X 2

Y 3 1317.8 17.9 18.3 18.0 at level X 4

Y 4 1221.9 22.1 22.0 at level X 5

SSPE j1

c

i1

nj

Yi,j Yj2 6.2 62 5.8 62 6 62

9.7 102 9.8 102 10.3 102 10.2 102 17.8 182 17.9 182 18.3 182 21.9 222 22.1 222 0.5

MSPE SSPEnc 0.5

124 0.0625

SSE i1

n Yi Yi2 Yi2 boYi b1XiYi

2448.5 2 156 4 534 0. 5SSLF SSE SSPE 0.5 0.5 0MSLF SSLF

c2 0

42 0

Thehypothesis

Ho:EY o 1X

Ha:EY o 1X


MSPE

0

0.0625

0Thedecision rule


If F F1 ;c 2,n c conclude HaF1 ;c 2;n c F0.95;2;8 4.46SinceF 0 F1 ;c 2;n c 4.46 weconcludeHo.QUESTION 30.

Theresults of acertain experiments areshown below

i 1 2 3 4 5 6 7 8 9 10

Xi 5.5 4.8 4.7 3.9 4.5 6.2 6.0 5.2 4.7 4.3Yi 3.1 2.3 3.0 1.9 2.5 3.7 3.4 2.6 2.8 1.6

-29-


30/120

i 11 12 13 14 15 16 17 18 19 20

Xi 4.9 5.4 5.0 6.3 4.6 4.3 5.0 5.9 4.1 4.7

Yi 2.0 2.9 2.3 3.2 1.8 1.4 2.0 3.8 2.2 1.5

Summary calculational resultsare: Xi 100, Yi 50,Xi2 509.12,Yi2 134.84, XiYi 257.66.1) Obtain theestimated regressionfunction.2) Set up theANOVA table3) Conduct theF test of Ho : 1 0 versusHa : 1 0 using 0.05Solutions1) Wehave

b1 XiYi

Xi Yin

Xi2 Xi2

n

257.66 10050

20

509.12 1002

20

0. 83991

and

bo Y b1X 1n Yi b1 1n Xi 120 50 0. 83991 120 100 1. 6996ThereforeYi 1.6996 0. 83991 Xi2) ANOVA table


regression SSR 1 MSR SSR1

error SSE n 2 MSE SSEn2

total SSTO n 1

Where

SSR Yi Y2 b1 XiYi

XiYin

0. 83991 257.66 1005020

6. 4337

SSE i1

n


134.84 1. 6996 50 0. 83991 257.66 3. 4088MSE SSE

n2 3.4088

18 0.18938

SSTO Yi Y2 Yi2 Yi2

n 134.84502

20 9. 84

Hence


regression SSR 6. 4337 1 MSR 6.4337

error SSE 3. 4088 18 MSE 0.18938

total SSTO 9. 84 19

3) F test: HypothesisHo : 1 0Ha : 1 0

Test statistics:F MSR

MSE

6.43370.18938

33. 972


-30-


31/120

IfF F1 ;1,n 2, concludeHaIn our caseF1 ;1;n 2 F0.95;1;18 4.41SinceF 33. 972 F1 ;1;n 2 4.41weconcludeHa (1 0), that means that thereis alinear association

between X and Y .QUESTION 31.

Thefollowing datewereobtained in thestudy of solution concentration.

i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Xi 9 9 9 7 7 7 5 5 5 3 3 3 1 1 1

Yi 0.07 0.09 0.08 0.16 0.17 0.21 0.49 0.58 0.53 1.22 1.15 1.07 2.84 2.57 3.1

Summary calculational resultsare: Xi 75, Yi 14.33,Xi2 495Yi2 29.2117, XiYi 32.77.1) Fit alinear regression function

2) Performan F test to determinewhether or not thereis lack of fitof alinear regression function. Use 0.05.Solution.1) Wehave

b1 XiYi

Xi Yin

Xi2 Xi2

n

32.77 7514.33

15

495 752

15

0. 324

andbo Y b1X 1n Yi b1 1n Xi 115 14.33 0. 324

115

75 2. 5753

ThereforeYi 2. 5753

0. 324 Xi

2) F test for lack of fit.Wehavec 5 levels for X and 3 replicates for eachlevel (henceeachnj 3and n 15.Hence

Yj 1nj

i1

nj


Y 1 133.1 2.57 2.84 2. 8367 at level X 1

Y 2 1

3Y1.07 1.15 1.22 1. 1467 at level X 3

Y 3 130.53 0.58 0.49 0. 53333at level X 5

Y 4 1

3

0.21 0.17 0.16 0.18 at level X 7

Y 4 13 0.08 0.09 0.07 0.08 at level X 9

SSPE j1

c

i1

nj

Yi,j Yj2 2.84 2.83672 2.57 2.83672

3.1 2.83672 1.22 1.14672 1.15 1.14672 1.07 1.14672 0.49 0.533332 0.58 0.53332 0.53 0.53332 0.16 0.182 0.17 0.182

0.21 0.182 0.07 0.082 0.09 0.082 0.08 0.082 0.1574MSPE SSPEnc

0.1574155 0.01574

SSE i1

n


29.2117 2. 57530 14.33 0. 324 32.77 2. 9251SSLF SSE SSPE 2. 9251 0.1574 2. 7677

-31-


32/120

MSLF SSLFc2

2.767752 0. 92257

Thehypothesis

Ho:EY o 1X

Ha:EY o 1X


MSPE

0.922570.01574

58. 613

Thedecision rule


If F F1 ;c 2,n c conclude HaF1 ;c 2;n c F0.95;3;10 3.71SinceF 58.613 F1 ;c 2;n c 3.71 weconcludeHa.QUESTION 31.A largediscount departmentstorechain advertises ontelevision(X1),on theradio (X2), and in newspapers (X3). A sample of 12 of its stores inacertain areashowed thefollowingadvertising expenditures andrevenuesduringagiven month. ( All figures arein thousands of rands) (Table 1.)1) Find theestimated regressioncoefficients.2)Testwhether thereis aregressionrelation using 0.01.3) Estimate1,2 and3 jointly by theBonferroni procedureusing99 percentfamily confidencecoefficient.4) Obtain an interval estimateofEYh when Xh,1 11, Xh,2 6 and Xh,3 2.Usea90 percent level of confidence.5) Obtain an ANOVA table and useit totest whether thereis aregressionrelation using 0.01.

6) Obtain theresiduals.7) Calculatethecoefficient of multipledeterminationR2.8) Obtain thesimultaneous interval estimates for two levels ofX :

i 1 2

X1 11 15

X2 7 9

X3 2 3

using 90 percent level of confidence.

-32-


33/120

Table1

i Revenues (Yi) Xi,1 Xi,2 Xi,3

1 84 13 5 2

2 84 13 7 13 80 8 6 3

4 50 9 5 3

5 20 9 3 1

6 68 13 5 1

7 34 12 7 2

8 30 10 3 2

9 54 8 5 2

10 40 10 5 3

11 57 5 6 2

12 46 5 7 2


X X

12 115 64 24

115 1191 610 222

64 610 362 129

24 222 129 54

X Y

647

6393

3600

1292

Y Y 39973

X X1

5182112801

208912801

332612801

21664267

208912801

16412801

1612801

724267

332612801

1612801

62612801

8312801

21664267

724267

8312801

730638403

Solution

-33-


34/120

1) X

1 13 5 2

1 13 7 1

1 8 6 3

1 9 5 3

1 9 3 1

1 13 5 1

1 12 7 2

1 10 3 2

1 8 5 2

1 10 5 3

1 5 6 2

1 5 7 2

Y

84

84

80

50

20

68

34

30

54

40

57

46

X X

1 13 5 2

1 13 7 1

1 8 6 3

1 9 5 3

1 9 3 1

1 13 5 1

1 12 7 2

1 10 3 2

1 8 5 2

1 10 5 3

1 5 6 2

1 5 7 2

1 13 5 2

1 13 7 1

1 8 6 3

1 9 5 3

1 9 3 1

1 13 5 1

1 12 7 2

1 10 3 2

1 8 5 2

1 10 5 3

1 5 6 2

1 5 7 2

12 115 64 24

115 1191 610 222

64 610 362 129

24 222 129 54

det

12 115 64 24

115 1191 610 222

64 610 362 12924 222 129 54

115209

X X1 1detA

Ai,j

12 115 64 24

115 1191 610 222

64 610 362 129

24 222 129 54

1

-34-


35/120

5182112801

208912801

332612801

21664267

208912801

16412801

1612801

724267

332612801

1612801

62612801

8312801

21664267

724267

8312801

730638403

X Y

1 13 5 2

1 13 7 1

1 8 6 3

1 9 5 3

1 9 3 1

1 13 5 1

1 12 7 2

1 10 3 2

1 8 5 2

1 10 5 3

1 5 6 2

1 5 7 2

84

84

80

50

20

68

34

30

54

40

57

46

647

6393

3600

1292

b X X1X Y

5182112801

208912801

332612801

21664267

208912801

16412801

1612801

724267

332612801 1612801 62612801 8312801 2166

426772

4267 83

128017306

38403

647

6393

3600

1292

11518753

1973753

5690753

42942259

5) ANOVA

-35-


36/120

Y Y

84

84

80

5020

68

34

30

54

40

57

46

84

84

80

5020

68

34

30

54

40

57

46

39973 Y 1

84

84

80

5020

68

34

30

54

40

57

46

1

1

1

11

1

1

1

1

1

1

1

647

bX Y

11518753

1973753

5690753

42942259

647

6393

3600

1292

82483577

2259 36513.

SSTO Y Y 1n Y11Y 39973 1

126472 61067

12 5088 11

12 5088. 9

SSE Y Y bX Y 39973 36513 3460.0SSR bX Y 1n Y

11Y 36513 112

6472 1954712

1628. 9

TheANOVA table

Sourceof variation SS df MS

Regression SSR 1628. 9 p 1 3 MSR SSRp1 542. 97

Error SSE 3460.0 n p 8 MSE SSEnp 432. 5

Total SSTO 5088. 9 n 1 11

Test of regression relationHypothesis: Ho : 1 2 3 0

Ha : not all k 0

Weusethetest statisticsF MSRMSE

542.97432.5

1. 2554

Thedecision ruleis

IfF F1 ,p 1,n p, concludeHo

IfF F1 ,p 1,n p, concludeHaAssumingthat 0.01 fromtable wegetF1 ,p 1,n p F0.99;3;8 7.59SinceF 1.2554 F1 ,p 1,n p F0.99;3;8 7.59weconcludeHo, Thereforethereis notalinear regressionrelation.

-36-


37/120

3) s2b MSEX X1 432. 5

5182112801

208912801

332612801

21664267

208912801

16412801

1612801

724267

332612801

1612801

62612801

8312801

21664267

724267

8312801

730638403

1750. 8 70. 58 112. 37 219. 54

70. 58 5. 541 . 54058 7. 2979

112. 37 . 54058 21. 15 2. 8043

219. 54 7. 2979 2. 8043 82. 281

Hences2b1 5.541 sb1 5.541 2. 3539s2b2 21.15 sb2 21.15 4. 5989s2b3 82.281 sb3 82.281 9. 0709

Ifg parameters aretobeestimated jointly (whereg p), theconfidencelimits withfamily confidencecoefficient 1 are:

bk Bsbkwhere

B t1 /2g,n p

Nextfor g 3 fromthetableB t1 0.01/2 3;8 t0.9833;8 t0.99,8 2.8961 0.01/2 3 0. 99833 0.99So for 1 1973

753

2.896 2. 3539, 1973753

2.896 2. 3539 4. 1967,9. 4371for 2 5690

753 2.896 4. 5989, 5690

753 2.896 4. 5989 5. 762,20. 875

for 3 4294

2259 2.896 9. 0709, 4294

2259 2.896 9. 0709 24. 368,28. 17

4)

X h

1

11

6

2

Thepoint estimateof mean forY is

Yh Xh

b 1 11 6 2

11518753

1973753

5690753

42942259

141563

2259 62. 666

Theestimated varianceb is:s2

Yh X h

s2bX h

-37-


38/120

1 11 6 2

1750. 8 70. 58 112. 37 219. 54

70. 58 5. 541 . 54058 7. 2979

112. 37 . 54058 21. 15 2. 8043

219. 54 7. 2979

2. 8043 82. 281

1

11

6

2

57.

586andsYh 57. 586 7. 5885

t1 /2;n p t1 0.01/2,8 t0.995;8 3.53Yh t1 /2;n ps

Yh

Hence62. 666 3.53 7. 5885,62. 666 3.53 7. 5885 35. 879,89. 4536) Fitted values

Y Xb

1 13 5 2

1 13 7 1

1 8 6 3

1 9 5 3

1 9 3 1

1 13 5 1

1 12 7 2

1 10 3 2

1 8 5 21 10 5 3

1 5 6 2

1 5 7 2

11518753

1973753

5690753

42942259

1363312259

1661772259

42700753

38983753

742212259

1320372259

1645522259

844342259

1067362259

13652251

1060492259

1231192259

60. 35

73. 562

56. 707

51. 77

32. 856

58. 449

72. 843

37. 377

47. 24954. 39

46. 945

54. 502

residuals

e Y Y

84

84

80

50

20

68

34

30

54

40

5746

60. 35

73. 562

56. 707

51. 77

32. 856

58. 449

72. 843

37. 377

47. 249

54. 39

46. 94554. 502

23. 65

10. 438

23. 293

1. 77

12. 856

9. 551

38. 843

7. 377

6. 751

14. 39

10. 0558. 502

-38-


39/120

7) Coefficient of multipledeterminationR2 SSR

SSTO

1628.95088.9

0. 32009

8) simultaneous interval estimates for twolevelsaofX

A B

Xh,1 11 15Xh,2 7 9

Xh,3 2 3

In this caseg 2. To determinewhichsimultaneous prediction intervals arebesthere,we

shall find S and B assumingtheconfidencecoefficient0.90.S2 gF1 ;g;n p 2F0.99;2;8 2 8.65 17. 3

soS 17. 3 4. 1593

and

1 0.01/4 . 9975B t1 /2g;n p t0.9975;8 t0.995 3.250

Hence, theBonferroni limitsaremoreefficient here.(They giveshorter intervals)For explanatory variables level A wehave

X A

1

11

7

2


YA X A

b 1 11 7 2

11518753

1973753

5690753

42942259

158633

2259 70. 223

ands2

YA X A

s2bX A

1 11 7 2

1750. 8 70. 58 112. 37 219. 54

70. 58 5. 541 . 54058 7. 2979

112. 37 . 54058 21. 15 2. 8043

219. 54 7. 2979 2. 8043 82. 281

1

11

7

2

108.

47and MSE 432. 5Hence

s2YAnew MSE s2YA 432. 5 108. 47 540. 97

andsYAnew 540. 97 23. 259

-39-


40/120

X B

1

15

9

3


YB X B

b 1 15 9 3

11518753

1973753

5690753

42942259

24527

251 97. 717

ands2

YB X B

s2bX B

1 15 9 3

1750. 8 70. 58 112. 37 219. 5470. 58 5. 541 . 54058 7. 2979

112. 37 . 54058 21. 15 2. 8043

219. 54 7. 2979 2. 8043 82. 281

115

9

3

645.

24Hence

s2YBnew MSE s2YB 432. 5 645. 24 1077. 7

andsYBnew 1077. 7 32. 828

Wefound beforethatB

3.250. Thesimultaneous Bonferroni predictionintervals with confidencecoefficient 0.90 are:Yh BsYhnew so70. 223 3.250 23. 259 YAnew 70. 223 3.250 23. 25997. 717 3.250 32. 828 YBnew 97. 717 3.250 32. 828or5. 3688 YAnew 145. 818. 974 YBnew 204. 41

QUESTION 32Assumethat thenormal regressionmodel is applicable.

For thefollowingdatagiven by:i 1 2 3 4 5

Xi 8 4 0 -4 -8

Yi 7.8 9 10.2 11 11.7

usingmatrix methodfind:1)Y Y2) X X3) X Y4) b5)Testthe Ho : 1 0 versus Ha : 1

0usingANOVA.

with 0.056) covariance-variancematrix s2b

-40-


41/120

Solution

X

1 8

1 4

1 0

1 4

1 8

Y

7.8

9

10.2

11

11.7

1) Y Y 7.8 9 10.2 11 11.7

7.8

9

10.2

11

11.7

503. 77

2) X X 1 1 1 1 1

8 4 0 4 8

1 81 4

1 0

1 4

1 8

5 0

0 160

3) X Y 1 1 1 1 1

8 4 0 4 8

7.8

9

10.2

11

11.7

49. 7

39. 2

4) X X1 5 0

0 160

1

1

detAAij

1

800

160 0

0 5

15

0

0 1160

b

X

X

1

X

Y

15

0

0 1160

49. 7

39. 2

9. 94

0. 245

5) ANOVA tableSSTO Y Y 1n Y

11Y

7.8 9 10.2 11 11.7

7.8

9

10.2

11

11.7

-41-


42/120

15 7.8 9 10.2 11 11.7

1

1

1

1

1

1 1 1 1 1

7.8

9

10.2

11

11.7

9. 752

SSE Y Y bX Y 503. 779. 94

0. 245

49. 7

39. 2 0.15

SSR SSTO SSE 9. 752 0.15 9. 602

(ANOVA table):


regression SSR 9.602 1 MSR SSR1 9.6021 9.602error SSE 0.15 3 MSE SSE

n2 0.15

3 0.05

total SSTO 9. 752 4

Ho : 1 0Ha : 1 0

F MSRMSE

9.6020.05

192. 04


IfF

F1 ;1,n 2, concludeHaF1 ;1,n 2 F0.95,1,3 10.13SinceF 192. 04 F1 ;1,n 2 F0.95,1,3 10.13weconcludeHa that means that thereis alinear associationbetween X and Y .

6) s2b MSEX X1 0.0515

0

0 1160

0.01 0

0 0.0003125

QUESTION 33.ProvethatSSE Y Y bX YSolution:

Weknow thatA A, A B A B, and AB BA

Alsothenormal equation:X Xb X Y

hence

X Xb X Y 0

0

where b bo

b1

so

-42-


43/120

X Xb X Y bX X Y X 0

0

0 0


Y

Y Y

Xb b

X

Y b

X

Xb Y Y bX Y bX Xb Y Xb Y Y bX Y bX X Y Xb

Y Y bX Y 0 0bo

b1

Y Y bX Y 0 Y Y bX YQUESTION 34.Let

Yh Xh

bProvethat:

ThevarianceofYh, is in matrix notation:

2Yh 2X h X X1X h

Proof: Weknow thatLet W bearandomvector obtained by premultiplyingtherandomvector

Y by aconstant matrix A*) W AY

Then2W 2AY A2YA

SinceYh X h

b using*) with A X h weget

2Yh X h

2bX h

Usingthefact that2b 2X X1

weget2

Yh X h


QUESTION 35Theresults of acertain experiments areshown below

i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2

Xi 5.5 4.8 4.7 3.9 4.5 6.2 6.0 5.2 4.7 4.3 4.9 5.4 5.0 6.3 4.6 4.3 5.0 5.9 4.1 4.

Yi 3.1 2.3 3.0 1.9 2.5 3.7 3.4 2.6 2.8 1.6 2.0 2.9 2.3 3.2 1.8 1.4 2.0 3.8 2.2 1.

Summary calculationresultsare:Xi 100.0,Yi 50.0,Xi2 509.12,Yi2 134.84,XiYi 257.66.a) Obtain theleastsquares estimates ofo and1, andstatetheestimated regression

function.b) Obtain thepoint estimatefor mean Y when X scoreis 5.0c) What is thepointestimateof changein themean responsewhen theX score

increasesby one.

Solution.

a) b1 XiYi

Xi Yin

Xi2

Xi

2

n

XiXYiY

XiX2

257.66 10050

20

509.12 1002

20

0. 83991

bo 1n Yi b1Xi Y b1X 120 50 0. 83991 100 1. 6996

-43-


44/120

Y 1.6996 0. 83991 X

b)Y 1.6996 0. 83991 5 2. 5

c) 0. 83991 ( b1)QUESTION 36.For thefollowingset of data:

Xi 30 20 60 80 40 50 60 30 70 60

Yi 73 50 128 170 87 108 135 69 148 132

1) Obtain theestimated regressionfunction. 2) Interpret bo and b1.Solution

1)

Xi Yi XiYi Xi2

30 73 30 73 302

20 50 20 50 202

60 128 60 128 602

80 170 80 170 802

40 87 40 87 402

50 108 50 108 502

60 135 60 135 602

30 69 30 69 692

70 148 70 148 702

60 132 60 132 602

500 1100 61800 32261 Totals

b1 XiYi

Xi Yin

Xi2 Xi2

n

XiXYiYXiX2

61800 5001100

10

32261 5002

10

68007261

0. 93651

bo 1n Yi b1Xi Y b1X 110 1100 0. 93651 500 63. 175

Y 63. 175 0. 93651 X2) bo - sincewedont know theif thescopeof themodel cover X 0 wecan not

giveanyinterpretation ofbo.b1 0.93651 - mean of Y increaseby 0.93651 when X increaseby 1.

QUESTION 37Theresults of acertain experiments areshown below

i 1 2 3 4 5 6 7 8 9 10

Xi 1 0 2 0 3 1 0 1 2 0

Yi 16 9 17 12 22 13 8 15 19 11

1) Obtain theestimated regressionfunction. 2) Plottheestimated regressionfunctionand thedata. 3) Interpret bo and b1. 4) Find the95% confidenceinterval for:o,1,and interpret them. 5)Testthe Ho : 1 0 versus Ha : 1 0 using t and ANOVA.using 0.05 6) Find 95% confidenceintervals for mean of responsevariablecorresponding tothelevel of theexplanatory equal to3.7)Find 90% predictionlimitsfor new observationof theresponsevariable

corresponding tothelevel of theexplanatory equal to3.8) Obtain theresiduals ei. 9) Estimate2 and. 10)Computeei2

-44-


45/120

Solution1)

X1 Yi XiYi Xi2 Yi

2Yi ei ei

2

1 16 16 1 256 14.2 1. 8 3. 24

0 9 0 0 81 10.2 1.2 1.442 17 34 4 289 18.2 1.2 1.44

0 12 0 0 144 10.2 1.8 3.24

3 22 66 9 484 22.2 0.2 0.04

1 13 13 1 169 14.2 1.2 1.44

0 8 0 0 64 10.2 2.2 4.84

1 15 15 1 225 14.2 0. 8 0. 64

2 19 38 4 361 18.2 0. 8 0. 64

0 11 0 0 121 10.2 0.8 0.64

10 142 182 20 2194 0 17. 6 totals

b1 XiYi

Xi Yin

Xi2 Xi2

n

182 10142

10

20 102

10

4

bo 1n Yi b1Xi 110 142 4 10 10. 2

Y 10.2 4 X2)

Regression

X vs. Y (Casewise MD deletion)

Y =10.200 +4.0000 * X

Correlation: r =.94916

X

Y

6

8

10

12

14

16

18

20

22

24

-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

3) bo 10.2 - since0 is included in scopefor X (wehaveobservations)this represent theestimated mean valuefor Y at X 0.

b1 4 - themean of Y increaseby 4 when X increaseby 1.4)

SSE i

1

n

Yi Yi2

i

1

n

Yi bo b1Xi2 i

1

n

ei2

17.6

MSE SSEn2

17.6102 2. 2

-45-


46/120

s2bo MSE Xi2

nXiX2 MSE 1n

X2

XiX2

MSE 1n X

2

Xi2 Xi2n

2.2 110

1

20 102

10

0. 44

sbo 0.44 0. 66332confidenceinterval for o isbo t1 /2;n 2sbo95% hence 0.5 and t1 /2;n 2 t0.975,8 2.306and 95% confidenceinterval foro is(10.2 2.306 0.66323,10.2 2.306 0.66323 8.6706,11.729

Interpretation- thereis 95% chancethat themean of Y corresponding tothevalue0 ofexplanatory variableX is in8.6706,11.729.

Confidenceinterval for1.

s2b1 MSE

XiX2

MSE

Xi2 Xi2

n

2.2

20 102

10

0. 22

sb1 0.22 0. 46904andb1 t1 /2;n 2sb1

wheret1 /2;n 2 t0.975,8 2.306

and 95% confidenceinterval for1 is4 2.306 0.46904,4 2.306 0.46904 2. 9184,5. 0816Interpretation- thereis 95% chancethat the1isin2. 9184,5. 0816.Sincethhis interval does notcontain 0 thereforethereis 95% chancethatther is alinear association between Y and X.5) Using t

Ho : 1 0Ha : 1 0

teststatisticst b1

sb1

40.46904

8. 5281

The decision rule(at thelevel of significance) isif|t | t1 /2;n 2, concludeHoif|t | t1 /2;n 2, concludeHa

wheret1 /2;n 2 t0.975,8 2.306Since |t | 8. 5281 t1 /2;n 2 t0.975,8 2.306 weconclude Ha (1 0), that means that thereis alinear associationbetween X and Y .ANOVA

-46-


47/120


regression SSR Yi Y2 1 MSR SSR1

error SSE Yi Yi2 n 2 MSE SSEn2

total SSTO Yi Y2

n 1

SSR Yi Y2 b1 XiYi

XiYin 4 182

1014210

160.0

SSE i1

n

Yi Yi2

i1

n

Yi bo b1Xi2 i1

n

ei2

17.6

MSE SSEn2

17.6102 2. 2

SSTO Yi2 Yi2

n 21941422

10 177. 6


regression SSR 160 1 MSR 160error SSE 17.6 8 MSE 2.2

total SSTO 177.6 9

Ho : 1 0Ha : 1 0

Test statistics:F MSR

MSE

1602.2

72. 727

Thedecision rule:IfF F1 ;1,n 2, concludeHoIfF F1

;1,n

2, concludeH

awhereF1 ;1,n 2 F0.95,8 5.32

SinceF 72. 727 F1 ;1,n 2 F0.95,8 5.32weconcludeHa (1 0), that means that thereis alinear associationbetween X and Y .6) 5)A 1 confidenceinterval for EYh is given by:

Yh t1 /2;n 2sYh

whereYh bo b1Xh 10.2 4 3 22. 2,

s2Yh MSE

1n

XhX2

XiX2

2.2 110

3 10

102

20102

10

1. 1

and sYh 1.1. 1. 0488

t1 /2;n 2 t0.975,8 2.306Hencethe95% confidenceinterval for mean of theresponsevariablecorresponding tothelevel of explanatory variable equal to3 is22.2 2.306 1. 0488,22.2 2.306 1. 0488 19. 781,24. 6197)90% confidenceinterval for new observation of theresponsevariablecorrespondingtothelevel of explanatory variable equal to3 is given by

EYh z1 /2In casewhen theparameters areunknown1- predictionlimits are

Yh t1 /2;n 2sYhnew

where

-47-


48/120

s2Yhnew s2Yh MSE MSE1 1n

XhX2

XiX2

2.2 1 110

3 10

32

20 102

10

2. 4444

sYhnew 2.4444 1. 5635

and t1 /2;n 2 t0.95,8 1.860Hencetheconfidenceinterval is22.2 1.860 1. 5635,22.2 1.860 1. 5635 19. 292,25.108QUESTION 38.Datafromastudyof computer-assisted learning by 12 students, showingthetotalnumber of responses in completingalesson (X) and thecost of computer time(Y, in cents), follow

i 1 2 3 4 5 6 7 8 9 10 11 12

Xi 16 14 22 10 14 17 10 13 19 12 18 11

Yi 77 70 85 50 62 70 52 63 88 57 81 54


Xi2 .

Solution.

Xi Yi wi 1/Xi2 wiXi wiYi wiXiYi wiXi

2

16 77 0.003906 0.0625 0.300781 4.8125 1

14 70 0.005102 0.071429 0.357143 5 1

22 85 0.002066 0.045455 0.17562 3.863636 1

10 50 0.01 0.1 0.5 5 1

14 62 0.005102 0.071429 0.316327 4.428571 1

17 70 0.00346 0.058824 0.242215 4.117647 1

10 52 0.01 0.1 0.52 5.2 1

13 63 0.005917 0.076923 0.372781 4.846154 1

19 88 0.00277 0.052632 0.243767 4.631579 1

12 57 0.006944 0.083333 0.395833 4.75 1

18 81 0.003086 0.055556 0.25 4.5 1

11 54 0.008264 0.090909 0.446281 4.909091 1

176 809 0.066619 0.868988 4.120748 56.05918 12 Totals

b1

wiXiYiwiXi wiYi wi

wiXi2 wiXi2

wi

56.05918 0.8689884.120748

0.066619

12 0.8689882

0.066619

3. 4711

bo wiYibiwiXi

wi

4.1207483.47110.8689880.066619

16. 578

HenceY 16. 578 3. 4711XQUESTION 39.Consider normal simple regressionmodel expressed in matrix terms.

-48-


49/120

Provethat:1) SSTO Y Y 1n Y

11Y2) SSE Y Y bX YSolution1)We know that

SSTO Yi2 nY2 Yi2 Yi2

n

Wealso known thatY Y Yi2

Let 1

1

1

:

1

n 1

Using this wehave

1n Y11Y 1n Y1 Y2 .. . Yn

11

:

1

1 1 . .. 1

Y1

Y2

:

Yn

1n Y1 Y2 .. .YnY1 Y2 .. .Yn Yi 2

n

HenceSSTO Y Y 1n Y

11Y2) Weknow thatA A, A B A B, and AB BA

Alsothenormal equation:X Xb X Y

hence

X Xb X Y 0

0

where b bo

b1

so

X Xb X Y bX X Y X 00

0 0



Y Y bX Y 0 0bo

b1

Y

Y b

X

Y 0 Y

Y b

X

YQUESTION 40.

-49-


50/120

Theresults of acertain experiments areshown below

i 1 2 3 4 5 6 7 8 9 10

Xi 1 0 2 0 3 1 0 1 2 0

Yi 16 9 17 12 22 13 8 15 19 11

Summary calculational resultsare: Xi 10, Yi 142,Xi2 20Yi2 2194, XiYi 182.1) Obtain theestimated regressionfunction.2) Find the95% confidenceinterval for:13) Test the Ho : 1 0 versus Ha : 1 0 using tand 0.05Solution1) To calculatebo and b1 weusethefollowingformulas

b1 XiYi

Xi Yin

Xi2 Xi2

n

182 10142

10

20 102

10

4

bo Y b1X 1n Yi b1 1n Xi 110 142 4 110 10 10.2SoY 10.2 4 X2)In our case

SSE i1

n


2194 10.2 142 4 182 17. 6MSE SSE

n2 17.6

8 2. 2

s2b1 MSE

XiX2

MSE

Xi2 Xi2

n

2.2

20 102

10

0. 22

sb1 0.22 0. 46904The95% confidenceinterval for 1 isb1 t1 /2;n 2sb1,b1 t1 /2;n 2sb1 4 2.306 0. 46904,4 2.306 0. 46904 2. 9184,5. 0816wheret1 /2;n 2 t0.975;8 2.306.3)Test for 1Ho;1 0Ha;1 0

Thetest statisticst

b1sb1

4

0.46904 8. 5281

Thedecision rulein our caseisIf |t | t1 /2;n 2, concludeHoIf|t | t1 /2;n 2, concludeHaIn our case|t | 8. 5281 2.306 t0.975;8 t1 /2;n 2so weconcludeHa (1 0), that means that thereis alinear associationbetween X and Y .QUESTION 41.

Thefollowing datawereobtained in acertain experiment:

-50-


51/120

i Xi,1 Xi,2 Yi

1 1 2 2.5

2 1 2 3

3 1 2 3.5

4 2 1 3

5 2 1 4

6 0 1 1

7 0 1 1.5

8 0 1 2

9 1 0 1.5

10 1 0 2

11 1 0 2.5Thedatasummary is given below in matrix form

X X

11 10 11

10 14 10

11 10 17

X X1

2354

527

16

527

1154

0

16

0 16

X Y

26.5

29

29.5

Y Y 72.25



80percentfamily confidencecoefficient.4) Test thelack of fit for theEY o 1X1 2X2 using 0.05Solution

1) b

bo

b1

b2

X X1X Y

2354

527

16

527

1154

0

16

0 16

26.5

29

29.5

1.0

1. 0

0. 5

2) Y 1 1Y Yi 26.5 (weget it from X Y )SSTO Y Y 1n Y

11Y 72.25 1

11 26.5 26.5 8. 4091

-51-


52/120

SSE Y Y bX Y 72.25 1 1 0.5

26.5

29

29.5

2.0

SSR bX Y 1n Y11Y

1 1 0.5

26.5

29

29.5

111

26.5 26.5 6. 4091

ANOVA table




0.25

total SSTO 8.4091 n 1 10Hypothesis:Ho : 1 2 0Ha : not both1 and2 areequal tozero


MSE

3.20460.25

12. 818


F1 ;p 1,n p F0.95,2,8 4.46SinceF 12.818 F0.95,2,8 4.46weconcludeHa (not both1 and2 areequal tozero), that means that thereis alinear associationbetween X and Y .3) Ifg parameters areto beestimated jointly (whereg p), theBonferroni confidencelimits withfamily confidencecoefficient 1 are:

bk BsbkB t1 /2g,n p



s2b MSEX X1 0.25

2354

527

16

527

1154

0

16

0 16

0. 10648 4. 6296 102 4. 1667 102

4. 6296 102 0.050926 0

4. 1667 102 0 0.041667

sos2b1 0.050926 and sb1 0.050926 0. 22567

s2b2 0.04166 and sb2 0.04166 0. 20411B t1 /2g,n p t1 0.2

22,9 t0.95,8 1.860

-52-


53/120

Hencethelimits for b1 and b2 are0.58025,1.4197 and 0.12036,0.87964respectively, since1 1.860 0. 22567 0. 580251 1.860 0. 22567 1. 4197

0.5 1.860 0. 20411 0. 120360.5 1.860 0. 20411 0. 879644) First level

i Xi,1 Xi,2 Yj,1

1 1 2 2.5

2 1 2 3

3 1 2 3.5

n1 3, Y 1 3,

Squaredeviationat this level2.5 32 3 32 3.5 32 0. 5Secondlevel

i Xi,1 Xi,2 Yj,2

4 2 1 3

5 2 1 4

n2 2,Y 2 3.5

Squaredeviationat this level3 3.52 4 3.52 0. 5Third level

i Xi,1 Xi,2 Yj,3

6 0 1 1

7 0 1 1.5

8 0 1 2

n3 3,Y 3 1.5

Squaredeviationat this level1 1.52 1.5 1.52 2 1.52 0. 5Fourth level

i Xi,1 Xi,2 Yj,4

9 1 0 1.5

10 1 0 2

11 1 0 2.5

n4 3,Y 4 2

Squaredeviationat this level1.5 22 2 22 2.5 22 0. 5c 4,n 11,p 3SSPE Yj,i Y i2 0.5 0.5 0.5 0.5 2MSPE SSPEnc

2114 0. 286

SSE 2.0SSLF SSE SSPE 0MSLF SSLFcp 0


MSPE 0

Thehypothesis

Ho:EY o 1X1 2X2

Ha:EY o1X1

2X2

Thedecision rule

-53-


54/120

If F F1 ;c p,n c conclude Ho

If F F1 ;c p,n c conclude HaF1 ;c p,n c F0.95,1,7 5.59SinceF F1 ;c p,n c weconcludeHo

Thereforetherein no lack of fit.QUESTION 42.For acertain experiment thefirst-order regressionmodel with twoindependentvariables was used. Thecalculated diagonal elementsof thehat matrix are:

i 1 2 3 4 5 6 7 8

hi,i 0.237 0.237 0.237 0.237 0.137 0.137 0.137 0.137

i 9 10 11 12 13 14 15 16

hi,i 0.137 0.137 0.137 0.137 0.237 0.237 0.237 0.237

1) Describeuseof hat matrix for identifyingoutlyingX observations.

2) Identify any outlyingX observations using thehat matrix method.Solution1) Thehat matrix H is given by:

H XX X1X

Thediagonal element hi,i in thehat matrix is called theleverageof the ith observation.Thus, alargeleveragevaluehi,i indicates that the ith observation is distantfromthecenter of theX observations. Themean leveragevalue

h hi,i

n pn


observations with regard to theX values.

2) In our casen 16, p 3 so thecritical value2pn 616 0. 375

Sinceall leveragevalues in our casearelessthan 0.375 thereforethis methoddoes notidentified outlying observationsfor X.QUESTION 43.Consider thefollowing functions of therandomvariables

Y 1,Y 2 and Y 3 :W1 Y1 Y2 Y3W2 Y1 Y2W3 Y1 Y2 Y3

whereY1,Y2,Y3 areindependentidentically distributedrandomvariableswithN0,1 distribution.

1) Stateabovein matrix notation

2) Find theexpectationof therandomvector W

W1

W2

W3

3) Find thevariance-covariancematrix ofW.Solution

1) W

W1

W2

W3

1 1 1

1 1 0

1 1 1

Y1

Y2

Y3

AY

-54-


55/120

2) EW

EW1

EW2

EW3

EY1 EY2 EY3

EY1 EY2

EY1 EY2 EY3

0

0

0

3) Let us notice

2Y

2Y1 Y1,Y2 Y1,Y3

Y2,Y1 2Y2 Y2,Y3

Y3,Y1 Y3,Y2 2Y3

1 0 0

0 1 0

0 0 1

SinceW AY then

2W 2AY A2YA

1 1 1

1 1 0

1 1 1

2Y

1 1 1

1 1 1

1 0 1

1 1 11 1 0

1 1 1

1 0 00 1 0

0 0 1

1 1 11 1 1

1 0 1

3 0 10 2 2

1 2 3

QUESTION 44.Consider themultiple regressionmodel:

Yi 1Xi,1 2Xi,2 iwherei areindependentnormally distributed randomerrorswithN0,2.Obtain themaximumlikelihood estimators for 1 and 2.Solution

Thedistribution of Yi is given byfyi,1Xi,1 2Xi,2,2 1

2 2exp yi1Xi,12Xi,2

2

22

Thelikelihood function

L i1

n1

2 2exp yi1Xi,12Xi,2

2

22

lnL nln 12 2

yi1Xi,12Xi,22

22

lnL1

yi1Xi,12Xi,2Xi,122

lnL2

yi1Xi,12Xi,2Xi,222

Hence lnL1 0 lnL2

0

gives us thefollowing equations

yi1Xi,12Xi,2Xi,122

0

yi1Xi,12Xi,2Xi,222

0

Hence1Xi,12 2Xi,1Xi,2 YiXi,11Xi,1Xi,2 2Xi,22 YiXi,2Solving weget

1 YiXi,12 Xi,1Xi,2 Xi,12

-55-


56/120

YiXi,12 Xi,1Xi,2 Xi,12

Xi,1Xi,2 2Xi,22 YiXi,2

2 Xi,22 Xi,12 Xi,1Xi,2

2

Xi,12 YiXi,2

YiXi,1 Xi,1Xi,2 Xi,12

2 Xi,22

Xi,12

Xi,1Xi,22

Xi,12 YiXi,2 Xi,1

2

YiXi,1 Xi,1Xi,2 Xi,12Hence

2 YiXi,2 Xi,12 YiXi,1 Xi,1Xi,2 Xi,22 Xi,12 Xi,1Xi,2

2

and

1

YiXi,1YiXi,2 Xi,12 YiXi,1 Xi,1Xi,2 Xi,22 Xi,12 Xi,1Xi,2

2 Xi,1Xi,2

Xi,12

QUESTION 45.

1) Obtain thelikelihoodfunctionfor thesampleobservationsY1,...,Yngiven X1,...,Xn if thenormal model is assumed tobeapplicable.

2) Obtain themaximumlikelihood estimators foro and1.Solution1) Under normal modelYi hasNo 1Xi,2, withthecorrespondingdensity functiongiven by

fYiyi 1

2 2exp yio1Xi

2

22

Hencethelikelihoodfunctionfor thenormal error model, given the

sampleobservationsY1,...,Yn, is:Lo,1,2

i1

n1

22exp 1

22Yi o 1Xi2

2) In order to find theMLE weuse

lnLo,1,2 lni1

n1

22exp 1

22Yi o 1Xi2

n2

ln2 122

Yi o 1Xi2o

lnLo,1,2 122

2Yi o 1Xi 1

1

2Yi no 1Xi

and1 lnLo,1,

2

1

22 2Yi o 1Xi Xi

1

2XiYi oXi 1Xi2

Fromo

lnLo,1,2 01

lnLo,1,2 0

weget thefollowingequations

Yi no 1XiXiYi oXi 1Xi2

Fromthefirst oneweget

o Y 1XUsing it in thesecond weget

-56-


57/120

XiYi Y 1XXi

1Xi

2

hence

1

XiYi Xi Yi

n

Xi

2 Xi2

n

XiXYiYXiX2

TheMLE foro and1 are

1

XiYi Xi Yi

n

Xi2 Xi2

n

ando Y

1X

thesameas estimatorsobtained usingleastsquares method.QUESTION 46.Datafromastudy of therelationbetween thesizeof abid in million rands (X) and thecost to thefirmof preparing thebid in thousands rands (Y) for 12 recent bidsare

presentedin tablebelow:

i 1 2 3 4 5 6 7 8 9 10 11 12

Xi 2.13 1.21 11.0 6.0 5.6 6.91 2.97 3.35 10.39 1.1 4.36 8.0

Yi 15.5 11.1 62.6 35.4 24.9 28.1 15.0 23.2 42.0 10 20 47.5


Xi2 .

Solution

Xi

Yi

wi

1/X

i

2 wiX

iw

iY

iw

iX

iY

iw

iX

i

2

2.13 15.5 0.220415 0.469484 3.41643 7.276995 1

1.21 11.1 0.683013 0.826446 7.587449 9.173554 1

11 62.6 0.008264 0.090909 0.517355 5.690909 1

6 35.4 0.027778 0.166667 0.983333 5.9 1

5.6 24.9 0.031888 0.178571 0.794005 4.446429 1

6.91 28.1 0.020943 0.144718 0.588505 4.06657 1

2.97 15 0.113367 0.3367 1.700507 5.050505 1

3.35 23.2 0.089107 0.298507 2.067276 6.925373 110.39 42 0.009263 0.096246 0.389061 4.042348 1

1.1 10 0.826446 0.909091 8.264463 9.090909 1

4.36 20 0.052605 0.229358 1.0521 4.587156 1

8 47.5 0.015625 0.125 0.742188 5.9375 1

63.02 335.3 2.098715 3.871698 28.09667 72.18825 12 Totals

bi

wiXiYiwiXi wiYi wi

wiXi2 wiXi2

wi

72.18825 3.87169828.09667

2.098715

123.8716982

2.098715

4. 1906

-57-


58/120

bo wiYibiwiXi

wi

28.096674.19063.8716982.098715

5. 6568

HenceY 5.6568 4.1906XQUESTION 47.

Thefollowing datewereobtained in acertain study.

i 1 2 3 4 5 6 7 8 9 10 11 12

Xi 1 1 1 2 2 3 3 3 3 5 5 5

Yi 4.8 4.9 5.1 7.9 8.3 10.9 10.8 11.3 11.1 16.5 17.3 17.1


of alinear regression function. Use 0.05.

Solution1) Wehave

b1 XiYi

Xi Yin

Xi2 Xi2

n

434 34126

12

122 342

12

3

andbo Y b1X 1n Yi b1 1n Xi 112 126 3

112

34 2

ThereforeY 2 3 X2) F test for lack of fit.

Wehavec 4 levels for X and 3 replicates for X 1 level,2 replicates for X 2 level, 4 replicates for X 3 leveland 3 replicates for X 5 level, and n 12.Hence

Yj 1nj

i1

nj


Y 1 134.8 4.9 5.1 4. 9333 at level X 1

Y 2 127.9 8.3 8. 1 at level X 2

Y 3 1410.9 10.8 11.3 11.1 11. 025 at level X 3

Y 4 1316.5 17.3 17.1 16. 967 at level X 5

SSPE j1

c

i1

nj

Yi,j Yj2 4.8 4.93332 4.9 4.93332 5.1 4.93332

7.9 8.12 8.3 8.12 10.9 11.0252 10.8 11.0252 11.3 11.0252 11.1 11.0252 16.5 16.9672 17.3 16.9672 17.1 16.9672 0. 62083

MSPE SSPEnc 0.62083

124 0.077604

SSE i1

n



c2

0.03917

42 0.019585

Thehypothesis

-58-


59/120

Ho:EY o 1X

Ha:EY o 1X


MSPE

0.0195850.077604

0. 25237

Thedecision ruleIf F F1 ;c 2,n c conclude Ho

If F F1 ;c 2,n c conclude HaF1 ;c 2;n c F0.95;2;8 4.46SinceF 0. 25237 F1 ;c 2;n c 4.46 weconcludeHo.

Thereis no lack of fit.QUESTION 48.1) Statethesimplenormal linear regressionmodel in matrix terms.2) Provethefollowingformula for SSE:SSE Y Y bX Y3) Provethat forYh Xh b thevarianceis in matrix notation2

Yh 2X h

X X1X hSolution1) Let

Y

Y1

Y2

:

Yn

X

1 X1

1 X2

: :

1 Xn

o

1

1

2

:

n

then

n1

Y n2

X21

n1

where: is thevector of parametersX - matrix of knownconstants,namely,thevalues of theindependentvariable

is avector of independent normal randomvariableswithE 0and2 2I.

2)We know thatSSE Yi2 boYi b1XiYiLet us noticethat

if Y

Y1Y2

:

Yn


and

if X

1 X1

1 X2

: :

1 Xn

then X 1 1 ... 1

X1 X2 .. . Xn.

Hence

-59-


60/120

Y Y Y1 Y2 .. . Yn

Y1

Y2

:

Yn

Yi2 Yi2

and

X Y 1 1 ... 1

X1 X2 .. . Xn

Y1

Y2

:

Yn

YiXiYi


b1wehave

Y Y bX Y Yi2 bo b1 Yi

XiYi

Yi2 boYi b1XiYi SSEwhichcompletes theproof.or2)Weknow thatA A, A B A B, and AB BA

andthenormal equation:X Xb X Y

hence

X Xb X Y 0

0

where b bo

b1

so

X Xb X Y bX X Y X 0

0

0 0



Y Y bX Y 0 0bo

b1

Y Y bX Y 0 Y Y bX Y

3)We know that:Let W bearandomvector obtained by premultiplyingtherandomvector

Y by aconstant matrix AW AY

-60-


61/120

Then*) 2W 2AY A2YA

SinceYh X h

b using *) withA X h weget

2Yh X h

2bX hUsingthefact that

2b 2X X1weget2

Yh X h



t 1 2 3 4 5 6 7 8 9 10Yt 21.96 4.15 7.36 22.11 10.98 22.06 47.35 47.05 73.40 69.79

et -1.45 -0.26 -0.16 -0.20 0.32 0.63 0.24 0.55 -0.50 -0.65

t 11 12 13 14 15 16 17 18 19 20

Yt 83.83 87.09 75.64 76.15 69.08 32.24 47.30 52.29 78.03 77.78

et 0.06 -0.09 -0.24 -1.03 0.02 0.56 0.80 0.11 0.57 0.72


Thehypothesis:

Ho : 0

Ha : 0


D

t2

n

etet12

t1

n

et2

6.50256.7072

0. 96948

t1

n

et2

1.452 0.262 0.162 0.202 0.322 0.632

0.242 0.552 0.502 0.652 0.062 0.092 0.242 1.032 0.022 0.562 0.802 0.112

0.572 0.722 6. 7072

t2

net et12 0.26 1.452 0.16 0.262 0.20 0.162

0.32 0.202 0.63 0.322 0.24 0.632 0.55 0.242 0.50 0.552 0.65 0.502 0.06 0.652 0.09 0.062 0.24 0.092 1.03 0.242 0.02 1.032 0.56 0.022 0.80 0.562 0.11 0.802 0.57 0.112 0.72 0.572 6. 5025

Thedecision rule

If D dU conclude Ho

IfD dL concludeHa

IfdL

D

dU thetest is inconclusivep 2 , dL 1.20 and dU 1.41

-61-


62/120

SinceD 0.96948 dL 1.20 weconcludeHa, that theerror terms arepositively autocorrelated.QUESTION 50.

Thefollowing datawereobtained in acertain experiment:

i Xi,1 Xi,2 Yi

1 1 2 2.5

2 1 2 3

3 1 2 3.5

4 2 1 3

5 2 1 4

6 0 1 1

7 0 1 1.5

8 0 1 29 1 0 1.5

10 1 0 2

11 1 0 2.5


X X

11 10 11

10 14 10

11 10 17

X X1

2354

527

16

527

1154

0

16

0 16

X Y

26.5

29

29.5

Y Y 72.25



80percentfamily confidencecoefficient.

Solution

1) b

bo

b1

b2

X X1X Y

2354

527

16

527

1154

0

16

0 16

26.5

29

29.5

1.0

1. 0

0. 5

2) Y

1 1

Y Yi 26.5 (weget it from X

Y )

-62-

7/29/2019 Linear models question

Documents

Linear models questions.pdf