5
Stat364 Solution to Final Exam 1. True/False Problems (1 point each) [Answer and reason your solution.] (a) The correlation between two quantitative measurements was found to be zero. It can be concluded that there exists no relationship between the two measurements. Answer: False. Zero correlation implies non-existence of linear relationship. How- ever, these two measurements may be strongly nonlinearly related. (b) For a given random sample, a 90% confidence interval for p is usually wider than a 95% confidence interval. Answer: False. A 90% confidence interval is narrower than a 95% confidence inter- val. (c) In determining the sample size for the estimation of population mean or population proportion, the smaller the pre-specified confidence level 1 - α, the larger the sample size is required. Answer: False. Smaller confidence level implies larger error rate α and hence smaller z α/2 . Consequently, smaller sample size is required. (d) In testing the independence of two qualitative variables, if the chi-square test statistic results in rejecting the null hypothesis, then we can conclude that these variables are directly related. Answer: False. The rejection of independence null hypothesis implies that the two qualitative variables are dependent. Howere, it does not necessarily imply that they are directly related. A latent variable may directly impact these two qualitative vari- ables. 2. (3 points) The following graphs show the probability density functions of three quantitative measurements in the left panels and the summary statistics of samples taken from the three populations (not in particular order). Match each sample with proper probability density function. Reason your answers. 1

Final Sol

Embed Size (px)

DESCRIPTION

The table is based on the upper right 1/2 of the Normal Distribution; total area shown is .5The Z-score values are represented by the column value + row value, up to two decimal placesThe probabilities up to the Z-score are in the cells

Citation preview

Page 1: Final Sol

Stat364 Solution to Final Exam

1. True/False Problems (1 point each) [Answer and reason your solution.]

(a) The correlation between two quantitative measurements was found to be zero. It canbe concluded that there exists no relationship between the two measurements.

Answer: False. Zero correlation implies non-existence oflinear relationship. How-ever, these two measurements may be strongly nonlinearly related.

(b) For a given random sample, a 90% confidence interval forp is usually wider than a95% confidence interval.

Answer: False. A 90% confidence interval is narrower than a 95% confidence inter-val.

(c) In determining the sample size for the estimation of population mean or populationproportion, the smaller the pre-specified confidence level1 − α, the larger the samplesize is required.

Answer: False. Smaller confidence level implies larger error rateα and hencesmallerzα/2. Consequently, smaller sample size is required.

(d) In testing the independence of two qualitative variables, if the chi-square test statisticresults in rejecting the null hypothesis, then we can conclude that these variables aredirectly related.

Answer: False. The rejection of independence null hypothesis implies that the twoqualitative variables are dependent. Howere, it does not necessarily imply that theyare directly related. A latent variable may directly impact these two qualitative vari-ables.

2. (3 points) The following graphs show the probability density functions of three quantitativemeasurements in the left panels and the summary statistics of samples taken from the threepopulations (not in particular order). Match each sample with proper probability densityfunction. Reason your answers.

1

Page 2: Final Sol

f1(x)

x

f 1(x

)

−10 −5 0 5 10 15 20

0.00

0.04

0.08

0.12

f2(x)

x

f 2(x

)

−10 −5 0 5 10 15 20

0.00

0.05

0.10

0.15

f3(x)

x

f 3(x

)

−10 −5 0 5 10 15 20

0.0

0.4

0.8

Mean Median4.99 6.02

Data Set #A

Mean Median5.01 4.26

Data Set #B

Mean Median5.06 5.05

Data Set #C

f1(x) is linked to data set #C (symmetric, sample mean and sample median are aboutthe same).f2(x) is linked to data set #B (positively skewed, sample mean is typically greatedthan the sample median), andf3(x) is linked to data set #A (negatively skewed, sam-ple mean is typically smaller than the sample median).

3. (2 points) Consider a 5 by 4 contingency table for testing the independence of two qualita-tive measurements. What is the degrees of freedom for the chi-square test statistic.

Answer: (5− 1)× (4− 1) = 12.

2

Page 3: Final Sol

4. (3 points) The breaking strength of a rivet has a normal distribution with mean value of10,000 psi and a standard deviation of 500 psi. What is the probability that the sample meanbreaking strength for a random sample of 36 rivets is between 9900 and 10,200?

The sample meanX has a normal distribution with mean 10,000 psi and standarddeviation of500/

√36 = 83.333. Hence,

P (9900 ≤ X ≤ 10200) = Φ

(10200− 10000

83.333

)− Φ

(9900− 10000

83.333

)= Φ(2.40)− Φ(−1.20) = .9918− .1151 = 0.8767.

5. Osteoporosis or a loss of bone minerals is a common cause of broken bones in the elderly.A researcher on aging conjectures that bone mineral loss can be reduced by regular physicaltherapy or by certain kinds of physical activity. A study is conducted on 200 elderly subjectsof approximately the same age divided into control, physical therapy, and physical activitygroups. After a suitable period of time, the nature of change in bone mineral content is ob-served.

(numbers in parentheses are expeected counts)Change in bone mineral

Appreciable Little Appreciableloss change increase Total

Control 38 (22.50) 15 (23.10) 7 ( ? ) 60Therapy 22 (26.25) 32 (26.95) 16 (16.80) 70Activity 15 (26.25) 30 (26.95) 25 (16.80) 70Total 75 77 48 200

The test statistic is given:

X2 = 10.68 + 2.84 + 3.80

+ 0.69 + 0.95 + 0.04

+ 4.82 + 0.34 + 4.00 = 28.16

(a) (1 point) There is a missing expected count. Fill it in. Show your work.

The missing expected count is

E13 =60× 48

200= 14.40.

(b) (2 points) Is there evidence to indicate that the change in bone mineral varies for thedifferent groups? If your answer isyes, comment on the group (or groups) that con-tributes most to the rejection of the null hypothesis.Useα = 0.01.

3

Page 4: Final Sol

Answer: Yes. The degrees of freedom for the chisquare test is(3 − 1)(3 − 1) = 4.At α = 0.01 test for homogeneity, the null hypothesis is rejected sinceX2 = 28.16 >χ2

.01,4 = 13.2767 (MINITAB calculation resulted in aP -value of 0.00001). Controlgroup contributes significantly to the rejection with the categoryAppreciable Lossbeing the largest. It has a fairly high expected count comparing to the expected value(if H0 is true, i.e., homogeneous). The categoryAppreciable Loss in physical ac-tivity group also contributes significantly (though not as high as the abovementionedcell). The lower than expected observed count in that cell explains it.

6. A rocket motor is manufactured by bonding together two types of propellants, an igniter anda sustainer. It is important to study the shear strength (in the unit of psi) of the bond (y)and the age (in weeks) of the propellant (x) when the motor is cast. Twenty observationsare taken and a straight line is fitted through the data. Selected computer output from asimple linear regression analysis is attached on page 5. Use the computer output to answerthe following questions (yes, no computation is needed!).

(a) (2 points) Does it appear that a straight line relationship is reasonable? Useα = 0.01.(Hint: Perform model utility test and inspect the scatterplot.)

Answer: Yes. The model utility test has at statistic of−12.31 with a P -valueof 0.0000. The straight line does provide significant informtion for predicting theresponse. From the graph, the straight line does appear to fit nicely to the data set.The evidence that the straight line model fits reasonably well is strengthen by thelargeR2 (of 89.4%).

(b) (1 point) Give a 95% confidence interval for the mean shear strength of all propellantsat age of 16 weeks. Useα = 0.05.

From the computer output, the confidence interval is (1985.1,2085.4).

(c) (1 point) Determine the sample correlation coefficient between the shear strength andage.

The straight line goes downward. Hence, the correlation coefficient is

−√

0.894 = −0.9455.

(d) (2 points) Is it wise to predict the shear strength of a propellant at age of 30 weeks?Reason you answer.

Answer: No. The age of propellants in the data set ranges (approximately) from 2to 26 weeks. A propellant of age 30 weeks is well beyond the data range. Since noinformation is available beyond this range, it’s unwise to predict its shear strength.Extrapolation is dangerous!

4

Page 5: Final Sol

Computer Output for Problem 6

5 10 15 20 25

1800

2200

2600

age

stre

ngth

Regression Analysis: strength versus age

The regression equation isstrength = 2628 - 37.1 age

Predictor Coef SE Coef T PConstant 2628.17 46.00 57.13 0.000age -37.058 3.010 -12.31 0.000

S = 100.485 R-Sq = 89.4% R-Sq(adj) = 88.8%

Analysis of Variance

Source DF SS MS F PRegression 1 1530786 1530786 151.60 0.000Residual Error 18 181751 10097Total 19 1712536

Predicted Values for New Observations

NewObs age Fit SE Fit 95% CI 95% PI

1 16.0 2035.2 23.9 (1985.1, 2085.4) (1818.3, 2252.2)

5