Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
ST7003-1
TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN
Faculty of Engineering, Mathematics and Science
School of Computer Science and Statistics
Postgraduate Certificate in Statistics Hilary Term 2015
DESIGN AND ANALYSIS OF EXPERIMENTS
Wednesday 29 April 2015 Sports Centre 14.00 – 17.00
Professor Stuart, Professor Parnell
______________________________________________________________________
Instructions to Candidates: Answer all 3 questions. Questions 1 and 2 carry 30 marks each. Question 3 carries 40 marks. Answer each question in a separate answer book. Appendix 1, pages 12 - 14, gives tables of critical values of the t distribution and selected critical values of the F distribution.
Materials permitted for this examination:
Non-programmable calculators are permitted for this examination; please indicate the make and model of your calculator on each answer book used.
2
1 Chemicals are used to increase the water retention capacity of meats (or "preserve the integrity of the moisture content in meats"). An experiment was conducted using two such chemicals, identified as A and B. Each chemical was used at three levels in a 3x3 factorial design, in duplicate. Water retention was measured in millilitres. The results are shown in Table 1.
Table 1: Water retention capacity (ml H2O) using three levels of Factor A and three levels of Factor B, in duplicate.
Factor A
Factor B 1 2 3
1
1.14 2.23 0.74
1.05 2.30 0.50
2
1.87 3.13 1.43
1.60 3.00 1.00
3
1.70 2.80 0.10
1.80 1.95 0.05
An analysis of variance produced the following results.
Analysis of Variance for Water Retention Capacity
Source DF SS MS F P
A 2 1.6631 0.8315 13.86 0.002
B 2 11.2170 5.6085 93.45 0.000
A*B 4 0.9487 0.2372 3.95 0.040
Error 9 0.5401 0.0600
Total 17 14.3690
S = 0.244983
(a) Provide a brief report on the statistical significance of the results (3 marks)
(b) Find the 5% critical value for the F ratio for interaction and explain how it
relates to the p-value. (3 marks)
(c) Why are there 9 degrees of freedom for Error? (2 marks)
Table 2 on page 3 shows summary data.
(d) Draw an interaction plot showing profiles of levels of Factor B with levels
of Factor A on the horizontal axis. (4 marks)
Table 2: Mean water retention capacity (ml H2O) using three levels of Factor A and three levels of Factor B
3
Factor A Factor B
Means 1 2 3
1 1.095 2.265 0.620 1.327
2 1.735 3.065 1.215 2.005
3 1.750 2.375 0.075 1.400
Means 1.527 2.568 0.637 1.577
(e) Provide a brief interpretation of the interaction plot of part (d). (4 marks)
(f) Explain why interpretation of main effects is not recommended in this
case. (2 marks)
(g) Identify the optimum combination of factor levels (assuming high water
retention capacity is desirable) and calculate a 95% confidence interval for
the mean water retention capacity when using that combination. (5 marks)
The following diagnostic plot was produced along with the analysis of variance.
The cases with "deleted" residuals approximating +4 and –4 correspond to the
duplicate design points with Factor A at level 3 and Factor B at level 2.
3.02.52.01.51.00.50.0
5
4
3
2
1
0
-1
-2
-3
-4
Fitted Value
Dele
ted
Resid
ual
4
(h) Explain the advantage of using "deleted" residuals as distinct from ordinary (raw)
residuals in plots such as this. (2 marks)
(i) Provide a brief interpretive comment on the plot. (2 marks)
(j) What action(s) would you recommend based on your interpretation of this plot? (3 marks)
5
2. In an experimental incineration plant, three versions of the basic burner were
evaluated with a view to identifying the most efficient version. The measure of
efficiency used in this case was the residual amount of a particular toxic
chemical, smaller is better. A complete burning cycle took approximately two
hours so that at most three burns could be completed in a single working day.
To allow for the possibility that burner efficiency might be subject to variation
depending on changing conditions from day to day, all three burners were used,
in random order, on each of four successive days. The results were as follows;
efficiency is recorded per cent multiplied by 100. (Thus, the efficiency recorded
using Burner 1 on Day 1 was 0.21%).
Burner
Day (Block)
B1 B2 B3 Mean
1 21 23 23 22.33
2 18 19 22 19.67
3 18 21 20 19.67
4 17 20 21 19.33
Mean 18.50 20.75 21.50 20.25
An analysis of variance was calculated using Minitab, with the following results.
Two-way ANOVA: Efficiency versus Burner, Day (Block)
Source DF SS MS F P
Burner 2 19.5000 9.75000 11.32 0.009
Day (Block) 3 17.5833 5.86111 6.81 0.023
Error 6 5.1667 0.86111
Total 11 42.2500
Tukey 95.0% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of Burner
Burner = 1 subtracted from:
Burner Lower Center Upper ------+---------+---------+---------+
2 0.2363 2.250 4.264 (---------*---------)
3 0.9863 3.000 5.014 (---------*---------)
------+---------+---------+---------+
0.0 2.0 4.0 6.0
Burner = 2 subtracted from:
Burner Lower Center Upper ------+---------+---------+---------+
3 -1.264 0.7500 2.764 (---------*---------)
------+---------+---------+---------+
0.0 2.0 4.0 6.0
6
(a) Report on the statistical significance of the results shown in the Analysis
of Variance table, in terms of F ratios and p-values. (6 marks)
(b) Discuss the F test for Burner effect:
what is measured / estimated by the Error Mean Square (MS)?
what is measured / estimated by the Burner Mean square?
what is measured by the Burner F-ratio?
what hypothesis is tested by the Burner F-ratio? (6 marks)
(c) Summarise the results of the Tukey pairwise comparisons.
(3 marks)
(d) Briefly explain how and why the simultaneous confidence intervals shown
above differ from confidence intervals for differences between individual
pairs of means. (3 marks)
A one-way analysis of variance, ignoring blocking, resulted as follows.
Source DF SS MS F P
Burner 2 19.50 9.75 3.86 0.062
Error 9 22.75 2.53
Total 11 42.25
Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of Burner
Burner = 1 subtracted from:
Burner Lower Center Upper ---+---------+---------+---------+---
2 -0.890 2.250 5.390 (----------*---------)
3 -0.140 3.000 6.140 (---------*---------)
---+---------+---------+---------+---
-3.0 0.0 3.0 6.0
Burner = 2 subtracted from:
Burner Lower Center Upper ---+---------+---------+---------+---
3 -2.390 0.750 3.890 (----------*---------)
---+---------+---------+---------+---
-3.0 0.0 3.0 6.0
7
(e) Compare the results of the two-way analysis with those of the one-way
analysis, referring to both analysis of variance and Tukey pairwise
comparisons. Discuss the benefits of blocking in the light of these
comparisons. (4 marks)
(f) Explain why randomization might be used in experiments such as this and
how it achieves its goal.
(4 marks)
(g) Describe how a spreadsheet might be used to implement the
randomization in this case. (4 marks)
8
3 When weeds occur in fields where food crops are being grown, there is
competition between the weeds and the food crops for nutrients supplied via the
soil in which the crops are planted and any added fertilisers. Evidence suggests
that different weed species may have different competitive effects. Experiments
may be carried out in which a standard wheat variety is grown in combination
with different weed species in different plots and the wheat yields from the
different plots are compared with a view to estimating the differential weed
species effects.
A complicating factor is that irrigation has an effect on wheat yield and this effect
may vary, depending on the competing weed species.
The matter is further complicated by the fact that, whereas wheat seed may be
sown combined with various weed seed combinations in relatively small plots of
land, the water piping arrangements required for irrigation mean that irrigated
areas will necessarily be larger. This means that plots treated with the same
level of irrigation (Irrigation or No irrigation) will be made up of a number of the
smaller plots treated with the different weed species.
An experiment was conducted in which a single variety of wheat was sown in
combination with three weed species and none in two sets of four neighbouring
plots, with one set being irrigated while the other was not irrigated, and the whole
arrangement was replicated four times, resulting in four blocks of eight plots
each. The weed species were black-grass (Bg), cleavers (Cl) and chickweed
(Cw), with no weed being designated as Nw. Irrigation was applied to one half of
each block, selected at random, while the other half was not irrigated. Weed
species (and none) were applied randomly within each set of four plots. The
yield (Y) of grain from each plot at 85% dry matter in tonnes per hectare was
measured. The results of the experiment are shown in Table 3 that follows.
9
Table 3 Wheat yields corresponding to different weed species for both irrigated
and non irrigated areas in four blocks.
Block I
Block II
Block III
Block IV
Irrigation Y N
Y N
Y N
Y N
Weed
Species
Bg 3.62 4.12
2.52 3.19
1.97 2.92
2.73 3.71
Cl 4.49 7.59
4.70 5.72
2.20 6.90
4.91 6.51
Cw 5.70 6.77
5.91 6.32
4.91 6.64
5.78 6.65
Nw 7.92 9.11
7.05 8.02
5.54 8.18
8.22 7.16
(a) Sketch a layout for this experiment showing a plausible assignment of
irrigation levels in the four blocks and a plausible assignment of weed
species in one of the blocks. (5 marks)
(b) Identify the whole plots and the whole plot treatments, the sub plots and
the subplot treatments. (4 marks)
(c) Show the plot and treatment structure diagram for these data. (5 marks)
(d) Write down the components of a Minitab style model for these data,
separating the terms in accordance with the plot structure and identifying
random term(s). (3 marks)
(e) Indicate how a split plot design facilitates assessing how the effect of
irrigation on wheat yield may vary, depending on the competing weed
species. (2 marks)
The data are illustrated in Figure 1 that follows.
10
Irrigation
Figure 1 Wheat yields corresponding to different weed species for both irrigated
and non irrigated areas in four blocks.
(f) Ignoring statistical significance, provide a commentary on the results with
respect to effects of all factors, including the blocking factor, and key
interactions. Refer to evidence in the graphs to support your commentary.
(6 marks)
The Analysis of Variance produced by Minitab for these data resulted as follows, where
B, I, W represent Block, Irrigation, Weed species, respectively
NwCwClBg
9
8
7
6
5
4
3
Weed Species
Wh
eat
Yie
ldBlock I
NwCwClBg
8
7
6
5
4
3
2
Weed Species
Wh
eat
Yie
ld
Block II
NwCwClBg
9
8
7
6
5
4
3
2
Weed Species
Wh
eat
Yie
ld
Block III
NwCwClBg
9
8
7
6
5
4
3
2
Weed Species
Wh
eat
Yie
ld
Block IV
11
Source DF Adj SS Adj MS F-Value P-Value
B 3 6.647 2.2158 1.48 0.378
I 1 14.231 14.2311 9.48 0.054
B*I 3 4.504 1.5012 5.75 0.006
W 3 85.926 28.6419 109.73 0.000
I*W 3 4.371 1.4571 5.58 0.007
Error 18 4.699 0.2610
Total 31 120.378
(Note: The B*W interaction was not at all significant and so was omitted
from the analysis.)
(g) Identify the errors terms corresponding to each of the other terms in the
model underlying the analysis. Confirm the values of the relevant F ratios;
show the relevant calculations. With reference to your answer to part (f),
comment on the validity of the whole plots error term. (5 marks)
(h) Discuss the statistical significance of the results. Make cross references
to your answer to part (f). (5 marks)
(i) Comment on the effectiveness of blocking in this case. (1 marks)
(j) Calculate a new Whole Plot variation by combining the B and B*I sources
of variation. Use this to recalculate the F ratio for Irrigation. Assess its
statistical significance by reference to the tables of the F distribution at the
end of this paper. Comment. (4 marks)
12
Appendix 1 Statistical Tables
Selected critical values for the t-distribution
is the proportion of values in a t distribution with degrees of freedom
which exceed in magnitude the tabled value. For example, 25% of the
values in a t distribution with 1 degree of freedom are outside ±2.41.
.25 .10 .05 .02 .01 .002 .001
= 1 2.41 6.31 12.71 31.82 63.66 318.32 636.61
2 1.60 2.92 4.30 6.96 9.92 22.33 31.60 3 1.42 2.35 3.18 4.54 5.84 10.22 12.92 4 1.34 2.13 2.78 3.75 4.60 7.17 8.61 5 1.30 2.02 2.57 3.36 4.03 5.89 6.87 6 1.27 1.94 2.45 3.14 3.71 5.21 5.96 7 1.25 1.89 2.36 3.00 3.50 4.79 5.41 8 1.24 1.86 2.31 2.90 3.36 4.50 5.04 9 1.23 1.83 2.26 2.82 3.25 4.30 4.78 10 1.22 1.81 2.23 2.76 3.17 4.14 4.59 12 1.21 1.78 2.18 2.68 3.05 3.93 4.32 15 1.20 1.75 2.13 2.60 2.95 3.73 4.07 20 1.18 1.72 2.09 2.53 2.85 3.55 3.85 24 1.18 1.71 2.06 2.49 2.80 3.47 3.75 30 1.17 1.70 2.04 2.46 2.75 3.39 3.65 40 1.17 1.68 2.02 2.42 2.70 3.31 3.55 60 1.16 1.67 2.00 2.39 2.66 3.23 3.46 120 1.16 1.66 1.98 2.36 2.62 3.16 3.37 ∞ 1.15 1.64 1.96 2.33 2.58 3.09 3.29
13
Selected critical values for the F distribution
with 1 numerator and 2 denominator degrees of freedom
For example, 10% of the values in an F distribution with 1 numerator and 2 denominator degrees of freedom exceed 8.5.
10% critical values for the F distribution
1 2 3 4 5 6 7 8 10 12 24 ∞
1 39.9 49.5 53.6 55.8 57.2 58.2 58.9 59.4 60.2 60.7 62.0 63.3
2 8.5 9.0 9.2 9.2 9.3 9.3 9.3 9.4 9.4 9.4 9.4 9.5
3 5.5 5.5 5.4 5.3 5.3 5.3 5.3 5.3 5.2 5.2 5.2 5.1
4 4.5 4.3 4.2 4.1 4.1 4.0 4.0 4.0 3.9 3.9 3.8 3.8
5 4.1 3.8 3.6 3.5 3.5 3.4 3.4 3.3 3.3 3.3 3.2 3.1
6 3.8 3.5 3.3 3.2 3.1 3.1 3.0 3.0 2.9 2.9 2.8 2.7
7 3.6 3.3 3.1 3.0 2.9 2.8 2.8 2.8 2.7 2.7 2.6 2.5
8 3.5 3.1 2.9 2.8 2.7 2.7 2.6 2.6 2.5 2.5 2.4 2.3
9 3.4 3.0 2.8 2.7 2.6 2.6 2.5 2.5 2.4 2.4 2.3 2.2
10 3.3 2.9 2.7 2.6 2.5 2.5 2.4 2.4 2.3 2.3 2.2 2.1
12 3.2 2.8 2.6 2.5 2.4 2.3 2.3 2.2 2.2 2.1 2.0 1.9
15 3.1 2.7 2.5 2.4 2.3 2.2 2.2 2.1 2.1 2.0 1.9 1.8
20 3.0 2.6 2.4 2.2 2.2 2.1 2.0 2.0 1.9 1.9 1.8 1.6
40 2.8 2.4 2.2 2.1 2.0 1.9 1.9 1.8 1.8 1.7 1.6 1.4
120 2.7 2.3 2.1 2.0 1.9 1.8 1.8 1.7 1.7 1.6 1.4 1.2
∞ 2.7 2.3 2.1 1.9 1.8 1.8 1.7 1.7 1.6 1.5 1.4 1.0
5% critical values for the F distribution
1 2 3 4 5 6 7 8 10 12 24 ∞
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 241.9 243.9 249.1 254.3
2 18.5 19.0 19.2 19.2 19.3 19.3 19.4 19.4 19.4 19.4 19.5 19.5
3 10.1 9.6 9.3 9.1 9.0 8.9 8.9 8.8 8.8 8.7 8.6 8.5
4 7.7 6.9 6.6 6.4 6.3 6.2 6.1 6.0 6.0 5.9 5.8 5.6
5 6.6 5.8 5.4 5.2 5.1 5.0 4.9 4.8 4.7 4.7 4.5 4.4
6 6.0 5.1 4.8 4.5 4.4 4.3 4.2 4.1 4.1 4.0 3.8 3.7
7 5.6 4.7 4.3 4.1 4.0 3.9 3.8 3.7 3.6 3.6 3.4 3.2
8 5.3 4.5 4.1 3.8 3.7 3.6 3.5 3.4 3.3 3.3 3.1 2.9
9 5.1 4.3 3.9 3.6 3.5 3.4 3.3 3.2 3.1 3.1 2.9 2.7
10 5.0 4.1 3.7 3.5 3.3 3.2 3.1 3.1 3.0 2.9 2.7 2.5
12 4.7 3.9 3.5 3.3 3.1 3.0 2.9 2.8 2.8 2.7 2.5 2.3
15 4.5 3.7 3.3 3.1 2.9 2.8 2.7 2.6 2.5 2.5 2.3 2.1
20 4.4 3.5 3.1 2.9 2.7 2.6 2.5 2.4 2.3 2.3 2.1 1.8
30 4.2 3.3 2.9 2.7 2.5 2.4 2.3 2.3 2.2 2.1 1.9 1.6
40 4.1 3.2 2.8 2.6 2.4 2.3 2.2 2.2 2.1 2.0 1.8 1.5
120 3.9 3.1 2.7 2.4 2.3 2.2 2.1 2.0 1.9 1.8 1.6 1.3
∞ 3.8 3.0 2.6 2.4 2.2 2.1 2.0 1.9 1.8 1.8 1.5 1.0
14
2.5% critical values for the F distribution
1 1 2 3 4 5 6 7 8 10 12 24 ∞
1 647.8 799.5 864.2 899.6 921.8 937.1 948.2 956.6 968.6 976.7 997.3 1018.3
2 38.5 39.0 39.2 39.2 39.3 39.3 39.4 39.4 39.4 39.4 39.5 39.5
3 17.4 16.0 15.4 15.1 14.9 14.7 14.6 14.5 14.4 14.3 14.1 13.9
4 12.2 10.6 10.0 9.6 9.4 9.2 9.1 9.0 8.8 8.8 8.5 8.3
5 10.0 8.4 7.8 7.4 7.1 7.0 6.9 6.8 6.6 6.5 6.3 6.0
6 8.8 7.3 6.6 6.2 6.0 5.8 5.7 5.6 5.5 5.4 5.1 4.8
7 8.1 6.5 5.9 5.5 5.3 5.1 5.0 4.9 4.8 4.7 4.4 4.1
8 7.6 6.1 5.4 5.1 4.8 4.7 4.5 4.4 4.3 4.2 3.9 3.7
9 7.2 5.7 5.1 4.7 4.5 4.3 4.2 4.1 4.0 3.9 3.6 3.3
10 6.9 5.5 4.8 4.5 4.2 4.1 3.9 3.9 3.7 3.6 3.4 3.1
12 6.6 5.1 4.5 4.1 3.9 3.7 3.6 3.5 3.4 3.3 3.0 2.7
15 6.2 4.8 4.2 3.8 3.6 3.4 3.3 3.2 3.1 3.0 2.7 2.4
20 5.9 4.5 3.9 3.5 3.3 3.1 3.0 2.9 2.8 2.7 2.4 2.1
30 5.6 4.2 3.6 3.2 3.0 2.9 2.7 2.7 2.5 2.4 2.1 1.8
40 5.4 4.1 3.5 3.1 2.9 2.7 2.6 2.5 2.4 2.3 2.0 1.6
120 5.2 3.8 3.2 2.9 2.7 2.5 2.4 2.3 2.2 2.1 1.8 1.3
∞ 5.0 3.7 3.1 2.8 2.6 2.4 2.3 2.2 2.0 1.9 1.6 1.0
1% critical values for the F distribution
1 2 3 4 5 6 7 8 10 12 24 ∞
1 4052.2 4999.3 5403.5 5624.3 5764.0 5859.0 5928.3 5981.0 6055.9 6106.7 6234.3 6365.6
2 98.5 99.0 99.2 99.3 99.3 99.3 99.4 99.4 99.4 99.4 99.5 99.5
3 34.1 30.8 29.5 28.7 28.2 27.9 27.7 27.5 27.2 27.1 26.6 26.1
4 21.2 18.0 16.7 16.0 15.5 15.2 15.0 14.8 14.5 14.4 13.9 13.5
5 16.3 13.3 12.1 11.4 11.0 10.7 10.5 10.3 10.1 9.9 9.5 9.0
6 13.7 10.9 9.8 9.1 8.7 8.5 8.3 8.1 7.9 7.7 7.3 6.9
7 12.2 9.5 8.5 7.8 7.5 7.2 7.0 6.8 6.6 6.5 6.1 5.6
8 11.3 8.6 7.6 7.0 6.6 6.4 6.2 6.0 5.8 5.7 5.3 4.9
9 10.6 8.0 7.0 6.4 6.1 5.8 5.6 5.5 5.3 5.1 4.7 4.3
10 10.0 7.6 6.6 6.0 5.6 5.4 5.2 5.1 4.8 4.7 4.3 3.9
11 9.6 7.2 6.2 5.7 5.3 5.1 4.9 4.7 4.5 4.4 4.0 3.6
12 9.3 6.9 6.0 5.4 5.1 4.8 4.6 4.5 4.3 4.2 3.8 3.4
14 8.9 6.5 5.6 5.0 4.7 4.5 4.3 4.1 3.9 3.8 3.4 3.0
16 8.5 6.2 5.3 4.8 4.4 4.2 4.0 3.9 3.7 3.6 3.2 2.8
18 8.3 6.0 5.1 4.6 4.2 4.0 3.8 3.7 3.5 3.4 3.0 2.6
20 8.1 5.8 4.9 4.4 4.1 3.9 3.7 3.6 3.4 3.2 2.9 2.4
25 7.8 5.6 4.7 4.2 3.9 3.6 3.5 3.3 3.1 3.0 2.6 2.2
30 7.6 5.4 4.5 4.0 3.7 3.5 3.3 3.2 3.0 2.8 2.5 2.0
40 7.3 5.2 4.3 3.8 3.5 3.3 3.1 3.0 2.8 2.7 2.3 1.8
120 6.9 4.8 3.9 3.5 3.2 3.0 2.8 2.7 2.5 2.3 2.0 1.4
∞ 6.6 4.6 3.8 3.3 3.0 2.8 2.6 2.5 2.3 2.2 1.8 1.0
© UNIVERSITY OF DUBLIN 2015.