Upload
arnold-woods
View
218
Download
0
Embed Size (px)
Citation preview
• State the ‘null hypothesis’• State the ‘alternative
hypothesis’• State either one-tailed or
two-tailed test• State the chosen statistical
test with reasons• State the level of significance
(usually 5%)• Present calculations• Draw conclusions –
accept or reject the‘null hypothesis’
Nominal level data (categorical)
Expected frequencies in any cell should not fall below 5
Testing Genetic Ratios
The Chi-squared test is commonly used for comparing experimental genetic data with that expected from predicted ratios
In a breeding experiment in which tall, purple flowered pea plants were allowed to self pollinate, a Mendelian ratio of 9 : 3 : 3 : 1 was predicted among the progeny as follows:
Tall/Purple : Tall/White : Dwarf/Purple : Dwarf/White = 9 : 3 : 3 : 1
The Chi-squared test was used to determine how well the observations fitted the predicted ratio; in this context,
Chi-squared is used as a test of ‘goodness of fit’
Null Hypothesis: There is no difference between the results of the genetic cross and the predicted Mendelian ratio of 9 : 3 : 3 : 1
Alternative Hypothesis: The experimental results differ from the predicted 9 : 3 : 3 : 1 ratio
The basis of the Chi-squared test is the difference between observed results
(O) and the expected results (E) predicted by the ‘null hypothesis’E
EO 2
2 )(
Results of the Experimental Genetic Cross
Tall with purple flowers 245
Tall with white flowers 75
Dwarf with purple flowers 63
Dwarf with white flowers 27• Calculate the expected results (according to a 9 : 3 : 3 : 1 ratio)• Calculate the differences between the observed and expected
results (O – E) and square the differences (O – E)2
• Divide each (O – E)2 by the relevant expected result and sum the values obtained to obtain the 2 (Chi-squared) value
• Calculate the number of degrees of freedom (n – 1) = number of categories – 1
• Reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ if the 2 value is greater than the critical value, 2
crit at the 5% significance level
Chi-squared result from a statistics package
0.3182053probability
3.519782
0.073781.8906325.625127
2.50427192.51676.875363
0.045733.5156376.875375
0.896206.641230.6259245
(O – E)2/E(O – E)2Expected
(E)Ratio
Observed(O)
7.81Critical Value
3Degrees of Freedom
The 2 value (3.52) is less than the critical value 2crit
(7.81),and thus we accept the ‘null hypothesis’; the experimental
results do conform to a 9 : 3 : 3 : 1 ratio
The value of p (probability) is > 0.05, so there is no significant difference between the experimental results and a 9 : 3 : 3 : 1 ratio
• The probability value gives us a measure of the extent to which chance has caused any difference between the observed and expected results
• For a 2 value of 3.52 and 3 degrees of freedom there is between a 30% and 50% probability that chance alone has caused the difference between the observed and expected values, i.e. p lies between 0.30 and 0.50 (the actual value is 0.32); no significant difference between observed and expected results
• The 2 value (3.52) is less than the critical value, 2crit
(7.81) for a 5% (0.05) level of significance and thus we accept the ‘null hypothesis’
critical value
A group of students investigated the response of Daphnia to light by introducing approximately 12 organisms into a water filled tube, and counting the numbers present in the weakly illuminated and dark halves after a period of 100 seconds
Null Hypothesis: Lighting conditions have no effect on the distribution of Daphnia
Alternative Hypothesis: Lighting conditions do have an effect on the distribution of Daphnia
As nominal (categorical) data was obtained from the experimental procedure the Chi-squared test was used to
determine whether the null hypothesis should be accepted
The Yates’ correction is applied before squaring the differences between observed and expected results
In this example there is only one degree of freedom and thus the Yates’ correction should be applied to enhance the
accuracy of the 2 value
The Yates’ correction• The Yates’ correction is applied to enhance the accuracy of
the 2 value when there is only ONE degree of freedom• The differences between the observed and expected results
(O – E) are calculated in the normal way but any negative difference is converted to a positive value; this is sometimes written |O – E| and is called the absolute value or modulus of (O – E)
• When all positive values of (O – E) have been determined, the value 0.5 is subtracted from each of the positive differences before these quantities are squaredThe 2 value calculation therefore becomes:
Daphnia Distribution Results
Dark area
O
Σ =
Illuminated area
[(O – E) – 0.5]2/E[(O – E) – 0.5]2(O – E) – 0.5(O – E)ECategory
E
EO 22 50 ].)[(
Number of Daphnia
Group
Totals1 2 3 4 5 6
Illuminated area 8 7 10 8 6 8
Dark area 4 5 3 4 6 5• Calculate the totals and complete the table below• Use the 2 value and critical values table to determine 2
crit, at the 5% significance level for one degree of freedom
• Use a statistic package to compare your values
• Interpret the findings
Category O E (O – E) (O – E) – 0.5 [(O – E) – 0.5]2 [(O – E) – 0.5]2/E
Illuminated area 47 37 10 9.5 90.25 2.439
Dark area 27 37 10 9.5 90.25 2.439
Σ = 4.87
The 2 value is 4.87
Daphnia Distribution Results
• For a 2 value of 4.87 and 1 degree of freedom, there is between a 1% and 5% probability that chance alone has caused the difference between the observed values and those expected if the ‘null hypothesis is true i.e. p lies between 0.01 and 0.05 ( the actual value is 0.02); there is a significant difference between observed and expected results
• The 2 value (4.87) is greater than the critical value, 2crit
(3.84) for a 5% (0.05) level of significance, and thus we reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ that lighting conditions do have an effect on the distribution of Daphnia with more organisms congregating in the weakly illuminated areas than in the darker areas
A student investigated the relationship between eye colour and hair colour for a biology project
Null Hypothesis: There is no association between eye colour and hair colour
Alternative Hypothesis: There is an association between eye colour and hair colour
The student performed the
Chi-squared test of association to
determine if there was a significant
relationship between eye colour
and hair colour23147brown
133421green/grey
91753blue
blackbrownfair/redHair colour
Eye colour
fair/red brown blackRow
totals
blue 53 17 9
green/grey 21 34 13
brown 7 14 23
Column totals
Grand total
• Calculate the row and column totals
• Calculate the grand total for the data
fair/red brown blackRow
totals
blue 53 17 9 79
green/grey 21 34 13 68
brown 7 14 23 44
Column totals 81 65 45 191
Grand total
• Calculate the expected frequency, E, for each data cell using the formula:
total Grand
total Column x total Row Frequency Expected
fair/red brown blackRow
totals
blue 53 17 9 79
green/grey 21 34 13 68
brown 7 14 23 44
Column totals 81 65 45 191
Grand total
• Calculate the expected frequency, E, for each data cell using the formula:
e.g. the expected value for brown hair & blue eyes = 79 x 65191
= 26.885
OE
16.021
23147brown
133421green/grey
91753blue
blackbrownfair/red
33.503 26.885 18.613
28.838 23.141
18.660 14.974 10.336
16.021
Draw up a table and calculate (O – E)2 ÷ E for each data cell
Sum together the values of (O – E)2 ÷ E to obtain the 2 test value
49.5040.1506.029.757.010.513.296.463.335.11)( 2
2
E
EO
Observed frequency
(O)
Expected frequency
(E)(O – E)2 ÷ E
53 33.503 11.35
17 26.885 3.63
9 18.631 4.96
21 28.838 2.13
34 23.141 5.10
13 16.021 0.57
7 18.660 7.29
14 14.974 0.06
23 10.366 15.40
Number of degrees of freedomdf = (Number of rows – 1) x (Number of columns – 1)
df = (3 – 1) x (3 – 1)
df = 2 x 2 = 4
Use the critical values table to determine the critical value, 2
crit corresponding to four degrees of freedom at the 5% (0.05) significance level
Observed frequency
(O)
Expected frequency
(E)(O – E)2 ÷ E
53 33.503 11.35
17 26.885 3.63
9 18.631 4.96
21 28.838 2.13
34 23.141 5.10
13 16.021 0.57
7 18.660 7.29
14 14.974 0.06
23 10.366 15.40
Reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ if the 2 value is greater than the critical value, 2
crit at the 5% significance level
For 4df the 2 value (50.49) is greater than the critical value, 2
crit (9.49) for a 5%
(0.05) level of significance and thus we reject the ‘null hypothesis’
49.5040.1506.029.757.010.513.296.463.335.11)( 2
2
E
EO
Within the species Cepaea nemoralis (land snail), there is much variation in the colours and banding patterns of the shells
A group of students investigated the distribution of banded and unbanded snails in two different localities - a deciduous woodland and open grassland
Data was entered into a 2 x 2 contingency table
Null Hypothesis: There is no relationship between banding pattern and locality
Alternative Hypothesis: There is an association between banding pattern and locality
The students wanted to determine if there is an association between banding pattern and locality and performed the
Chi-squared test of association
1852Grassland
10536Woodland
UnbandedBanded
Snail TypeLocation
Snail Distribution Results
LocationSnail Type
Row totalsBanded Unbanded
Woodland 36 105
Grassland 52 18
Column totals
Grand total• Calculate the row and
column totals• Calculate the grand total for
the snail data
LocationSnail Type
Row totalsBanded Unbanded
Woodland 36 105 141
Grassland 52 18 70
Column totals
88 123 211
Grand total
• Calculate the expected frequency, E, for each data cell using the formula:
total Grand
total Column x total Row Frequency Expected
e.g. the expected value for banded snails in the woodland = 141 x 88211
= 58.806
The Yates’ correction is applied before squaring the differences between observed and expected results
In this example there is only one degree of freedom and thus the Yates’ correction should be applied to enhance the accuracy of the 2 value
Table showing expected and observed results
1852Grassland
10536Woodland
UnbandedBanded
Snail TypeLocation
58.806 82.194
29.194 40.806 OE
[(O – E) – 0.5]
Calculate the absolute values of(O – E), i.e. convert negative values to positive, and then subtract 0.5 from each of the differences
Draw up a table and calculate [(O – E) – 0.5]2 ÷ Efor each data cellSum together the values of[(O – E) – 0.5]2 ÷ Eto obtain the 2 test value
[(O – E) – 0.5]2 Square the corrected differences
Table showing expected and observed results
1852Grassland
10536Woodland
UnbandedBanded
Snail TypeLocation
58.806 82.194
29.194 40.806 OE
Observed frequency
(O)
Expected frequency
(E)
[(O – E) – 0.5]2
E
36 58.806 8.46
105 82.194 6.05
52 29.194 17.04
18 40.806 12.19
74431912041705646850 2
2 .....].)[(
E
EO
Number of degrees of freedom,df = (Number of rows –1) x (Number of columns – 1)
df = (2 – 1) x (2 –1)
df = 1 x 1 = 1
Use the critical values table to determine the critical value, 2crit
corresponding to one degree of freedom at the 5% (0.05) significance level
Reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ if the 2 value is greater than the critical value, 2
crit at the 5% significance level
Observed frequency
(O)
Expected frequency
(E)
[(O – E) – 0.5]2
E
36 58.806 8.46
105 82.194 6.05
52 29.194 17.04
18 40.806 12.19
74431912041705646850 2
2 .....].)[(
E
EO
For 1df, the 2 value (43.74) is greater than the critical value, 2
crit (3.84) for a 5%
(0.05) level of significance and thus we reject the ‘null hypothesis’
Observed frequency
(O)
Expected frequency
(E)
[(O – E) – 0.5]2
E
36 58.806 8.46
105 82.194 6.05
52 29.194 17.04
18 40.806 12.19
74431912041705646850 2
2 .....].)[(
E
EO