State the ‘null hypothesis’ State the ‘alternative hypothesis’ State either one-tailed or two-tailed test State the chosen statistical test with reasons

• State the ‘null hypothesis’• State the ‘alternative

hypothesis’• State either one-tailed or

two-tailed test• State the chosen statistical

test with reasons• State the level of significance

(usually 5%)• Present calculations• Draw conclusions –

accept or reject the‘null hypothesis’

Nominal level data (categorical)

Expected frequencies in any cell should not fall below 5

Testing Genetic Ratios

The Chi-squared test is commonly used for comparing experimental genetic data with that expected from predicted ratios

In a breeding experiment in which tall, purple flowered pea plants were allowed to self pollinate, a Mendelian ratio of 9 : 3 : 3 : 1 was predicted among the progeny as follows:

Tall/Purple : Tall/White : Dwarf/Purple : Dwarf/White = 9 : 3 : 3 : 1

The Chi-squared test was used to determine how well the observations fitted the predicted ratio; in this context,

Chi-squared is used as a test of ‘goodness of fit’

Null Hypothesis: There is no difference between the results of the genetic cross and the predicted Mendelian ratio of 9 : 3 : 3 : 1

Alternative Hypothesis: The experimental results differ from the predicted 9 : 3 : 3 : 1 ratio

The basis of the Chi-squared test is the difference between observed results

(O) and the expected results (E) predicted by the ‘null hypothesis’E

EO 2

2 )(

Results of the Experimental Genetic Cross

Tall with purple flowers 245

Tall with white flowers 75

Dwarf with purple flowers 63

Dwarf with white flowers 27• Calculate the expected results (according to a 9 : 3 : 3 : 1 ratio)• Calculate the differences between the observed and expected

results (O – E) and square the differences (O – E)2

• Divide each (O – E)2 by the relevant expected result and sum the values obtained to obtain the 2 (Chi-squared) value

• Calculate the number of degrees of freedom (n – 1) = number of categories – 1

• Reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ if the 2 value is greater than the critical value, 2

crit at the 5% significance level

Chi-squared result from a statistics package

0.3182053probability

3.519782

0.073781.8906325.625127

2.50427192.51676.875363

0.045733.5156376.875375

0.896206.641230.6259245

(O – E)2/E(O – E)2Expected

(E)Ratio

Observed(O)

7.81Critical Value

3Degrees of Freedom

The 2 value (3.52) is less than the critical value 2crit

(7.81),and thus we accept the ‘null hypothesis’; the experimental

results do conform to a 9 : 3 : 3 : 1 ratio

The value of p (probability) is > 0.05, so there is no significant difference between the experimental results and a 9 : 3 : 3 : 1 ratio

• The probability value gives us a measure of the extent to which chance has caused any difference between the observed and expected results

• For a 2 value of 3.52 and 3 degrees of freedom there is between a 30% and 50% probability that chance alone has caused the difference between the observed and expected values, i.e. p lies between 0.30 and 0.50 (the actual value is 0.32); no significant difference between observed and expected results

• The 2 value (3.52) is less than the critical value, 2crit

(7.81) for a 5% (0.05) level of significance and thus we accept the ‘null hypothesis’

critical value

A group of students investigated the response of Daphnia to light by introducing approximately 12 organisms into a water filled tube, and counting the numbers present in the weakly illuminated and dark halves after a period of 100 seconds

Null Hypothesis: Lighting conditions have no effect on the distribution of Daphnia

Alternative Hypothesis: Lighting conditions do have an effect on the distribution of Daphnia

As nominal (categorical) data was obtained from the experimental procedure the Chi-squared test was used to

determine whether the null hypothesis should be accepted

The Yates’ correction is applied before squaring the differences between observed and expected results

In this example there is only one degree of freedom and thus the Yates’ correction should be applied to enhance the

accuracy of the 2 value

The Yates’ correction• The Yates’ correction is applied to enhance the accuracy of

the 2 value when there is only ONE degree of freedom• The differences between the observed and expected results

(O – E) are calculated in the normal way but any negative difference is converted to a positive value; this is sometimes written |O – E| and is called the absolute value or modulus of (O – E)

• When all positive values of (O – E) have been determined, the value 0.5 is subtracted from each of the positive differences before these quantities are squaredThe 2 value calculation therefore becomes:

Daphnia Distribution Results

Dark area

O

Σ =

Illuminated area

[(O – E) – 0.5]2/E[(O – E) – 0.5]2(O – E) – 0.5(O – E)ECategory

E

EO 22 50 ].)[(

Number of Daphnia

Group

Totals1 2 3 4 5 6

Illuminated area 8 7 10 8 6 8

Dark area 4 5 3 4 6 5• Calculate the totals and complete the table below• Use the 2 value and critical values table to determine 2

crit, at the 5% significance level for one degree of freedom

• Use a statistic package to compare your values

• Interpret the findings

Category O E (O – E) (O – E) – 0.5 [(O – E) – 0.5]2 [(O – E) – 0.5]2/E

Illuminated area 47 37 10 9.5 90.25 2.439

Dark area 27 37 10 9.5 90.25 2.439

Σ = 4.87

The 2 value is 4.87

Daphnia Distribution Results

• For a 2 value of 4.87 and 1 degree of freedom, there is between a 1% and 5% probability that chance alone has caused the difference between the observed values and those expected if the ‘null hypothesis is true i.e. p lies between 0.01 and 0.05 ( the actual value is 0.02); there is a significant difference between observed and expected results

• The 2 value (4.87) is greater than the critical value, 2crit

(3.84) for a 5% (0.05) level of significance, and thus we reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ that lighting conditions do have an effect on the distribution of Daphnia with more organisms congregating in the weakly illuminated areas than in the darker areas

A student investigated the relationship between eye colour and hair colour for a biology project

Null Hypothesis: There is no association between eye colour and hair colour

Alternative Hypothesis: There is an association between eye colour and hair colour

The student performed the

Chi-squared test of association to

determine if there was a significant

relationship between eye colour

and hair colour23147brown

133421green/grey

91753blue

blackbrownfair/redHair colour

Eye colour

fair/red brown blackRow

totals

blue 53 17 9

green/grey 21 34 13

brown 7 14 23

Column totals

Grand total

• Calculate the row and column totals

• Calculate the grand total for the data


totals

blue 53 17 9 79

green/grey 21 34 13 68

brown 7 14 23 44

Column totals 81 65 45 191

Grand total

• Calculate the expected frequency, E, for each data cell using the formula:

total Grand

total Column x total Row Frequency Expected


totals

blue 53 17 9 79

green/grey 21 34 13 68

brown 7 14 23 44

Column totals 81 65 45 191

Grand total


e.g. the expected value for brown hair & blue eyes = 79 x 65191

= 26.885

OE

16.021

23147brown

133421green/grey

91753blue

blackbrownfair/red

33.503 26.885 18.613

28.838 23.141

18.660 14.974 10.336

16.021

Draw up a table and calculate (O – E)2 ÷ E for each data cell

Sum together the values of (O – E)2 ÷ E to obtain the 2 test value

49.5040.1506.029.757.010.513.296.463.335.11)( 2

2

E

EO

Observed frequency

(O)

Expected frequency

(E)(O – E)2 ÷ E

53 33.503 11.35

17 26.885 3.63

9 18.631 4.96

21 28.838 2.13

34 23.141 5.10

13 16.021 0.57

7 18.660 7.29

14 14.974 0.06

23 10.366 15.40

Number of degrees of freedomdf = (Number of rows – 1) x (Number of columns – 1)

df = (3 – 1) x (3 – 1)

df = 2 x 2 = 4

Use the critical values table to determine the critical value, 2

crit corresponding to four degrees of freedom at the 5% (0.05) significance level

Observed frequency

(O)

Expected frequency

(E)(O – E)2 ÷ E

53 33.503 11.35

17 26.885 3.63

9 18.631 4.96

21 28.838 2.13

34 23.141 5.10

13 16.021 0.57

7 18.660 7.29

14 14.974 0.06

23 10.366 15.40

Reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ if the 2 value is greater than the critical value, 2


For 4df the 2 value (50.49) is greater than the critical value, 2

crit (9.49) for a 5%

(0.05) level of significance and thus we reject the ‘null hypothesis’

49.5040.1506.029.757.010.513.296.463.335.11)( 2

2

E

EO

Within the species Cepaea nemoralis (land snail), there is much variation in the colours and banding patterns of the shells

A group of students investigated the distribution of banded and unbanded snails in two different localities - a deciduous woodland and open grassland

Data was entered into a 2 x 2 contingency table

Null Hypothesis: There is no relationship between banding pattern and locality

Alternative Hypothesis: There is an association between banding pattern and locality

The students wanted to determine if there is an association between banding pattern and locality and performed the

Chi-squared test of association

1852Grassland

10536Woodland

UnbandedBanded

Snail TypeLocation

Snail Distribution Results

LocationSnail Type

Row totalsBanded Unbanded

Woodland 36 105

Grassland 52 18

Column totals

Grand total• Calculate the row and

column totals• Calculate the grand total for

the snail data

LocationSnail Type

Row totalsBanded Unbanded

Woodland 36 105 141

Grassland 52 18 70

Column totals

88 123 211

Grand total


total Grand

total Column x total Row Frequency Expected

e.g. the expected value for banded snails in the woodland = 141 x 88211

= 58.806

The Yates’ correction is applied before squaring the differences between observed and expected results

In this example there is only one degree of freedom and thus the Yates’ correction should be applied to enhance the accuracy of the 2 value

Table showing expected and observed results

1852Grassland

10536Woodland

UnbandedBanded

Snail TypeLocation

58.806 82.194

29.194 40.806 OE

[(O – E) – 0.5]

Calculate the absolute values of(O – E), i.e. convert negative values to positive, and then subtract 0.5 from each of the differences

Draw up a table and calculate [(O – E) – 0.5]2 ÷ Efor each data cellSum together the values of[(O – E) – 0.5]2 ÷ Eto obtain the 2 test value

[(O – E) – 0.5]2 Square the corrected differences

Table showing expected and observed results

1852Grassland

10536Woodland

UnbandedBanded

Snail TypeLocation

58.806 82.194

29.194 40.806 OE

Observed frequency

(O)

Expected frequency

(E)

[(O – E) – 0.5]2

E

36 58.806 8.46

105 82.194 6.05

52 29.194 17.04

18 40.806 12.19

74431912041705646850 2

2 .....].)[(

E

EO

Number of degrees of freedom,df = (Number of rows –1) x (Number of columns – 1)

df = (2 – 1) x (2 –1)

df = 1 x 1 = 1

Use the critical values table to determine the critical value, 2crit

corresponding to one degree of freedom at the 5% (0.05) significance level

Reject the ‘null hypothesis’ and accept the ‘alternative hypothesis’ if the 2 value is greater than the critical value, 2


Observed frequency

(O)

Expected frequency

(E)

[(O – E) – 0.5]2

E

36 58.806 8.46

105 82.194 6.05

52 29.194 17.04

18 40.806 12.19

74431912041705646850 2

2 .....].)[(

E

EO

For 1df, the 2 value (43.74) is greater than the critical value, 2

crit (3.84) for a 5%

(0.05) level of significance and thus we reject the ‘null hypothesis’

Observed frequency

(O)

Expected frequency

(E)

[(O – E) – 0.5]2

E

36 58.806 8.46

105 82.194 6.05

52 29.194 17.04

18 40.806 12.19

74431912041705646850 2

2 .....].)[(

E

EO

Documents

State the ‘null hypothesis’ State the ‘alternative hypothesis’ State either one-tailed or two-tailed test State the chosen statistical test with reasons