67
THE CHI-SQUARE TEST JANUARY 28, 2013

THE CHI-SQUARE TEST

Embed Size (px)

DESCRIPTION

THE CHI-SQUARE TEST. JANUARY 28, 2013. A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e. Mendel’s laws as expressed in the Punnett square). Chi-Square Test. - PowerPoint PPT Presentation

Citation preview

Page 1: THE CHI-SQUARE TEST

THE CHI-SQUARE TEST

JANUARY 28, 2013

Page 2: THE CHI-SQUARE TEST

Chi-Square Test

• A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e. Mendel’s laws as expressed in the Punnett square).

Page 3: THE CHI-SQUARE TEST

• How can you tell if the number of an observed set of offspring is legitimately the result of a given underlying simple ratio?

Page 4: THE CHI-SQUARE TEST

• Let’s say that you have mated two heterozygous tall parents and they have 20 children.

• What is the expected ratio of the monohybrid cross?

• 1:2:1• What is the expected ratio of their 20

offspring?• 5:10:5

Page 5: THE CHI-SQUARE TEST

• We expected 5:10:5 (TT:Tt:tt)

• But suppose we actually observe the following:

1. 4:11:6 or

2. 2:17:1

• How could we prove mathematically which results are likely and which are not likely?

Page 6: THE CHI-SQUARE TEST

The Critical Question

• How do you tell a really odd but correct result from a WRONG result?

• Most of the time results fit expectations pretty well, but occasionally very skewed distributions of data occur even though you performed the experiment correctly, based on the correct theory.

Page 7: THE CHI-SQUARE TEST

• The simple answer is: you can never tell for certain that a given result is “wrong”, that the result you got was completely impossible based on the theory you used.

• All we can do is determine whether a given result is likely or unlikely.

Page 8: THE CHI-SQUARE TEST

Goodness of Fit

• Mendel has no way of solving this problem. Shortly after the rediscovery of his work in 1900, Karl Pearson and R.A. Fisher developed the “chi-square” test for this purpose.

Page 9: THE CHI-SQUARE TEST

• It’s not always true that what we expect will be what we actually observe. Sometimes there are deviations.

• But the question is: Are the deviations that we observe due purely to chance or is there some other force affecting our observations?

• The Chi-square test helps us to determine this.

Page 10: THE CHI-SQUARE TEST

• The chi-square test is a “goodness of fit” test: it answers the question of how well do experimental data fit expectations.

Page 11: THE CHI-SQUARE TEST

Calculating and Using the Chi-Squared Value

Page 12: THE CHI-SQUARE TEST

• For example, suppose you crossed Pp x Pp and you observe 290 purple flowers and 110 white flowers in the offspring.

• This is pretty close to a 3/4 : 1/4 ratio (3:1), but how do you formally define "pretty close"?

• What about 250:150?

Page 13: THE CHI-SQUARE TEST

Null Hypothesis

• We start with a theory for how the offspring will be distributed: the “null hypothesis” (e.g. the deviation between expected and observed results is due only to chance).

• We will discuss the offspring of a self-pollination of a heterozygote. The null hypothesis is that the offspring will appear in a ratio of 3/4 dominant to 1/4 recessive, right?

Page 14: THE CHI-SQUARE TEST

Formula

• First determine the number of each phenotype that have been observed and how many would be expected given basic genetic theory.

• Then calculate the chi-square statistic using this formula. You need to memorize the formula!

exp

exp)( 22 obs

Page 15: THE CHI-SQUARE TEST

• The “Χ” is the Greek letter chi.• The “∑” is a sigma; it means to sum the following

terms for all phenotypes. • “obs” is the number of individuals of the given

phenotype observed.• “exp” is the number of that phenotype expected

from the null hypothesis.• Note that you must use the number of

individuals, the counts, and NOT proportions, ratios, or frequencies.

Page 16: THE CHI-SQUARE TEST

Example

• The offspring of an F2 generation from the parents Pp x Pp were counted and there were 290 purple and 110 white flowers. This is a total of 400 offspring.

• Determine if this ratio was due to chance (as it should) or to some other determining factor.

Page 17: THE CHI-SQUARE TEST

• We expect a 3/4 : 1/4 or 3:1 ratio.

• Calculate the expected numbers by multiplying the total offspring by the expected proportions.

• Thus we expect 400 * 3/4 = 300 purple, and 400 * 1/4 = 100 white.

P p

P PP Pp

p Pp pp

Page 18: THE CHI-SQUARE TEST

• So we expected 300 Purple (P_) and 100 White (pp).

• But we observed 290 Purple and 110 White.

• Is this difference just due to chance or has some other factor affected our values?

Page 19: THE CHI-SQUARE TEST

CLASSES OBS EXP OBS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Purple

White

Σ

Page 20: THE CHI-SQUARE TEST

CLASSES OBS EXP OBS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Purple 290

White 110

Σ

Page 21: THE CHI-SQUARE TEST

CLASSES OBS EXP OBS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Purple 290 300

White 110 100

Σ

Page 22: THE CHI-SQUARE TEST

CLASSES OBS EXP OBS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Purple 290 300 -10

White 110 100 10

Σ

Page 23: THE CHI-SQUARE TEST

CLASSES OBS EXP OBS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Purple 290 300 -10 100

White 110 100 10 100

Σ

Page 24: THE CHI-SQUARE TEST

CLASSES OBS EXP OBS-EXP (OBS-EXP)2 (OBS-EXP)2

EXP

Purple 290 300 -10 100 0.33

White 110 100 10 100 1.00

Σ 1.33

Page 25: THE CHI-SQUARE TEST

• If we were proficient in plugging into the formula:

2 = (290 - 300)2 / 300 + (110 - 100)2 / 100

= (-10)2 / 300 + (10)2 / 100

= 100 / 300 + 100 / 100

= 0.333 + 1.000

= 1.333. • This is our chi-square value: now we need to see

what it means and how to use it? We must use a Chi-squared table.

Page 26: THE CHI-SQUARE TEST

• Key point: There are 2 ways of getting a high chi-square value: an unusual result from the correct theory, or a result from the wrong theory.

• These are indistinguishable; because of this fact, statistics is never able to discriminate between true and false with 100% certainty.

Page 27: THE CHI-SQUARE TEST

• Using the example here, how can you tell if your 290: 110 offspring ratio really fits a 3/4 : 1/4 ratio (as expected from selfing a heterozygote) or whether it was the result of a mistake or accident-- a 1/2 : 1/2 ratio from a backcross for example?

• You can’t be certain, but you can at least determine whether your result is reasonable.

Page 28: THE CHI-SQUARE TEST

Reasonable

• What is a “reasonable” result is subjective and arbitrary.

• For most work (and for the purposes of this class), a result is said to not differ significantly from expectations if it could happen at least 1 time in 20. That is, if the difference between the observed results and the expected results is small enough that it would be seen at least 1 time in 20 over thousands of experiments, we “fail to reject” the null hypothesis.

Page 29: THE CHI-SQUARE TEST

• For technical reasons, we use “fail to reject” instead of “accept”.

• “1 time in 20” can be written as a probability value p = 0.05, because 1/20 = 0.05.

• Another way of putting this. If your experimental results are worse than 95% of all similar results, they get rejected because you may have used an incorrect null hypothesis.

Page 30: THE CHI-SQUARE TEST

Degrees of Freedom

• A critical factor in using the chi-square test is the “degrees of freedom”, which is essentially the number of independent random variables involved.

• Degrees of freedom is simply the number of classes of offspring minus 1.

• For our example, there are 2 classes of offspring: purple and white. Thus, degrees of freedom (d.f.) = 2 -1 = 1.

Page 31: THE CHI-SQUARE TEST

Critical Chi-Square

• Critical values for chi-square are found on tables, sorted by degrees of freedom and probability levels. Be sure to use p = 0.05.

• If your calculated chi-square value is greater than the critical value from the table, you “reject the null hypothesis”.

• If your chi-square value is less than the critical value, you “fail to reject” the null hypothesis (that is, you accept that your genetic theory about the expected ratio is correct).

Page 32: THE CHI-SQUARE TEST

• With this Chi Square table, you are determining the probability that the deviation is due to chance. 

• If the probability is 0.10 or greater, then the deviation is not considered significant, and the proposed hypothesis is a valid one. 

• On the other hand, a probability of less that 0.10 is significant because chance alone is not likely to result in such observed ratios.

Page 33: THE CHI-SQUARE TEST

Chi-Square Table

Page 34: THE CHI-SQUARE TEST

• p – probability that the deviation was due to chance alone.

• df – degrees of freedom

Page 35: THE CHI-SQUARE TEST

Using the Table

• In our example of 290 purple to 110 white, we calculated a chi-square value of 1.333, with 1 degree of freedom.

• Looking at the table, 1 d.f. is the first row, and p = 0.05 is the sixth column. Here we find the critical chi-square value, 3.841.

Page 36: THE CHI-SQUARE TEST

• Since our calculated chi-square, 1.333, is less than the critical value, 3.841, we “fail to reject” the null hypothesis.

• Thus, an observed ratio of 290 purple to 110 white is a good fit to a 3/4 to 1/4 ratio and the variation was merely due to chance.

Page 37: THE CHI-SQUARE TEST

• The actual probability of getting these results over and over is

• 70% > p > 50%

Page 38: THE CHI-SQUARE TEST

Another example

• For example, let’s look at the phenotypic ratios of the 144 offspring gotten as a result of a cross between the following parents:

• AaDd x AaDd

• What are the possible gametes of both parents?

Page 39: THE CHI-SQUARE TEST

LET’S SOLVE

AD Ad aD ad

AD AADD AADd AaDD AaDd

Ad AADd AAdd AaDd Aadd

aD AaDD AaDd aaDD aaDd

ad AaDd Aadd aaDd aadd

Page 40: THE CHI-SQUARE TEST

• The expected phenotypic ratios would be:

• 9 A_D_

• 3 A_dd

• 3 aaD_

• 1 aadd

• Let’s divide 144 by the ratios (144/16=9).

Page 41: THE CHI-SQUARE TEST

• According to Mendelian inheritance, the expected results should be

• A_D_ = 81• A_dd = 27• aaD_ = 27• aadd = 9

• However, we might observe results such as;

• A_D_ = 72• A_dd = 36• aaD = 20• aadd = 16

Page 42: THE CHI-SQUARE TEST

• Let’s perform the chi-squared test to determine if this difference between observed and expected results was due to chance or some actual factor.

Page 43: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Page 44: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

A_D_

A_dd

aaD_

aadd

Σ

Page 45: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

AD 72

Ad 36

aD 20

ad 16

Σ

Page 46: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

AD 72 81

Ad 36 27

aD 20 27

ad 16 9

Σ

Page 47: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

AD 72 81 -9

Ad 36 27 9

aD 20 27 -7

ad 16 9 7

Σ

Page 48: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

AD 72 81 -9 81

Ad 36 27 9 81

aD 20 27 -7 49

ad 16 9 7 49

Σ

Page 49: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

AD 72 81 -9 81 1.00

Ad 36 27 9 81 3.00

aD 20 27 -7 49 1.81

ad 16 9 7 49 5.44

Σ 11.25

Page 50: THE CHI-SQUARE TEST

• The number of classes is 4, so the degrees of freedom is 4-1 = 3.

• Let’s use the chi-squared table and determine if the difference between the results and expected values were due to chance (REMEMBER 0.05).

Page 51: THE CHI-SQUARE TEST

Chi-Square Table

Page 52: THE CHI-SQUARE TEST

• The chi-squared value was greater than the critical value and so the observed results were due to some other factor, other than chance alone.

• Probability of actually getting these results if we performed the test over and over is only: 5% > p > 1% (poor odds) REJECT THE NULL HYPOTHESIS

Page 53: THE CHI-SQUARE TEST

SUMMARY

Page 54: THE CHI-SQUARE TEST

• The chi-squared test is the means by which the statistical validity of results can be tested.

• It measures the extent if any deviation between any expected and observed results.

• This measure of deviation is called the chi-squared value.

Page 55: THE CHI-SQUARE TEST

• So we are looking for a low probability representation in the chi-squared table. 

• P = probability greater than and df = Degrees of freedom.

• Chi-squared table is used.

exp

exp)( 22 obs

Page 56: THE CHI-SQUARE TEST

1. Read the question carefully.

2. Write the observed frequencies in column OBS.

3. Determine expected frequencies and write them in EXP.

4. Complete the other columns of the table and find the chi-squared value.

5. Find the df. (N-1)

6. Find the Chi squared table value

7. If your chi-square value is equal to or greater than the table value, reject the null hypothesis: differences in your data are not due to chance alone

Page 57: THE CHI-SQUARE TEST

QUESTION 1

• A cross between two pea plants heterozygous for seed colour and morphology was done. Both parents had yellow, round seed phenotypes. Let Y=yellow, y=green, R=round and r=wrinkled. The offspring produced were 315 YR, 101yR, 108 Yr and 32 yr.

1. Using a Punnett square, determine the expected phenotypic ratios of the offspring of the parents. (7 marks)

2. Use the Chi-squared test to determine if the results were due to chance alone (accept the null hypothesis). Include probability statement. (9 marks)

Page 58: THE CHI-SQUARE TEST
Page 59: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Page 60: THE CHI-SQUARE TEST

QUESTION 2

• Domestic fowl with walnut combs were crossed with each other. The expected phenotypic ratios of the offspring were: 9 walnut, 3 rose, 3 pea and 1 single. 160 offspring produced 93 walnut, 24 rose, 36 pea and 7 single combs. Apply the chi-squared test to determine if these results are non-significant (due only to chance) or significant. Include probability (10 marks).

Page 61: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Page 62: THE CHI-SQUARE TEST

QUESTION 3

• Work in pairs. Toss a single coin 100 times. Tally the number of times that the coin lands on heads and on tails. It is expected that the coin should land on heads 50 times and on tails 50 times. Use the chi-squared test to determine if your results are significant or non-significant. Use the table and conclude with a probability statement.

Page 63: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Page 64: THE CHI-SQUARE TEST

QUESTION 4

• Work in pairs. Each person will have the same type of coin. Flip the coins 100 times. Tabulate the number of times the coins land on HH, TT or HT. It is expected that the coin should land in the following ratios 1HH:2HT:1TT.

• Use the chi-squared test to determine if your results are significant or non-significant. Use the table and conclude with a probability statement.

Page 65: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E

Page 66: THE CHI-SQUARE TEST

QUESTION 5

• Work in pairs. Each person will have a different type of coin. Flip the coins 100 times. Tabulate the number of times the coins land on HH, TT or HT. It is expected that the coin should land in the following ratios 1HH:2HT:1TT.

• Use the chi-squared test to determine if your results are significant or non-significant. Use the table and conclude with a probability statement.

Page 67: THE CHI-SQUARE TEST

CLASSES OBS. EXP 0BS-EXP (OBS-EXP)2 (OBS-EXP)2

E