61
Assignment #2 Chapter 3: 17, 21 Chapter 4: 18, 20 Due This Friday Oct. 2 nd by 2pm in your TA’s homework box

06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Assignment #2

Chapter 3: 17, 21 Chapter 4: 18, 20 Due This Friday Oct. 2nd by 2pm in your TA’s homework box

Page 2: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Assignment #1

Chapter 1: 14, 17, 19, 20 Chapter 2: 25, 32 Answers are now posted on the webpage Late assignments (even 1 min.) will not be accepted

Page 3: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Assignment #3

Chapter 5: 28, 36, 37 Chapter 6: 16, 18, 19 Due next Friday Oct. 9th by 2pm in your TA’s homework box

Page 4: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Reading

For Today: Chapter 6 For Thursday: Chapter 7

Page 5: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Chapter 5 Review The probability of an event is its true relative frequency; the proportion of times the event would occur if we repeated the same process over and over again. A probability distribution describes the true relative frequency (a.k.a. the probability) of all possible values of a random variable.

Sum of two dice

Probability

Page 6: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Chapter 5 Review

The probability of A OR B involves addition If the two are mutually exclusive:

Pr(A or B) = Pr(A) + Pr(B) If the two are not mutually exclusive: Pr(A or B) = Pr(A) + Pr(B) – Pr(A and B)

The probability of A AND B involves multiplication

If the two are independent Pr(A and B) = Pr(A) Pr(B)

If the two are dependent Pr(A and B) = Pr(A) Pr(B|A)

Page 7: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Chapter 5 Review Mutually Exclusive Not Mutually Exclusive

Independent Not possible First draw is king (put card back) OR Second draw is king 4/52 + 4/52 – (4/52*4/52) = 0.15 First draw is king (put card back) AND Second draw is king 4/52 * 4/52 = 0.006

Dependent First draw is ace of hearts OR Second draw is ace of hearts (1/52) + (51/52 * 1/51) = 0.04 First draw is ace of hearts AND Second draw is ace of hearts 1/52 * 0 = 0

First draw is king OR Second draw is king 4/52 + ((48/52 * 4/51) + (4/52 * 3/51)) – (4/52 * 3/51) First draw is king AND Second draw is king 4/52 * 3/51

Two consecutive draws from the same deck of cards

Page 8: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Chapter 5 Review •  The conditional probability of an event is the

probability of that event occurring given that a condition is met.

•  Law of total probability:

•  Bayes theorem:

Pr A[ ] = Pr A | B[ ]All valuesof B∑ Pr B[ ]

Pr A |B[ ] =Pr B |A[ ]Pr[A]

Pr[B]

Page 9: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Hypothesis testing

Page 10: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Hypothesis testing compares data to what we would expect to see if a specific null hypothesis were true. If the data are too unusual, compared to what we would expect to see if the null hypothesis were true, then the null hypothesis is rejected.

Page 11: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

So we imagine making an infinite number of samples,

from a distribution where men and women have the same

height.

Hypothesis testing in a nutshell

PopulationWe want to know somethingabout this population, say, are men and women the same height, on average?

We can't measure everyone- it would take too long and cost too much. So we take a sample, and meaure those. For these we estimate the difference between men and women's mean height.

Sample

But we have a problem: The sample doesn't have the sameproperties as the population,because of chance errors.

So we need to know how good the sample is, and how likely it is that it is much different from the population.

We make an estimate from each of these samples, and from these we can calculate the sampling distribution of the estimate.

Freq

uenc

y

Difference in mean height

If the actual sample value is so different from what we would expect samples to look like, then we can say that the men in this population are on average taller than the women.

Freq

uenc

y

Difference in mean height

Page 12: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

So we imagine making an infinite number of samples,

from a distribution where men and women have the same

height.

Hypothesis testing in a nutshell

PopulationWe want to know somethingabout this population, say, are men and women the same height, on average?

We can't measure everyone- it would take too long and cost too much. So we take a sample, and meaure those. For these we estimate the difference between men and women's mean height.

Sample

But we have a problem: The sample doesn't have the sameproperties as the population,because of chance errors.

So we need to know how good the sample is, and how likely it is that it is much different from the population.

We make an estimate from each of these samples, and from these we can calculate the sampling distribution of the estimate.

Freq

uenc

y

Difference in mean height

If the actual sample value is so different from what we would expect samples to look like, then we can say that the men in this population are on average taller than the women.

Freq

uenc

y

Difference in mean height

Page 13: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Hypotheses are about populations, but are tested

with data from samples

Hypothesis testing usually assumes that sampling is random.

Page 14: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Null hypothesis (Ho): a specific statement about a population parameter made for the purposes of argument. Alternate hypothesis (HA): represents all other possible parameter values except that stated in the null hypothesis.

Page 15: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

The null hypothesis is usually the simplest statement, whereas the alternative hypothesis is usually the

statement of greatest interest.

Page 16: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

A good null hypothesis would be interesting if proven wrong.

Page 17: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

A null hypothesis is specific; an alternate hypothesis is not.

Page 18: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement
Page 19: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

A test statistic is a number calculated from the data that is used to evaluate how compatible the data are with the result expected

under the null hypothesis

Page 20: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Null distribution: the sampling distribution of outcomes for a test statistic under the assumption that the null hypothesis is true

Page 21: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

P-value

Page 22: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

A P-value is the probability of getting the data, or something as or more unusual, if the null hypothesis were true.

Page 23: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Hypothesis testing: an example

Does a red shirt help win wrestling?

Page 24: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

The experiment and the results

•  Animals use red as a sign of aggression

•  Does red influence the outcome of wrestling, taekwondo, and boxing?

–  16 of 20 rounds had more red-shirted than blue-shirted winners in these sports in the 2004 Olympics

–  Shirt color was randomly assigned

Hill, RA, and RA Burton 2005. Red enhances human performance in contests Nature 435:293.

Page 25: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Stating the hypotheses

H0: Red- and blue-shirted athletes are equally likely to win (proportion = 0.5).

HA: Red- and blue-shirted athletes

are not equally likely to win (proportion ≠ 0.5).

Page 26: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Estimating the value

•  16 of 20 is a proportion, proportion = 0.8

•  This is a discrepancy of 0.3 from the proportion proposed by the null hypothesis, proportion = 0.5

Page 27: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Is this discrepancy by chance alone?:

Estimating the probability of such an extreme result

We use the null distribution to estimate the probability of getting the data, or

something as or more unusual, if the null hypothesis were true (i.e. to get the P-

value for our test statistic)

Page 28: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

The null distribution of the sample proportion

Page 29: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Calculating the P-value from the null distribution

The P-value is calculated as

P = 2 × [Pr(16) + Pr(17) + Pr(18) + Pr(19) + Pr(20)] = 0.012.

Page 30: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Statistical significance

The significance level, α, is a probability used as a criterion for rejecting the null hypothesis. If the P-value for a test is less than or equal to α, then the null hypothesis is rejected.

Page 31: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

α is often 0.05

Page 32: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Significance for the red shirt example

•  P = 0.012

•  P < α, so we can reject the null hypothesis

•  Athletes in red shirts were more likely to win.

Page 33: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Larger samples give more information

•  A larger sample will tend to give and estimate with a smaller confidence interval

•  A larger sample will give more power to reject a false null hypothesis

Page 34: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Hypothesis testing: another example Do dogs resemble their owners?

Page 35: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Common wisdom holds that dogs resemble their owners. Is this true?

•  41 dog owners approached in parks; photos taken of dog and owner separately

•  Photos of owner and dog, along with a photo of another dog, shown to students to match

Roy, M.M., & Christenfeld, N.J.S. (2004). Do dogs resemble their owners? Psychological Science, 15, 361–363

Page 36: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Hypotheses

H0: The proportion of correct matches is proportion = 0.5.

HA: The proportion of correct matches is different from proportion = 0.5.

Page 37: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Data

Of 41 matches, 23 were correct and 18 were incorrect.

Page 38: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Estimating the proportion

sample proportion =2341

= 0.56

Page 39: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Null distribution for dog/owner resemblance

Page 40: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

P = 0.53.

The P-value:

We do not reject the null hypothesis that dogs do not resemble their owners.

Page 41: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Jargon

Page 42: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Significance level

•  The acceptable probability of rejecting a true null hypothesis

•  Called α

•  For many purposes, α = 0.05 is acceptable

Page 43: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

“Statistically significant”

•  P < α

•  We can “reject the null hypothesis”

Page 44: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

We never “accept the null hypothesis”

Page 45: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Type I error

•  Rejecting a true null hypothesis

•  Probability of Type I error is α (the significance level)

Page 46: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Type II error

•  Not rejecting a false null hypothesis

•  The probability of a Type II error is β.

•  The smaller β, the more power a test has.

Page 47: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Power

•  The ability of a test to reject a false null hypothesis

•  Power = 1- β

Page 48: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement
Page 49: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

One- and two-tailed tests

•  Most tests are two-tailed tests.

•  This means that a deviation in either direction would reject the null hypothesis.

•  Normally α is divided into α/2 on one side and α/2 on the other.

Page 50: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Test statistic

2.5% 2.5%

Page 51: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

One-tailed tests

•  Only used when the other tail is nonsensical

•  For example, comparing grades on a multiple choice test to that expected by random guessing

Page 52: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Test Statistic

•  A number calculated to represent the match between a set of data and the null hypothesis

•  Can be compared to a general distribution to infer probability

Page 53: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Critical value

•  The value of a test statistic beyond which the null hypothesis can be rejected

Page 54: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Correlation does not automatically imply causation

Page 55: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Correlation does not automatically imply causation

Page 56: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement
Page 57: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

57

Life expectancy by country:

Page 58: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Confounding variable

An unmeasured variable that may be cause both X and Y"

Page 59: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Observations vs. Experiments

Page 60: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Statistical significance ≠ Biological importance

Page 61: 06. Hypothesis testing - | Department of Zoology at UBCmfscott/lectures/06... · Hypothesis testing usually assumes that sampling is random. Null hypothesis (H o): a specific statement

Important Unimportant Significant

Polio vaccine reduces incidence of polio

Things you don’t care about, or already well known things:

Insignificant Small study shows a possible effect, leading to larger study which finds significance.

or Large study showing no effect of drug that was thought to be beneficial.

Studies with small sample size and high P-value

or Things you don’t care

about