Hypothesis and Test

HYPOTHESIS TEST ERRORS AND BASIC TEST Avjinder Singh Kaler and Kristi Mai

Hypothesis Test Errors

Testing a Claim about a Population Proportion

Testing the Difference between Two Population Proportions

A Type I error is the mistake of rejecting the null hypothesis when it is actually true.

The symbol 𝛼 is used to represent the probability of a type I error.

The significance level, 𝛼, was the probability of rejecting 𝐻0 when it is actually true, thus:

𝛼 = 𝑃 𝑇𝑦𝑝𝑒 𝐼 𝐸𝑟𝑟𝑜𝑟 = 𝑃(𝑅𝑒𝑗𝑒𝑐𝑡 𝐻0|𝐻0 𝑇𝑟𝑢𝑒)

A Type II error is the mistake of failing to reject the null hypothesis when it is actually false.

The symbol β (beta) is used to represent the probability of a type II error.

𝛽 = 𝑃 𝑇𝑦𝑝𝑒 𝐼𝐼 𝐸𝑟𝑟𝑜𝑟 = 𝑃 𝐹𝑎𝑖𝑙 𝑡𝑜 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻0 𝐻0 𝐹𝑎𝑙𝑠𝑒

Assume that we are conducting a hypothesis test of the claim that a method of gender selection increases the likelihood of a baby girl, so that the probability of a baby girls is p > 0.5.

Here are the null and alternative hypotheses:

a) Identify a type I error.

b) Identify a type II error.

0

1

: 0.5

: 0.5

H p

H p

a) A type I error is the mistake of rejecting a true null hypothesis: We conclude the probability of having a girl is greater than 50%, when in reality, it is not. Our data misled us.

b) A type II error is the mistake of failing to reject the null hypothesis when it is false: There is no evidence to conclude the probability of having a girl is greater than 50% (our data misled us), but in reality, the probability is greater than 50%.

For any fixed 𝛼, an increase in the sample size n will cause a decrease in β

For any fixed sample size n, a decrease in 𝛼 will cause an increase in β. Conversely, an increase in 𝛼 will cause a decrease in β.

To decrease both 𝛼 and β, increase the sample size.

n = sample size or number of trials

p = population proportion

q = 1 – p

ˆ x

pn

1) The sample observations are a simple random sample.

2) The conditions for a binomial distribution are satisfied.

3) The conditions np ≥ 5 and nq ≥ 5 are both satisfied, so the binomial distribution of sample proportions can be approximated by a normal distribution with μ = np and

Note: p is the assumed proportion not the sample proportion.

npq

P-values:

Critical Values:

Use the standard normal distribution

Use the standard normal distribution

p̂ pz

pq

n

When testing claims about a population proportion, the critical value method and the p-value method are equivalent and will yield the same result since they use the same standard deviation based on the claimed proportion p.

However, the confidence interval uses an estimated standard deviation based upon the sample proportion .

Consequently, it is possible that the critical value and p-value methods may yield a different conclusion than the confidence interval method.

A good strategy is to use a confidence interval to estimate a population proportion, but use the p-value or critical value method for testing a claim about the proportion.

p̂

Based on information from the National Cyber Security Alliance, 93% of computer owners believe they have antivirus programs installed on their computers.

In a random sample of 400 scanned computers, it is found that 380 of them (or 95%) actually have antivirus software programs.

Use the sample data from the scanned computers to test the claim that 93% of computers have antivirus software.

Requirement check:

1) The 400 computers are randomly selected.

2) There is a fixed number of independent trials with two categories (computer has an antivirus program or does not).

3) The requirements np ≥ 5 and nq ≥ 5 are both satisfied with n = 400

400 0.93 372

400 0.07 28

np

nq

P-Value Method:

1) The original claim that 93% of computers have antivirus software can be expressed as p = 0.93.

2) The opposite of the original claim is p ≠ 0.93.

3) The hypotheses are written as:

0

1

: 0.93

: 0.93

H p

H p

P-Value Method:

4) For the significance level, we select 𝛼 = 0.05.

5) Because we are testing a claim about a population proportion, the sample statistic relevant to this test is:

ˆ , approximated by a normal distributionp

P-Value Method:

6) The test statistic is calculated as:

3800.93

ˆ 400 1.570.93 0.07

400

p pz

pq

n

P-Value Method:

6) Because the hypothesis test is two-tailed with a test statistic of z = 1.57, the P-value is twice the area to the right of z = 1.57.

The P-value is twice 0.0582, or 0.1164.

P-Value Method:

7) Because the P-value of 0.1164 is greater than the significance level of 𝛼 = 0.05, we fail to reject the null hypothesis.

8) We fail to reject the claim that 93% computers have antivirus software. We conclude that there is not sufficient sample evidence to warrant rejection of the claim that 93% of computers have antivirus programs.

Critical Value Method: Steps 1 – 5 are the same as for the P-value method.

6) The test statistic is computed to be z = 1.57. We now find the critical values, with the critical region having an area of α = 0.05, split equally in both tails.

Critical Value Method:

7) Because the test statistic does not fall in the critical region, we fail to reject the null hypothesis.

8) We fail to reject the claim that 93% computers have antivirus software. We conclude that there is not sufficient sample evidence to warrant rejection of the claim that 93% of computers have antivirus programs.

Confidence Interval Method:

The claim of p = 0.93 can be tested at the α = 0.05 level of significance with a 95% confidence interval.

So, we get: 0.929 < p < 0.971

This interval contains p = 0.93, so we do not have sufficient evidence to warrant the rejection of the claim that 93% of computers have antivirus programs.

(the sample proportion)

The corresponding notations apply to

which come from population 2.

For population 1, we let:

= population proportion

= size of the sample

= number of successes in the sample

1p

1n

1x

11

1

ˆx

pn

1 1ˆ ˆ1q p

2 2 2, 2 2ˆ ˆ, , ,p n x p and q

The pooled sample proportion is given by:

1 2

1 2

x xp

n n

1q p

1) The two samples must each be a SRS.

2) The two samples are independent.

3) For each of the two samples, the number of successes is at least 5 and the number of failures is at least 5.

For or

0 1 2:H p p

1 2 1 2

1 2

ˆ ˆ( ) ( )p p p pz

pq pq

n n

1 2 0p p

1 1 2 2/2

1 2

ˆ ˆ ˆ ˆwith margin of error

p q p qE z

n n

1 2 1 2 1 2ˆ ˆ ˆ ˆ( ) ( ) ( )p p E p p p p E

Do people having different spending habits depending on the type of money they have?

89 undergraduates were randomly assigned to two groups and were given a choice of keeping the money or buying gum or mints.

The claim is that “money in large denominations is less likely to be spent relative to an equivalent amount in many smaller denominations”.

Let’s test the claim at the 0.05 significance level.

Below are the sample data and summary statistics:

Group 1 Group 2

Subjects Given

$1 Bill

Subjects Given

4 Quarters

Spent the

money

x1 = 12 x2 = 27

Subjects in

group

n1 = 46 n2 = 43

1 2

12 27ˆ ˆ

46 43

12 270.438202

46 43

p p

p

Requirement Check:

1) The 89 subjects were randomly assigned to two groups, so we consider these independent random samples.

2) The subjects given the $1 bill include 12 who spent it and 34 who did not. The subjects given the quarters include 27 who spent it and 16 who did not. All counts are above 5, so the requirements are all met.

0 1 2

1 1 2

:

:

H p p

H p p

Step 1: The claim that “money in large denominations is less likely to be spent” can be expressed as p1 < p2.

Step 2: If p1 < p2 is false, then p1 ≥ p2.

Step 3: The hypotheses can be written as:

1 2

12 27ˆ ˆ

46 43

12 270.438202

46 43

1 1 0.438202 0.561798

p p

p

q p

Step 4: The significance level is 𝛼 = 0.05.

Step 5: We will use the normal distribution to run the test with:

1 2 1 2

1 2

ˆ ˆ( ) ( )

12 270

46 43

0.438202 0.561798 0.438202 0.561798

46 43

3.49

p p p pz

pq pq

n n

Step 6: Calculate the value of the test statistic:

Step 6: This is a left-tailed test, so the P-value is the area to the left of the test statistic z = –3.49, or 0.0002. The critical value is also shown below.

Step 7: Because the P-value of 0.0002 is less than the significance level of α = 0.05, reject the null hypothesis.

There is sufficient evidence to support the claim that people with money in large denominations are less likely to spend relative to people with money in smaller denominations.

It should be noted that the subjects were all undergraduates and care should be taken before generalizing the results to the general population.

We can also construct a confidence interval to estimate the difference between the population proportions.

Caution: The confidence interval uses standard deviations based on estimated values of the population proportions, and consequently, a conclusion based on a confidence interval might be different from a conclusion based on a hypothesis test.

Construct a 90% confidence interval estimate of the difference between the two population proportions.

What does the result suggest about our claim about people spending large denominations relative to spending small denominations?

1 1 2 2/2

1 2

ˆ ˆ ˆ ˆ

12 34 27 16

46 46 43 431.645

46 43

0.161387

p q p qE z

n n

1 2 1 2 1 2

1 2

1 2

ˆ ˆ ˆ ˆ( ) ( ) ( )

0.367037 0.161387 ( ) 0.367037 0.161387

0.528 ( ) 0.206

p p E p p p p E

p p

p p

The confidence interval limits do not include 0, implying that there is a significant difference between the two proportions.

There does appear to be sufficient evidence to support the claim that “money in large denominations is less likely to be spent relative to an equivalent amount in many smaller denominations.”

Complete HW 3 and Quiz 3

Data & Analytics

Hypothesis and Test