Comparing Two Proportions Situations in which we …linford-math.weebly.com/uploads/4/2/4/6/4246372/ch._10...AP Statistics – Ch. 10 Notes Comparing Two Proportions Situations in

AP Statistics – Ch. 10 Notes

Comparing Two Proportions

Situations in which we perform inference about the difference between two proportions (1 2

p p−−−− ):

• Comparing the proportions of individuals with a certain characteristic in two different populations: The parameters of interest are the true proportions of individuals with the characteristic in

each population, 1p and 2.p We estimate these proportions by taking separate random samples from the

two population and calculating the proportion of individuals in each sample with the characteristic ( 1p̂

and 2p̂ ).

• Comparing the proportions of successful outcomes for two treatment groups in a completely randomized experiment: The parameters of interest are the true proportions of successful outcomes for

each treatment, 1p and 2.p We estimate these proportions using the proportions of successes in the two

treatment groups of our randomized experiment ( 1p̂ and 2p̂ ).

We compare the populations or treatments by doing inference about the difference 1 2p p− between the

parameters. The statistic that we use to estimate this difference in a confidence interval or hypothesis test is the

difference between the two sample proportions, 1 2ˆ ˆ .p p−

The Sampling Distribution of 1 2

ˆ ˆp p−−−−

Choose an SRS of size 1n from Population 1 with proportion of successes 1p and an independent SRS of size

2n from Population 2 with proportion of successes 2.p

• Shape: When 1 1 1 1 2 2 2 2, , , and n p n q n p n q are all at least 10, the sampling distribution of 1 2ˆ ˆp p− is

approximately Normal.

• Center: 1 2ˆ ˆ 1 2.p pµ p p−

= − That is, the difference in sample proportions is an unbiased estimator of the

difference in population proportions.

• Spread: 1 2

1 1 2 2ˆ ˆ

1 2

p p

p q p qσ

n n−

= + as long as the 10% condition is met.

Example: A researcher reports that 80% of high school graduates but only 40% of high school dropouts would

pass a basic literacy test. Assume the researcher’s claim is true. Suppose we give a basic literacy test to a

random sample of 60 high school graduates and a separate random sample of 75 high school dropouts. Let

graduatep̂ and dropoutp̂ be the proportions of graduates and dropouts in the samples who pass the test, respectively.

a) What is the shape of the sampling distribution of graduate dropout

ˆ ˆ ?p p− How do you know?

b) Find the mean and standard deviation of the sampling distribution of graduate dropout

ˆ ˆ .p p− Interpret these

values in context.

c) Find the probability that in your samples the proportion of graduates who pass the test is no more than

0.20 higher than the proportion of dropouts who pass.

d) Suppose that the difference in the sample proportions (graduate – dropout) who pass the test is exactly

0.20. Based on your result in part (c), would this give you reason to doubt the researcher’s claim?

Explain.

Just as we can construct confidence intervals and perform hypothesis tests for one-sample situations, we can do

the same for two-sample situations.

When we are constructing a confidence interval for 1 2 ,p p− we don’t know the values of 1p or 2 ,p so we have

to use 1p̂ and 2p̂ to estimate these values in the formula for standard deviation.

Standard Error (or Estimated Standard Deviation) of 1 2

ˆ ˆp p−−−− : 1 2

1 1 2 2ˆ ˆ

1 2

ˆ ˆ ˆ ˆp p

p q p qSE

n n−

= +

Two-Sample z Interval for a Difference between Two Proportions (or Two-Proportion z Interval)

An approximate level C confidence interval for 1 2p p− is

( ) 1 1 2 21 2

1 2

ˆ ˆ ˆ ˆˆ ˆ

p q p qp p z

n n

∗− ± +

where z∗ is the critical value for the standard Normal curve with area C between z∗− and z∗ .

Conditions:

• Random: The data come from independent random samples from two different populations

• Normal: The counts of “successes” and “failures” in each sample— 1 1ˆ ,n p 1 1

ˆ ,n q 2 2ˆ ,n p and 2 2

ˆn q — are all at

least 10.

• 10% Condition: Check that both populations are at least 10 times as large as their corresponding samples.

Confidence Interval for the Difference between Two Proportions on TI-83/TI-84 Calculators. 1. Choose “2-PropZInt” on the STAT → TESTS menu.

2. Enter the requested information: x1: number of successes in sample 1

n1: sample size of sample 1

x2: number of successes in sample 2


C-Level: confidence level (as a decimal)

3. Choose “Calculate”

Example: Did the proportion of U.S. adults who would report having read a book in the past year change

between 2015 and 2018? In a random sample of 1,403 U.S. adults in April 2015, 72% reported having read a

book in the past year. In another random sample of 2,001 U.S. adults in January 2018, 76% reported having

read a book in the past year.

a) Calculate the standard error of the sampling distribution of the difference in the sample proportions

(2018 – 2015). Interpret this value.

b) Construct and interpret a 90% confidence interval for the difference in the proportions of all U.S. adults

who would report having read a book in the past year in 2018 and in 2015 (2018-2015).

c) Based on your interval, is there convincing evidence that the proportion of U.S. adults who would report

having read a book in the past year changed between 2015 and 2018? Explain.

Example: Are teens or adults more likely to be online “almost constantly”? The Pew Internet and American

Life Project asked a random sample of 1060 teens (September 2014 to March 2015) and a separate random

sample of 2001 adults (June to July 2015) how often they use the Internet. In these two surveys, 257 of the teens

and 420 of the adults said they are online “almost constantly”. Construct and interpret a 95% confidence

interval for teens adults .p p−

Significance Tests for 1 2

p p−−−−

Very often, we want to test the null hypothesis that there is no difference between two proportions, so we test

0 1 2 0 1 2: 0, or, alternatively, : .H p p H p p− = = The alternative hypothesis specifies what kind of difference we

expect.

Since the null hypothesis assumes that 1 2 ,p p= we use a pooled sample proportion to calculate the standard

deviation.

1 2

1 2

count of successes in both samples combinedˆ

count of individuals in both samples combinedC

X Xp

n n

+= = =

+pooled sample proportion

Two-Sample z Test for the Difference between Two Proportions

To test the hypothesis 0 1 2

: 0H p p− = or 0 1 2

:H p p=

Test statistic: ( )1 2

1 2

ˆ ˆ 0

ˆ ˆ ˆ ˆC C C C

p pz

p q p q

n n

− −=

+

P-value: Find the probability of getting a z statistic this large or larger in the direction specified by the

alternative hypothesis .a

H The P-value is the shaded area. For a two-tailed test, it is the total area in both tails.

Conditions:


• Normal: The counts of “successes” and “failures” in each sample—1 1

ˆ ,n p 1 1̂

,n q 2 2

ˆ ,n p and 2 2ˆn q — are all at

least 10.


Hypothesis Test for the Difference between Two Proportions on TI-83/TI-84 Calculators. 1. Choose “2-PropZTest” on the STAT → TESTS menu.

2. Enter the requested information:





3. Specify which proportion the alternative hypothesis says is higher.

4. Choose “Calculate” to see results, or “Draw” to see a shaded Normal curve.

Example: Are teenagers going deaf? In a study of 3000 randomly selected teenagers in 1988-1994, 15%

showed some hearing loss. In a similar study of 1800 teenagers in 2005-2006, 19.5% showed some hearing loss.

(These data are reported in Arizona Daily Star, August 18, 2010.)

a) Do these data give convincing evidence that the proportion of all teens with hearing loss has increased?

b) Between the two studies, Apple introduced the iPod. If the results of the test are statistically significant,

can we blame iPods for the increased hearing loss in teenagers?

Example: The Centers for Disease Control and Prevention selected a random sample of Americans age 65 and

older. They found that 411 of the 1012 men and 535 of the 1062 women suffered from some form of arthritis.

Do these data provide convincing evidence that arthritis is more likely to afflict senior women than senior men?

Comparing Two Means

Situations in which we perform inference about the difference between two means (1 2µ µ−−−− ):

• Comparing the mean of some quantitative variable for the individuals in two different

populations: The parameters of interest are the population means in each population, 1µ and

2.µ We

estimate these means by taking separate random samples from each population and calculating the

sample means 1

x and 2.x

• Comparing the average effectiveness of two treatments in a completely randomized experiment:

The parameters of interest are the true mean responses for treatment 1 and treatment 2, 1µ and

2.µ We

use the mean response in the two groups, 1

x and 2,x to make the comparison.

We compare the populations or treatments by doing inference about the difference 1 2µ µ− between the

parameters. The statistic that we use to estimate this difference in a confidence interval or hypothesis test is the

difference between the two sample means, 1 2

.x x−

The Sampling Distribution of 1 2

x x−−−−

Choose an SRS of size 1

n from Population 1 with mean 1µ and standard deviation

1σ and an independent SRS

of size 2

n from Population 2 with mean 2µ and standard deviation

2.σ

• Shape: When both population distributions are Normal, the sampling distribution of 1 2

x x− is Normal.

In other cases, the sampling distribution of 1 2

x x− is approximately Normal if the sample sizes are large

enough (1

30n ≥ and 2

30n ≥ ).

• Center: 1 2 1 2 .x xµ µ µ−

= − That is, the difference in sample means is an unbiased estimator of the

difference in population means.

• Spread: 1 2

2 2

1 2

1 2

x x

σ σσ

n n−

= + as long as the 10% condition is met.

Example: A potato chip manufacturer buys potatoes from two different suppliers, Riderwood Farms and

Camberley, Inc. The weights of potatoes from Riderwood Farms are approximately Normally distributed with a

mean of 175 grams and a standard deviation of 25 grams. The weights of potatoes from Camberley are

approximately Normally distributed with a mean of 180 grams and a standard deviation of 30 grams. When

shipments arrive at the factory, inspectors randomly select a sample of 20 potatoes from each shipment and

weigh them. They are surprised when the average weight of the potatoes in the sample from Riderwood Farms,

,R

x is higher than the average weight of the potatoes in the sample from Camberley, .C

x

a) Describe the shape, center, and spread of the sampling distribution of .C R

x x− Interpret the values of the

mean and standard deviation in context.

b) Find the probability that the mean weight of the Riderwood sample is larger than the mean weight of the

Camberley sample. Should the inspectors have been surprised that the Riderwood sample had a higher

mean weight than the Camberley sample?

c) Review from Ch. 6: Find the probability that a single potato from Riderwood Farms weighs more than a

single potato from Camberley Farms.

Standard Error (or Estimated Standard Deviation) of 1 2

x x−−−− : 1 2

2 2

1 2

1 2

x x

s sSE

n n−

= +

Two-Sample t Statistic: ( ) ( )1 2 1 2

2 2

1 2

1 2

x x µ µt

s s

n n

− − −=

+

.

Like any other z or t statistic, this statistic tells us how many standard deviations the sample statistic 1 2

x x− is

from its mean. The two-sample t statistic has approximately a t distribution. There are two options for

determining the degrees of freedom.

• Option 1 (Technology): Use the t distribution with degrees of freedom calculated from the data by the

lovely formula below. With this option, the degrees of freedom may not be a whole number. 2

2 2

1 2

1 2

2 22 2

1 2

1 1 2 2

df1 1

1 1

s s

n n

s s

n n n n

+

=

+

− −

• Option 2 (Conservative): Use the t distribution with degrees of freedom equal to the smaller of 1

1n −

and 2

1.n − This always gives a confidence interval that is wider than necessary (higher margin of error)

for the desired confidence level, and a P-value that is greater than or equal to the true P-value.

Robustness of Two-Sample t Procedures Two-sample t procedures are even more robust against non-Normality than the one-sample t procedures. This is

especially true if the two populations being compared have distributions with similar shapes. The two-sample t

procedures are most robust against non-Normality when the sample sizes are equal or very similar.

Two-Sample t Interval for a Difference between Two Means (or Two-Sample t Interval)

An approximate level C confidence interval for 1 2µ µ− is

( )2 2

1 21 2

1 2

s sx x t

n n

∗− ± +

where t∗ is the critical value for confidence level C for the t distribution with degrees of freedom approximated

by technology or the smaller of 1

1n − and 2

1.n −

Conditions:


• Normal/Large Sample Size: Both samples are large (1

30n ≥ and 2

30n ≥ ) OR no strong skewness or

outliers can be seen in the graph of either distribution of sample data. (You are checking to make sure it is

reasonable to believe that both populations distributions are approximately Normal.)


Confidence Interval for the Difference between Two Means on TI-83/TI-84 Calculators. 1. Choose “2-SampTInt” on the STAT → TESTS menu.

2. Choose “Data” if you have a list of sample data. Choose “Stats” if you have values for 1 2 1 2, , and .x x s s


For “Data” option, input the sample values into two lists and indicate which lists they are in.

For “Stats” option,

1

x : mean of sample 1 2

x : mean of sample 2

Sx1: sample st. dev. of sample 1 Sx2: sample st. dev. of sample 2

n1: size of sample 1 n2: size of sample 2

C-level: confidence level (as a decimal)

Always choose “NO” pooling! 4. Choose “Calculate”

Example: A team of anthropologists headed by reasearcher Nicole Waguespack studied the difference between

stone-tipped and wooden-tipped arrows. Stone arrow tips are tougher, but also take longer to make. Many

cultures used both types of arrow tips, including the Apache in North America and the Tiwi in Australia. The

researchers set up a compound bow with 60 lbs. of force. They shot arrows of both types into a hide-covered

ballistics gel and measured the penetration depth in mm. Arrows that penetrate more deeply into their targets

make deadlier weapons. Here are the penetration depths for seven shots with each type of arrow tip.

Wooden (mm) 216 211 192 208 203 210 203

Stone (mm) 240 208 213 225 232 214 240

a) Draw parallel dotplots of the data. Just from looking at the graphs, do you expect to find evidence of a

significant difference in penetration depth for the two types of arrow tips?

b) Construct and interpret a 95% confidence interval for the difference in mean penetration depth for the

two types of arrow tips.

c) Does your interval provide convincing evidence that there is a difference in the mean penetration depth

for the two types of arrow tips? Does the difference seem worth the extra time, effort, and cost of stone

tips?

Two-Sample t Test for the Difference between Two Means

To test the hypothesis 0 1 2

: hypothesized value,H µ µ− = compute the two-sample t statistic

Test statistic: ( ) ( )1 2 1 2

2 2

1 2

1 2

x x µ µt

s s

n n

− − −=

+

P-value: Find the probability of getting a t statistic this large or larger in the direction specified by the

alternative hypothesis .a

H Use the t distribution with degrees of freedom approximated by technology or the

smaller of 1

1n − and 2

1.n − The P-value is the shaded area. For a two-tailed test, it is the total area in both tails.

1 2 1 2 1 2

: hypothesized value : hypothesized value : hypothesized valuea a a

H µ µ H µ µ H µ µ− > − < − ≠

Conditions:


• Normal/Large Sample Size: Both samples are large (1

30n ≥ and 2

30n ≥ ) OR no strong skewness or

outliers can be seen in the graph of either distribution of sample data. (You are checking to make sure it is

reasonable to believe that both populations distributions are approximately Normal.)

10% Condition: Check that both populations are at least 10 times as large as their corresponding samples.

Significance Tests for the Difference between Two Means on TI-83/TI-84 Calculators. 1. Choose “2-SampTTest” on the STAT → TESTS menu.

2. Choose “Data” if you have a list of sample data. Choose “Stats” if you have values for 1 2 1 2, , and .x x s s


For “Data” option, input the sample values into two lists and indicate which lists they are in.

For “Stats” option,

1

x : mean of sample 1 2

x : mean of sample 2

Sx1: sample st. dev. of sample 1 Sx2: sample st. dev. of sample 2

n1: size of sample 1 n2: size of sample 2

Always choose “NO” pooling! 5. Specify which proportion the alternative hypothesis says is higher.

6. Choose “Calculate” to see results, or “Draw” to see a shaded Normal curve.

Example: In commercials for Bounty paper towels, the manufacturer claims that they are the “quicker picker-upper.” But

are they also the stronger picker upper? Two AP Statistics students, Wesley and Maverick, decided to find out.

They selected a random sample of 30 Bounty paper towels and a random sample of 30 generic paper towels and

measured their strength when wet. To do this, they uniformly soaked each paper towel with 4 ounces of water,

held two opposite edges of the paper towel, and counted how many quarters each paper towel could hold until

ripping, alternating brands. Here are their results:

Bounty: 106 111 106 120 103 112 115 125 116 120 126 125 116 117 114

118 126 120 115 116 121 113 111 128 124 125 127 123 115 114

Generic: 77 103 89 79 88 86 100 90 81 84 84 96 87 79 90

86 88 81 91 94 90 89 85 83 89 84 90 100 94 87

a) Display these distributions using parallel boxplots and briefly compare these distributions. Based only

on the boxplots, discuss whether or not your think the mean for Bounty is significantly higher than the

mean for generic.

b) Use a significance test to determine whether there is convincing evidence that wet Bounty paper towels

can hold more weight, on average, than wet generic paper towels can.

c) Interpret the P-value from part (b) in the context of this question.

Independent Samples vs. Paired Samples – Which Is It? a) To test the effect of background music on productivity, several workers are observed. For one month

they had no music. For another month they had background music.

b) A random sample of 10 workers in Plant A are to be compared to a sample of 10 workers in Plant B.

c) A new weight reducing diet was tried on ten women. The weight of each woman was measured before

the diet, and again after being on the diet for ten weeks.

d) To compare the average weight gain of pigs fed two different rations, nine pairs of pigs were used. The

pigs in each pair were litter-mates.

e) To test the effects of a new fertilizer, 100 plots are treated with one fertilizer, and 100 plots are treated

with the other.

f) A sample of college teachers is taken. We wish to compare the average salaries of male and female

teachers.

g) A new fertilizer is tested on 100 plots. Each plot is divided in half. Fertilizer A is applied to one half and

B to the other.

h) Consumers Union wants to compare two types of calculators. They get 100 volunteers and ask them to

carry out a series of 50 routine calculations (such as figuring discounts, sales tax, totaling a bill, etc.).

Each volunteer does each calculation on both types of calculator, and the time required for each

calculation is recorded.

Inference for Experiments

Important Differences

Parameters:

• Proportions of individuals like those in the study who would respond a certain way to each treatment.

• Mean response sizes for individuals like those in the study.

� Caution: Avoid past tense when defining your parameters and in your conclusion. If you refer to how

individuals did respond rather than how they would respond, you are talking about your treatment group and

are referring to values of statistics ( ˆ 'p s or 'x s ) that can be calculated directly rather than the unknown

parameters ( 'p s or 'µ s ) for which you are doing inference. Also, do not refer to subjects when defining

your parameter or in your conclusion for the same reason – subjects are people who actually took part in

your experiment, not individuals similar to them.

Conditions:

• Random: The data come from two groups in a randomized experiment.

• Normal:

o For 2-sample inference about proportions:

� 1 1 1 1 2 2 2 2

ˆ ˆ ˆ ˆ, , , and n p n q n p n q must all be at least 10. That is, there must be at least 10

successes and 10 failures in each treatment group.

o For 2-sample inference about means:

� The two treatment groups are both large (1 2

30 and 30n n≥ ≥ ) OR no strong skewness

or outliers can be seen in the graph of either distribution of sample data. (You are

checking to make sure it is reasonable to believe that the true distributions of responses to

the two treatments are approximately Normal).

• Independent: The outcomes for the individuals in the study must be independent of each other. (The

outcome for any individual shouldn’t give you any new information about the likely outcome for any

other individual.) In a well-designed experiment that includes controls and random assignment, this

should be true. (If you want to study the effects of the treatments, you must control any other variables

that might affect the outcome, including any influence the subjects might have on each other.)

� DO NOT CHECK THE 10% CONDITION! (You didn’t take a random sample, so it isn’t

correct to check the 10% condition, and YOU WILL LOSE POINTS if you do!)

Example: High levels of cholesterol in the blood are associated with a higher risk of heart attacks. Will using a

drug to lower blood cholesterol reduce heart attacks? The Helsinki Heart Study recruited middle-aged men with

high cholesterol but no history of other serious medical problems to investigate this question. The volunteer

subjects were assigned at random to one of two treatments: 2051 men took the drug gemfibrozil to reduce their

cholesterol levels, and a control group of 2030 men took a placebo. During the next five years, 56 men in the

gemfibrozil group and 84 men in the placebo group had heart attacks.

a) Do the results of this study give convincing evidence at the 0.01α = level that gemfibrozil is effective in

preventing heart attacks?

b) Interpret the P-value you got in part a) in the context of this experiment.

The logic behind this example: There are two possible reasons why we might have observed a difference in

the proportions of subjects in our two groups who experienced heart attacks. Either a lower proportion of the

gemfibrozil group experienced heart attacks because gemfibrozil is more effective at preventing heart attacks

than a placebo, or all 140 people in the study who experienced heart attacks would have had a heart attack

regardless of which treatment they received, and the researchers just happened to put a higher proportion of

them in the gemfibrozil group by chance.

Let’s assume that 0

: 0G C

H p p− = is true. That is, there is no difference in the effectiveness of gemfibrozil and

the placebo. All 140 people in the study who experienced heart attacks would have had a heart attack no matter

which treatment they received. We can think about what would happen if we were to repeat the reassignment

many times – dividing the 4081 subjects into two

treatment groups, one with 2051 subjects, and one

with 2030 subjects. Each time, we count how many of

those who end up having a heart attack end up in each

group, then we calculate the difference in sample

proportions, ˆ ˆG C

p p− for each randomization. The

result is called a randomization distribution.

The figure to the right shows the result of 3000 re-

randomizations for this scenario. Notice that the

distribution is approximately Normal with a mean of

0 and a standard deviation of 0.0057, which are very

close to the same mean and standard deviation we get

from the formulas used for situations where we have

two independent random samples! Very convenient!

In the Helsinki Heart Study, the observed difference in the proportions of subjects who had a heart attack in the

gemfibrozil and placebo groups was 0.0273 – 0.0414 = – 0.0141. Only 25 of the 3000 re-randomizations

resulted in a difference in proportions this low or lower by chance, so the estimated P-value is about 0.0083,

which isn’t far off from the P-value we calculated with our formulas. Basically, the P-value is the probability

that we see at least this much of a difference in the sample proportions simply due to chance variation in

random assignment if gemfibrozil is no more effective than a placebo at preventing heart attacks.

Example: Does increasing the amount of calcium in our diet reduce blood pressure? Observational studies have

suggested a link, but researchers designed a randomized comparative experiment to investigate the question of

causation. The subjects were 21 healthy men who volunteered to take part in the experiment. They were

randomly assigned to two groups: 10 of the men received a calcium supplement for 12 weeks, while the control

group of 11 men received a placebo pill that looked identical. The experiment was double-blind. The response

variable is the decrease in systolic blood pressure for a subject after 12 weeks, in millimeters of mercury. (An

increase appears as a negative number.) Here are the data:

Calcium: 7 –4 18 17 –3 –5 1 10 11 –2

Placebo: –1 12 –1 –3 3 –5 5 2 –11 –1 –3

a) Draw parallel dotplots of the data. Based only on the dotplots, do you suspect there will be a significant

difference between the true mean decrease in systolic blood pressure for healthy men like those in the

study who take calcium and those who take a placebo?

b) Do the data provide convincing evidence that a calcium supplement reduces blood pressure more than a

placebo, on average, for subjects like the ones in this study?

The logic behind this example: There are two possible reasons why we observed a difference in mean blood

pressure reduction for the two groups as large as we did. Either the mean blood pressure reduction was higher

for the calcium group because calcium is more effective at lowering blood pressure, or the two treatments are

equally effective, and any differences we saw were simply because of chance variation due to random

assignment.

Assume that 0

: 0C P

H µ µ− = is true. That is, assume that calcium and the placebo are equally effective at

lowering blood pressure. If we reassign the 21 subjects to the two groups many times, assuming the treatment

doesn’t affect each individual’s change in blood pressure, and then calculate the new difference in sample mean

decrease in systolic blood pressure (C P

x x− ) for each re-randomization, we can estimate how likely it is that

we’d see a difference at least as extreme as the one

we actually observed by chance if the two treatments

don’t differ in effectiveness.

The randomization distribution is approximately

Normal with a mean of 0.014 and a standard

deviation of 3.400, which agree well with the mean

and standard deviation we calculated using the

formulas for situations involving two independent

random samples (0 and 3.29). Very convenient!

The observed difference in the mean reduction in

blood pressure in the calcium and placebo groups

was 5.000 – (–0.273) = 5.273. About 660 of the re-

randomizations resulted in differences this high or

higher by chance, so the estimated P-value is about

0.066, which isn’t far off from the P-value we calculated with our formulas. Basically, the P-value is the

probability that we see at least this much of a difference in the sample means simply due to chance variation in

random assignment if calcium is no more effective than a placebo at reducing blood pressure.

AP Statistics – 12.1 Notes

Inference for Linear Regression

Regression Line: A line that describes how a response variable y changes as an explanatory variable x changes.

Correlation (r): A number between –1 and 1 that measures the direction and strength of the linear relationship

between two variables.

Coefficient of Determination 2( r ) : The proportion of the variation in the values of y that is explained by the

least-squares regression line.

Residuals: The differences between the observed value of the response variable and the value predicted by the

regression equation. ˆresidual y y= −

Residual Plot: A plot of the explanatory variable (x) vs. the residuals.

If we have all the data for a population, we can calculate the true regression line: .yµ α βx= += += += +

Regression line based on the entire population (all eruptions of Old Faithful in a month)

What can we do if we have a sample? Can a regression line based on a sample tell us anything about the true

regression line of the population?

Regression lines based on samples of 20 eruptions from that month.

Notice how the slope of the regression line is different for each sample, even though they all come from the

same population.

Population Regression Equation:

33.97 10.36yµ x= +

Conditions for Regression Inference (LINER): Suppose we have n observations on an explanatory variable

x and a response variable .y Our goal is to study or predict the behavior of y for given values of .x

• Linear: The actual relationship between x and

y is linear. The mean values of y for each value

of x line up along the population (true)

regression line .yµ α βx= +

• Independent: Individual observations are

independent of each other (or the 10% condition

is met).

• Normal: For each value of x, the y-values are

Normally distributed around the regression line.

• Equal Variance: The standard deviation of ,y

called ,σ is the same for all values of x.

• Random: The data come from a well-designed random sample or randomized experiment.

How to Check Conditions:

• Linear: Look at the scatterplot to make sure the overall pattern is roughly linear. Make sure there are no

curved patterns in the residual plot. Check to see that the residuals appear randomly scattered and are

centered around the “residual = 0” line.

• Independent: Look at how the data were produced. If sampling is done without replacement, check the

10% condition. If the study is a randomized experiment, good design with proper controls and random

assignment help ensure the independence of individual observations.

• Normal: Make a stemplot, histogram, or Normal probability plot of the residuals and check to make

sure there isn’t extreme skewness, outliers, or other major departures from Normality.

• Equal Variance: Look at the scatter of the residuals above and below the “residual = 0” line in the

residual plot. The amount of scatter should be roughly the same from the smallest to the largest x-value.

Make sure you don’t see a “fan” pattern – a tight cluster at one end of the graph and a spread out pattern

at the other end.

• Random: The data come from a well-designed random sample or randomized experiment.

Example: Many people believe that students learn better if they sit closer to the front of the classroom. Does

sitting closer cause higher achievement or do better students simply choose to sit nearer to the front? To

investigate, an AP Statistics teacher randomly assigned students to seat locations in his classroom for a

particular chapter and recorded the test score for each student at the end of the chapter. The explanatory variable

in this experiment is which row the student was assigned to (Row 1 is closest to the front and Row 7 is farthest

away). Here are the results.

Row 1: 76, 77, 94, 99

Row 2: 83, 85, 74, 77

Row 3: 90, 88, 68, 78

Row 4: 94, 72, 101, 70, 79

Row 5: 76, 65, 90, 67, 96

Row 6: 88, 79, 90, 83

Row 7: 79, 76, 77, 63

Predictor Coef SE Coef T P

Constant 85.706 4.239 20.22 0.000

Row -1.1171 0.9472 -1.18 0.248

S = 10.0673 R-Sq = 4.7% R-Sq(adj) = 1.3%

a) What is the equation of the least-squares regression line?

b) Interpret the slope of the least-squares regression line in this context.

c) Interpret the value of 2r in this context.

d) Why was it important to randomly assign the students to seats rather than letting each student choose

where to sit?

e) Check to see if the conditions for inference are met. A residual plot and a histogram of the residuals are

shown below.

7654321

20

10

0

-10

-20

Row

Residual

20151050-5-10-15

7

6

5

4

3

2

1

0

Residual

Frequency

7654321

100

90

80

70

60

Row

Score

Parameters for the True Regression Line yµ α βx= += += += +

• α is the true y-intercept.

• β is the true slope.

• σ is the standard deviation. It describes the variability of the response y about the population (true)

regression line. It basically says how tightly packed the observations are around the line.

Estimating α and :β In the regression line for a sample, ˆ ,y a bx= + the slope b is an unbiased estimator of

the true slope ,β and the intercept a is an unbiased estimator of the true intercept .α

Estimating :σ We are often interested in how tightly the data are clustered around the regression line. Since σ

is unknown, we use ,s which is the standard deviation of the residuals. Remember that s can be interpreted as

the typical prediction error, or the typical or average distance of the observed values from the predicted values.

( )

22 ˆresiduals

2 2

y ys

n n

−= =

− −

∑ ∑.

Inference about the True Slope, β

Usually, the most important parameter in a regression problem is the true slope, .β The slope tells us how much

y is predicted to change, on average, each time x changes by 1 unit.

The standard error of the slope (((( ))))bSE is the standard deviation of the

sampling distribution of b – the standard deviation of the slopes of regression

lines formed by taking repeated samples of the same size from the population.

It measures how much the slopes of the sample regression lines from repeated

samples typically vary from the slope of the population regression line.

The graph at the left shows the slopes of the regression lines for 1000 samples

of size 20n= from the Old Faithful data.

Normally, we get the value of b

SE from computer output, but the formula is .1

b

x

sSE

s n=

−

� If 0,β = that means the mean of y does not change at all when x changes. In other words, it means there

is no true linear relationship between x and .y

� When data from a random sample or a randomized experiment suggest that there is an association

between two variables, there are two possible explanations for why the slope differs from 0. We do

inference to decide which explanation seems more plausible.

o Explanation 1: There really is no association between the variables, and we got a nonzero slope

due to sampling variability or the chance variation due to random assignment.

o Explanation 2: There really is an association between the two variables.

Inference about the true slope, ,β involves a t curve with 2n− degrees of freedom.

Confidence Interval for the True Slope, :β

( )( )statistic critical value standard error of the statistic

*b

b t SE

±

±

Use the t curve with 2n − degrees of freedom.

Example: Here is the computer output from the previous example (test score vs. row #):


Constant 85.706 4.239 20.22 0.000

Row -1.1171 0.9472 -1.18 0.248

S = 10.0673 R-Sq = 4.7% R-Sq(adj) = 1.3%

a) Identify the standard error of the slope, ,b

SE from the computer output. Interpret this value in context.

b) Calculate a 95% confidence interval for the true slope. Show your work. Interpret the interval in context.

c) Based on your interval, is there convincing evidence that seat location affects scores?

Example: For their second-semester project, two AP Statistics students decided to investigate the effect of

sugar on the life of cut flowers. They went to the local grocery store and randomly selected 12 carnations. All

the carnations seemed equally healthy when they were selected. When the students got home, they prepared 12

identical vases with exactly the same amount of water in each vase. They put one tablespoon of sugar in 3

vases, two tablespoons of sugar in 3 vases, and 3 tablespoons of sugar in 3 vases. In the remaining vases, they

put no sugar. After the vases were prepared and placed in the same location, the students randomly assigned one

flower to each vase and observed how many hours each flower continued to look fresh. Here are the data along

with computer output from a least-squares regression analysis:


Constant 181.200 3.635 49.84 0.000

Sugar (Tbsp.) 15.200 1.943 7.82 0.000

S = 7.52596 R-Sq = 86.0% R-Sq(adj) = 84.5%

Sugar (Tbsp.) Freshness (hrs.)

0 168

0 180

0 192

1 192

1 204

1 204

2 204

2 210

2 210

3 222

3 228

3 234

3.02.52.01.51.00.50.0

240

230

220

210

200

190

180

170

160

Sugar (Tbsp.)

Freshness (hrs.)

3.02.52.01.51.00.50.0

10

5

0

-5

-10

-15

Sugar (Tbsp.)

Residual

1050-5-10-15

4

3

2

1

0

Residual

Frequency

a) Construct and interpret a 99% confidence interval for the slope of the true regression line.

b) Would you feel confident predicting the hours of freshness if 10 tablespoons of sugar are used? Explain.

Significance Tests about the True Slope, :β

Hypotheses:

0 :H 0β = (There is no true linear relationship between x and .y )

:a

H 0β> (There is a positive correlation – y increases as x increases)

-or- 0β< (There is a negative correlation – y decreases as x increases)

-or- 0β ≠ (There is a linear relationship – y changes when x changes)

Test Statistic: b

bt

SE= with 2n− degrees of freedom.

P-value: The probability of getting a t statistic this large or larger in the direction specified by the alternative

hypothesis.

� The P-value given by computer output is for a two-sided test. You must divide it by two if you are

doing a one-sided test.

Example: Do customers who stay longer at buffets give larger tips? An AP Statistics student who worked at an

Asian buffet decided to investigate this question. While doing her job as a hostess, she obtained a random

sample of receipts, which included the length of time (in minutes) the party was in the restaurant and the

amount of the tip (in dollars). Do these data provide convincing evidence that customers who stay longer give

larger tips? Here are the data and computer output.


Constant 4.535 1.657 2.74 0.021

Time (min.) 0.03013 0.02448 1.23 0.247

S = 1.77931 R-Sq = 13.2% R-Sq(adj) = 4.5%

a) Describe what the scatterplot tells you about the relationship between the two variables.

b) What is the equation of the least-squares regression line for predicting the amount of the tip from the

length of the stay? Define any variables you use.

c) Interpret the slope and the y-intercept of the least-squares regression line in context.

d) Carry out an appropriate test to determine whether the data provide convincing evidence that customers

who stay longer give larger tips. Assume the conditions for inference have been met.

Time (min.) Tip ($)

23 5.00

39 2.75

44 7.75

55 5.00

61 7.00

65 8.88

67 9.01

70 5.00

74 7.29

85 7.50

90 6.00

99 6.50

1009080706050403020

9

8

7

6

5

4

3

2

T ime (min.)Tip ($)

1009080706050403020

3

2

1

0

-1

-2

-3

T ime (min.)

Residual

210-1-2-3

3.0

2.5

2.0

1.5

1.0

0.5

0.0

Residual

Frequency

Example: A random sample of 11 used Honda CR-Vs from the 2002-2006 model years was selected from the

inventory at www.carmax.com. The number of miles driven and the advertised price were recorded for each

CR-V. A 95% confidence interval for the slope of the true least-squares regression line for predicting advertised

price from number of miles (in thousands) driven is ( )122.3, 50.1 .− − Based on this interval, what conclusion

should we draw from a test of 0 : 0H β = versus : 0a

H β ≠ at the 0.05α = significance level?

Documents

Comparing Two Proportions Situations in which we …linford-math.weebly.com/uploads/4/2/4/6/4246372/ch._10...AP Statistics – Ch. 10 Notes Comparing Two Proportions Situations in