38
Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Embed Size (px)

Citation preview

Page 1: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Lecture 3: Review

Review of Point and Interval EstimatorsStatistical SignificanceHypothesis Testing

Page 2: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Point Estimates

An estimator (as opposed to an estimate) is a sample statistic that predicts a value of a parameter. For example, the sample mean and sample standard deviation are estimators.

A point estimate is a particular value of an estimator used to predict the population value. For example, an estimate of the population mean from the sample mean may be “12”.

Page 3: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Point Estimates

2 characteristics of point estimators are their efficiency and bias.

An efficient estimator is one which has the lowest standard error, relative to other estimators

An estimator is biased to the extent that the sampling distribution is not centred around the population parameter.

Page 4: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Efficiency and Bias

Sampling Distributions

Biased Estimator

Unbiased, but inefficient Estimator

Efficient Estimator

Page 5: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Point Estimates

Sample Mean

NYi /

nyy i /

nyy i /

Population Mean

Sample Mean as a point estimate of population mean

Page 6: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Point Estimates

Sample Standard Deviation

Sample Standard Deviation as estimate of Population Standard Deviation

n

yys i

2

N

Yi

2

1

ˆ2

n

yys i

Population Standard Deviation

Page 7: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Interval Estimates

Confidence intervals for a mean A confidence interval is an interval estimate around a mean. A confidence interval is range of values in which the mean (or other statistic, such as a proportion) has a certain probability of falling.

Page 8: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

95% Confidence Interval of a Mean

95% Confidence Interval

95% of thearea

2.5% of thearea

Sampling Distribution of Means

x

96.1z 96.1z

Page 9: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Confidence Interval of a Mean

yZyIC .. Z is the z-score corresponding to a .the confidence probability;

and is an estimate of the standard error of the mean (for large samples);

s is the sample standard deviation.

n

sy y

Page 10: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Confidence Interval of a Mean

Example: you wish to estimate the average height for all Canadian men 18-21 from sample of 50 with mean height of 166 cm and standard deviation of 30 cm. Calculate the 95% confidence interval for the mean.

24.450

30ˆ

n

sy

173159

03.7166

)24.4(96.1166..

ˆ..

)95(

)95(

IC

ZyIC y

We are 95% confident that the true population mean falls in the interval from 159cm to 173cm.

Page 11: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Point Estimates for Proportions

The sample proportion of individuals in category x n

Xp

N

X

n

Xp

The population proportion of individuals with characteristic x

Sample proportion is an estimator of the population proportion

Page 12: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Confidence Intervals for Proportions

The standard deviation of the sample probability distribution

The standard error of the sample proportion

confidence interval for a proportion

n

pps

1

nn

ˆ1ˆˆ

ˆ ˆ

n

ZZ ˆ1ˆ

ˆˆ ˆ

Page 13: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Confidence Intervals for Proportions

Example: Consider a poll predicting election results with a sample size of 1500. Of those who responded to a question about their voting intentions in an upcoming election, 840 (56%) answered that they would vote for the current governing party.

If we use the sample proportion as an estimator of the population proportion, then

56.01500/840ˆ

and the proportion that would not vote for the governing party;

44.056.011

Page 14: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Confidence Intervals for Proportions

Example: Consider a poll predicting election results with a sample size of 1500. Of those who responded to a question about their voting intentions in an upcoming election, 840 (56%) answered that they would vote for the current governing party.

The standard error of the estimated proportion is then;

n

ˆ1ˆ

ˆ ˆ

1500

44.056.0ˆ ˆ

0128.0002.ˆ ˆ

Page 15: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Confidence Intervals for Proportions

Example: Consider a poll predicting election results with a sample size of 1500. Of those who responded to a question about their voting intentions in an upcoming election, 840 (56%) answered that they would vote for the current governing party.

The 99% confidence interval is:

ˆ99 58.2ˆ.. IC

)0128(.58.256.0.. 99 IC

033.056.0

We are 99% confident that the true population proportion lies between 0.527 and 0.593.

Page 16: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Significance Tests

Page 17: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Significance Tests

Elements of significance tests:

(1) Assumptions: - Type of data- Distributions- Sampling- Sample size

Page 18: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Significance Tests

Elements of significance tests:

(2) Hypotheses: - H0: There is no effect, relationship, or

difference- Ha: There is an effect, difference, relationship

Page 19: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Significance Tests

Elements of significance tests:

(3) Test Statistic

(4) Significance level, critical value

(5) Decision whether or not to reject H0

Page 20: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large Sample Z-Test for a Mean

Example:

You sample 400 high school teachers and submit them to an IQ test, which is known to have a population mean of 100. You wish to know whether the sample mean IQ of 115 is significantly different from the population mean. The sample s.d. is 18.

Page 21: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large Sample Z-Test for a Mean

Assumptions: A z-test for a mean assumes random sampling and n greater than 30.

Hypotheses:

100:

100:0

aH

H

Page 22: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large Sample Z-Test for a Mean

Test Statistic:

ns

yz

yz

y

0

0

ˆ

66.16

40018

100115

testz

z

Page 23: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large Sample Z-Test for a Mean

Significance Level and Critical Value:

α = .05, .01, or .001

Find the appropriate critical values (z-scores) from the table of areas under the normal curve.

For a 2-tailed z-test at α=.05 zcritical = 1.96.

Decision:

Because ztest is higher than zcritical, we will reject the null hypothesis and conclude that high school teachers do have significantly higher IQ scores than the general population

Page 24: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Critical Values

While in 231 we always found a critical value for a test statistic given our determined α-level, and compared it to our observed value, we can instead find the probability of observing the test statistic value, and compare it to our determined α-level.

Computer output gives us the exact probability, rather than a test statistic

Page 25: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large-Sample Z-tests for proportions

Example: You randomly sample 500 cattle and dairy farmers in Saskatchewan, asking them whether they have enough stored feed to last through a summer of severe drought. 275 report that they do not. You know that of the total population (all Canadian farmers), 50% do not have adequate stored feed. Is the proportion of Saskatchewan farmers that is unprepared significantly higher?

Page 26: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large-Sample Z-tests for proportions

Assumptions:

Z-tests assume that the sample size is large enough that the sampling distribution will be approximately normal. In practice this means 30 cases or more. Random sampling is also assumed.

Hypotheses:

50.0:

:

0

00

H

H

50.0:

: 0

a

a

H

H

Page 27: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large-Sample Z-tests for proportions

Test Statistic:

n

z00

0

ˆ

0

1

ˆ

ˆ

ˆ

231.2500

25.

50.0500

)50.1(50.

50.055.0

testz

z

z

Page 28: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Large-Sample Z-tests for proportions

Significance level or critical value:

If we look at the table of normal curve probabilities for the probability of finding a zcritical value of 2.23, we find that the one-tail probability is about 0.0129.

This is greater than the .01 level, so we would fail to reject the null. However, it is less than .05.

Conclusion:

A significantly higher proportion of Saskatchewan farmers reported that they did not have enough stored feed to withstand a drought (p=.00129).

Page 29: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Small-Sample Inference for Means

The t-distribution

The t-distribution is another bell-shaped symmetrical distribution centred on 0. The t-distribution differs from the normal distribution in that as sample size decreases, the tails of the t-distribution become thicker than normal. However, when n ≥ 30, the t-distribution is practically the same as the normal distribution.

Page 30: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Small-Sample Inference for Means

Example: Suppose we have a sample of 25 young offenders whom we have interviewed. We are interested in the effects of being born to young parents. The average age of the mothers at the birth of the future young offenders was 20.8, with a s.d. of 3.5 years. The average age at first birth for women in Canada in 1990 was 23.2 years. Is the average age of the mothers of young offenders different than that of mothers in the general Canadian population?

Page 31: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Small-Sample Inference for Means

Assumptions:

It is assumed that the sample is SRS, from a population that is normally distributed (but it is pretty robust to violations of this assumptions).

Hypotheses: 2.23:

:

0

00

H

H

2.23:

: 0

a

a

H

H

Page 32: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Small-Sample Inference for Means

Test Statistic:

ns

yyt

y /ˆ00

255.3

2.238.20 t

43.2actualt

Page 33: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Small-Sample Inference for Means

Significance level or critical value: We look for the probability of finding a t-value of -2.43, with (n-1) degrees of freedom.

If our decided P=value is .05, we look for the tcritical value associated with a single-tail probability of .025 (half of the total 2-tailed p-value).

tcritical at p=.05 (2-tailed), 24 df = 2.064.

Page 34: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Small-Sample Inference for Means

Decision:We decide to reject the null hypothesis, because the actual value of t is greater than our critical value of t at (p=.05). This means that the observed difference between the observed sample mean and the hypothesized population mean is sufficiently great that we are willing to conclude that the sample comes from a population with a different mean age at first birth than the Canadian population.

Page 35: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Significance Tests

Note that the form of the z and t- statistics are similar;

ns

yyz

y

00

ˆ

n

z00

0

ˆ

0

1

ˆˆ

)1(

/ˆ00

ndf

ns

yyt

y

Large-sample test for means

Large-sample test for proportions

Small-sample test for means

Page 36: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Decisions and Types of Errors

Page 37: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

“Confusion” Matrix

H0 is true H0 is false

Reject H0 Type I error

(α)

Correct decision

(1-β) or “Power”

Fail to Reject H0

Correct decision

(1-α)

Type II error

(β)

Page 38: Lecture 3: Review Review of Point and Interval Estimators Statistical Significance Hypothesis Testing

Type I and Type II Errors

a

P-level

Distribution under H0 Distribution under Ha