21
Introduction to Probability and Statistics Chapter 7 Ammar M. Sarhan, [email protected] Department of Mathematics and Statistics, Dalhousie University Fall Semester 2008

Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Introduction to Probability and Statistics

Chapter 7Ammar M. Sarhan,

[email protected] of Mathematics and Statistics,

Dalhousie UniversityFall Semester 2008

Page 2: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Chapter 7

Statistical IntervalsBased on a

Single Sample

Dr. Ammar Sarhan 2

Page 3: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 3

Confidence Intervals

An alternative to reporting a single value for the parameter being estimated is to calculate and report an entire interval of plausible values – a confidence interval (CI). A confidence level is a measure of the degree of reliability of the interval.

Page 4: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 4

7.1 Basic Properties of Confidence Intervals

If after observing X1 = x1,…, Xn = xn, we compute the observed sample mean , then a 95% confidence interval for μ can beexpressed as

⎟⎠

⎞⎜⎝

⎛ ⋅+⋅−n

xn

x σσ 96.1,96.1

x

The population has a N(μ, σ2) and σ is known.

Page 5: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 5

Other Levels of Confidence

A (1- α)100% confidence interval for the mean μ of a normal populationwhen the value of is known α is given by

⎟⎠

⎞⎜⎝

⎛ ⋅+⋅−n

zxn

zx σσαα 2/2/ ,

Other Levels of Confidence

Page 6: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 6

Sample Size

The general formula for the sample size n necessary to ensure an interval width w is

2

2/2 ⎟⎠⎞

⎜⎝⎛ ⋅=

wzn σα

Page 7: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 7

Deriving a Confidence Interval

Let X1,…, Xn denote the sample on which the CI for the parameter θ is to be based. Suppose a random variable satisfying the following properties can be found:

1. The variable depends functionally on both X1,…,Xn and θ.2. The probability distribution of the variable does not depend on θ

or any other unknown parameters.

For any α, 0 < α < 1, constants a and b can be found to satisfy

( ) αθ −=<< 1);,,( 1 bXXhaP nL

θ̂

Let h(X1,…, Xn; θ) denote this random variable. In general, the form of h is usually suggested by examining the distribution of an appropriate estimator .

Page 8: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 8

( ) αθ −=<< 1),,(),,( 11 nn XXuXXlP LL

Now suppose that the inequalities can be manipulated to isolate θ

lower confidencelimit

upper confidencelimit

For a 100(1- α)% CI.

Page 9: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 9

Examples: 7.5 p. 260. will be given in the class.

Page 10: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 10

7.2 Large-Sample Confidence Intervals for a Population Mean and ProportionLarge-Sample Confidence Interval

Let X1,…, Xn denote the sample from a population having a mean μ and standard deviation σ. If n is sufficiently large, then

This implies that

is a large-sample confidence interval for μ with level 100(1- α)%.

)1,0(~/

Nn

XZσ

μ−=

⎟⎠⎞

⎜⎝⎛ ⋅+⋅−

nzX

nzX σσ

αα 2/2/ ,

This formula is valid regardless of the shape of the population distribution.

For practice: n > 40.

( )nNX 2,~ σμ

Page 11: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 11

Examples: 7.6 p. 264.Suppose a random sample with size 48 from a population with unknown mean μ and unknown variance σ2 with the following information:

( )2.56,2.534823.596.17.54,

4823.596.17.54 =⎟

⎞⎜⎝

⎛ +−

23.5and7.54 == sx

950,144and,2626 2 == ∑∑ ii xx

Find the 95% confidence interval of μ?Solution: From the information given in the problem, we have:

Then, the 95% confidence interval of μ is

That is, with a confidence level of approximation 95%, 2.562.53 << μ

⎟⎠⎞

⎜⎝⎛ ⋅+⋅−

nszX

nszX 2/2/ , αα

That is, Notice, if σ is unknown, replace it with the sample standard deviation s.

Page 12: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 12

Confidence Interval for a Population Proportion

Let p denote the proportion of “successes” in a population, where success identifies an individual or object that has a specified property. A random sample of n individuals is to be selected, and X is the number of successes in the sample. A confidence interval for a population proportion p with level 100(1- α)% is:

( ) ( )⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

+

+++

+

+−+

nzn

znqpz

nzp

nzn

znqpz

nzp

/14

ˆˆ2

ˆ,

/14

ˆˆ2

ˆ2

2/

2

22/

2/

22/

22/

2

22/

2/

22/

α

αα

α

α

αα

α

where, .ˆ1ˆ,ˆ pqnXp −==

Page 13: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 13

Notice, if the sample size is quite large, the approximate CI limits become

⎟⎟⎠

⎞⎜⎜⎝

⎛+−

nqpzp

nqpzp

ˆˆˆ,ˆˆˆ 2/2/ αα

Since is negligible compared to .p̂)2/(22/ nzα

Sample Size

The general formula for the sample size n necessary to ensure an interval width w is

2

22/ ˆˆ4

wqpzn α≈

Page 14: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 14

Examples: 7.8 p. 267.Suppose that in 48 trials in a particular laboratory, 16 resulted in ignition of a particular type of substrate by a lighted cigarette.

667.0,333.031

4816ˆ ===== q

nXp

Solution: n = 48 Let p denote the long-run proportion of all such trails that would result in ignition.

Then, the approximately 95% CI of p is

Find the 95% confidence interval of the long-run proportion of all such trails that would result in ignition?

( ) 48/96.11)48(4)96.1(

48)667.0)(333.0(96.1

)48(2)96.1(333.0

2

2

22

+

+±+

)474.0,217.0(08.1

139.0333.0=

±=

Page 15: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 15

Notice, the traditional 95%CI is

)466.0,200.0(48

)667.0)(333.0(96.1333.0 =±

Examples: 7.9 p. 267.Find the sample size necessary to ensure a width of 0.10 for the 95% confidence interval of the long-run proportion of all such trails that would result in ignition?

2

22/ ˆˆ4

wqpzn α≈

=4*(1.96) 2 * 0.333*0.667 / .01 = 341.305

n =341

Page 16: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 16

7.3 Intervals Based on a Normal Population Distribution

The population of interest is normal, so that X1,…, Xn constitutes a random sample from a normal distribution with both μ and σ2 unknown.

has a probability distribution called a t distribution with n-1 degrees of freedom (df).

t Distribution

Let X1,…, Xn ~ N(μ, σ2), then the rv

nSXT

/μ−

=

Page 17: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 17

Properties of t Distributions

Let tv denote the density function curve for v df.

1. Each tv curve is bell-shaped and centered at 0.

2. Each tv curve is spread out more than the standard normal (z) curve.

3. As v increases, the spread of the corresponding tv curve decreases.

4. As v →∞, the sequence of tv curves approaches the standard normalcurve (the z curve is called a t curve with df =∞)

0

t25 curve

t5 curve

z curve

Page 18: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 18

t Critical Value

Let tα,v = the number on the measurement axis for which the area under the t curve with v df to the right of tα,v is α. tα,v is called t critical value.

t critical value

Table A.5, p. 671, gives the t critical value for given α, v.

t0.025,15 =2.131, t0.05,22 =1.717 , t0.01,22 =2.508.

Page 19: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 19

⎟⎠

⎞⎜⎝

⎛ ⋅+⋅− −− nStX

nStX nn 1,2/1,2/ , αα

Now, let X1,…, Xn be a random sample from a normal distribution with both μ and σ2 unknown, then the (1- α)100% CI of μ is

or

nstX n ⋅± −1,2/α

t Confidence Interval

XHere, and S are the sample mean and sample variance.

Page 20: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 20

Examples: 7.12 p. 274.Consider the following sample of fat content (in percentage) on 10 randomly selected hot dogs:

25.2 21.3 22.8 17.0 29.8 21.0 25.5 16.0 20.9 19.5

Find the 95% confidence interval of the population mean fat content, assuming the population is normal.?Solution:

025.02/,134.4and90.21,10 ==== αsxnWe have:

Then, the 95% confidence interval of μ is

⎟⎠⎞

⎜⎝⎛ ⋅+⋅− −− 10

134.490.21,10134.490.21 110,025.0110,025.0 tt

⎟⎠⎞

⎜⎝⎛ ⋅+⋅−=

10134.4262.290.21,

10134.4262.290.21

( )86.24,94.18=

Page 21: Introduction to Probability and Statisticsasarhan/stat 2060/Ch7.pdf · 2010. 11. 4. · 2. The probability distribution of the variable does not depend on θ or any other unknown

Dr. Ammar Sarhan 21

Summary

X1,…, Xn ~ N(μ, σ2) σ is known n

zX σα ⋅± 2/

X1,…, Xn ~ N(μ, σ2) σ is unknown n

StX n ⋅± −1,2/α

X1,…, Xn ~ any population with mean μ and variance σ2

σ is known (n ≥ 30) nzX σα ⋅± 2/

σ is unknown (n ≥ 30) nSzX ⋅± 2/α

(1- α)100% CI of μThe random sample