30
Chapter 8 – continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p. 476 - 478 8.4 The t distributions Skip: derivation of the pdf, p. 483 - 484 8.5 Confidence intervals 8.6 Bayesian Analysis of Samples from a Normal Distribution 8.7 Unbiased Estimators 8.8 Fisher Information Sampling Distributions 1 / 30

Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

Chapter 8 – continued

Chapter 8: Sampling distributions of estimatorsSections

8.1 Sampling distribution of a statistic8.2 The Chi-square distributions8.3 Joint Distribution of the sample mean and sample variance

Skip: p. 476 - 4788.4 The t distributions

Skip: derivation of the pdf, p. 483 - 484

8.5 Confidence intervals8.6 Bayesian Analysis of Samples from a Normal Distribution8.7 Unbiased Estimators8.8 Fisher Information

Sampling Distributions 1 / 30

Page 2: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

Review from Sections 8.1 - 8.4

Review from Sections 8.1 - 8.4

Chi-square distribution: χ2m, same as Gamma(α = m/2, β = 1/2)

The tm distribution: If Y ∼ χ2m and Z ∼ N(0,1) are independent

then Z√Y/m∼ tm.

Let X1, . . . ,Xn be a random sample from N(µ, σ2)If µ is known but σ is not:

n σ̂20

σ2 ∼ χ2n where σ̂2

0 =1n

n∑i=1

(Xi − µ)2

If both (µ, σ) are unknown:

nσ2 Sn ∼ χ2

n−1 where Sn =1n

n∑i=1

(Xi − X n)2

√n(X n − µ)

σ′∼ tn−1 where σ′ =

[∑ni=1(Xi − X n)

2

n − 1

]1/2

Sampling Distributions 2 / 30

Page 3: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Confidence Interval – A frequentist tool

Say we want to estimate θ, or in general g(θ)We also want to know “how good” that estimate is.

Def: Confidence Interval (CI)

Let X1, . . . ,Xn be a random sample from f (x |θ), where θ is unknown(but not random). Let g(θ) be a real-valued function and let A and B bestatistics where

P (A < g(θ) < B) ≥ γ ∀θ .

The random interval (A,B) is called a 100γ% confidence interval forg(θ). If “=”, the CI is exact .

After the random variables X1, . . . ,Xn have been observed and thevalues of A = a and B = b have been computed, the interval (a,b)is called the observed confidence interval .

Sampling Distributions 3 / 30

Page 4: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Confidence Interval - Mean of a Normal Distribution

Last time we saw the following exampleLet X1, . . . ,Xn be a random sample from N(µ, σ2)

Let

X n =1n

n∑i=1

Xi and σ′ =

(∑ni=1(Xi − X n)

2

n − 1

)1/2

Then we know that

U =

√n(X n − µ)

σ′

has the tn−1 distribution.We can therefore calculate γ = P(−c < U < c). Turning thisaround, we get

γ = P(

X n − cσ′√

n< µ < X n + c

σ′√n

)Sampling Distributions 4 / 30

Page 5: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Confidence Interval - Mean of a Normal Distribution

Let Tm(x) denote the cdf of the tm distribution.Given γ we can find c so that P(−c < U < c) = γ:

γ = P(−c < U < c) = 2Tn−1(c)− 1

since the t distribution is symmetric around 0. Solving for c we get

c = T−1n−1

(γ + 1

2

)where T−1

n−1 is the quantile function for the tn−1 distribution.So a 100γ% confidence interval for µ is(

X n − T−1n−1

(γ + 1

2

)σ′√

n, X n + T−1

n−1

(γ + 1

2

)σ′√

n

)Sampling Distributions 5 / 30

Page 6: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Example – HotdogsExercise 8.5.7 in the book

Data on calorie content in 20 different beef hot dogs from ConsumerReports (June 1986 issue):

186,181,176,149,184,190,158,139,175,148,152,111,141,153,190,157,131,149,135,132

Assume that these numbers are observed values from a randomsample of twenty independent N(µ, σ2) random variables, where µ andσ2 are unknown.

Observed sample mean and σ′ are

X n = 156.85 and σ′ = 22.64201

Find a 95% confidence interval for µ

Sampling Distributions 6 / 30

Page 7: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Interpretation of a confidence intervalConfidence intervals are a Frequentist tool

We know that

P(

X n − T−1n−1

(γ + 1

2

)σ′√

n< µ < X n + T−1

n−1

(γ + 1

2

)σ′√

n

)= γ

After observing the data we observe the random intervalFor example: (146.25,167.45) is an observed 95% confidenceinterval for µThat does NOT mean that P(146.25 < µ < 167.45) = 0.95.For this statement to make sense we need Bayesian thinking andBayesian methods.

Sampling Distributions 7 / 30

Page 8: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Interpretation of a confidence intervalConfidence intervals are a Frequentist tool

One way of thinking of this: Repeated samples.Take a random sample of size n from N(µ, σ2) and calculate the95% confidence intervalTake another random sample (of the same size n) and do thesame calculations.Repeat. Many times.Since there is a 95% chance that the random intervals cover thevalue of µ we expect 95% of the intervals to cover the actual valueof µ

Problem: We never take more than one sample!

Sampling Distributions 8 / 30

Page 9: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Properties of a confidence interval - Simulation Study

I simulated n=20r.v. from N(8,22) andcalculated the 95% CII repeated that 100times4 of the 100 intervalsdo not cover µ = 8(red intervals)

Sampling Distributions 9 / 30

Page 10: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Non-symmetric confidence intervalsMean of the normal distribution

More generally we want to find

P(c1 < U < c2) = γ

Symmetric confidence interval : Equal probability on either side:

P(U ≤ c1) = P(U ≥ c2) =1− γ

2

Since the distribution of U is symmetric around 0, the shortestpossible for µ is the symmetric confidence interval.One-sided confidence interval : All the extra probability is on oneside.That is, either c1 =∞ or c2 =∞

Sampling Distributions 10 / 30

Page 11: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

One-sided Confidence Interval

Def: Lower boundLet A be a statistic so that

P(A < g(θ)) ≥ γ ∀θ

The random interval (A,∞) is a one-sided 100γ% confidenceinterval for g(θ)A is a 100γ% lower confidence limit for g(θ)

Sampling Distributions 11 / 30

Page 12: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

One-sided Confidence Interval

Def: Upper boundLet B be a statistic so that

P(g(θ) < B) ≥ γ ∀θ

The random interval (−∞,B) is a one-sided 100γ% confidenceinterval for g(θ).B is a 100γ% upper confidence limit for g(θ)

Sampling Distributions 12 / 30

Page 13: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

One-sided Confidence Interval - Mean of a normal

Let X1, . . . ,Xn be a random sample from N(µ, σ2), both µ and σ2

unknown.Find the one-sided 100γ% confidence intervals for µ

Find the observed 95% upper confidence limit for µ for the hotdogexample.

Sampling Distributions 13 / 30

Page 14: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Confidence intervals for other distributions

Def: PivotalLet X = (X1, . . . ,Xn) be a random sample from a distribution thatdepends on parameter θ. Let V (X, θ) be a random variable whosedistribution is the same for all θ. Then V is called a pivotal quantity .

To use this we need to be able to invert the pivotal relationship: find afunction r(v ,x) so that

r(V (X, θ),X) = g(θ).

If the function r is increasing in v for every x, V has a continuousdistribution with cdf F (v) and γ2 − γ1 = γ, then

A = r(

F−1(γ1),X)

and B = r(

F−1(γ2),X)

are the endpoints of an exact 100γ% confidence interval (Theorem8.5.3).

Sampling Distributions 14 / 30

Page 15: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Confidence interval using Pivotal quantities

Example: The rate parameter θ of the exponential distributionX1, . . . ,Xn i.i.d. Expo(θ)Find the γ% upper confidence limit for θFind a symmetric γ% confidence interval for θ

Example: Variance of the normal distributionX1, . . . ,Xn i.i.d. N(µ, σ2), both unknown.Find a symmetric γ% confidence interval for σ2

Find the observed symmetric γ% confidence interval for σ2 for thehotdog example

Sampling Distributions 15 / 30

Page 16: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.5 Confidence intervals

Problems with interpretation of a confidence interval

Example 8.5.11 is an interesting example.Say X1,X2 are i.i.d. Uniform(θ − 0.5, θ + 0.5)Let Y1 = min(X1,X2) and Y2 = max(X1,X2).Then (Y1,Y2) is a 50% confidence interval for θHowever: If we observe Y1 and Y2 that are more than 0.5 apart,that is y2 − y1 > 0.5 then we know for a certainty that (y1, y2)contains θ! Yet we only assign 50% “confidence” to that interval,which ignores information we have.

Sampling Distributions 16 / 30

Page 17: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

Chapter 8 – continued

Chapter 8: Sampling distributions of estimatorsSections

8.1 Sampling distribution of a statistic8.2 The Chi-square distributions8.3 Joint Distribution of the sample mean and sample variance

Skip: p. 476 - 4788.4 The t distributions

Skip: derivation of the pdf, p. 483 - 484

8.5 Confidence intervals8.6 Bayesian Analysis of Samples from a Normal Distribution8.7 Unbiased Estimators8.8 Fisher Information

Sampling Distributions 17 / 30

Page 18: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.7 Unbiased Estimators

Unbiased Estimators

Suppose that we use an estimator δ(X) to estimate the parameterg(θ)Properties of an estimator (so far): consistency and sufficiencyAnother property of an estimator: unbiasedness

Def: Unbiased Estimator / BiasAn estimator δ(X) is an unbiased estimator of g(θ) if

E(δ(X)) = g(θ) ∀θ .

Otherwise it is called a biased estimator . The bias is defined as

E(δ(X))− g(θ)

Sampling Distributions 18 / 30

Page 19: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.7 Unbiased Estimators

Examples

X1, . . . ,Xn i.i.d. N(µ, σ2). X n is an unbiased estimator of µ sinceE(X n) = µ for all µ

Unbiased estimators of mean and variance from any distribution:Let X1, . . . ,Xn be a random sample from f (x |θ). The mean andvariance of the distribution (if exist) are functions of θ.X n is an unbiased estimator of the mean E(X1)

Theorem 8.7.1: If variance is finite then σ̂21 is an unbiased

estimator of Var(X ) where

σ̂21 =

1n − 1

n∑i=1

(Xi − X n)2

Note: This means that the MLE of σ2 in N(µ, σ2) is a biasedestimator

Sampling Distributions 19 / 30

Page 20: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.7 Unbiased Estimators

Mean Square Error (MSE)

Is unbiased good enough?Useless if the estimator has high varianceLook for unbiased estimators with lowest varianceMean squared error: E

((δ(X)− g(θ))2)

Want estimators with small MSE.

Corollary 8.7.1

Let δ(X) be an estimator with finite variance. Then

MSE(δ(X)) = Var(δ(X)) + bias(δ(X))2

⇒ the MSE of an unbiased estimator is equal to its variance.

Searching for unbiased estimator with small variance is equivalentto searching for unbiased estimators with small MSE.

Sampling Distributions 20 / 30

Page 21: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.7 Unbiased Estimators

Example

Let X1, . . . ,Xn be a random sample from N(µ, σ2) (both µ and σ2 areunknown).

Consider two estimators of σ2

δ1 = Sn (the MLE of σ2)δ2 = σ̂2

1 (unbiased)

Find the MSE of each estimator.Which estimator has smaller MSE?Which estimator do you prefer?

Sampling Distributions 21 / 30

Page 22: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.7 Unbiased Estimators

Why unbiased?

Sounds good - who wants to be “biased”?However, the variance or MSE are better evaluators of quality ofestimatorsIn many cases there exist biased estimators with smaller MSE

Sampling Distributions 22 / 30

Page 23: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.8 Fisher Information

Let the pdf of X be f (x |θ)The Fisher information I(θ) in the random variable X is defined as

I(θ) = E

{[d log f (X |θ)

]2}

Under mild conditions, we have (Theorem 8.8.1)

I(θ) = Var[

d log f (X |θ)dθ

]= −E

[d2 log f (X |θ)

dθ2

]For a random sample X1, . . . ,Xn, the Fisher information In(θ)satisfies that

In(θ) = nI(θ)

Sampling Distributions 23 / 30

Page 24: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.8 Fisher Information

Cramér-Rao InequalityLet X1, . . . ,Xn be a random sample from a distribution for which the pdfis f (x |θ). For any statistic T , let m(θ) = E(T ). Then under mildconditions, we have

Var(T ) ≥ [m′(θ)]2

nI(θ).

(Corollary 8.8.1) If T is unbiased estimator of θ, then

Var(T ) ≥ 1nI(θ)

.

Efficient estimator of its expectation: if an estimator achieves thelower bound in Cramér-Rao Inequality.Example: X1, . . . ,Xn is a random sample from Poisson(θ). Showthat the MLE is an efficient estimator of θ.

Sampling Distributions 24 / 30

Page 25: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.8 Fisher Information

Asymptotic Distributions of MLE

Theorem 8.8.5

Let θ̂n be the MLE of θ, then under mild conditions, we have

[nI(θ)]1/2(θ̂n − θ)d→ N(0,1).

MLE is asymptotically efficient

Sampling Distributions 25 / 30

Page 26: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.8 Fisher Information

Chapter 8: Sampling distributions of estimatorsSections

8.1 Sampling distribution of a statistic8.2 The Chi-square distributions8.3 Joint Distribution of the sample mean and sample variance

Skip: p. 476 - 4788.4 The t distributions

Skip: derivation of the pdf, p. 483 - 484

8.5 Confidence intervals8.6 Bayesian Analysis of Samples from a Normal Distribution8.7 Unbiased Estimators8.8 Fisher Information

Sampling Distributions 26 / 30

Page 27: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.6 Bayesian Analysis of Samples from a Normal Distribution

Bayesian alternative to confidence intervals

Bayesian inference is based on the posterior distribution.Reporting a whole distribution may not be what you (or your client)wantPoint estimates: Bayesian estimators Minimize the expected lossInterval estimates: simply use quantiles of the posteriordistributionFor example: We can find constants c1 and c2 so that

P(c1 < θ < c1|X = x) ≥ γ

The interval (c1, c2) is called a 100γ% Credible interval for θNote: The interpretation is very different from interpretation ofconfidence intervals

Sampling Distributions 27 / 30

Page 28: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.6 Bayesian Analysis of Samples from a Normal Distribution

Example: the normal distribution

Let X1, . . . ,Xn be a random sample for N(µ, σ2)

In Chapter 7.3 we saw:If σ2 is known, the normal distribution is a conjugate prior for µTheorem 7.3.3: If the prior is µ ∼ N(µ0, ν

20) the posterior of µ is

also normal with mean and variance

µ1 =σ2µ0 + nν2

0xn

σ2 + nν20

and ν21 =

σ2ν20

σ2 + nν20

We can obtain credible intervals for µ from this N(µ1, ν21) posterior

distribution

Sampling Distributions 28 / 30

Page 29: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.6 Bayesian Analysis of Samples from a Normal Distribution

Example: the normal distribution

What if both µ and σ2 are unknown?Use the joint distribution of µ and σ2 as the prior;Conjugate priors are available: the Normal-Inverse Gammadistribution;To give credible intervals for µ and σ2 individually we need themarginal posterior distributions

Sampling Distributions 29 / 30

Page 30: Chapter 8 – continued Chapter 8: Sampling distributions of ......Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square

8.6 Bayesian Analysis of Samples from a Normal Distribution

END OF CHAPTER 8

Sampling Distributions 30 / 30