38
1 Ch6. Sampling distribution Dr. Deshi Ye [email protected]

1 Ch6. Sampling distribution Dr. Deshi Ye [email protected]

Embed Size (px)

Citation preview

Page 1: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

1

Ch6. Sampling distribution

Dr. Deshi [email protected]

Page 2: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

2/38

Outline Population and sample

The sampling distribution of the mean ( known)

The sampling distribution of the mean ( unknown)

The sampling distribution of the variance

Page 3: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

3/38

Statistics

Descriptive statistics

Inferential statistics

Remarks: many thanks to Paul Resnick for some slides

Page 4: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

4/38

Inferential Statistics

1. Involves: Estimation Hypothesis

Testing

2. Purpose Make Inferences

about Population Characteristics

Population?Population?

Page 5: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

5/38

Inference Process

PopulationPopulation

SampleSample

Sample Sample statistic statistic

((XX))

Estimates Estimates & tests& tests

Page 6: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

6/38

Key terms

Population All items of interest

Sample Portion of population

Parameter Summary Measure about Population

Statistic Summary Measure about sample

Page 7: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

7/38

6.1 Population and Sample Population: refer to a population in

term of its probability distribution or frequency distribution. Population f(x) means a population

described by a frequency distribution, a probability distribution f(x)

Population might be infinite or it is impossible to observe all its values even finite, it may be impractical or uneconomical to observe it.

Page 8: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

8/38

Sample

Sample: a part of population. Random samples (Why we need?):

such results can be useful only if the sample is in some way “representative”.

Negative example: performance of a tire if it is tested only on a smooth roads; family incomes based on the data of home owner only.

Page 9: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

9/38

Sampling

Representative sample Same characteristics as the population

Random sample Every subset of the population has an

equal chance of being selected

Page 10: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

10/38

Random sample

Random sample: A set of observations

constitutes a random sample of size n

from a finite population of size N, if its value are chosen so that each subset of n of the N elements of the population has the same probability of being selected.

nXXX ,,, 21

Page 11: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

11/38

Discussion

Ways assuring the selection of a sample is at least approximately random

Both finite population and infinite population

Page 12: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

12/38

6.2 The sampling distribution of the Mean ( known)

Random sample of n observations, and its mean has been computed.

Another random sample of n observation, and also its mean has been computed.

Probably no two of them are alike.

x

x

Page 13: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

13/38

Suppose There’s a Population ...

Population Size, N = 4

Random Variable, x, Is # Errors in Work

Values of x: 1, 2, 3, 4

All values equally likely

Estimate based on a sample of two

© 1984-1994 T/Maker Co.

Page 14: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

14/38

Checking list

What is the experiment corresponding to random variable X?

What is the experiment corresponding to the random variable ?

What is “the sampling distribution of the mean”?

X

Page 15: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

15/38

Population Characteristics

.0

.1

.2

.3

1 2 3 4

.0

.1

.2

.3

1 2 3 4

Population DistributionPopulation DistributionSummary MeasuresSummary Measures

1

2

1

2.5

51.12

4

N

ii

N

ii

X

N

X

N

1

2

1

2.5

51.12

4

N

ii

N

ii

X

N

X

N

Page 16: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

16/38

All Possible Samples of Size n = 2

1st 2nd ObservationObs 1 2 3 4

1 1,1 1,2 1,3 1,4

2 2,1 2,2 2,3 2,4

3 3,1 3,2 3,3 3,4

4 4,1 4,2 4,3 4,4

1st 2nd ObservationObs 1 2 3 4

1 1,1 1,2 1,3 1,4

2 2,1 2,2 2,3 2,4

3 3,1 3,2 3,3 3,4

4 4,1 4,2 4,3 4,4

16 Samples16 Samples

Sample with replacementSample with replacement

Page 17: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

17/38

All Possible Samples of Size n = 2

1st 2nd ObservationObs 1 2 3 4

1 1,1 1,2 1,3 1,4

2 2,1 2,2 2,3 2,4

3 3,1 3,2 3,3 3,4

4 4,1 4,2 4,3 4,4

1st 2nd ObservationObs 1 2 3 4

1 1,1 1,2 1,3 1,4

2 2,1 2,2 2,3 2,4

3 3,1 3,2 3,3 3,4

4 4,1 4,2 4,3 4,4

1st 2nd ObservationObs 1 2 3 4

1 1.0 1.5 2.0 2.5

2 1.5 2.0 2.5 3.0

3 2.0 2.5 3.0 3.5

4 2.5 3.0 3.5 4.0

1st 2nd ObservationObs 1 2 3 4

1 1.0 1.5 2.0 2.5

2 1.5 2.0 2.5 3.0

3 2.0 2.5 3.0 3.5

4 2.5 3.0 3.5 4.0

16 Samples16 Samples 16 Sample Means16 Sample Means

Sample with replacementSample with replacement

Page 18: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

18/38

Sampling Distribution of All Sample Means

.0

.1

.2

.3

1.0 1.5 2.0 2.5 3.0 3.5 4.0

`X

P(`X)

.0

.1

.2

.3

1.0 1.5 2.0 2.5 3.0 3.5 4.0

`X

P(`X)

1st 2nd ObservationObs 1 2 3 4

1 1.0 1.5 2.0 2.5

2 1.5 2.0 2.5 3.0

3 2.0 2.5 3.0 3.5

4 2.5 3.0 3.5 4.0

1st 2nd ObservationObs 1 2 3 4

1 1.0 1.5 2.0 2.5

2 1.5 2.0 2.5 3.0

3 2.0 2.5 3.0 3.5

4 2.5 3.0 3.5 4.0

16 Sample Means16 Sample Means Sampling Sampling DistributionDistribution

Page 19: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

19/38

Comparison

.0

.1

.2

.3

1 1.5 2 2.5 3 3.5 4

`X

P(`X)

.0

.1

.2

.3

1 1.5 2 2.5 3 3.5 4

`X

P(`X)

.0

.1

.2

.3

1 2 3 4

P(X)

.0

.1

.2

.3

1 2 3 4

P(X)

PopulationPopulation Sampling DistributionSampling Distribution

52.x 52.x

0.798

5xσ 0.79

8

5xσ 112. 112.

2 5. 2 5.

Page 20: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

20/38

EX Suppose that 50 random samples of

size n=10 are to be taken from a population having the discrete uniform distribution

sampling is with replacement, so to speak, so that we sampling from an infinite population.

else

xforxf

0

9,,2,1,010

1)(

Page 21: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

21/38

Sample means

We get 50 samples whose means are 4.4 3.2 5.0 3.5 4.1 4.4 3.6 6.5 5.3 4.4 3.1 5.3 3.8 4.3 3.3 5.0 4.9 4.8 3.1 5.3 3.0 3.0 4.6 5.8 4.6 4.0 3.7 5.2 3.7 3.8 5.3 5.5 4.8 6.4 4.9 6.5 3.5 4.5 4.9 5.3 3.6 2.7 4.0 5.0 2.6 4.2 4.4 5.6 4.7 4.3

Page 22: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

22/38

Theorem

If a random sample of size n is taken from a population having the mean and the variance , then is a random variable whose distribution has the mean For samples from infinite populations the variance of this distribution is

For samples from a finite population without replacement of size N the variance is

2 X

2

n

2

1

N n

n N

Page 23: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

23/38

Central limit theorem

If is the mean of a sample of size n taken from a population having the mean and the finite variance , then

is a random variable whose distribution

function approaches that of the standard normal distribution as

X

2

n

XZ

/

n

Page 24: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

24/38

Central Limit Theorem

XX

As As sample sample size gets size gets large large enough enough (n (n 30) ...30) ...

sampling sampling distribution distribution becomes becomes almost almost normal.normal.

x n

x n

x x

Page 25: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

25/38

EX

If a 1-gallon can of paint covers on the average 513.3 square feet with a standard variation of 31.5 square feet.

Question: what is the probability that the sample mean area covered by a sample of 40 of these 1-gallon cans will be anywhere from 510 to 520 square feet?

Page 26: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

26/38

Solution

We shall have to find the normal curve area between

6625.040/5.31

3.513510

z 520 513.3

1.3431.5/ 40

z and

Check from the cumulative standard normal distribution Table

9099.0)34.1(

2538.0)6625.0(

F

F

Hence, the probability is 6561.0)6625.0()34.1( FF

Page 27: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

27/38

Another example You’re an operations

analyst for AT&T. Long-distance telephone calls are normally distributed with = 8 min. & = 2 min. If you select random samples of 25 calls, what percentage of the sample means would be between 7.8 & 8.2 minutes?

Page 28: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

28/38

Solution

8

`X = .4

7.8 8.2 `X8

`X = .4

7.8 8.2 `X

Sampling DistributionSampling

Distribution

ZX

n

ZX

n

7 8 82 25

50

8 2 82 25

50

..

..

ZX

n

ZX

n

7 8 82 25

50

8 2 82 25

50

..

..

0

= 1

-.50 Z.500

= 1

-.50 Z.50

.3830.3830.3830.3830

.1915.1915.1915.1915

Standardized Normal Distribution

Standardized Normal Distribution

Page 29: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

29/38

If n is large, it doesn’t matter whether is known or not, as it is reasonable in that

case to substitute for it the sample standard deviation s.

Question: how about n is a small value?

We need to make the assumption that the sample comes from a normal population.

6.2 The sampling distribution of the Mean ( unknown)

Page 30: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

30/38

Assumption: population having normal distribution

If is the mean of a random sample of size n taken from a normal population having the mean and the variance , and , then

is a random variable having the t distribution with the parameter

X

2

n

i

i

n

XXS

1

22

1

)(

nS

Xt

/

1n

Page 31: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

31/38

t-distribution

)(nt

( ( ))P t t n

Page 32: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

32/38

EX. A manufacturer of fuses claims that with a

20% overload, the fuses will blow in 12.4 minutes on the average. To test this claim, sample of 20 of the fuses was subjected to a 20% overload, and the times it took them to blow had a mean of 10.63 minutes and a standard deviation of 2.48 minutes. If it can be assumed that the data constitute a random sample from a normal population.

Question: do they tend to support or refute the manufacturer’s claim?

Page 33: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

33/38

Solution

First, we calculate

19.320/48.2

4.1263.10

t

Rule to reject the claim: t value is larger than 2.86 or less than -2.86 where

005.0)86.2( tP

( 2.86) 0.005P t And

Page 34: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

34/38

6.4 The Sampling distribution of the variance

Theorem 6.4. If is the variance of a random sample of size n taken from a normal population having the variance then

is a random variable having the chi-

square distribution with the parameter

2S

2

1n

21

2

2

22

)()1(

n

ii XX

Sn

Page 35: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

35/38

Chi-square distribution

)(2 n

Page 36: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

36/38

F distribution Theorem. If and are the

variances of independent random samples of size

and , respectively, taken from two normal populations having the same variance, then

is a random variable having the F

distribution with the parameter

21S 2

2S

1n 2n

2122

SF

S

1 1 2 21, 1n n

Page 37: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

37/38

F distribution

),( 21 nnF

Page 38: 1 Ch6. Sampling distribution Dr. Deshi Ye yedeshi@zju.edu.cn

38/38

Thanks!