Upload
rodger-thornton
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
2/38
Outline Population and sample
The sampling distribution of the mean ( known)
The sampling distribution of the mean ( unknown)
The sampling distribution of the variance
3/38
Statistics
Descriptive statistics
Inferential statistics
Remarks: many thanks to Paul Resnick for some slides
4/38
Inferential Statistics
1. Involves: Estimation Hypothesis
Testing
2. Purpose Make Inferences
about Population Characteristics
Population?Population?
5/38
Inference Process
PopulationPopulation
SampleSample
Sample Sample statistic statistic
((XX))
Estimates Estimates & tests& tests
6/38
Key terms
Population All items of interest
Sample Portion of population
Parameter Summary Measure about Population
Statistic Summary Measure about sample
7/38
6.1 Population and Sample Population: refer to a population in
term of its probability distribution or frequency distribution. Population f(x) means a population
described by a frequency distribution, a probability distribution f(x)
Population might be infinite or it is impossible to observe all its values even finite, it may be impractical or uneconomical to observe it.
8/38
Sample
Sample: a part of population. Random samples (Why we need?):
such results can be useful only if the sample is in some way “representative”.
Negative example: performance of a tire if it is tested only on a smooth roads; family incomes based on the data of home owner only.
9/38
Sampling
Representative sample Same characteristics as the population
Random sample Every subset of the population has an
equal chance of being selected
10/38
Random sample
Random sample: A set of observations
constitutes a random sample of size n
from a finite population of size N, if its value are chosen so that each subset of n of the N elements of the population has the same probability of being selected.
nXXX ,,, 21
11/38
Discussion
Ways assuring the selection of a sample is at least approximately random
Both finite population and infinite population
12/38
6.2 The sampling distribution of the Mean ( known)
Random sample of n observations, and its mean has been computed.
Another random sample of n observation, and also its mean has been computed.
Probably no two of them are alike.
x
x
13/38
Suppose There’s a Population ...
Population Size, N = 4
Random Variable, x, Is # Errors in Work
Values of x: 1, 2, 3, 4
All values equally likely
Estimate based on a sample of two
© 1984-1994 T/Maker Co.
14/38
Checking list
What is the experiment corresponding to random variable X?
What is the experiment corresponding to the random variable ?
What is “the sampling distribution of the mean”?
X
15/38
Population Characteristics
.0
.1
.2
.3
1 2 3 4
.0
.1
.2
.3
1 2 3 4
Population DistributionPopulation DistributionSummary MeasuresSummary Measures
1
2
1
2.5
51.12
4
N
ii
N
ii
X
N
X
N
1
2
1
2.5
51.12
4
N
ii
N
ii
X
N
X
N
16/38
All Possible Samples of Size n = 2
1st 2nd ObservationObs 1 2 3 4
1 1,1 1,2 1,3 1,4
2 2,1 2,2 2,3 2,4
3 3,1 3,2 3,3 3,4
4 4,1 4,2 4,3 4,4
1st 2nd ObservationObs 1 2 3 4
1 1,1 1,2 1,3 1,4
2 2,1 2,2 2,3 2,4
3 3,1 3,2 3,3 3,4
4 4,1 4,2 4,3 4,4
16 Samples16 Samples
Sample with replacementSample with replacement
17/38
All Possible Samples of Size n = 2
1st 2nd ObservationObs 1 2 3 4
1 1,1 1,2 1,3 1,4
2 2,1 2,2 2,3 2,4
3 3,1 3,2 3,3 3,4
4 4,1 4,2 4,3 4,4
1st 2nd ObservationObs 1 2 3 4
1 1,1 1,2 1,3 1,4
2 2,1 2,2 2,3 2,4
3 3,1 3,2 3,3 3,4
4 4,1 4,2 4,3 4,4
1st 2nd ObservationObs 1 2 3 4
1 1.0 1.5 2.0 2.5
2 1.5 2.0 2.5 3.0
3 2.0 2.5 3.0 3.5
4 2.5 3.0 3.5 4.0
1st 2nd ObservationObs 1 2 3 4
1 1.0 1.5 2.0 2.5
2 1.5 2.0 2.5 3.0
3 2.0 2.5 3.0 3.5
4 2.5 3.0 3.5 4.0
16 Samples16 Samples 16 Sample Means16 Sample Means
Sample with replacementSample with replacement
18/38
Sampling Distribution of All Sample Means
.0
.1
.2
.3
1.0 1.5 2.0 2.5 3.0 3.5 4.0
`X
P(`X)
.0
.1
.2
.3
1.0 1.5 2.0 2.5 3.0 3.5 4.0
`X
P(`X)
1st 2nd ObservationObs 1 2 3 4
1 1.0 1.5 2.0 2.5
2 1.5 2.0 2.5 3.0
3 2.0 2.5 3.0 3.5
4 2.5 3.0 3.5 4.0
1st 2nd ObservationObs 1 2 3 4
1 1.0 1.5 2.0 2.5
2 1.5 2.0 2.5 3.0
3 2.0 2.5 3.0 3.5
4 2.5 3.0 3.5 4.0
16 Sample Means16 Sample Means Sampling Sampling DistributionDistribution
19/38
Comparison
.0
.1
.2
.3
1 1.5 2 2.5 3 3.5 4
`X
P(`X)
.0
.1
.2
.3
1 1.5 2 2.5 3 3.5 4
`X
P(`X)
.0
.1
.2
.3
1 2 3 4
P(X)
.0
.1
.2
.3
1 2 3 4
P(X)
PopulationPopulation Sampling DistributionSampling Distribution
52.x 52.x
0.798
5xσ 0.79
8
5xσ 112. 112.
2 5. 2 5.
20/38
EX Suppose that 50 random samples of
size n=10 are to be taken from a population having the discrete uniform distribution
sampling is with replacement, so to speak, so that we sampling from an infinite population.
else
xforxf
0
9,,2,1,010
1)(
21/38
Sample means
We get 50 samples whose means are 4.4 3.2 5.0 3.5 4.1 4.4 3.6 6.5 5.3 4.4 3.1 5.3 3.8 4.3 3.3 5.0 4.9 4.8 3.1 5.3 3.0 3.0 4.6 5.8 4.6 4.0 3.7 5.2 3.7 3.8 5.3 5.5 4.8 6.4 4.9 6.5 3.5 4.5 4.9 5.3 3.6 2.7 4.0 5.0 2.6 4.2 4.4 5.6 4.7 4.3
22/38
Theorem
If a random sample of size n is taken from a population having the mean and the variance , then is a random variable whose distribution has the mean For samples from infinite populations the variance of this distribution is
For samples from a finite population without replacement of size N the variance is
2 X
2
n
2
1
N n
n N
23/38
Central limit theorem
If is the mean of a sample of size n taken from a population having the mean and the finite variance , then
is a random variable whose distribution
function approaches that of the standard normal distribution as
X
2
n
XZ
/
n
24/38
Central Limit Theorem
XX
As As sample sample size gets size gets large large enough enough (n (n 30) ...30) ...
sampling sampling distribution distribution becomes becomes almost almost normal.normal.
x n
x n
x x
25/38
EX
If a 1-gallon can of paint covers on the average 513.3 square feet with a standard variation of 31.5 square feet.
Question: what is the probability that the sample mean area covered by a sample of 40 of these 1-gallon cans will be anywhere from 510 to 520 square feet?
26/38
Solution
We shall have to find the normal curve area between
6625.040/5.31
3.513510
z 520 513.3
1.3431.5/ 40
z and
Check from the cumulative standard normal distribution Table
9099.0)34.1(
2538.0)6625.0(
F
F
Hence, the probability is 6561.0)6625.0()34.1( FF
27/38
Another example You’re an operations
analyst for AT&T. Long-distance telephone calls are normally distributed with = 8 min. & = 2 min. If you select random samples of 25 calls, what percentage of the sample means would be between 7.8 & 8.2 minutes?
28/38
Solution
8
`X = .4
7.8 8.2 `X8
`X = .4
7.8 8.2 `X
Sampling DistributionSampling
Distribution
ZX
n
ZX
n
7 8 82 25
50
8 2 82 25
50
..
..
ZX
n
ZX
n
7 8 82 25
50
8 2 82 25
50
..
..
0
= 1
-.50 Z.500
= 1
-.50 Z.50
.3830.3830.3830.3830
.1915.1915.1915.1915
Standardized Normal Distribution
Standardized Normal Distribution
29/38
If n is large, it doesn’t matter whether is known or not, as it is reasonable in that
case to substitute for it the sample standard deviation s.
Question: how about n is a small value?
We need to make the assumption that the sample comes from a normal population.
6.2 The sampling distribution of the Mean ( unknown)
30/38
Assumption: population having normal distribution
If is the mean of a random sample of size n taken from a normal population having the mean and the variance , and , then
is a random variable having the t distribution with the parameter
X
2
n
i
i
n
XXS
1
22
1
)(
nS
Xt
/
1n
31/38
t-distribution
)(nt
( ( ))P t t n
32/38
EX. A manufacturer of fuses claims that with a
20% overload, the fuses will blow in 12.4 minutes on the average. To test this claim, sample of 20 of the fuses was subjected to a 20% overload, and the times it took them to blow had a mean of 10.63 minutes and a standard deviation of 2.48 minutes. If it can be assumed that the data constitute a random sample from a normal population.
Question: do they tend to support or refute the manufacturer’s claim?
33/38
Solution
First, we calculate
19.320/48.2
4.1263.10
t
Rule to reject the claim: t value is larger than 2.86 or less than -2.86 where
005.0)86.2( tP
( 2.86) 0.005P t And
34/38
6.4 The Sampling distribution of the variance
Theorem 6.4. If is the variance of a random sample of size n taken from a normal population having the variance then
is a random variable having the chi-
square distribution with the parameter
2S
2
1n
21
2
2
22
)()1(
n
ii XX
Sn
35/38
Chi-square distribution
)(2 n
36/38
F distribution Theorem. If and are the
variances of independent random samples of size
and , respectively, taken from two normal populations having the same variance, then
is a random variable having the F
distribution with the parameter
21S 2
2S
1n 2n
2122
SF
S
1 1 2 21, 1n n
37/38
F distribution
),( 21 nnF
38/38
Thanks!