Upload
chris-masters
View
13
Download
0
Embed Size (px)
Citation preview
ISOM 2500 Topic 2: Probability & Distributions K.H. Chen Part III: Continuous Random Variables
& Distributions
1. Key Learning Objectives
• In this lesson, we will discuss the continuous variables and their probability distributions. We will start with the discussion of general continuous distribution setup and will continue with two forms of the latter, namely, the normal and sampling distributions. It is very important that you understand the material discussed here as they constitute the theoretical background to inferential statistics.
2. Discussion
A continuous random variable is a random variable that can take any value contained in one or more intervals. (i.e., an uncountable number of values). Examples: Salary, time, volume of milk in a container, etc. Since there is an infinite number of values that can be assumed by a continuous random variable, the probability of each individual value is virtually zero! As such, we can only determine the probability of only a range of values. Figure 1a: Histogram of the gas-mileages of 49 mid-sized cars.
33323130
12
10
8
6
4
2
0
Mileage
Freq
uenc
y
Histogram of Mileage
1
Figure 1b: Density Histogram of the gas-mileages of 49 mid-sized cars.
Figure 1c: Density Function of the gas-mileages of 49 mid-sized cars.
33323130
0.5
0.4
0.3
0.2
0.1
0.0
Mileage
Den
sity
Density Histogram of Mileage
343332313029
12
10
8
6
4
2
0 x
f(x) Density Function
2
The probability of a continuous random variable is represented by the area under the curve, called the probability density function f(x). Requirements for a Probability Density Function: 1. ( ) 0 for all between and .f x x a b≥
2. ( )The total area under the curve between and is 1.0, i.e., 1b
aa b f x dx =∫
The probability of the (continuous) random variable between c and d is
( ) ( ) ( ) for and d
cP c X d P c X d f x dx c a d b< < = ≤ ≤ = ≥ ≤∫
Mean and Variance of a Continuous Random Variable: From Topic 2 Part II, the mean and variance of a discrete random variable are determined as follows:
[ ] ( )all
.x
E X x p xµ= = ∑ and [ ] [ ] ( ) ( )2 22
all .
xV X E X x p xσ µ µ= = − = −∑
The mean and variance of a continuous random variable, which ranges between a and b are determined in a similar fashion using the integral sign rather than the summation sign. That is, Mean
[ ] . ( )b
aE X x f x dxµ= = ∫ Variance
[ ] [ ] ( )2 22 . ( )b
aV X E X x f x dxσ µ µ= = − = −∫
( )2 2.b
ax f x dx µ= −∫
( )2 2 2 2 where .b
aE X E X x f x dxµ = − = ∫
Standard deviation
( )V Xσ =
3
Example 1: After playing golf for many years, a statistics professor determined the density function for the distance his drives travel in hundreds of yards (denoted by X). It is
( ) 23 for 2 319
f x x x= ≤ ≤
a. Confirm that the above function satisfies the requirements for a probability density function.
From the above plot, we can see that f(x) > 0 for 2 3x≤ ≤ and thus the first requirement for a probability density function is met.
The total area under ( ) for 2 3f x x≤ ≤3 22
319
x dx= ∫
3 22
319
x dx= ∫
33
2
319 3
x =
33
219x
=
( ) ( )3 33 2
19
−=
27 8 119
−= =
The second requirement for a probability density function is also met. Thus,
( ) 23 for 2 319
f x x x= ≤ ≤ satisfies the requirements for a probability density
function.
4
3.02.82.62.42.22.0
1.5
1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
0.6
x
f(x)
Plot of f(x) vs x
b. Find the probability that the professor’s next drive is more than 250 yards.
( )3 2
2.5
32.519
P X x dx> = ∫
33
2.519x
=
( ) ( )3 33 2.519
−=
27 15.62519
−=
0.5987= c. Determine the mean, variance, and standard deviation of the professor’s
drives.
Mean, [ ] ( )3
2.E X x f x dx= ∫
3 32
319
x dx= ∫
34
2
376x
=
( )243 48 2.5658 256.58 yards76−
= =
( )32 2
2.E X x f x dx = ∫
3 42
319
x dx= ∫
35
2
395x
=
729 96 63395 95−
= =
Variance, [ ] 2 2 V X E X µ = −
( )2633 2.5658 95
= −
( )20.07988227149 798.823 yards= Standard deviation, 0.07988227149σ =
( )0.2826345193 28.26 yards=
5
Uniform/Rectangular Distribution A continuous random variable X is a uniform random variable over an interval a x b≤ ≤ or [a, b] (equivalently), if X can take on any value in closed interval [a, b] and if the probability density function of X is constant over this interval. That is,
1 for
( )0 otherwise
a x bf x b a
≤ ≤= −
Figure 2: Probability density function of a uniform random variable X. Note that for a uniform random variable X,
( )2
a bE X += and ( ) ( )2
12b a
V X−
= ⇒ ( )12X
b aσ
−=
X
f(X)
Distribution PlotUniform, Lower=a, Upper=b
a b
1b a−
6
Example 2: The weekly output of a steel mill is a uniformly distributed random variable that lies between 110 and 175 metric tons. a. Sketch the probability density function of the weekly output. b. Find the probability that the steel mill will produce more than 150 metric
tons next week. c. Determine the probability that the steel mill will produce between 120 and
160 metric tons next week. d. The operations manager labels any week that is in the bottom 20% of
production a “bad week.” How many metric tons should be used to define a bad week?
e. Find the expected value and standard deviation of the weekly output.
7
Normal Distribution A continuous random variable X, with the following probability density function is called the normal random variable
( )21
21 for , , 02
x
f x e xµ
σ µ σσ π
− − = − ∞ < < +∞ − ∞ < < +∞ >
where 2.71828...e = and 3.14159...π = . Its distribution is called the normal distribution. In short, we write
( )2~ ,X N µ σ Here, [ ]E X µ= and [ ] 2.V X σ= The probability density function of a normal random variable is mound-shaped (or bell-shaped) and symmetric about its mean µ and has points of inflexion at µ σ− and µ σ+ . To calculate the probability that a normal random variable falls into any interval, we need to compute the area in the interval under the curve. That is, taking the integral of the probability density function of a normal random variable. However, computing the former is not easy and thus, we resort to using a probability table or statistical software package to calculate normal probabilities. For the probability-table approach, it would appear that (countless) separate tables are needed for different combinations of means ( µ−∞ < < +∞ ) and standard deviations ( 0σ > ). Fortunately, this won’t be necessary as we can reduce the number of tables needed to one by standardizing the normal random variable. That is,
( )~ 0,1XZ Nµσ−
= .
Note that the above standardized normal random variable, Z is called the standard normal random variable and it has the following probability density function
( )21
21 for 2
zf z e z
π
−= − ∞ < < +∞
and its distribution is called the standard normal distribution. In short, we write
( )~ 0,1Z N Here, ( ) 0ZE Z µ= = and ( ) 2 1ZV Z σ= = .
8
Using the probability table to find standard normal probabilities Table 3 (in Appendix B) on pages B–8 & B–9 on lmes2.ust.hk can be used to find standard normal probabilities. To use this table: first find the row that corresponds to the integer part and the first decimal of the z value, and then find the column that corresponds to the second decimal. Then, at the intersection of the row and the column, you will find the probability that corresponds to an area similar to the one depicted in the graph that accompanies the table. Example 3: Find ( )1.95P Z < . Note that the former corresponds to the area under the standard normal curve between and 1.95−∞ .
Figure 3: ( )1.95P Z < .
From the table, ( )1.95 0.9744P Z < =
0.4
0.3
0.2
0.1
0.0Z
f(Z)
Distribution PlotNormal, Mean=0, StDev=1
9
MINITAB Instructions: Computing standard normal probabilities Step 1: Click on the Calc menu and select Probability Distributions, followed by Normal….
Step 2: Once the Normal Distribution window pops up,
• use the default cumulative probability • enter the values of µ and σ next to “Mean:” and “Standard deviation:”,
respectively • select “Input constant:” and enter the value of the normal random
variable next to it.
10
Step 3: Click on the OK button. From the Session Window of MINITAB, ( )1.95 0.974412P Z < = Note: When you need to calculate probabilities other than of the ( )P Z z−∞ < < or
( )P Z z< type, you need to be able to express your probability in terms of the ( )P Z z< probability. Homework problems will give you the chance to practice
doing so. Figure 4 depicts the way to do these manipulations.
Figure 4: Visualization of the simple arithmetic manipulations needed to express other types of probabilities in terms of the P(Z < z) type.
0.4
0.3
0.2
0.1
0.0Z
f(Z)
1
0.136
20
Distribution PlotNormal, Mean=0, StDev=1
0.4
0.3
0.2
0.1
0.0Z
f(Z)
2
0.977
0
Distribution PlotNormal, Mean=0, StDev=1
= –
0.4
0.3
0.2
0.1
0.0Z
f(Z)
1
0.841
0
Distribution PlotNormal, Mean=0, StDev=1
= – 0.4
0.3
0.2
0.1
0.0Z
f(Z)
2
0.0228
0
Distribution PlotNormal, Mean=0, StDev=1
0.4
0.3
0.2
0.1
0.0Z
f(Z)
2
0.977
0
Distribution PlotNormal, Mean=0, StDev=1
0.4
0.3
0.2
0.1
0.0Z
f(Z)
-3.09 0 3.09
1.000
Distribution PlotNormal, Mean=0, StDev=1
11
Example 4: Let X be a normally distributed random variable with mean µ = 40 and σ = 5. Find the probability P(X < 49). To compute P(X < 49) using the standard normal table, we need to standardize X: Figure 5: P(X < 49)
( ) ( )49 4049 1.8 0.96415
XP X P P Zµσ− − < = < = < =
0.4
0.3
0.2
0.1
0.0Z
f(Z)
-1
0.819
0 2
Distribution PlotNormal, Mean=0, StDev=1
= – 0.4
0.3
0.2
0.1
0.0Z
f(Z)
2
0.977
0
Distribution PlotNormal, Mean=0, StDev=1
0.4
0.3
0.2
0.1
0.0Z
f(Z)
-1
0.159
0
Distribution PlotNormal, Mean=0, StDev=1
0.4
0.3
0.2
0.1
0.0Z
f(Z)
1.8
0.964
0
Distribution PlotNormal, Mean=0, StDev=1
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0.00X
f(X)
49
0.964
40
Distribution PlotNormal, Mean=40, StDev=5
Standardize
12
MINITAB Instructions: Computing normal probabilities Steps: Same as Steps 1, 2 & 3 on pages 10 & 11. From the Session Window of MINITAB, ( ) ( )49 1.8 0.964070P X P Z< = < =
Example 5: Let X be a normally distributed random variable with mean µ = 50 and σ = 8. Find (30 39)P X< < .
13
Determining Z and X values when the probability is given. • Use the table in the reverse way. • Destandardize using X = µ + Zσ Example 6: The life of a calculator manufactured by CASIO is normally distributed with µ = 50 months and σ = 8 months. What should the warranty period be if the company does not want to replace more than 5% of its products?
Figure 6a: P(X < x0.95) = 0.05 where x0.95 denotes the 5th percentile of X ~ N(50, 64).
Figure 6b: P(Z < z0.95) = 0.05 where z0.95 denotes the 5th percentile of Z ~ N(0, 1).
From Table 3 (in Appendix B) on pages B–8 & B–9 on lmes2.ust.hk, P(Z < –1.645) = 0.05 z0.95 = –1.645. Thus, ( )( )0.95 0.95 50 1.645 8 50 13.16 36.84x zµ σ= + = + − = − =
0.4
0.3
0.2
0.1
0.0Z
f(Z)
0.05
0
Distribution PlotNormal, Mean=0, StDev=1
0.95z
0.05
0.04
0.03
0.02
0.01
0.00X
f(X)
0.05
50
Distribution PlotNormal, Mean=50, StDev=8
0.95x
14
MINITAB Instructions: Computing normal percentiles Step 1: Click on the Calc menu and select Probability Distributions, followed by Normal….
Step 2: Once the Normal Distribution window pops up,
• select Inverse cumulative probability • enter the values of µ and σ next to “Mean:” and “Standard deviation:”,
respectively • select “Input constant:” and enter the value of the cumulative
probability next to it.
15
Step 3: Click on the OK button. From the Session Window of MINITAB, 0.95 36.8412x =
Example 7: At a certain university, the SAT scores on the verbal portion of the first-year students are normally distributed with mean 520 and standard deviation 40. a. Find the proportion of first-year students whose SAT scores on the verbal
portion are between 500 and 650. b. How high a verbal test score must be in order to be among the highest 5%
test scores? c. If 5 first-year students are randomly selected, what is the probability that
there will be 3 students whose scores are between 500 and 650?
16
Check for normality
• Construct a dotplot/stem-and-leaf/histogram of the variable to see whether the data are normally distributed.
• Apply the empirical rule: o Compute ( )1 , 1x s x s− + , ( )2 , 2x s x s− + , ( )3 , 3x s x s− + . o Compute the (actual) proportions of data points that fall within each of
the above 3 intervals. o Compare the computed proportions with the theoretical proportions:
68%, 95%, and 100%, respectively. • Construct the normal quantile (Q-Q) plot. • Conduct normality tests. A normal quantile (Q-Q) plot is a graph designed to show whether a normal model is a reasonable description of the variation in the data. The basic idea behind the normal quantile plot is to compare the data values with the values one would expect from a standard normal distribution. The comparison is based on the idea of quantiles.
Example 8:
0.0 –0.3 –0.1 –0.5 –0.4 2.8 2.6 –1.3 0.5 2.6
To construct a normal quantile plot, do the following:
1. Sort the data in ascending order (see Column III on the next page). 2. Determine which quantile each data value represents. In this example, the
smallest of the 10 values, represents the smallest 10% of the data. We will consider this data value to lie half way between 0% and 10% (the middle of
the lowest 10%). In general, the computation 0.5in
− gives the desired value
of the position (expressed as a decimal) since that is halfway between 1in−
and in
(see Column IV on the next page).
3. Compute the value “theoretical” quantile of the standard normal distribution: *z or ( )0.5n i
nz − + ; that corresponds to the proportion computed in Column IV
(see Column V on the next page). For example, to obtain the theoretical quantile for the 1st row of the table on the next page, we need to know what value in the standard normal distribution has approximately 5% of the distribution below it. So we search for something close to 0.05 in the body of the standard normal table, and see that it lies roughly half-way between –1.64 and –1.65 (let’s fix it at –1.645).
17
A computer can get the value more accurately and indicates that it is –1.64485. MINITAB will give you this value if you type “invcdf 0.05” next to “MTB >” command in the session window or if you use the menu under Calc > Probability Distributions > Normal…)
A normal quantile plot is then constructed by plotting the values under Column III ( ( )ix ) against the values under Column V ( *z or ( )0.5n i
nz − + ).
If the data came perfectly from a standard normal distribution, Columns III and V of the table below would be identical (the theoretical quantile and the data value would match). This means that all the points would fall along the straight line y = x. Since other normal distributions are just linear transformations of the standard normal distribution ( )x zµ σ= + , perfect data from a normal distribution with mean µ and standard deviation σ would give a line with slope σ and intercept µ . We use normal quantile plots to assess the plausibility that a data set is a sample from a normally distributed population. If the resulting plot is approximately linear, then it is plausible that the data come from a normal distribution. Else (if the plot is markedly nonlinear), it is doubtful that the data come from a normal distribution. Of course, this will work much better for large data sets than for small data sets.
I II III IV V Position
i Data Value
ix
Sample Quantile (Sorted Data
Value) ( )ix
Proportion below ( ) :ix 0.5in
−
Theoretical Quantile
*z or ( )0.5n in
z − +
1 0.0 –1.3 0.05 –1.64485 2 –0.3 –0.5 0.15 –1.03643 3 –0.1 –0.4 0.25 –0.67449 4 –0.5 –0.3 0.35 –0.38532 5 –0.4 –0.1 0.45 –0.12566 6 2.8 0.0 0.55 0.12566 7 2.6 0.5 0.65 0.38532 8 –1.3 2.6 0.75 0.67449 9 0.5 2.6 0.85 1.03643 10 2.6 2.8 0.95 1.64485
18
3210-1
2
1
0
-1
-2
Sample Quantiles
Theo
reti
cal Q
uant
iles
Normal Q-Q Plot
19
Exponential Distribution A continuous random variable X is exponentially distributed if its probability density function is given by
( ) 1 for 0x
f x e xµ
µ−
= ≥
where e = 2.71828… and µ is the mean of the exponential random variable. It can be shown that the mean of an exponential random variable X is equal to its standard deviation, i.e., [ ] [ ] XE X V X σ= = . Figure 7: Exponential distributions with Xµ = 0.5, 1.0, and 2.0. Probabilities associated with an exponential random variable X:
a. ( )*
*x
P X x e µ−
> =
b. ( )*
* 1x
P X x e µ−
< = −
c. ( ) ( ) ( )* *1 2
* * * *1 2 2 1
x x
P x X x P X x P X x e eµ µ− −
< < = < − < = −
Note that if the number of arrivals follows a Poisson distribution, the times between arrivals follow an exponential distribution.
302520151050
2.0
1.5
1.0
0.5
0.0
X
f(X)
Mean = 0.5Mean = 1Mean = 2
Variable
Probability Density Function of Exponential Random Variable X
20
Example 9: Toll booths on the New York State Thruway are often congested because of the large number of cars waiting to pay. A consultant working for the state concluded that if service times are measured from the time a car stops in line until it leaves, service times are exponentially distributed with a mean of 2.7 minutes. a. What is the probability that a car will take more than 2 minutes to get
through the toll booth? b. What is the probability that a car will take less than 3 minutes to get through
the toll booth? c. What is the probability that a car will take at least 2 but no more than 4
minutes to get through the toll booth?
21
Population vs. Sample Parameter: a numerical measure (or characteristic) of the population, 2: , , .pθ µ σ Statistic: a numerical measure (or characteristic) of a sample, 2ˆ ˆ: , , .X S pθ Sampling error: the absolute difference between the parameter θ and its statisticθ̂ , that is, θ̂ θ− .
Sampling distribution: the probability distribution of a statistic. Standard error: the standard deviation of a statistic. Sampling distribution of the sample mean X Suppose a random sample is taken from an infinite (very large) population which has a mean µ ( )or Xµ and a standard deviation σ ( )or Xσ . The mean of X (average of all possible sample means) will then be µ (or Xµ ) and the variance of
X is 22
or X
n nσσ
. That is,
( ) ( ) or XXE X µ µ µ= = and ( )22
2 or XXVar X
n nσσσ
= =
Furthermore, • if X is a normal random variable (the population from which the samples are
drawn is normally distributed), then X is also a normal random variable (probability distribution of the sample mean X is also normally distributed)
with mean ( ) XE X µ µ= = and variance ( )2
2XVar X
nσσ= = ;
• if X is not a normal random variable (the population from which the samples are drawn is not normally distributed), then X is approximately a normal random variable (probability distribution of the sample mean X is approximately normally distributed) provided n is large, according to the Central Limit Theorem. In many practical situations, a sample size of 30 (
30n ≥ ) may be sufficiently large to allow us to use normal approximation for the sampling distribution of X . However, if the population is extremely nonnormal (for example, bimodal and highly-skewed distributions), the sampling distribution will also be nonnormal even for moderately large values of n.
22
The Central Limit Theorem: If the sample size n is sufficiently large, then the population of all possible sample means is approximately normally distributed (with mean Xµ µ= and standard
deviation (standard error of the mean) X nσ σ= ), no matter what probability distribution describes the sampled population. Furthermore, the larger the sample size n is, the more nearly normally distributed is the population of all possible sample means.
23
Example 10: a. X ~ N(50, 64)
Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ... Variable N Mean StDev Variance Minimum Q1 Median Means (n = 5) 1000000 49.997 3.578 12.803 33.563 47.580 49.996 Means (n = 10) 1000000 49.998 2.525 6.373 36.832 48.292 49.998 Means (n = 15) 1000000 50.000 2.067 4.272 39.973 48.604 49.998 Means (n = 20) 1000000 50.000 1.791 3.206 41.550 48.790 50.001 Means (n = 25) 1000000 50.001 1.600 2.559 42.476 48.923 50.002 Means (n = 30) 1000000 50.003 1.459 2.130 42.315 49.017 50.003
N for Variable Q3 Maximum Mode Mode Means (n = 5) 52.409 67.410 * 0 Means (n = 10) 51.701 62.532 * 0 Means (n = 15) 51.393 60.710 * 0 Means (n = 20) 51.208 58.101 * 0 Means (n = 25) 51.081 57.262 * 0 Means (n = 30) 50.987 56.804 * 0
57.055.153.251.349.447.545.643.7
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 30)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 30)
66.061.657.252.848.444.039.635.2
12000
10000
8000
6000
4000
2000
0
Means (n = 5)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 5)
61.257.854.451.047.644.240.837.4
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 10)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 10)
59.456.754.051.348.645.943.240.5
20000
15000
10000
5000
0
Means (n = 15)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 15)
58.0555.9053.7551.6049.4547.3045.1543.00
12000
10000
8000
6000
4000
2000
0
Means (n = 20)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 20)
56.5554.6052.6550.7048.7546.8044.8542.90
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 25)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 25)
24
b. X ~ Unif(25, 75) Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ... Variable N Mean StDev Variance Minimum Q1 Median Means (n = 5) 1000000 49.994 6.451 41.616 26.619 45.536 49.997 Means (n = 10) 1000000 49.998 4.561 20.800 30.057 46.872 49.994 Means (n = 15) 1000000 49.993 3.731 13.921 32.819 47.457 49.995 Means (n = 20) 1000000 50.002 3.229 10.427 35.332 47.812 50.002 Means (n = 25) 1000000 50.000 2.883 8.309 37.017 48.047 50.002 Means (n = 30) 1000000 49.996 2.636 6.949 38.557 48.212 49.996
N for Variable Q3 Maximum Mode Mode Means (n = 5) 54.460 73.408 * 0 Means (n = 10) 53.117 69.436 * 0 Means (n = 15) 52.531 67.824 * 0 Means (n = 20) 52.198 64.841 * 0 Means (n = 25) 51.953 63.044 * 0 Means (n = 30) 51.782 62.576 * 0
68.262.055.849.643.437.231.0
12000
10000
8000
6000
4000
2000
0
Means (n = 5)
Freq
uenc
y50
Histogram of Means (k = 1,000,000, n = 5)
67.662.457.252.046.841.636.431.2
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 10)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 10)
64.459.855.250.646.041.436.832.2
12000
10000
8000
6000
4000
2000
0
Means (n = 15)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 15)
62.458.554.650.746.842.939.035.1
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 20)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 20)
61.257.854.451.047.644.240.837.4
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 25)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 25)
60.857.654.451.248.044.841.638.4
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 30)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 30)
25
c. X ~ exp(50)
2031741451168758290
20000
15000
10000
5000
0
Means (n = 5)
Freq
uenc
y50
Histogram of Means (k = 1,000,000, n = 5)
14712610584634221
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 10)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 10)
1441261089072543618
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 15)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 15)
116.0101.587.072.558.043.529.014.5
20000
15000
10000
5000
0
Means (n = 20)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 20)
11298847056422814
20000
15000
10000
5000
0
Means (n = 25)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 25)
110.2598.0085.7573.5061.2549.0036.7524.50
12000
10000
8000
6000
4000
2000
0
Means (n = 30)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 30)
101.2590.0078.7567.5056.2545.0033.7522.50
12000
10000
8000
6000
4000
2000
0
Means (n = 35)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 35)
86.476.867.257.648.038.428.8
10000
8000
6000
4000
2000
0
Means (n = 40)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 40)
26
97.5087.7578.0068.2558.5048.7539.0029.25
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 45)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 45)
9081726354453627
12000
10000
8000
6000
4000
2000
0
Means (n = 50)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 50)
9081726354453627
12000
10000
8000
6000
4000
2000
0
Means (n = 55)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 55)
84.075.667.258.850.442.033.625.2
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 60)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 60)
85.878.070.262.454.646.839.031.2
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 65)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 65)
81.474.066.659.251.844.437.029.6
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 70)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 70)
81.474.066.659.251.844.437.029.6
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 75)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 75)
81.474.066.659.251.844.437.029.6
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 80)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 80)
27
Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ... Variable N Mean StDev Variance Minimum Q1 Median Means (n = 5) 1000000 50.030 22.350 499.538 2.441 33.723 46.738 Means (n = 10) 1000000 50.016 15.834 250.717 5.632 38.622 48.375 Means (n = 15) 1000000 49.997 12.914 166.765 8.239 40.773 48.892 Means (n = 20) 1000000 50.005 11.194 125.304 13.661 42.074 49.162 Means (n = 25) 1000000 50.013 10.000 100.002 15.563 42.950 49.356 Means (n = 30) 1000000 49.989 9.130 83.355 18.710 43.573 49.428 Means (n = 35) 1000000 50.004 8.450 71.402 20.072 44.075 49.529 Means (n = 40) 1000000 49.993 7.902 62.447 21.147 44.459 49.571 Means (n = 45) 1000000 50.005 7.460 55.647 21.845 44.800 49.635 Means (n = 50) 1000000 50.009 7.071 49.998 22.887 45.074 49.683 Means (n = 55) 1000000 49.999 6.736 45.376 24.549 45.302 49.698 Means (n = 60) 1000000 49.991 6.457 41.688 23.093 45.502 49.709 Means (n = 65) 1000000 50.000 6.199 38.433 24.911 45.686 49.742 Means (n = 70) 1000000 49.992 5.980 35.762 27.437 45.835 49.745 Means (n = 75) 1000000 49.999 5.775 33.351 25.150 45.989 49.769 Means (n = 80) 1000000 50.000 5.583 31.171 27.228 46.132 49.793 Means (n = 85) 1000000 50.013 5.427 29.454 28.619 46.251 49.820 Means (n = 90) 1000000 50.006 5.268 27.749 28.847 46.355 49.826 Means (n = 95) 1000000 50.000 5.121 26.225 27.900 46.462 49.829 Means (n = 100) 1000000 49.998 5.003 25.031 27.543 46.537 49.827
N for Variable Q3 Maximum Mode Mode Means (n = 5) 62.776 222.738 * 0 Means (n = 10) 59.601 164.371 * 0 Means (n = 15) 58.009 143.829 * 0 Means (n = 20) 57.033 124.532 * 0 Means (n = 25) 56.353 120.890 * 0 Means (n = 30) 55.808 111.209 * 0 Means (n = 35) 55.417 105.860 * 0 Means (n = 40) 55.064 94.210 * 0 Means (n = 45) 54.813 96.548 * 0 Means (n = 50) 54.586 91.011 * 0 Means (n = 55) 54.363 93.361 * 0 Means (n = 60) 54.173 87.456 * 0 Means (n = 65) 54.022 84.908 * 0 Means (n = 70) 53.890 83.259 * 0 Means (n = 75) 53.764 81.664 * 0 Means (n = 80) 53.646 82.805 * 0 Means (n = 85) 53.567 84.893 * 0 Means (n = 90) 53.458 81.011 * 0 Means (n = 95) 53.349 79.137 * 0 Means (n = 100) 53.275 77.490 * 0
81.474.066.659.251.844.437.029.6
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 85)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 85)
81.674.868.061.254.447.640.834.0
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 90)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 90)
74.868.061.254.447.640.834.027.2
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 95)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 95)
72.666.059.452.846.239.633.0
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
Means (n = 100)
Freq
uenc
y
50
Histogram of Means (k = 1,000,000, n = 100)
28
Example 11: An automatic machine in a manufacturing process is operating properly if the lengths of an important subcomponent are normally distributed with mean 117 and standard deviation 5.2 (in centimeters). a. Find the probability that one randomly selected subcomponent is longer than
120 cm. b. Find the sampling distribution of the sample mean from a random sample of
size 4. c. Find the probability that if four subcomponents are randomly selected, their
mean length exceeds 120 cm. d. Find the probability that if four subcomponents are randomly selected, all
four have lengths that exceed 120 cm.
29
Example 12: The restaurant in a large commercial building provides coffee for the building’s occupants. The restaurateur has determined that the mean number of cups of coffee consumed in a day by all the occupants is 2.0 with a standard deviation of 0.6. A new tenant of the building intends to have a total of 125 new employees. What is the probability that the new employees will consume more than 240 cups per day?
30