30
ISOM 2500 Topic 2: Probability & Distributions K.H. Chen Part III: Continuous Random Variables & Distributions 1. Key Learning Objectives In this lesson, we will discuss the continuous variables and their probability distributions. We will start with the discussion of general continuous distribution setup and will continue with two forms of the latter, namely, the normal and sampling distributions. It is very important that you understand the material discussed here as they constitute the theoretical background to inferential statistics. 2. Discussion A continuous random variable is a random variable that can take any value contained in one or more intervals. (i.e., an uncountable number of values). Examples: Salary, time, volume of milk in a container, etc. Since there is an infinite number of values that can be assumed by a continuous random variable, the probability of each individual value is virtually zero! As such, we can only determine the probability of only a range of values. Figure 1a: Histogram of the gas-mileages of 49 mid-sized cars. 33 32 31 30 12 10 8 6 4 2 0 Mileage Frequency Histogram of Mileage 1

Topic+2+Part+III.pdf

Embed Size (px)

Citation preview

Page 1: Topic+2+Part+III.pdf

ISOM 2500 Topic 2: Probability & Distributions K.H. Chen Part III: Continuous Random Variables

& Distributions

1. Key Learning Objectives

• In this lesson, we will discuss the continuous variables and their probability distributions. We will start with the discussion of general continuous distribution setup and will continue with two forms of the latter, namely, the normal and sampling distributions. It is very important that you understand the material discussed here as they constitute the theoretical background to inferential statistics.

2. Discussion

A continuous random variable is a random variable that can take any value contained in one or more intervals. (i.e., an uncountable number of values). Examples: Salary, time, volume of milk in a container, etc. Since there is an infinite number of values that can be assumed by a continuous random variable, the probability of each individual value is virtually zero! As such, we can only determine the probability of only a range of values. Figure 1a: Histogram of the gas-mileages of 49 mid-sized cars.

33323130

12

10

8

6

4

2

0

Mileage

Freq

uenc

y

Histogram of Mileage

1

Page 2: Topic+2+Part+III.pdf

Figure 1b: Density Histogram of the gas-mileages of 49 mid-sized cars.

Figure 1c: Density Function of the gas-mileages of 49 mid-sized cars.

33323130

0.5

0.4

0.3

0.2

0.1

0.0

Mileage

Den

sity

Density Histogram of Mileage

343332313029

12

10

8

6

4

2

0 x

f(x) Density Function

2

Page 3: Topic+2+Part+III.pdf

The probability of a continuous random variable is represented by the area under the curve, called the probability density function f(x). Requirements for a Probability Density Function: 1. ( ) 0 for all between and .f x x a b≥

2. ( )The total area under the curve between and is 1.0, i.e., 1b

aa b f x dx =∫

The probability of the (continuous) random variable between c and d is

( ) ( ) ( ) for and d

cP c X d P c X d f x dx c a d b< < = ≤ ≤ = ≥ ≤∫

Mean and Variance of a Continuous Random Variable: From Topic 2 Part II, the mean and variance of a discrete random variable are determined as follows:

[ ] ( )all

.x

E X x p xµ= = ∑ and [ ] [ ] ( ) ( )2 22

all .

xV X E X x p xσ µ µ= = − = −∑

The mean and variance of a continuous random variable, which ranges between a and b are determined in a similar fashion using the integral sign rather than the summation sign. That is, Mean

[ ] . ( )b

aE X x f x dxµ= = ∫ Variance

[ ] [ ] ( )2 22 . ( )b

aV X E X x f x dxσ µ µ= = − = −∫

( )2 2.b

ax f x dx µ= −∫

( )2 2 2 2 where .b

aE X E X x f x dxµ = − = ∫

Standard deviation

( )V Xσ =

3

Page 4: Topic+2+Part+III.pdf

Example 1: After playing golf for many years, a statistics professor determined the density function for the distance his drives travel in hundreds of yards (denoted by X). It is

( ) 23 for 2 319

f x x x= ≤ ≤

a. Confirm that the above function satisfies the requirements for a probability density function.

From the above plot, we can see that f(x) > 0 for 2 3x≤ ≤ and thus the first requirement for a probability density function is met.

The total area under ( ) for 2 3f x x≤ ≤3 22

319

x dx= ∫

3 22

319

x dx= ∫

33

2

319 3

x =

33

219x

=

( ) ( )3 33 2

19

−=

27 8 119

−= =

The second requirement for a probability density function is also met. Thus,

( ) 23 for 2 319

f x x x= ≤ ≤ satisfies the requirements for a probability density

function.

4

3.02.82.62.42.22.0

1.5

1.4

1.3

1.2

1.1

1.0

0.9

0.8

0.7

0.6

x

f(x)

Plot of f(x) vs x

Page 5: Topic+2+Part+III.pdf

b. Find the probability that the professor’s next drive is more than 250 yards.

( )3 2

2.5

32.519

P X x dx> = ∫

33

2.519x

=

( ) ( )3 33 2.519

−=

27 15.62519

−=

0.5987= c. Determine the mean, variance, and standard deviation of the professor’s

drives.

Mean, [ ] ( )3

2.E X x f x dx= ∫

3 32

319

x dx= ∫

34

2

376x

=

( )243 48 2.5658 256.58 yards76−

= =

( )32 2

2.E X x f x dx = ∫

3 42

319

x dx= ∫

35

2

395x

=

729 96 63395 95−

= =

Variance, [ ] 2 2 V X E X µ = −

( )2633 2.5658 95

= −

( )20.07988227149 798.823 yards= Standard deviation, 0.07988227149σ =

( )0.2826345193 28.26 yards=

5

Page 6: Topic+2+Part+III.pdf

Uniform/Rectangular Distribution A continuous random variable X is a uniform random variable over an interval a x b≤ ≤ or [a, b] (equivalently), if X can take on any value in closed interval [a, b] and if the probability density function of X is constant over this interval. That is,

1 for

( )0 otherwise

a x bf x b a

≤ ≤= −

Figure 2: Probability density function of a uniform random variable X. Note that for a uniform random variable X,

( )2

a bE X += and ( ) ( )2

12b a

V X−

= ⇒ ( )12X

b aσ

−=

X

f(X)

Distribution PlotUniform, Lower=a, Upper=b

a b

1b a−

6

Page 7: Topic+2+Part+III.pdf

Example 2: The weekly output of a steel mill is a uniformly distributed random variable that lies between 110 and 175 metric tons. a. Sketch the probability density function of the weekly output. b. Find the probability that the steel mill will produce more than 150 metric

tons next week. c. Determine the probability that the steel mill will produce between 120 and

160 metric tons next week. d. The operations manager labels any week that is in the bottom 20% of

production a “bad week.” How many metric tons should be used to define a bad week?

e. Find the expected value and standard deviation of the weekly output.

7

Page 8: Topic+2+Part+III.pdf

Normal Distribution A continuous random variable X, with the following probability density function is called the normal random variable

( )21

21 for , , 02

x

f x e xµ

σ µ σσ π

− − = − ∞ < < +∞ − ∞ < < +∞ >

where 2.71828...e = and 3.14159...π = . Its distribution is called the normal distribution. In short, we write

( )2~ ,X N µ σ Here, [ ]E X µ= and [ ] 2.V X σ= The probability density function of a normal random variable is mound-shaped (or bell-shaped) and symmetric about its mean µ and has points of inflexion at µ σ− and µ σ+ . To calculate the probability that a normal random variable falls into any interval, we need to compute the area in the interval under the curve. That is, taking the integral of the probability density function of a normal random variable. However, computing the former is not easy and thus, we resort to using a probability table or statistical software package to calculate normal probabilities. For the probability-table approach, it would appear that (countless) separate tables are needed for different combinations of means ( µ−∞ < < +∞ ) and standard deviations ( 0σ > ). Fortunately, this won’t be necessary as we can reduce the number of tables needed to one by standardizing the normal random variable. That is,

( )~ 0,1XZ Nµσ−

= .

Note that the above standardized normal random variable, Z is called the standard normal random variable and it has the following probability density function

( )21

21 for 2

zf z e z

π

−= − ∞ < < +∞

and its distribution is called the standard normal distribution. In short, we write

( )~ 0,1Z N Here, ( ) 0ZE Z µ= = and ( ) 2 1ZV Z σ= = .

8

Page 9: Topic+2+Part+III.pdf

Using the probability table to find standard normal probabilities Table 3 (in Appendix B) on pages B–8 & B–9 on lmes2.ust.hk can be used to find standard normal probabilities. To use this table: first find the row that corresponds to the integer part and the first decimal of the z value, and then find the column that corresponds to the second decimal. Then, at the intersection of the row and the column, you will find the probability that corresponds to an area similar to the one depicted in the graph that accompanies the table. Example 3: Find ( )1.95P Z < . Note that the former corresponds to the area under the standard normal curve between and 1.95−∞ .

Figure 3: ( )1.95P Z < .

From the table, ( )1.95 0.9744P Z < =

0.4

0.3

0.2

0.1

0.0Z

f(Z)

Distribution PlotNormal, Mean=0, StDev=1

9

Page 10: Topic+2+Part+III.pdf

MINITAB Instructions: Computing standard normal probabilities Step 1: Click on the Calc menu and select Probability Distributions, followed by Normal….

Step 2: Once the Normal Distribution window pops up,

• use the default cumulative probability • enter the values of µ and σ next to “Mean:” and “Standard deviation:”,

respectively • select “Input constant:” and enter the value of the normal random

variable next to it.

10

Page 11: Topic+2+Part+III.pdf

Step 3: Click on the OK button. From the Session Window of MINITAB, ( )1.95 0.974412P Z < = Note: When you need to calculate probabilities other than of the ( )P Z z−∞ < < or

( )P Z z< type, you need to be able to express your probability in terms of the ( )P Z z< probability. Homework problems will give you the chance to practice

doing so. Figure 4 depicts the way to do these manipulations.

Figure 4: Visualization of the simple arithmetic manipulations needed to express other types of probabilities in terms of the P(Z < z) type.

0.4

0.3

0.2

0.1

0.0Z

f(Z)

1

0.136

20

Distribution PlotNormal, Mean=0, StDev=1

0.4

0.3

0.2

0.1

0.0Z

f(Z)

2

0.977

0

Distribution PlotNormal, Mean=0, StDev=1

= –

0.4

0.3

0.2

0.1

0.0Z

f(Z)

1

0.841

0

Distribution PlotNormal, Mean=0, StDev=1

= – 0.4

0.3

0.2

0.1

0.0Z

f(Z)

2

0.0228

0

Distribution PlotNormal, Mean=0, StDev=1

0.4

0.3

0.2

0.1

0.0Z

f(Z)

2

0.977

0

Distribution PlotNormal, Mean=0, StDev=1

0.4

0.3

0.2

0.1

0.0Z

f(Z)

-3.09 0 3.09

1.000

Distribution PlotNormal, Mean=0, StDev=1

11

Page 12: Topic+2+Part+III.pdf

Example 4: Let X be a normally distributed random variable with mean µ = 40 and σ = 5. Find the probability P(X < 49). To compute P(X < 49) using the standard normal table, we need to standardize X: Figure 5: P(X < 49)

( ) ( )49 4049 1.8 0.96415

XP X P P Zµσ− − < = < = < =

0.4

0.3

0.2

0.1

0.0Z

f(Z)

-1

0.819

0 2

Distribution PlotNormal, Mean=0, StDev=1

= – 0.4

0.3

0.2

0.1

0.0Z

f(Z)

2

0.977

0

Distribution PlotNormal, Mean=0, StDev=1

0.4

0.3

0.2

0.1

0.0Z

f(Z)

-1

0.159

0

Distribution PlotNormal, Mean=0, StDev=1

0.4

0.3

0.2

0.1

0.0Z

f(Z)

1.8

0.964

0

Distribution PlotNormal, Mean=0, StDev=1

0.09

0.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0.00X

f(X)

49

0.964

40

Distribution PlotNormal, Mean=40, StDev=5

Standardize

12

Page 13: Topic+2+Part+III.pdf

MINITAB Instructions: Computing normal probabilities Steps: Same as Steps 1, 2 & 3 on pages 10 & 11. From the Session Window of MINITAB, ( ) ( )49 1.8 0.964070P X P Z< = < =

Example 5: Let X be a normally distributed random variable with mean µ = 50 and σ = 8. Find (30 39)P X< < .

13

Page 14: Topic+2+Part+III.pdf

Determining Z and X values when the probability is given. • Use the table in the reverse way. • Destandardize using X = µ + Zσ Example 6: The life of a calculator manufactured by CASIO is normally distributed with µ = 50 months and σ = 8 months. What should the warranty period be if the company does not want to replace more than 5% of its products?

Figure 6a: P(X < x0.95) = 0.05 where x0.95 denotes the 5th percentile of X ~ N(50, 64).

Figure 6b: P(Z < z0.95) = 0.05 where z0.95 denotes the 5th percentile of Z ~ N(0, 1).

From Table 3 (in Appendix B) on pages B–8 & B–9 on lmes2.ust.hk, P(Z < –1.645) = 0.05 z0.95 = –1.645. Thus, ( )( )0.95 0.95 50 1.645 8 50 13.16 36.84x zµ σ= + = + − = − =

0.4

0.3

0.2

0.1

0.0Z

f(Z)

0.05

0

Distribution PlotNormal, Mean=0, StDev=1

0.95z

0.05

0.04

0.03

0.02

0.01

0.00X

f(X)

0.05

50

Distribution PlotNormal, Mean=50, StDev=8

0.95x

14

Page 15: Topic+2+Part+III.pdf

MINITAB Instructions: Computing normal percentiles Step 1: Click on the Calc menu and select Probability Distributions, followed by Normal….

Step 2: Once the Normal Distribution window pops up,

• select Inverse cumulative probability • enter the values of µ and σ next to “Mean:” and “Standard deviation:”,

respectively • select “Input constant:” and enter the value of the cumulative

probability next to it.

15

Page 16: Topic+2+Part+III.pdf

Step 3: Click on the OK button. From the Session Window of MINITAB, 0.95 36.8412x =

Example 7: At a certain university, the SAT scores on the verbal portion of the first-year students are normally distributed with mean 520 and standard deviation 40. a. Find the proportion of first-year students whose SAT scores on the verbal

portion are between 500 and 650. b. How high a verbal test score must be in order to be among the highest 5%

test scores? c. If 5 first-year students are randomly selected, what is the probability that

there will be 3 students whose scores are between 500 and 650?

16

Page 17: Topic+2+Part+III.pdf

Check for normality

• Construct a dotplot/stem-and-leaf/histogram of the variable to see whether the data are normally distributed.

• Apply the empirical rule: o Compute ( )1 , 1x s x s− + , ( )2 , 2x s x s− + , ( )3 , 3x s x s− + . o Compute the (actual) proportions of data points that fall within each of

the above 3 intervals. o Compare the computed proportions with the theoretical proportions:

68%, 95%, and 100%, respectively. • Construct the normal quantile (Q-Q) plot. • Conduct normality tests. A normal quantile (Q-Q) plot is a graph designed to show whether a normal model is a reasonable description of the variation in the data. The basic idea behind the normal quantile plot is to compare the data values with the values one would expect from a standard normal distribution. The comparison is based on the idea of quantiles.

Example 8:

0.0 –0.3 –0.1 –0.5 –0.4 2.8 2.6 –1.3 0.5 2.6

To construct a normal quantile plot, do the following:

1. Sort the data in ascending order (see Column III on the next page). 2. Determine which quantile each data value represents. In this example, the

smallest of the 10 values, represents the smallest 10% of the data. We will consider this data value to lie half way between 0% and 10% (the middle of

the lowest 10%). In general, the computation 0.5in

− gives the desired value

of the position (expressed as a decimal) since that is halfway between 1in−

and in

(see Column IV on the next page).

3. Compute the value “theoretical” quantile of the standard normal distribution: *z or ( )0.5n i

nz − + ; that corresponds to the proportion computed in Column IV

(see Column V on the next page). For example, to obtain the theoretical quantile for the 1st row of the table on the next page, we need to know what value in the standard normal distribution has approximately 5% of the distribution below it. So we search for something close to 0.05 in the body of the standard normal table, and see that it lies roughly half-way between –1.64 and –1.65 (let’s fix it at –1.645).

17

Page 18: Topic+2+Part+III.pdf

A computer can get the value more accurately and indicates that it is –1.64485. MINITAB will give you this value if you type “invcdf 0.05” next to “MTB >” command in the session window or if you use the menu under Calc > Probability Distributions > Normal…)

A normal quantile plot is then constructed by plotting the values under Column III ( ( )ix ) against the values under Column V ( *z or ( )0.5n i

nz − + ).

If the data came perfectly from a standard normal distribution, Columns III and V of the table below would be identical (the theoretical quantile and the data value would match). This means that all the points would fall along the straight line y = x. Since other normal distributions are just linear transformations of the standard normal distribution ( )x zµ σ= + , perfect data from a normal distribution with mean µ and standard deviation σ would give a line with slope σ and intercept µ . We use normal quantile plots to assess the plausibility that a data set is a sample from a normally distributed population. If the resulting plot is approximately linear, then it is plausible that the data come from a normal distribution. Else (if the plot is markedly nonlinear), it is doubtful that the data come from a normal distribution. Of course, this will work much better for large data sets than for small data sets.

I II III IV V Position

i Data Value

ix

Sample Quantile (Sorted Data

Value) ( )ix

Proportion below ( ) :ix 0.5in

Theoretical Quantile

*z or ( )0.5n in

z − +

1 0.0 –1.3 0.05 –1.64485 2 –0.3 –0.5 0.15 –1.03643 3 –0.1 –0.4 0.25 –0.67449 4 –0.5 –0.3 0.35 –0.38532 5 –0.4 –0.1 0.45 –0.12566 6 2.8 0.0 0.55 0.12566 7 2.6 0.5 0.65 0.38532 8 –1.3 2.6 0.75 0.67449 9 0.5 2.6 0.85 1.03643 10 2.6 2.8 0.95 1.64485

18

Page 19: Topic+2+Part+III.pdf

3210-1

2

1

0

-1

-2

Sample Quantiles

Theo

reti

cal Q

uant

iles

Normal Q-Q Plot

19

Page 20: Topic+2+Part+III.pdf

Exponential Distribution A continuous random variable X is exponentially distributed if its probability density function is given by

( ) 1 for 0x

f x e xµ

µ−

= ≥

where e = 2.71828… and µ is the mean of the exponential random variable. It can be shown that the mean of an exponential random variable X is equal to its standard deviation, i.e., [ ] [ ] XE X V X σ= = . Figure 7: Exponential distributions with Xµ = 0.5, 1.0, and 2.0. Probabilities associated with an exponential random variable X:

a. ( )*

*x

P X x e µ−

> =

b. ( )*

* 1x

P X x e µ−

< = −

c. ( ) ( ) ( )* *1 2

* * * *1 2 2 1

x x

P x X x P X x P X x e eµ µ− −

< < = < − < = −

Note that if the number of arrivals follows a Poisson distribution, the times between arrivals follow an exponential distribution.

302520151050

2.0

1.5

1.0

0.5

0.0

X

f(X)

Mean = 0.5Mean = 1Mean = 2

Variable

Probability Density Function of Exponential Random Variable X

20

Page 21: Topic+2+Part+III.pdf

Example 9: Toll booths on the New York State Thruway are often congested because of the large number of cars waiting to pay. A consultant working for the state concluded that if service times are measured from the time a car stops in line until it leaves, service times are exponentially distributed with a mean of 2.7 minutes. a. What is the probability that a car will take more than 2 minutes to get

through the toll booth? b. What is the probability that a car will take less than 3 minutes to get through

the toll booth? c. What is the probability that a car will take at least 2 but no more than 4

minutes to get through the toll booth?

21

Page 22: Topic+2+Part+III.pdf

Population vs. Sample Parameter: a numerical measure (or characteristic) of the population, 2: , , .pθ µ σ Statistic: a numerical measure (or characteristic) of a sample, 2ˆ ˆ: , , .X S pθ Sampling error: the absolute difference between the parameter θ and its statisticθ̂ , that is, θ̂ θ− .

Sampling distribution: the probability distribution of a statistic. Standard error: the standard deviation of a statistic. Sampling distribution of the sample mean X Suppose a random sample is taken from an infinite (very large) population which has a mean µ ( )or Xµ and a standard deviation σ ( )or Xσ . The mean of X (average of all possible sample means) will then be µ (or Xµ ) and the variance of

X is 22

or X

n nσσ

. That is,

( ) ( ) or XXE X µ µ µ= = and ( )22

2 or XXVar X

n nσσσ

= =

Furthermore, • if X is a normal random variable (the population from which the samples are

drawn is normally distributed), then X is also a normal random variable (probability distribution of the sample mean X is also normally distributed)

with mean ( ) XE X µ µ= = and variance ( )2

2XVar X

nσσ= = ;

• if X is not a normal random variable (the population from which the samples are drawn is not normally distributed), then X is approximately a normal random variable (probability distribution of the sample mean X is approximately normally distributed) provided n is large, according to the Central Limit Theorem. In many practical situations, a sample size of 30 (

30n ≥ ) may be sufficiently large to allow us to use normal approximation for the sampling distribution of X . However, if the population is extremely nonnormal (for example, bimodal and highly-skewed distributions), the sampling distribution will also be nonnormal even for moderately large values of n.

22

Page 23: Topic+2+Part+III.pdf

The Central Limit Theorem: If the sample size n is sufficiently large, then the population of all possible sample means is approximately normally distributed (with mean Xµ µ= and standard

deviation (standard error of the mean) X nσ σ= ), no matter what probability distribution describes the sampled population. Furthermore, the larger the sample size n is, the more nearly normally distributed is the population of all possible sample means.

23

Page 24: Topic+2+Part+III.pdf

Example 10: a. X ~ N(50, 64)

Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ... Variable N Mean StDev Variance Minimum Q1 Median Means (n = 5) 1000000 49.997 3.578 12.803 33.563 47.580 49.996 Means (n = 10) 1000000 49.998 2.525 6.373 36.832 48.292 49.998 Means (n = 15) 1000000 50.000 2.067 4.272 39.973 48.604 49.998 Means (n = 20) 1000000 50.000 1.791 3.206 41.550 48.790 50.001 Means (n = 25) 1000000 50.001 1.600 2.559 42.476 48.923 50.002 Means (n = 30) 1000000 50.003 1.459 2.130 42.315 49.017 50.003

N for Variable Q3 Maximum Mode Mode Means (n = 5) 52.409 67.410 * 0 Means (n = 10) 51.701 62.532 * 0 Means (n = 15) 51.393 60.710 * 0 Means (n = 20) 51.208 58.101 * 0 Means (n = 25) 51.081 57.262 * 0 Means (n = 30) 50.987 56.804 * 0

57.055.153.251.349.447.545.643.7

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 30)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 30)

66.061.657.252.848.444.039.635.2

12000

10000

8000

6000

4000

2000

0

Means (n = 5)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 5)

61.257.854.451.047.644.240.837.4

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 10)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 10)

59.456.754.051.348.645.943.240.5

20000

15000

10000

5000

0

Means (n = 15)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 15)

58.0555.9053.7551.6049.4547.3045.1543.00

12000

10000

8000

6000

4000

2000

0

Means (n = 20)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 20)

56.5554.6052.6550.7048.7546.8044.8542.90

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 25)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 25)

24

Page 25: Topic+2+Part+III.pdf

b. X ~ Unif(25, 75) Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ... Variable N Mean StDev Variance Minimum Q1 Median Means (n = 5) 1000000 49.994 6.451 41.616 26.619 45.536 49.997 Means (n = 10) 1000000 49.998 4.561 20.800 30.057 46.872 49.994 Means (n = 15) 1000000 49.993 3.731 13.921 32.819 47.457 49.995 Means (n = 20) 1000000 50.002 3.229 10.427 35.332 47.812 50.002 Means (n = 25) 1000000 50.000 2.883 8.309 37.017 48.047 50.002 Means (n = 30) 1000000 49.996 2.636 6.949 38.557 48.212 49.996

N for Variable Q3 Maximum Mode Mode Means (n = 5) 54.460 73.408 * 0 Means (n = 10) 53.117 69.436 * 0 Means (n = 15) 52.531 67.824 * 0 Means (n = 20) 52.198 64.841 * 0 Means (n = 25) 51.953 63.044 * 0 Means (n = 30) 51.782 62.576 * 0

68.262.055.849.643.437.231.0

12000

10000

8000

6000

4000

2000

0

Means (n = 5)

Freq

uenc

y50

Histogram of Means (k = 1,000,000, n = 5)

67.662.457.252.046.841.636.431.2

18000

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 10)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 10)

64.459.855.250.646.041.436.832.2

12000

10000

8000

6000

4000

2000

0

Means (n = 15)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 15)

62.458.554.650.746.842.939.035.1

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 20)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 20)

61.257.854.451.047.644.240.837.4

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 25)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 25)

60.857.654.451.248.044.841.638.4

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 30)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 30)

25

Page 26: Topic+2+Part+III.pdf

c. X ~ exp(50)

2031741451168758290

20000

15000

10000

5000

0

Means (n = 5)

Freq

uenc

y50

Histogram of Means (k = 1,000,000, n = 5)

14712610584634221

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 10)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 10)

1441261089072543618

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 15)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 15)

116.0101.587.072.558.043.529.014.5

20000

15000

10000

5000

0

Means (n = 20)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 20)

11298847056422814

20000

15000

10000

5000

0

Means (n = 25)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 25)

110.2598.0085.7573.5061.2549.0036.7524.50

12000

10000

8000

6000

4000

2000

0

Means (n = 30)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 30)

101.2590.0078.7567.5056.2545.0033.7522.50

12000

10000

8000

6000

4000

2000

0

Means (n = 35)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 35)

86.476.867.257.648.038.428.8

10000

8000

6000

4000

2000

0

Means (n = 40)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 40)

26

Page 27: Topic+2+Part+III.pdf

97.5087.7578.0068.2558.5048.7539.0029.25

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 45)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 45)

9081726354453627

12000

10000

8000

6000

4000

2000

0

Means (n = 50)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 50)

9081726354453627

12000

10000

8000

6000

4000

2000

0

Means (n = 55)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 55)

84.075.667.258.850.442.033.625.2

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 60)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 60)

85.878.070.262.454.646.839.031.2

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 65)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 65)

81.474.066.659.251.844.437.029.6

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 70)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 70)

81.474.066.659.251.844.437.029.6

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 75)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 75)

81.474.066.659.251.844.437.029.6

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 80)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 80)

27

Page 28: Topic+2+Part+III.pdf

Descriptive Statistics: Means (n = 5, Means (n = 1, Means (n = 1, ... Variable N Mean StDev Variance Minimum Q1 Median Means (n = 5) 1000000 50.030 22.350 499.538 2.441 33.723 46.738 Means (n = 10) 1000000 50.016 15.834 250.717 5.632 38.622 48.375 Means (n = 15) 1000000 49.997 12.914 166.765 8.239 40.773 48.892 Means (n = 20) 1000000 50.005 11.194 125.304 13.661 42.074 49.162 Means (n = 25) 1000000 50.013 10.000 100.002 15.563 42.950 49.356 Means (n = 30) 1000000 49.989 9.130 83.355 18.710 43.573 49.428 Means (n = 35) 1000000 50.004 8.450 71.402 20.072 44.075 49.529 Means (n = 40) 1000000 49.993 7.902 62.447 21.147 44.459 49.571 Means (n = 45) 1000000 50.005 7.460 55.647 21.845 44.800 49.635 Means (n = 50) 1000000 50.009 7.071 49.998 22.887 45.074 49.683 Means (n = 55) 1000000 49.999 6.736 45.376 24.549 45.302 49.698 Means (n = 60) 1000000 49.991 6.457 41.688 23.093 45.502 49.709 Means (n = 65) 1000000 50.000 6.199 38.433 24.911 45.686 49.742 Means (n = 70) 1000000 49.992 5.980 35.762 27.437 45.835 49.745 Means (n = 75) 1000000 49.999 5.775 33.351 25.150 45.989 49.769 Means (n = 80) 1000000 50.000 5.583 31.171 27.228 46.132 49.793 Means (n = 85) 1000000 50.013 5.427 29.454 28.619 46.251 49.820 Means (n = 90) 1000000 50.006 5.268 27.749 28.847 46.355 49.826 Means (n = 95) 1000000 50.000 5.121 26.225 27.900 46.462 49.829 Means (n = 100) 1000000 49.998 5.003 25.031 27.543 46.537 49.827

N for Variable Q3 Maximum Mode Mode Means (n = 5) 62.776 222.738 * 0 Means (n = 10) 59.601 164.371 * 0 Means (n = 15) 58.009 143.829 * 0 Means (n = 20) 57.033 124.532 * 0 Means (n = 25) 56.353 120.890 * 0 Means (n = 30) 55.808 111.209 * 0 Means (n = 35) 55.417 105.860 * 0 Means (n = 40) 55.064 94.210 * 0 Means (n = 45) 54.813 96.548 * 0 Means (n = 50) 54.586 91.011 * 0 Means (n = 55) 54.363 93.361 * 0 Means (n = 60) 54.173 87.456 * 0 Means (n = 65) 54.022 84.908 * 0 Means (n = 70) 53.890 83.259 * 0 Means (n = 75) 53.764 81.664 * 0 Means (n = 80) 53.646 82.805 * 0 Means (n = 85) 53.567 84.893 * 0 Means (n = 90) 53.458 81.011 * 0 Means (n = 95) 53.349 79.137 * 0 Means (n = 100) 53.275 77.490 * 0

81.474.066.659.251.844.437.029.6

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 85)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 85)

81.674.868.061.254.447.640.834.0

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 90)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 90)

74.868.061.254.447.640.834.027.2

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 95)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 95)

72.666.059.452.846.239.633.0

18000

16000

14000

12000

10000

8000

6000

4000

2000

0

Means (n = 100)

Freq

uenc

y

50

Histogram of Means (k = 1,000,000, n = 100)

28

Page 29: Topic+2+Part+III.pdf

Example 11: An automatic machine in a manufacturing process is operating properly if the lengths of an important subcomponent are normally distributed with mean 117 and standard deviation 5.2 (in centimeters). a. Find the probability that one randomly selected subcomponent is longer than

120 cm. b. Find the sampling distribution of the sample mean from a random sample of

size 4. c. Find the probability that if four subcomponents are randomly selected, their

mean length exceeds 120 cm. d. Find the probability that if four subcomponents are randomly selected, all

four have lengths that exceed 120 cm.

29

Page 30: Topic+2+Part+III.pdf

Example 12: The restaurant in a large commercial building provides coffee for the building’s occupants. The restaurateur has determined that the mean number of cups of coffee consumed in a day by all the occupants is 2.0 with a standard deviation of 0.6. A new tenant of the building intends to have a total of 125 new employees. What is the probability that the new employees will consume more than 240 cups per day?

30