64
Lesson5-1 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson 5: Continuous Probability Distributions

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-1 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Lesson 5:

Continuous Probability Distributions

Page 2: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-2 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Outline

Continuous probability distributions

Features of univariate probability distribution

Features of bivariate probability distribution

Marginal density and Conditional density

Expectation, Variance, Covariance and Correlation Coefficient

Importance of normal distribution

The normal approximation to the binomial

Page 3: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-3 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Types of Probability Distributions

Number of random variables

Joint distribution

1 Univariate probability distribution

2 Bivariate probability distribution

3 Trivariate probability distribution

… …

n Multivariate probability distribution

Probability distribution may be classified according to the number of random variables it describes.

Page 4: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-4 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Continuous Probability Distributions

The curve f(x) is the continuous probability distribution (or probability curve or probability density function) of the random variable X if the probability that X will be in a specified interval of numbers is the area under the curve f(x) corresponding to the interval.

Properties of f(x)1. f(x) 0 for all x2. The total area

under the curve of f(x) is equal to 1

Page 5: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-5 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Features of a Univariate Continuous Distribution

Let X be a random variable that takes any real values in an interval between a and b. The number of possible outcomes are by definition infinite.

The main features of a probability density function f(x) are: f(x) 0 for all x and may be larger than 1. The probability that X falls into an subinterval

(c,d) is

and lies between 0 and 1. P(X (a,b)) = 1. P(X = x) = 0.

d

c

dxxfdcXP )()),((

Page 6: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-6 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Univariate Uniform Distribution

If c and d are numbers on the real line, the random variable X ~ U(c,d), i.e., has a univariate uniform distribution if

otherwise 0

dxcfor c-d

1=f(x)

The mean and standard deviation of a uniform random variable x are

122

cdand

dcXX

Page 7: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-7 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Uniform Density

Page 8: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-8 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Normal Probability Distribution

The random variable X ~ N(,2), i.e., has a univariate normal distribution if for all x on the real line (-,+ )

e2

1=f(x)

2-x

21

-

and are the mean and standard deviation, = 3.14159 … and e = 2.71828 is the base of natural or Naperian logarithms.

Page 9: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-9 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

8 10 12 14 16 18 20 22

Learning exercise 4: Part-time Work on Campus

A student has been offered part-time work in a laboratory. The professor says that the work will vary from week to week. The number of hours will be between 10 and 20 with a uniform probability density function, represented as follows:

How tall is the rectangle? What is the probability of

getting less than 15 hours in a week?

Given that the student gets at least 15 hours in a week, what is the probability that more than 17.5 hours will be available?

Page 10: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-10 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

8 10 12 14 16 18 20 22

Learning exercise 4: Part-time Work on Campus

How tall is the rectangle? (20-10)*h = 1 h=0.1

What is the probability of getting less than 15 hours in a week? 0.1*(15-10) = 0.5

Given that the student gets at least 15 hours in a week, what is the probability that more than 17.5 hours will be available? 0.1*(20-17.5) = 0.25 0.25/0.5 = 0.5P(hour>17.5)/P(hour>15)

Page 11: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-11 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Features of a Bivariate Continuous Distribution

Let X1 and X2 be a random variables that takes any real values in a region (rectangle) of (a,b,c,d). The number of possible outcomes are by definition infinite.

The main features of a probability density function f(x1,x2) are: f(x1,x2) 0 for all (x1,x2) and may be larger than 1. The probability that (X1,X2) falls into a region

(rectangle) or (p,q,r,s) is

and lies between 0 and 1. P((X1,X2) (a,b,c,d)) = 1. P((X1,X2) = (x1,x2) ) = 0.

q

p

s

r

dxdxxxfsrqpXXP 212121 ),()),,,(),((

Page 12: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-12 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Bivariate Uniform Distribution

If a, b, c and d are numbers on the real line, , the random variable (X1,X2) ~ U(a,b,c,d), i.e., has a bivariate uniform distribution if

otherwise 0

dxc and bxa for c)-a)(d-(b

1

=)x,f(x 2121

Page 13: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-13 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Marginal Density

The marginal density functions are:

y)dxf(x, f(y)

y)dyf(x, f(x)

Page 14: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-14 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Conditional Density

The conditional density functions are:

y)/f(y)f(x, y)|f(x

y)/f(x)f(x, x)|f(y

Page 15: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-15 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Expectation (Mean) of Continuous Probability Distribution

For univariate probability distribution, the expectation or mean E(X) is computed by the formula:

For bivariate probability distribution, the the expectation or mean E(X) is computed by the formula:

xf(x)dxE(X)

dyy)dx xf(x, E(X)

Page 16: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-16 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Conditional Mean of Bivariate Discrete Probability Distribution

For bivariate probability distribution, the conditional expectation or conditional mean E(X|Y) is computed by the formula:

Unconditional expectation or mean of X, E(X)

dxy)Y|E(X

)|( yxxf

][

[

)|(

XμE

Y)]|E(XE

f(y)dydx E(X)

yxxf

Page 17: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-17 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Expectation of a linear transformed random variable

If a and b are constants and X is a random variable, then E(a) = aE(bX) = bE(X)E(a+bX) = a+bE(X)

bE(x)a

dx f(x)x bdx f(x)a

dx f(x)bx dx f(x) a

dx f(x) bx)(a

dx bx)f(a bx)(abx)E(a

Page 18: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-18 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Variance of a Continuous Probability Distribution

For univariate continuous probability distribution

]μ)E[(XXV 2)(

If a and b are constants and X is a random variable, then V(a) = 0V(bX) = b2V(X)V(a+bX) = b2V(X)

Page 19: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-19 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Covariance of a Bivariate Discrete Probability Distribution

)]μ)(YμE[(XC YX ),( YX

Covariance measures how two random variables co-vary.

If a and b are constants and X is a random variable, then C(a,b) = 0C(a,bX) = 0C(a+bX,Y) = bC(X,Y)

Page 20: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-20 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Variance of a sum of random variables

If a and b are constants and X and Y are random variables, then

V(X+Y) = V(X) + V(Y) + 2C(X,Y)V(aX+bY) =a2V(X) + b2V(Y) + 2abC(X,Y)

Y)C(X,

)]μ)(YμE[(X)μ(YE ]) μ(X E[

)]μ)(Yμ(X)μ(Y ) μ(X E[

)]μ(Y)μ(X E[

] )μ μY XE[YXV

YX2

Y2

X

YX2

Y2

X

2YX

2YX

2)()(

2[

2

()(

YVXV

Y)C(X,a

)]μ)(YμE[(X)μ(YE ]) μ(X E[a

)]μ)(bYμ(aX)μ(Y ) μ(Xa E[

)]μ(bY)μ(aX E[

] )μ μYaX E[YXV

22

YX2

Y22

X2

YX2

Y22

X2

2YX

2YX

abYVbXV

abb

bab

ba

babba

2)()(

2[

2

()(

Page 21: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-21 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Correlation coefficient

The strength of the dependence between X and Y is measured by the correlation coefficient:

V(X)V(Y)Y)C(X,

Y)rr(X,C o

Page 22: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-22 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Importance of Normal Distribution

1. Describes many random processes or continuous phenomena

2. Basis for Statistical Inference

Page 23: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-23 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Characteristics of a Normal Probability Distribution

1. bell-shaped and single-peaked (unimodal) at the exact center of the distribution.

Page 24: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-24 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Characteristics of a Normal Probability Distribution

2. Symmetrical about its mean. The arithmetic mean, median, and mode of the distribution are equal and located at the peak. Thus half the area under the curve is above the mean and half is below it.

Page 25: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-25 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Characteristics of a Normal Probability Distribution

The normal probability distribution is asymptotic. That is the curve gets closer and closer to the X-axis but never actually touches it.

Page 26: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-26 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

N(0,2)

Symmetric

Mean=median = mode

Unimodal

Bell-shaped

Asymptotic

Page 27: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-27 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

N(,2)

x

x

x

(a)

(b)

(c)

Page 28: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-28 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Normal Distribution Probability

Probability is the area under the curve!

c dX

f(X) A table may be constructed to help us find the probability

Page 29: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-29 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Infinite Number of Normal Distribution Tables

Normal distributions differ by mean & standard deviation.

Each distribution would require its own table.

X

f(X)

Page 30: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Standard Normal Probability Distribution -- N(0,1)

The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.

It is also called the z distribution.

A z-value is the distance between a selected value, designated X, and the population mean , divided by the population standard deviation, . The formula is:

X

z

Page 31: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Transform to Standard Normal Distribution -- A numerical example

Any normal random variable can be transformed to a standard normal random variable

x x- (x-)/σ x/σ

0 -2 -1.4142 0

1 -1 -0.7071 0.7071

2 0 0 1.4142

3 1 0.7071 2.1213

4 2 1.4142 2.8284

Mean 2 0 0 1.4142

std 1.4142 1.4142 1 1

Page 32: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Standard Normal Probability Distribution

Any normal random variable can be transformed to a standard normal random variable

Suppose X ~ N(µ, 2) Z=(X-µ)/ ~ N(0,1)

P(X<k) = P [(X-µ)/ < (k-µ)/ ]

Page 33: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-33 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Standardize the Normal Distribution

Z

= 0

z = 1

Z

Because we can transform any normal random variable into standard normal random variable, we need only one table!

Normal Distribution

Standardized Normal Distribution

X

XZ

Page 34: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-34 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Standardizing Example

ZZ

= 0

Z = 1

.12

Normal Distribution

Standardized Normal Distribution

X = 5

= 10

6.2

12.010

52.6

XZ

Page 35: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-35 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Obtaining the Probability

ZZ

= 0

Z = 1

0.12

Z .00 .01

0.0 .0000 .0040 .0080

.0398 .0438

0.2 .0793 .0832 .0871

0.3 .1179 .1217 .1255

0.0478

.02

0.1 .0478

Standardized Normal Probability Table (Portion)

ProbabilitiesShaded Area Exaggerated

Page 36: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-36 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example P(3.8 X 5)

Z Z = 0

Z = 1

-0.12

Normal Distribution

0.0478

Standardized Normal Distribution

Shaded Area Exaggerated

X = 5

= 10

3.8

12.010

58.3

XZ

Page 37: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-37 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example (2.9 X 7.1)

0

Z

= 1

-.21 Z.21

Normal Distribution

.1664

.0832.0832

Standardized Normal Distribution

5

= 10

2.9 7.1 X

ZX

ZX

2 9 510

21

7 1 5

1021

..

..

Shaded Area Exaggerated

Page 38: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-38 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example (2.9 X 7.1)

0

Z

= 1

-.21 Z.21

Normal Distribution

.1664

.0832.0832

Standardized Normal Distribution

5

= 10

2.9 7.1 X

ZX

ZX

2 9 510

21

7 1 5

1021

..

..

Shaded Area Exaggerated

Page 39: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-39 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example P(X 8)

ZZ

= 0

Z

= 1

.30

Normal Distribution

Standardized Normal Distribution

.1179

.5000 .3821

ZX

8 5

1030.

X = 5

= 10

8

Shaded Area Exaggerated

Page 40: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-40 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example P(7.1 X 8)

z = 0

Z = 1

.30 Z.21

Normal Distribution

.0832

.1179 .0347

Standardized Normal Distribution

ZX

ZX

71 510

21

8 5

1030

..

.

= 5

= 10

87.1 XShaded Area Exaggerated

Page 41: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-41 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Normal Distribution Thinking Challenge

You work in Quality Control for GE. Light bulb life has a normal distribution with µ= 2000 hours & = 200 hours. What’s the probability that a bulb will last between 2000 & 2400 hours? less than 1470 hours?

Page 42: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-42 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Solution P(2000 X 2400)

ZZ

= 0

Z

= 1

2.0

Normal Distribution

.4772

Standardized Normal DistributionZ

X

2400 2000

2002 0.

X = 2000

= 200

2400

P(2000<X<2400) = P [(2000-µ)/ <(X-µ)/ < (2400-µ)/ ]= P[(X-µ)/ < (2400-µ)/ ] – P [(X-µ)/ < (2000-µ)/ ]= P[(X-µ)/ < (2400-µ)/ ] – 0.5

Shaded Area Exaggerated

Page 43: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-43 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Solution P(X 1470)

Z Z= 0

Z = 1

-2.65

Normal Distribution

.4960 .0040

.5000

Standardized Normal Distribution

ZX

1470 2000

2002 65.

X = 2000

= 200

1470

P(X<1470) = P [(X-µ)/ < (1470-µ)/ ]

Shaded Area Exaggerated

Page 44: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-44 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Finding Z Values for Known Probabilities

Z .00 .02

0.0 .0000 .0040 .0080

0.1 .0398 .0438 .0478

0.2 .0793 .0832 .0871

.1179 .1255

Z Z = 0

Z = 1

.31

.1217 .01

0.3 .1217

Standardized Normal Probability Table (Portion)

What Is Z Given P(Z) = 0.1217?

Shaded Area Exaggerated

Page 45: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-45 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Finding X Values for Known Probabilities

Z Z = 0

Z = 1

.31X = 5

= 10

?

Normal Distribution Standardized Normal Distribution

.1217 .1217

1.810)31.0(5 ZXShaded Area Exaggerated

Page 46: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-46 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 1

The bi-monthly starting salaries of recent MBA graduates follows the normal distribution with a mean of $2,000 and a standard deviation of $200. What is the z-value for a salary of $2,400?

00.2200$

000,2$400,2$

Xz

Page 47: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-47 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 1 continued

A z-value of 2 indicates that the value of $2,400 is one standard deviation above the mean of $2,000.

A z-value of –1.50 indicates that $1,900 is 1.5 standard deviation below the mean of $2000.

50.1200$

200,2$900,1$

Xz

What is the z-value of $1,900 ?

Page 48: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-48 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Areas Under the Normal Curve

About 68 percent of the area under the normal curve is within one standard deviation of the mean.

± P( - < X < + ) = 0.6826

About 95 percent is within two standard deviations of the mean. ± 2 P( - 2 < X < + 2 ) = 0.9544

Practically all is within three standard deviations of the mean. ± 3 P( - 3 < X < + 3 ) = 0.9974

Page 49: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-49 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 2

The daily water usage per person in New Providence, New Jersey is normally distributed with a mean of 20 gallons and a standard deviation of 5 gallons.

About 68 percent of those living in New Providence will use how many gallons of water?

About 68% of the daily water usage will lie between 15 and 25 gallons.

Page 50: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-50 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 2 continued

What is the probability that a person from New Providence selected at random will use between 20 and 24 gallons per day?

00.05

2020

X

z

80.05

2024

X

z

P(20<X<24)=P[(20-20)/5 < (X-20)/5 < (24-20)/5 ] =P[ 0<Z<0.8 ]

The area under a normal curve between a z-value of 0 and a z-value of 0.80 is 0.2881. We conclude that 28.81 percent of the residents use between 20 and 24 gallons of water per day.

Page 51: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-51 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

How do we find P(0<z<0.8)

P(0<z<0.8) = P(z<0.8) – P(z<0)=0.7881 – 0.5=0.2881

P(z<c)

c

P(0<z<c)

c0

P(0<z<0.8) = 0.2881

Page 52: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-52 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 2 continued

What percent of the population use between 18 and 26 gallons of water per day?

40.05

2018

X

z

20.15

2026

X

z

Suppose X ~ N(µ, 2) Z=(X-µ)/ ~ N(0,1)

P(X<k) = P [(X-µ)/ < (k-µ)/ ]

Page 53: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-53 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

How do we find P(-0.4<z<1.2)

P(z<c)

c

P(0<z<c)

c0

P(-0.4<z<1.2) = P(-0.4<z<0) + P(0<z<1.2)=P(0<z<0.4) + P(0<z<1.2)=0.1554+0.3849=0.5403

P(-0.4<z<1.2) = P(z<1.2) - P(z<-0.4)= P(z<1.2) - P(z>0.4) = P(z<1.2) – [1- P(z<0.4)]=0.8849 – [1- 0.6554]=0.5403

P(-0.4<z<0) =P(0<z<0.4) because of symmetry of the z distribution.

Page 54: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-54 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 3

Professor Mann has determined that the scores in his statistics course are approximately normally distributed with a mean of 72 and a standard deviation of 5. He announces to the class that the top 15 percent of the scores will earn an A.

What is the lowest score a student can earn and still receive an A?

Page 55: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-55 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 3 continued

To begin let k be the score that separates an A from a B.

15 percent of the students score more than k, then 35 percent must score between the mean of 72 and k.

Write down the relation between k and the probability: P(X>k) = 0.15 and P(X<k) =1-P(X>k) = 0.85

Transform X into z: P[(X-72)/5) < (k-72)/5 ] = P[z < (k-72)/5] P[0<z < s] =0.85 -0.5 = 0.35

Find s from table: P[0<z<1.04]=0.35

Compute k: (k-72)/5=1.04 implies K=77.2Those with a score of 77.2 or more earn an A.

Page 56: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-56 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Normal Approximation to the Binomial

The normal distribution (a continuous distribution) yields a good approximation of the binomial distribution (a discrete distribution) for large values of n.

The normal probability distribution is generally a good approximation to the binomial probability distribution when n and n(1- ) are both greater than 5.

Why can we approximate binomial by normal?Because of the Central Limit Theorem.

Page 57: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-57 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Normal Approximation continued

Recall for the binomial experiment: There are only two mutually exclusive outcomes

(success or failure) on each trial. A binomial distribution results from counting the

number of successes. Each trial is independent. The probability is fixed from trial to trial, and the

number of trials n is also fixed.

Page 58: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-58 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

The Normal Approximation

normal

binomial

Page 59: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-59 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Continuity Correction Factor

Because the normal distribution can take all real numbers (is continuous) but the binomial distribution can only take integer values (is discrete), a normal approximation to the binomial should identify the binomial event "8" with the normal interval "(7.5, 8.5)" (and similarly for other integer values). The figure below shows that for P(X > 7) we want the magenta region which starts at 7.5.

Page 60: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-60 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Continuity Correction Factor

Example: If n=20 and p=.25, what is the probability that X is greater than or equal to 8?

The normal approximation without the continuity correction factor yields z=(8-20 × .25)/(20 × .25 × .75)0.5 = 1.55, P(X ≥ 8) is approximately .0606 (from the table).

The continuity correction factor requires us to use 7.5 in order to include 8 since the inequality is weak and we want the region to the right. z = (7.5 - 20 × .25)/(20 × .25 × .75)0.5 = 1.29, P(X ≥ 7.5) is .0985.

The exact solution from binomial distribution function is .1019.

The continuity correct factor is important for the accuracy of the normal approximation of binomial.

The approximation is quite good.

Page 61: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-61 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

EXAMPLE 4

A recent study by a marketing research firm showed that 15% of American households owned a video camera. For a sample of 200 homes, how many of the homes would you expect to have video cameras?

30)200)(15(. n

What is the variance?

5.25)15.1)(30()1(2 n

0498.55.25

What is the standard deviation?

What is the mean?

Page 62: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-62 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

What is the probability that less than 40 homes in the sample have video cameras?

“Less than 40” means “less or equal to 39”. We use the correction factor, so X is 39.5.

The value of z is 1.88.

88.10498.5

0.305.39

X

z

EXAMPLE 5 continued

Page 63: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-63 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

Example 4 continued

From Standard Normal Table the area between 0 and 1.88 on the z scale is .4699.

So the area to the left of 1.88 is .5000 + .4699 = .9699.

The likelihood that less than 40 of the 200 homes have a video camera is about 97%.

Page 64: Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson5-1 Lesson 5: Continuous Probability Distributions

Lesson5-64 Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data

- END -

Lesson 5: Lesson 5: Continuous Probability Distributions