46
Business Statistics for Managerial Decision Probability Theory

Business Statistics for Managerial Decision Probability Theory

Embed Size (px)

Citation preview

Page 1: Business Statistics for Managerial Decision Probability Theory

Business Statistics for Managerial Decision

Probability Theory

Page 2: Business Statistics for Managerial Decision Probability Theory

Probability Theory The mathematics of probability can provide

models to describe The flow of traffic through a highway system, a

telephone interchange, or a computer processor, the product preference of consumers, the spread of epidemics or computer viruses, and the rate of return on risky investments.

We are interested in probability because of its usefulness in statistics.

Page 3: Business Statistics for Managerial Decision Probability Theory

General Probability Rules Rule 1

0 P(A) 1 for any event A

Rule 2 P(S) =1

Rule 3 Complement rule: for any event A,

P(Ac)=1- p(A) Rule 4

Addition rule: If A and B are disjoint events, thenP(A or B) = P(A) + P(B)

The General addition rule: for any event A and BP(A or B) = P(A) + P(B) - P(A and B)

Page 4: Business Statistics for Managerial Decision Probability Theory

Independence and the Multiplication Rule

Two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. If A and B are independent,

P(A and B) = P(A) × P(B)

Page 5: Business Statistics for Managerial Decision Probability Theory

Conditional Probability The following table contains counts (in

1000’s) of persons aged 16-24 who are enrolled in school classified by gender and employment status

  Employed Unemployed Not in labor force TotalMale 3927 520 4611 9058Female 4313 446 4357 9116Total 8240 966 8968 18174

Page 6: Business Statistics for Managerial Decision Probability Theory

Conditional Probability Randomly choose a person aged 16 to 24 who is enrolled

in school. What is the probability that the person is employed? Now we are told that the person chosen is female. What is

the probability that this person is employed? This is a conditional probability. The conditional probability above gives the probability of

one event (the person chosen is employed) under the condition that we know another event(the person is female).

Page 7: Business Statistics for Managerial Decision Probability Theory

Definition of conditional probability

When P(A) > 0, the conditional probability of B given A is

Two events A and B are independent if

)(

) and ()|(

AP

BAPABP

)()|( BPABP

Page 8: Business Statistics for Managerial Decision Probability Theory

Example:prosperity and education

Call a household prosperous if its income exceeds $100,000. Call the household educated if the householder completed college. Select an American household at random, and let A be the event that the selected household is prosperous and B the event that is educated. According to the Current Population Survey, P(A) = .134, P(B) = .254, and the probability that a household is both prosperous and educated is P(A and B) = .080.

Page 9: Business Statistics for Managerial Decision Probability Theory

Example:prosperity and education Draw a Venn diagram that shows the relation between

the events A and B. What is the probability P(A or B) that the household selected is either prosperous or educated?

In the diagram, shade the event that the household is educated but not prosperous. What is the probability of this event?

Find the conditional probability that a household is educated, given that it is prosperous.

Are the events A and B independent? How do you know?

Page 10: Business Statistics for Managerial Decision Probability Theory

The Binomial Distribution A store sells 10 computers with 1-year warranties.

How many will not need repair within 1 year? A company’s human resources manager asks 100

employee if job stress is affecting their personal lives. How many will say “yes”.

In all these situations, we want a probability model for a count of successful outcomes.

Page 11: Business Statistics for Managerial Decision Probability Theory

The Binomial Setting There are a fixed number of n observations. The n observations are all independent. That is,

knowing the result of one observation tells you nothing about the other observations.

Each observation falls into one of just two categories, which for convenience we call “success” and “failure”.

The probability of a success, call it p, is the same for each observation.

Example Tossing a coin n = 15 times

Page 12: Business Statistics for Managerial Decision Probability Theory

Binomial Distribution The distribution of successes (x) in a

binomial setting is the Binomial distribution of x with parameters n and p. The parameter n is number of observations, and p is the probability of a success on any one observation. The possible values of X are the whole numbers from 0 to n.

Page 13: Business Statistics for Managerial Decision Probability Theory

Binomial Probabilities Suppose we toss a coin 20 times. Let X be

the number of heads. What is the probability that x =8?

Page 14: Business Statistics for Managerial Decision Probability Theory

Finding Binomial Probabilities: Formula

If X has the binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0, 1, 2, 3, …, n. If k is any one of these values,

knk

knk

ppknk

n

ppk

nkXP

)1()!(!

!

)1()(

Page 15: Business Statistics for Managerial Decision Probability Theory

Finding Binomial Probabilities: Formula

Wee tossed a coin 20 times, and X is the number of heads.

What is the probability that X =8? In this example n = ----- and p =----- Using the binomial formula

1201.0)5.01()5.0()!820(!8

!20)8( 8208

XP

Page 16: Business Statistics for Managerial Decision Probability Theory

Finding Binomial Probabilities: Tables

The formula given in the previous slide is practical for hand calculations when n is small.

In practice, we either use statistical packages or table C in your Moore, MaCabe, Duckworth, Sclove text book.

Page 17: Business Statistics for Managerial Decision Probability Theory

Example:Inspecting switches The quality engineers inspect a SRS of 10

switches from a large shipment of which 10% fail to conform to specifications. What is the probability that no more than 1 of the 10 switches in the sample fails inspection?

The count X of nonconforming switches in the sample has approximately the binomial distribution with n = ----- and P = -----.

What is the probability that exactly 4 in the sample of 10 fail to conform to specification?

Page 18: Business Statistics for Managerial Decision Probability Theory

Binomial Mean and Standard Deviation

If a count X has the binomial distribution based on n observations with probability p of success, what is the average count of successes in very many repetition of the binomial setting.

If a count X has a Binomial distribution with number of observations n and probability of success p, the mean and the standard deviation of X are

)1( pnp

np

Page 19: Business Statistics for Managerial Decision Probability Theory

Example:Inspecting switches The count X of bad

switches is Binomial with n = 10 and P = 0.1. The mean and standard deviation of this Binomial distributions are

9487.9.)9.0)(1.0(10

)1(

1)1.0)(10(

pnp

np

Page 20: Business Statistics for Managerial Decision Probability Theory

The Normal Approximation to Binomial Distribution

The Binomial probability formula and tables are practical only when the number of trials n is small.

When n is large, we can use Normal probability calculation to approximate hard to calculate Binomial probability.

Page 21: Business Statistics for Managerial Decision Probability Theory

Normal Approximation for Binomial Distribution

Suppose that a count X has the Binomial distribution with n trials and success probability p. When n is large, the distribution of X is approximately normal,

As a rule of Thumb, we will use the normal approximation when n and p satisfy np 10 and n(1-p) 10.

))1(,( pnpnpN

Page 22: Business Statistics for Managerial Decision Probability Theory

Example:Is clothes shopping frustrating Sample surveys show that fewer people enjoy shopping

than in the past. A recent survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that “I like buying new clothes, but shopping is often frustrating and time consuming.” The population that the poll wants to draw conclusions about is all the U.S. residents aged 18 and over. Suppose that 60% of all adult U.S. residents would say “agree” if asked the same question.

What is the probability that 1520 or more of the sample agree.

Page 23: Business Statistics for Managerial Decision Probability Theory

Example:Is clothes shopping frustrating

Histogram of 1000 binomial counts (n = 2500, p = 0.6) and the normal density curve that approximates this Binomial distribution.

49.24)4.0)(6.0(2500)1(

1500)6.0)(2500(

pnp

np

)49.24,1500(~ NX

Page 24: Business Statistics for Managerial Decision Probability Theory

Example:Is clothes shopping frustrating

What is the probability that 1520 or more of the sample agree?

2061.7930.1

)82.0()49.24

15001520()1520(

ZPzpXP

Page 25: Business Statistics for Managerial Decision Probability Theory

The Poisson Distributions It is common to meet counts that are open ended. A bank counts the number of automatic teller

machine (ATM) customers arriving at a particular ATM between 2:00 p.m. and 4:00 p.m.

A railyard counts the number of work injuries that happen in a month.

What are the possible outcomes for these examples?

Poisson distribution is another distribution for counting random variables.

Page 26: Business Statistics for Managerial Decision Probability Theory

The Poisson setting The number of events (call them successes) that

occur in any unit of measure is independent of the number of successes that occur in any non-overlapping unit of measure.

The probability that a success will occur in a unit of measure is the same for all units of equal size and is proportional to the size of the unit.

The probability that 2 or more successes will occur in a unit approaches 0 as the size of the unit becomes smaller.

Page 27: Business Statistics for Managerial Decision Probability Theory

Poisson Distribution The distribution of the count X of successes in the

Poisson setting is the Poisson distribution with mean . The parameter is the mean number of successes per unit of measure. The possible values of X are the whole numbers 0, 1, 2, 3, … if k is any whole number 0 or grater, then

The standard deviation of the distribution is .

!)(

k

ekXP

k

Page 28: Business Statistics for Managerial Decision Probability Theory

Example: Flaws in carpets A carpet manufacturer knows that the number of

flaws per square yard in a type of carpet material varies with an average of 1.6 flaws per square yard. The count X of flaws per square yard can be modeled by the Poisson distribution with = 1.6. The unit of measure is a square yard of carpet material.

What is the probability of no more than 2 defects in a randomly chosen square yard of this material?

Page 29: Business Statistics for Managerial Decision Probability Theory

Example: Flaws in carpets

7834.2584.3220.02019.0!2

)6.1(

!1

)6.1(

!0

)6.1(

)2()1()0()2(

2,6.1!

)(

26.116.106.1

eee

xpxpxpxP

kk

ekxp

k

Page 30: Business Statistics for Managerial Decision Probability Theory

The Role of Probability in Statistical

Inference A statistic from a random sample will take

different values if we take more samples from the same population. That is, sample statistics are random variables.

The values of a statistic (sampling distribution, in many samples) have a regular pattern.

We will use the language of probability to to examine the sampling distribution of a sample mean .X

Page 31: Business Statistics for Managerial Decision Probability Theory

Example: Does this wine smell bad?

Sulfur compounds such as Dimethyl sulfide (DMS) are sometimes present in wine. DMS causes “off-odors” in wine, so winemakers want to know the odor threshold, the lowest concentration of DMS that the human nose can detect. Different people have different thresholds, so we start by asking about the mean threshold in the population of all adults.

The number is a parameter that describe this population.

Page 32: Business Statistics for Managerial Decision Probability Theory

Example: Does this wine smell bad? To estimate , we present tasters with both natural wine and the same

wine spiked with DMS at different concentrations to find the lowest concentration at which they can identify the spiked wine.

Here are the odor threshold (measured in micrograms of DMS per liter of wine) for 10 randomly selected subjects:

28 40 28 33 20 31 29 2717 21

The mean threshold for these subjects is . This sample mean is a statistic that we use to estimate the parameter .

This is probably not exactly equal to . A different 10 subjects would give us a different .

4.27X

X

Page 33: Business Statistics for Managerial Decision Probability Theory

Statistical Estimation and the Law of Large Numbers

A parameter, such as the mean threshold of all adults, is in practice a fixed but unknown number.

A statistic, such a the mean threshold of a random sample of 10 adults, is a random variable.

We use to estimate . An SRS should fairly represent the population, so

the mean of the sample should be somewhere near the mean of the population (i.e. it is an unbiased estimate of ).

X

X

XX

Page 34: Business Statistics for Managerial Decision Probability Theory

Statistical Estimation and the Law of Large Numbers

If is rarely exactly right and varies from sample to sample, why is it nonetheless a reasonable estimate of the population ?

The answer: If we keep on taking larger and larger samples, the

statistic is guaranteed to get closer and closer to the parameter .

That is if we can afford to keep on measuring more subjects, eventually we will estimate the mean odor threshold of all adults very accurately.

This fact is known As the law of large Numbers.

X

X

Page 35: Business Statistics for Managerial Decision Probability Theory

The Law of Large Numbers Draw independent observations at random

from any population with finite mean . As the number of observations drawn

increases, the mean of the observed values get closer and closer to the mean of the population

X

Page 36: Business Statistics for Managerial Decision Probability Theory

The Law of Large Numbers In fact, the distribution of

odor threshold among all adults has mean 25.

= 25

As we take more observations, the sample mean always approaches the mean of the population.

X

Page 37: Business Statistics for Managerial Decision Probability Theory

Sampling Distributions The law of large number assures us that if we

measure enough subjects, the statistic will eventually get very close to the unknown parameter .

In our example we had a sample of 10 subjects. What can we say about from 10 subjects as an

estimate of ? That is, what would happen if we took many samples of

10 subjects from this population?

X

X

Page 38: Business Statistics for Managerial Decision Probability Theory

Sampling Distributions To answer this question

Take a large number of samples of size 10 from the same population

Calculate the sample mean for each sample. Make a histogram of the values of . this histogram shows how

varies in many samples. The histogram of values of the statistic approximates the Sampling

distribution that we would see if we kept on sampling for ever. One reason for studying probability is that the laws of probability can

tell us about sampling distributions without the need to actually choose or simulate a large number of samples.

X

X

X

Page 39: Business Statistics for Managerial Decision Probability Theory

The mean and Standard Deviation of Suppose that is the mean of a SRS of size

n drawn from a large population with mean and standard deviation . Then the mean of the sampling distribution of is and its standard deviation is .

X

X

X

n

Page 40: Business Statistics for Managerial Decision Probability Theory

Sampling Distribution of a Sample Mean

If a population has the N(, ) distribution, the sample mean of n independent observations has the

X

),(n

N

Page 41: Business Statistics for Managerial Decision Probability Theory

Example: Estimating Odor Threshold Adults differ in the smallest amount of DMS they can

detect in wine. Extensive studies have found that the DMS odor threshold of adults follows roughly a Normal distribution with mean = 25 g/l and standard deviation = 7 g/l. because the population distribution is Normal, the sampling distribution of is also Normal Both distribution have the same mean But means ( )from a sample of 10 adults vary less

than do measurements on individual adults. The standard deviation of is

lgn

/21.210

7

X

X

X

Page 42: Business Statistics for Managerial Decision Probability Theory

Example: Estimating Odor Threshold

The distribution of single observations compared with the distribution of the mean of 10 observations.

Averages are less variable than individual observations.

X

Page 43: Business Statistics for Managerial Decision Probability Theory

Central Limit Theorem What happens when the population distribution is

not Normal? As the sample size increases, the distribution of changes shape: it looks less like that of the

population distribution and more like a Normal distribution.

When the sample is large enough, the distribution of is very close to Normal

This important fact of probability is called the central limit theorem.

X

X

Page 44: Business Statistics for Managerial Decision Probability Theory

The Central Limit Theorem in Action The distribution of means

X from a strongly non-normal population becomes more Normal as the sample size increases.

(a) the distribution of 1 observation

(b) The distribution of two observations

(c)The distribution of of 10 observations

(d) the distribution of of 25 observations.

X

X

X

Page 45: Business Statistics for Managerial Decision Probability Theory

Central Limit Theorem Draw a SRS of size n from any population

with mean and finite standard deviation . When n is large, the sampling distribution of the sample mean is approximately Normal:

X

),(ely approximat is n

NX

Page 46: Business Statistics for Managerial Decision Probability Theory

Example: flaws in carpets The number of flaws per square yard in a type of carpet

material varies with mean 1.6 flaws per square yard and standard deviation 1.2 flaws per square yard. The population distribution cannot be normal, because a count takes only whole number values. An inspector samples 200 square yards of material, records the number of flaws found in each square yard, and calculates , the mean number of flaws per square yard inspected. Use the central limit theorem to find the approximate probability that the mean number of flaws exceeds 2 per square yard.

X