64
Chapter 3 Discrete Random Variable and Probability Distributions Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 1

Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

  • Upload
    others

  • View
    23

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Chapter 3 Discrete Random Variable and ProbabilityDistributions

Seungchul Baek

STAT 355 Introduction to Probability and Statistics for Scientists andEngineers

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 1

Page 2: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Random Variable

Definition: A random variable is a function from a sample space S intothe real numbers. We usually denote random variables with uppercaseletters, e.g., X, Y,. . .

X : S → R.

Examples:

Experiment: toss a coin 2 times, Random variable: X = number ofheads in 2 tosses

Experiment: toss 10 dice, Random variable: X = sum of the numbers

Experiment: apply different amounts of fertilizer to corn plants,Random variable: X = yield/acre

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 2

Page 3: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Sample Space of Random Variable

In defining a random variable, we have also defined a new samplespace. For example, in the experiment of tossing a coin 2 times, theoriginal sample space is

S = {HH,HT ,TH,TT}

We define a random variable X = number of heads in 2 tosses. Thenew sample space is

X = {0, 1, 2}

The new sample space X is called the range of the random variable X .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 3

Page 4: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Type of Random Variables

A discrete random variable can take one of a countable list of distinctvalues. Its sample space has finite or countable outcomes.

A continuous random variable can take any value in an interval of the realnumber line. Its sample space has uncountable outcomes.

Examples:

Time until a projectile returns to earth.The number of times a transistor in computer memory changes state inone operation.The volume of gasoline that is lost to evaporation during the filling ofa gas tank.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 4

Page 5: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example

Consider toss a fair coin 10 times. The sample space S contains total210 = 1024 elements, which is of the form

S = {TTTTTTTTTT , . . . ,HHHHHHHHHH}

Define the random variable Y as the number of tails out of 10 trials.Remember that a random variable is a map from sample space to realnumber. For instance,

Y (TTTTTTTTTT ) = 10.

The range (all possible values) of Y is

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 5

Page 6: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Mechanical Components

An assembly consists of three mechanical components. Suppose that theprobabilities that the first, second, and third components meetspecifications are 0.90, 0.95 and 0.99, respectively. Assume the componentsare independent.

We define event Si = the i th component is within specification andSi = the i th component is not within specification, where i = 1, 2, 3,

One can calculate

P(S1S2S3) =P(S1S2S3) =. . .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 6

Page 7: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Mechanical Components

Possible outcomes for one assembly is

Let Y = Number of components within specification in a randomly chosenassembly.

Remark: We usually denote the realized value of the random variable bycorresponding lowercase letters.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 7

Page 8: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Probability Mass Functions of Discrete Variables

Definition: Let X be a discrete random variable defined on some samplespace S. The probability mass function (pmf) associated with X isdefined to be

pX (x) = P(X = x).

A pmf p(x) for a discrete random variable X satisfies the following:

0 ≤ p(x) ≤ 1, for all possible values of x .The sum of the probabilities, taken over all possible values of X , mustequal 1; i.e., ∑

all xpX (x) = 1.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 8

Page 9: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Cumulative Distribution Function

Definition: Let Y be a random variable, the cumulative distributionfunction (cdf) of Y is defined as

FY (y) = P(Y ≤ y).

FY (y) = P(Y ≤ y) is read, ‘the probability that the random variableY is less than or equal to the value y .’

Properties of cumulative distribution function

limy→−∞ FY (y) = 0, limy→∞ FY (y) = 1Right continuous, i.e., limy→a+ FY (y) = FY (a).Non-decreasing, i.e., y1 ≤ y2 =⇒ FY (y1) ≤ FY (y2)

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 9

Page 10: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Mechanical Components Revisited

In the Mechanical Components example, Y = number of components withinspecification in a randomly chosen assembly. We have the following pmf:

And the cdf is?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 10

Page 11: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: pmf plot of Mechanical Components

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

y

pmf

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 11

Page 12: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: cdf plot of Mechanical Components

−1 0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

y

cdf

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 12

Page 13: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example 3.9

Consider whether the next person buying a computer at a certain electronicsstore buys a laptop or a desktop model. We denote X = 1 if the customerpurchases a desktop and X = 0 if the customer purchases a laptop.Suppose that 20% of all purchases during that week select a desktop.

Specify the pmf and cdf of X .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 13

Page 14: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Expected Value of a Random Variable

Definition: Let Y be a discrete random variable with pmf pY (y), theexpected value or expectation of Y is defined as

µ = E (Y ) =∑all y

ypY (y).

The expected value for a discrete random variable Y is simply a weightedaverage of the possible values of Y . Each value y is weighted by itsprobability pY (y).

In statistical applications, µ = E (Y ) is commonly called thepopulation mean.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 14

Page 15: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Expectation and Mean

Expectation

µ = E (Y ) =∑all y

ypY (y)

Arithmetic mean when there are n observations, x1, . . . , xn

X̄ = 1n

n∑i=1

xi

Why do we say ‘mean’ for expectation of a random variable?

The expected value can be interpreted as the long-run average.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 15

Page 16: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Expected Value: Mechanical Components

The pmf of Y in the mechanical components example is

The expected value of Y is

µ = E (Y ) =∑all y

ypY (y)

=

Interpretation: On average, we would expect 2.84 components withinspecification in a randomly chosen assembly.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 16

Page 17: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Expected Value of Functions of Y

Theorem: Let Y be a discrete random variable with pmf pY (y). Let g bea real-valued function defined on the range of Y . The expected value orexpectation of g(Y ) is defined as

E{g(Y )} =∑all y

g(y)pY (y).

Interpretation: The expectation E{g(Y )} could be viewed as theweighted average of the function g(y) when evaluated at the randomvariable Y .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 17

Page 18: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Properties of Expectation

Let Y be a discrete random variable with pmf pY (y). Suppose thatg1, g2, . . . , gk are real-valued functions, and let c be a real constant. Theexpectation of Y satisfies the following properties:

1 E (c) = c.2 E (cY ) = cE (Y ).3 Linearity: E

{∑kj=1 gj(Y )

}=∑k

j=1 E{gj(Y )}.

Remark: The 2nd and 3rd rules together imply that

E

k∑

j=1cjgj(Y )

=k∑

j=1cjE{gj(Y )},

for constant cj .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 18

Page 19: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Mechanical Components

In the Mechanical Components example, Y = number of componentswithin specification in a randomly chosen assembly. We foundE (Y ) = 2.84. Suppose that the cost (in dollars) to repair the assembly isgiven by the cost function g(Y ) = (3− Y )2. What is the expected cost torepair an assembly?

Interpretation:

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 19

Page 20: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Project Management

A project manager for an engineering firm submitted bids on three projects.The following table summarizes the firm’s chances of having the threeprojects accepted.

Project A B CProb. of Acceptance 0.3 0.8 0.1

Assuming the projects are independent of one another, what is theprobability that the firm will have all three projects accepted?

What is the probability of having at least one project accepted?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 20

Page 21: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Project Management

Let Y = number of projects accepted. Fill in the following table andcalculate the expected number of projects accepted.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 21

Page 22: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Variance of a Random Variable

Definition: Let Y be a discrete random variable with pmf pY (y) andexpected value E (Y ) = µ. The variance of Y is defined as

σ2 ≡ var(Y ) ≡ E[(Y − µ)2

]=∑all y

(y − µ)2pY (y).

Warning: Variance is always non-negative!

Definition: The standard deviation of Y is the positive square root ofthe variance:

σ =√σ2 =

√var(Y ).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 22

Page 23: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Remarks on Variance (σ2)

Suppose Y is a random variable with mean µ and variance σ2.

The variance is the average squared distance from the mean µ.

The variance of a random variable Y is a measure of dispersion orscatter in the possible values for Y .

The larger (smaller) σ2 is, the more (less) spread in the possible valuesof Y about the population mean E (Y ).

Computing Formula:

var(Y ) = E [(Y − µ)2] = E [Y 2]− [E (Y )]2.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 23

Page 24: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Properties on Variance (σ2)

Suppose X and Y are random variable with finite variance. Let c be aconstant.

var(c) = 0.

var(cY ) = c2var(Y ).

Suppose X and Y are independent, then

var(X + Y ) = var(X ) + var(Y ).

Question: var(X − Y ) = var(X )− var(Y ). True or false?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 24

Page 25: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example 3.25

The pmf of the number X of DVD’s checked out was given p(X = 1) = 0.3,p(X = 2) = 0.25, p(X = 3) = 0.15, p(X = 4) = 0.05, p(X = 5) = 0.1, andp(X = 6) = 0.15.

Compute E (X ) and var(X ).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 25

Page 26: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Bernoulli Trial

Definition: Many experiments can be envisioned as consisting of asequence of “trials,” a trial is called Bernoulli trial if

1 a trial results in a “success” or “failure”2 trials are independent3 the probability of a “success” in each trial, denoted as p, 0 < p < 1, is

the same on every trial, i.e., identical.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 26

Page 27: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example

When circuit boards used in the manufacture of Blue Ray players are tested,the long-run percentage of defective boards is 5 percent.

trial =

success =

p =

Ninety-eight percent of all air traffic radar signals are correctly interpretedthe first time they are transmitted.

trial =

success =

p =

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 27

Page 28: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Expectation and Variance of Bernoulli Trial

Usually, a success is recorded as “1” and a failure as “0”.

The probability of success is equal to the probability to get “1”, i.e.,success.

Let p = P(X = 1) and q = 1− p = P(X = 0).

We haveE (X ) = p

andvar(X ) = p(1− p) = pq

This random variable X is called a Bernoulli random variable.

Notation: X ∼ Bernoulli(p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 28

Page 29: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Binomial Distribution

Suppose that n Bernoulli trials are performed. Define

Y = the number of successes (out of n trials performed).

The random variable Y has a binomial distribution with number of trialsn and success probability p. It is written as Y ∼ B(n, p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 29

Page 30: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Water Filters

A manufacturer of water filters for refrigerators monitors the process fordefective filters. Historically, this process averages 5% defective filters. Fivefilters are randomly selected.

Find the probability that no filters are defective.

Find the probability that exactly 1 filter is defective.

Find the probability that exactly 2 filter is defective.

Can you see the pattern?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 30

Page 31: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

The pmf of Binomial Random Variable

Suppose Y ∼ B(n, p). The pmf of Y is given by

p(y) ={(n

y)py (1− p)n−y , for y = 0, 1, 2, . . . , n,

0, otherwise.

Recall that(n

r)is the number of ways to choose r distinct unordered

objects from n distinct objects:(nr

)= n!

r !(n − r)! .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 31

Page 32: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Expectation and Variance of Binomial Random Variable

If Y ∼ B(n, p), thenE (Y ) = npvar(Y ) = np(1− p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 32

Page 33: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example 3.31 (Using Binomial Tables)

Suppose that 20% of all copies of a particular textbook fail a certainbinding strength test. Let X denote the number among 15 randomlyselected copies that fail the test.

The probability that at most 8 fail the test is

The probability that exactly 8 fail is

The probability that between 4 and 7, inclusive, fail is

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 33

Page 34: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Binomial R function

We can use R to calculate the pmf and cdf of binomial distribution directlyby using

pY (y) = P(Y = y) = dbinom(y, n, p)

andFY (y) = P(Y ≤ y) = pbinom(y, n, p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 34

Page 35: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Radon Levels

Historically, 10% of homes in Florida have radon levels higher than thatrecommended by the EPA. Assume that radon level for each home isindependent of one another. 20 homes are randomly selected.

Find the probability that exactly 3 have radon levels higher than theEPA recommendation.

What is the probability that no more than 5 homes out of the samplehaving higher radon level than recommended?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 35

Page 36: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Radon Levels

What is the probability that at least 5 homes out of the sample havinghigher radon level than recommended?

What is the probability that 2 to 8 homes out of the sample havinghigher random level than recommended?

What is the probability that less than 2 or greater than 8 homes out ofthe sample having higher random level than recommended?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 36

Page 37: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Geometric Distribution

The geometric distribution also arises in experiments involving Bernoullitrial:

each trial results in a “success” or “failure”the trials are independentthe probability of a “success” in each trial, denoted as p, 0 < p < 1, isthe same on every trial.

Definition: Suppose that Bernoulli trials are continually observed. Define

Y = the number of trials to observe the first success.

We say that Y has a geometric distribution with success probability p.Notation: Y ∼ geom(p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 37

Page 38: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

The pmf of Geometric Distribution

If Y ∼ geom(p), then the probability mass function (pmf) of Y is given by

pY (y) ={

(1− p)y−1p, y = 1, 2, 3, . . .0, otherwise.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 38

Page 39: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Mean and Variance of Geometric Distribution

Suppose Y ∼ geom(p), the expected value of Y is

E (Y ) = 1p .

The variance of Y isvar(Y ) = 1− p

p2

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 39

Page 40: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Wafer

The probability that a wafer contains a large particle of contamination is0.01. It is assumed that the wafers are independent.

What is the probability that exactly 125 wafers need to be analyzeduntil a large particle is detected?

How many wafers need to be analyzed until a large particle is detectedon average?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 40

Page 41: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Fruit Fly

Biology students are checking the eye color of fruit flies. For each fly, theprobability of observing white eyes is p = 0.25. We interpret

trial: to identify the eye color of a fruit fly

success: The fly has white eyes

p = P(white eyes) = 0.25

If the Bernoulli trial assumptions hold (independent and identical), then

Y = the number of flies needed to find the first white-eyed∼ geom(p = 0.25)

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 41

Page 42: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Fruit Fly

What is the probability that the first white-eyed fly is observed on thefifth trial?

What is the probability that the first white-eyed fly is observed beforethe fourth fly is examined?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 42

Page 43: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Geometric Distribution in a Different View

We consider a random variable X as the number of failures until the firstsuccess. Then, the pmf of X is

pX (x) ={

p(1− p)x , x = 0, 1, 2, . . .0, otherwise.

Mean and Variance

E (X ) = 1− pp .

var(X ) = 1− pp2

The relationship between X and Y ?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 43

Page 44: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Geometric Distribution pmf & cdf R Code

We can use R to calculate the pmf and cdf of geometric distribution directlyby using

pY (y) = P(Y = y) = dgeom(y-1, p)

andFY (y) = P(Y ≤ y) = pgeom(y-1, p)

i.e., the first argument is “the number of failures.”

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 44

Page 45: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Negative Binomial Distribution

The negative binomial distribution also arises in experiments involvingBernoulli trial:

each trial results in a “success” or “failure”;the trials are independent;the probability of a “success” in each trial, denoted as p, 0 < p < 1, isthe same on every trial.

Definition: Suppose that Bernoulli trials are continually observed. Define

Y = the number of trials to observe the r -th success.

We say that Y has a negative binomial distribution with waitingparameter r and success probability p. Notation: Y ∼ NB(r , p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 45

Page 46: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Negative Binomial Distribution

The negative binomial distribution is a generalization of the geometricdistribution. If r = 1, then NB(1, p) = geom(p)

If Y ∼ NB(r , p), then the probability mass function of Y is given by

pY (y) ={(y−1

r−1)pr (1− p)y−r , y = r , r + 1, r + 2, . . .

0, otherwise.

Mean and Variance:E (Y ) = r

p

var(Y ) = r(1− p)p2

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 46

Page 47: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example Revisited: Fruit Fly

Biology students are checking the eye color of fruit flies. For each fly, theprobability of observing white eyes is p = 0.25. What is the probability thethird white-eyed fly is observed on the tenth fly checked?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 47

Page 48: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Negative Binomial Distribution (Textbook Version)

The alternative formulation of the NB distribution is to count the numberof failures until the r -th success.

X = the number of failures to observe the r -th success.

If X ∼ nib(r , p), then the probability mass function of X is given by

pX (x) ={(x+r−1

r−1)pr (1− p)x , x = 0, 1, 2, . . .

0, otherwise.

Mean and Variance:E (X ) = r 1− p

p

var(X ) = r (1− p)p2

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 48

Page 49: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Negative Binomial Distribution (Textbook Version)

Is there any relationship between X and Y ?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 49

Page 50: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Negative Binomial Distribution pmf & cdf R Code

We can use R to calculate the pmf and cdf of negative binomial distributiondirectly by using

pY (y) = P(Y = y) = dnbinom(y-r, r, p)

andFY (y) = P(Y ≤ y) = pnbinom(y-r, r, p)

i.e. the first argument is “the number of failures.”

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 50

Page 51: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example Re-Re-visited: Fruit Fly

Biology students are checking the eye color of fruit flies. For each fly, theprobability of observing white eyes is p = 0.25. What is the probability thethird white-eyed fly is observed on the tenth fly checked? We denote X tobe the number of failures to observe third success in 10 trilas.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 51

Page 52: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Hypergeometric Distribution

Suppose we have a bowl containing N balls, r of which are red and N − r ofwhich are black. We draw n(< N) balls out (without replacement). Ourr.v. of interest Y is the number of sampled balls that are red. Then Y hasa hypergeometric distribution. Define

Y = the number of success (out of the n selected).

We say that Y has a hypergeometric distribution and writeY ∼ hyper(N, n, r). The pmf of Y is given by

p(y) =

(r

y)(N−rn−y)

(Nn) , y ≤ r and n − y ≤ N − r

0, otherwise.,

where y = 0, 1, 2, . . . , n.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 52

Page 53: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Mean and Variance of Hypergeometric Distribution

If Y ∼ hyper(N, n, r), then

E (Y ) = n( r

N)

var(Y ) = n( r

N) (N−r

N

) (N−nN−1

).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 53

Page 54: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Parts

A batch of parts contains 100 parts from a local supplier of tubing and 200parts from a supplier of tubing in the next state. If four parts are selectedrandomly and without replacement, what is the probability they are all fromthe local supplier?

What is the probability that two or more parts in the sample are from thelocal supplier?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 54

Page 55: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Generator

If a shipment of 100 generators contains 5 faulty generators, what is theprobability that we select 10 generators from the shipment and get a faultyone?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 55

Page 56: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example 3.35

Five individuals from an animal population thought to be near extinction ina certain region have been caught, tagged, and released to mix into thepopulation. After they have had an opportunity to mix, a random sample of10 of these animals is selected. Let X be the number of tagged animals inthe second sample. Suppose there are actually 25 animals of this type in theregion.

What is the probability that exactly two of the animals in the secondare tagged?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 56

Page 57: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Poisson Distribution

The Poisson distribution is commonly used to model counts, such as

the number of customers entering a post office in a given hour

the number of α-particles discharged from a radioactive substance inone second

the number of machine breakdowns per month

the number of insurance claims received per day

the number of defects on a piece of raw material.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 57

Page 58: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Poisson Process

Let the number of occurrences in a given continuous interval of time orspace be counted. A Poisson process enjoys the following properties:

The number of occurrences in non-overlapping intervals areindependent random variables.

The probability of an occurrence in a sufficiently short interval isproportional to the length of the interval.

The probability of two or more occurrences in a sufficiently shortinterval is zero.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 58

Page 59: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Poisson Distribution

Poisson distribution can be used to model the number of events occurringin a continuous time or space.

Let λ be the average number of occurrences per a unit interval. Let Y =the number of “occurrences” over in a unit interval of time (or space).Suppose Poisson distribution is adequate to describe Y . Then, the pmf ofY is given by

pY (y) ={

λy e−λ

y ! , y = 0, 1, 2, . . .0, otherwise.

The shorthand notation is Y ∼ Poisson(λ).

Mean and Variance:E (Y ) = λ

var(Y ) = λ

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 59

Page 60: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Insulated Wire

“Historically the process has averaged 2.6 breaks in the insulation per 1000meters of wire” implies that the average number of occurrences λ = 2.6,and the base units is 1000 meters. Let X= number of breaks in 1000meters of wire. So X ∼ Poisson(2.6). We want to find the probability that1000 meters of wire will have 1 or fewer breaks in insulation.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 60

Page 61: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example: Insulated Wire

If we were inspecting 2000 meters of wire, what distribution does Yfollow?

If we were inspecting 500 meters of wire, what distribution does Yfollow?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 61

Page 62: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example

Suppose we average 5 radioactive particles passing a counter in 1millisecond. What is the probability that exactly 3 particles will pass inone millisecond?

Suppose we average 5 radioactive particles passing a counter in 1millisecond. What is the probability that exactly 10 particles will passin three milliseconds?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 62

Page 63: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Example 3.38

Let X denote the number of traps in a particular type of metal oxidesemiconductor transistor, and suppose it has a Poisson distribution withλ = 2.

What is the probability that there are exactly three traps?

What is the probability that there are at most three traps?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 63

Page 64: Chapter 3 Discrete Random Variable and Probability ...baek.math.umbc.edu/stat355/ch3.pdfSeungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers

Relationship Between Binomial and Poisson

For some fixed λ, we have p = λ/n, then for large n, b(n, p) isapproximately Poisson(λ = np).

We can show that

limn→∞

(ny

)py (1− p)n−y = e−λλy

y ! .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 64