Upload
haminh
View
298
Download
3
Embed Size (px)
Citation preview
Copyright Reserved 1
Chapter 5
Discrete Probability Distributions
Random Variables
x is a random variable which is a numerical description of the outcome of an experiment.
• Discrete: If the possible values change by steps or jumps.
Example: Suppose we flip a coin 5 times and count the number of tails. The number of tails
could be 0, 1, 2, 3, 4 or 5. Therefore, it can be any integer value between (and including) 0
and 5. However, it could not be any number between 0 and 5. We could not, for example, get
2.5 tails. Therefore, the number of tails must be discrete.
• Continuous: If the possible values can take any value within some range.
Example: The height of trees is an example of continuous data. Is it possible for a tree to be
2.105m tall? Sure. How about 2.10567m? Yes. How about 2.105679821014m? Definitely!
Discrete Random Variables
Consider the sales of cars at a car dealership over the past 300 days.
Frequency Distribution:
Number of cars sold per day Number of days
(frequency)
0 54
1 117
2 72
3 42
4 12
5 3
300
• Define the random variable:
Let x = the number of cars sold during a day.
• Note: We make the assumption that no more than 5 cars are sold per day.
• Sample Space: S = {0, 1, 2, 3, 4, 5}
• Notation:
P(X = 0) = f(0) = probability of 0 cars sold
P(X = 1) = f(1) = probability of 1 car sold
P(X = 2) = f(2) = probability of 2 cars sold
P(X = 3) = f(3) = probability of 3 cars sold
P(X = 4) = f(4) = probability of 4 cars sold
P(X = 5) = f(5) = probability of 5 cars sold
Copyright Reserved 2
• Note: f(x) = probability function
• The probability function provides the probability for each value of the random variable
• Probability distribution for the number of cars sold per day at a car dealership
x Number of days
(frequency)
f(x)
0 54 54300 = 0.18
1 117 117300 = 0.39
2 72 72300 = 0.24
3 42 42300 = 0.14
4 12 12300 = 0.04
5 3 3300 = 0.01
300 1
• Question: Does the above mentioned probability function fulfill the required conditions for a
discrete probability function?
There are two requirements:
(i) 0 ≤ (�) ≤ 1 for all (�)
(ii) ∑ (�) = 1 Yes, both requirements are fulfilled.
0.18
0.39
0.24
0.14
0.04
0.01
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 1 2 3 4 5
Pro
ba
bil
ity
Number of cars sold per day
Graphical representation of the probability
distribution for the number of cars sold per day
Copyright Reserved 3
Questions:
a) The probability that 2 cars are sold per day?
�(� = 2) = (2) = 0.24
b) The probability that, at most, 2 cars are sold per day?
�(� ≤ 2)
= �(� = 0) + �(� = 1) + �(� = 2)
= (0) + (1) + (2)
= 0.18 + 0.39 + 0.24 = 0.81
c) The probability that more than 2 cars are sold per day?
�(� > 2)
= �(� = 3) + �(� = 4) + �(� = 5)
= (3) + (4) + (5)
= 0.14 + 0.04 + 0.01 = 0.19
d) The probability that at least 2 cars are sold per day?
�(� ≥ 2)
= �(� = 2) + �(� = 3) + �(� = 4) + �(� = 5)
= (2) + (3) + (4) + (5)
= 0.24 + 0.14 + 0.04 + 0.01
= 0.43
e) The probability that more than 1 but less than 4 cars are sold per day?
�(1 < � < 4)
= �(� = 2) + �(� = 3)
= (2) + (3)
= 0.24 + 0.14
= 0.38
Copyright Reserved 4
Discrete Uniform probability function:
(�) = 1�
where n = the number of values the random variable may assume
Example: Dice (�) = �� for x = 1, 2, 3, 4, 5, 6
x f(x)
1 ��
2 ��
3 ��
4 ��
5 ��
6 ��
Does the above mentioned probability function fulfill the required conditions for a discrete probability
function?
There are two requirements:
(i) 0 ≤ (�) ≤ 1 for all (�)
(ii) ∑ (�) = 1 Yes, both requirements are fulfilled.
Another example of a random variable x with the following discrete probability distribution
(�) = ��� for x = 1, 2, 3, 4
x f(x)
1
2
3
4
Does the above mentioned probability function fulfill the required conditions for a discrete probability
function?
There are two requirements:
(i) 0 ≤ (�) ≤ 1 for all (�)
(ii) ∑ (�) = 1 Yes, both requirements are fulfilled.
Copyright Reserved 5
Expected value, variance, standard deviation and median:
Expected Value
�(�) = � = ∑ � (�)
= (0)(0.18) + (1)(0.39) + (2)(0.24) + (3)(0.14) + (4)(0.04) + (5)(0.01)
= 1.5
Variance
� !(�) = "# = ∑(� − �)# (�)
= (0 − 1.5)#(0.18) + (1 − 1.5)#(0.39)+(2 − 1.5)#(0.24) + (3 − 1.5)#(0.14) + (4 − 1.5)#(0.04) +(5 − 1.5)#(0.01)
= 1.25
Standard deviation
" = √"# = √1.25 = 1.118
Median
0 → 0.18
0 and 1 → 0.18 + 0.39 = 0.57 > 0.5
Therefore, the median = 1
0.18
0.39
0.24
0.14
0.04
0.01
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 1 2 3 4 5
Pro
ba
bil
ity
Number of cars sold per day
Graphical representation of the probability
distribution for the number of cars sold per day
Copyright Reserved 6
OR use a table to calculate the expected value, variance and standard deviation:
x f(x) x f(x) ' − ( (' − ()) (' − ())*(')
0 0.18 0 -1.5 2.25 0.4050
1 0.39 0.39 -0.5 0.25 0.0975
2 0.24 0.48 0.5 0.25 0.0600
3 0.14 0.42 1.5 2.25 0.3150
4 0.04 0.16 2.5 6.25 0.2500
5 0.01 0.05 3.5 12.25 0.1225
( = 1.5 +) = 1.25
+ = √,. )- = ,. ,,.
Example:
A psychologist has determined that the number of hours required to obtain the trust of a new patient is
either 1, 2 or 3 hours.
Let x = be a random variable indicating the time in hours required to gain the patient’s trust.
The following probability function has been proposed:
(�) = �� for x = 1, 2, 3
Questions:
a) Set up the probability function of x.
b) Is this a valid probability function? Explain.
c) Give a graphical representation of the probability function of x.
d) What is the probability that it takes exactly 2 hours to gain the patient’s trust?
e) What is the probability that it takes at least 2 hours to gain the patient’s trust?
f) Calculate the expected value, variance and standard deviation.
Copyright Reserved 7
Answers:
a)
x f(x)
1 (1) = 1 60 = 0.161 2 (2) = 2 60 = 0. 31 3 (3) = 3 60 = 0.5
1
b) There are two requirements:
(i) 0 ≤ (�) ≤ 1 for all (�)
(ii) ∑ (�) = 1 Yes, both requirements are fulfilled.
c)
d) (2) = 2 60 = 0. 31
e) �(� ≥ 2) = (2) + (3) = 2 60 + 3 60 = 5 60 = 0.831
f) � = ∑ � (�) = 2. 31
"# = ∑(� − �)# (�) = 0. 51
" = √"# = 20. 51 = 0.745
x f(x) x f(x) ' − ( (' − ()) (' − ())*(')
1 0.1667 0.1667 -1.3333 1.7778 0.2963
2 0.3333 0.6667 -0.3333 0.1111 0.0370
3 0.5 1.5000 0.6667 0.4444 0.2222
2.3333 0.5556
0
0.1
0.2
0.3
0.4
0.5
0.6
1 2 3
Pro
bab
ilit
y
x
Graphical representation of the probability
distribution
Copyright Reserved 8
Binomial distribution
1. The experiment consists of a sequence of n identical trials.
2. Two outcomes are possible on each trial. We refer to a
� Success
� Failure
3. The probability of a success, denoted by p does not change from trial to trial. Consequently, the
probability of a failure, denoted by 1 – p, does not change from trial to trial.
4. The trials are independent
In general:
� Let: x = number of successes
� Then x has a binomial distribution of n trials and the probability of a success of p.
The Binomial probability function is:
(�) = 3��4 5�(1 − 5)67�
Martin clothing store problem:
Let us consider the purchase decisions of the next 3 customers who enter the Martin clothing store. On the
basis of past experience, the store manager estimates the probability that any one customer will make a
purchase is 0.3.
Let: S = customer makes a purchase (success)
F = customer does not make a purchase (failure)
The above mentioned is a Binomial experiment, because:
1. n = 3 identical trials
2. Two possible outcomes
• customer makes a purchase (success)
• customer does not make a purchase (failure)
3. Probability of a success p = 0.3 and a failure 1 – p = 0.7
4. The trials are independent
Let x = number of customers that make a purchase
OR
x = number of successes
Copyright Reserved 9
Tree Diagram:
1st 2nd 3rd
Outcomes
(S, S, S)
(S, S, F)
(S, F, S)
(S, F, F)
(F, S, S)
(F, S, F)
(F, F, S)
(F, F, F)
Value of x
3
2
2
1
2
1
1
0
Total number of experimental outcomes:
Using the tree diagram we count 8 experimental outcomes.
Using the counting rule for multiple-step experiments we get (n1)(n2)(n3) = (2)(2)(2) = 8.
Since the binomial distribution only as two possible outcomes on each step (success or failure), we can
use the formula 26 which in this case equals 28 = 8 where n denotes the number of trials in the binomial
experiment.
S
S
S
S
S
S
S
F
F
F
F
F
F
F
Copyright Reserved 10
Calculating binomial probabilities: (�) = 3��4 5�(1 − 5)67�
Question 1: Calculate the probability that 2 out of the 3 customers make a purchase.
Answer 1:
(2) = 3324 0.3#0.7� = (3)(0.09)(0.7) = 0.189
Question 2: Calculate the probability that 1 out of the 3 customers make a purchase.
Answer 2:
(1) = 3314 0.3�0.7# = (3)(0.3)(0.49) = 0.441
Question 3: Calculate the probability that 3 out of the 3 customers make a purchase.
Answer 3:
(3) = 3334 0.380.7878 = (1)(0.027)(1) = 0.027
Question 4: Calculate the probability that 0 out of the 3 customers make a purchase.
Answer 4:
(0) = 3304 0.3�0.787� = (1)(1)(0.343) = 0.343
Copyright Reserved 11
The probability distribution for the number of customers making a purchase:
x f(x)
0 0.343
1 0.441
2 0.189
3 0.027
1
Does it fulfill the basic requirements for a discrete probability function?
There are two requirements:
(iii) 0 ≤ (�) ≤ 1 for all (�)
(iv) ∑ (�) = 1 Yes, both requirements are fulfilled.
Calculate the expected value, variance and standard deviation of x:
x f(x) x f(x) ' − ( (' − ()) (' − ())*(')
0 0.343 0 -0.9 0.81 0.27783
1 0.441 0.441 0.1 0.01 0.00441
2 0.189 0.378 1.1 1.21 0.22869
3 0.027 0.081 2.1 4.41 0.11907
0.9 0.63
� = ∑ � (�) = 0.9
"# = ∑(� − �)# (�) = 0.63
" = √"# = √0.63 = 0.794
Formulas of � and "# for the Binomial Probability Distribution:
�(�) = � = �5
� !(�) = "# = �5(1 − 5)
Test:
�(�) = � = �5 = (3)(0.3) = 0.9
� !(�) = "# = �5(1 − 5) = (3)(0.3)(0.7) = 0.63
9:;<=(�) = √0.63 = 0.794
Copyright Reserved 12
EXCEL:
BINOMDIST(x, n, p, false) – normal probability
BINOMDIST(x, n, p, true) – cumulative probability
Formula Worksheet
Value Worksheet
Value Worksheet with explanations
Copyright Reserved 13
Example: (Extension of the Martin-experiment)
• Suppose 10 customers go into the store.
• The probability of purchasing something is 0.3
• Let x = number of customers that make a purchase
Questions:
1. What is the distribution of x.
Binomial with n = 10 and p = 0.3
2. Calculate the expected value, variance and standard deviation of x.
�(�) = � = �5 = (10)(0.3) = 3
� !(�) = "# = �5(1 − 5) = (10)(0.3)(0.7) = 2.1
9:;<=(�) = " = √2.1 = 1.449
3. Calculate the probability distribution of x.
x f(x)
0 f(0) = 3100 4 (0.3)�(0.7)��7� = (1)(1)(0.028) = 0.0282
1 f(1) = 3101 4 (0.3)�(0.7)��7� = (10)(0.3)(0.040) = 0.1211
2 f(2) = 3102 4 (0.3)#(0.7)��7# = (45)(0.09)(0.058) = 0.2335
3 f(3) = 3103 4 (0.3)8(0.7)��78 = (120)(0.027)(0.082) = 0.2668
4 f(4) = 3104 4 (0.3)?(0.7)��7? = (210)(0.0081)(0.118) = 0.2001
5 f(5) = 3105 4 (0.3)@(0.7)��7@ = (252)(0.00243)(0.168) = 0.1029
6 f(6) = 3106 4 (0.3)�(0.7)��7� = (210)(0.0007)(0.2401) = 0.0368
7 f(7) = 3107 4 (0.3)A(0.7)��7A = (120)(0.00022)(0.343) = 0.0090
8 f(8) = 3108 4 (0.3)B(0.7)��7B = (45)(0.000066)(0.49) = 0.0014
9 f(9) = 3109 4 (0.3)C(0.7)��7C = (10)(0.0000197)(0.7) = 0.0001
10 f(10) = 310104 (0.3)��(0.7)��7�� = (1)(0.0000059)(1) = 0.0000
Copyright Reserved 14
4. A graphical representation of the probability distribution of x.
5. Calculate the cumulative distribution of x.
Formula worksheet
0.028
0.121
0.233
0.267
0.200
0.103
0.037
0.0090.001 0.000 0.000
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0 1 2 3 4 5 6 7 8 9 10
f(x)
x
Probability distribution of 10 customers
Copyright Reserved 15
Value worksheet
Value worksheet with explanations
Copyright Reserved 16
6. Calculate the probability that:
(a) At most 3 clients purchase something:
�(� ≤ 3) = 0.6496
(b) Only 3 clients purchase something:
�(� = 3) = 0.2668 or (3) = 3103 4 (0.3)8(0.7)��78 = (120)(0.027)(0.082) = 0.2668
(c) More than 1 client purchase something:
�(� > 1) = 1 − �(� ≤ 1) = 1 − 0.1493 = 0.8507
since �(� ≤ 1) + �(� > 1) = 1
(d) More than 2 but less than 5 clients purchase something:
�(2 < � < 5) = �(� = 3) + �(� = 4) = 0.2668 + 0.2001 = 0.4669
OR
�(2 < � < 5) = �(� = 3) + �(� = 4) = �(� ≤ 4) − �(� ≤ 2) = 0.8497 − 0.3828 =0.4669
(e) Less than 5 clients purchase something:
�(� < 5) = �(� ≤ 4) = 0.8497
(f) At least 4 clients purchase something:
�(� ≥ 4) = 1 − �(� ≤ 3) = 1 − 0.6496 = 0.3504
(g) Exactly 6 clients do not purchase anything:
If 6 clients do not purchase something, then 4 clients purchase something
�(� = 4) = 0.2001 or �(� ≤ 4) − �(� ≤ 3) = 0.8497 − 0.6496 = 0.2001
(h) Difficult question: Calculate the probability that the first three clients make a purchase:
555(1 − 5)(1 − 5)(1 − 5)(1 − 5)(1 − 5)(1 − 5)(1 − 5) = 58(1 − 5)A = 0.380.7A = 0.00222
Copyright Reserved 17
Homework
A university found that 20% of its students withdraw without completing the introductory statistics
course.
Assume: That 20 students have registered for the course this quarter.
Let: x = number of students withdrawing from the course.
Binomial Experiment: n = 20 and p = 0.2
x f(x) Cumulative prob
0 0.0115 0.0115
1 0.0576 0.0692
2 0.1369 0.2061
3 0.2054 0.4114
4 0.2182 0.6296
5 0.1746 0.8042
6 0.1091 0.9133
7 0.0545 0.9679
8 0.0222 0.9900
9 0.0074 0.9974
10 0.0020 0.9994
11 0.0005 0.9999
12 0.0001 1.0000
13 0.0000 1.0000
14 0.0000 1.0000
15 0.0000 1.0000
16 0.0000 1.0000
17 0.0000 1.0000
18 0.0000 1.0000
19 0.0000 1.0000
20 0.0000 1.0000
Copyright Reserved 18
Homework (work through this on your own):
a) Calculate the probability that 2 or less will withdraw.
D(E ≤ )) = F. )FG,
b) Calculate the probability that exactly 4 will withdraw.
D(E = H) = F. ),.) or D(E = H) = D(E ≤ H) − D(E ≤ I) = F. G)JG − F. H,,H = F. ),.,
c) Calculate the probability that more than 3 will withdraw.
D(E > I) = , − D(E ≤ I) = , − F. H,,H = F. -..G
d) What is the expected number of withdrawals?
K(E) = LM = ()F)(F. )) = H
e) What is the expected number of students that will not withdraw?
Suppose y = number of students not withdrawing from the course.
Then K(N) = L(, − M) = ()F)(F. .) = ,G
We use the probability , − M, since:
• M = 0.2 is the probability that a student will withdraw
• , − M = 1 – 0.2 = 0.8 is the probability that a student will not withdraw
f) Calculate the probability that 15 will not withdraw.
If 15 students from 20 students will not withdraw, then 5 students will withdraw.
�(� = 5) = �(� ≤ 5) − �(� ≤ 4) = 0.8042 − 0.6296 = 0.1746
Copyright Reserved 19
Shape of the Binomial distribution:
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
1 2 3 4 5 6 7 8 9 10 11
Pro
ba
bil
ity
x
Binomial: n = 10 and p < 0.5
Skewed to the right
0
0.05
0.1
0.15
0.2
0.25
0.3
1 2 3 4 5 6 7 8 9 10 11
Pro
ba
bil
ity
x
Binomial: n = 10 and p = 0.5
Symmetric
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
1 2 3 4 5 6 7 8 9 10 11
Pro
ba
bil
ity
x
Binomial: n = 10 and p > 0.5
Skewed to the left
Copyright Reserved 20
Typical exam questions
Questions 1 to 4 are based on the following information:
30% of the tires of airplanes at an airport are faulty. Four tyres are randomly selected.
Let: x = Number of faulty tires found
Question 1
The standard deviation of x is:
Answer 1
x is binomially distributed with n = 4 and p = 0.3. Thus, O:;<=(�) = 2�5(1 − 5) = √4 × 0.3 × 0.7 =0.917.
Question 2
The probability that the first and third tyres in the sample of 4 selected tyres are faulty is:
Answer 2
(0.3)(0.7)(0.3)(0.7) = 0.3#0.7# = 0.0441.
Question 3
The probability that less than 3 tyres selected are faulty, is:
Answer 3
�(� < 3) = �(� = 0) + �(� = 1) + �(� = 2) = 3404 0.3�0.7? + 34
14 0.3�0.78 + 3424 0.3#0.7# =
0.2401 + 0.4116 + 0.2646 = 0.9163.
Question 4
The probability that 1 of the 4 randomly selected tyres is not faulty, is:
Answer 4
If 1 out of 4 is not faulty, it means that 3 out of 4 are faulty.
�(� = 3) = 3434 0.380.7� = 0.0756.
Copyright Reserved 21
Questions 5 to 9 are based on the following information:
It is known that 60% of all South Africans will watch the opening match of the 2010 Soccer World Cup.
A random sample of 40 South Africans was asked whether they will watch the opening match.
Let � = the number of South Africans who will watch the opening match.
Consider the following results in Excel of two different binomial distributions:
Formula worksheet:
Value worksheet:
Copyright Reserved 22
Question 5
Suppose a random sample of 7 South Africans is selected. The probability that the first 5 will watch the
opening match is:
Answer 5
55555(1 − 5)(1 − 5) = (0.6)@(0.4)# = 0.0124.
Question 6
The probability that exactly 23 South Africans will watch the opening match is:
Answer 6
�(� = 23) = (23) = 340234 (0.6)#8(0.4)�A = 0.1204.
Or using Excel we obtain: �(� = 23) = �(� ≤ 23) − �(� ≤ 22) = 0.4319 − 0.3115 = 0.1204.
Question 7
The probability that more than 18 but less than 24 South Africans will watch the opening match is:
Answer 7
�(18 < � < 24) = �(� ≤ 23) − �(� ≤ 18) = 0.4319 − 0.0392 = 0.3927
Question 8
The probability that more than 20 but less than 24 South Africans will not watch the opening match is:
Answer 8
Let y = the number of South Africans who will not watch the opening match.
The probability that someone will not watch the opening match is 1 – 0.6 = 0.4. These probabilities are
given in column C of the Excel spreadsheet [see cells C9, C10 and C11].
�(20 < Q < 24) = �(Q = 21) + �(Q = 22) + �(Q = 23) = 0.0352 + 0.0203 + 0.0106 = 0.0661.
Question 9
The standard deviation of � is:
Answer 9
9:;<=(�) = 2�5(1 − 5) = √40 × 0.6 × 0.4 = 3.10.
Questions 10 to 13 are based on the following information:
Consider the XYZ University. Five students writing the Accounting exam were selected at random. It is
known from previous experience that the probability that any one student will pass the exam is 0.4.
Let: x = the number of students who pass the exam.
Define: An experimental outcome is a sequence of successes and failures in a 5-trial binomial
experiment. (Tree diagram)
Copyright Reserved 23
S
S
S
S
S
S
S
F
S
S
S
S
S
S
S
S
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Question 10
The total number of experimental outcomes with 3 successes out of 5 is:
By drawing a tree diagram we find the answer = 10.
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
S
F
Outcome
(S, S, S, S, S)
(S, S, S, S, F)
(S, S, S, F, S)
(S, S, S, F, F)
(S, S, F, S, S)
(S, S, F, S, F)
(S, S, F, F, S)
(S, S, F, F, F)
(S, F, S, S, S)
(S, F, S, S, F)
(S, F, S, F, S)
(S, F, S, F, F)
(S, F, F, S, S)
(S, F, F, S, F)
(S, F, F, F, S)
(S, F, F, F, F)
(F, S, S, S, S)
(F, S, S, S, F)
(F, S, S, F, S)
(F, S, S, F, F)
(F, S, F, S, S)
(F, S, F, S, F)
(F, S, F, F, S)
(F, S, F, F, F)
(F, F, S, S, S)
(F, F, S, S, F)
(F, F, S, F, S)
(F, F, S, F, F)
(F, F, F, S, S)
(F, F, F, S, F)
(F, F, F, F, S)
(F, F, F, F, F)
Number of
successes
5
4
4
3
4
3
3
2
4
3
3
2
3
2
2
1
4
3
3
2
3
2
2
1
3
2
2
1
2
1
1
0
Copyright Reserved 24
Question 11
The probability that only the first and last student will pass the exam is:
Answer 11
5(1 − 5)(1 − 5)(1 − 5)5 = 5#(1 − 5)8 = 0.4#(0.6)8 = 0.03456
Question 12
The probability that at most 4 of the 5 students will pass the exam is:
Answer 12
�(� ≤ 4)
= �(� = 0) + �(� = 1) + �(� = 2) + �(� = 3) + �(� = 4)
= 3504 0.4�0.6@ + 35
14 0.4�0.6? + 3524 0.4#0.68 + 35
34 0.480.6# + 3544 0.4?0.6�
= 0.07776 + 0.2592 + 0.3456 + 0.2304 + 0.0768 = 0.98976.
OR
�(� ≤ 4) = 1 − �(� = 5) = 1 − 3554 [email protected]� = 1 − 0.01024 = 0.98976.
Question 13
The probability that 2 of the 5 students will fail is:
Answer 13
If 2 out of 5 fail the exam, the 3 out of 5 pass the exam. Therefore,
�(� = 3) = 3534 0.480.6# = 0.2304.
OR
Let y = the number of students who fail the exam
The probability that any one student will fail the exam is 0.6.
�(Q = 2) = 3524 0.6#0.48 = 0.2304.