Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
18
Chapter 3 - Normal Distribution
3.1 a. Original data:
1 2 2 3 3 3 4 4 4 4 5 5 5 6 6 7
B
B
B
B
B
B
B
0
0.5
1
1.5
2
2.5
3
3.5
4
0 1 2 3 4 5 6 7
Fre
qu
enc
y
b. To convert the distribution to a distribution of X - µ, subtract µ = 4 from each score:
-3 -2 -2 -1 -1 -1 0 0 0 0 1 1 1 2 2 3
c. To complete the conversion to z, divide each score by = 1.63:
-1.84 -1.23 -1.23 -0.61 -0.61 -0.61 0 0 0 0
0.61 0.61 0.61 1.23 1.23 1.84
3.2 Converting specific scores from distribution in Exercise 3.1 into z scores:
Xz
2.5 4
1.63
6.2 4
1.63
9 4
1.63
.921.353.07
score
(x)
z
score
2.5 -0.92 18% of the distribution lies below X = 2.5
6.2 +1.35 91% of the distribution lies below X = 6.2
9.0 +3.07 99.9% of the distribution lies below X = 9.0
19
960 975
15
15
15 1
990 975
15
15
15 1
3.3 Errors counting shoppers in a major department store:
a. X
z
between -1 and µ lie .3413
between +1 and µ lie .3413
.6826
Therefore between 960 and 990 are found approximately 68% of the scores.
b. 975 = µ; therefore 50% of the scores lie below 975.
c. .5000 lie below 975
.3413 lie between 975 and 990
.8413 (or 84%) lie below 990
3.4 Using the data in Exercise 3.3:
a. From Appendix z:
z score area
between z
and mean
0.67 0.2486
0.6745 0.2500 [interpolation from Appendix z]
0.68 0.2517
Therefore z = ±.6745 encompasses middle 50%.
975.6745
15
958.12 and 964.88
Xz
X
X
50% of the scores lie between counts of 965 and 985.
b. 75% of the counts would be less than 985 because we just calculated the middle 50%,
25% of which lie on either side of the mean. Since 50% lie below the mean, 50 + 25 =
75% lie below 985.
20
c. What scores would 95% of the counts lie between?
9751.96
15
945.6 and 1004.4
95% of the counts would lie between 946 and 1004
Xz
X
X
3.5 The supervisor’s count of shoppers:
Xz
950 975
15
1.67 X to ±1.67 = 2(.0475) = .095; therefore 9.5% of the time scores will be at least this
extreme.
3.6 a. Sketch:
b. 30 25
1.005
Xz
The smaller portion for z = 1.00 is .1587. Therefore 16% of the 4th
-graders score
better than the average 9th
-grader.
c. 25 30
0.510
Xz
The smaller portion for z = -0.5 is .3085. Therefore 31% of the 9th
-graders score
worse than the average 4th
-grader.
3.7 They would be equal when the two distributions have the same standard deviation.
21
3.8 Diagnostically meaningful cutoff:
1501.2817
30
111.549 the diagnostically
meaningful cutoff
Xz
X
X
z score area above z
1.2800 0.1003
1.2817 0.1000
1.2900 0.0985
3.9 Next year's salary raises:
a.
z X
1 . 2817 X 2000
400
$ 2512. 68 X
10% of the faculty will have a raise equal to or greater than $2,512.68.
b.
z X
1 . 645 X 2000
400
$ 1342 X
The 5% of the faculty who haven't done anything useful in years will receive no more
than $1,342 each, and probably don’t deserve that much.
22
3.10 Introductory Psychology students checking seatbelt usage:
a. Plot of distribution:
b.
z X
6244
7 2 . 57, p . 0051
A count this high (or higher) would occur by chance only .5% of the time. The
suspicion is that he just made up a number.
3.11 Transforming scores on diagnostic test for language problems:
X1 = original scores µ1 = 48 1 = 7
X2 = transformed scores µ2 = 80 2 = 10
2 1 /
10 7 /
0.7
C
C
C
Therefore to transform the original standard deviation from 7 to 10, we need to divide the
original scores by .7. However dividing the original scores by .7 divides their mean by
.7.
2 1 / 0.7 48 / .7 68.57X X
We want to raise the mean to 80. 80 – 68.57 = 11.43. Therefore we need to add 11.43 to
each score.
23
2 10.7 11.43X X
X2 = X1/0.7 + 11.43. [This formula summarizes the whole process.]
3.12 Skewed distribution of diagnostic test for language problems:
a. Diagram:
Distribution of Language
Test Scores
Score
Pro
po
rtio
n
10 20 30 40 50
0.0
00
.02
0.0
40
.06
0.0
8
b. To find the cutoff for the bottom 10% if the distribution is not normal, empirically
count up from the bottom.
3.13 October 1981 GRE, all people taking exam:
600 489
126
0.88 (larger portion) = 0.81
Xz
p
A GRE score of 600 would correspond to the 81st percentile.
24
3.14 For the data in Exercise 3.13:
489.6754
126
573.987
Xz
X
X
z score p
0.6700 0.7486
0.6745 0.7500
0.6800 0.7517
A GRE score of (.6745*126 + 489) = 574 would correspond to the 75th percentile.
3.15 October 1981 GRE, all seniors and nonenrolled college graduates:
600 507
118
0.79 .785
Xz
p
507
0.6745118
586.591
Xz
X
X
For seniors and nonenrolled college graduates, a GRE score of 600 is at the 79th
percentile, and a score of 587 would correspond to the 75th percentile.
3.16 Percentiles are dependent upon the reference group. As a group, the seniors and
nonenrolled college graduates did better on the GRE than did all people taking the exam.
A person receiving a given score, therefore, did better (scored at a higher percentile)
when compared to all people taking the exam than when compared only to the seniors
and nonenrolled college graduates taking the exam.
3.17 GPA scores:
88 2.46 0.86N X s [calculated from data set]
2.460.6745
0.86
3.04
X Xz
s
X
X
The 75th percentile for GPA is 3.04.
25
3.18 Diagnostically meaningful cutoff for Behavior Problem scores:
502.0540
10
70.54 cutoff
Xz
X
X
z score p
2.050 0.9798
2.054 0.9800
2.060 0.9803
3.19 There is no meaningful discrimination to be made among those scoring below the mean,
and therefore all people who score in that range are given a T score of 50.
3.20
Notice that some of the plots don’t look as neat as you might expect. Notice also that as
the sample size increases the plots look better.
26
3.21 Weight gain data
None of these is very close to normal, but the post intervention weight is closest.
.
3.22 Hand-calculated qqplot
z cumfreq cumperct zperc
-3.0 0 0.0000 0.00135
-2.5 0 0.0000 0.00621
-2.0 0 0.0000 0.02275
-1.5 11 0.0367 0.06681
-1.0 53 0.1767 0.15866
-0.5 97 0.3233 0.30854
0.0 162 0.5400 0.50000
0.5 220 0.7333 0.69146
1.0 259 0.8633 0.84134
1.5 278 0.9267 0.93319
2.0 289 0.9633 0.97725
2.5 292 0.9733 0.99379
3.0 300 1.0000 0.99865
3.23 I would first draw 16 scores from a normally distributed population with = 0 and = 1.
Call this variable z1. The sample (z1) would almost certainly have a sample mean and
standard deviation that are not 0 and 1. Then I would create a new variable z2 = z1-
mean(z1). This would have a mean of 0.00. Then I would divide z2 by sd(z1) to get a
new distribution (z3) with mean = 0 and sd = 1. Then make that variable have a st. dev.
of 4.25 by multiplying it by 4.25. Finally add 16.3 (the new mean). Now the mean is
exactly 16.3 and the standard deviation is exactly 4.25.
27
3.24 Salaries for assistant professors (1999-2000)
I expect that you would do reasonably well if you treated these as normally distributed,
especially if you calculated a trimmed mean and a Winsorized standard deviation. The
extreme salaries probably come from people who have either stayed at the rank of
Assistant Professor for many years, possibly because they don’t have the highest degree
in their field, or those who have come to the university with considerable nonacademic
experience. If you took the log of the salaries you would reduce the high end of the
distribution more than the low end, reducing the effect of very large salaries.
3.25 SAT Data
840
860
880
900
920
940
960
980
1,0
00
1,0
20
1,0
40
1,0
60
1,0
80
1,1
00
combined
0
2
4
6
8
10
12
Fre
qu
en
cy
Mean = 965.92Std. Dev. = 74.82056N = 50
The data are actually bimodal, with probably too few scores at the extremes.
-80
-70
-60
-50
-40
-30
-20
-10
0 10
20
30
40
50
60
70
80
adjcomb
0
2
4
6
8
10
12
14
Fre
qu
en
cy
Mean = 5.9674E-16Std. Dev. = 34.53279N = 50
These data are much more normally distributed. As we will see in Chapter Nine, there are two
kinds of students who take the SAT, depending on where they live. It the East most students take
it. In the West, students applying to high ranking eastern schools take it. This leads to the
bimodal distribution in the adjusted scores.
28
Chapter 4 – Sampling Distributions and Hypothesis Testing
4.1 Was last night's game an NHL hockey game?
a. Null hypothesis: The game was actually an NHL hockey game.
b. On the basis of that null hypothesis I expected that each team would earn somewhere
between 0 and 6 points. I then looked at the actual points and concluded that they
were way out of line with what I would expect if this were an NHL hockey game. I
therefore rejected the null hypothesis.
4.2 Am I overcharged at lunch?
a. Sketch:
b. No, $4.25 is a common observation.
c. I set up the null hypothesis that I was charged correctly. Therefore I would expect to
receive about $1.00 in change, give or take a quarter or so. The change that I
received was in line with that expectation, and therefore I have no basis for rejecting
H0.
4.3 A Type I error would be concluding that I had been shortchanged when in fact I had not.
4.4 A Type II error would be concluding that I had not been shortchanged when in fact I had.
4.5 The critical value would be that amount of change below which I would decide that I had
been shortchanged. The rejection region would be all amounts less than the critical
value—i.e., all amounts that would lead to rejection of H0.
29
4.6 I would adopt a one-tailed test (using the right-hand tail) if I wanted to detect being
shortchanged but was not concerned about receiving too much money. In that case I
would not reject the null hypothesis no matter how much excess change I received (i.e., I
would not care if the restaurant was being cheated). If I chose the wrong tail, however, I
would be looking out for the restaurant's interests and ignoring my own.
4.7 Was the son of the member of the Board of Trustees fairly admitted to graduate school?
490 650
50
3.2
Xz
z
z score p
3.00 0.0013
3.20 0.0007
3.25 0.0006
The probability that a student drawn at random from those properly admitted would have
a GRE score as low as 490 is .0007. I suspect that the fact that his mother was a member
of the Board of Trustees played a role in his admission.
4.8 The standard deviation is small because we have restricted our sample to the admitted
students, i.e., a high-scoring sample.
4.9 The distribution would drop away smoothly to the right for the same reason that it always
does—there are few high-scoring people. It would drop away steeply to the left because
fewer of the borderline students would be admitted (no matter how high the borderline is
set).
4.10 I would draw a very large number of samples. For each sample I would calculate the
mode, the range, and their ratio (M). I would then plot the resulting value of M.
4.11 M is called a test statistic.
4.12 Is the car at the stop sign going to stay there (H0) or dart out in front of you (H1)?
4.13 The alternative hypothesis is that this student was sampled from a population of students
whose mean is not equal to 650.
4.14 Sampling error is variability in a statistic from sample to sample that is due to chance—
i.e., due to which observations happened to be included in the sample.
4.15 The word "distribution" refers to the set of values obtained for any set of observations.
The phrase "sampling distribution" is reserved for the distribution of outcomes (either
theoretical or empirical) of a sample statistic.
4.16 If were to decrease, would increase and power would decrease.
30
4.17 a. Research hypothesis—Children who attend kindergarten adjust to 1st grade faster
than those who do not. Null hypothesis—1st-grade adjustment rates are equal for
children who did and did not attend Kindergarten.
b. Research hypothesis—Sex education in junior high school decreases the rate of
pregnancies among unmarried mothers in high school. Null hypothesis—The rate of
pregnancies among unmarried mothers in high school is the same regardless of the
presence or absence of sex education in junior high school.
4.18 Probability of a Type II error (ß) for distribution in Figure 4.4:
67 80 20
67 80
20
0.65
X
Xz
Looking z = -0.65 up in the Appendix, we find that .7422 of the scores fall above a score
of 67. is therefore 0.74.
4.19 Finger-tapping cutoff if = .01:
1002.327
20
53.46
Xz
X
X
z score p
2.3200 0.9898
2.3270 0.9900
2.3300 0.9901
For to equal .01, z must be -2.327. The cutoff score is therefore 53. The corresponding
value for z when a cutoff score of 53 is applied to the curve for H1:
53.46 80
20
1.33
Xz
53.46 80
Xz
31
Looking z = -1.33 up in Appendix z, we find that .9082 of the scores fall above a score of
53.46. is therefore 0.908.
4.20 In Section 4.11 we were running a one-tailed test so we compared the obtained
probability (.017) to .05 (placing the full 5% in the single tail) and rejected H0. If we
were using a two-tailed test we would compare the obtained probability (still .017) to
.025 (placing 5%/2 = 2.5% in each tail) and would still reject H0. In this case, therefore,
the results would have been the same in either case.
4.21 To determine whether there is a true relationship between grades and course evaluations I
would find a statistic that reflected the degree of relationship between two variables. (The
students will see such a statistic (r) in Chapter 9.) I would then calculate the sampling
distribution of that statistic in a situation in which there is no relationship between two
variables. Finally, I would calculate the statistic for a representative set of students and
classes and compare my sample value with the sampling distribution of that statistic.
4.22 I would repeat the answer to Exercise 4.21 except that here we are speaking of comparing
means rather than looking at relationships. In other words, I would obtain the sampling
distribution of the difference between two means under the condition that I am sampling
from populations with identical means. I would then calculate the difference between my
two sample means and compare it to that sampling distribution. (The students will see
such a test in Chapter 7, although there we will use the t statistic instead of the difference
between the means.)
4.23 a. You could draw a large sample of boys and a large sample of girls in the class and
calculate the mean allowance for each group. The null hypothesis would be the
hypothesis that the mean allowance, in the population, for boys is the same as for
girls.
b. I would use a two-tailed test because I want to be able to reject the null hypothesis
whether girls receive significantly more or significantly less allowance than boys.
c. I would reject the null hypothesis if the difference between the two sample means
were greater than I could expect to find due to chance. Otherwise I would not reject.
d. The most important thing to do would be to have some outside corroboration for the
amount of allowance reported by the children.
4.24 c. This is an interesting problem. On the one hand they have all of the states, so they
have the parameters and don’t have to estimate them. On the other hand, it would be
interesting to test a general hypothesis about whether there is something about private
ownership that keeps prices up (or down). I just don’t see how you test that here.
Students may struggle with this one.
32
4.25 In the parking lot example the traditional approach to hypothesis testing would test the
null hypothesis that the mean time to leave a space is the same whether someone is
waiting or not. If their test failed to reject the null hypothesis they would simply fail to
reject the null hypothesis, and would do so at a two-tailed level of = .05. Jones and
Tukey on the other hand would not consider that the null hypothesis of equal population
means could possibly be true. They would focus on making a conclusion about which
population mean is higher. A ―nonsignificant result‖ would only mean that they didn’t
have enough data to draw any conclusion. Jones and Tukey would also be likely to work
with a one-tailed = .025, but be actually making a two-tailed test because they would
not have to specify a hypothesized direction of difference.
4.26 Reporting effect sizes would put the results of any study in perspective. It would give the
reader some sense of how large a difference we are speaking about, rather than leaving
him or her with the conclusion that some (possibly trivial) difference is greater than we
would expect by chance.
4.27 Distribution of proportion of those seeking help who are women.
The sampling distribution of proportion of women in the sample.
a. It is quite unlikely that we would have 61% of our sample being women if p = .50. In
my particular sampling distribution as score of 61 or higher was obtained on 16/1000
= 1.6% of the time.
b. I would repeat the same procedure again except that I would draw from a binomial
distribution where p = .75.
33
Chapter 5 - Basic Concepts of Probability
5.1 a. Analytic: If two tennis players are exactly equally skillful so that the outcome of
their match is random, the probability is .50 that Player A will win the upcoming
match.
b. Relative frequency: If in past matches Player A has beaten Player B on 13 of the 17
occasions on which they played, then Player A has a probability of 13/17 = .76 of
winning their upcoming match, all other things held constant.
c. Subjective: Player A's coach feels that he has a probability of .90 of winning his
upcoming match with Player B.
5.2 a. p(that you will win) = 1/1000 = .001
b. p(that your brother will win) = 2/1000 = .002
c. p(that one or the other of you will win) = .001 + .002 = .003
5.3 a. p(that you will win 2nd prize given that you don't win 1st) = 1/9 = .111
b. p(that he will win 1st and you 2nd) = (2/10)(1/9) = (.20)(.111) = .022
c. p(that you will win 1st and he 2nd) = (1/10)(2/9) = (.10)(.22) = .022
d. p(that you are 1st and he 2nd [= .022]) + p(that he is 1st and you 2nd [= .022])
= p(that you and he will be 1st and 2nd) = .044
5.4 Joint probabilities were involved in Exercise 5.3b and 5.3c and when we combined those
results in 5.3d.
5.5 Conditional probabilities were involved in Exercise 5.3a.
5.6 Joint probabilities: What is the probability that I will be free to go skiing next Wednesday
and that the conditions will be good?
5.7 Conditional probabilities: What is the probability that skiing conditions will be good on
Wednesday, given that they are good today?
5.8 p(that they will look at each other at the same time) = p(that mother looks at baby) *
p(that baby looks at mother) = (2/24)(3/24) = (.083)(.125) = .01
5.9 p(that they will look at each other at the same time during waking hours) = p(that mother
looks at baby during waking hours) * p(that baby looks at mother during waking hours) =
(2/13)(3/13) = (.154)(.231) = .036
34
5.10 A flier that contains a message asking the person to dispose of it properly has a higher
probability of being found in the trash than we would expect if the message and disposal
were independent events.
5.11 A continuous distribution for which we care about the probability of an observation's
falling within some specified interval is exemplified by the probability that your baby
will be born on its due date.
5.12 The continuous distribution of children's learning abilities is often treated as discrete by
school systems, which divide children into those needing special education versus those
who should attend regular classes. Often schools further divide the regular classes into
different tracks.
5.13 Two examples of discrete variables: Variety of meat served at dinner tonight; Brand of
desktop computer owned.
5.14 p(that any applicant will be admitted) the ratio of the number admitted to the number
applying = 10/300 = .03
5.15 a. 20%, or 60 applicants, will fall at or above the 80th percentile and 10 of these will be
chosen. Therefore p(that an applicant with the highest rating will be admitted) =
10/60 = .167.
b. No one below the 80th percentile will be admitted, therefore p(that an applicant with
the lowest rating will be admitted) = 0/300 = .00.
5.16 Mean ADDSC score = 52.6, s = 12.42 [Calculated from Data Set.]
a.
50 52.6
0.2112.42
z
Since a score of 50 is below the mean, and since we are looking for the probability of
a score greater than 50, we want to look in the tables of the normal distribution in the
column labeled "larger portion".
p(larger portion) = .5832
b. 45% of the scores actually exceed 50, while 56% are = 50.
5.17 Mean ADDSC score for boys = 54.29, s = 12.90 [Calculated from Data Set]
a.
50 54.3
0.3312.90
z
35
Since a score of 50 is below the mean, and since we are looking for the probability of
a score greater than 50, we want to look in the tables of the normal distribution in the
column labeled "larger portion".
p(larger portion) = .6293
b. 29/55 = 53% > 50; 32/55 = 58% > 50. (Notice that one percentage refers to the
proportion greater than 50, while the other refers to the proportion greater than or
equal to 50.)
5.18 p(that person will drop out of school, given that he/she has an ADDSC of at least 60) =
7/25 = .28
5.19 Compare the probability of dropping out of school, ignoring the ADDSC score, with the
conditional probability of dropping out given that ADDSC in elementary school
exceeded some value (e.g., 66).
5.20 p(dropout) = 10/88 = .11; p(dropout|ADDSC > 60) = .28; Students are much more likely
to drop out of school if they scored at or above ADDSC = 60 in elementary school.
5.21 Plot of correct choices on trial 1 of a 5-choice task:
p(0) = .1074
p(1) = .2684
p(2) = .3020
p(3) = .2013
p(4) = .0881
p(5) = .0264
p(6) = .0055
p(7) = .0008
p(8) = .0001
p(9) = .0000
p(10) = .0000
5.22 p(6 or more correct) = p(6) + p(7) + p(8) + p(9) + p(10)
= .0055 + .0008 + .0001 + .0000 + .0000
= .0064
Thus if 6 of the 10 were correct I would conclude that they were not operating at chance
(there is some cheating going on!).
5.23 p(5 or more correct) = p(5) + p(6) + p(7) + p(8) + p(9) + p(10)
= .0264 + .0055 + .0008 + .0001 + .0000 + .0000
= .028 < .05
36
p(4 or more correct) = p(4) + p(5) + p(6) + p(7) + p(8) + p(9) + p(10)
= .0881 + .0264 + .0055 + .0008 + .0001 + .0000 + .0000
= .1209 > .05
At = .05, therefore, up to 4 correct choices indicate chance performance, but 5 or more
correct choices would lead me to conclude that they are no longer performing at chance
levels.
5.24 Probability statements about the treatment of automobile shoppers:
Simple probability: The probability that the salesperson will make a condescending
remark is .15.
Joint probability: The probability that the salesperson will make a condescending
remark and that the customer is a woman is .10.
Conditional prob: The probability that the salesperson will make a condescending
remark given that the customer is a woman is .25.
5.25 If there is no housing discrimination, then a person’s race and whether or not they are
offered a particular unit of housing are independent events. We could calculate the
probability that a particular unit (or a unit in a particular section of the city) will be
offered to anyone in a specific income group. We can also calculate the probability that
the customer is a member of an ethnic minority. We can then calculate the probability of
that person being shown the unit assuming independence and compare that answer
against the actual proportion of times a member of an ethnic minority was offered such a
unit.
5.26 Number of subjects needed in verbal learning experiment if each is to see different
classes of words in a different order:
4
4
4!24
4 4 !P
5.27 Number of subjects needed in Exercise 5.26's verbal learning experiment if each subject
can see only two of the four classes of words:
4
2
4! 4!12
4 2 ! 2!P
5.28 Chance that subject will press correctly on first trial when learning to press three out of
five buttons in a certain order:
5
3
5! 5!60
5 3 ! 2!P
37
There are 60 possible orders to push 3 out of 5 buttons. The probability that the subject
will choose the correct order on the first trial = p(1/60) = 0.017
5.29 The total number of ways of making ice cream cones =
6 6 6 6 6 6
6 5 4 3 2 11 6 15 20 15 6 63C C C C C C
[You can't have an ice cream cone without ice cream; exclude 6
0C ].
5.30 Different ways to record from the rat's brain:
6
4
6! 6!15
4! 6 4 ! 2!C
5.31 Knowledge of current events:
If p = .50 of being correct on any one true-false item, and N = 20:
20 11 9
11
20
11
20 11 9
11
(11) 5 5
20! 20!167,960
11! 20 11)! 11!9!
(11) 5 5 167,960 .00048828 .00195313 .16
p C
C
p C
Since the probability of 11 correct by chance is .16, the probability of 11 or more correct
must be greater than .16. Therefore we can not reject the hypothesis that p = .50 (student
is guessing) at = .05.
5.32 Probability of 25 blue M&M’s out of 60 draws sampling with replacement.
60 25 35
25
25 35
16 16
(25) .24 .76
60!.24 .76
25!35!
(5.191543797 10 )(3.200965864 10 ) .000067671
.0011196
p C
5.33 Driving test passed by 22 out of 30 drivers when 60% expected to pass:
38
z 2230( . 60)
30( . 60) ( . 40) 1 . 49; we cannot reject H 0 at . 05.
5.34 On the theory that practice in almost anything leads to improvement, we give a sample of
first year college students, who will major in the humanities (where there is a lot of
reading assigned), a test for reading speed at the beginning of the fall semester. At the
end of the year we again measure their reading speed. We wish to test the null hypothesis
that reading speed, on average (or for most people) increased over the year.
5.35 Students should come to understand that nature does not have a responsibility to make
things come out even in the end, and that it has a terrible memory of what has happened
in the past. Any ―law of averages‖ refers to the results of a long term series of events, and
it describes what we would expect to see. It does not have any self-correcting mechanism
built into it.
5.36 Probability of breast cancer
( ) .01
| .80
| .096
p BC
p BC
p BC
||
| ( | )
.80 .01
.80 .01 .096 .99
.008 .008.078
.008 .095 .103
p D H p Hp H D
p D H p H p D H p H
5.37 It is low because the probability of breast cancer is itself very low. But don’t be too
discouraged. Having collected some data (a positive mammography) the probability is
7.8 times higher than it would otherwise have been. (And if you are a woman, please
don’t stop having mammographies.)
5.38 Reducing the rate of false positives
Here we can use the same calculations, but just change .096 to .05.
( ) .01
| .80
| .05
p BC
p BC
p BC
||
| ( | )
.80 .01
.80 .01 .05 .99
.008 .008.139
.008 .0495 .103
p D H p Hp H D
p D H p H p D H p H
The probability has nearly doubled when we nearly halved our false positive rate.