13
Binomial distributions often arise in discrimination Binomial distributions often arise in discrimination cases when the population in question is large. The cases when the population in question is large. The generic question is “If the selection were made at random generic question is “If the selection were made at random from the entire population, what is the probability that from the entire population, what is the probability that the number of members of a protected class the number of members of a protected class hired/promoted/laid off would be as small/large as it hired/promoted/laid off would be as small/large as it actually was?” This assumes that all members of the actually was?” This assumes that all members of the qualified population have equal merit, so its just a qualified population have equal merit, so its just a first step. If the population is large, we can act as if first step. If the population is large, we can act as if the candidates are chosen independently. the candidates are chosen independently. In 2004, the National Institute of Health announced that In 2004, the National Institute of Health announced that it would give a few new Director Pioneer Awards for it would give a few new Director Pioneer Awards for research. The awards were highly valued: $500,000 per research. The awards were highly valued: $500,000 per year for five years for research support. Nine awards year for five years for research support. Nine awards were made, all to men. This caused an outcry. were made, all to men. This caused an outcry. There were 1300 nominees for the award, 80% male. Suppose There were 1300 nominees for the award, 80% male. Suppose that all nominees are equally qualified. If we choose 9 that all nominees are equally qualified. If we choose 9 at random, the number of women among the winners has (to at random, the number of women among the winners has (to a close approximation) the binomial distribution with n=9 a close approximation) the binomial distribution with n=9 and p=0.2. Call the number of women X. and p=0.2. Call the number of women X. Find P(no award go to women), P(at least one woman), P(no Find P(no award go to women), P(at least one woman), P(no more than one woman), the mean number of women in more than one woman), the mean number of women in repeated random drawing, and the standard deviation. Can repeated random drawing, and the standard deviation. Can we use the normal approximation to calculate these we use the normal approximation to calculate these probabilities? probabilities?

Binomial distributions often arise in discrimination cases when the population in question is large. The generic question is “If the selection were made

Embed Size (px)

Citation preview

Binomial distributions often arise in discrimination cases when the Binomial distributions often arise in discrimination cases when the population in question is large. The generic question is “If the selection population in question is large. The generic question is “If the selection were made at random from the entire population, what is the probability were made at random from the entire population, what is the probability that the number of members of a protected class hired/promoted/laid off that the number of members of a protected class hired/promoted/laid off would be as small/large as it actually was?” This assumes that all would be as small/large as it actually was?” This assumes that all members of the qualified population have equal merit, so its just a first members of the qualified population have equal merit, so its just a first step. If the population is large, we can act as if the candidates are chosen step. If the population is large, we can act as if the candidates are chosen independently.independently.

In 2004, the National Institute of Health announced that it would give a few In 2004, the National Institute of Health announced that it would give a few new Director Pioneer Awards for research. The awards were highly new Director Pioneer Awards for research. The awards were highly valued: $500,000 per year for five years for research support. Nine awards valued: $500,000 per year for five years for research support. Nine awards were made, all to men. This caused an outcry.were made, all to men. This caused an outcry.

There were 1300 nominees for the award, 80% male. Suppose that all There were 1300 nominees for the award, 80% male. Suppose that all nominees are equally qualified. If we choose 9 at random, the number of nominees are equally qualified. If we choose 9 at random, the number of women among the winners has (to a close approximation) the binomial women among the winners has (to a close approximation) the binomial distribution with n=9 and p=0.2. Call the number of women X.distribution with n=9 and p=0.2. Call the number of women X.

Find P(no award go to women), P(at least one woman), P(no more than Find P(no award go to women), P(at least one woman), P(no more than one woman), the mean number of women in repeated random drawing, one woman), the mean number of women in repeated random drawing, and the standard deviation. Can we use the normal approximation to and the standard deviation. Can we use the normal approximation to calculate these probabilities?calculate these probabilities?

8.2 Warm Up8.2 Warm Up At an archaeological site that was an ancient swamp, the bones from At an archaeological site that was an ancient swamp, the bones from

20 brontosaur skeletons have been unearthed. The bones do not show 20 brontosaur skeletons have been unearthed. The bones do not show any sign of disease or malformation. It is thought that these animals any sign of disease or malformation. It is thought that these animals wandered into a deep area of the swamp and became trapped in the wandered into a deep area of the swamp and became trapped in the swamp bottom. The 20 left femur bones (thigh bones) were located and swamp bottom. The 20 left femur bones (thigh bones) were located and 4 of these left femurs are to randomly selected without replacement for 4 of these left femurs are to randomly selected without replacement for DNA testing to determine gender. DNA testing to determine gender.

a) Let X be the number out of the 4 selected left femurs that are from a) Let X be the number out of the 4 selected left femurs that are from males. Based on how these bones were sampled, explain why the males. Based on how these bones were sampled, explain why the probability distribution of X is not binomial.probability distribution of X is not binomial.

b) Suppose that the group of 20 brontosaurs whose remains were found in b) Suppose that the group of 20 brontosaurs whose remains were found in the swamp had been made up of 10 males and 10 females. What is the the swamp had been made up of 10 males and 10 females. What is the probability that all 4 in the sample to be tested are male?probability that all 4 in the sample to be tested are male?

c) The DNA testing revealed that all 4 femurs tested were from males. c) The DNA testing revealed that all 4 femurs tested were from males. Based on this result and your answer from part (b), do you think that Based on this result and your answer from part (b), do you think that males and females were equally represented in the group of 20 males and females were equally represented in the group of 20 brontosaurs stuck in the swamp? Explain.brontosaurs stuck in the swamp? Explain.

d) Is it reasonable to generalize your conclusion from part c) pertaining to d) Is it reasonable to generalize your conclusion from part c) pertaining to the group of 20 brontosaurs to the population of all brontosaurs? the group of 20 brontosaurs to the population of all brontosaurs? Explain why or why not.Explain why or why not.

More discrimination in the workplaceMore discrimination in the workplace

There are several thousand workers at a particular There are several thousand workers at a particular factory, of which 30% are Hispanic. We randomly factory, of which 30% are Hispanic. We randomly select a sample of 15 employees to serve on a select a sample of 15 employees to serve on a committee to study and recommend changes to committee to study and recommend changes to the employee benefits program. But only 3 the employee benefits program. But only 3 Hispanic employees were selected, and the Hispanic employees were selected, and the Hispanic employees have charged that the Hispanic employees have charged that the selection process was rigged to favor non-selection process was rigged to favor non-Hispanics. Is there evidence of this? Specifically, Hispanics. Is there evidence of this? Specifically, what is the probability that at most 3 Hispanics are what is the probability that at most 3 Hispanics are chosen for the committee? chosen for the committee?

Used when the goal is to obtain a FIXED Used when the goal is to obtain a FIXED number of SUCCESSES. number of SUCCESSES.

The random variable X is defined as counting The random variable X is defined as counting the number of trials needed to obtain that first the number of trials needed to obtain that first success.success.

Possible values of a geometric random Possible values of a geometric random variable: 1, 2, 3…(infinite) since it is variable: 1, 2, 3…(infinite) since it is theoretically possible to proceed indefinitely theoretically possible to proceed indefinitely without ever obtaining a success.without ever obtaining a success.

The Geometric Setting: The Geometric Setting: 2PIFS2PIFS

1.1. 22 outcomes (success/failure) outcomes (success/failure)

2.2. ProbabilityProbability is equal for each observation is equal for each observation

3.3. The observations are The observations are independentindependent

4.4. The variable of interest is the number of The variable of interest is the number of trials required to obtain the trials required to obtain the first first success.success.

ExamplesExamples An experiment consists of rolling a single An experiment consists of rolling a single

die. The event of interest is rolling a 3; die. The event of interest is rolling a 3; this event is called a success. The this event is called a success. The random variable X is defined as X = the random variable X is defined as X = the number of trials until a 3 occurs. Is this a number of trials until a 3 occurs. Is this a geometric setting?geometric setting?

Suppose you repeatedly draw cards Suppose you repeatedly draw cards without replacement from a deck of 52 without replacement from a deck of 52 cards until you draw an ace. Is this a cards until you draw an ace. Is this a geometric setting?geometric setting?

Ex. 8.13:Ex. 8.13: An experiment consists of rolling a single die. An experiment consists of rolling a single die. The event of interest is rolling a 3; this event is called a The event of interest is rolling a 3; this event is called a success. The random variable X is defined as X = the success. The random variable X is defined as X = the number of trials until a 3 occurs.number of trials until a 3 occurs.

X=1X=1 X=2X=2 X=3X=3

Glenn likes the game at the fair where you toss a coin Glenn likes the game at the fair where you toss a coin into a saucer. You win if the coin comes to rest in the into a saucer. You win if the coin comes to rest in the saucer w/o sliding off. Glenn has played this a lot and saucer w/o sliding off. Glenn has played this a lot and has determined that he wins 1 out of every 12 times he has determined that he wins 1 out of every 12 times he plays. He believes his chances of winning are the same plays. He believes his chances of winning are the same for each toss. He has no reason to think the tosses are for each toss. He has no reason to think the tosses are not independent. Let X be the # of tosses until a win. not independent. Let X be the # of tosses until a win.

1) Find the probability of success on any given trial1) Find the probability of success on any given trial2) Find the expected number of successes.2) Find the expected number of successes.3) Find the standard deviation.3) Find the standard deviation.

Roll a die until a 3 is observed. Find the Roll a die until a 3 is observed. Find the probability that it takes more than 6 rolls probability that it takes more than 6 rolls to observe a 3.to observe a 3.

Let Y be the number of Glenn’s coin Let Y be the number of Glenn’s coin tosses until a coin stays in the saucer. tosses until a coin stays in the saucer. The expected number is 12. Find the The expected number is 12. Find the probability that it takes more than 12 probability that it takes more than 12 tosses to win a stuffed animal. tosses to win a stuffed animal.

For the offices in a large office building, there are For the offices in a large office building, there are 100 different lock-and-key combinations. You start 100 different lock-and-key combinations. You start testing locks to see if the key will fit. The number testing locks to see if the key will fit. The number of locks X you must test to find one that the key of locks X you must test to find one that the key fits has a geometric distribution with p = 1/100 = fits has a geometric distribution with p = 1/100 = 0.01. (The necessary assumption here is that each 0.01. (The necessary assumption here is that each office is equally likely to have any of the 100 office is equally likely to have any of the 100 combos; this permits us to say that p remains combos; this permits us to say that p remains constant at 1/100 on each trial). constant at 1/100 on each trial).

1) What is the expected number of offices you will 1) What is the expected number of offices you will have to visit in order to find an office with a lock have to visit in order to find an office with a lock that the key fits?that the key fits?

2) What is the probability that you will have to visit at 2) What is the probability that you will have to visit at least 200 offices in order to find an office with a least 200 offices in order to find an office with a lock that the key fits?lock that the key fits?

3) What is the probability that you will have to visit at 3) What is the probability that you will have to visit at most 200 offices?most 200 offices?

  There is a probability of 0.08 that a vaccine will cause a There is a probability of 0.08 that a vaccine will cause a certain side effect. Suppose that a number of patients are certain side effect. Suppose that a number of patients are inoculated with the vaccine. We are interested in the inoculated with the vaccine. We are interested in the number of patients vaccinated until the first side effect is number of patients vaccinated until the first side effect is observed. observed.

1.1. Define the random variable of interest. X=?Define the random variable of interest. X=?________________________2.2. Verify that this describes a geometric setting.Verify that this describes a geometric setting.3.3. Find the probability that exactly 5 patients must be Find the probability that exactly 5 patients must be

vaccinated in order to observe the first side effect.vaccinated in order to observe the first side effect.4.4. Construct a probability distribution table for X (up through X Construct a probability distribution table for X (up through X

= 5). = 5). 5.5. How many patients would you expect to have to vaccinate How many patients would you expect to have to vaccinate

in order to observe the first side effect?in order to observe the first side effect?6.6. What is the probability that the number of patients What is the probability that the number of patients

vaccinated until the first side effect is observed is at most 5?vaccinated until the first side effect is observed is at most 5?

Case Closed! P. 554Case Closed! P. 554

5 groups (count off)5 groups (count off) Each group will be randomly assigned a Each group will be randomly assigned a

letter a-eletter a-e Turn in group paper at the end of the Turn in group paper at the end of the

periodperiod

Exploring Geometric Exploring Geometric Distributions with the Distributions with the TI83TI83

Page 547 Technology Toolbox. Page 547 Technology Toolbox. Other points:Other points: The probability distribution histogram is The probability distribution histogram is

strongly skewed to the right. The height of strongly skewed to the right. The height of each bar after the 1each bar after the 1stst is the height of the is the height of the previous bar times the probability of failure (1-previous bar times the probability of failure (1-p). Since you are * each consecutive height by p). Since you are * each consecutive height by a number <1, each new bar will always be a number <1, each new bar will always be shorter than the previous. Therefore the shorter than the previous. Therefore the histogram will ALWAYS be right-skewed.histogram will ALWAYS be right-skewed.