Remember Playing perfect black jack – the probability of winning a hand is.498 What is the...

Preview:

Citation preview

Remember

• Playing perfect black jack – the probability of winning a hand is .498

• What is the probability that you will win 8 of the next 10 games of blackjack?

Binomial Distribution

Ingredients:N = total number of eventsp = the probability of a success on any one trialq = (1 – p) = the probability of a failure on any one trialX = number of successful events

)(

)!(!

!)( XNXqp

XNX

NXp

Binomial Distribution

Ingredients:N = total number of eventsp = the probability of a success on any one trialq = (1 – p) = the probability of a failure on any one trialX = number of successful events

)810(8 502.498.)!810(!8

!10)(

Xp

Binomial Distribution

Ingredients:N = total number of eventsp = the probability of a success on any one trialq = (1 – p) = the probability of a failure on any one trialX = number of successful events

)810(8 502.498.)!810(!8

!10)(

Xp

p = .0429

Binomial Distribution

• What if you are interested in the probability of winning at least 8 games of black jack?

• To do this you need to know the distribution of these probabilities

Probability of Winning Blackjack

• p = .498, N = 10Number of Wins p

0

1

2

3

4

5

6

7

8

9

10

Probability of Winning Blackjack

• p = .498, N = 10Number of Wins p

0 .001

1

2

3

4

5

6

7

8

9

10

Probability of Winning Blackjack

• p = .498, N = 10Number of Wins p

0 .001

1 .010

2 .045

3 .119

4 .207

5 .246

6 .203

7 .115

8 .044

9 .009

10 .001

Probability of Winning Blackjack

• p = .498, N = 10Number of Wins p

0 .001

1 .010

2 .045

3 .119

4 .207

5 .246

6 .203

7 .115

8 .044

9 .009

10 .001

1.00

Binomial Distribution

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Games Won

p

Hypothesis Testing

• You wonder if winning at least 7 games of blackjack is significantly (.05) better than what would be expected due to chance.

• H1= Games won > 6 • H0= Games won < or equal to 6

• What is the probability of winning 7 or more games?

Binomial Distribution

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Games Won

p

Binomial Distribution

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Games Won

p

Probability of Winning Blackjack

• p = .498, N = 10Number of Wins p

0 .001

1 .010

2 .045

3 .119

4 .207

5 .246

6 .203

7 .115

8 .044

9 .009

10 .001

1.00

Probability of Winning Blackjack

• p = .498, N = 10

• p of winning 7 or more games

• .115+.044+.009+.001 = .169

• p > .05• Not better than chance

Number of Wins p

0 .001

1 .010

2 .045

3 .119

4 .207

5 .246

6 .203

77 .115.115

88 .044.044

99 .009.009

1010 .001.001

1.00

Practice

• The probability at winning the “Statistical Slot Machine” is .08.

• Create a distribution of probabilities when N = 10

• Determine if winning at least 4 games of slots is significantly (.05) better than what would be expected due to chance.

Probability of Winning SlotNumber of Wins p

0 .434

1 .378

2 .148

3 .034

4 .005

5 .001

6 .000

7 .000

8 .000

9 .000

10 .000

1.00

Binomial Distribution

00.050.1

0.150.2

0.250.3

0.350.4

0.450.5

0 1 2 3 4 5 6 7 8 9 10

Games Won

p

Probability of Winning Slot

• p of winning at least 4 games

• .005+.001+.000 . . . .000 = .006

• p< .05

• Winning at least 4 games is significantly better than chance

Number of Wins p

0 .434

1 .378

2 .148

3 .034

4 .005

5 .001

6 .000

7 .000

8 .000

9 .000

10 .000

1.00

Binomial Distribution

• These distributions can be described with means and SD.

• Mean = Np• SD = Npq

Binomial Distribution

• Black Jack; p = .498, N =10

• M = 4.98

• SD = 1.59

Binomial Distribution

0

0.05

0.1

0.15

0.2

0.25

0.3

0 1 2 3 4 5 6 7 8 9 10

Games Won

p

Binomial Distribution

• Statistical Slot Machine; p = .08, N = 10

• M = .8

• SD = .86

Binomial Distribution

00.050.1

0.150.2

0.250.3

0.350.4

0.450.5

0 1 2 3 4 5 6 7 8 9 10

Games Won

p

Note: as N gets bigger, distributions will approach normal

Next Step

• You think someone is cheating at BLINGOO!

• p = .30 of winning

• You watch a person play 89 games of blingoo and wins 39 times (i.e., 44%).

• Is this significantly bigger than .30 to assume that he is cheating?

Hypothesis

• H1= .44 > .30

• H0= .44 < or equal to .30

• Or

• H1= 39 wins > 26.7 wins

• H0= 39 wins < or equal to 26.7 wins

Distribution

• Mean = 26.7

• SD = 4.32

• X = 39

Z-score

Results

• (39 – 26.7) / 4.32 = 2.85• p = .0021• p < .05

• .44 is significantly bigger than .30. There is reason to believe the person is cheating!

• Or – 39 wins is significantly more than 26.7 wins (which are what is expected due to chance)

BLINGOO Competition

• You and your friend enter at competition with 2,642 other players

• p = .30• You win 57 of the 150 games and your friend won 39.

• Afterward you wonder how many people– A) did better than you?– B) did worse than you?– C) won between 39 and 57 games

• You also wonder how many games you needed to win in order to be in the top 10%

Blingoo

• M = 45• SD = 5.61

• A) did better than you?

• (57 – 45) / 5.61 = 2.14• p = .0162• 2,642 * .0162 = 42.8 or 43 people

Blingoo

• M = 45• SD = 5.61

• A) did worse than you?

• (57 – 45) / 5.61 = 2.14• p = .9838• 2,642 * .9838 = 2,599.2 or 2,599 people

Blingoo

• M = 45• SD = 5.61

• A) won between 39 and 57 games?

• (57 – 45) / 5.61 = 2.14 ; p = .4838• (39 – 45) / 5.61 = -1.07 ; p = .3577 • .4838 + .3577 = .8415• 2,642 * .8415 = 2,223.2 or 2, 223 people

Blingoo

• M = 45• SD = 5.61

• You also wonder how many games you needed to win in order to be in the top 10%

• Z = 1.28• 45 + 5.61 (1.28) = 52.18 games or 52 games

Practice

• In the past you have had a 5% success rate at getting someone to accept a date from you.

• What is the probability that at least 1 of the next 10 people you ask out will accept?

• Note: N isn’t big enough in these problems to use the Z-score formula

Bullied as a child?

Are you tall or short?

6’ 4”

5’ 10”

4’

2’ 4”

Is a persons’ size related to if they were bullied

• You gathered data from 209 children at Springfield Elementary School.

• Assessed:

• Height (short vs. not short)

• Bullied (yes vs. no)

Results

Height Yes No

Short 42 50

Not short 30 87

Ever Bullied

Results

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Results

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 34% 66% 209

Ever Bullied

Results

Height Yes No Total

Short 42 50 44%

Not short 30 87 56%

Total 72 137 209

Ever Bullied

Results

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Results

Height Yes No Total

Short 45% 55% 92

Not short 26% 74% 117

Total 72 137 209

Ever Bullied

Is this difference in proportion due to chance?

• To test this you use a Chi-Square (2)

• Notice you are using nominal data

Hypothesis

• H1: There is a relationship between the two variables– i.e., a persons size is related to if they were

bullied

• H0:The two variables are independent of each other– i.e., there is no relationship between a persons

size and if they were bullied

Logic

• 1) Calculate an observed Chi-square

• 2) Find a critical value

• 3) See if the the observed Chi-square falls in the critical area

Chi-Square

O = observed frequency

E = expected frequency

Results

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Observed Frequencies

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Expected frequencies

• Are how many observations you would expect in each cell if the null hypothesis was true

– i.e., there there was no relationship between a persons size and if they were bullied

Expected frequencies

• To calculate a cells expected frequency:

• For each cell you do this formula

Expected Frequencies

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Expected Frequencies

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Expected Frequencies

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Row total = 92

Expected Frequencies

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Row total = 92

Column total = 72

Expected Frequencies

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Row total = 92 N = 209

Column total = 72

Expected Frequencies

Height Yes No Total

Short 42 50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

E = (92 * 72) /209 = 31.69

Expected Frequencies

Height Yes No Total

Short 42(31.69)

50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Expected Frequencies

Height Yes No Total

Short 42(31.69)

50 92

Not short 30 87 117

Total 72 137 209

Ever Bullied

Expected Frequencies

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30 87 117

Total 72 137 209

Ever Bullied

E = (92 * 137) /209 = 60.30

Expected Frequencies

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

E = (117 * 72) / 209 = 40.30

E = (117 * 137) / 209 = 76.69

Expected Frequencies

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

The expected frequencies are what you would expect if there was no relationship between the two variables!

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

Looking only at:

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

If you randomly selected a person from these 209 people what is the probability you would select a person who is short?

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

If you randomly selected a person from these 209 people what is the probability you would select a person who is short? 92 / 209 = .44

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied?

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied? 72 / 209 = .34

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied and is short?

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied and is short? (.44) (.34) = .15

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

How many people do you expect to have been bullied and short?

How do the expected frequencies work?

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

How many people would you expect to have been bullied and short? (.15 * 209) = 31.35 (difference due to rounding)

Back to Chi-Square

O = observed frequency

E = expected frequency

2

O E O - E (O - E)2 (O - E)2

E

2

O E O - E (O - E)2 (O - E)2

E42

50

30

87

2

O E O - E (O - E)2 (O - E)2

E42 31.69

50 60.30

30 40.30

87 76.69

2

O E O - E (O - E)2 (O - E)2

E42 31.69 10.31

50 60.30 -10.30

30 40.30 -10.30

87 76.69 10.31

2

O E O - E (O - E)2 (O - E)2

E42 31.69 10.31 106.30

50 60.30 -10.30 106.09

30 40.30 -10.30 106.09

87 76.69 10.31 106.30

2

O E O - E (O - E)2 (O - E)2

E42 31.69 10.31 106.30 3.35

50 60.30 -10.30 106.09 1.76

30 40.30 -10.30 106.09 2.63

87 76.69 10.31 106.30 1.39

2

O E O - E (O - E)2 (O - E)2

E42 31.69 10.31 106.30 3.35

50 60.30 -10.30 106.09 1.76

30 40.30 -10.30 106.09 2.63

87 76.69 10.31 106.30 1.392 = 9.13

Significance

• Is a 2 of 9.13 significant at the .05 level?

• To find out you need to know df

Degrees of Freedom

• To determine the degrees of freedom you use the number of rows (R) and the number of columns (C)

• DF = (R - 1)(C - 1)

Degrees of Freedom

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

Rows = 2

Degrees of Freedom

Height Yes No Total

Short 42(31.69)

50(60.30)

92

Not short 30(40.30)

87(76.69)

117

Total 72 137 209

Ever Bullied

Rows = 2

Columns = 2

Degrees of Freedom

• To determine the degrees of freedom you use the number of rows (R) and the number of columns (C)

• df = (R - 1)(C - 1)

• df = (2 - 1)(2 - 1) = 1

Significance

• Look on page 736

• df = 1 = .05

2critical = 3.84

Decision

• Thus, if 2 > than 2critical

– Reject H0, and accept H1

• If 2 < or = to 2critical

– Fail to reject H0

Current Example

2 = 9.13

2critical = 3.84

• Thus, reject H0, and accept H1

Current Example

• H1: There is a relationship between the the two variables– A persons size is significantly (alpha = .05)

related to if they were bullied

Seven Steps for Doing 2

• 1) State the hypothesis

• 2) Create data table

• 3) Find 2 critical

• 4) Calculate the expected frequencies

• 5) Calculate 2

• 6) Decision

• 7) Put answer into words

Example

• With whom do you find it easiest to make friends?

• Subjects were either male and female.• Possible responses were: “opposite sex”,

“same sex”, or “no difference”

• Is there a significant (.05) relationship between the gender of the subject and their response?

Results

Opposite Sex Same Sex No Difference

Females 58 16 63

Males 15 13 40

Step 1: State the Hypothesis

• H1: There is a relationship between gender and with whom a person finds it easiest to make friends

• H0:Gender and with whom a person finds it easiest to make friends are independent of each other

Step 2: Create the Data Table

Opposite Sex Same Sex No Difference

Females 58 16 63

Males 15 13 40

Step 2: Create the Data Table

Opposite Sex Same Sex No Difference Total

Females 58 16 63 137

Males 15 13 40 68

Total 73 29 103 205

Add “total” columns and rows

Step 3: Find 2 critical

• df = (R - 1)(C - 1)

Step 3: Find 2 critical

• df = (R - 1)(C - 1)

• df = (2 - 1)(3 - 1) = 2 = .05

2 critical = 5.99

Step 4: Calculate the Expected Frequencies

• Two steps:

• 4.1) Calculate values

• 4.2) Put values on your data table

Step 4: Calculate the Expected Frequencies

Opposite Sex Same Sex No Difference Total

Females 58(48.78)

16 63 137

Males 15 13 40 68

Total 73 29 103 205

E = (73 * 137) /205 = 48.79

Step 4: Calculate the Expected Frequencies

Opposite Sex Same Sex No Difference Total

Females 58(48.78)

16 63 137

Males 15(24.21)

13 40 68

Total 73 29 103 205

E = (73 * 68) /205 = 24.21

Step 4: Calculate the Expected Frequencies

Opposite Sex Same Sex No Difference Total

Females 58(48.78)

16(19.38)

63 137

Males 15(24.21)

13 40 68

Total 73 29 103 205

E = (29 * 137) /205 = 19.38

Step 4: Calculate the Expected Frequencies

Opposite Sex Same Sex No Difference Total

Females 58(48.78)

16(19.38)

63(68.83)

137

Males 15(24.21)

13(9.62)

40(34.17)

68

Total 73 29 103 205

Step 5: Calculate 2

O = observed frequency

E = expected frequency

2

O E O - E (O - E)2 (O - E)2

E58 48.79

15 24.21

16 19.38

13 9.62

63 68.83

40 34.17

2

O E O - E (O - E)2 (O - E)2

E58 48.79 9.21 84.82 1.74

15 24.21

16 19.38

13 9.62

63 68.83

40 34.17

2

O E O - E (O - E)2 (O - E)2

E58 48.79 9.21 84.82 1.74

15 24.21 -9.21 84.82 3.50

16 19.38

13 9.62

63 68.83

40 34.17

2

O E O - E (O - E)2 (O - E)2

E58 48.79 9.21 84.82 1.74

15 24.21 -9.21 84.82 3.50

16 19.38 -3.38 11.42 .59

13 9.62 3.38 11.42 1.19

63 68.83 -5.83 33.99 .49

40 34.17 5.83 33.99 .99

2

O E O - E (O - E)2 (O - E)2

E58 48.79 9.21 84.82 1.74

15 24.21 -9.21 84.82 3.50

16 19.38 -3.38 11.42 .59

13 9.62 3.38 11.42 1.19

63 68.83 -5.83 33.99 .49

40 34.17 5.83 33.99 .99

8.5

Step 6: Decision

• Thus, if 2 > than 2critical

– Reject H0, and accept H1

• If 2 < or = to 2critical

– Fail to reject H0

Step 6: Decision

• Thus, if Thus, if 22 > than > than 22criticalcritical

– Reject HReject H00, and accept H, and accept H11

• If 2 < or = to 2critical

– Fail to reject H0

2 = 8.5

2 crit = 5.99

Step 7: Put it answer into words

• H1: There is a relationship between gender and with whom a person finds it easiest to make friends

• A persons gender is significantly (.05) related with whom it is easiest to make friends.

Practice

• 6.15

• 6.16

Recommended