22
STAT 22000 Lecture Slides Random Variables Yibi Huang Department of Statistics University of Chicago

STAT 22000 Lecture Slides Random Variables

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: STAT 22000 Lecture Slides Random Variables

STAT 22000 Lecture SlidesRandom Variables

Yibi HuangDepartment of StatisticsUniversity of Chicago

Page 2: STAT 22000 Lecture Slides Random Variables

Outline

Coverage: Section 2.4 in the text.

• Random Variables

• Expected Value

• Standard Deviation

• Linear combinations of random variables

1

Page 3: STAT 22000 Lecture Slides Random Variables

Random variables

Page 4: STAT 22000 Lecture Slides Random Variables

Random Variables

A random variable is a numeric quantity whose value depends onthe outcome of a random event

• We use a capital letter, like X , to denote a random variable• The values of a random variable are denoted with a lowercase

letter, in this case x, e.g., P(X = x)

Example 1. Let X be the number of heads in 2 tosses of a coin.The possible outcomes are {HH,HT ,TH,TT } and

X(HH) = 2, X(HT) = 1, X(TH) = 1, X(TT) = 0

Example 2. Let Y be the number of tosses required to get a head.The possible outcomes are S = {H,TH,TTH,TTTH,TTTTH, . . .}

Y(H) = 1, Y(TH) = 2, Y(TTH) = 3, Y(TTTH) = 4, . . .

2

Page 5: STAT 22000 Lecture Slides Random Variables

Discrete and Continuous Random Variable

There are two types of random variables:

• Discrete random variables often take only integer values• Example: Number of credit hours, Difference in number of

credit hours this term vs last

• Continuous random variables take real (decimal) values• Example: Cost of books this term, lifetime of a battery

3

Page 6: STAT 22000 Lecture Slides Random Variables

Probability Distribution

A probability distribution of a discrete random variable is a list ofits possible values and the probabilities that it takes on thosevalues.

Value of X x1 x2 x3 . . .

Probability p1 p2 p3 . . .

A probability distribution must satisfy two conditions:

• 0 ≤ pi ≤ 1 for all i

• p1 + p2 + · · · = 1

4

Page 7: STAT 22000 Lecture Slides Random Variables

Example

Let X be the number of heads obtained in 2 tosses of a fair coin.What is the distribution of X?

Value of X 0 1 2Outcomes TT HT, TH HHProbability 1/4 1/2 1/4

5

Page 8: STAT 22000 Lecture Slides Random Variables

Example: A Card Game

In a game of cards you win $1 if you draw a heart, $5 if you draw anace (including the ace of hearts), $10 if you draw the king of spadesand nothing for any other card you draw. What’s the probabilitydistribution of your earning?

Event X P(X)

Heart (not ace) 1 12/52Ace 5 4/52King of spades 10 1/52All else 0 35/52Total 1

6

Page 9: STAT 22000 Lecture Slides Random Variables

Example: A Card Game (Cont’d)

Below is a visual representation of the probability distribution ofwinnings from this game:

0 1 2 3 4 5 6 7 8 9 100.0

0.1

0.2

0.3

0.4

0.5

0.6

7

Page 10: STAT 22000 Lecture Slides Random Variables

Expected Value (Mean)

• We are often interested in the average outcome of a randomvariable.

• We call this the expected value (or the mean), and it is aweighted average of the possible outcomes

µ = E(X) =k∑

i=1

xi P(X = xi)

8

Page 11: STAT 22000 Lecture Slides Random Variables

Example: A Card Game (Cont’d)

In a game of cards you win $1 if you draw a heart, $5 if you draw anace (including the ace of hearts), $10 if you draw the king of spadesand nothing for any other card you draw. What is the expectedearning in a game?

Event X P(X) X P(X)

Heart (not ace) 1 12/52 1 ×1252

=1252

Ace 5 4/52 5 ×452

=2052

King of spades 10 1/52 10 ×152

=1052

All else 0 35/52 0 ×3552

= 0

Total E(X) = 42/52 ≈ 0.819

Page 12: STAT 22000 Lecture Slides Random Variables

Interpretation of Expected Value

If we play the card game a huge number of time, what’s the averageearning per game?

Let Xi be the earning in the ith game. Since the earning can onlybe 0, 1, 5, or 10, the average amount of earning per game is

X1 + X2 + · · ·+ Xn

n=

$0 ·(# of gamesearning $0

)n

+$1 ·

(# of gamesearning $1

)n

+$5 ·

(# of gamesearning $5

)n

+$10 ·

(# of gamesearning $10

)n

By the law of large numbers,

(# of gamesearning $0

)n

→ P(X = 0) as n gets big,

and likewise for the rest. So the long run average earning in agame is just the expected value.

0 · P(X = 0) + 1 · P(X = 1) + 5 · P(X = 5) + 10 · P(X = 10) = E(X)10

Page 13: STAT 22000 Lecture Slides Random Variables

Example: A Card Game (Cont’d)

In a game of cards you win $1 if you draw a heart, $5 if you draw anace (including the ace of hearts), $10 if you draw the king of spadesand nothing for any other card you draw. If it charges a certainamount of money each time to play the game, what is the maximumamount you would be willing to pay? Explain your reasoning.

At most E(X) = 42/52 ≈ 0.81. If it charges more than that eachtime to play the game, the gambler will lose money in the long-run.

11

Page 14: STAT 22000 Lecture Slides Random Variables

Fair game

A fair game is defined as a game that costs as much as itsexpected payout, i.e. expected profit is 0.

Do you think casino games in Vegas cost more or less than theirexpected payouts?

If those games cost less than theirexpected payouts, it would mean thatthe casinos would be losing money onaverage, and hence they wouldn’t beable to pay for all this:

Image by Moyan Brenn on Flickr http:// www.flickr.com/ photos/ aigle dore/ 5951714693.

12

Page 15: STAT 22000 Lecture Slides Random Variables

Variance & Standard Deviation of a Random Variable

We are also often interested in the variability in the values of arandom variable.

The variance of a random variable X , denoted as σ2X or V(X) is

defined as

σ2X = V(X) =

∑i=1

(xi − E(X))2P(X = xi).

The standard deviation of a random variable X , denoted as σX orSD(X) is simply the square root of the variance:

σX = SD(X) =√

V(X)

13

Page 16: STAT 22000 Lecture Slides Random Variables

For the previous card game example, how much would you expectthe winnings to vary from game to game?

X P(X) X P(X) (X − E(X))2 P(X) (X − E(X))2

1 1252 1 × 12

52 = 1252 (1 − 0.81)2 = 0.0361 12

52 × 0.0361 = 0.0083

5 452 5 × 4

52 = 2052 (5 − 0.81)2 = 17.5561 4

52 × 17.5561 = 1.3505

10 152 10 × 1

52 = 1052 (10 − 0.81)2 = 84.4561 1

52 × 84.0889 = 1.6242

0 3552 0 × 35

52 = 0 (0 − 0.81)2 = 0.6561 3552 × 0.6561 = 0.4416

Total E(X) = 0.81 V(X) = 3.4246

SD(X) =√

3.4246 = 1.85

14

Page 17: STAT 22000 Lecture Slides Random Variables

Example – American Roulette

The game of American roulette involves spinning a wheel with 38 slots:18 red, 18 black, and 2 green. A ball is spun onto the wheel and willeventually land in a slot, where each slot has an equal chance of capturingthe ball. Gamblers can place bets on red or black. If the ball lands ontheir color, they double their money. If it lands on another color, they losetheir money. Suppose you bet $1 on red. What’s the expected value andstandard deviation of your winnings?

Outcome Red Black or GreenProfit X 1 −1P(X) 18/38 20/38

E(X) = 1 ·1838

+ (−1) ·2038

= −238

= −119

SD(X) =

√(1 − (−

119

)

)2

·1838

+

(−1 − (−

119

)

)2

·2038

=

√360361

≈ 0.9986

15

Page 18: STAT 22000 Lecture Slides Random Variables

Example – American Roulette (Cont’d)

Compare the expected values and SDs of the total winnings of the follow-ing three betting strategies

(a) betting $5 on red in a single round

(b) betting $3 on red in the first round, and then $2 on red in a secondround

(c) betting $1 on red in 5 different rounds

Let Xi be one’s earning for every $1 bet on red in the ith round. The totalwinning of the three strategies are

(a) 5X1

(b) 3X1 + 2X2

(c) X1 + X2 + X3 + X4 + X5

which are all linear combinations of random variables16

Page 19: STAT 22000 Lecture Slides Random Variables

Linear Combinations of Random Variables

Let X ,Y ,Z , . . . be random variables, and a, b , c, . . . be any fixednumbers.

• E(aX + bY + cZ + · · · ) = a ·E(X)+ b ·E(Y)+ c ·E(Z)+ · · ·• This formula is always valid

• V(aX +bY +cZ + · · · ) = a2 ·V(X)+b2 ·V(Y)+c2 ·V(Z)+ · · ·

• This formula is valid only when X ,Y ,Z , . . . are independent.The formula for dependent case is beyond the scope of thiscourse.

• SD(aX + bY + cZ + · · · ) =√

V(aX + bY + cZ + · · · )

17

Page 20: STAT 22000 Lecture Slides Random Variables

Example – American Roulette (Cont’d)

Recall the Xi is one’s earning for betting $1 on red in the ith round.So Xi ’s are independent with mean E(Xi) = −$1/19 and varianceV(Xi) =

360361 .

For the three gambling strategies, the expected winnings are

(a) E(5X1) = 5E(X1) = 5 · (−$1/19) = −$5/19

(b) E(3X1 + 2X2) = 3E(X1) + 2E(X2) = 3 · (−$1/19) + 2 · (−$1/19) =−$5/19

(c) E(X1 + X2 + X3 + X4 + X5) =

E(X1) + E(X2) + E(X3) + E(X4) + E(X5) = −$5/19

So the three strategies have identical expected winnings.

18

Page 21: STAT 22000 Lecture Slides Random Variables

Example – American Roulette (Cont’d)

Recall the Xi is one’s earning for betting $1 on red in the ith round.So Xi ’s are independent with mean E(Xi) = −$1/19 and varianceV(Xi) =

360361 .

For the three gambling strategies, the variance and standarddeviation of the winnings are

(a) V(5X1) = 52V(X1) = 52 · 360361

SD(5X1) =√

V(5X1) =√

52 · 360361 ≈ $4.993

(b) V(3X1 + 2X2) = 32V(X1) + 22V(X2) = (32 + 22) · 360361

SD(3X1 + 2X2) =√(32 + 22) · 360

361 ≈ $3.601(c) V(X1 + X2 + X3 + X4 + X5) =

V(X1) + V(X2) + V(X3) + V(X4) + V(X5) = 5 · 360361

SD(X1 + X2 + X3 + X4 + X5) =√

5 · 360361 = $2.233

19

Page 22: STAT 22000 Lecture Slides Random Variables

Probability distributions of winnings of the 3 strategies

Pro

babi

lity

0.00.10.20.30.40.5

−5 −4 −3 −2 −1 0 1 2 3 4 5

Pro

babi

lity

0.00.10.20.30.40.5

−5 −4 −3 −2 −1 0 1 2 3 4 5

Winnings (dollar)

Pro

babi

lity

0.00.10.20.30.40.5

−5 −4 −3 −2 −1 0 1 2 3 4 5

20