06: Random Variables - Stanford University...The probability of the event that a random variable...

Preview:

Citation preview

06: Random VariablesLisa Yan

April 17, 2020

1

Random Variables

2

06b_random_variables

Lisa Yan, CS109, 2020

Conditional independence review

3

Next Episode Playing in 5 seconds

Lisa Yan, CS109, 2020

Random variables are like typed

variables (with uncertainty)

int a = 5;

double b = 4.2;

bit c = 1;

𝐴 is the number of Pokemon we bring to our future battle.

𝐴 ∈ 1, 2, … , 6

𝐡 is the amount of money we get after we win a battle.

𝐡 ∈ ℝ+

𝐢 is 1 if we successfully beat the Elite Four. 0 otherwise.

𝐢 ∈ {0,1}

4

Random variables are like typed variables

name

Random

variablesCS variables

Lisa Yan, CS109, 2020

Random Variable

A random variable is a real-valued function defined on a sample space.

5

Experiment 𝑋 = π‘˜

2. What is the event (set of outcomes) where 𝑋 = 2?

Outcome

Example:

3 coins are flipped.

Let 𝑋 = # of heads.

𝑋 is a random variable.

1. What is the value of 𝑋 for the outcomes:

β€’ (T,T,T)?

β€’ (H,H,T)?

3. What is 𝑃 𝑋 = 2 ? πŸ€”

Lisa Yan, CS109, 2020

Random Variable

A random variable is a real-valued function defined on a sample space.

6

Experiment 𝑋 = π‘˜

2. What is the event (set of outcomes) where 𝑋 = 2?

Outcome

1. What is the value of 𝑋 for the outcomes:

β€’ (T,T,T)?

β€’ (H,H,T)?

3. What is 𝑃 𝑋 = 2 ?

Example:

3 coins are flipped.

Let 𝑋 = # of heads.

𝑋 is a random variable.

Lisa Yan, CS109, 2020

Random variables are NOT events!

It is confusing that random variables and events use the same notation.

β€’ Random variables β‰  events.

β€’ We can define an event to be a particular assignmentof a random variable.

7

event

𝑋 = 2probability

(number b/t 0 and 1)

𝑃(𝑋 = 2)

Example:

3 coins are flipped.

Let 𝑋 = # of heads.

𝑋 is a random variable.

Lisa Yan, CS109, 2020

Random variables are NOT events!

It is confusing that random variables and events use the same notation.

β€’ Random variables β‰  events.

β€’ We can define an event to be a particular assignmentof a random variable.

8

Example:

3 coins are flipped.

Let 𝑋 = # of heads.

𝑋 is a random variable.

𝑋 = 𝟏 {(H, T, T), (T, H, T),

(T, T, H)}

3/8

𝑋 = 𝟐 {(H, H, T), (H, T, H),

(T, H, H)}

3/8

𝑋 = πŸ‘ {(H, H, H)} 1/8

𝑋 β‰₯ 4 { } 0

𝑋 = π‘₯ Set of outcomes 𝑃 𝑋 = π‘˜

𝑋 = 𝟎 {(T, T, T)} 1/8

PMF/CDF

9

06c_pmf_cdf

Lisa Yan, CS109, 2020

So far

3 coins are flipped. Let 𝑋 = # of heads. 𝑋 is a random variable.

10

Experiment 𝑋 = ___Outcome

(flip __ heads)𝑃 𝑋 = ___

Can we get a β€œshorthand” for

this last step?

Seems like it might be useful!

Lisa Yan, CS109, 2020

Probability Mass Function

3 coins are flipped. Let 𝑋 = # of heads. 𝑋 is a random variable.

11

𝑃(𝑋 = π‘˜)return value/output

number between

0 and 1

parameter/input π‘˜

A function on π‘˜with range [0,1]

What would be a useful function to define?The probability of the event that a random variable 𝑋 takes on the value π‘˜!

For discrete random variables, this is a probability mass function.

Lisa Yan, CS109, 2020

Probability Mass Function

3 coins are flipped. Let 𝑋 = # of heads. 𝑋 is a random variable.

12

𝑃(𝑋 = π‘˜)

2

0.375return value/output:

probability of the event

𝑋 = π‘˜

parameter/input π‘˜:

a value of 𝑋

N = 3P = 0.5

def prob_event_y_equals(k):n_ways = scipy.special.binom(N, k)p_way = np.power(P, k) * np.power(1 - P, N-k)return n_ways * p_way

A function on π‘˜with range [0,1]

probability mass function

Lisa Yan, CS109, 2020

A random variable 𝑋 is discrete if it can take on countably many values.

β€’ 𝑋 = π‘₯, where π‘₯ ∈ π‘₯1, π‘₯2, π‘₯3, …

The probability mass function (PMF) of a discrete random variable is

𝑃 𝑋 = π‘₯

β€’ Probabilities must sum to 1:

This last point is a

good way to verify

any PMF you create.

Discrete RVs and Probability Mass Functions

13

𝑖=1

∞

𝑝 π‘₯𝑖 = 1

shorthand notation

= 𝑝 π‘₯ = 𝑝𝑋(π‘₯)

Lisa Yan, CS109, 2020

PMF for a single 6-sided die

Let 𝑋 be a random variable that represents the result of a singledice roll.

β€’ Support of 𝑋 : 1, 2, 3, 4, 5, 6

β€’ Therefore 𝑋 is a discreterandom variable.

β€’ PMF of X:

𝑝 π‘₯ = α‰Š1/6 π‘₯ ∈ {1, … , 6}0 otherwise

14

1/6

1 2 3 4 5 60

𝑋 = π‘₯

𝑃𝑋=π‘₯

Lisa Yan, CS109, 2020

Cumulative Distribution Functions

For a random variable 𝑋, the cumulative distribution function (CDF) is defined as

𝐹 π‘Ž = 𝐹𝑋 π‘Ž = 𝑃 𝑋 ≀ π‘Ž ,where βˆ’βˆž < π‘Ž < ∞

For a discrete RV 𝑋, the CDF is:

𝐹 π‘Ž = 𝑃 𝑋 ≀ π‘Ž =

all π‘₯β‰€π‘Ž

𝑝(π‘₯)

15

Lisa Yan, CS109, 2020

Let 𝑋 be a random variable that represents the result of a single dice roll.

16

1/6

1 2 3 4 5 60

𝑋 = π‘₯

𝑃𝑋=π‘₯

CDFs as graphsCDF (cumulative

distribution function)𝐹 π‘Ž = 𝑃 𝑋 ≀ π‘Ž

πΉπ‘Ž

𝑋 = π‘₯

5/6

4/6

3/6

2/6

1/6

PMF of 𝑋

CDF of 𝑋

𝑃 𝑋 ≀ 0 = 0

𝑃 𝑋 ≀ 6 = 1

Expectation

17

06d_expectation

Lisa Yan, CS109, 2020

Discrete random variables

18

Discrete

Random

Variable, 𝑋

Experiment

outcomes

PMF

𝑃 𝑋 = π‘₯ = 𝑝(π‘₯)

Definition

Properties

Support

CDF 𝐹 π‘₯

Without performing the experiment:

β€’ The support gives us a ballpark of what values our RV will take on

β€’ Next up: How do we report the β€œaverage” value?

Lisa Yan, CS109, 2020

Expectation

The expectation of a discrete random variable 𝑋 is defined as:

𝐸 𝑋 =

π‘₯:𝑝 π‘₯ >0

𝑝 π‘₯ β‹… π‘₯

β€’ Note: sum over all values of 𝑋 = π‘₯ that have non-zero probability.

β€’ Other names: mean, expected value, weighted average,center of mass, first moment

19

Lisa Yan, CS109, 2020

Expectation of a die roll

What is the expected value of a 6-sided die roll?

20

Expectation

of 𝑋𝐸 𝑋 =

π‘₯:𝑝 π‘₯ >0

𝑝 π‘₯ β‹… π‘₯

1. Define random variables

2. Solve

𝑃 𝑋 = π‘₯ = α‰Š1/6 π‘₯ ∈ {1, … , 6}0 otherwise

𝑋 = RV for value of roll

𝐸 𝑋 = 11

6+ 2

1

6+ 3

1

6+ 4

1

6+ 5

1

6+ 6

1

6=7

2

Lisa Yan, CS109, 2020

Important properties of expectation

1. Linearity:

𝐸 π‘Žπ‘‹ + 𝑏 = π‘ŽπΈ 𝑋 + 𝑏

2. Expectation of a sum = sum of expectation:

𝐸 𝑋 + π‘Œ = 𝐸 𝑋 + 𝐸 π‘Œ

3. Unconscious statistician:

𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

21

β€’ Let 𝑋 = 6-sided dice roll,

π‘Œ = 2𝑋 βˆ’ 1.

β€’ 𝐸 𝑋 = 3.5β€’ 𝐸 π‘Œ = 6

Sum of two dice rolls:

β€’ Let 𝑋 = roll of die 1

π‘Œ = roll of die 2

β€’ 𝐸 𝑋 + π‘Œ = 3.5 + 3.5 = 7

These properties let you avoid

defining difficult PMFs.

Proofs (OK to stop here)

22

Lisa Yan, CS109, 2020

Important properties of expectation

1. Linearity:

𝐸 π‘Žπ‘‹ + 𝑏 = π‘ŽπΈ 𝑋 + 𝑏

2. Expectation of a sum = sum of expectation:

𝐸 𝑋 + π‘Œ = 𝐸 𝑋 + 𝐸 π‘Œ

3. Unconscious statistician:

𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

23

Review

Lisa Yan, CS109, 2020

Linearity of Expectation proof

𝐸 π‘Žπ‘‹ + 𝑏 = π‘ŽπΈ 𝑋 + 𝑏

Proof:

𝐸 π‘Žπ‘‹ + 𝑏 =

π‘₯

π‘Žπ‘₯ + 𝑏 𝑝 π‘₯ =

π‘₯

π‘Žπ‘₯𝑝 π‘₯ + 𝑏𝑝 π‘₯

= π‘Ž

π‘₯

π‘₯𝑝(π‘₯) + 𝑏

π‘₯

𝑝 π‘₯

= π‘Ž 𝐸 𝑋 + 𝑏 β‹… 1

24

𝐸 𝑋 =

π‘₯:𝑝 π‘₯ >0

𝑝 π‘₯ β‹… π‘₯

Lisa Yan, CS109, 2020

Expectation of Sum intuition

𝐸 𝑋 + π‘Œ = 𝐸 𝑋 + 𝐸 π‘Œ

25

(we’ll prove this

in two weeks)

𝑋 π‘Œ 𝑋 + π‘Œ

3 6 9

2 4 6

6 12 18

10 20 30

-1 -2 -3

0 0 0

8 16 24

1

7(28)

Average:1

𝑛

𝑖=1

𝑛

π‘₯𝑖1

𝑛

𝑖=1

𝑛

𝑦𝑖1

𝑛

𝑖=1

𝑛

π‘₯𝑖 + 𝑦𝑖

Intuition

for now:

+ =

+ =1

7(56)

1

7(84)

𝐸 𝑋 =

π‘₯:𝑝 π‘₯ >0

𝑝 π‘₯ β‹… π‘₯

Lisa Yan, CS109, 2020

LOTUS proof

26

Let π‘Œ = 𝑔(𝑋), where 𝑔 is a real-valued function.

𝐸 𝑔 𝑋 = 𝐸 π‘Œ =

𝑗

𝑦𝑗𝑝(𝑦𝑗)

=

𝑗

𝑦𝑗

𝑖:𝑔 π‘₯𝑖 = 𝑦𝑗

𝑝(π‘₯𝑖)

=

𝑗

𝑖:𝑔 π‘₯𝑖 = 𝑦𝑗

𝑦𝑗 𝑝(π‘₯𝑖)

=

𝑗

𝑖:𝑔 π‘₯𝑖 = 𝑦𝑗

𝑔(π‘₯𝑖) 𝑝(π‘₯𝑖)

=

𝑖

𝑔(π‘₯𝑖) 𝑝(π‘₯𝑖)

Expectation

of 𝑔 𝑋𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

For you to review

so that you can

sleep at night

Variance

27

07a_variance_i

Lisa Yan, CS109, 2020

Stanford, CA

𝐸 high = 68°F

𝐸 low = 52°F

Washington, DC

𝐸 high = 67°F

𝐸 low = 51°F

28

Average annual weather

Is 𝐸 𝑋 enough?

Lisa Yan, CS109, 2020

Stanford, CA

𝐸 high = 68°F

𝐸 low = 52°F

Washington, DC

𝐸 high = 67°F

𝐸 low = 51°F

29

Average annual weather

0

0.1

0.2

0.3

0.4

0

0.1

0.2

0.3

0.4

𝑃𝑋=π‘₯

𝑃𝑋=π‘₯

Stanford high temps Washington high temps

35 50 65 80 90 35 50 65 80 90

68Β°F 67Β°F

Normalized histograms are approximations of PMFs.

Lisa Yan, CS109, 2020

Variance = β€œspread”

Consider the following three distributions (PMFs):

β€’ Expectation: 𝐸 𝑋 = 3 for all distributions

β€’ But the β€œspread” in the distributions is different!

β€’ Variance, Var 𝑋 : a formal quantification of β€œspread”

30

Lisa Yan, CS109, 2020

Variance

The variance of a random variable 𝑋 with mean 𝐸 𝑋 = πœ‡ is

Var 𝑋 = 𝐸 𝑋 βˆ’ πœ‡ 2

β€’ Also written as: 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

β€’ Note: Var(X) β‰₯ 0

β€’ Other names: 2nd central moment, or square of the standard deviation

Var 𝑋

def standard deviation SD 𝑋 = Var 𝑋

31

Units of 𝑋2

Units of 𝑋

Lisa Yan, CS109, 2020

Variance of Stanford weather

32

Variance

of 𝑋Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

Stanford, CA

𝐸 high = 68°F

𝐸 low = 52°F

0

0.1

0.2

0.3

0.4

𝑃𝑋=π‘₯

Stanford high temps

35 50 65 80 90

𝑋 𝑋 βˆ’ πœ‡ 2

57Β°F 124 (Β°F)2

𝐸 𝑋 = πœ‡ = 68

71Β°F 9 (Β°F)2

75Β°F 49 (Β°F)2

69Β°F 1 (Β°F)2

… …

Variance 𝐸 𝑋 βˆ’ πœ‡ 2 = 39 (Β°F)2

Standard deviation = 6.2Β°F

Lisa Yan, CS109, 2020

Stanford, CA

𝐸 high = 68°F

Washington, DC

𝐸 high = 67°F

33

Comparing variance

0

0.1

0.2

0.3

0.4

0

0.1

0.2

0.3

0.4

𝑃𝑋=π‘₯

𝑃𝑋=π‘₯

Stanford high temps Washington high temps

35 50 65 80 90 35 50 65 80 90

Var 𝑋 = 39 Β°F 2 Var 𝑋 = 248 Β°F 2

Variance

of 𝑋Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

68Β°F 67Β°F

Properties of Variance

34

07b_variance_ii

Lisa Yan, CS109, 2020

Properties of variance

Definition Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

def standard deviation SD 𝑋 = Var 𝑋

Property 1 Var 𝑋 = 𝐸 𝑋2 βˆ’ 𝐸 𝑋 2

Property 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž2Var 𝑋

35

Units of 𝑋2

Units of 𝑋

β€’ Property 1 is often easier to compute than the definition

β€’ Unlike expectation, variance is not linear

Lisa Yan, CS109, 2020

Properties of variance

Definition Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

def standard deviation SD 𝑋 = Var 𝑋

Property 1 Var 𝑋 = 𝐸 𝑋2 βˆ’ 𝐸 𝑋 2

Property 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž2Var 𝑋

36

Units of 𝑋2

Units of 𝑋

β€’ Property 1 is often easier to compute than the definition

β€’ Unlike expectation, variance is not linear

Lisa Yan, CS109, 2020

Computing variance, a proof

37

Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

= 𝐸 𝑋2 βˆ’ 𝐸 𝑋 2

= 𝐸 𝑋2 βˆ’ πœ‡2= 𝐸 𝑋2 βˆ’ 2πœ‡2 + πœ‡2= 𝐸 𝑋2 βˆ’ 2πœ‡πΈ 𝑋 + πœ‡2

= 𝐸 𝑋 βˆ’ πœ‡ 2Let 𝐸 𝑋 = πœ‡

Everyone,

please

welcome the

second

moment!

Variance

of 𝑋Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

= 𝐸 𝑋2 βˆ’ 𝐸 𝑋 2

=

π‘₯

π‘₯ βˆ’ πœ‡ 2𝑝 π‘₯

=

π‘₯

π‘₯2 βˆ’ 2πœ‡π‘₯ + πœ‡2 𝑝 π‘₯

=

π‘₯

π‘₯2𝑝 π‘₯ βˆ’ 2πœ‡

π‘₯

π‘₯𝑝 π‘₯ + πœ‡2

π‘₯

𝑝 π‘₯

β‹… 1

Lisa Yan, CS109, 2020

Let Y = outcome of a single die roll. Recall 𝐸 π‘Œ = 7/2 .

Calculate the variance of Y.

38

Variance of a 6-sided dieVariance

of 𝑋Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

= 𝐸 𝑋2 βˆ’ 𝐸 𝑋 2

1. Approach #1: Definition

Var π‘Œ =1

61 βˆ’

7

2

2

+1

62 βˆ’

7

2

2

+1

63 βˆ’

7

2

2

+1

64 βˆ’

7

2

2

+1

65 βˆ’

7

2

2

+1

66 βˆ’

7

2

2

2. Approach #2: A property

𝐸 π‘Œ2 =1

612 + 22 + 32 + 42 + 52 + 62

= 91/6

Var π‘Œ = 91/6 βˆ’ 7/2 2

= 35/12 = 35/12

Lisa Yan, CS109, 2020

Properties of variance

Definition Var 𝑋 = 𝐸 𝑋 βˆ’ 𝐸 𝑋 2

def standard deviation SD 𝑋 = Var 𝑋

Property 1 Var 𝑋 = 𝐸 𝑋2 βˆ’ 𝐸 𝑋 2

Property 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž2Var 𝑋

39

Units of 𝑋2

Units of 𝑋

β€’ Property 1 is often easier to compute than the definition

β€’ Unlike expectation, variance is not linear

Lisa Yan, CS109, 2020

Property 2: A proof

Property 2 Var π‘Žπ‘‹ + 𝑏 = π‘Ž2Var 𝑋

40

Var π‘Žπ‘‹ + 𝑏

= 𝐸 π‘Žπ‘‹ + 𝑏 2 βˆ’ 𝐸 π‘Žπ‘‹ + 𝑏 2 Property 1

= 𝐸 π‘Ž2𝑋2 + 2π‘Žπ‘π‘‹ + 𝑏2 βˆ’ π‘ŽπΈ 𝑋 + 𝑏 2

= π‘Ž2𝐸 𝑋2 + 2π‘Žπ‘πΈ 𝑋 + 𝑏2 βˆ’ π‘Ž2 𝐸[𝑋] 2 + 2π‘Žπ‘πΈ[𝑋] + 𝑏2

= π‘Ž2𝐸 𝑋2 βˆ’ π‘Ž2 𝐸[𝑋] 2

= π‘Ž2 𝐸 𝑋2 βˆ’ 𝐸[𝑋] 2

= π‘Ž2Var 𝑋 Property 1

Proof:

Factoring/

Linearity of

Expectation

Bernoulli RV

41

07c_bernoulli

Lisa Yan, CS109, 2020

Jacob Bernoulli

42

Jacob Bernoulli (1654-1705), also known as β€œJames”, was a Swiss mathematician

One of many mathematicians in Bernoulli family

The Bernoulli Random Variable is named for him

My academic great14 grandfather

Lisa Yan, CS109, 2020

Consider an experiment with two outcomes: β€œsuccess” and β€œfailure.”

def A Bernoulli random variable 𝑋 maps β€œsuccess” to 1 and β€œfailure” to 0.

Other names: indicator random variable, boolean random variable

Examples:β€’ Coin flip

β€’ Random binary digit

β€’ Whether a disk drive crashed

Bernoulli Random Variable

43

𝑃 𝑋 = 1 = 𝑝 1 = 𝑝𝑃 𝑋 = 0 = 𝑝 0 = 1 βˆ’ 𝑝𝑋~Ber(𝑝)

Support: {0,1} Variance

Expectation

PMF

𝐸 𝑋 = 𝑝

Var 𝑋 = 𝑝(1 βˆ’ 𝑝)

Remember this nice property of

expectation. It will come back!

Lisa Yan, CS109, 2020

Defining Bernoulli RVs

44

Serve an ad.β€’ User clicks w.p. 0.2β€’ Ignores otherwise

Let 𝑋: 1 if clicked

𝑋~Ber(___)

𝑃 𝑋 = 1 = ___

𝑃 𝑋 = 0 = ___

Roll two dice.β€’ Success: roll two 6’s

β€’ Failure: anything else

Let 𝑋 : 1 if success

𝑋~Ber(___)

𝐸 𝑋 = ___

𝑋~Ber(𝑝) 𝑝𝑋 1 = 𝑝𝑝𝑋 0 = 1 βˆ’ 𝑝𝐸 𝑋 = 𝑝

Run a programβ€’ Crashes w.p. 𝑝‒ Works w.p. 1 βˆ’ 𝑝

Let 𝑋: 1 if crash

𝑋~Ber(𝑝)

𝑃 𝑋 = 1 = 𝑝

𝑃 𝑋 = 0 = 1 βˆ’ 𝑝 πŸ€”

Lisa Yan, CS109, 2020

Defining Bernoulli RVs

45

Serve an ad.β€’ User clicks w.p. 0.2β€’ Ignores otherwise

Let 𝑋: 1 if clicked

𝑋~Ber(___)

𝑃 𝑋 = 1 = ___

𝑃 𝑋 = 0 = ___

Roll two dice.β€’ Success: roll two 6’s

β€’ Failure: anything else

Let 𝑋 : 1 if success

𝑋~Ber(___)

𝐸 𝑋 = ___

𝑋~Ber(𝑝) 𝑝𝑋 1 = 𝑝𝑝𝑋 0 = 1 βˆ’ 𝑝𝐸 𝑋 = 𝑝

Run a programβ€’ Crashes w.p. 𝑝‒ Works w.p. 1 βˆ’ 𝑝

Let 𝑋: 1 if crash

𝑋~Ber(𝑝)

𝑃 𝑋 = 1 = 𝑝

𝑃 𝑋 = 0 = 1 βˆ’ 𝑝

(live)06: Random Variables ISlides by Lisa Yan

July 6, 2020

46

Today’s discussion thread: https://us.edstem.org/courses/667/discussion/84211

Lisa Yan, CS109, 2020

Discrete random variables

47

Discrete

Random

Variable, 𝑋

Experiment

outcomes𝑃 𝑋 = π‘₯ = 𝑝(π‘₯)

𝐸 𝑋

Definition

Properties

Review

Note: Random Variables

also called distributions

Support

Lisa Yan, CS109, 2020

Event-driven probability

β€’ Relate only binary eventsβ—¦ Either happens (𝐸)

β—¦ or doesn’t happen (𝐸𝐢)

β€’ Can only report probability

β€’ Lots of combinatorics

Random Variables

β€’ Link multiple similar events together (𝑋 = 1, 𝑋 = 2,… , 𝑋 = 6)

β€’ Can compute statistics: report the β€œaverage” outcome

β€’ Once we have the PMF (discrete RVs), we can do regular math

48

A Whole New World with Random Variables

Lisa Yan, CS109, 2020

PMF for the sum of two dice

Let π‘Œ be a random variable that represents the sum oftwo independent dice rolls.

Support of π‘Œ: 2, 3, … , 11, 12

𝑝 𝑦 =

π‘¦βˆ’1

36𝑦 ∈ β„€, 2 ≀ 𝑦 ≀ 6

13βˆ’π‘¦

36𝑦 ∈ β„€, 7 ≀ 𝑦 ≀ 12

0 otherwise

Sanity check:

49

6/36

0

π‘Œ = 𝑦

5/36

2 3 4 5 6 7 8 9 10 11 12

4/36

3/36

2/36

1/36

π‘ƒπ‘Œ=𝑦

𝑦=2

12

𝑝 𝑦 = 1

Think, then Breakout Rooms

Then check out the question on the next slide. Post any clarifications here!

https://us.edstem.org/courses/667/discussion/84211

Think by yourself: 1 min

Breakout rooms: 5 min.

50

πŸ€”

Lisa Yan, CS109, 2020

Example random variable

Consider 5 flips of a coin which comes up heads with probability 𝑝.Each coin flip is an independent trial. Let 𝒀 = # of heads on 5 flips.

1. What is the support of π‘Œ? In other words, what are the values that π‘Œ can take on with non-zero probability?

2. Define the event π‘Œ = 2. What is 𝑃 π‘Œ = 2 ?

3. What is the PMF of π‘Œ? In other words, whatis 𝑃 π‘Œ = π‘˜ , for π‘˜ in the support of π‘Œ?

51

πŸ€”

Lisa Yan, CS109, 2020

Example random variable

Consider 5 flips of a coin which comes up heads with probability 𝑝.Each coin flip is an independent trial. Let 𝒀 = # of heads on 5 flips.

1. What is the support of π‘Œ? In other words, what are the values that π‘Œ can take on with non-zero probability?

2. Define the event π‘Œ = 2. What is 𝑃 π‘Œ = 2 ?

3. What is the PMF of π‘Œ? In other words, whatis 𝑃 π‘Œ = π‘˜ , for π‘˜ in the support of π‘Œ?

52

0, 1, 2, 3, 4, 5

𝑃 π‘Œ = π‘˜ =52

𝑝2 1 βˆ’ 𝑝 3

𝑃 π‘Œ = π‘˜ =5π‘˜

π‘π‘˜ 1 βˆ’ 𝑝 5βˆ’π‘˜

Lisa Yan, CS109, 2020

Expectation

Remember that the expectation of a die roll is 3.5.

53

𝑋 = RV for value of roll

𝐸 𝑋 = 11

6+ 2

1

6+ 3

1

6+ 4

1

6+ 5

1

6+ 6

1

6=7

2

Review

𝐸 𝑋 =

π‘₯:𝑝 π‘₯ >0

𝑝 π‘₯ β‹… π‘₯ Expectation: The average value

of a random variable

Lisa Yan, CS109, 2020

Lying with statistics

54

β€œThere are three kinds of lies:

lies, damned lies, and statistics”

–popularized by Mark Twain, 1906

Lisa Yan, CS109, 2020

Lying with statistics

A school has 3 classes with 5, 10, and 150 students.

What is the average class size?

55

1. Interpretation #1

β€’ Randomly choose a classwith equal probability.

β€’ 𝑋 = size of chosen class

𝐸 𝑋 = 51

3+ 10

1

3+ 150

1

3

=165

3= 55

2. Interpretation #2

β€’ Randomly choose a studentwith equal probability.

β€’ π‘Œ = size of chosen class

𝐸 π‘Œ

= 55

165+ 10

10

165+ 150

150

165

=22635

165β‰ˆ 137

Average student perception of class sizeWhat universities usually report

Interlude for fun/announcements

56

Lisa Yan, CS109, 2020

The pedagogy behind concept checks

β€’ Spaced practice (vs. ”practice makes perfect”): better memory retention

β€’ Low-stakes testing: better concept retrieval, actively connect concepts

It is okay if you don’t understand material off-the-bat. In fact,

learning research suggests that you will learn more in the long run.

Announcements

57

Problem Set 2

Out: last Wed.

Due: Friday

Covers: through today(ish)

Problem Set 1

Due: last Wed.

On-time grades: this Wed.

Solutions: next Wed.

https://www.dartmouth.edu/~cogedlab/pubs/Kang(2016,PIBBS).pdf

http://learninglab.psych.purdue.edu/downloads/2006_Roediger_Karpicke_PsychSci.pdfJOKE HERE

Lisa Yan, CS109, 2020

Ethics in Probability: Misleading Averages

β€œAverages can be misleading when a distribution is heavily skewed at one end, with a small number of unrepresentative outliers pulling the average in their direction. β€œ

β€œIn 2011, for example, the average income of the 7,878 households in Steubenville, Ohio, was $46,341. But if just two people, Warren Buffett and Oprah Winfrey, relocated to that city, the average household income in Steubenville would rise 62 percent overnight, to $75,263 per household.”

58

https://www.nytimes.com/2013/05/26/opinion/sunday/wh

en-numbers-mislead.html

Lisa Yan, CS109, 2020

Important properties of expectation

1. Linearity:

𝐸 π‘Žπ‘‹ + 𝑏 = π‘ŽπΈ 𝑋 + 𝑏

2. Expectation of a sum = sum of expectation:

𝐸 𝑋 + π‘Œ = 𝐸 𝑋 + 𝐸 π‘Œ

3. Unconscious statistician:

𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)59

Roll a die, outcome is 𝑋. You win $2𝑋 βˆ’ 1.

What are your expected winnings?

Review

Let 𝑋 = 6-sided dice roll.

𝐸 2𝑋 βˆ’ 1 = 2 3.5 βˆ’ 1 = 6

Lisa Yan, CS109, 2020

Important properties of expectation

1. Linearity:

𝐸 π‘Žπ‘‹ + 𝑏 = π‘ŽπΈ 𝑋 + 𝑏

2. Expectation of a sum = sum of expectation:

𝐸 𝑋 + π‘Œ = 𝐸 𝑋 + 𝐸 π‘Œ

3. Unconscious statistician:

𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)60

Roll a die, outcome is 𝑋. You win $2𝑋 βˆ’ 1.

What are your expected winnings?

Review

What is the expectation of the sum of

two dice rolls?

Let 𝑋 = 6-sided dice roll.

𝐸 2𝑋 βˆ’ 1 = 2 3.5 βˆ’ 1 = 6

Let 𝑋 = roll of die 1, π‘Œ = roll of die 2.

𝐸 𝑋 + π‘Œ = 3.5 + 3.5 = 7

Lisa Yan, CS109, 2020

Important properties of expectation

1. Linearity:

𝐸 π‘Žπ‘‹ + 𝑏 = π‘ŽπΈ 𝑋 + 𝑏

2. Expectation of a sum = sum of expectation:

𝐸 𝑋 + π‘Œ = 𝐸 𝑋 + 𝐸 π‘Œ

3. Unconscious statistician:

𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)61

Roll a die, outcome is 𝑋. You win $2𝑋 βˆ’ 1.

What are your expected winnings?

Review

What is the expectation of the sum of

two dice rolls?

Let 𝑋 = 6-sided dice roll.

𝐸 2𝑋 βˆ’ 1 = 2 3.5 βˆ’ 1 = 6

(next up)

Let 𝑋 = roll of die 1, π‘Œ = roll of die 2.

𝐸 𝑋 + π‘Œ = 3.5 + 3.5 = 7

Think, then Breakout Rooms

Then check out the question on the next slide. Post any clarifications here!

https://us.edstem.org/courses/667/discussion/84211

Think by yourself: 1 min

Breakout rooms: 5 min. Introduce yourself!

62

πŸ€”

Lisa Yan, CS109, 2020

Being a statistician unconsciously

Let 𝑋 be a discrete random variable.

β€’ 𝑃 𝑋 = π‘₯ =1

3for π‘₯ ∈ {βˆ’1, 0, 1}

Let π‘Œ = |𝑋|. What is 𝐸 π‘Œ ?

63

Expectation

of 𝑔 𝑋𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

A.1

3β‹… 1 +

1

3β‹… 0 +

1

3β‹… βˆ’1 = 0

B. 𝐸 π‘Œ = 𝐸 0 = 0

C.1

3β‹… 0 +

2

3β‹… 1 =

2

3

D.1

3β‹… βˆ’1 +

1

3β‹… 0 +

1

31 =

2

3

E. C and D πŸ€”

Lisa Yan, CS109, 2020

A.1

3β‹… 1 +

1

3β‹… 0 +

1

3β‹… βˆ’1 = 0

B. 𝐸 π‘Œ = 𝐸 0 = 0

C.1

3β‹… 0 +

2

3β‹… 1 =

2

3

D.1

3β‹… βˆ’1 +

1

3β‹… 0 +

1

31 =

2

3

E. C and D

Being a statistician unconsciously

Let 𝑋 be a discrete random variable.

β€’ 𝑃 𝑋 = π‘₯ =1

3for π‘₯ ∈ {βˆ’1, 0, 1}

Let π‘Œ = |𝑋|. What is 𝐸 π‘Œ ?

64

Expectation

of 𝑔 𝑋𝐸 𝑔 𝑋 =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

𝐸 𝑋

1. Find PMF of π‘Œ: π‘π‘Œ 0 =1

3, π‘π‘Œ 1 =

2

3

2. Compute 𝐸[π‘Œ]

Use LOTUS by using PMF of X:

1. 𝑃 𝑋 = π‘₯ β‹… π‘₯2. Sum up

𝐸 𝐸 𝑋

❌

❌

Think

Then check out the question on the next slide. Post any clarifications here!

https://us.edstem.org/courses/667/discussion/84211

Think by yourself: 2 min

65

πŸ€”(by yourself)

Lisa Yan, CS109, 2020

St. Petersburg Paradox

β€’ A fair coin (comes up β€œheads” with 𝑝 = 0.5)

β€’ Define π‘Œ = number of coin flips (β€œheads”) before first β€œtails”

β€’ You win $2π‘Œ

How much would you pay to play? (How much you expect to win?)

66

Expectation

of 𝑔 𝑋

A. $10000

B. $∞C. $1

D. $0.50

E. $0 but let me play

F. I will not play

𝐸 𝑔 π‘₯ =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

πŸ€”(by yourself)

Lisa Yan, CS109, 2020

β€’ A fair coin (comes up β€œheads” with 𝑝 = 0.5)

β€’ Define π‘Œ = number of coin flips (β€œheads”) before first β€œtails”

β€’ You win $2π‘Œ

How much would you pay to play? (How much you expect to win?)

67

St. Petersburg Paradox

1. Define random variables

2. Solve

For 𝑖 β‰₯ 0:

Let π‘Š = your winnings, 2π‘Œ.

𝐸 π‘Š = 𝐸 2π‘Œ =1

2

1

20 +1

2

2

21 +1

2

3

22 +β‹―

𝑃 π‘Œ = 𝑖 =1

2

𝑖+1

=

𝑖=0

∞1

2

𝑖+1

2𝑖 =

𝑖=0

∞1

2= ∞

Expectation

of 𝑔 𝑋𝐸 𝑔 π‘₯ =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

Lisa Yan, CS109, 2020 68

St. Petersburg + Reality

1. Define random variables

2. Solve 𝐸 π‘Š =1

2

1

20 +1

2

2

21 +1

2

3

22 +β‹―

=

𝑖=0

π‘˜1

2

𝑖+1

2𝑖 =

𝑖=0

161

2= 8.5

Expectation

of 𝑔 𝑋𝐸 𝑔 π‘₯ =

π‘₯

𝑔 π‘₯ 𝑝(π‘₯)

What if dealer has only $65,536?

β€’ Same game

β€’ If you win over $65,536, dealer leaves the country

β€’ Define π‘Œ = # heads before first tails

β€’ You win π‘Š = $2π‘Œ

π‘˜ = log2 65,536

= 16

For 𝑖 β‰₯ 0:

Let π‘Š = your winnings, 2π‘Œ.

𝑃 π‘Œ = 𝑖 =1

2

𝑖+1

Breakout Rooms

Check out the questions on the next slide.Post any clarifications here!

https://us.edstem.org/courses/667/discussion/84211

Breakout rooms: 5 min. Introduce yourself!

69

πŸ€”

Lisa Yan, CS109, 2020

Statistics: Expectation and variance

1. a. Let 𝑋 = the outcome of a 4-sideddie roll. What is 𝐸 𝑋 ?

b. Let π‘Œ = the sum of three rolls of a4-sided die. What is 𝐸 π‘Œ ?

2. Compare the variances of𝐡1~Ber 0.1 and 𝐡2~Ber(0.5).

3. a. Let 𝑍 = # of tails on 10 flips of abiased coin (w.p. 0.4 of heads). What is 𝐸 𝑍 ?

b. What is Var(𝑍)?

70

πŸ€”

Lisa Yan, CS109, 2020

Statistics: Expectation and variance

1. a. Let 𝑋 = the outcome of a 4-sideddie roll. What is 𝐸 𝑋 ?

b. Let π‘Œ = the sum of three rolls of a4-sided die. What is 𝐸 π‘Œ ?

2. Compare the variances of𝐡1~Ber 0.1 and 𝐡2~Ber(0.5).

3. a. Let 𝑍 = # of tails on 10 flips of abiased coin (w.p. 0.4 of heads). What is 𝐸 𝑍 ?

b. What is Var(𝑍)?

71

If you can identify common RVs, just look up statistics instead of re-deriving from definitions.

Lisa Yan, CS109, 2020

Our first common RVs

73

𝑋 ~ Ber(𝑝)

π‘Œ ~ Bin(𝑛, 𝑝)

1. The random

variable

2. is distributed

as a3. Bernoulli 4. with parameter

Example: Heads in one coin flip,

P(heads) = 0.8 = p

Example: # heads in 40 coin flips,

P(heads) = 0.8 = p

Review

Next Time: Binomial, Poisson, and more ☺

74

Recommended