13
3/5/20 1 CS 237: Probability in Computing Wayne Snyder Computer Science Department Boston University Lecture 10: Expectation of Discrete Random Variables Concluded: Expected Value of standard distributions Properties of Expected Value Problems and paradoxes of Expected Value Variance and Standard Deviation of Discrete Random Variables Properties of Variance and Standard Deviation Variance and Standard Deviation of Standard Distributions Discrete RandomVariables: Expected Value A fundamental way of characterizing a collection of real numbers is the average or mean value of the collection: Example: The mean/average of { 2, 4, 6, 9 } = 21/4 = 5.7 The corresponding notion for a random variable X is the Expected Value: Example:

CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

1

CS 237: Probability in Computing

Wayne SnyderComputer Science Department

Boston University

Lecture 10:• Expectation of Discrete Random Variables Concluded:

• Expected Value of standard distributions• Properties of Expected Value• Problems and paradoxes of Expected Value

• Variance and Standard Deviation of Discrete Random Variables• Properties of Variance and Standard Deviation• Variance and Standard Deviation of Standard Distributions

Discrete RandomVariables: Expected ValueA fundamental way of characterizing a collection of real numbers is the averageor mean value of the collection:

Example: The mean/average of { 2, 4, 6, 9 } = 21/4 = 5.7

The corresponding notion for a random variable X is the Expected Value:

Example:

Page 2: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

2

Linearity of Expected ValueTheorem (Linearity of Expectation)

For any random variable X and real numbers a and b,

Proof:

Corollary: For any constant c, E(c) = c.

This will make many calculations involving expected value MUCH easier!

Linearity of Expected ValueTheorem (Linearity of Expectation of Sum of Random Variables)

For any random variables X and Y,

Proof:

where in the second-to-last step, we used the Law of Total Probability:

Note: X and Y do not have to be independent!

Again, this will simplify many proofs to come.

Page 3: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

3

Expected Value: Basic PropertiesTheorem (Linearity of Expectation of Products of Independent Random Variables)

For any independent random variables X and Y,

Proof:

Note: X and Y must be independent!

Expected Value: Basic PropertiesCorollary: This theorem is not true if X and Y are dependent:

Here is a counter example.

Suppose you flip a coin, let X = the number of heads showing and Y = the number of tails showing. It should be obvious X and Y are not independent, in fact Y = 1 – X.

This confirms your intuition that if you flip a coin and multiply the number of heads showing by the number of tails showing (one of which is always 0), you’ll always get 0.

But:

Page 4: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

4

Expected Value of the Standard DistributionsExpected Value of Bernoulli

Expected Value of the Standard DistributionsExpected Value of Binomial

By the linearity of expectation of sums of RVs we immediately have:

E(X) = E(Y1) + E(Y2) + … + E(YN) = N * E(Y) = N * p

Page 5: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

5

Geometric Distribution: Expected Value To derive the Expected Value, we can use the fact that X ~ Geometric(p) has the memoryless property and break into two cases, depending on the result of the first Bernoulli trial.Let

XS = “result of X when there is a success on the first trial”XF = “result of X when there is a failure on the first trial”

Clearly,o E(XS) = 1o E(XF) = 1 + E( X for the remaining trials )

= 1 + E( X ) By the memoryless property!

E(X) = 1/p

Thus we have:

Pascal Distribution: Expected Value Since the Pascal is simply an “iterated” version of the Geometric, we can use the theorem again!

Formally, if Y ~ Bernoulli( p ) and

X = “The number of trials of Y until m successes occur”

= Y1 + … + Ym

Then

and by the linearity of expectation for sums of RVs we have

m times

Page 6: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

6

Expected value is a one-number summary of a possibly very complex distribution!

How useful is this? Are they any issues or problems to keep in mind??

Issue #1: The expected value may not exist (may be ∞)!

For countably infinite discrete RVs, we calculate the expected value using an infinite sum, for example the expected number of trials until the first success in Geometric(1/2):

The problem is that such sequences may not converge, and the expected value may be infinite (you’ll see what I mean when you do the current homework).

Problems with Expected Value

Expected value is a one-number summary of a possibly very complex distribution!

How useful is this? Are there any issues or problems to keep in mind??

Issue #2: (The Inspection Paradox)

For some problems, the expected value is very subtle, and depends, essentially on your point of view, or perhaps on naively confusing “average value” with “expected value.”

Example: Suppose Wayne is teaching two classes at BU, LX 496 with 20 students, and CS 237 with 120 students.

What is the “average class size” for Wayne?

What is the “average class size” for a studentin one of these classes?

Problems with Expected Value

Page 7: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

7

Issue #3: Expectation is only one point of comparison, there are many others!

Example: How useful is expected value in making the following decision?

Consider two games. Which one would you prefer to play?

Game One: For $1 per round, you can flip a coin, and I’ll give you $11 (net: $10) if heads appears, and nothing if tails appears (net: -$1). Call this the random variable X1 :

Game Two: For $1 per round, you can flip a coin 20 times, and if you get 20 heads, I’ll give you $5,767,168, else you lose the $1. Call this the random variable X2 :

Oh, and by the way, I only have time for a single round....

Any takers for Game TWO??

Discrete RandomVariables: Expected Value

The question is: How much does X vary from E(X)? How spread out is the probability distribution around the expected value?

Discrete RandomVariables: Variance

E(X)

Page 8: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

8

The Variance of a random variable, Var(X), is the expected deviation from E(X).

But how to define “deviation”? Whatever we pick it should work on simple examples.

First (doomed) attempt: deviation = distance from expected value

deviation = X – E(X) Var(X) = E[ X – E(X) ]

Example: X1 = “Flip a coin and return the number of heads showing”X2 = “Flip a coin and return 100 * the number of heads showing”

Discrete RandomVariables: Variance

Note that E(X) is a constant:

Example: X1 = “Flip a coin and return the number of heads showing”X2 = “Flip a coin and return 100 * the number of heads showing”

Second attempt: deviation = absolute value of distance from E(X)

deviation = | X – E(X) | Var(X) = E[ | X – E(X) | ]

Discrete RandomVariables: Variance

Page 9: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

9

Example: X1 = “Flip a coin and return the number of heads showing”X2 = “Flip a coin and return 100 * the number of heads showing”

Second attempt: deviation = absolute value of distance from E(X)

deviation = | X – E(X) | Var(X) = E[ | X – E(X) | ]

Looks promising! What’s wrong with that?

Ugh! Requires case analysis and blows up with an exponential number of cases, and resulting in functions that are not continuous; such "piece-wise" functions are very hard to work with! Anyone want to take the derivative of the following function?

Discrete RandomVariables: Variance

Ok, finally, here is the best definition:

This is the standard definition and has severaladvantages:

o It it much easier to work with mathematically;o Like the absolute value, it gives only positive values.

But it gives results which are not very intuitive!

Discrete RandomVariables: VarianceAlternate notation for expected value:

or just if X is obvious.

And what about the units?If these are dollars, then this is 2500 squared dollars...

Page 10: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

10

Therefore a more common measure of spread around the mean is the Standard Deviation:

This has all the advantages of the variance, plus three more:o It explains simple examples;o The units are correct; ando It corresponds to a well-known geometric notion, the Euclidean Distance....

Discrete RandomVariables: Standard Deviation

Let's apply this idea to our games:

Discrete RandomVariables: Variance and StdDev

Page 11: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

11

Useful formulae for the Variance and Standard Deviation:

Theorem:

Proof:

Discrete RandomVariables: Variance and StdDev

Recall that E(X) is a constant!

Useful formula for the Variance and Standard Deviation, showing that variance and the standard deviation are NOT linear functions:

Theorem:

Proof:

Discrete RandomVariables: Variance and StdDev

Corollary:

Page 12: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

12

However, independence, as usual, makes things simpler:

Theorem: (Variance of Sum of Independent Random Variables)

Let X and Y be independent random variables, then

Proof:

Discrete RandomVariables: Variance and StdDev

This term is called the Covariance of X and Y, Cov(X,Y), and measures how much they “vary together”. For independent RV, Cov(X,Y) = 0. This will be back in a few weeks….

Variance of Standard Distributions

Page 13: CS 237: Probability in Computingsnyder/cs237/Lectures and Materials...expected value MUCH easier! Linearity of Expected Value Theorem (Linearity of Expectation of Sum of Random Variables)

3/5/20

13

However, independence, as usual, makes things simpler:

Theorem: (Variance of Sum of Independent Random Variables)

Let X and Y be independent random variables, then

Proof:

Expectation and Variance: Summary of Theorems

Variance of Standard Distributions