19

Click here to load reader

Documentp2

  • Upload
    rt2222

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Documentp2

Probability and statistics crash course

http://www.comp.leeds.ac.uk/hannah/mathsclub

Probability 1 (for dummies:-)

Stats 1 (averages and deviations)

Probability 2 (Trials and distributions)

Stats 2 (significance)

Stats 3 (errors)

– p. 1/19

Page 2: Documentp2

Random variables

A random variable is an abstraction: it is how probabilitytheorists refer to things that can take more than one state,usually with associated probabilities.

A fair 6 sided dice can be modelled as a randomvariable with the possible outcomes {1, 2, 3, 4, 5, 6}. Theprobability of each state is 1

6 .

Random variables can be discrete, like dice. . .

. . . or continuous, like the height of a sunflower

Continuous random variables can take any number ofdifferent values

– p. 2/19

Page 3: Documentp2

Probability functions

For discrete random variables, we can represent theprobability that each possible state occurs with a probabilitymass function, or pmf.

For continuous random variables, we can represent thedistribution of probabilities with a probability densityfunction, or pdf.

To find the probability of a particular continuous randomvariable falling between two values, calculate the areaunder the pdf for those values.

– p. 3/19

Page 4: Documentp2

Bernoulli trials

Any experiment where there’s a random outcome and thiscan be either “success” or “failure” is known as a Bernoullitrial. A discrete random variable with 2 values is all youneed.

Head or tails?

Female or male?

Throwing a 6?

The Expectation E of a Bernoulli distribution with probility pis E(X) = p, and the variance V (X) = p(1 − p).

– p. 4/19

Page 5: Documentp2

Bernoulli distribution

The Bernoulli distribution is the distribution of outcomes in aBernoulli trial. It’s really very simple. It’s indexed by a singleparameter p, which is the probability of success.

Figure 1: pmf of a Bernoulli trial with p=0.2

– p. 5/19

Page 6: Documentp2

Trials

Think about throwing a dice repeatedly. How long are youlikely to have to wait to get a 6?

Figure 2: Waiting for a 6

– p. 6/19

Page 7: Documentp2

Trials 2

Think about throwing a dice 50 times. How many sixes willyou get?

Figure 3: Counting the number of 6s in 50 throws,

1000 times

– p. 7/19

Page 8: Documentp2

Distributions

The previous two slides are examples of the kind ofdistribution you get when you ask particular questionsabout a Bernoulli variable, and then carry out theexperiments to see what happens.

If you want to play with the parameters, the c++ code togenerate the data is on the web athttp://www.comp.leeds.ac.uk/hannah/mathsclub

Perhaps unsurprisingly, these distributions can be modelledtheoretically...

– p. 8/19

Page 9: Documentp2

Waiting for a 6 revisited

The probability of getting a 6 on the first throw is...

The probability of NOT getting a 6 on the first throw thengetting one on the second throw is...

The probability of NOT getting a 6 on the first or secondthrows then getting one on the third throw is...

– p. 9/19

Page 10: Documentp2

Geometric distribution

Waiting for success in a Bernoulli trial is governed by aGeometric distribution.

P (X = j) = (1 − p)j−1p

– p. 10/19

Page 11: Documentp2

Probability distribution: Geometric

Figure 4: Probability of rolling a 6 in X throws; theo-

reticalNote: Unlike a few slides back, this pmf sums to 1, and is abit tidier!

– p. 11/19

Page 12: Documentp2

Binomial distribution

The pmf for a sum of n independent Bernoulli randomvariables with success probability p is the Binomialdistribution.

p(x) =

(

n

x

)

px(1 − p)n−x

As you will remember from a few weeks back, “n choose x”is defined as

(

n

x

)

=n!

x!(n − x)!

– p. 12/19

Page 13: Documentp2

pdfs for continuous random variables

Three main types of pdf are found with continuous randomvariables. There are more, but these are the three big ones.

Continuous uniform distribution

Exponential distribution

Normal distribution (the famed Gaussian)

The probability of something falling between two values inone of these distributions is the area under the distribution.

– p. 13/19

Page 14: Documentp2

Continuous uniform distribution

A continuous uniform distribution on the interval [a, b] haspdf given by

f(x) =1

b − a

At all points, the probability is the same, and equal to 1b−a

.

– p. 14/19

Page 15: Documentp2

Exponential distribution

The pdf of a continuous random variable can sometimes bemodelled as an exponential distribution

Figure 5: Exponential distributions: lambda=0.2 and

0.5

– p. 15/19

Page 16: Documentp2

Exponential distribution: The sums bit

f(x) = λe−λx

Mean of an exponential distribution= 1λ.

Variance of an exponential distribution= 1λ2 .

– p. 16/19

Page 17: Documentp2

Normal distribution

Most things are normally distributed (blanket statementalert). A normal distribution (Gaussian) is defined by twoparameters: mean and standard deviation.

Figure 6: Normal distributions: sd=2, 4 and 1,

means at either 5 or 7– p. 17/19

Page 18: Documentp2

Normal distribution: The sums bit

The random variable X is normally distributed with mean µ

and variance σ2 if the pdf is given by. . .

f(x) =1

σ√

2πe

1

2(x−µ

σ)2

If the normal distribution has mean 0 and variance 1, it’scalled the Standard Normal distribution and is referred to asZ.

– p. 18/19

Page 19: Documentp2

Central Limit Theorem

The central limit theorem is what a lot of statistics is basedupon.

The distribution of an average is normal, even if thedistribution from which the average is drawn is totallystrange

This bit is magic, and probably best addressed with ananimation:http://www.statisticalengineering.com/central_limit_theorem.htm

– p. 19/19