Click here to load reader
Upload
rt2222
View
212
Download
0
Embed Size (px)
Citation preview
Probability and statistics crash course
http://www.comp.leeds.ac.uk/hannah/mathsclub
Probability 1 (for dummies:-)
Stats 1 (averages and deviations)
Probability 2 (Trials and distributions)
Stats 2 (significance)
Stats 3 (errors)
– p. 1/19
Random variables
A random variable is an abstraction: it is how probabilitytheorists refer to things that can take more than one state,usually with associated probabilities.
A fair 6 sided dice can be modelled as a randomvariable with the possible outcomes {1, 2, 3, 4, 5, 6}. Theprobability of each state is 1
6 .
Random variables can be discrete, like dice. . .
. . . or continuous, like the height of a sunflower
Continuous random variables can take any number ofdifferent values
– p. 2/19
Probability functions
For discrete random variables, we can represent theprobability that each possible state occurs with a probabilitymass function, or pmf.
For continuous random variables, we can represent thedistribution of probabilities with a probability densityfunction, or pdf.
To find the probability of a particular continuous randomvariable falling between two values, calculate the areaunder the pdf for those values.
– p. 3/19
Bernoulli trials
Any experiment where there’s a random outcome and thiscan be either “success” or “failure” is known as a Bernoullitrial. A discrete random variable with 2 values is all youneed.
Head or tails?
Female or male?
Throwing a 6?
The Expectation E of a Bernoulli distribution with probility pis E(X) = p, and the variance V (X) = p(1 − p).
– p. 4/19
Bernoulli distribution
The Bernoulli distribution is the distribution of outcomes in aBernoulli trial. It’s really very simple. It’s indexed by a singleparameter p, which is the probability of success.
Figure 1: pmf of a Bernoulli trial with p=0.2
– p. 5/19
Trials
Think about throwing a dice repeatedly. How long are youlikely to have to wait to get a 6?
Figure 2: Waiting for a 6
– p. 6/19
Trials 2
Think about throwing a dice 50 times. How many sixes willyou get?
Figure 3: Counting the number of 6s in 50 throws,
1000 times
– p. 7/19
Distributions
The previous two slides are examples of the kind ofdistribution you get when you ask particular questionsabout a Bernoulli variable, and then carry out theexperiments to see what happens.
If you want to play with the parameters, the c++ code togenerate the data is on the web athttp://www.comp.leeds.ac.uk/hannah/mathsclub
Perhaps unsurprisingly, these distributions can be modelledtheoretically...
– p. 8/19
Waiting for a 6 revisited
The probability of getting a 6 on the first throw is...
The probability of NOT getting a 6 on the first throw thengetting one on the second throw is...
The probability of NOT getting a 6 on the first or secondthrows then getting one on the third throw is...
– p. 9/19
Geometric distribution
Waiting for success in a Bernoulli trial is governed by aGeometric distribution.
P (X = j) = (1 − p)j−1p
– p. 10/19
Probability distribution: Geometric
Figure 4: Probability of rolling a 6 in X throws; theo-
reticalNote: Unlike a few slides back, this pmf sums to 1, and is abit tidier!
– p. 11/19
Binomial distribution
The pmf for a sum of n independent Bernoulli randomvariables with success probability p is the Binomialdistribution.
p(x) =
(
n
x
)
px(1 − p)n−x
As you will remember from a few weeks back, “n choose x”is defined as
(
n
x
)
=n!
x!(n − x)!
– p. 12/19
pdfs for continuous random variables
Three main types of pdf are found with continuous randomvariables. There are more, but these are the three big ones.
Continuous uniform distribution
Exponential distribution
Normal distribution (the famed Gaussian)
The probability of something falling between two values inone of these distributions is the area under the distribution.
– p. 13/19
Continuous uniform distribution
A continuous uniform distribution on the interval [a, b] haspdf given by
f(x) =1
b − a
At all points, the probability is the same, and equal to 1b−a
.
– p. 14/19
Exponential distribution
The pdf of a continuous random variable can sometimes bemodelled as an exponential distribution
Figure 5: Exponential distributions: lambda=0.2 and
0.5
– p. 15/19
Exponential distribution: The sums bit
f(x) = λe−λx
Mean of an exponential distribution= 1λ.
Variance of an exponential distribution= 1λ2 .
– p. 16/19
Normal distribution
Most things are normally distributed (blanket statementalert). A normal distribution (Gaussian) is defined by twoparameters: mean and standard deviation.
Figure 6: Normal distributions: sd=2, 4 and 1,
means at either 5 or 7– p. 17/19
Normal distribution: The sums bit
The random variable X is normally distributed with mean µ
and variance σ2 if the pdf is given by. . .
f(x) =1
σ√
2πe
1
2(x−µ
σ)2
If the normal distribution has mean 0 and variance 1, it’scalled the Standard Normal distribution and is referred to asZ.
– p. 18/19
Central Limit Theorem
The central limit theorem is what a lot of statistics is basedupon.
The distribution of an average is normal, even if thedistribution from which the average is drawn is totallystrange
This bit is magic, and probably best addressed with ananimation:http://www.statisticalengineering.com/central_limit_theorem.htm
– p. 19/19