06/05/2008 Jae Hyun Kim Chapter 1 Probability Theory (i) : One Random Variable Bioinformatics Tea Seminar: Statistical Methods in Bioinformatics

06/05/2008

Jae Hyun Kim

Chapter 1Probability Theory (i) : One Random Variable

Bioinformatics Tea Seminar: Statistical Methods in Bioinformatics

Discrete Random Variable Discrete Probability Distributions Probability Generating Functions Continuous Random Variable Probability Density Functions Moment Generating Functions

2

Content

[email protected]

Discrete Random Variable Numerical quantity that, in some experiment (Sample

Space) that involves some degree of randomness, takes one value from some discrete set of possible values (EVENT)

Sample Space Set of all outcomes of an experiment (or observation) For Example,

Flip a coin { H,T } Toss a die {1,2,3,4,5,6} Sum of two dice { 2,3,…,12 }

Event Any subset of outcome

3

Discrete Random Variable

[email protected]

The probability distribution Set of values that this random variable can take, together

with their associated probabilities Example,

Y = total number of heads when flip a coin twice

Probability Distribution Function

Cumulative Distribution Function

4

Discrete Probability Distributions

[email protected]

A Bernoulli Trial Single trial with two possible outcomes “success” or “failure” Probability of success = p

5

One Bernoulli Trial

[email protected]

The Binomial Random Variable The number of success in a fixed number of n independent

Bernoulli trials with the same probability of success for each trial

Requirements Each trial must result in one of two possible outcomes The various trials must be independent The probability of success must be the same on all trials The number n of trials must be fixed in advance

6

The Binomial Distribution

[email protected]

Comments Single Bernoulli Trial = special case (n=1)

of Binomial Distribution Probability p is often an unknown parameter There is no simple formula for the

cumulative distribution function for the binomial distribution

There is no unique “binomial distribution,” but rather a family of distributions indexed by n and p

7

Bernoulli Trail and Binomial Distribution

[email protected]

Hypergeometric Distribution N objects ( n red, N-n white ) m objects are taken at random, without replacement Y = number of red objects taken

Biological example N lab mice ( n male, N-n female ) m Mutations The number Y of mutant males: hypergeometric

distribution

8

The Hypergeometric Distribution

[email protected]

The Uniform Distribution Same values over the range

The Geometric Distribution Number of Y Bernoulli trials before but not including the

first failure

Cumulative distribution function

9

The Uniform/Geometric Distribution

[email protected]

The Poisson Distribution Event occurs randomly in time/space

For example, The time between phone calls

Approximation of Binomial Distribution When

n is large p is small np is moderate

Binomial (n, p, x ) = Poisson (np, x) ( = np)

10

The Poisson Distribution

[email protected]

Mean / Expected Value

Expected Value of g(y)

Example

Linearity Property

In general,

11

Mean

[email protected]

Definition

12

Variance

[email protected]

Summary

[email protected]

Moment r th moment of the probability distribution about

zero

Mean : First moment (r = 1) r th moment about mean

Variance : r = 2

14

General Moments

[email protected]

PGF

Used to derive moments Mean

Variance

If two r.v. X and Y have identical probability generating functions, they are identically distributed

15

Probability-Generating Function

[email protected]

Probability density function f(x)

Probability

Cumulative Distribution Function

16

Continuous Random Variable

[email protected]

Mean

Variance

Mean value of the function g(X)

17

Mean and Variance

[email protected]

Chebyshev’s Inequality

Proof

18

Chebyshev’s Inequality

[email protected]

Pdf

Mean & Variance

19

The Uniform Distribution

[email protected]

Pdf

Mean , Variance 2

20

The Normal Distribution

[email protected]

Normal Approximation to Binomial Condition

n is large Binomial (n,p,x) = Normal (=np, 2=np(1-p), x) Continuity Correction

Normal Approximation to Poisson Condition

is large Poisson (,x) = Normal(=, 2=, x)

21

Approximation

[email protected]

Pdf

Cdf

Mean 1/, Variance 1/2

22

The Exponential Distribution

[email protected]

Pdf

Mean and Variance

23

The Gamma Distribution

[email protected]

Definition

Useful to derive

m’(0) = E[X], m’’(0) = E[X2], m(n)(0) = E[Xn] mgf m(t) = pgf P(et)

24

The Moment-Generating Function

[email protected]

Conditional Probability

Bayes’ Formula

Independence

Memoryless Property

25

Conditional Probability

[email protected]

Definition

can be considered as function of PY(y) a measure of how close to uniform that distribution is, and

thus, in a sense, of the unpredictability of any observed value of a random variable having that distribution.

Entropy vs Variance measure in some sense the uncertainty of the value of a

random variable having that distribution Entropy : Function of pdf Variance : depends on sample values

26

Entropy

[email protected]

Documents

06/05/2008 Jae Hyun Kim Chapter 1 Probability Theory (i) : One Random Variable Bioinformatics Tea Seminar: Statistical Methods in Bioinformatics