Probability and Statistics Course Notes

8/12/2019 Probability and Statistics Course Notes

1/31

MEASURES of LOCATION and VARIABILITY

Measures of Location:

Given numbers

sample mean

Notes:

i) population meandenoted by .ii) trimmed meanseliminate percentage of outliers.

iii)categorical data: mean is the sample proportion.

the sample median :

the middle value if nis odd;

average of two middle values if nis even.

Measures of Variability

Given numbers

the deviationfrom the mean: .

Note: .

the sample variance ;

sis the sample standard deviation.Computation using shortcut method:

.

the population variance ;is the population standard deviation.

Why use n-1 for ?

intuitive answer: do not know , so overcompensate.technical answer:

is based on only n-1 degrees of freedom.

Simple Properties of and s:

Given and c

If ,

then and .

If ,

then and .


2/31

Boxplots:

Given numbers

fourth spread upper fourth - lower fourth

lower fourth: median of n/2 or (n+1)/2 smallest 'supper fourth: median of n/2 or (n+1)/2 largest 's

boxplot:

1. draw and mark axis

2. draw a box extending from lower fourth to upper fourth

3. draw median line in box

4. extend lines from the box edges to the farthest 's within fromedges

5. mild outliers: draw open circles at each from to from boxedges

6. extreme outliers: draw solid circles at each beyond from box

edges

SAMPLE SPACES and EVENTS

Sample Space :

The set of all possible outcomes of an experiment

Event:A subset of outcomes in

simple eventconsists of only one outcome compound eventconsists of more than one outcome

Set Theory

The unionof two eventsAandB, ,

is all outcomes in eitherAorB, or both.

The intersectionof two eventsAandB, ,

is all outcomes in bothAandB.

The complementof eventA,A',

is all outcomes in not inA. disjointor mutually exclusiveeventsAandB, have no outcomes in common

PROBABILTITY AXIOMS and PROPERTIES

Objective:


3/31

Given , determine for each a number

P(A), the probabilityor chance thatAwill occur.

Axioms of Probability

1.

For any eventA, the probabilityofA, .

2. 1.

3. If is a collection of mutually exclusive events,

Interpretation:If an experiment with events from is repeated many times,P(A) is the relativefrequency forA.

Properties of Probability

For any eventA,P(A) = 1 -P(A').

IfAandBare mutually exclusive, . For any two eventsAandB,

.Note:

.

If are simple events in compound eventA

COUNTING TECHNIQUES

Key Formula:If size of isNand number of outcomes inAisN(A) thenP(A) =N(A)/N.

Product Rule for Ordered Pairs

Rule: if 1 element can be selectecd ways and 2 element ways, then the

number of pairs is . Use of tree diagrams

General Product Rule

Use k-tupleto denote ordered collection of kobjects.


4/31

If choices for 1 element, choices for 2 element, ..., choices for k

element, then there are possible k-tuples.

Permutations

A ordered setof kobjects taken from a set of ndistinct objects isa permutationof size k.

The number of permutations of size kfrom ndistinct objects is .

Combinations

A unordered setof kobjects taken from a set of ndistinct objects isa combinationof size k.

The number of combinations of size kfrom ndistinct objects

is (nchoose k).

CONDITIONAL PROBABILITY

Notation: P(A|B) is the conditional probabilityofAgiven thatBhas occurred.

Definition: IfP(B) > 0 then

Mutiplication Rule

Note: . Exhaustive:

Events are exhaustiveif one must occur, so

that .

Law of Total Probability:

If are exhaustive and mutually exclusive events, then for any other eventB

Bayes' Theorem:

If are mutually exclusive and exhaustive events with all , then for

any other eventBwithP(B) > 0

INDEPENDENCE

Definition

EventsAandBare independentifP(A|B)=P(A).

Two Independent Events


5/31


6/31

Definition

A Bernoulli random variableis a random variable with only two outcomes 0 and 1.

Definition

A discreteset has either a finite number of elements or elements that can be listed insequence.

Definition

A discrete random variablehas a discrete set of possible values.

DISCRETE RV PROBABILITY DISTRIBUTIONS

Probability Mass Function

A probability mass function (pmf),

p(x), for a discrete rv is defined by.

Pictorial representation: probability histogram.

Parameterized PMF's

A familyofp(x)'s can depend on a parameter.

Example: Bernoulli rv's, with , ,depend on parameter .

Cumulative Distribution Function

A cumulative distribution function (cdf),F(x), for a discrete rv is defined by

Graph of pdf for discrete rv is a step function.

For real aand bwith ,

EXPECTED VALUES for DISCRETE RV's

Expected Values

Expected valueor mean valuefor rvXwith

valuesxfrom some setDis


7/31

Expected value of a function h(X) is

Rule for expected values: for constants aand b,

Variance of X

VarianceofXis

Standard deviationofXis . Shortcut formula

Rules for variance:

i) , and ii) .

BINOMIAL DISTRIBUTION

Binomial Experiment

Conditions1. Experimment has ntrials, with nfixed in advance.

2. Trials are identical with S or F results only.

3. Trials are independent.

4.

Probability of success for each trial isp. Large population rule: an S or F without replacementexperiment from a

population of sizeNis approximately binomial if n


8/31

Hypergeometric Experiment Conditions

1. Population to be sampled hasNobjects.

2. Each object is labelled S or F, withMS's.

3. A sample of size nis drawn so that

each subset of size nis equally likely.

Hypergeometric RV's

A hypergeometric random variable XisX= the number of successes for a sample of size n.

If ,

.

. Notes:

a) Let ;E(X) = npand .b) For largeNandM, and withpfixed,

.

BINOMIAL RELATED DISTRIBUTIONSCONTINUED

Negative Binomial Experiment Conditions

1. Experiment consists of sequence of independent trials.

2. Each tria l result is S or F.

3. Probabilitypof success is constant for each trial.4. Experiment continues until rS's are observed.

Negative Binomial RV's

A negative binomial random variable Xis

X= the number of failures preceeding success.

Forxan integer with ,


9/31

E(X) = r(1-p)/p, .

Note: If r=1, , the pmf forthe geometric distribution.

POISSON DISTRIBUTION

Poisson Process Assumptions

1. At most one event can occur at random at any time (or at any point in space).

2. The occurrence of an event in a given time (or space) interval is independent ofthat in any other nonoverlapping interval.

3. The probability of occurrence of an event in a small interval is proportional (with

some constant , the occurrence rate) to the width of the interval.

Poisson RV's

A Poisson random variable Xis

X= number of occurrences of event in interval .

for , and .

.

Notes:

a) For large nand smallp

, with .b) If is given for some time t, ,

.

DISCRETE RV's SUMMARY

Terms:Random variable, Bernoulli rv, discrete rv, probability mass function, cumulative

distribution function, expected value (mean), variance, standard deviation, binomialexperiment, hypergeometric experiment, negative binomial experiment, Poisson process.

Distributions


10/31

Binomial:X= # of successes for ntrials;

E(X) = npand V(X) = np(1-p).

Hypergeometric:X= # of S's for a sample of size n;

If ,E(X)=npand ;

for largeNandM, .

Negative binomial:X= # of F's preceeding S;

E(X) = r(1-p)/pand ;

nb(x; 1,p) is the geometric distribution.

Poisson:X= # of event occurrences in some interval;

for large nand smallp, .

CONTINUOUS RANDOM VARIABLES

Continuous Random Variables

Definition: an rv X is continuous

if its set of possible values is an interval.

Density: a probability density function (pdf)is is a functionf(x) defined forxin

some [a,b] with

Requirements: a) , and b) . Density Graph: the limitof a sequence of histograms.

Some Properties:i)P(X= c) = 0, and

ii)

.

Uniform distribution


11/31

A continuous rv has uniform distributionon [A,B] if

CONTINUOUS CDF's and EXPECTATIONS

Cumulative Distribution Function

The cdf for a continuous rvX, with pdff(y), is

Probabilities: .

Cdf for uniform pdf: .

Pdf from cdf:f(x) =F'(x).

Percentiles

The 100 percentile, , forXis defined by

The median, forXis defined by .

Expectations

The meanof a continuous rvXis

The varianceof a continuous rvXis

The standard deviation(SD) forXis .

A symmetricpdf has .

THE NORMAL DISTRIBUTION

Normal Distribution


12/31


13/31

100 percentile for

= [100 percentile forN(0,1)] . Many applications to discrete populations

Normal Approximation to Binomial

If with npand nqboth large,

THE GAMMA DISTRIBUTION

Gamma Function

For real , the gamma functionis

Properties:

a) for any , ;

b) for an integer n, ; c) .

Gamma PDF

A gammarvXhas pdf, for and ,

A standard gammarvXhas pdf, for ,

Gamma CDF

Forx> 0, the incomplete gamma functionis

A standard gamma rvXhas cdf

with .


14/31


15/31

The cdf, using pdff(x), is

.

Expectations: mean ;

variance , with SD .

E(aX+b)= aE(X)+b; ;

ifZ= aX+b, .

100 percentile, , is defined by ,

with median defined by .

Distributions

Uniform: pdf , for ,

cdf , with , .

Standard Normal: pdf ,

cdf .

If , has cdf .

If for large n,

Gamma function: for , ;

( =(n-1)! for integer n).

Standard gamma: pdf ,

with .

Exponential: pdf , cdf .

JOINT DISTRIBUTIONS

Joint PMF's:

assumeXand Yare rv's for .

Thejoint pmfis .


16/31

IfAis a set of (x,y)'s, .

The marginal pmf's forXand Yare

.

Joint PDF'sAssumeXand Yare continuous rv's.

If thejoint pdfforXand Yisf(x,y),

.

If ,

The marginal pdf's forXand Yare

.

Independence:two rv'sXand Yare independentif

.

.

Conditional Distributions

The conditional pdf of Ygiven X=x, for

continous rv'sXand Y, is .

The conditional pmf of Ygiven X=x, for

discrete rv'sXand Y, is .

EXPECTED VALUES

Assume rv'sXand Ywith pmfp(x,y) or pdff(x,y).

Expected Value:

the expected value of h(x,y) is

Covariance:


17/31

the covariancebetweenXand Yis

Correlation

The correlation coefficient

.

.

IfXand Yare independent, then .

iff Y= aX+ b, with .

If thenXand Yare uncorrelated.

STATISTICS and DISTRIBUTIONS

Statistics

Background:

before sampling, rv's denote possible observations; after sampling, sample

values denote actual observations.

A statisticis any quantity that can be calculated from sample data. E.g. mean,

variance, median.

The sampling distributionis the distribution for a statistic.

Sampling

An independent and identically distributed(iid)

random sampleis an independent set of rv's that all have the sameprobability distribution.

E.g. samples with replacement, or samples from a very large population.

Determination of sampling distribution:

! by derivation -

to determine exact sampling distribution.

! by simulation -

use histograms to approximate distribution.


18/31

SAMPLE MEAN DISTRIBUTION

Assume a random sample from some

distribution with mean and standard deviation .

Sample Mean Distribution

.

.

If sample total , then

, .

If , then .

Central Limit Theorem (CLT)

If nis sufficiently large, then

and ).

CLT can usually be applied if n> 30.

If only positive 's are possible, then

is approximately lognormal.

DISTRIBUTION of LINEAR COMBINATIONS

Linear Combination Distribution

Given rv's and constants , the rv

is called a linear combinationof the 's.

Expected Value:

If has mean for ,

Variance:

If has variance for and

if the 's are independent, then


19/31

with .

For any 's,

Difference:

, and

for independent , , .

Normal Case:

If , for ,

independently, then .

JOINT DISTRIBUTIONS and RANDOM SAMPLES SUMMARY

Terms:Joint pmf's and pdf's, marginal pmf's and pdf's, idependence, conditional pmf's and pdf's,

expected values, coavariance, correlation coefficient, uncorrelated, statistic, sampling

distribution, iid, CLT, linear combination.

Joint Distributions:assumeXand Yare rv's.

Probabilities:

for discrete rv's;

for continuous rv's. Marginals:

, discrete rv's;

, continuous rv's.

Independence:Xand Yare independentif

, for discrete rv's;

, for continuous rv's.

Conditionals: conditional pdf of YgivenX=x

for discrete rv's;

for continous rv's.


20/31

Expected value of h(x,y) isE[h(x,y)]

for discrete rv's;

for continuous rv's.

Covariance: .

Correlation: ;

X, Yindependent ;

uncorrelated.

Sample Mean Distribution:

Assume rv's , for someD.

, and .

If , , .

If , then .

CLT: if nlarge, and ).

Linear Combinations:

Assume rv's , and

, for some constants .

Expected Value: If ,

Variance: If , and 's are independent,

; .

For any 's, .

Normal Case: If , independently,

.


21/31

POINT ESTIMATION CONCEPTS

Point Estimates

A point estimateof a parameter is a single number that is the most plausible value

for . Some suitable statistic for is called a point estimatorfor .Unbiased Estimators

is an unbiased estimatorof if .

If is biased, is called the biasof .

Principle: Always choose an unbiased estimator.

If , then the sample proportion is an unbiased estimatorofp.

Given rv's with for someD, is unbiased

for and is unbiased for . ForDsymmetric, or

any trimmed mean are unbiased for . Minimum Variance Unbiased Estimators

a) MVUE Principle: MVUE estimators are preferred.

b) If , is an MVUE.c) A robust estimatorhas low variance for a variety of distributions (e.g. 10-20%

trimmed mean).

The Standard Error

The standard errorof an estimator is .

The estimated standard errorof an estimator is denoted by or .

POINT ESTIMATION METHODS

Method of MomentsAssume

a random sample with .

The k distribution momentis .


22/31

The k sample momentis .

Moment estimatorsuse sample moments as

approximations to distribution moments to determine

.

Maximum LikelhoodAssume

have joint pmf or pdf .

Maximum likelihood estimators(mle's)

maximizef.

Invariance principle: if are mle's for , then the mle of

any is .

For large nan mle is an MVUE.

POINT ESTIMATION SUMMARY

Terms:Point estimate, estimator, unbiased, MVUE, standard error, distribution moments, sample

moments, MLE.

Unbiased Estimators

If , then the sample proportion is an unbiased estimatorofp.

Given rv's with ,

is unbiased for , and

is unbiased for .

If , is an MVUE.

Estimation Methods


23/31

Moment estimators use sample moments as

approximations to distribution moments to

determine .

If have pmf or pdf , the

MLE's maximizef.

CONFIDENCE INTERVAL PROPERTIES

Assume rv's with , known,

and observed values .

Confidence Intervals

The 95% confidence interval(95% CI) for :

so the 95% CI for isInterpretation: if the the experiment is repeated many times, 95% of the CI's will

contain .

The 100 % CIfor :

so the % CI is

Choice of Sample Size

Suppose a % CI of length L is desired.

Determine from .

Solve to determine

LARGE SAMPLE CI's for and p

Assume rv's with , and

observed values , with nlarge.


24/31


25/31

A tdistribution with degrees of freedom (df)has pdf:

Examples: ; ;

; . Properties of tdistributions:

1. is symmetric aboutx= 0 and bell-shaped.

2. more spread-out than a .

3. As , .

The tcritical value is the point where

Normal CI's using 's

Theorem: has a tdistribution with n-1 df.

.

The % CI for is

Prediction intervalfor with level % is

CI's for and for Normal RV's

Assume a random sample from .

The Distribution

distribution with degrees of freedomhas pdf:


26/31

Examples: ; ;

; .

Note: is skewed with mean .

The critical value is the point where

CI's for and using 's

Theorem: The rv

has a distribution with n-1 df.

.

The % CI for is

The % CI for is

CONFIDENCE INTERVALS SUMMARY

Terms:

100 % confidence interval, critical value,

tdistribution, distribution.

100 % CI for

known, :

If a CI of length L is desired, use .


27/31

unknown, :

If a CI of length L is desired, use .

unknown, , n< 30:

HYPOTHESES and TEST PROCEDURES

Hypotheses

Statistical hypothesis: a claim about the value(s) of some populationcharacteristic(s).

Null hypothesis : a claim believed to be true.

Takes the form ( is the nullvalue).

An alternate hypothesis is the other claim.Takes the form

1. (implicit null hypoth. ),

2. (implicit null hypoth. ), or

3. .

Test Procedures:A test procedure is specified by

1. A test statistic, a function of the sample data on which the decision to reject or

not reject is based.

2. A rejection regionR, the set of all test statistic values for which will berejected.

Errors in Hypothesis Testing

Type I error: is rejected, when true.

Let (type I error) = P( rejected, when true).

Type II error: is not rejected, when false.

Let (type II error) = P( not rejected, when false).

Decrease in R to obtain smaller results in larger .


28/31

100 % CI for p

, when :

If a CI of length L is desired, use ;

use of always provides length .

100 % CI for

, when :

POPULATION MEAN TESTSAssume that the null hypothesis is: .

Case I: Normal Population with Known

Test statistic: .

Sample size for one-tailed test: .

Sample size for two-tailed test: .

Case II: Large Sample Tests


29/31

- use Case I with .

Case III: Normal Population, uknown , small n

Test statistic: .

Type II Error Probabilities: require use of graphs or complicated numericalintegration.

POPULATION PROPORTION TESTSAssume that the null hypothesis is: .

Large Sample Tests:

assume and .

Test statistic: .

One-tailed test .

Two-tailed test .


30/31

Small Sample Tests:use the Binomial distribution.

P-VALUES for HYPOTHESIS TESTING

P-Values

Definition: The P-valueis the smallest level of

significance at which would be rejected whena specified test procedure is used.

IfP-value then reject at level .

IfP-value then do not reject at level .

Data is said to besignificantif is rejected,otherwise data is said to be insignificant.

P-Values for a z-test

P-Values for a t-test

These require interpolation in tables for the tdistribution.

HYPOTHESIS TESTING SUMMARY

Terms:hypothesis, null hypothesis, alternate hypothesis,

Type I, II errors, rejection region, P-value, significant data.

Population Mean Tests:

null hypothesis is .


31/31

known, , using .

Sample size for one-tailed test: .

Sample size for two-tailed test: .

unknown, : use previous with .

unknown, , n< 30: use

, and instead of .

Tests for p:

null hypothesis is . For large n,

use , replacing by .For small n, use the Binomial distribution directly.

P-Values for Hypothesis Testing

P-Values for a t-test: use instead of .

Documents

Probability and Statistics Course Notes