Exam II Material

EXAM II MATERIAL

Probability Distributions, and Parameter Estimation

DISCRETE DISTRIBUTIONS Some Notes

Random Variable – takes on different values based on chance

Discrete – Only has certain possible valuesContinuous – Anything is possible!

BINOMIAL DISTRIBUTION Where only two outcomes are possible Certain number of “trials” Trials are independent Probabilities are consistent

BINOMIAL DISTRIBUTION On five MC questions with five options,

what is the probability that someone randomly guessing will get three correct?

What you’re calculating is: (# ways to get 3 correct)/(all possible outcomes) Accounting for the “known” probability

What about three or less correct?

BINOMIAL DISTRIBUTION Need to find possible ways can occur Counting Rule for Combinations

Cnx = (n!)/[(x!)(n-x)!]

Tells us number of possible outcomes given situation

Order does not matter

BINOMIAL DISTRIBUTIONS With Counting Rule for Combinations

and probability, we can construct Binomial FormulaPr(x) = {(n!)/(x!)(n-x)!}*(px )*(qn-x)These are located on the Binomial Table

BINOMIAL DISTRIBUTIONS Mean

n*pGives average number of successes

Standard Deviation√(n*p*q)

POISSON DISTRIBUTION Based on a countable number of

“successes” Use Poisson when

We know average number of successesProbability of success is consistentSegments are independentWe can divide segments into smaller pieces

POISSON DISTRIBUTION Mean

λt Be careful how you use this…

Poisson Probability DisributionPr(x) = {(λt)x e-λt}/(x!)

Standard Deviation√(λt)

POISSON DISTRIBUTION Rick Ankiel has hit 10 HR in 58 games.

What is the probability that he will hit a HR in the first three innings of tonight’s game?

NORMAL DISTRIBUTION The most often used/desired distribution

of them allEasiest to work withMost other distributions converge towards

normal Looking for range of possible values

Pr(x) = 0, no matter what x isTrue for all continuous distributions

NORMAL DISTRIBUTION Density Function… Properties of Normal

Has a single peakSymmetricMean = Median = ModeApproaches 0, but never reachesVariation depends on height, spread

NORMAL DISTRIBUTION All Normals can be “Standardized” The Z-value is the “standardized”

version This value can be used with the Z-table But be aware of what you’re calculating

and reading from the table

UNIFORM DISTRIBUTION Density Function… Shaped as a rectangle with a and b

as its “limits” on the x axis Mean

(a+b)/2 Standard Deviation

√{(b-a)2/12}

SAMPLING DISTRIBUTIONS When we consider samples from a

population, those samples have a distribution of their own

We’ll want to know how accurate our sample is as a representative of the population

SAMPLING ERROR Sampling Error = (x-bar) – μ

Size will depend on sample selectionMay be + or –Can be different for each sample

SAMPLING DISTRIBUTION For all possible values of a statistic of a

given sample size that has been randomly selected from a populationThe average of all possible sample averages

will equal population averagesSame is true for standard deviationsThis property called unbiasedness

SAMPLING DISTRIBUTIONS As we increase the size of n, something

else occursAs n increases, we should see the values of

our statistics (means and standard deviations) grow closer to the population value

This is called consistencyUsually shown analytically as population

unknown

SAMPLING DISTRIBUTIONS If population is ~ N,

Sampling dist’n of sample mean ~ NMean = μStandard Deviation = σ/(√n)

We can then convert to Z-valueEquation…

CENTRAL LIMIT THEOREM This is why the Normal is so wonderful As the sample size grows, any

distribution will become approximately normal

Mean of x-bar Standard Deviation of σ/√n

SAMPLING DIST’N OF PROPORTIONS Defined as π = X/N Sample proportion is p=x/n Sampling error is p – π Mean of SampDist of p

π Standard Error

√{(π(1- π))/n} Works as long as

nπ ≥ 5n(1 – π) ≥ 5

SAMPLING DIST’N OF PROPORTION We can also do Z-values for this

Z = p – π/(std. error)

ESTIMATING PARAMETERS Point estimate

Statistic used to estimate a parameterThis is likely what you see reported

ESTIMATING PARAMETERS, “SIGMA” KNOWN Recall if the sample is large enough, we

can assume it to be normalCentral Limit Theoremn > 30, typically

Regardless, we can convert to Z-values and construct confidence intervals

ESTIMATING PARAMETERS, “SIGMA” KNOWN Confidence Interval

(X-bar) ± Z*(σ/√n)This tells you how “certain” you are that

the population value is within that range.The percentage based on choice of Z

Error happens, but it is measurableMargin of error = Z*(σ/√n)

This illustrates a tradeoffLower confidence – lower errorHigher confidence – higher error

Can also increase sample size to lower error

ESTIMATING PARAMETERS, “SIGMA” UNKNOWN We don’t always know σ (in fact, we

rarely do) But we can estimate σ (calculating s) This however changes our method,

slightly We’ll use the t-distribution

Relying on degrees of freedom

ESTIMATING PARAMETERS, “SIGMA” UNKNOWN t-score for mean… t-score for confidence interval… Now we have a method

ESTIMATING PARAMETERS What else could we do to influence the

margin of error? Change the sample size (n)

ESTIMATING PARAMETERS Sample Size Requirement (σ known)

(Z2 σ2)/(e2) But again, σ not always (if ever) known Sample Size Requirement (σ unknown)

Estimate σ using (R / 6)

ESTIMATING PARAMETERS We can also do the same for proportions Some formulas…

Sample ProportionStandard Error for pEstimate for SE for pConfidence Interval for pMargin of ErrorSample Size

HYPOTHESIS TESTING Now that you know how to calculate

some statistics, it’s time to “give you the sword”

Null Hypothesis (H0) – This is what we are testing

Alternative Hypothesis (HA) – This includes everything not in the nullOne-sided or two sided?

HYPOTHESIS TESTING Two-sided

H0 will have “=“Rejection region is on either side of the null

region One-sided

H0 will have “>” or “<“Rejection region is only on one sideWhen specifying H0, don’t set up the “straw

man” Formally, it goes away from the power of the test Informally, it’s “shady”

HYPOTHESIS TESTING What’s the point?

Statistical method of determining validity of claims

Powerful weapon of refuting or supporting these claims

Must be done properly else lose credibilityNote: We will never “prove” anythingWe only find evidence

HYPOTHESIS TESTING We will either reject or fail to reject H0 WE WILL NEVER ACCEPT H0!!! I don’t care what the book says, that

is careless and inappropriate

HYPOTHESIS TESTING This is done with error

Type I Error – Rejecting a true H0 Denoted by significance level This is your α This will determining your critical value

Type II Error – Failing to reject a false H0 This is usually denoted by β

HYPOTHESIS TESTING To do this, you’ll need your critical

value Critical Value – cutoff point where you

either reject or fail to reject H0Calculating the critical value…These will have the subscript “crit”

Critical value compared to the test statisticCalculating the test statistic…These will have subscript “stat”

HYPOTHESIS TESTING Here’s what you’ll need to do

Specify H0 and HADetermine if the test is one or two sidedSpecify Decision Rule using Zcrit Calculate ZstatCompare the two valuesExpress your decision

HYPOTHESIS TESTING Another approach exists p-value – Tells you what α level would

allow you to reject H0This does not mean you should use this αThat depends on the problem

Calculating p-valueFind ZstatFind associated value in Z-table

HYPOTHESIS TESTING Once again, σ is not always known In that event, you’ll again use t-statistics

Calculation of t-statCalculation of t-crit

Other than a change in formulas, the procedure is exactly the same

HYPOTHESIS TESTING You can also do this for proportions Calculating…

Z-stat for proportionsZ-crit for proportions

Documents

Exam II Material