Upload
darrell-cameron
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Random Thoughts 2012(COMP 066)
Jan-Michael FrahmJared Heinly
source: fivethirtyeight.com
2
Election Polls
• Virginia at 8:30 pm was 58% Romney and 41% Obama with 12% of the polls in
• That is a poll of 971, 000 people
• Why did Obama win it now?
3
Election Polls• Why is Florida still not having a projected
winner?
• Why does Ohio already have a projected winner with the same percentage of polls in?
4
Statistic of Support for Candidates
• USAToday: “Romney leads in states with most American cars” eight of the top 10 states for registration of new
american build cars are Romney supporters two others are Swing states (Iowa & Michigan) In fact the next four states are also Romney
supporters Obama has solid support for 9 of the 10 states
with the most foreign registrations
• Is this statement true?
• How do we compute if its true?
5
Hypothesis Testing
• What we want is to test a hypothesis H0
• Hypothesis is usually a number to characterize a population percentage of cancer in population average size of a person in the US ….
• In hypothesis testing we also need an alternative hypothesis that we pick to make a statement if H0 is rejected
6
Alternative Hypothesis • Alternative Hypothesis Ha
selected to support the rejection statement typical choices:
Ha< H0 is less is the desired statement if H0 is rejected
Ha<> H0 is different is the desired statement if H0 is rejected (H0 is false)
Ha< H0 is larger is the desired statement if H0 is rejected
Ha often called research hypothesis
• How to select what is H0 and what is Ha?
H0 is typically the statement you want to verify
H0 is typically assumed to be true unless there is strong evidence its wrong (similar to jury trial)
7
Find a Sample to Test Hypothesis
• Select a sample of size N to test the hypothesis all rules of good samples apply that we have
seen before for polls. sample size still is influencing your certainty of
the decision/estimate
• Compute the desired value e.g. average height of males)
• What does that value tell us? its only the characteristic of our sample set we will need to extract its characteristics for the
hypothesis evidence
8
Standardizing the Sample Value
• Convert to standard score (probability of the result) 1. Take out the Null Hypothesis H0 (value from
sample – H0)
if small this indicates you are close to H0, if far H0 is less likely
2. divide by standard error of the statistic s
this normalizes the distance to equalize close and far in 1) to what the deviation of the value is.
• What distance is good to reject or not reject?
9
How to reject H0
• Previous normalization brings value into standard value distribution called Z-distribution or Normal distribution
• Test for value being likely or unlikely given the distribution if within likely region keep H0
if unlikely reject H0
10
Z-Distribution (Normal distribution)
11
How to reject H0
• Previous normalization brings value into standard value distribution called Z-distribution or Normal distribution
• Test for value being likely or unlikely given the distribution if within likely region keep H0
if unlikely reject H0
• Note that if the value is not rejected that does not mean its accepted either! Only means there is not enough evidence to reject
12
Finding the Likelihood
• Value is called p value
• Can be looked up in reference tables
• EXCEL: vp NORM.S.DIST(value.TRUE)
• For alternative hypothesis being: less than p=vp
not equal p=2 vp
larger than p = 1- vp
13
Interpreting p-value
• set your cutoff called α (e.g. α = 0.05)
• if the p-value is: less than 0.01 result is considered highly
statistically significant reject null hypothesis if between α and 0.01 (not close to α) result is
statistically significant reject null hypothesis if close to α result is marginally statistically
significant either way is fine for rejection or not rejection
if greater than α don’t reject
• Always ask for p-value and α to make up your own mind
14
Testing for Proportion of Population
• Again for proportions we need to test differently
1. Compute proportion of population that is positive
regular percentage calculation
2. Subtract proportion that is claimed
3. Calculate standard error
4. divide step 2 by the standard error
15
Statistic of Support for Candidates
• USAToday: “Romney leads in states with most American cars” eight of the top 10 states for registration of new
american build cars are Romney supporters two others are Swing states (Iowa & Michigan) In fact the next four states are also Romney
supporters Obama has solid support for 9 of the 10 states
with the most foreign registrations
• Is this statement true?
• How do we compute if its true?
16
Small Samples
• Use t-distribution
17
T-distribution
source: Wikipedia
18
Small Samples
• Use t-distribution
• Accounts for the sample size
• Value can be found in tables
• Excel: T.DIST
19
Errors
• Error Type 1: Wrong rejection
• Error Type 2: Missed rejection