Upload
autumn-hutchinson
View
223
Download
3
Embed Size (px)
Citation preview
Hypothesis TestingHypothesis Testing
Testing Statistical SignificanceTesting Statistical Significance
Statistical Decision Statistical Decision MakingMaking
Public managers are often faced with decisions about Public managers are often faced with decisions about program effectiveness, personnel productivity, and program effectiveness, personnel productivity, and procedural changesprocedural changes Is Patty Roberts an effective supervisor?Is Patty Roberts an effective supervisor? If we redesign form 54b, will it result in faster processing If we redesign form 54b, will it result in faster processing
times?times? Is the Head Start program resulting in better reading scores for Is the Head Start program resulting in better reading scores for
it’s participants?it’s participants? When we change these questions into statements, we When we change these questions into statements, we
have made have made hypotheseshypotheses The head start program has resulted in higher reading scores The head start program has resulted in higher reading scores
for it’s participantsfor it’s participants
Statistical Decision Statistical Decision MakingMaking
However, in research we don’t directly test our However, in research we don’t directly test our hypotheseshypotheses
Instead we test the “negative” of the hypothesisInstead we test the “negative” of the hypothesis Our Our research hypothesisresearch hypothesis may be that the new speeding may be that the new speeding
fines in Virginia are resulting in fewer highway fatalitiesfines in Virginia are resulting in fewer highway fatalities However, what we test is the statement “the new However, what we test is the statement “the new
speeding fines in Virginia have NOT reduced highway speeding fines in Virginia have NOT reduced highway fatalities”fatalities”
This is called the This is called the Null HypothesisNull Hypothesis Why?Why?
Statistical Decision Statistical Decision MakingMaking
How do you “prove” something?How do you “prove” something? Can you “prove” anything?Can you “prove” anything? Can you “fail” to prove something?Can you “fail” to prove something?
All we can do is triangulate on the truth All we can do is triangulate on the truth by eliminating what, most likely, is not the by eliminating what, most likely, is not the truth.truth.
Huh?Huh?
It’s all based on the concept of disconfirming evidenceIt’s all based on the concept of disconfirming evidence Hypothesis testing relies on “disconfirming” evidenceHypothesis testing relies on “disconfirming” evidence An investigator does not directly assert that his/her An investigator does not directly assert that his/her
data support the hypothesisdata support the hypothesis Instead, investigator states that evidence shows that Instead, investigator states that evidence shows that
the the “null” hypothesis“null” hypothesis is probably false is probably false Sherlock Holmes got itSherlock Holmes got it
““Eliminate the impossible, whatever remains, however Eliminate the impossible, whatever remains, however improbable, is the truth.”improbable, is the truth.”
Examples of HExamples of H11 and H and H00
H1: Some job training programs are more successful H1: Some job training programs are more successful than other programs in placing trainees in permanent than other programs in placing trainees in permanent employment.employment.
H0: All job training programs are equally likely to place H0: All job training programs are equally likely to place trainees in permanent employment.trainees in permanent employment.
H1: Male planners earn higher salaries than female H1: Male planners earn higher salaries than female planners.planners.
H0: Gender is not related to planners’ salaries.H0: Gender is not related to planners’ salaries. H1: Dr. Schroeder is smarter than the average Virginia H1: Dr. Schroeder is smarter than the average Virginia
Tech Research ProfessorTech Research Professor H0: Dr. Schroeder is no smarter than the average H0: Dr. Schroeder is no smarter than the average
Virginia Tech Research Professor (his intelligence is a Virginia Tech Research Professor (his intelligence is a random error)random error)
Disconfirming exampleDisconfirming example
We We cannotcannot “prove” that Dr. Schroeder is “prove” that Dr. Schroeder is smarter than the average Virginia Tech smarter than the average Virginia Tech Research ProfessorResearch Professor
What we What we cancan do is “reject” the null hypothesis – do is “reject” the null hypothesis – we can prove that Dr. Schroeder’s intelligence we can prove that Dr. Schroeder’s intelligence is “not” within the random error surrounding the is “not” within the random error surrounding the scores of a random sample of other research scores of a random sample of other research professors – If Hprofessors – If H00 is not true, than there is more is not true, than there is more “evidence” that H“evidence” that H11 might be true! might be true!
Importance of Stating the Importance of Stating the Hypothesis CorrectlyHypothesis Correctly
The ability to state the null (H0) and The ability to state the null (H0) and research hypotheses (H1) correctly is research hypotheses (H1) correctly is essentialessential
The statistical techniques used in The statistical techniques used in significance testing will have little significance testing will have little meaning if not stated correctlymeaning if not stated correctly
Let’s practice making some research and Let’s practice making some research and null hypotheses:null hypotheses:
Hypotheses Example 1Hypotheses Example 1
Six months after the local newspaper ran a weeklong Six months after the local newspaper ran a weeklong series of articles on the Northlake, Virginia, Community series of articles on the Northlake, Virginia, Community Pride Center, the director wants to see whether the Pride Center, the director wants to see whether the positive media coverage improved turnout at the positive media coverage improved turnout at the center’s after school recreation programs, compared to center’s after school recreation programs, compared to turnout before the media coverage took placeturnout before the media coverage took place
H1H1 Media coverage increased turnout at the Community Pride Media coverage increased turnout at the Community Pride
CenterCenter H0H0
Media coverage did not increase turnout at the Community Media coverage did not increase turnout at the Community Pride CenterPride Center
Hypotheses Example 2Hypotheses Example 2
The head of the Alton, New York, Public Works The head of the Alton, New York, Public Works Department has installed security cameras in the public Department has installed security cameras in the public yard in hopes of lowering the large number of illegal yard in hopes of lowering the large number of illegal after hours dumping incidents. After 90 days, officials after hours dumping incidents. After 90 days, officials want to assess the impact this measure has had on the want to assess the impact this measure has had on the number of illegal dumping incidents.number of illegal dumping incidents.
H1H1 The Installation of security cameras has led to a decrease in The Installation of security cameras has led to a decrease in
the number of illegal dumping incidentsthe number of illegal dumping incidents H0H0
The installation of security cameras has not led to a decrease The installation of security cameras has not led to a decrease in the number of illegal dumping incidentsin the number of illegal dumping incidents
Hypotheses Example 3Hypotheses Example 3
The principal of the Oaklawn Charter School claims The principal of the Oaklawn Charter School claims that the “Oaklawn method” of mathematics instruction that the “Oaklawn method” of mathematics instruction produces higher scores on standardized math skills produces higher scores on standardized math skills tests compared to those of students in the district who tests compared to those of students in the district who are taught “the old math.”are taught “the old math.”
H1H1 Math scores at Oaklawn are higher than those at other schools Math scores at Oaklawn are higher than those at other schools
in the district.in the district. H0H0
Math scores at Oaklawn are not higher than those at other Math scores at Oaklawn are not higher than those at other schools in the districtschools in the district
Testing HypothesesTesting Hypotheses
Now that we have the idea about how to state Now that we have the idea about how to state research and null hypotheses, we can start research and null hypotheses, we can start looking at the statistical techniques used to looking at the statistical techniques used to test themtest them
ThreeThree situations you will be in when needing situations you will be in when needing to test your hypothesesto test your hypotheses1.1. Population Parameter vs. Population ParameterPopulation Parameter vs. Population Parameter
2.2. Sample Statistic vs. Population ParameterSample Statistic vs. Population Parameter
3.3. Sample Statistic vs. Sample StatisticSample Statistic vs. Sample Statistic
The Hypothesis Testing SystemThe Hypothesis Testing System
1.1. State the research and null hypothesesState the research and null hypotheses HH1 – Research Hypothesis1 – Research Hypothesis
HH0 – Null Hypothesis0 – Null Hypothesis
2.2. Select an alpha level - Select an alpha level - % willing to incorrectly reject the null hypothesis% willing to incorrectly reject the null hypothesis
3.3. Select and compute a test statisticSelect and compute a test statistic Chi-SquaredChi-Squared T-testT-test
4.4. Accept or reject the null hypothesisAccept or reject the null hypothesis
5.5. Make a decisionMake a decision
Type I vs. Type II errorsType I vs. Type II errors
These are the mirror of each other – one goes up, the other goes These are the mirror of each other – one goes up, the other goes downdown If you increase your sample size, both go down, but the relationship If you increase your sample size, both go down, but the relationship
between them remains the samebetween them remains the same Type I error: rejecting a true Null HypothesisType I error: rejecting a true Null Hypothesis
Finding that Dr. Schroeder IS more intelligent (when in fact he is not)!Finding that Dr. Schroeder IS more intelligent (when in fact he is not)! Type II error: not rejecting a false Null HypothesisType II error: not rejecting a false Null Hypothesis
Finding that Dr. Schroeder’s intelligence is no different than the Finding that Dr. Schroeder’s intelligence is no different than the average (when in fact he is MUCH smarter)!average (when in fact he is MUCH smarter)!
Why do we call it “not rejecting” or “failing to reject” the null Why do we call it “not rejecting” or “failing to reject” the null hypothesis? Why don’t we just “accept” the null hypothesis or find hypothesis? Why don’t we just “accept” the null hypothesis or find the null hypothesis to be “true”?the null hypothesis to be “true”?
Which type of error is it generally worse to make?Which type of error is it generally worse to make? Type I (finding false evidence that your hypothesis may be true, as Type I (finding false evidence that your hypothesis may be true, as
opposed to failing to find more evidence – can always try again: more opposed to failing to find more evidence – can always try again: more subjects, different alpha level, etc.)subjects, different alpha level, etc.)
Selecting an alpha level Selecting an alpha level (())
alpha, what is it? - the probability that you will alpha, what is it? - the probability that you will make a Type I errormake a Type I error .05 means 5% chance of committing error.05 means 5% chance of committing error .01 means 1%, etc..01 means 1%, etc.
Alpha, what’s it for? – used with test statistic to Alpha, what’s it for? – used with test statistic to determine “threshold” a score must be above in determine “threshold” a score must be above in order to be accepted as “non-random”order to be accepted as “non-random”
Chosen prior to begging analysis, why?Chosen prior to begging analysis, why? Because, depends on practical consequences of Because, depends on practical consequences of
committing Type I or II error, NOT on what the data committing Type I or II error, NOT on what the data collected shows – need to think this through first!collected shows – need to think this through first!
How SURE do you need to How SURE do you need to be?be?
Social scientists routinely use .05 for Social scientists routinely use .05 for alphaalpha
In managerial situations, however, that In managerial situations, however, that may be two bigmay be two big
How SURE do you need to How SURE do you need to be?be?
A rape crisis center may decide that the A rape crisis center may decide that the probability that one staff member cannot probability that one staff member cannot handle all the possible rape calls in a single handle all the possible rape calls in a single day is .05day is .05
This means, however, that 1 day in 20, or once This means, however, that 1 day in 20, or once every three weeks, the rape crisis center will every three weeks, the rape crisis center will fail to meet a crisisfail to meet a crisis
In this situation, you might instead pick .001 In this situation, you might instead pick .001 (which comes out to about one failure every (which comes out to about one failure every three years) three years)
How SURE do you need to How SURE do you need to be?be?
A police department. On the other hand, A police department. On the other hand, may be able to accept a .05 probability may be able to accept a .05 probability that one of its cars may be out of servicethat one of its cars may be out of service
But the fire department may require a But the fire department may require a probability of .0001 that a fire hose will probability of .0001 that a fire hose will fail to operate (1 in 10,000 chance)fail to operate (1 in 10,000 chance)
Selecting a Test StatisticSelecting a Test Statistic
Most commonly used in social sciences: chi-Most commonly used in social sciences: chi-square and t-testsquare and t-test
Which one to use? Depends on “level” of data Which one to use? Depends on “level” of data being investigated.being investigated. chi-square: for nominal level data predicting nominal chi-square: for nominal level data predicting nominal
level data (usually in contingency tables)level data (usually in contingency tables) e.g. type of training program [a nominal category] vs. e.g. type of training program [a nominal category] vs.
working status [another nominal category] – see Table 12.3working status [another nominal category] – see Table 12.3 t-test: for nominal level data predicting interval level t-test: for nominal level data predicting interval level
datadata e.g. gender [a nominal level category] determining salary e.g. gender [a nominal level category] determining salary
[an interval level category][an interval level category]
Testing Hypotheses with Testing Hypotheses with Population ParametersPopulation ParametersParameter vs. ParameterParameter vs. Parameter
If you have access to the population If you have access to the population parameters, then hypothesis testing is parameters, then hypothesis testing is pretty easypretty easy
It’s like deciding whom should start at It’s like deciding whom should start at center if Shaquille O’Neal plays for your center if Shaquille O’Neal plays for your teamteam
Testing Hypotheses with Testing Hypotheses with Population ParametersPopulation ParametersParameter vs. ParameterParameter vs. Parameter
Suppose Jerry Green, the governor of a large Suppose Jerry Green, the governor of a large eastern state, wants to know whether a former eastern state, wants to know whether a former governor’s executive reorganization has had governor’s executive reorganization has had any impact on the state’s expendituresany impact on the state’s expenditures
After some thought, he postulates the followingAfter some thought, he postulates the following H1: State expenditures decreased after the H1: State expenditures decreased after the
executive reorganization, compared with the state executive reorganization, compared with the state budget’s long-run growth ratebudget’s long-run growth rate
H0: State expenditures did not decrease after the H0: State expenditures did not decrease after the executive reorganization, compared with the state executive reorganization, compared with the state budget’s long-run growth ratebudget’s long-run growth rate
Testing Hypotheses with Testing Hypotheses with Population ParametersPopulation ParametersParameter vs. ParameterParameter vs. Parameter
A management review shows that the state’s A management review shows that the state’s expenditure grew at a rate of 10.7% per year expenditure grew at a rate of 10.7% per year before the reorganization and 10.4% after the before the reorganization and 10.4% after the reorganizationreorganization
What do these figures say about the null What do these figures say about the null hypothesis?hypothesis?
Because 10.4% is less than 10.7%, we Because 10.4% is less than 10.7%, we REJECTREJECT the the NULLNULL hypothesis hypothesis
We conclude that the growth rate in state We conclude that the growth rate in state expenditures declined after the reorganizationexpenditures declined after the reorganization
Testing Hypotheses with Testing Hypotheses with Population ParametersPopulation ParametersParameter vs. ParameterParameter vs. Parameter
Is a 0.3% decrease in the growth rate of Is a 0.3% decrease in the growth rate of expenditures significant?expenditures significant?
Of course! These are population parameters Of course! These are population parameters which means the “probability” that the which means the “probability” that the difference between the two conditions is real is difference between the two conditions is real is 100%!100%!
Is this statistically significant difference trivial?Is this statistically significant difference trivial? ProbablyProbably
We could have made the hypothesis more We could have made the hypothesis more specific – “…expenditures decreased by more specific – “…expenditures decreased by more than 5%...”than 5%...”
Testing Hypotheses with Testing Hypotheses with Population ParametersPopulation ParametersParameter vs. ParameterParameter vs. Parameter
Don’t really have to use the hypothesis testing Don’t really have to use the hypothesis testing system in this scenariosystem in this scenario
No need for an alpha or statistical test (we are No need for an alpha or statistical test (we are not dealing with statistics, we are dealing only not dealing with statistics, we are dealing only with parameters)with parameters)
Only when we add unknowns into the mix via Only when we add unknowns into the mix via sampling do we need to resort to statistical sampling do we need to resort to statistical teststests
You DO, however, need to take care to state You DO, however, need to take care to state logical hypotheseslogical hypotheses
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. ParameterStatistic vs. Parameter
Referred to as “One-Sample” testsReferred to as “One-Sample” tests Comparing one sample to a known populationComparing one sample to a known population ““What’s the likelihood that the mean I just What’s the likelihood that the mean I just
obtained from my sample is representative of obtained from my sample is representative of the population as a whole?”the population as a whole?”
I already know the population of M&M handfuls I already know the population of M&M handfuls for the class, so, how well does the mean of a for the class, so, how well does the mean of a specific sample of four handfuls represent the specific sample of four handfuls represent the population?population?
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
The One Sample Chi-SquareThe One Sample Chi-Square The Chi-square test used with one sample is The Chi-square test used with one sample is
described as a described as a "goodness of fit""goodness of fit" test. It can test. It can help you decide whether a distribution of help you decide whether a distribution of frequencies for a variable in a sample is frequencies for a variable in a sample is representative of, or "fits", a specified representative of, or "fits", a specified population distribution. For example, you can population distribution. For example, you can use this test to decide whether your data are use this test to decide whether your data are approximately normal or not.approximately normal or not.
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
Suppose the relative Suppose the relative frequencies of frequencies of marital status for the marital status for the population of adult population of adult American females American females under 40 years of under 40 years of age are as follows:age are as follows:
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
Then suppose an investigator Then suppose an investigator wanted to know whether a wanted to know whether a particular sample of 200 adult particular sample of 200 adult females under age 40 was drawn females under age 40 was drawn from a population that is from a population that is representative of the general representative of the general populationpopulation
By applying the procedures of Chi By applying the procedures of Chi Square and the steps of Square and the steps of hypothesis testing, we can decide hypothesis testing, we can decide whether the sample distribution is whether the sample distribution is close enough to the population close enough to the population distribution to be considered distribution to be considered representativerepresentative of it. of it.
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
State the Research and State the Research and Null HypothesesNull Hypotheses
H1: H1: The sample The sample does does notnot represent the represent the population distributionpopulation distribution
H0: H0: The sample The sample doesdoes represent the population represent the population distributiondistribution
Why is H1 stated Why is H1 stated “negatively”? “negatively”? What does Chi-Square What does Chi-Square
show us?show us?
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
Select an “alpha” levelSelect an “alpha” level How willing are we to make How willing are we to make
a mistake and say that the a mistake and say that the sample IS representative of sample IS representative of the population when it the population when it actually isn’t?actually isn’t?
Really depends on why Really depends on why you are getting the sample you are getting the sample in the first place, but let’s in the first place, but let’s assume 5% for nowassume 5% for now
Stated another way we’d Stated another way we’d say there is a probability of say there is a probability of .05 that we will mistakenly .05 that we will mistakenly accept the research accept the research hypothesishypothesis
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
Select and Compute Select and Compute a Test Statistica Test Statistic
In this case we are In this case we are dealing with what dealing with what “level” of data?“level” of data?
NominalNominal Chi Square is selectedChi Square is selected
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
We calculate We calculate Expected Expected frequencies for each of the cells frequencies for each of the cells in our sample distributionin our sample distribution
If in our general population, 55% If in our general population, 55% of such women are married then of such women are married then we would expect 55% of 200 we would expect 55% of 200 = 110 in our sample to be = 110 in our sample to be marriedmarried
Single women would be 21% of Single women would be 21% of 200 = 42200 = 42
Separated 9% of 200 = 18Separated 9% of 200 = 18 Divorced = 12% of 200 = 24Divorced = 12% of 200 = 24 Widowed 3% of 200 = 6Widowed 3% of 200 = 6 We then get the difference We then get the difference
between each Expected and between each Expected and each Observed, square this, and each Observed, square this, and then divide this result by the then divide this result by the Expected. Expected.
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
=0.91 + 0.10 + 0.22 + 6.00 + 0.67
=7.90
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square
General PopulationGeneral Population Sample Sample (N=200)(N=200)
Marital Marital StatusStatus
Relative Relative FrequencyFrequency
Observed Observed FrequenciesFrequencies
MarriedMarried 0.550.55 100100
SingleSingle 0.210.21 4444
SeparatedSeparated 0.090.09 1616
DivorcedDivorced 0.120.12 3636
WidowedWidowed 0.030.03 44
Reject or Accept the Null Reject or Accept the Null HypothesisHypothesisYou then refer to your XYou then refer to your X22 table table under the 0.05 heading with under the 0.05 heading with df = C-1 = 4df = C-1 = 4You find a critical value of 9.49You find a critical value of 9.49Is our calculated value of 7.9 Is our calculated value of 7.9 significant?significant?No, so do we accept or reject No, so do we accept or reject the null hypothesis?the null hypothesis?Accept H0 – what does that Accept H0 – what does that mean?mean?
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
The One-Sample t TestThe One-Sample t Test A professor wants to know if her A professor wants to know if her
introductory statistics class has a introductory statistics class has a good grasp of basic mathgood grasp of basic math
Six students are chosen at random Six students are chosen at random from the class and given a math from the class and given a math proficiency testproficiency test
The professor wants the class to The professor wants the class to be able to score at least 70 on the be able to score at least 70 on the testtest
The six students get scores of 62, The six students get scores of 62, 92, 75, 68, 83, and 9592, 75, 68, 83, and 95
Can the professor be at least 90 Can the professor be at least 90 percent certain that the mean percent certain that the mean score for the class on the test score for the class on the test would be at least 70? would be at least 70?
ScoresScores
6262
9292
7575
6868
8383
9595
mean: 79.17mean: 79.17
sd: 13.17sd: 13.17
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
State your research State your research and null hypothesesand null hypotheses
H1: μ ≥ 70H1: μ ≥ 70 H0: μ < 70H0: μ < 70
ScoresScores
6262
9292
7575
6868
8383
9595
mean: 79.17mean: 79.17
sd: 13.17sd: 13.17
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
Select and Compute Select and Compute StatisticStatistic
Dealing with Dealing with Interval/Ratio level Interval/Ratio level datadata
Select t-testSelect t-test
ScoresScores
6262
9292
7575
6868
8383
9595
mean: 79.17mean: 79.17
sd: 13.17sd: 13.17
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
To test the hypothesis, the To test the hypothesis, the computed computed tt-value of 1.71 will -value of 1.71 will be compared to the critical be compared to the critical value in the value in the tt-table. -table.
ScoresScores
6262
9292
7575
6868
8383
9595
mean: 79.17mean: 79.17
sd: 13.17sd: 13.17
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
Accept or Reject the Null Accept or Reject the Null HypothesisHypothesis
A 90 percent confidence level is A 90 percent confidence level is equivalent to an alpha level of .10 equivalent to an alpha level of .10
The number of degrees of freedom The number of degrees of freedom for the problem is 6 – 1 = 5for the problem is 6 – 1 = 5
The value in the The value in the tt-table for -table for tt10,510,5 is is 1.4761.476
Because the computed Because the computed tt-value of -value of 1.71 is larger than the critical value 1.71 is larger than the critical value in the table, the null hypothesis can in the table, the null hypothesis can be rejected, and the professor can be rejected, and the professor can be 90 percent certain that the class be 90 percent certain that the class mean on the math test would be at mean on the math test would be at least 70least 70
ScoresScores
6262
9292
7575
6868
8383
9595
mean: 79.17mean: 79.17
sd: 13.17sd: 13.17
t vs. zt vs. z
Note that the formula for the one-sample Note that the formula for the one-sample tt-test -test for a population mean is the same as the for a population mean is the same as the zz--test, except that the test, except that the tt-test substitutes the -test substitutes the sample standard deviation sample standard deviation ss for the population for the population standard deviation σ and takes critical values standard deviation σ and takes critical values from the from the tt-distribution instead of the -distribution instead of the zz--distribution. The distribution. The tt-distribution is particularly -distribution is particularly useful for tests with small samples ( useful for tests with small samples ( nn < 30) < 30)
Could use either distribution to test your Could use either distribution to test your hypothesis about n=30hypothesis about n=30
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
““Two-Tail” ExampleTwo-Tail” Example Used when you don’t care if Used when you don’t care if
something is more or less than – something is more or less than – just “different” thanjust “different” than
A Little League baseball coach A Little League baseball coach wants to know if his team is wants to know if his team is representativerepresentative of other teams in of other teams in scoring runsscoring runs
Nationally, the average number of Nationally, the average number of runs scored by a Little League team runs scored by a Little League team in a game is 5.7in a game is 5.7
He chooses five games at random He chooses five games at random in which his team scored 5 9, 4, 11, in which his team scored 5 9, 4, 11, and 8 runs. Is it likely that his team's and 8 runs. Is it likely that his team's scores could have come from the scores could have come from the national distribution?national distribution?
ScoresScores
55
99
44
1111
88
mean: 7.4mean: 7.4
sd: 2.88sd: 2.88
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
State Research and Null State Research and Null HypothesesHypotheses
H1: μ ≠ 5.7 H1: μ ≠ 5.7 H0: μ = 5.7 H0: μ = 5.7
Select alphaSelect alpha Alpha: .05Alpha: .05
ScoresScores
55
99
44
1111
88
mean: 7.4mean: 7.4
sd: 2.88sd: 2.88
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
Calculate tCalculate t ScoresScores
55
99
44
1111
88
mean: 7.4mean: 7.4
sd: 2.88sd: 2.88
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
Now, look up the critical value Now, look up the critical value from the from the tt-table-table
The degrees of freedom is 5 – The degrees of freedom is 5 – 1 = 4. The overall alpha level 1 = 4. The overall alpha level is .05is .05
ButBut because this is a because this is a two-two-tailed testtailed test, the alpha level , the alpha level must be divided by two, which must be divided by two, which yields .025yields .025 This means .025 on either end This means .025 on either end
vs .05 on one endvs .05 on one end The tabled value for The tabled value for tt.025,4.025,4 is is
2.7762.776
ScoresScores
55
99
44
1111
88
mean: 7.4mean: 7.4
sd: 2.88sd: 2.88
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
In a two-tailed hypothesis, you have to In a two-tailed hypothesis, you have to consider BOTH ends, not just oneconsider BOTH ends, not just one
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
t must be t must be EITHEREITHER more than more than the positive critical value or the positive critical value or less than the negative critical less than the negative critical value (±2.776)value (±2.776)
The computed The computed tt of 1.32 is not of 1.32 is not smaller than -2.776 or more smaller than -2.776 or more than +2.776than +2.776
You cannot reject the null You cannot reject the null hypothesis that the mean of hypothesis that the mean of this team is equal to the this team is equal to the population meanpopulation mean
The coach can conclude that The coach can conclude that his team fits in with the national his team fits in with the national distribution on runs scored. distribution on runs scored.
ScoresScores
55
99
44
1111
88
mean: 7.4mean: 7.4
sd: 2.88sd: 2.88
Confidence interval for Confidence interval for population mean using population mean using tt
This is exactly the same as when dealing with z!
Remember ± 1.96 x s.e.? The 1.96 was the z-score
Now we are just using t x s.e.
Confidence interval for Confidence interval for population mean using population mean using tt
Using the previous example, what is a 95 percent confidence interval for runs scored per team per game?
First, determine the t-value. A 95 percent confidence level is equivalent to an alpha level of .05
Half of .05 is .025 (Why Half?)
The t-value corresponding to an area of .025 at either end of the t-distribution for 4 degrees of freedom ( t.025,4) is 2.776.
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test
In-Class ExerciseIn-Class Exercise
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. StatisticStatistic vs. Statistic
Used to test hypotheses that two groups have Used to test hypotheses that two groups have statistically different meansstatistically different means (two-tailed [non-directional])(two-tailed [non-directional])
HH11: Men make a different salary than women: Men make a different salary than women HH00: Men and women make the same: Men and women make the same
Or, tests hypotheses that one group’s mean is Or, tests hypotheses that one group’s mean is higher than the other group’s mean higher than the other group’s mean (one-tailed [directional])(one-tailed [directional])
HH11: Men make more than women: Men make more than women HH00: Men make the same or less than women: Men make the same or less than women
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – Chi SquareStatistic vs. Statistic – Chi Square
You’ve already done this previously when you You’ve already done this previously when you were calculating chi square for contingency were calculating chi square for contingency tables!tables!
Now, you are just adding on the proper way to Now, you are just adding on the proper way to hypothesizehypothesize
Because chi square looks for the existence of a Because chi square looks for the existence of a relationship based on the “difference” between relationship based on the “difference” between observed and expected, your null hypothesis is observed and expected, your null hypothesis is always that there is “no difference”always that there is “no difference”
Previous ExamplePrevious ExampleCalculations for Expected FrequenciesCalculations for Expected Frequencies
Table CellTable Cell
CompetenceCompetence HierarchyHierarchy ObservedObserved ExpectedExpected (O-E)(O-E)22/E/E
LowLow LowLow 113113 .50x152=76.0.50x152=76.0 18.0118.01
LowLow MediumMedium 3131 .40x152=60.8.40x152=60.8 14.6114.61
LowLow HighHigh 88 .10x152=15.2.10x152=15.2 3.413.41
MediumMedium LowLow 6060 .50x159=79.5.50x159=79.5 4.784.78
MediumMedium MediumMedium 9191 .40x159=63.6.40x159=63.6 11.811.8
MediumMedium HighHigh 88 .10x159=15.9.10x159=15.9 3.933.93
HighHigh LowLow 2727 .50x89=44.5.50x89=44.5 6.886.88
HighHigh MediumMedium 3838 .40x89=35.6.40x89=35.6 .16.16
HighHigh HighHigh 2424 .10x89=8.9.10x89=8.9 25.6225.62
TotalTotal 400400 400400 89.289.2CHI-SQUARE!
Previous ExamplePrevious Example
H1: Hierarchy is Related to CompetenceH1: Hierarchy is Related to Competence H0: Hierarchy is not related to H0: Hierarchy is not related to
CompetenceCompetence
If Chi Square is Higher than the critical If Chi Square is Higher than the critical value, you reject the null hypothesis and value, you reject the null hypothesis and accept the research hypothesisaccept the research hypothesis
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
Tests whether the Tests whether the meansmeans of two groups are of two groups are statisticallystatistically differentdifferent from each other from each other
Let’s look at some graphs so we may Let’s look at some graphs so we may visually understand what it is we are visually understand what it is we are
looking atlooking at
Statistical AnalysisStatistical Analysis
Statistical AnalysisStatistical Analysis
Controlgroupmean
Statistical AnalysisStatistical Analysis
Controlgroupmean
Treatmentgroupmean
Statistical AnalysisStatistical Analysis
Controlgroupmean
Treatmentgroupmean
Is there a difference?
What Does What Does DifferenceDifference Mean?Mean?
What Does What Does DifferenceDifference Mean?Mean?
Mediumvariability
What Does What Does DifferenceDifference Mean?Mean?
Mediumvariability
Highvariability
What Does What Does DifferenceDifference Mean?Mean?
Mediumvariability
Highvariability
Lowvariability
What Does What Does DifferenceDifference Mean?Mean?
Mediumvariability
Highvariability
Lowvariability
The mean differenceis the same for all
three cases.
What Does What Does DifferenceDifference Mean?Mean?
Mediumvariability
Highvariability
Lowvariability
Which one showsthe greatestdifference?
What Does What Does DifferenceDifference Mean?Mean? A statistical difference is a function of the A statistical difference is a function of the
difference between meansdifference between means relative to the relative to the variabilityvariability..
A small difference between means with large A small difference between means with large variability could be due to variability could be due to chancechance..
Lowvariability
Which one showsthe greatestdifference?
What Do We Estimate?What Do We Estimate?
Lowvariability
What Do We Estimate?What Do We Estimate?
Lowvariability
Signal
Noise
What Do We Estimate?What Do We Estimate?
Lowvariability
Signal
Noise
Difference between group means=
What Do We Estimate?What Do We Estimate?
Lowvariability
Signal
Noise
Difference between group means
Variability of groups=Difference between group means
Variability of groups
What Do We Estimate?What Do We Estimate?
Lowvariability
Signal
Noise
Difference between group means
Variability of groups=
=XT - XC
SE(XT - XC)
_ _
_ _
Difference between group means
Variability of groups_
Difference between group means
Variability of groups__
Difference between group means
Variability of groups
_
__
Difference between group means
Variability of groups
__
__
Difference between group means
Variability of groups
What Do We Estimate?What Do We Estimate?
Lowvariability
Signal
Noise
Difference between group means
Variability of groups=
XT - XC
SE(XT - XC)=
= t-value
_ _
_ _
Difference between group means
Variability of groups_ _
_
_ _
__
_ _
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
The Ware County librarian wants to increase The Ware County librarian wants to increase circulation from the Ware County bookmobilescirculation from the Ware County bookmobiles
The librarian thins that poster ads in areas The librarian thins that poster ads in areas where the book mobiles stop will attract more where the book mobiles stop will attract more browsers and increase circulationbrowsers and increase circulation
To test this idea, the librarian sets up an To test this idea, the librarian sets up an experimentexperiment
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
Ten bookmobile routes are selected at Ten bookmobile routes are selected at randomrandom
On those routes, poster ads are posted On those routes, poster ads are posted with bookmobile informationwith bookmobile information
Ten other bookmobile routes are Ten other bookmobile routes are selected at randomselected at random
On those routes, no advertising is doneOn those routes, no advertising is done
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
Step 1 - HypothesesStep 1 - Hypotheses The null hypothesis is that the mean circulation of The null hypothesis is that the mean circulation of
the experimental group is the experimental group is not highernot higher than the mean than the mean circulation of the control groupcirculation of the control group
The research hypothesis is that the mean circulation The research hypothesis is that the mean circulation of the experimental group is of the experimental group is higherhigher than the mean than the mean circulation of the control groupcirculation of the control group
Step 2 – AlphaStep 2 – Alpha .05.05
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
The following The following data is data is obtained:obtained:
Librarian’s DataLibrarian’s Data
GroupsGroups
BooksBooks Experimental Experimental GroupGroup
Control GroupControl Group
MeanMean 526526 475475
Standard Standard DeviationDeviation
125125 115115
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
STEP 3STEP 3 Calculate the Calculate the
s.e. for each s.e. for each groupgroup
125/sqrt(10)125/sqrt(10) =39.5=39.5 115/sqrt(10)115/sqrt(10) =36.4=36.4
Librarian’s DataLibrarian’s Data
GroupsGroups
BooksBooks Experimental Experimental GroupGroup
Control GroupControl Group
MeanMean 526526 475475
Standard Standard DeviationDeviation
125125 115115
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
STEP 4STEP 4 Create a “pooled” Create a “pooled”
standard errorstandard error
Librarian’s DataLibrarian’s Data
GroupsGroups
BooksBooks ExperimentExperimental Groupal Group
Control Control GroupGroup
MeanMean 526526 475475
Standard Standard DeviationDeviation
125125 11511522
21 ...... eseses d
7.534.365.39.. 22 des
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
STEP 5STEP 5 Subtract the mean of Subtract the mean of
the second group the second group from the firstfrom the first
Then divide by the Then divide by the pooled errorpooled error
Librarian’s DataLibrarian’s Data
GroupsGroups
BooksBooks ExperimentExperimental Groupal Group
Control Control GroupGroup
MeanMean 526526 475475
Standard Standard DeviationDeviation
125125 115115
95.7.53
526475
..21
t
es
XXt
d
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
STEP 6STEP 6 Degrees of Freedom Degrees of Freedom
equals nequals n11 + n + n22 – 2 – 2
10 + 10 – 2 = 1810 + 10 – 2 = 18
Librarian’s DataLibrarian’s Data
GroupsGroups
BooksBooks ExperimentExperimental Groupal Group
Control Control GroupGroup
MeanMean 526526 475475
Standard Standard DeviationDeviation
125125 115115
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
STEP 7STEP 7 Look it up and accept or Look it up and accept or
reject the null hypothesisreject the null hypothesis Critical value for 18df at Critical value for 18df at
the .05 level of the .05 level of significance is significance is 1.7341.734
We did not meet that We did not meet that value and, therefore, fail value and, therefore, fail to reject the null to reject the null hypothesishypothesis
We can’t say that the We can’t say that the advertising increased advertising increased book circulationbook circulation
Librarian’s DataLibrarian’s Data
GroupsGroups
BooksBooks ExperimentExperimental Groupal Group
Control Control GroupGroup
MeanMean 526526 475475
Standard Standard DeviationDeviation
125125 115115
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
What if we wanted to What if we wanted to just see if advertising just see if advertising had any effect?had any effect?
What values would we What values would we be comparing then?be comparing then?
t would have to be t would have to be outside the range of outside the range of ±2.10±2.10
How did we get that?How did we get that?
Librarian’s DataLibrarian’s Data
GroupsGroups
BooksBooks ExperimentExperimental Groupal Group
Control Control GroupGroup
MeanMean 526526 475475
Standard Standard DeviationDeviation
125125 115115
Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means
In-Class ExerciseIn-Class Exercise
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – t-test for proportionsStatistic vs. Statistic – t-test for proportions
Because we can figure out standard Because we can figure out standard errors for proportions (like we did last errors for proportions (like we did last 2 weeks), we can use a t-test to also 2 weeks), we can use a t-test to also compare two groups’ proportionscompare two groups’ proportions
The formulas are the same, the only The formulas are the same, the only difference is the calculation of the difference is the calculation of the standard deviation from the standard deviation from the proportionsproportions
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – t-test for proportionsStatistic vs. Statistic – t-test for proportions
If you’re trying to see if If you’re trying to see if there is a statistical there is a statistical difference between two difference between two groups on whether or not groups on whether or not they support an they support an amendment to the state amendment to the state constitution, it would look constitution, it would look like thislike this
Once you have the Once you have the standard deviation, you standard deviation, you do everything the same do everything the same as when comparing as when comparing meansmeans
Group AGroup A Group BGroup B
n=50n=50 n=50n=50
For=60%For=60% For=40%For=40%
p=.60p=.60 p=.40p=.40
s=sqrt(p(1-p))s=sqrt(p(1-p))
= .49= .49
s=sqrt(p(1-p))s=sqrt(p(1-p))
= .49= .49
Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – t-test for proportionsStatistic vs. Statistic – t-test for proportions
In-Class WorksheetIn-Class Worksheet
HomeworkHomework
Write 2 scenarios and analyses (make Write 2 scenarios and analyses (make them PA relevant), one for a comparison them PA relevant), one for a comparison of sample means, another for a of sample means, another for a comparison of sample proportionscomparison of sample proportions
Make up the problem descriptions and Make up the problem descriptions and datadata Emailed to me by Halloween MidnightEmailed to me by Halloween Midnight No class Halloween nightNo class Halloween night