Hypothesis Testing Testing Statistical Significance

Hypothesis TestingHypothesis Testing

Testing Statistical SignificanceTesting Statistical Significance

Statistical Decision Statistical Decision MakingMaking

Public managers are often faced with decisions about Public managers are often faced with decisions about program effectiveness, personnel productivity, and program effectiveness, personnel productivity, and procedural changesprocedural changes Is Patty Roberts an effective supervisor?Is Patty Roberts an effective supervisor? If we redesign form 54b, will it result in faster processing If we redesign form 54b, will it result in faster processing

times?times? Is the Head Start program resulting in better reading scores for Is the Head Start program resulting in better reading scores for

it’s participants?it’s participants? When we change these questions into statements, we When we change these questions into statements, we

have made have made hypotheseshypotheses The head start program has resulted in higher reading scores The head start program has resulted in higher reading scores

for it’s participantsfor it’s participants


However, in research we don’t directly test our However, in research we don’t directly test our hypotheseshypotheses

Instead we test the “negative” of the hypothesisInstead we test the “negative” of the hypothesis Our Our research hypothesisresearch hypothesis may be that the new speeding may be that the new speeding

fines in Virginia are resulting in fewer highway fatalitiesfines in Virginia are resulting in fewer highway fatalities However, what we test is the statement “the new However, what we test is the statement “the new

speeding fines in Virginia have NOT reduced highway speeding fines in Virginia have NOT reduced highway fatalities”fatalities”

This is called the This is called the Null HypothesisNull Hypothesis Why?Why?


How do you “prove” something?How do you “prove” something? Can you “prove” anything?Can you “prove” anything? Can you “fail” to prove something?Can you “fail” to prove something?

All we can do is triangulate on the truth All we can do is triangulate on the truth by eliminating what, most likely, is not the by eliminating what, most likely, is not the truth.truth.

Huh?Huh?

It’s all based on the concept of disconfirming evidenceIt’s all based on the concept of disconfirming evidence Hypothesis testing relies on “disconfirming” evidenceHypothesis testing relies on “disconfirming” evidence An investigator does not directly assert that his/her An investigator does not directly assert that his/her

data support the hypothesisdata support the hypothesis Instead, investigator states that evidence shows that Instead, investigator states that evidence shows that

the the “null” hypothesis“null” hypothesis is probably false is probably false Sherlock Holmes got itSherlock Holmes got it

““Eliminate the impossible, whatever remains, however Eliminate the impossible, whatever remains, however improbable, is the truth.”improbable, is the truth.”

Examples of HExamples of H11 and H and H00

H1: Some job training programs are more successful H1: Some job training programs are more successful than other programs in placing trainees in permanent than other programs in placing trainees in permanent employment.employment.

H0: All job training programs are equally likely to place H0: All job training programs are equally likely to place trainees in permanent employment.trainees in permanent employment.

H1: Male planners earn higher salaries than female H1: Male planners earn higher salaries than female planners.planners.

H0: Gender is not related to planners’ salaries.H0: Gender is not related to planners’ salaries. H1: Dr. Schroeder is smarter than the average Virginia H1: Dr. Schroeder is smarter than the average Virginia

Tech Research ProfessorTech Research Professor H0: Dr. Schroeder is no smarter than the average H0: Dr. Schroeder is no smarter than the average

Virginia Tech Research Professor (his intelligence is a Virginia Tech Research Professor (his intelligence is a random error)random error)

Disconfirming exampleDisconfirming example

We We cannotcannot “prove” that Dr. Schroeder is “prove” that Dr. Schroeder is smarter than the average Virginia Tech smarter than the average Virginia Tech Research ProfessorResearch Professor

What we What we cancan do is “reject” the null hypothesis – do is “reject” the null hypothesis – we can prove that Dr. Schroeder’s intelligence we can prove that Dr. Schroeder’s intelligence is “not” within the random error surrounding the is “not” within the random error surrounding the scores of a random sample of other research scores of a random sample of other research professors – If Hprofessors – If H00 is not true, than there is more is not true, than there is more “evidence” that H“evidence” that H11 might be true! might be true!

Importance of Stating the Importance of Stating the Hypothesis CorrectlyHypothesis Correctly

The ability to state the null (H0) and The ability to state the null (H0) and research hypotheses (H1) correctly is research hypotheses (H1) correctly is essentialessential

The statistical techniques used in The statistical techniques used in significance testing will have little significance testing will have little meaning if not stated correctlymeaning if not stated correctly

Let’s practice making some research and Let’s practice making some research and null hypotheses:null hypotheses:

Hypotheses Example 1Hypotheses Example 1

Six months after the local newspaper ran a weeklong Six months after the local newspaper ran a weeklong series of articles on the Northlake, Virginia, Community series of articles on the Northlake, Virginia, Community Pride Center, the director wants to see whether the Pride Center, the director wants to see whether the positive media coverage improved turnout at the positive media coverage improved turnout at the center’s after school recreation programs, compared to center’s after school recreation programs, compared to turnout before the media coverage took placeturnout before the media coverage took place

H1H1 Media coverage increased turnout at the Community Pride Media coverage increased turnout at the Community Pride

CenterCenter H0H0

Media coverage did not increase turnout at the Community Media coverage did not increase turnout at the Community Pride CenterPride Center


The head of the Alton, New York, Public Works The head of the Alton, New York, Public Works Department has installed security cameras in the public Department has installed security cameras in the public yard in hopes of lowering the large number of illegal yard in hopes of lowering the large number of illegal after hours dumping incidents. After 90 days, officials after hours dumping incidents. After 90 days, officials want to assess the impact this measure has had on the want to assess the impact this measure has had on the number of illegal dumping incidents.number of illegal dumping incidents.

H1H1 The Installation of security cameras has led to a decrease in The Installation of security cameras has led to a decrease in

the number of illegal dumping incidentsthe number of illegal dumping incidents H0H0

The installation of security cameras has not led to a decrease The installation of security cameras has not led to a decrease in the number of illegal dumping incidentsin the number of illegal dumping incidents


The principal of the Oaklawn Charter School claims The principal of the Oaklawn Charter School claims that the “Oaklawn method” of mathematics instruction that the “Oaklawn method” of mathematics instruction produces higher scores on standardized math skills produces higher scores on standardized math skills tests compared to those of students in the district who tests compared to those of students in the district who are taught “the old math.”are taught “the old math.”

H1H1 Math scores at Oaklawn are higher than those at other schools Math scores at Oaklawn are higher than those at other schools

in the district.in the district. H0H0

Math scores at Oaklawn are not higher than those at other Math scores at Oaklawn are not higher than those at other schools in the districtschools in the district

Testing HypothesesTesting Hypotheses

Now that we have the idea about how to state Now that we have the idea about how to state research and null hypotheses, we can start research and null hypotheses, we can start looking at the statistical techniques used to looking at the statistical techniques used to test themtest them

ThreeThree situations you will be in when needing situations you will be in when needing to test your hypothesesto test your hypotheses1.1. Population Parameter vs. Population ParameterPopulation Parameter vs. Population Parameter

2.2. Sample Statistic vs. Population ParameterSample Statistic vs. Population Parameter

3.3. Sample Statistic vs. Sample StatisticSample Statistic vs. Sample Statistic

The Hypothesis Testing SystemThe Hypothesis Testing System

1.1. State the research and null hypothesesState the research and null hypotheses HH1 – Research Hypothesis1 – Research Hypothesis

HH0 – Null Hypothesis0 – Null Hypothesis

2.2. Select an alpha level - Select an alpha level - % willing to incorrectly reject the null hypothesis% willing to incorrectly reject the null hypothesis

3.3. Select and compute a test statisticSelect and compute a test statistic Chi-SquaredChi-Squared T-testT-test

4.4. Accept or reject the null hypothesisAccept or reject the null hypothesis

5.5. Make a decisionMake a decision

Type I vs. Type II errorsType I vs. Type II errors

These are the mirror of each other – one goes up, the other goes These are the mirror of each other – one goes up, the other goes downdown If you increase your sample size, both go down, but the relationship If you increase your sample size, both go down, but the relationship

between them remains the samebetween them remains the same Type I error: rejecting a true Null HypothesisType I error: rejecting a true Null Hypothesis

Finding that Dr. Schroeder IS more intelligent (when in fact he is not)!Finding that Dr. Schroeder IS more intelligent (when in fact he is not)! Type II error: not rejecting a false Null HypothesisType II error: not rejecting a false Null Hypothesis

Finding that Dr. Schroeder’s intelligence is no different than the Finding that Dr. Schroeder’s intelligence is no different than the average (when in fact he is MUCH smarter)!average (when in fact he is MUCH smarter)!

Why do we call it “not rejecting” or “failing to reject” the null Why do we call it “not rejecting” or “failing to reject” the null hypothesis? Why don’t we just “accept” the null hypothesis or find hypothesis? Why don’t we just “accept” the null hypothesis or find the null hypothesis to be “true”?the null hypothesis to be “true”?

Which type of error is it generally worse to make?Which type of error is it generally worse to make? Type I (finding false evidence that your hypothesis may be true, as Type I (finding false evidence that your hypothesis may be true, as

opposed to failing to find more evidence – can always try again: more opposed to failing to find more evidence – can always try again: more subjects, different alpha level, etc.)subjects, different alpha level, etc.)

Selecting an alpha level Selecting an alpha level (())

alpha, what is it? - the probability that you will alpha, what is it? - the probability that you will make a Type I errormake a Type I error .05 means 5% chance of committing error.05 means 5% chance of committing error .01 means 1%, etc..01 means 1%, etc.

Alpha, what’s it for? – used with test statistic to Alpha, what’s it for? – used with test statistic to determine “threshold” a score must be above in determine “threshold” a score must be above in order to be accepted as “non-random”order to be accepted as “non-random”

Chosen prior to begging analysis, why?Chosen prior to begging analysis, why? Because, depends on practical consequences of Because, depends on practical consequences of

committing Type I or II error, NOT on what the data committing Type I or II error, NOT on what the data collected shows – need to think this through first!collected shows – need to think this through first!

How SURE do you need to How SURE do you need to be?be?

Social scientists routinely use .05 for Social scientists routinely use .05 for alphaalpha

In managerial situations, however, that In managerial situations, however, that may be two bigmay be two big


A rape crisis center may decide that the A rape crisis center may decide that the probability that one staff member cannot probability that one staff member cannot handle all the possible rape calls in a single handle all the possible rape calls in a single day is .05day is .05

This means, however, that 1 day in 20, or once This means, however, that 1 day in 20, or once every three weeks, the rape crisis center will every three weeks, the rape crisis center will fail to meet a crisisfail to meet a crisis

In this situation, you might instead pick .001 In this situation, you might instead pick .001 (which comes out to about one failure every (which comes out to about one failure every three years) three years)


A police department. On the other hand, A police department. On the other hand, may be able to accept a .05 probability may be able to accept a .05 probability that one of its cars may be out of servicethat one of its cars may be out of service

But the fire department may require a But the fire department may require a probability of .0001 that a fire hose will probability of .0001 that a fire hose will fail to operate (1 in 10,000 chance)fail to operate (1 in 10,000 chance)

Selecting a Test StatisticSelecting a Test Statistic

Most commonly used in social sciences: chi-Most commonly used in social sciences: chi-square and t-testsquare and t-test

Which one to use? Depends on “level” of data Which one to use? Depends on “level” of data being investigated.being investigated. chi-square: for nominal level data predicting nominal chi-square: for nominal level data predicting nominal

level data (usually in contingency tables)level data (usually in contingency tables) e.g. type of training program [a nominal category] vs. e.g. type of training program [a nominal category] vs.

working status [another nominal category] – see Table 12.3working status [another nominal category] – see Table 12.3 t-test: for nominal level data predicting interval level t-test: for nominal level data predicting interval level

datadata e.g. gender [a nominal level category] determining salary e.g. gender [a nominal level category] determining salary

[an interval level category][an interval level category]

Testing Hypotheses with Testing Hypotheses with Population ParametersPopulation ParametersParameter vs. ParameterParameter vs. Parameter

If you have access to the population If you have access to the population parameters, then hypothesis testing is parameters, then hypothesis testing is pretty easypretty easy

It’s like deciding whom should start at It’s like deciding whom should start at center if Shaquille O’Neal plays for your center if Shaquille O’Neal plays for your teamteam


Suppose Jerry Green, the governor of a large Suppose Jerry Green, the governor of a large eastern state, wants to know whether a former eastern state, wants to know whether a former governor’s executive reorganization has had governor’s executive reorganization has had any impact on the state’s expendituresany impact on the state’s expenditures

After some thought, he postulates the followingAfter some thought, he postulates the following H1: State expenditures decreased after the H1: State expenditures decreased after the

executive reorganization, compared with the state executive reorganization, compared with the state budget’s long-run growth ratebudget’s long-run growth rate

H0: State expenditures did not decrease after the H0: State expenditures did not decrease after the executive reorganization, compared with the state executive reorganization, compared with the state budget’s long-run growth ratebudget’s long-run growth rate


A management review shows that the state’s A management review shows that the state’s expenditure grew at a rate of 10.7% per year expenditure grew at a rate of 10.7% per year before the reorganization and 10.4% after the before the reorganization and 10.4% after the reorganizationreorganization

What do these figures say about the null What do these figures say about the null hypothesis?hypothesis?

Because 10.4% is less than 10.7%, we Because 10.4% is less than 10.7%, we REJECTREJECT the the NULLNULL hypothesis hypothesis

We conclude that the growth rate in state We conclude that the growth rate in state expenditures declined after the reorganizationexpenditures declined after the reorganization


Is a 0.3% decrease in the growth rate of Is a 0.3% decrease in the growth rate of expenditures significant?expenditures significant?

Of course! These are population parameters Of course! These are population parameters which means the “probability” that the which means the “probability” that the difference between the two conditions is real is difference between the two conditions is real is 100%!100%!

Is this statistically significant difference trivial?Is this statistically significant difference trivial? ProbablyProbably

We could have made the hypothesis more We could have made the hypothesis more specific – “…expenditures decreased by more specific – “…expenditures decreased by more than 5%...”than 5%...”


Don’t really have to use the hypothesis testing Don’t really have to use the hypothesis testing system in this scenariosystem in this scenario

No need for an alpha or statistical test (we are No need for an alpha or statistical test (we are not dealing with statistics, we are dealing only not dealing with statistics, we are dealing only with parameters)with parameters)

Only when we add unknowns into the mix via Only when we add unknowns into the mix via sampling do we need to resort to statistical sampling do we need to resort to statistical teststests

You DO, however, need to take care to state You DO, however, need to take care to state logical hypotheseslogical hypotheses

Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. ParameterStatistic vs. Parameter

Referred to as “One-Sample” testsReferred to as “One-Sample” tests Comparing one sample to a known populationComparing one sample to a known population ““What’s the likelihood that the mean I just What’s the likelihood that the mean I just

obtained from my sample is representative of obtained from my sample is representative of the population as a whole?”the population as a whole?”

I already know the population of M&M handfuls I already know the population of M&M handfuls for the class, so, how well does the mean of a for the class, so, how well does the mean of a specific sample of four handfuls represent the specific sample of four handfuls represent the population?population?

Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample Chi SquareStatistic vs. Parameter – 1 Sample Chi Square

The One Sample Chi-SquareThe One Sample Chi-Square The Chi-square test used with one sample is The Chi-square test used with one sample is

described as a described as a "goodness of fit""goodness of fit" test. It can test. It can help you decide whether a distribution of help you decide whether a distribution of frequencies for a variable in a sample is frequencies for a variable in a sample is representative of, or "fits", a specified representative of, or "fits", a specified population distribution. For example, you can population distribution. For example, you can use this test to decide whether your data are use this test to decide whether your data are approximately normal or not.approximately normal or not.


Suppose the relative Suppose the relative frequencies of frequencies of marital status for the marital status for the population of adult population of adult American females American females under 40 years of under 40 years of age are as follows:age are as follows:

General PopulationGeneral Population Sample Sample (N=200)(N=200)

Marital Marital StatusStatus

Relative Relative FrequencyFrequency

Observed Observed FrequenciesFrequencies

MarriedMarried 0.550.55 100100

SingleSingle 0.210.21 4444

SeparatedSeparated 0.090.09 1616

DivorcedDivorced 0.120.12 3636

WidowedWidowed 0.030.03 44


Then suppose an investigator Then suppose an investigator wanted to know whether a wanted to know whether a particular sample of 200 adult particular sample of 200 adult females under age 40 was drawn females under age 40 was drawn from a population that is from a population that is representative of the general representative of the general populationpopulation

By applying the procedures of Chi By applying the procedures of Chi Square and the steps of Square and the steps of hypothesis testing, we can decide hypothesis testing, we can decide whether the sample distribution is whether the sample distribution is close enough to the population close enough to the population distribution to be considered distribution to be considered representativerepresentative of it. of it.











State the Research and State the Research and Null HypothesesNull Hypotheses

H1: H1: The sample The sample does does notnot represent the represent the population distributionpopulation distribution

H0: H0: The sample The sample doesdoes represent the population represent the population distributiondistribution

Why is H1 stated Why is H1 stated “negatively”? “negatively”? What does Chi-Square What does Chi-Square

show us?show us?











Select an “alpha” levelSelect an “alpha” level How willing are we to make How willing are we to make

a mistake and say that the a mistake and say that the sample IS representative of sample IS representative of the population when it the population when it actually isn’t?actually isn’t?

Really depends on why Really depends on why you are getting the sample you are getting the sample in the first place, but let’s in the first place, but let’s assume 5% for nowassume 5% for now

Stated another way we’d Stated another way we’d say there is a probability of say there is a probability of .05 that we will mistakenly .05 that we will mistakenly accept the research accept the research hypothesishypothesis











Select and Compute Select and Compute a Test Statistica Test Statistic

In this case we are In this case we are dealing with what dealing with what “level” of data?“level” of data?

NominalNominal Chi Square is selectedChi Square is selected











We calculate We calculate Expected Expected frequencies for each of the cells frequencies for each of the cells in our sample distributionin our sample distribution

If in our general population, 55% If in our general population, 55% of such women are married then of such women are married then we would expect 55% of 200 we would expect 55% of 200 = 110 in our sample to be = 110 in our sample to be marriedmarried

Single women would be 21% of Single women would be 21% of 200 = 42200 = 42

Separated 9% of 200 = 18Separated 9% of 200 = 18 Divorced = 12% of 200 = 24Divorced = 12% of 200 = 24 Widowed 3% of 200 = 6Widowed 3% of 200 = 6 We then get the difference We then get the difference

between each Expected and between each Expected and each Observed, square this, and each Observed, square this, and then divide this result by the then divide this result by the Expected. Expected.




















=0.91 + 0.10 + 0.22 + 6.00 + 0.67

=7.90











Reject or Accept the Null Reject or Accept the Null HypothesisHypothesisYou then refer to your XYou then refer to your X22 table table under the 0.05 heading with under the 0.05 heading with df = C-1 = 4df = C-1 = 4You find a critical value of 9.49You find a critical value of 9.49Is our calculated value of 7.9 Is our calculated value of 7.9 significant?significant?No, so do we accept or reject No, so do we accept or reject the null hypothesis?the null hypothesis?Accept H0 – what does that Accept H0 – what does that mean?mean?

Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Parameter – 1 Sample t-testStatistic vs. Parameter – 1 Sample t-test

The One-Sample t TestThe One-Sample t Test A professor wants to know if her A professor wants to know if her

introductory statistics class has a introductory statistics class has a good grasp of basic mathgood grasp of basic math

Six students are chosen at random Six students are chosen at random from the class and given a math from the class and given a math proficiency testproficiency test

The professor wants the class to The professor wants the class to be able to score at least 70 on the be able to score at least 70 on the testtest

The six students get scores of 62, The six students get scores of 62, 92, 75, 68, 83, and 9592, 75, 68, 83, and 95

Can the professor be at least 90 Can the professor be at least 90 percent certain that the mean percent certain that the mean score for the class on the test score for the class on the test would be at least 70? would be at least 70?

ScoresScores

6262

9292

7575

6868

8383

9595

mean: 79.17mean: 79.17

sd: 13.17sd: 13.17


State your research State your research and null hypothesesand null hypotheses

H1: μ ≥ 70H1: μ ≥ 70 H0: μ < 70H0: μ < 70

ScoresScores

6262

9292

7575

6868

8383

9595

mean: 79.17mean: 79.17

sd: 13.17sd: 13.17


Select and Compute Select and Compute StatisticStatistic

Dealing with Dealing with Interval/Ratio level Interval/Ratio level datadata

Select t-testSelect t-test

ScoresScores

6262

9292

7575

6868

8383

9595

mean: 79.17mean: 79.17

sd: 13.17sd: 13.17


To test the hypothesis, the To test the hypothesis, the computed computed tt-value of 1.71 will -value of 1.71 will be compared to the critical be compared to the critical value in the value in the tt-table. -table.

ScoresScores

6262

9292

7575

6868

8383

9595

mean: 79.17mean: 79.17

sd: 13.17sd: 13.17


Accept or Reject the Null Accept or Reject the Null HypothesisHypothesis

A 90 percent confidence level is A 90 percent confidence level is equivalent to an alpha level of .10 equivalent to an alpha level of .10

The number of degrees of freedom The number of degrees of freedom for the problem is 6 – 1 = 5for the problem is 6 – 1 = 5

The value in the The value in the tt-table for -table for tt10,510,5 is is 1.4761.476

Because the computed Because the computed tt-value of -value of 1.71 is larger than the critical value 1.71 is larger than the critical value in the table, the null hypothesis can in the table, the null hypothesis can be rejected, and the professor can be rejected, and the professor can be 90 percent certain that the class be 90 percent certain that the class mean on the math test would be at mean on the math test would be at least 70least 70

ScoresScores

6262

9292

7575

6868

8383

9595

mean: 79.17mean: 79.17

sd: 13.17sd: 13.17

t vs. zt vs. z

Note that the formula for the one-sample Note that the formula for the one-sample tt-test -test for a population mean is the same as the for a population mean is the same as the zz--test, except that the test, except that the tt-test substitutes the -test substitutes the sample standard deviation sample standard deviation ss for the population for the population standard deviation σ and takes critical values standard deviation σ and takes critical values from the from the tt-distribution instead of the -distribution instead of the zz--distribution. The distribution. The tt-distribution is particularly -distribution is particularly useful for tests with small samples ( useful for tests with small samples ( nn < 30) < 30)

Could use either distribution to test your Could use either distribution to test your hypothesis about n=30hypothesis about n=30


““Two-Tail” ExampleTwo-Tail” Example Used when you don’t care if Used when you don’t care if

something is more or less than – something is more or less than – just “different” thanjust “different” than

A Little League baseball coach A Little League baseball coach wants to know if his team is wants to know if his team is representativerepresentative of other teams in of other teams in scoring runsscoring runs

Nationally, the average number of Nationally, the average number of runs scored by a Little League team runs scored by a Little League team in a game is 5.7in a game is 5.7

He chooses five games at random He chooses five games at random in which his team scored 5 9, 4, 11, in which his team scored 5 9, 4, 11, and 8 runs. Is it likely that his team's and 8 runs. Is it likely that his team's scores could have come from the scores could have come from the national distribution?national distribution?

ScoresScores

55

99

44

1111

88

mean: 7.4mean: 7.4

sd: 2.88sd: 2.88


State Research and Null State Research and Null HypothesesHypotheses

H1: μ ≠ 5.7 H1: μ ≠ 5.7 H0: μ = 5.7 H0: μ = 5.7

Select alphaSelect alpha Alpha: .05Alpha: .05

ScoresScores

55

99

44

1111

88

mean: 7.4mean: 7.4

sd: 2.88sd: 2.88


Calculate tCalculate t ScoresScores

55

99

44

1111

88

mean: 7.4mean: 7.4

sd: 2.88sd: 2.88


Now, look up the critical value Now, look up the critical value from the from the tt-table-table

The degrees of freedom is 5 – The degrees of freedom is 5 – 1 = 4. The overall alpha level 1 = 4. The overall alpha level is .05is .05

ButBut because this is a because this is a two-two-tailed testtailed test, the alpha level , the alpha level must be divided by two, which must be divided by two, which yields .025yields .025 This means .025 on either end This means .025 on either end

vs .05 on one endvs .05 on one end The tabled value for The tabled value for tt.025,4.025,4 is is

2.7762.776

ScoresScores

55

99

44

1111

88

mean: 7.4mean: 7.4

sd: 2.88sd: 2.88


In a two-tailed hypothesis, you have to In a two-tailed hypothesis, you have to consider BOTH ends, not just oneconsider BOTH ends, not just one


t must be t must be EITHEREITHER more than more than the positive critical value or the positive critical value or less than the negative critical less than the negative critical value (±2.776)value (±2.776)

The computed The computed tt of 1.32 is not of 1.32 is not smaller than -2.776 or more smaller than -2.776 or more than +2.776than +2.776

You cannot reject the null You cannot reject the null hypothesis that the mean of hypothesis that the mean of this team is equal to the this team is equal to the population meanpopulation mean

The coach can conclude that The coach can conclude that his team fits in with the national his team fits in with the national distribution on runs scored. distribution on runs scored.

ScoresScores

55

99

44

1111

88

mean: 7.4mean: 7.4

sd: 2.88sd: 2.88

Confidence interval for Confidence interval for population mean using population mean using tt

This is exactly the same as when dealing with z!

Remember ± 1.96 x s.e.? The 1.96 was the z-score

Now we are just using t x s.e.

Confidence interval for Confidence interval for population mean using population mean using tt

Using the previous example, what is a 95 percent confidence interval for runs scored per team per game?

First, determine the t-value. A 95 percent confidence level is equivalent to an alpha level of .05

Half of .05 is .025 (Why Half?)

The t-value corresponding to an area of .025 at either end of the t-distribution for 4 degrees of freedom ( t.025,4) is 2.776.


In-Class ExerciseIn-Class Exercise

Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. StatisticStatistic vs. Statistic

Used to test hypotheses that two groups have Used to test hypotheses that two groups have statistically different meansstatistically different means (two-tailed [non-directional])(two-tailed [non-directional])

HH11: Men make a different salary than women: Men make a different salary than women HH00: Men and women make the same: Men and women make the same

Or, tests hypotheses that one group’s mean is Or, tests hypotheses that one group’s mean is higher than the other group’s mean higher than the other group’s mean (one-tailed [directional])(one-tailed [directional])

HH11: Men make more than women: Men make more than women HH00: Men make the same or less than women: Men make the same or less than women

Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – Chi SquareStatistic vs. Statistic – Chi Square

You’ve already done this previously when you You’ve already done this previously when you were calculating chi square for contingency were calculating chi square for contingency tables!tables!

Now, you are just adding on the proper way to Now, you are just adding on the proper way to hypothesizehypothesize

Because chi square looks for the existence of a Because chi square looks for the existence of a relationship based on the “difference” between relationship based on the “difference” between observed and expected, your null hypothesis is observed and expected, your null hypothesis is always that there is “no difference”always that there is “no difference”

Previous ExamplePrevious ExampleCalculations for Expected FrequenciesCalculations for Expected Frequencies

Table CellTable Cell

CompetenceCompetence HierarchyHierarchy ObservedObserved ExpectedExpected (O-E)(O-E)22/E/E

LowLow LowLow 113113 .50x152=76.0.50x152=76.0 18.0118.01

LowLow MediumMedium 3131 .40x152=60.8.40x152=60.8 14.6114.61

LowLow HighHigh 88 .10x152=15.2.10x152=15.2 3.413.41

MediumMedium LowLow 6060 .50x159=79.5.50x159=79.5 4.784.78

MediumMedium MediumMedium 9191 .40x159=63.6.40x159=63.6 11.811.8

MediumMedium HighHigh 88 .10x159=15.9.10x159=15.9 3.933.93

HighHigh LowLow 2727 .50x89=44.5.50x89=44.5 6.886.88

HighHigh MediumMedium 3838 .40x89=35.6.40x89=35.6 .16.16

HighHigh HighHigh 2424 .10x89=8.9.10x89=8.9 25.6225.62

TotalTotal 400400 400400 89.289.2CHI-SQUARE!

Previous ExamplePrevious Example

H1: Hierarchy is Related to CompetenceH1: Hierarchy is Related to Competence H0: Hierarchy is not related to H0: Hierarchy is not related to

CompetenceCompetence

If Chi Square is Higher than the critical If Chi Square is Higher than the critical value, you reject the null hypothesis and value, you reject the null hypothesis and accept the research hypothesisaccept the research hypothesis

Testing Hypotheses with Testing Hypotheses with SamplesSamples Statistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means

Tests whether the Tests whether the meansmeans of two groups are of two groups are statisticallystatistically differentdifferent from each other from each other

Let’s look at some graphs so we may Let’s look at some graphs so we may visually understand what it is we are visually understand what it is we are

looking atlooking at

Statistical AnalysisStatistical Analysis


Controlgroupmean


Controlgroupmean

Treatmentgroupmean


Controlgroupmean

Treatmentgroupmean

Is there a difference?

What Does What Does DifferenceDifference Mean?Mean?


Mediumvariability


Mediumvariability

Highvariability


Mediumvariability

Highvariability

Lowvariability


Mediumvariability

Highvariability

Lowvariability

The mean differenceis the same for all

three cases.


Mediumvariability

Highvariability

Lowvariability

Which one showsthe greatestdifference?

What Does What Does DifferenceDifference Mean?Mean? A statistical difference is a function of the A statistical difference is a function of the

difference between meansdifference between means relative to the relative to the variabilityvariability..

A small difference between means with large A small difference between means with large variability could be due to variability could be due to chancechance..

Lowvariability

Which one showsthe greatestdifference?

What Do We Estimate?What Do We Estimate?

Lowvariability


Lowvariability

Signal

Noise


Lowvariability

Signal

Noise

Difference between group means=


Lowvariability

Signal

Noise

Difference between group means

Variability of groups=Difference between group means

Variability of groups


Lowvariability

Signal

Noise


Variability of groups=

=XT - XC

SE(XT - XC)

_ _

_ _


Variability of groups_


Variability of groups__



_

__



__

__




Lowvariability

Signal

Noise


Variability of groups=

XT - XC

SE(XT - XC)=

= t-value

_ _

_ _


Variability of groups_ _

_

_ _

__

_ _


The Ware County librarian wants to increase The Ware County librarian wants to increase circulation from the Ware County bookmobilescirculation from the Ware County bookmobiles

The librarian thins that poster ads in areas The librarian thins that poster ads in areas where the book mobiles stop will attract more where the book mobiles stop will attract more browsers and increase circulationbrowsers and increase circulation

To test this idea, the librarian sets up an To test this idea, the librarian sets up an experimentexperiment


Ten bookmobile routes are selected at Ten bookmobile routes are selected at randomrandom

On those routes, poster ads are posted On those routes, poster ads are posted with bookmobile informationwith bookmobile information

Ten other bookmobile routes are Ten other bookmobile routes are selected at randomselected at random

On those routes, no advertising is doneOn those routes, no advertising is done


Step 1 - HypothesesStep 1 - Hypotheses The null hypothesis is that the mean circulation of The null hypothesis is that the mean circulation of

the experimental group is the experimental group is not highernot higher than the mean than the mean circulation of the control groupcirculation of the control group

The research hypothesis is that the mean circulation The research hypothesis is that the mean circulation of the experimental group is of the experimental group is higherhigher than the mean than the mean circulation of the control groupcirculation of the control group

Step 2 – AlphaStep 2 – Alpha .05.05


The following The following data is data is obtained:obtained:

Librarian’s DataLibrarian’s Data

GroupsGroups

BooksBooks Experimental Experimental GroupGroup

Control GroupControl Group

MeanMean 526526 475475

Standard Standard DeviationDeviation

125125 115115


STEP 3STEP 3 Calculate the Calculate the

s.e. for each s.e. for each groupgroup

125/sqrt(10)125/sqrt(10) =39.5=39.5 115/sqrt(10)115/sqrt(10) =36.4=36.4


GroupsGroups

BooksBooks Experimental Experimental GroupGroup

Control GroupControl Group

MeanMean 526526 475475


125125 115115


STEP 4STEP 4 Create a “pooled” Create a “pooled”

standard errorstandard error


GroupsGroups

BooksBooks ExperimentExperimental Groupal Group

Control Control GroupGroup

MeanMean 526526 475475


125125 11511522

21 ...... eseses d

7.534.365.39.. 22 des


STEP 5STEP 5 Subtract the mean of Subtract the mean of

the second group the second group from the firstfrom the first

Then divide by the Then divide by the pooled errorpooled error


GroupsGroups



MeanMean 526526 475475


125125 115115

95.7.53

526475

..21

t

es

XXt

d


STEP 6STEP 6 Degrees of Freedom Degrees of Freedom

equals nequals n11 + n + n22 – 2 – 2

10 + 10 – 2 = 1810 + 10 – 2 = 18


GroupsGroups



MeanMean 526526 475475


125125 115115

Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – t-test for two sample meansStatistic vs. Statistic – t-test for two sample means

STEP 7STEP 7 Look it up and accept or Look it up and accept or

reject the null hypothesisreject the null hypothesis Critical value for 18df at Critical value for 18df at

the .05 level of the .05 level of significance is significance is 1.7341.734

We did not meet that We did not meet that value and, therefore, fail value and, therefore, fail to reject the null to reject the null hypothesishypothesis

We can’t say that the We can’t say that the advertising increased advertising increased book circulationbook circulation


GroupsGroups



MeanMean 526526 475475


125125 115115


What if we wanted to What if we wanted to just see if advertising just see if advertising had any effect?had any effect?

What values would we What values would we be comparing then?be comparing then?

t would have to be t would have to be outside the range of outside the range of ±2.10±2.10

How did we get that?How did we get that?


GroupsGroups



MeanMean 526526 475475


125125 115115


In-Class ExerciseIn-Class Exercise

Testing Hypotheses with Testing Hypotheses with SamplesSamplesStatistic vs. Statistic – t-test for proportionsStatistic vs. Statistic – t-test for proportions

Because we can figure out standard Because we can figure out standard errors for proportions (like we did last errors for proportions (like we did last 2 weeks), we can use a t-test to also 2 weeks), we can use a t-test to also compare two groups’ proportionscompare two groups’ proportions

The formulas are the same, the only The formulas are the same, the only difference is the calculation of the difference is the calculation of the standard deviation from the standard deviation from the proportionsproportions


If you’re trying to see if If you’re trying to see if there is a statistical there is a statistical difference between two difference between two groups on whether or not groups on whether or not they support an they support an amendment to the state amendment to the state constitution, it would look constitution, it would look like thislike this

Once you have the Once you have the standard deviation, you standard deviation, you do everything the same do everything the same as when comparing as when comparing meansmeans

Group AGroup A Group BGroup B

n=50n=50 n=50n=50

For=60%For=60% For=40%For=40%

p=.60p=.60 p=.40p=.40

s=sqrt(p(1-p))s=sqrt(p(1-p))

= .49= .49

s=sqrt(p(1-p))s=sqrt(p(1-p))

= .49= .49


In-Class WorksheetIn-Class Worksheet

HomeworkHomework

Write 2 scenarios and analyses (make Write 2 scenarios and analyses (make them PA relevant), one for a comparison them PA relevant), one for a comparison of sample means, another for a of sample means, another for a comparison of sample proportionscomparison of sample proportions

Make up the problem descriptions and Make up the problem descriptions and datadata Emailed to me by Halloween MidnightEmailed to me by Halloween Midnight No class Halloween nightNo class Halloween night

Documents

Hypothesis Testing Testing Statistical Significance