WMI 606: Research Methods Course - Hypothesis setting …cavs.uonbi.ac.ke/sites/default/files/cavs/ResearchMethods_session4... · Hypothesis setting and testing session 4 ... data/40007

WMI 606: Research Methods Course

Hypothesis setting and testingsession 4

Thumbi Mwangi

1Paul G. Allen School for Global Animal Health, Washington State University2KEMRI - Center for Global Health Research

3Wangari Maathai Institute, University of Nairobi

February 11th, 2015

WMI 606: Research Methods Course

Introduction toHypothesis setting and

testing

http://www.sagepub.com/upm-

data/40007 Chapter8.pdf

Definitions:

Hypothesis: an “educated guess” basedon prior knowledge or observation,explaining a phenomenon

Statement or prediction of therelationship between two or more factors

Provisional explanation of something -needs to be proved!

.......the jury analogy........

A person is assumed innocent until proven guilty

The jury decides if the defendant is guilty or not guilty

.......the jury analogy........

The decision NOT on whether guilty or innocentThe prosecutor must present evidence in a trial that shows thedefendant is guiltyThe evidence either shows guilt (decision: guilty) or does not(decision: not guilty).

.....hypothesis setting........

review available evidence before setting ahypothesis

what do you think is true based on the availableevidence?

show logic in your hypothesis

include the “why” in your hypothesis

Hypothesis testing: method for testing aclaim/hypothesis about a parameter in apopulation, using data measured in thesample

the purpose is to rule out chance(sampling error) as a plausibleexplanation for the study results

.....determine whether a claim is true.........

The average height of Kenyan men is 5 feet 3inch

the population mean

You select a sample of Kenyan men andcalculate their average height = 5 feet 8 inch

the sample mean

.....2 plausible explanations?.....

There two possible explanations for the observeddifferences:

1 the difference between the sample mean andthe population mean is due to Samplingerror

2 the difference between the sample mean andpopulation mean is too large to beexplained by Sampling error

the claim on average height of Kenyan men isnot supported by your data

......which of the 2 plausible explanations do we go with...

The hypothesis to be tested is given thesymbol H0, and commonly referred to asthe Null Hypothesis

it is assumed to be true, unless there isstrong enough evidence against it

is the average height of Kenyans 5 feet3 inch?

.....is there a treatment/intervention effect?....

In experiments - we are interested in knowing:

Does the treatment under investigation havean effect on the population mean?

Take an example:

To determine the impact of communityfocus group trainings on environmentalconservation on environmental governanceindices

.....is there a treatment/intervention effect?....

The treatment:

Training of communities on mattersenvironmental conservation (through focusgroups)

The outcome:

Increase or decrease in the environmentalgovernance indices

Comparing:

Governance index score in communities nottrained (the population mean) and thosetrained (the sample mean)

.....is there a treatment effect....

Does the training significantly affect the meanof the populations governance index?

or are observed differences (between populationand sample means) the result of sampling error?

.....is there a treatment effect....

Does the training significantly affect the meanof the populations governance index?

or are observed differences (between populationand sample means) the result of sampling error?

.....is there a treatment effect ....

The NULL hypothesis?

Focus group trainings on environmentalconservation do not have an effect ongovernance indices

.....reading on hypothesis testing....

.....hypothesis testing .......

a) Step1: select the “cut-off” point - the

level of significance; the criterion for

making a decision about the null

hypothesis

α = 0.05,

α = 0.01,

α = 0.001.

.....hypothesis testing step1: select the “cut-off” point....


b) Step 2: Identify the “critical region”

outcomes that are very unlikely to occur

if the null hypothesis is true

Governance indices (of the sample) that

are very unlikely to occur/to be observed

if focus group trainings on environmental

conservation have no effect


b) Step 2: Identify the “critical region”

outcomes that are very unlikely to occur

if the null hypothesis is true

Governance indices (of the sample) that

are very unlikely to occur/to be observed

if focus group trainings on environmental

conservation have no effect

.....critical region for α level 0.05....

....computing the test statistic.......

c) Compute the test statisticto determine how far, how many standarddeviations - a sample mean is from thepopulation mean

The standard deviation: the dispersion of a set of data fromthe mean


For a normal distribution:

68% percent of the distribution is within

1 standard deviation.

95.4% within 2 standard deviations

over 99% within 3 standard deviations.

Mostly scientists will use the level of 2

standard deviations to make a decision

whether to reject or retain the null

hypothesis


For a normal distribution:

68% percent of the distribution is within

1 standard deviation.

95.4% within 2 standard deviations

over 99% within 3 standard deviations.

Mostly scientists will use the level of 2

standard deviations to make a decision

whether to reject or retain the null

hypothesis

....reject or retain the null hypothesis?.......

d) Decision to reject or retain the nullhypothesis test statistic

A large value of the test statistic shows themean difference is more than would beexpected if there was no treatment effect

If the value falls within the critical region,the conclusion is the difference is significant

i.e the focus group trainings have an effecton governance indices

Null hypothesis is rejected

....reject or retain the null hypothesis?.......

The probability of obtaining a samplemean, given that the value stated in thenull hypothesis is true, is stated by thep-value.

The p-value for obtaining a sampleoutcome is compared to the level ofsignificance - set in stage 2

When the p-value is < 0.05 - reject theNULL hypothesis

When the p-value is > 0.05 - retain theNULL hypothesis

.....Errors with hypothesis testing....

Differences in the sample mean and thepopulation mean may not always be due to atreatment effect

Sampling error may cause such differences

There is the risk that misleading data may causethe hypothesis test to reach a wrong conclusion

Two possible types of errors:

Type 1 errorType 2 error

.....Type 1 error....

Type I error

Sample data shows a treatment effect,where in fact, there is none!You reject the NULL hypothesis - a falseconclusion

Causes:

Researcher selected an extreme sample -that already falls in the critical regionE.g a sample of basketball players in order totest the hypothesis the average height ofKenyans is 5 feet 3 inches


Type I error

Sample data shows a treatment effect,where in fact, there is none!You reject the NULL hypothesis - a falseconclusion

Causes:

Researcher selected an extreme sample -that already falls in the critical regionE.g a sample of basketball players in order totest the hypothesis the average height ofKenyans is 5 feet 3 inches


Type II error

Sample data does not show a treatmenteffect, where in fact, the treatment doeshave an effect!You FAIL to reject the NULL hypothesis - afalse conclusion

Causes:

Very small treatment effectsAlthough the treatment has effect, its notlarge enough to be picked by the study


Type II error

Sample data does not show a treatmenteffect, where in fact, the treatment doeshave an effect!You FAIL to reject the NULL hypothesis - afalse conclusion

Causes:

Very small treatment effectsAlthough the treatment has effect, its notlarge enough to be picked by the study

.....reading on hypothesis testing....

http://www.sagepub.com/upm-data/40007 Chapter8.pdf

Documents

WMI 606: Research Methods Course - Hypothesis setting …cavs.uonbi.ac.ke/sites/default/files/cavs/ResearchMethods_session4... · Hypothesis setting and testing session 4 ... data/40007