100 Fundamentals of Hypothesis Testing 1

8/2/2019 100 Fundamentals of Hypothesis Testing 1

1/31

Hypothesis testing

Behavioural Science II

Week 1, Semester 2, 2002


2/31

Behavioural Science II 2

Hypothesis testing

Null hypothesis is that there is nosystematic relationship between

independent variables (IVs) anddependent variables (DVs).

Research hypothesis is that any

relationship observed in the data isreal.


3/31


Hypothesis testing

Whereas research hypothesis tends to beimprecise about numerical differencesbetween groups (e.g., difference inreaction times), null hypothesis statesvery specifically that difference should bezero.


4/31


Null hypothesis versus

alternative hypothesis The null hypothesis assumes that

scores for different levels of the IV

are random samples from the samepopulation.

The alternative hypothesis is that

samples come from differentpopulations.


5/31


Null hypothesis versus

alternative hypothesis For any single experiment, we are bound

to see a difference, just as we see adifference between the means of tworandom samples in a distribution ofsample means.

If the null hypothesis is true, then

differences in mean scores are just tworandom samples from the samepopulation.


6/31


Testing the null hypothesis

A statistical test assesses theprobability of obtaining a given

sample or samples of scores,assuming the null hypothesis iscorrect.


7/31


Testing the null hypothesis

If the probability is low enough (e.g.,p.05), then the null hypothesis isnot rejected but retained, and the IV isdeemed to have no effect (i.e., theobserved changes are due to chance).


8/31


Statistical significance

Statistical significance refers to theprobability of the data obtained, given thatthe null hypothesis is true.

A statistically significant result does notmean that the null hypothesis isimprobable.

There is an ongoing gap betweenstatistical significance and substantivesignificance.


9/31


Hypothesis testing and

sampling distributions The decision to reject or not reject

the null hypothesis usually is made

with reference to the samplingdistribution of a statistic of somekind (e.g., z-distribution, t-

distribution).


10/31


Example of hypothesis

testing using z-distribution Null hypothesis population

parameters:

= 15=15

Random sample statistics

Mean = 110

N=9


11/31


Applying formulae

Given that z-score of 1.96 = p< .05 (two-tailed), would reject null hypothesis.

X

N

15

915

3 5

ZX

X

X

110100

5

10

5

2


12/31


Example of hypothesis

testing using t-distribution Null hypothesis population

parameters:

=100 Random sample statistics

Mean = 110

N=9

x2 = 960


13/31


Applying formulaeGiven that t-

scores of2.306 (df=8)=p< .05(two-tailed),would

reject thenullhypothesis.

x2N1

960

91

960

8 10.95

X

N 10.95

9 10.95

3 3.65

tX

X

X

110100

3.65

10

3.65 2.74


14/31


Hypothesis testing using

confidence intervals We reject null hypothesis when null

population mean lies outside the

confidence interval. We infer alternative population mean is

higher than null population mean if lowerlimit of confidence intervals is to right of

null population mean and lower if upperlimit of confidence intervals is to left ofnull population mean.


15/31


Errors in hypothesis testing

Given the gap between statistical andsubstantive significance, a decision

based on probability to retain orreject the null hypothesis can bewrong.


16/31


When null hypothesis is

true (Type I error) When null hypothesis is true, and it

is rejected, this decision is called a

Type 1 error. The probability of making such an

error is designated alpha () and isequivalent to the significance level(e.g., p


17/31



true (Type I error) If null hypothesis is true and alpha level is

set at .05, then the null hypothesis will berejected 5% of time even though it is true.

One way to safeguard against a Type Ierror is to set a more stringent alpha level(e.g., p


18/31



false (Type II or III errors) When alternative hypothesis is true,

and the statistic (mean) from

alternative distribution falls withincut-off points (i.e., p>.05), then nullhypothesis would be retained.


19/31


Type II error

Retaining null hypothesis when alternativehypothesis is true is called a Type II error.

The probability of making a Type II errorusually is symbolized as beta (). The probability of beta depends on how

much the alternative hypothesis sampling

distribution overlaps the retention regionof the null hypothesis samplingdistribution.


20/31


Type III error

It is also possible to make a Type III error,by rejecting a null hypothesis but inferringthe incorrect alternative hypothesis.

The probability of making a Type III errorusually is symbolized as gamma () and isequivalent to whatever percentage ofscores in the alternative distribution fallsin the far end of the null hypothesisdistribution. The probability of making aType III error is usually quite small.


21/31


The power of a test

The probability of rejecting a falsenull hypothesis and correctly

inferring the position or direction ofthe alternative hypothesis withrespect to the null hypothesis.

Factors affecting power and errorrates


22/31


Power is affected by

significance (alpha) level Setting a less stringent significance

level increases the discriminatory

power of the statistical test andincreases power as long as thealternative hypothesis is true.


23/31


Power is affected by magnitude of

difference between sample means So, increasing the difference in the

size of the mean at differing levels of

the IV increases the power of thetest.


24/31


Power is affected by sample size

An increase in sample size increasesthe power of the test, if the

alternative hypothesis is true. This is because as sample size

increases, the standard error of the

mean decreases, thus reducing theoverlap between the null andalternative hypotheses.


25/31


Effect size

In order to gauge the effect of the IV,it makes sense to contrast the

difference between the populationmean for the null hypothesis and thepopulation mean for the alternative

hypothesis.


26/31


Effect size formula

where

is standard deviation of populationof dependent measure scores.

Eff ect_ size

0

1


27/31


Judging effect sizes

According to Cohen (1988)

.20 = small effect size

.50 = medium effect size

.80 = large effect size


28/31


Do we really need the null

hypothesis? A significant test of the null

hypothesis does not mean the data

are not a product of chance. The significant result may simply be

a Type I error (falsely rejecting null

hypothesis).


29/31


Do we really need the null

hypothesis? Better to test research hypothesis, if

know size and direction of effect.

Even better report combination ofoutcome values (e.g., effect sizes,confidence intervals, strength ofrelationship).


30/31


One-tailed versus two-tailed

tests Conventionally reject null hypothesis if

obtained z-score or t-score falls beyond

certain values in either tail of the relevantsampling distribution (i.e., a two-tailedtest).

In specific contexts, a one-tailed test

might seem appropriate (e.g., reject nullhypothesis only if test statistic fell in 5%left-hand tail of distribution.


31/31


One-tailed versus two-tailed

tests Generally, two-tailed tests are preferred to

one-tailed tests.

The IV may have an effect in oppositedirection to the one predicted.

Documents

100 Fundamentals of Hypothesis Testing 1