23
Evaluation Methods and Statistics Professor Andrew Howes School of Computer Science University of Birmingham

Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

  • Upload
    hangoc

  • View
    220

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

Evaluation Methods and Statistics

Professor Andrew Howes

School of Computer ScienceUniversity of Birmingham

Page 2: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

What did the crocodile swallow in Peter Pan?

Page 3: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

• Your mind has now primed “Google” (Sparrow et al., 2011).

• The neurons that represent “Google” have changed as a consequence of being asked “hard” questions.

• For a short while they are more “active”, meaning that their biochemistry has prepared them for action.

• This is true even though I did not mention Google.

• Sparrow, B., Liu, J. & Wegner, D.M. (2011). Google effects on memory: Cognitive consequences of having information at our fingertips. Science, 333, 776-778.

Page 4: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The
Page 5: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

why is this interesting?

• It is one piece of evidence that suggests that people do not just use the internet to find information that they do not know

• but in addition

• they remember less and instead remember how to find it.

• this is called ‘transactive’ memory.

• similarly couples implicitly divide up everyday memory tasks.

Page 6: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

claims and evidence

• the main question for us is how do we support these sort of claims with evidence.

• to answer this question we need to take a step back into the history of experimental psychology…

Page 7: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

In what follows quickly name thecolour of the ink.

Do not read the word.

Page 8: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

GREEN

Page 9: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

GREEN

Page 10: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

The Stroop Effect

• named after J. Ridley Stroop.

• the task is to report the colour of the ink as quickly as possible without reading the words.

• Stroop claimed that, on average, it takes longer to report colours of incongruent stimuli than those that are congruent.

Page 11: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

Stroop (1935) experimental stimuli

• congruent = the word is the word for the colour of the ink.

• incongruent = the word is the word of a different colour.

GREEN

GREEN

Page 12: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

theory

• why is the effect thought to occur?

• top-down control over information processing is limited.

• humans appear incapable of entirely switching-off word reading when words are presented in the visual field.

• words are sometimes read more quickly than colours can be reported.

• people are more experienced at reading words than at saying the colour of the ink.

Page 13: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

experimental design

• The hypothesis concerns the relative effect of congruent and incongruent colour words on Reaction Time (RT).

• The hypothesis concerns a population. Sometimes, this is all humans.

• From the population we take a sample of size N participants.

• Stroop experiments typically use a within-participant design.

• In a within-participant design all participants take part in all conditions.

• The Stroop experiment has two conditions: One with congruent stimuli and the other with incongruent stimuli.

Page 14: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

how is the experiment conducted?

• typically...

• with sequentially presented stimuli.

• multiple participants

• each participant receives both congruent and incongruent stimuli (a within-subject design).

Page 15: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

what do the data look like?

Page 16: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

factors

• each response is represented on a separate row along with its factor levels.

Page 17: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

how do we make sense of the data?

• this is a fraction of the raw data from just one participant!

• reaction times in 1000ths of a second (milliseconds).

• the participant has made some errors.

• multiple stimuli in multiple experimental conditions.

• is there evidence that people take longer to process incongruent words?

Page 18: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

a frequency plot ( a histogram)

• a plot of the frequency of each Reaction Time (RT) at 100ms intervals.

• red are for congruent. white for incongruent.

• what can we say about the performance of participant 2?

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 1

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 2

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 3

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Page 19: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 1

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 2

Reaction Time (ms)Fr

eque

ncy

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 3

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 4

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 5

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 6

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 50000

510

15 congruentincongruent

Page 20: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

outliers

• the frequency plot reveals outliers -- data points that are separated from the main distribution.

• as we will see outliers are sometimes excluded from analyses.

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 4

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 5

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

congruent

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015

incongruent participant 6

Reaction Time (ms)

Freq

uenc

y

0 1000 3000 5000

05

1015 congruent

incongruent

Page 21: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

why do response times vary?

• the cognitive neural system is subject to noise - random disturbances of signal.

• smell, for example, is affected by thermodynamic noise because molecules arrive at the receptors at random rates. Similarly for vision and photons.

• perceptual amplification processes can add further noise.

• noise in neuron firing is also relevant.

• to generate movement neuronal signals are relayed and converted by mechanical forces in their muscle fibres. All of these processes are noisy.

• Together these various systems, and others, lead to trial-to-trial variation.

Page 22: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

Exercise

• Working in pairs:

• go to http://www.humanbenchmark.com/tests/reactiontime

• one person perform the test while the other writes down the durations.

• write down 60 durations using the same hand.

• repeat with your other hand.

• think about what else you should record?

Page 23: Evaluation Methods and Statistics - University of Birminghamaxj/pub/teaching/2016-7/stats/EMS2.pdf · Evaluation Methods and Statistics ... Stroop (1935) experimental ... • The

Reading

• Statistical Methods for Psychology, David C. Howell. Duxbury, 2006.