19
Human Reward / Stimulus/ Response Signal Experiment: Data and Analysis Draws on: Alan and Bill’s experiment Usher & McClelland model and experiments Patrick Simen’s model Sam and Phil’s analysis Juan’s further analysis

Human Reward / Stimulus/ Response Signal Experiment: Data and Analysis

  • Upload
    zelig

  • View
    53

  • Download
    0

Embed Size (px)

DESCRIPTION

Human Reward / Stimulus/ Response Signal Experiment: Data and Analysis. Draws on: Alan and Bill’s experiment Usher & McClelland model and experiments Patrick Simen’s model Sam and Phil’s analysis Juan’s further analysis. - PowerPoint PPT Presentation

Citation preview

Page 1: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Human Reward / Stimulus/ Response Signal Experiment: Data and Analysis

Draws on:

Alan and Bill’s experimentUsher & McClelland model and experiments

Patrick Simen’s modelSam and Phil’s analysisJuan’s further analysis

Page 2: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Human experiment examining reward bias effect with responsesignal given at different times after target onset

• Target stimuli are rectangles shifted 1,3, or 5 pixels L or R of fixation

• Reward cue occurs 750 msec before stimulus.

– Small arrow head pointing L or R visible for 250 msec. – Only biased reward conditions (2 vs 1 and 1 vs 2) are used.

• Response signal occurs at different times after target onset:

0 75 150 225 300 450 600 900 1200 2000

- Participant receives reward only if response is correct and occurs within 250 msec of response signal.

- Participants were run for 15-25 sessions to provide stable data.

- Data shown are from later sessions in which effects were all stable.

Page 3: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

A participant with very little reward bias

• Top panel shows probability of response giving larger reward as a function of actual response time for combinations of:

Stimulus shift (1 3 5) pixels

Reward-stimulus compatibility

• Lower panel shows data transformed to z scores, and corresponds to the theoretical construct:

mean(x1(t)-x2(t))+bias(t)

sd(x1(t)-x2(t))

where x1 represents the state of the

accumulator associated with greater

reward, x2 the same for lesser reward,

and S is thought to choose larger reward if

x1(t)-x2(t)+bias(t) > 0.

Page 4: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Participants Showing Reward Bias

Page 5: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis
Page 6: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Analysis Assumptions

• Decision variable x varies as a function of t.• Choice is made at some time t = signal lag + rt.• At the time the choice is made:

– For a single difficulty level, two distributions, with means +, -, and equal sd set to 1. Choose high reward if decision variable x > -Xc

– For three difficulty levels, fixed = 1, means i (i=1,2,3),assume same Xc for all difficulty levels.

– Xc can be regarded as a positive increment to the state of the decision variable;high reward is chosen if x > 0 in this case.

-10 -8 -6 -4 -2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

- +-xc

Page 7: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

C

C

X

X

LHPinvNormZ

HHPinvNormZ

))|((

))|((2

1

2

2

21

21

ZZX

ZZ

c

Only one diff level

iC

iC

Xi

Xi

LHPinvNormZ

HHPinvNormZ

))|((

))|((2

1

3*2

2

21

21

iii

c

iii

ZZX

ZZ

Three diff levels

Subject’s sensitivity, as defined in theory of signal detectability

)(' ii

id When response

signal delay varies)(' tdi

For each subject, fit with function from UM’01

asymi

fiti detd

tt

)1()()0(

Page 8: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Subject Sensitivity

0 0.5 1 1.5 2 2.5-0.5

0

0.5

1

1.5

2

2.5cm

d pr

im

RT+response cue delay

0 0.5 1 1.5 2 2.5-1

0

1

2

3

4ja

d pr

im

RT+response cue delay

0 0.5 1 1.5 2 2.5-0.5

0

0.5

1

1.5

2sl

d pr

im

RT+response cue delay

data, diff=5data, diff=3data, diff=1fit, diff=5fit, diff=3fit, diff=1

Page 9: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

1 2 3 4 50.26

0.28

0.3

0.32

0.34

0.36

stimulus (diff) level

RT

0

1 2 3 4 50.2

0.25

0.3

0.35

0.4

0.45

0.5

stimulus (diff) level

0 1 2 3 4 50

1

2

3

4

stimulus (diff) level

das

ym

cm

jasl

cm

jasl

cm

jasl

Page 10: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Optimal “bias” Xc/based on observedsensitivity data.

Observed “bias”, treatedas positive offsetfavoring response associated with highreward.

3*2

21

i

iic

ZZX

-10 -8 -6 -4 -2 0 2 4 6 8 100

0.5

1

1.5

-Xc/

Page 11: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

0 0.5 1 1.5 2 2.50

0.5

1

1.5

2cm

RT+response cue delay

norm

aliz

ed t

hres

hold

xc/

real

optimal

0 0.5 1 1.5 2 2.50

0.5

1

1.5

2ja

RT+response cue delay

norm

aliz

ed t

hres

hold

xc/

real

optimal

0 0.5 1 1.5 2 2.5-0.5

0

0.5

1

1.5

2sl

RT+response cue delay

norm

aliz

ed t

hres

hold

xc/

real

optimal

Page 12: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Some possible models

• OU process ( < 0, 0 = 0) following F&H,with reward bias effect implemented as:

1. An alteration in initial condition, subject to decay 2. Optimal time-varying decision boundary outside of the OU

process3. An input ‘current’ starting at presentation of reward signal

1. Noise from reward onset2. Noise from stimulus onset

4. A constant offset or criterion shift unaffected by time

Page 13: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

1. Reward as a change in initial condition, subject to decay

Note:1. Effect of the bias

decays away for lambda<0.

2. There is a dip at

3. At t=0, p=1.

aCaCt 0log1

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

Time (s)

P o

f ch

oice

tow

ard

larg

er r

ewar

d

RSC 1, diff 5RSC 0, diff 5RSC 1, diff 3RSC 0, diff 3RSC 1, diff 1RSC 0, diff 1

Feng & Holmes notes

)1()();1(),( 220

2 ttaCt etveetC

Page 14: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

2. Time-varying optimal bias (Outside of OU process)

Note:1. Effect of the bias

persists.2. There is a dip at

3. At t=0, p=1.4. The smaller the

stimulus effect, the larger the bias.

5. The harder the stimulus condition, the later the dip.

2log4

2log4122

22

log

Ca

Cat

)1()( 42log taC etb

)1()();1()(),( 22

2 ttaC etvetbtC

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

Time (s)

P o

f ch

oice

tow

ard

larg

er r

ewar

d

RSC 1, diff 5RSC 0, diff 5RSC 1, diff 3RSC 0, diff 3RSC 1, diff 1RSC 0, diff 1

Page 15: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

3.1. Reward acts as input “current”, stays on from reward signal to end of trial, noise starts at reward onset

Reward signal comes seconds before stimulus

Note:1. Effect of the

bias persists2. There is no

dip.3. At t=0, p<1.

Feng & Holmes notes

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

Time (s)

P o

f ch

oice

tow

ard

larg

er r

ewar

d

RSC 1, diff 5RSC 0, diff 5RSC 1, diff 3RSC 0, diff 3RSC 1, diff 1RSC 0, diff 1

2

Page 16: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

3.2. Same as 3.1 but variability is introduced only at stimulus onset

Note:1. Effect of the bias

persists2. There is dip at

3. At t=0, p=1 since all accumulators have no variance.

baCbeaCt

log1

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

Time (s)

P o

f ch

oice

tow

ard

larg

er r

ewar

d

RSC 1, diff 5RSC 0, diff 5RSC 1, diff 3RSC 0, diff 3RSC 1, diff 1RSC 0, diff 1

2

Page 17: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

4. Reward as a constant offset

Note:1. Equivalent to 3.2

for large

2. There is a dip at

3. At t=0, p=1

0log1

aCaCt

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

Time (s)

P o

f ch

oice

tow

ard

larg

er r

ewar

d

RSC 1, diff 5RSC 0, diff 5RSC 1, diff 3RSC 0, diff 3RSC 1, diff 1RSC 0, diff 1

)1()();1(),( 220

2 ttaC etvetC

Page 18: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Some possible models

• OU models ( < 0, 0 = 0) following F&H,with reward bias effect implemented as:

1. An alteration in initial condition, subject to decay 2. Optimal time-varying decision boundary outside of the OU

process3. An input ‘current’ starting at presentation of reward signal

1. Noise from reward onset2. Noise from stimulus onset

4. A constant offset or criterion shift unaffected by time

• While none fit perfectly, starting point variability (0 > 0) would potentially improve 3.2 and 4.

Page 19: Human Reward / Stimulus/ Response Signal Experiment:   Data and Analysis

Jay’s favorite mechanistic story(draws from Simen’s model)

• Participant learns to inject waves of activation that prime response accumulators; waves peak just after stimulus onset and have a residual.– Wave is higher for hi rwd response.

• Stimulus activation accumulates as in LCAM. • Response signal initiates added drive to both

accumulators equally.• First accumulator to fixed threshold initiates the

response.