24
More Instrumental More Instrumental (Operant) Conditioning (Operant) Conditioning

More Instrumental (Operant) Conditioning. B.F. Skinner Coined the term ‘Operant conditioning’ Coined the term ‘Operant conditioning’ The animal operates

  • View
    238

  • Download
    5

Embed Size (px)

Citation preview

More Instrumental (Operant) More Instrumental (Operant) ConditioningConditioning

B.F. SkinnerB.F. Skinner

Coined the term Coined the term ‘Operant conditioning’‘Operant conditioning’

The animal operates The animal operates on the environmenton the environment

Pioneered the use of Pioneered the use of the free operantsthe free operants

Theory-FreeTheory-Free

The Skinner BoxThe Skinner Box

AutomaticAutomatic Easy measurements Easy measurements

that can be compared that can be compared across speciesacross species

Operant StrengthenedOperant Strengthened

Bite

Groom

Lick

Rear

Push Lever

Reinforcer

TechniquesTechniques

Shaping:Shaping: Successive approximations Successive approximationsRequire closer and closer appoximations to Require closer and closer appoximations to

the target behaviourthe target behaviourSecondary Reinforcers:Secondary Reinforcers:

Stimuli accompanying reinforcer deliveryStimuli accompanying reinforcer deliveryMarking:Marking:

Feedback that a response had occurredFeedback that a response had occurred

Key concepts and termsKey concepts and terms

Three term contingencyThree term contingency Discriminative stimulusDiscriminative stimulus OperantOperant ConsequenceConsequence

AcquisitionAcquisition ExtinctionExtinction Spontaneous recoverySpontaneous recovery GeneralizationGeneralization Conditioned reinforcementConditioned reinforcement Response chainsResponse chains

Other SimilaritiesOther Similarities

= bar press = food

Perfect contingency

Strong Responding

Degraded contingency

Weak Responding

Limits of Operant ConditioningLimits of Operant Conditioning

RelevanceRelevanceYawning to get foodYawning to get foodScratching a body part to get foodScratching a body part to get foodBiting to get access to a femaleBiting to get access to a female

Breland and “Misbehavior”Breland and “Misbehavior”

How to train a chicken

Schedules of ReinforcementSchedules of Reinforcement

You could give a reinforcement after each You could give a reinforcement after each responseresponse

This is called CRF or Continuous This is called CRF or Continuous reinforcementreinforcement

Does not maintain high rates of behaviorDoes not maintain high rates of behavior

Schedules of ReinforcementSchedules of Reinforcement

Fixed IntervalFixed Interval First response after a First response after a

given interval is given interval is rewardedrewarded

FI ScallopFI Scallop

Variable IntervalVariable Interval Like FI but varies with Like FI but varies with

a given averagea given average Scallop disappearsScallop disappears

Schedules of ReinforcementSchedules of Reinforcement

Fixed RatioFixed Ratio Reinforcement is given Reinforcement is given

after a given number after a given number of responsesof responses

Short pausesShort pauses

Variable RatioVariable Ratio After a varying number After a varying number

of responsesof responses

Some Other SchedulesSome Other Schedules

DRL, Differential reinforcement for low DRL, Differential reinforcement for low rates of respondingrates of responding

DRH, Differential reinforcement for high DRH, Differential reinforcement for high rates of respondingrates of responding

DR0, Different reinforcement of anything DR0, Different reinforcement of anything but the target behavior but the target behavior

Compound SchedulesCompound Schedules

Different schedules are presented one-by-one, Different schedules are presented one-by-one, signaled (signaled (MultipleMultiple) or not () or not (MixedMixed) ) by their own by their own discriminative stimulus discriminative stimulus

Reinforcement occurs after two or more Reinforcement occurs after two or more schedules have been completed in succession schedules have been completed in succession with either discriminative stimuli (with either discriminative stimuli (ChainedChained) not ) not ((TandemTandem)). .

Two schedules are simultaneously in force Two schedules are simultaneously in force ((ConcurrentConcurrent), usually for different responses, ), usually for different responses, and reinforcement on those schedules is and reinforcement on those schedules is independent of each other. independent of each other.

FR-10 FR-20

Schedule this….Schedule this….

Concurrent, Choice between two Concurrent, Choice between two alternative schedulesalternative scheduleschange over delay (no “channel surfing”)change over delay (no “channel surfing”)

A B

VI-30 VI-60

Matching LawMatching Law

B1/(B1+B2) = R1/(R1+R2)B1/(B1+B2) = R1/(R1+R2)B stands for numbers of a certain behaviorB stands for numbers of a certain behaviorR stands for numbers of a reinforcers earnedR stands for numbers of a reinforcers earned

$5 today $50 wait$5 today $6 wait

Schedule this….Schedule this….

Concurrent, Choice between an immediate Concurrent, Choice between an immediate small reward or a larger delayed rewardsmall reward or a larger delayed reward

A B

Self-Control….Self-Control….

Concurrent ChoiceConcurrent ChoiceHuman and nonhumans often chose a Human and nonhumans often chose a

immediate small reward over a larger delayed immediate small reward over a larger delayed reward (delayed rewards are “discounted”) reward (delayed rewards are “discounted”)

Example of ImpulsivityExample of Impulsivity

“Free” reinforcers given every 20s

Lever press advances delivery of the first pellet, and deletes the second pellet

So, if you press at 2 seconds, you get a pellet immediately, but you get no other pellets until the 60 second pellet is available.

20s 40s 60s

Delay of ReinforcementDelay of Reinforcement

Delayed reinforcers Delayed reinforcers are steeply are steeply discounteddiscounted

Loss of self-control Loss of self-control and impulsivityand impulsivity

0

10

20

30

40

50

60

70

80

90

100

-9 -6 -3 0

smallimmediate

largedelayed

Rei

nfo

rcer

Po

ten

cy

Delay

Increasing Self-ControlIncreasing Self-Control

small

LARGE

A B

Direct Choice(Concurrent)

small

A

LARGE

B

A B

Concurrent Chain(Precommittment)

Self ControlSelf Control

BehaviouralBehavioural PrecommitmentPrecommitment

Self-Exclusion Self-Exclusion ContractsContracts

DistractionDistraction ModelingModeling Shaping WaitingShaping Waiting

Reduce delay for smallReduce delay for small Increase delay for largeIncrease delay for large

CognitiveCognitive Public DeclarationPublic Declaration

Abstinence PledgeAbstinence Pledge Cold vs Hot ThoughtsCold vs Hot Thoughts Increase Internal Increase Internal

Resources for Self-Resources for Self-ControlControl

Counterfactual Counterfactual LearningLearning

Dopamine Error SignalsDopamine Error Signals Experiential (actual)Experiential (actual) Fictive (could have)Fictive (could have)