Download pdf - PSY402 Theories of Learningnalvarado/PSY402 PPTs/New Klein/PDFs/KleinCh8.pdf · water and “dry lick” for air on alternate days. Punishment of both behaviors had a greater effect

PSY402

Theories of Learning

Chapter 8, Theories of Appetitive and Aversive

Conditioning

Operant Conditioning

The nature of reinforcement:

Premack’s probability differential theory

Response deprivation theory

Behavioral economics:

Behavioral allocation – blisspoint

Choice behavior – Herrnstein’s matching law.

Momentary maximization theory

Delay-reduction theory

Probability-Differential Theory

Premack – a reinforcer can be any activity that is more likely to occur than the reinforced behavior.

Manipulators vs eaters

High probability behaviors can be used as reinforcers of low probability behaviors.

Frequency of the reinforcer decreases when it is made contingent on another response.

Activities can be Reinforcers

Playing with toys reinforces

working math problems

correctly

Response Deprivation Theory

Timberlake & Allison – deprivation occurs

when an activity is used as a reinforcer and is

not freely emitted.

The activity is reinforcing because it satisfies the

deprivation created.

The animal tries to return to its pre-deprivation

level of responding.

Activities can be reinforcing even if their

initial baselines were not higher.

Behavioral Allocation

Blisspoint (paired basepoint) – the free operant level of two responses.

Unrestricted responding with two choices of behaviors.

Blisspoint is used to figure out how much behavior an animal will engage in to obtain a reward.

Animals try to get as close to the blisspoint as possible.

Finding the Blisspoint

Contingency Lines for Rewards

Problems with Contingencies

Blisspoint is established by looking at

behavior before a contingency is established.

The established contingency must take

blisspoint into account or it may not increase

desired behavior.

Choice Behavior

Herrnstein’s matching law – describes how

animals act when they have two or more

choices.

Different responses have different schedules of

reinforcement.

Responding to each choice is proportionate to the

reinforcement for each choice – after learning.

This can be expressed mathematically.

Mathematical Expression

The formula for the matching law is:

where R1 and R2 are the rates of response for two

alternative responses

And r1 and r2 are rates of reinforcement for those

responses

Law Predicts Pecking Behavior

Delayed Gratification

Why does anyone choose a smaller reward part of the time?

Animals and people typically choose a small immediate reward over a larger delayed reward.

Large rewards are selected when:

The choice is made in advance of reward.

Reinforcers are not visible or reward is already present (pleasurable activity).

Complexities of the Matching Law

Maximizing law – sometimes the aim is to

obtain as many rewards as possible.

Explains FR-10 vs FR-40 schedules.

Doesn’t work for VI vs VR schedules.

Momentary maximization theory – choose

best alternative at the time.

Delay reduction theory – choose what will get

the reward the fastest.

Aversive Theories: Explaining

Avoidance

The existence of avoidance behavior implies a

cognitive process:

Behaving in order to prevent an aversive event.

Behaviorists like Hull needed to explain this

without cognition.

Mowrer’s two-factor theory was developed to

explain this – but it has problems needing

explanation.

Mowrer’s Two-Factor Theory

Mowrer proposed a drive-based two-factor theory to avoid explaining avoidance using cognitive (mentalistic) concepts.

Avoidance involves two stages:

Fear is classically conditioned to the environmental conditions preceding an aversive event.

Cues evoke fear -- an instrumental response occurs to terminate the fear.

Mowrer’s View (Cont.)

We are not actually avoiding an event but

escaping from a feared object (environmental

cue).

Miller’s white/black chamber – rats escaped

the feared white chamber, not avoided an

anticipated shock.

Fear reduction rewards the escape behavior.

Criticisms of Two-Factory Theory

Avoidance behavior is extremely resistant to

extinction.

Should extinguish with exposure to CS without

UCS, but does not.

Levis & Boyd found that animals do not get

sufficient exposure duration because their

behavior prevents it.

Avoidance persists if long latency cues exist

closer to the aversive event.

Is Fear Really Present?

When avoidance behavior is well-learned the animals don’t seem to be afraid.

An avoidance CS does not suppress operant responding (no fear).

However, this could mean that the animal’s hunger is stronger than the fear.

Strong fear (drive strength) is not needed if habit strength is large.

Avoidance without a CS

Sidman avoidance task – an avoidance

response delays an aversive event for a period

of time.

There is no external cue to when the aversive

event will occur – just duration. Temporal

conditioning.

How do animals learn to avoid shock without

any external cues for the classical

conditioning of fear?

Kamin’s Findings

Avoidance of the UCS, not just termination of the CS (and the fear) matters in avoidance learning.

Four conditions:

Response ends CS and prevents UCS.

Reponse ends CS but doesn’t stop UCS.

Response prevents UCS but CS stays.

CS and UCS, response does nothing (control condition).

Both Factors are Important

Termination and

Avoidance both

show greater

learning

D’Amato’s Acquired Motive View

D’Amato proposed that both pain and relief motivate avoidance.

Anticipatory pain & relief responses.

Shock elicits unconditioned pain response RP and stimulus SP motivates escape.

Classically conditioned cues sP elicit anticipatory pain response rP that motivates escape from the CS.

Anticipatory Relief Response

Termination of the UCS produces an unconditioned relief response RR with stimulus consequences SR.

Conditioned cues elicit an anticipatory relief response rR with stimulus consequences sR.

Example: dog bite elicits pain response, sight of dog elicits anticipatory pain, house elicits relief

A Discriminative Cue is Needed

During trace conditioning no cue is present when UCS occurs and no avoidance learning occurs.

A second cue presented during avoidance behavior slowly acquires rR-sR conditioning.

Similarly, in a Sidman task, cues predict relief -- associated with avoidance behavior, not the UCS.

A Second Cue Helps Trace Learning

Group TS saw a

second cue

associated with

termination of

shock

Thorndike’s Negative Law of Effect

Thorndike suggested that punishment

weakens an S-R bond.

Skinner’s finding that suppression of behavior is

temporary contradicts this.

The effect of punishment must be something

different than weakening of the S-R bond.

Guthrie’s View of Punishment

When punishment occurs, the response to it is

conditioned to the environment during the

event.

Freezing, jumping, flinching.

The effect on behavior depends on the UCR

elicited by the shock.

Shock to forepaws inhibits running but a shock to

hindpaws facilitates it.

Monkeys struggle more when shocked.

Guthrie’s Competing Response Theory

Guthrie suggested that punishment works

only if the response elicited by the

punishment is incompatible with the punished

behavior.

Gerbils punished for standing upright do it more,

not less.

Problems with Guthrie’s Theory

Response competition alone is insufficient to

make punishment effective.

When punishment is contingent instead of just

co-occurring, it is more effective.

Contingent means the punishment happens only

when the behavior occurs, not independent of it,

randomly

Este’s Motivational View

When a behavior is rewarded, the

motivational system becomes associated with

the behavior.

The response occurs the next time the

motivational system is activated.

Punishment works by changing the motives.

Stimuli associated with punishment inhibit the

motivational state.

Support for Estes

Thirsty rats were trained to lever press for

water and “dry lick” for air on alternate days.

Punishment of both behaviors had a greater effect

on dry licking (a thirst-related behavior) than

lever pressing.

If the behavior rather than the motive were being

suppressed no such difference should occur.

Results differed with hungry rats.