Amimal Learning

8/3/2019 Amimal Learning

http://slidepdf.com/reader/full/amimal-learning 1/52



Copyright © 2008 Psychology Press http://www.psypress.com/animal-learning-and-cognition

Animal Learning &Cognition







Animal Learning &Cognition,An Introduction

Third Edition

John M. Pearce

Cardiff University




Published in 2008by Psychology Press27 Church Road, Hove, East Sussex, BN3 2FA

Simultaneously published in the USA and Canadaby Psychology Press270 Madison Ave, New York, NY 10016

www.psypress.com

Psychology Press is an imprint of the Taylor & Francis Group,an informa business

© 2008 Psychology Press

All rights reserved. No part of this book may be reprinted or reproduced or utilized inany form or by any electronic, mechanical, or other means, now known or hereafterinvented, including photocopying and recording, or in any information storage orretrieval system, without permission in writing from the publishers.

The publisher makes no representation, express or implied, with regard to theaccuracy of the information contained in this book and cannot accept any legalresponsibility or liability for any errors or omissions that may be made.

British Library Cataloguing in Publication DataA catalogue record for this book is available from the British Library

Library of Congress Cataloging-in-Publication DataPearce, John M.

Animal learning and cognition: an introduction / John M. Pearce.

p. cm.Includes bibliographical references and index.ISBN 978–1–84169–655–3—ISBN 978–1–84169–656–01. Animal intelligence. I. Title.

QL785.P32 2008591.5'13—dc22 2007034019

ISBN: 978–1–84169–655–3 (hbk)ISBN: 978–1–84169–656–0 (pbk)

Typeset by Newgen Imaging Systems (P) Ltd, Chennai, IndiaPrinted and bound in Slovenia




For Victoria







Contents

Preface ix

1 The study of animal

intelligence 2

The distribution of intelligence 4

Defining animal intelligence 12

Why study animal intelligence? 16

Methods for studying animal

intelligence 20

Historical background 22

2 Associative learning 34

Conditioning techniques 36

The nature of associative learning 42

Stimulus–stimulus learning 49

The nature of US representations 52

The conditioned response 55

Concluding comment: the reflexive

nature of the conditioned

response 60

3 The conditions for learning:

Surprise and attention 62

Part 1: Surprise and conditioning 64

Conditioning with a single CS 64

Conditioning with a compound CS 68

Evaluation of the Rescorla–Wagner

model 72

Part 2: Attention and conditioning 74

Wagner’s theory 76

Stimulus significance 80

The Pearce–Hall theory 86

Concluding comments 91

4 Instrumental conditioning 92

The nature of instrumental

learning 93

The conditions of learning 97

The performance of instrumental

behavior 106The Law of Effect and

problem solving 111

5 Extinction 122

Extinction as generalization

decrement 123

The conditions for extinction 125

Associative changes during

extinction 134

Are trials important for Pavlovian

extinction? 142

6 Discrimination learning 148

Theories of discrimination learning 149

Connectionist models of

discrimination learning 161

Metacognition and discrimination

learning 166

7 Category formation 170

Examples of categorization 171

Theories of categorization 173

Abstract categories 179

Relationships as categories 180

The representation of knowledge 188

8 Short-term retention 190

Methods of study 191

Forgetting 199

Theoretical interpretation 202

Serial position effects 206

Metamemory 207

9 Long-term retention 212

Capacity 214

Durability 215




viii Contents

Theoretical interpretation 218

Episodic memory 225

10 Time, number, and

serial order 232

Time 233Number 243

Serial order 253

Transitive inference 259


11 Navigation 264

Part 1: Short-distance travel 265

Methods of navigation 265

Part 2: Long-distance travel 283

Navigational cues 284

Homing 286

Migration 289


12 Social learning 296

Diet selection and foraging 298

Choosing a mate 301

Fear of predators 301

Copying behavior: mimicry 302

Copying behavior: imitation 304

Theory of mind 312

Self-recognition 319


13 Animal communication andlanguage 326

Animal communication 327

Communication and language 336

Can an ape create

a sentence? 339

Language training with other

species 350

The requirements for learning

a language 356

14 The distribution of

intelligence 360

Intelligence and brain size 361

The null hypothesis 364

Intelligence and evolution 369

References 373

Author index 403

Subject index 411




Preface

In preparing the third edition of this book, my aim, as it was for the previous editions,

has been to provide an overview of what has been learned by pursuing one particular

approach to the study of animal intelligence. It is my belief that the intelligence of

animals is the product of a number of mental processes. I think the best way of

understanding these processes is by studying the behavior of animals in an

experimental setting. This book, therefore, presents what is known about animal

intelligence by considering experimental findings from the laboratory and from more

naturalistic settings.

I do not attach any great importance to the distinction between animal learning

and animal cognition. Research in both areas has the common goal of elucidating the

mechanisms of animal intelligence and, very often, this research is conducted using

similar procedures. If there is any significance to the distinction, then it is that thefields of animal learning and animal cognition are concerned with different aspects

of intelligence. Chapters 2 to 6 are concerned predominantly with issues that fall

under the traditional heading of animal learning theory. My main concern in these

chapters is to show how it is possible with a few simple principles of associative

learning to explain a surprisingly wide range of experimental findings. Readers

familiar with the previous edition will notice that apart from a new chapter devoted

to extinction, there are relatively few changes to this part of the book. This lack of

change does not mean that researchers are no longer actively investigating the basic

learning processes in animals. Rather, it means that the fundamental principles of




x Preface

learning are now reasonably well established and that current research is directed

towards issues that are too advanced to be considered in an introductory text book.

The second half of the book covers material that is generally treated under the

heading of animal cognition. My overall aim in these chapters is to examine what has

been learned from studying animal behavior about such topics as memory, the

representation of knowledge, navigation, social learning, communication, and

language. I also hope to show that the principles developed in the earlier chapters areof relevance to understanding research that is reviewed in the later chapters. It is in

this part of the book that the most changes have been made. Research on animal

cognition during the last 10 years has headed in many new directions. I have tried to

present a clear summary of this research, as well as a balanced evaluation of its

theoretical implications.

Those who wish to study the intelligence of animals face a daunting task. Not

only are there numerous different species to study, but there is also an array of

intellectual skills to be explored, each posing a unique set of challenging theoretical

problems. As a result, many of the topics that I discuss are still in their infancy. Some

readers may therefore be disappointed to discover that we are still trying to answer

many of the interesting questions that can be asked about the intelligence of animals.On the other hand, it is just this lack of knowledge that makes the study of animal

learning and cognition so exciting. Many fascinating discoveries remain to be made

once the appropriate experiments have been conducted.

One of the rewards for writing a book is the opportunity it provides to thank the

many friends and colleagues who have been so generous with the help they have

given me. The way in which this book is organized and much of the material it

contains have been greatly influenced by numerous discussions with A. Dickinson,

G. Hall, N. J. Mackintosh, and E. M. Macphail. Different chapters have benefited

greatly from the critical comments on earlier versions by A. Aydin, N. Clayton,

M. Haselgrove, C. Heyes, V. LoLordo, A. McGregor, E. Redhead, and P. Wilson.

A special word of thanks is due to Dave Lieberman, whose thoughtful comments onan earlier draft of the present edition identified numerous errors and helped to clarify

the manner in which much of the material is presented. The present edition has

also greatly benefited from the detailed comments on the two previous editions by

N. J. Mackintosh.

I should also like to express my gratitude to the staff at Psychology Press.

Without the cajoling and encouragement of the Assistant Editor, Tara Stebnicky, it is

unlikely that I would have embarked on this revision. I am particularly grateful to the

Production Editor, Veronica Lyons, who, with generous amounts of enthusiasm and

imagination, has done a wonderful job in trying to transform a sow’s ear into a silk

purse. Thanks are also due to the colleagues who were kind enough to send me

photographs of their subjects while they were being tested. Finally, there is the

pleasure of expressing gratitude to Victoria, my wife, who once again patiently

tolerated the demands made on her while this edition was being prepared. In previous

editions I offered similar thanks to my children, but there is no need on this occasion

now that they have left home. Even so, Jess, Alex, and Tim would never forgive me

if I neglected to mention their names.

While preparing for this revision I read a little about Darwin’s visit to the

Galapagos Islands. I was so intrigued by the influence they had on him that I felt

compelled to visit the islands myself. During the final stages of preparing this

edition, Veronica and Tania, somewhat reluctantly, allowed me a two-week break to

travel to the Galapagos Islands. The holiday was one of the highlights of my life.




Preface xi

The sheer number of animals, and their absolute indifference to the presence of

humans, was overwhelming. The picture on the previous page shows me trying

unsuccessfully to engage a giant tortoise in conversation. This, and many other

photographs, were taken without any elaborate equipment and thus reveal how the

animals allowed me to approach as close as I wished in order to photograph them.

I came away from the islands having discovered little that is new about the

intelligence of animals, but with a deeper appreciation of how the environmentshapes not only their form, but also their behavior.

John M. Pearce

October, 2007




C

H A

P T E

R

4C O N T E N T S

The nature of instrumental learning 93

The conditions of learning 97

The performance of instrumental behavior 106

The Law of Effect and problem solving 111




4Instrumentalconditioning

Behavior is affected by its consequences. Responses that lead to reward are repeated,

whereas those that lead to punishment are withheld. Instrumental conditioning refers

to the method of using reward and punishment in order to modify an animal’s

behavior. The first laboratory demonstration of instrumental conditioning was

provided by Thorndike (1898) who, as we saw in Chapter l, trained cats to make a

response in order to escape from a puzzle box and earn a small amount of fish. Since

this pioneering work, there have been many thousands of successful demonstrations

of instrumental conditioning, employing a wide range of species, and a variety of

experimental designs. Skinner, for example, taught two pigeons, by means of

instrumental conditioning, to play ping-pong with each other.From the point of view of understanding the mechanisms of animal intelligence,

three important issues are raised by a successful demonstration of instrumental

conditioning. We need to know what information an animal acquires as a result of its

training. Pavlovian conditioning was shown to promote the growth of

stimulus–stimulus associations, but what sort of associations develop when a response

is followed by a reward or punishment? Once the nature of the associations formed

during instrumental conditioning has been identified, we then need to specify the

conditions that promote their growth. Surprise, for example, is important for successful

Pavlovian conditioning, but what are the necessary ingredients to ensure the success of

instrumental conditioning? Finally, we need to understand the factors that determine

when, and how vigorously, an instrumental response will be performed.Before turning to a detailed discussion of these issues, we must be clear what is

meant by the term reinforcer. This term refers to the events that result in the

strengthening of an instrumental response. The events are classified as either positive

reinforcers, when they consist of the delivery of a stimulus, or negative reinforcers,

when it involves the removal of a stimulus.

THE NATURE OF INSTRUMENTAL LEARN ING

Historical background

Thorndike (1898) was the first to propose that instrumental conditioning is based on

learning about responses. According to his Law of Effect, when a response is

followed by a reinforcer, then a stimulus–response (S–R) connection is strengthened.

In the case of a rat that must press a lever for food, the stimulus might be the lever

itself and the response would be the action of pressing the lever. Each successful

lever press would thus serve to strengthen a connection between the sight of the lever

and the response of pressing it. As a result, whenever the rat came across the lever in

the future, it would be likely to press it and thus gain reward. This analysis of

instrumental conditioning has formed the basis of a number of extremely influential

theories of learning (e.g. Hull, 1943).

K E Y T E R M

Reinforcer

An event thatincreases theprobability of aresponse when

presented after it.If the event is theoccurrence of astimulus, such asfood, it is referredto as a positivereinforcer; but if theevent is the removalof a stimulus, such asshock, it is referredto as a negativereinforcer.




94 A N I M A L L E A R N I N G & C O G N I T I O N

A feature of the Law of Effect that has proved unacceptable to the intuitions of

many psychologists is that it fails to allow the animal to anticipate the goal for which

it is responding. The only knowledge that an S–R connection permits an animal to

possess is the knowledge that it must make a particular response in the presence of a

given stimulus. The delivery of food after the response will, according to the Law of

Effect, effectively come as a complete surprise to the animal. In addition to sounding

implausible, this proposal has for many years conflicted with a variety of experimental findings.

One early finding is reported by Tinkelpaugh (1928), who required monkeys to

select one of two food wells to obtain reward. On some trials the reward was a

banana, which was greatly preferred to the other reward, a lettuce leaf. Once the

animals had been trained they were occasionally presented with a lettuce leaf when

they should have received a banana. The following quote, which is cited in

Mackintosh (1974), provides a clear indication that the monkey expected a more

attractive reward for making the correct response (Tinkelpaugh, 1928, p. 224):

She extends her hand to seize the food. But her hand drops to the floor without

touching it. She looks at the lettuce but (unless very hungry) does not touch it.

She looks around the cup and behind the board. She stands up and looks under

and around her. She picks the cup up and examines it thoroughly inside and out.

She had on occasion turned toward the observers present in the room and

shrieked at them in apparent anger.

A rather different type of finding that shows

animals anticipate the rewards for which they are

responding can be found in experiments in which rats

ran down an alley, or through a maze, for food. If a

rat is trained first with one reward which is then

changed in attractiveness, there is a remarkably rapid

change in its performance on subsequent trials. Elliott

(1928) found that the number of errors in a multiple-

unit maze increased dramatically when the quality of

reward in the goal box was reduced. Indeed, the

animals were so dejected by this change that they

made more errors than a control group that had been

trained throughout with the less attractive reward

(Figure 4.1). According to S–R theory, the change in

performance by the experimental group should have

taken place more slowly, and should not have resulted

in less accurate responding than that shown by the control group. As an alternativeexplanation, these findings imply that the animals had some expectancy of the

reward they would receive in the goal that allowed them to detect when it was made

less attractive.

Tolman (1932) argued that findings such as these indicate that rats form

R–unconditioned stimulus (US) associations as a result of instrumental conditioning.

They are assumed to learn that a response will be followed by a particular outcome.

There is no doubt that the results are consistent with this proposal, but they do not

force us to accept it. Several S–R theorists have pointed out that the anticipation of

reward could have been based on conditioned stimulus (CS)–US, rather than R–US

FIGURE 4.1 The meannumber of errors madeby two groups of rats ina multiple-unit maze. Forthe first nine trials thereward for the controlgroup was moreattractive than for theexperimental group, butfor the remaining trialsboth groups received thesame reward (adaptedfrom Elliott, 1928).

70

60

50

40

30

20

10

00 2 4 6 8 10 12 14 16

Trials

P e r c e n t e r r o r s

Experimental

Control





associations. In Elliott’s (1928) experiment, for example, the animal consumed the

reward in the goal box. It is possible that the stimuli created by this part of the

apparatus served as a CS that became associated with food. After a number of

training trials, therefore, the sight of the goal box would activate a representation of

the reward and thereby permit the animal to detect when its value was changed. Both

Hull (1943) and Spence (1956) seized on this possibility and proposed that the

strength of instrumental responding is influenced by the Pavlovian properties of thecontext in which the response is performed.

The debate between S–R theorists and what might be called the expectancy

(R–US) theorists continued until the 1970s (see for example, Bolles, 1972). In the

last 20 years or so, however, experiments have provided new insights into the

nature of the associations that are formed during instrumental conditioning. To

anticipate the following discussion, these experiments show that both the S–R and

the expectancy theorists were correct. The experiments also show that these theorists

underestimated the complexity of the information that animals can acquire in even

quite simple instrumental conditioning tasks.

Evidence for R–US associations

To demonstrate support for an expectancy theory of instrumental conditioning,

Colwill and Rescorla (1985) adopted a reinforcer devaluation design (see also

Adams & Dickinson, 1981). A single group of rats was trained in the manner

summarized in Table 4.1. In the first (training) stage of the experiment subjects were

able to make one response (R1) to earn one reinforcer (US1) and another response

(R2) to earn a different reinforcer (US2). The two responses were lever pressing or

pulling a small chain that was suspended from the ceiling, and the two reinforcers

were food pellets or sucrose solution. After a number of sessions of this training, an

aversion was formed to US1 by allowing subjects free access to it and then injecting

them with a mild poison (lithium chloride; LiCl). This treatment was so effective thatsubjects completely rejected US1 when it was subsequently presented to them. For

the test trials subjects were again allowed to make either of the two responses, but

this time neither response led to the delivery of a reinforcer. The results from the

experiment are shown in Figure 4.2, which indicates that R2 was performed more

vigorously than R1. The figure also shows a gradual decline in the strength of R2,

which reflects the fact that neither response was followed by reward. This pattern of

results can be most readily explained by assuming that during their training rats

formed Rl–US1 and R2–US2 associations. They would then be reluctant to perform

R1 in the test phase because of their knowledge that this response produced a

reinforcer that was no longer attractive.

Training Devaluation Test

R1 → US1 US1 → LiCl R1 versus R2

R2 → US2

LiCl, lithium chloride; R, response; US, unconditioned stimulus.

TABLE 4.1 Summary of the training given to a single group of rats in an

experiment by Colwill and Rescorla (1985)

K E Y T E R M

Reinforcer devaluation

A technique in whichthe positive reinforcerfor an instrumentalresponse issubsequentlydevalued, normallyby pairing itsconsumption withillness.





Evidence for S–R associations

The evidence that instrumental conditioning results in

the development of S–R associations is perhaps less

convincing than that concerning the development of

R–US associations. A re-examination of Figure 4.2

reveals that after the devaluation treatment thereremained a tendency to perform R1. This tendency

was sustained even though the response never

resulted in the delivery of a reinforcer and, more

importantly, it was sustained even though the

devaluation training resulted in a complete rejection

of US1. The fact that an animal is willing to make a

response, even though it will reject the reinforcer that

normally follows the response, is just what would be

expected if the original training resulted in the growth

of an S–R connection. In other words, because an S–R connection does not allow an

animal to anticipate the reward it will receive for its responses, once such a

connection has formed the animal will respond for the reward even if it is no longer

attractive. Thus the results of the experiment by Colwill and Rescorla (1985) indicate

that during the course of their training rats acquired both R–US and S–R

associations.

Readers who are struck by the rather low rate at which Rl was performed might

conclude that the S–R connection is normally of little importance in determining

responding. Note, however, that for the test trials there was the opportunity of

performing either R1 or R2. Even a slight preference for R2 would then have a

suppressive effect on the performance of R1. On the basis of the present results,

therefore, it is difficult to draw precise conclusions concerning the relative

contribution S–R and R–US associations to instrumental responding.

To complicate matters even further, it seems that the relative contribution of S–R

and R–US associations to instrumental behavior is influenced by the training given.

Adams and Dickinson (1981) conducted a series of experiments in which rats had to

press a lever for food. An aversion to the food was then conditioned using a technique

similar to that adopted by Colwill and Rescorla (1985). If a small amount of

instrumental training had been given initially, then subjects showed a marked

reluctance to press the lever in a final test session. But if extensive instrumental training

had been given initially, there was little evidence of any effect at all of the devaluation

treatment. Adams and Dickinson (1981) were thus led to conclude that R–US

associations underlie the acquisition and early stages of instrumental training, but with

extended practice this learning is transformed into an S–R habit. There is some debate

about the reasons for this change in influence of the two associations, or whether italways takes place (see Dickinson & Balleine, 1994).

Evidence for S–(R–US) associations

Animals can thus learn to perform a particular response in the presence of a given

stimulus (S–R learning), they can also learn that a certain reinforcer will follow a

response (R–US learning). The next question to ask is whether this information can

be integrated to provide the knowledge that in the presence of a certain stimulus a

certain response will be followed by a certain outcome. Table 4.2 summarizes

FIGURE 4.2 The meanrates at which a singlegroup of rats performedtwo responses, R1 andR2, that had previouslybeen associated withtwo different rewards.Before the test sessions,the reward for R1, butnot R2, had beendevalued. No rewardswere presented in thetest session (adaptedfrom Rescorla, 1991).

8

6

4

2

00 1 2 3 4 5

Blocks of four minutes

M e a n r e s p o n s e s p e r m i n u t e

R2

R1





the design of an experiment by Rescorla (1991) that was conducted to test this

possibility.

A group of rats first received discrimination training in which a light or a noise

(S1 or S2) was presented for 30 seconds at a time. During each stimulus the rats were

trained to perform two responses (pulling a chain or pressing a lever), which each

resulted in a different reinforcer (food pellets or sucrose solution). The design of

conditioning experiments is rarely simple and, in this case, it was made moredifficult by reversing the response–reinforce relationships for the two stimuli. Thus

in S1, R1 led to US1 and R2 led to US2; but in S2, R1 led to US2 and R2 led to US1.

For the second stage of the experiment, the reinforcer devaluation technique was

used to condition an aversion to US2. Finally, test trials were conducted in extinction

in which subjects were provided with the opportunity of performing the two

responses in the presence of each stimulus. The result from these test trials was quite

clear. There was a marked preference to perform R1, rather than R2, in the presence

of S1; but in the presence of S2 there was a preference to perform R2 rather than R1.

These findings cannot be explained by assuming that the only associations acquired

during the first stage were S–R, otherwise the devaluation technique would have

been ineffective. Nor can the results be explained by assuming that only R–USassociations developed, otherwise devaluation treatment should have weakened R1

and R2 to the same extent in both stimuli. Instead, the results can be most readily

explained by assuming that the subjects were sensitive to the fact that the devalued

reinforcer followed R2 in S1, and followed R1 in S2. Rescorla (1991) has argued that

this conclusion indicates the development of a hierarchical associative structure that

he characterizes as S–(R–US). Animals are first believed to acquire an R–US

association, and this association in its entirety is then assumed to enter into a new

association with S. Whether it is useful to propose that an association can itself enter

into an association remains to be seen. There are certainly problems with this type of

suggestion (see, for example, Holland, 1992). In addition, as Dickinson (1994) points

out, there are alternative ways of explaining the findings of Rescorla (1991). Despite

these words of caution, the experiment demonstrates clearly that animals are able to

anticipate the reward they will receive for making a certain response in the presence

of a given stimulus.

THE COND IT IONS OF LEARN ING

There is, therefore, abundant evidence to show that animals are capable of learning

about the consequences of their actions. We turn now to consider the conditions that

enable this learning to take place.

Discrimination training Devaluation Test

Sl: R1 → US1 and R2 → US2 S1: R1 R2

US2 → LiClS2: R1 → US2 and R2 → US1 S2: R2 Rl

LiCl, lithium chloride; R, response; S, stimulus; US, unconditioned stimulus.

TABLE 4.2 Summary of the training given to a single group of rats in an

experiment by Rescorla (1991)





Contiguity

A fundamental principle of the early theories of

learning was that instrumental conditioning is most

effective when the response is contiguous with or, in

other words, followed immediately by the reinforcer.

An early demonstration of this influence of contiguity on instrumental conditioning was made by

Logan (1960), who trained rats to run down an alley

for food. He found that the speed of running was

substantially faster if the rats received food as soon as

they reached the goal box, as opposed to waiting in

the goal box before food was made available. This

disruptive effect of waiting was found with delays

from as little as 3 seconds. Moreover, the speed of

running down the alley was directly related to the

duration of the delay in the goal box. This

effect, which is referred to as the gradient of

delay, has been reported on numerous occasions

(e.g. Dickinson, Watt, & Griffiths, 1992).

It is apparent from Logan’s (1960) study that

even relatively short delays between a response

and a reinforcer disrupt instrumental conditioning.

Once this finding has been established, it then

becomes pertinent to consider by how much the

reinforcer can be delayed before instrumental

conditioning is no longer possible. The precise

answer to this question remains to be sought, but a

study by Lattal and Gleeson (1990) indicates that it

may be greater than 30 seconds. Rats were

required to press a lever for food, which was

delivered 30 seconds after the response. If another

response was made before food was delivered then

the timer was reset and the rat had to wait another

30 seconds before receiving food. This schedule

ensured that the delay between any response and

food was at least 30 seconds. Despite being

exposed to such a demanding method of training,

each of the three rats in the experiment showed an increase in the rate of lever

pressing as training progressed. The results from one rat are shown in Figure 4.3.

The remarkable finding from this experiment is that rats with no prior experienceof lever pressing can increase the rate of performing this response when the only

response-produced stimulus change occurs 30 seconds after a response has

been made.

It should be emphasized that the rate of lever pressing by the three rats was

relatively slow, and would have been considerably faster if food had been

presented immediately after the response. Temporal contiguity is thus important

for instrumental conditioning, but such conditioning is still effective, albeit to a

lesser extent, when there is a gap between the response and the delivery

of reward.

FIGURE 4.3 The mean rate of pressing a lever by asingle rat when food was presented 30 seconds after aresponse (adapted from Lattal & Gleeson, 1990).

8

6

4

2

00


2 4 6 8 10

Session

Temporal contiguity is an important factor in theeffectiveness of instrumental conditioning. This goldenretriever’s obedience training will be much more effectiveif the owner rewards his dog with a treat straight afterthe desired response.

K E Y T E R M

Gradient of delayThe progressiveweakening of aninstrumental responseas a result of increasing the delaybetween thecompletion of theresponse and thedelivery of thereinforcer.





Response–reinforcer contingency

We saw in Chapter 3 that the CS–US contingency is

important for Pavlovian conditioning because learning

is more effective when the US occurs only in the

presence of the CS than when the US also occurs both

in the presence and absence of the CS. An experimentby Hammond (1980) makes a similar point for

instrumental behavior, by demonstrating the

importance of the response–reinforcer contingency

for effective conditioning. The training schedule was

quite complex and required that the experimental

session was divided into 1-second intervals. If a

response occurred in any interval then, for three groups

of thirsty rats, water was delivered at the end of the

interval with a probability of 0.12. The results from a

group that received only this training, and no water in

the absence of lever pressing (Group 0), are shown in

the left-hand histogram of Figure 4.4. By the end of

training this group was responding at more than 50

responses a minute. For the remaining two groups,

water was delivered after some of the 1-second

intervals in which a response did not occur. For Group

0.08, the probability of one of these intervals being followed by water was 0.08, whereas

for Group 0.12 this probability was 0.12. The remaining two histograms show the final

response rates for these two groups. Both groups responded more slowly than Group 0,

but responding was weakest in the group for which water was just as likely to be

delivered whether or not a response had been made. The contingency between response

and reinforcer thus influences the rate at which the response will be performed. We now

need to ask why this should be the case. In fact, there are two answers to this question.

One answer is based on a quite different view of instrumental conditioning to that

considered thus far. According to this account, instrumental conditioning will be

effective whenever a response results in an increase in the rate of reinforcement

(e.g. Baum, 1973). Thus there is no need for a response to be followed closely by

reward for successful conditioning, all that is necessary is for the overall probability of

reward being delivered to increase. In other words, the contingency between a response

and reward is regarded as the critical determinant for the outcome of instrumental

conditioning. This position is referred to as a molar theory of reinforcement because

animals are assumed to compute the rate at which they make a response over a

substantial period of time and, at the same time, compute the rate at which reward is

delivered over the same period. If they should detect that an increase in the rate of responding is correlated with an increase in the rate of reward delivery, then the

response will be performed more vigorously in the future. Moreover, the closer the

correlation between the two rates, the more rapidly will the response be performed.

Group 0 of Hammond’s (1980) experiment demonstrated a high correlation between

the rate at which the lever was pressed and the rate at which reward was delivered, and

this molar analysis correctly predicts that rats will learn to respond rapidly on the lever.

In the case of Group 0.12, however, the rate of lever pressing had some influence on

the rate at which reward was delivered, but this influence was slight because the reward

would be delivered even if a rat refused to press the lever. In these circumstances,

FIGURE 4.4 The mean rates of lever pressing for waterby three groups of thirsty rats in their final session of training. The groups differed in the probability withwhich free water was delivered during the intervals

between responses. Group 0 received no water duringthese intervals, Group 0.08 and Group 0.12 receivedwater with a probability of 0.08 and 0.12, respectively, atthe end of each period of 1 second in which a responsedid not occur (adapted from Hammond, 1980).

60

40

20

00 0.08 0.12

Group


K E Y T E R M S

Response–reinforcer

contingency

The degree to whichthe occurrence of the

reinforcer dependson the instrumentalresponse. Positivecontingency: thefrequency of thereinforcer is increasedby making the response.Negative contingency:the frequency of thereinforcer is reduced bymaking the response.Zero contingency: thefrequency of thereinforcer is unaffectedby making the response.

Molar theory of

reinforcement

The assumption thatthe rate of instrumentalresponding isdetermined by theresponse–reinforcercontingency.





responding is predicted to be slow and again the theory

is supported by the findings.

The molar analysis of instrumental behavior has

received a considerable amount of attention and

generated a considerable body of experimental

research, but there are good reasons for believing that it

may be incorrect. In an experiment by Thomas (1981),rats in a test chamber containing a lever were given a

free pellet of food once every 20 seconds, even if they

did nothing. At the same time, if they pressed the lever

during any 20-seconds interval then the pellet was

delivered immediately, and the pellet at the end of the

interval was cancelled. Subsequent responses during

the remainder of the interval were without effect. This

treatment ensured that rats received three pellets of

food a minute whether or not they pressed the lever. Thus lever pressing in this

experiment did not result in an increase in the rate of food delivery and, according to the

molar point of view, the rate of making this response should not increase. The mean rateof responding during successive sessions is shown for one rat in Figure 4.5. Although

the rat took some time to press the lever, it eventually pressed at a reasonably high rate.

A similar pattern of results was observed with the other rats in the experiment, which

clearly contradicts the prediction drawn from a molar analysis of instrumental behavior.

Thomas (1981) reports a second experiment, the design of which was much the

same as for the first experiment, except that lever pressing not only resulted in the

occasional, immediate delivery of food but also in an overall reduction of food by

postponing the start of the next 20-second interval by 20 seconds. On this occasion,

the effect of lever pressing was to reduce the rate at which food was delivered, yet

each of six new rats demonstrated an increase in the rate of lever pressing as training

progressed. The result is opposite to that predicted by a molar analysis of instrumental conditioning.

Although molar theories of instrumental behavior (e.g. Baum, 1973) are ideally

suited to explaining results such as those reported by Hammond (1980), it is difficult

to see how they can overcome the problem posed by the findings of Thomas (1981). It

is therefore appropriate to seek an alternative explanation for the influence of the

response–reinforcer contingency on instrumental conditioning. One alternative, which

by now should be familiar, is that instrumental conditioning depends on the formation

of associations. This position is referred to as a molecular theory of reinforcement

because it assumes that the effectiveness of instrumental conditioning depends on

specific episodes of the response being paired with a reinforcer. The results from the

experiment can be readily explained by a molecular analysis of instrumental

conditioning, because contiguity between a response and a reinforcer is regarded as the

important condition for successful conditioning. Each lever press that resulted in food

would allow an association involving the response to gain in strength, which would

then encourage more vigorous responding as training progressed.

At first glance, Hammond’s (1980) results appear to contradict a molecular

analysis because the response was paired with reward in all three groups and they

would therefore be expected to respond at a similar rate, which was not the case. It

is, however, possible to reconcile these results with a molecular analysis of

instrumental conditioning by appealing to the effects of associative competition, as

the following section shows.

FIGURE 4.5 The totalnumber of lever pressesrecorded in eachsession for a rat in theexperiment by Thomas(1981).

30

25

20

15

10

5

00 5 10 15 20 25 30

Session


K E Y T E R M

Molecular theory of

reinforcement

The assumption thatthe rate of instrumentalresponding isdetermined byresponse–reinforcercontiguity.





Associative competition

If two stimuli are presented together for Pavlovian conditioning, the strength of the

conditioned response (CR) that each elicits when tested individually is often weaker

than if they are presented for conditioning separately. This overshadowing effect is

explained by assuming the two stimuli are in competition for associative strength so

that the more strength acquired by one the less is available for the other (Rescorla &Wagner, 1972). Overshadowing is normally assumed to take place between stimuli

but, if it is accepted that overshadowing can also occur between stimuli and

responses that signal the same reinforcer, then it is possible for molecular theories of

instrumental behavior to explain the contingency effects reported by Hammond

(1980). In Group 0 of his study, each delivery of water would strengthen a

lever-press–water association and result eventually in rapid lever pressing. In the

other groups, however, the delivery of free water would allow the context to enter

into an association with this reinforcer. The delivery of water after a response will

then mean that it is signaled by both the context and the response, and theories of

associative learning predict that the context–water association will restrict, through

overshadowing, the growth of the response–water association. As the strength of the

response–water association determines the rate at which the response is performed,

responding will be slower when some free reinforcers accompany the instrumental

training than when all the reinforcers are earned. Furthermore, the more often that

water is delivered free, the stronger will be the

context–water association and the weaker will be the

response–water association. Thus the pattern of

results shown in Figure 4.4 can be explained by a

molecular analysis of instrumental conditioning,

providing it is assumed that responses and stimuli

compete with each other for their associative

strength. The results from two different experiments

lend support to this assumption.

The first experiment directly supports the claim

that overshadowing is possible between stimuli and

responses. Pearce and Hall (1978) required rats to

press a lever for food on a variable interval schedule,

in which only a few responses were followed by

reward. For an experimental group, each rewarded

response was followed by a brief burst of white noise

before the food was delivered. The noise, which

accompanied only rewarded responses, resulted in a

substantially lower rate of lever pressing by the

experimental than by control groups that received either similar exposure to the noise(but after nonrewarded responses) or no exposure to the noise at all (Figure 4.6).

Geoffrey Hall and I argued that the most plausible explanation for these findings is

that instrumental learning involves the formation of R–US associations and that these

were weakened through overshadowing by a noise–food association that developed

in the experimental group.

The second source of support for a molecular analysis of the effect of

contingency on instrumental responding can be found in contingency experiments in

which a brief stimulus signals the delivery of each free reinforcer. The brief stimulus

should itself enter into an association with the reinforcer and thus overshadow the

FIGURE 4.6 The meanrates of lever pressingby three groups of ratsthat received a burstof noise after eachrewarded response(Corr), after somenonrewarded responses(Uncorr), or no noiseat all (Food alone)(adapted from Pearce &Hall, 1978).

20

15

10

5

01 2 3 4

Sessions

R e s p o n s e s p e r

m i n u t e

Corr

Uncorr Food alone





development of an association between the context and the reinforcer. Whenever a

response is followed by the reinforcer it will now be able to enter a normal R–US

association, because of the lack of competition from the context. Responding in these

conditions should thus be more vigorous than if the free US is not signaled. In

support of this argument, both Hammond and Weinberg (1984) and Dickinson and

Charnock (1985) have shown that free reinforcers disrupt instrumental responding to

a greater extent when they are unsignaled than when they are signaled. Thesefindings make a particularly convincing case for the belief that competition for

associative strength is an important influence on the strength of an instrumental

response. They also indicate that this competition is responsible for the influence of

the response–reinforcer contingency on the rate of instrumental responding.

The nature of the reinforcer

Perhaps the most important requirement for successful instrumental conditioning is

that the response is followed by a reinforcer. But what makes a reinforcer? In nearly

all the experiments that have been described thus far, the reinforcer has been food fora hungry animal, or water for a thirsty animal. As these stimuli are of obvious

biological importance, it is hardly surprising to discover that animals are prepared to

engage in an activity such as lever pressing in order to earn them. However, this does

not mean that a reinforcer is necessarily a stimulus that is of biological significance

to the animal. As Schwartz (1989) notes, animals will press a lever to turn on a light,

and it is difficult to imagine the biological need that is satisfied on these occasions.

Thorndike (1911) was the first to appreciate the need to identify the defining

characteristics of a reinforcer, and his solution was contained within the Law of

Effect. He maintained that a reinforcer was a stimulus that resulted in a satisfying

state of affairs. A satisfying state of affairs was then defined as “ . . . one which the

animal does nothing to avoid, often doing things which maintain or renew it”

(Thorndike, 1913, p. 2). In other words, Thorndike effectively proposed that a

stimulus would serve as a reinforcer (increase the likelihood of a response) if animals

were willing to respond in order to receive that stimulus. The circularity in this

definition should be obvious and has served as a valid source of criticism of the Law

of Effect on more than one occasion (e.g. Meehl, 1950). Thorndike was not alone in

providing a circular definition of a reinforcer. Skinner has been perhaps the most

blatant in this respect, as the following quotation reveals (Skinner, 1953, pp. 72–73):

The only way to tell whether or not a given event is reinforcing to a given

organism under given conditions is to make a direct test. We observe the

frequency of a selected response, then make an event contingent upon it and

observe any change in frequency. If there is a change, we classify the event as

reinforcing.

To be fair, for practical purposes this definition is quite adequate. It provides a

useful and unambiguous terminology. At the same time, once we have decided that

a stimulus, such as food, is a positive reinforcer, then we can turn to a study of a

number of issues that are important to the analysis of instrumental learning. For

instance, we have been able to study the role of the reinforcer in the associations that

are formed during instrumental learning, without worrying unduly about what it is

that makes a stimulus a reinforcer. But the definitions offered by Thorndike and





Skinner are not very helpful if a general statement is

being sought about the characteristics of a stimulus

that dictate whether or not it will function as a

reinforcer. And the absence of such a general

statement makes our understanding of the conditions

that promote instrumental learning incomplete.

A particularly elegant solution to the problem of deciding whether a stimulus will function as a

reinforcer is provided by the work of Premack (1959,

1962, 1965), who put forward what is now called the

Premack principle. He proposed that reinforcers

were not stimuli but opportunities to engage in

behavior. Thus the activity of eating, not the stimulus

of food, should be regarded as the reinforcer when an

animal has been trained to lever press for food. To

determine if one activity will serve as the reinforcer

for another activity, Premack proposed that the

animal should be allowed to engage freely in bothactivities. For example, a rat might be placed into a

chamber containing a lever and some food pellets. If

it shows a greater willingness to eat the food than to press the lever, then we can

conclude that the opportunity to eat will reinforce lever pressing, but the opportunity

to lever press will not reinforce eating.

It is perhaps natural to think of the properties of a reinforcer as being absolute.

That is, if eating is an effective reinforcer for one response, such as lever pressing,

then it might be expected to serve as a reinforcer for any response. But Premack

(1965) has argued this assumption is unjustified. An activity will only be reinforcing

if subjects would rather engage in it than in the activity that is to be reinforced. To

demonstrate this relative property of a reinforcer, Premack (1971a) placed rats into arunning wheel, similar to the one sketched in Figure 4.7, for 15 minutes a day.

When the rats were thirsty, they preferred to drink rather than to run in the wheel,

but when they were not thirsty, they preferred to run rather than to drink. For the test

phase of the experiment, the wheel was locked and the rats had to lick the drinking

tube to free it and so gain the opportunity to run for 5 seconds. Running is not

normally regarded as a reinforcing activity but because rats that are not thirsty prefer

to run rather than drink, it follows from Premack’s (1965) argument that they should

increase the amount they drink in the wheel in order to earn the opportunity to run.

Conversely; running would not be expected to reinforce drinking for thirsty rats,

because in this state of deprivation they prefer drinking to running. In clear support

of this analysis, Premack (1971a) found that running could serve as a reinforcer for

drinking, but only with rats that were not thirsty.

As Allison (1989) has pointed out, Premack’s proposals can be expressed

succinctly by paraphrasing Thorndike’s Law of Effect. For instrumental conditioning

to be effective it is necessary for a response to be followed not by a satisfying state

of affairs, but by a preferred response. Despite the improvement this change affords

with respect to the problem of defining a reinforcer, experiments have shown that it

does not account adequately for all the circumstances where one activity will serve

as a reinforcer for another.

Consider an experiment by Allison and Timberlake (1974) in which rats were

first allowed to drink from two spouts that provided different concentrations of

FIGURE 4.7 A sketchof the apparatus usedby Premack (1971a) todetermine if being giventhe opportunity to runcould serve as areinforcer for drinking inrats that were notthirsty (adapted fromPremack, 1971a).

Drinkingtube

Pneumaticcylinders

K E Y T E R M

Premack principle

The proposal thatactivity A will reinforceactivity B, if activity Ais more probable thanactivity B.





saccharin solution. This baseline test session revealed a preference for the sweeter

solution. According to Premack’s proposals, therefore, rats should be willing to

increase their consumption of the weaker solution if drinking it is the only means

by which they can gain access to the sweeter solution. By contrast, rats should not

be willing to increase their consumption of the sweeter solution to gain access to the

weaker one. To test this second prediction, rats were allowed to drink from the spout

supplying the sweeter solution and, after every 10 licks, they were permitted onelick at the spout offering the less-sweet solution. This 10 : 1 ratio meant that,

relative to the amount of sweet solution consumed, the rats received less of the

weaker solution than they chose to consume in the baseline test session. As a

consequence of this constraint imposed by the experiment, Allison and Timberlake

(1974) found that rats increased their consumption of the stronger solution. It is

important to emphasize that this increase occurred in order to allow the rats to gain

access to the less preferred solution, which, according to Premack’s theory, should

not have taken place.

Timberlake and Allison (1974) explained their results in terms of an equilibrium

theory of behavior. They argued that when an animal is able to engage in a variety of

activities, it will have a natural tendency to allocate more time to some than others.The ideal amount of time that would be devoted to an activity is referred to as its bliss

point, and each activity is assumed to have its own bliss point. By preventing an

animal from engaging in even its least preferred activity, it will be displaced from the

bliss point and do its best to restore responding to this point.

In the experiment by Allison and Timberlake (1974), therefore, forcing the

subjects to drink much more of the strong than the weak solution meant that they

were effectively deprived of the weak solution. As the only way to overcome this

deficit was to drink more of the sweet solution, this is what they did. Of course, as

the rats approached their bliss point for the consumption of the weak solution, they

would go beyond their bliss point for the consumption of the sweet solution. To cope

with this type of conflict, animals are believed to seek a compromise, or state of equilibrium, in which the amount of each activity they perform will lead them as

close as possible to the bliss points for all activities. Thus the rats completed the

experiment by drinking rather more than they would prefer of the strong solution,

and rather less than they would prefer of the weak solution.

By referring to bliss points, we can thus predict when the opportunity to engage

in one activity will serve as a reinforcer for another activity. But this does not mean

that we have now identified completely the circumstances in which the delivery of a

particular event will function as a reinforcer. Some reinforcers do not elicit responses

that can be analyzed usefully by equilibrium theory. Rats will press a lever to receive

stimulation to certain regions of the brain, or to turn on a light, or to turn off an

electric shock to the feet. I find it difficult to envisage how any measure of baseline

activity in the presence of these events would reveal that they will serve as

reinforcers for lever pressing. In the next section we will find that a stimulus that has

been paired with food can reinforce lever pressing in hungry rats. Again, simply by

observing an animal’s behavior in the presence of the stimulus, it is hard to imagine

how one could predict that the stimulus will function as a reinforcer. Our

understanding of the nature of a reinforcer has advanced considerably since

Thorndike proposed the Law of Effect. However, if we wish to determine with

confidence if a certain event will act as a reinforcer for a particular response, at times

there will be no better alternative than to adopt Skinner’s suggestion of testing for

this property directly.





Conditioned reinforcement

The discussion has been concerned thus far with primary reinforcers, that is, with

stimuli that do not need to be paired with another stimulus to function as reinforcers

for instrumental conditioning. There are, in addition, numerous studies that have

shown that even a neutral stimulus may serve as an instrumental reinforcer by virtue

of being paired with a primary reinforcer. An experiment by Hyde (1976) provides agood example of a stimulus acting in this capacity as a conditioned reinforcer. In

the first stage of the experiment, an experimental group of hungry rats had a number

of sessions in which the occasional delivery of food was signaled by a brief tone.

A control group was treated in much the same way except that the tone and food

were presented randomly in respect to each other. Both groups were then given the

opportunity to press the lever to present the tone. The results from the eight sessions

of this testing are displayed in Figure 4.8. Even though no food was presented in this

test phase, the experimental group initially showed a considerable willingness to

press the lever. The superior rate of pressing by the experimental compared to the

control group strongly suggests that pairing the tone with food resulted in it

becoming a conditioned reinforcer.

In the previous experiment, the effect of the conditioned reinforcer was

relatively short lasting, which should not be surprising because it will lose its

properties by virtue of being presented in the absence of food. The effects of

conditioned reinforcers can be considerably more robust if their relationship with

the primary reinforcer is maintained, albeit intermittently. Experiments using token

reinforcers provide a particularly forceful demonstration of how the influence of a

conditioned reinforcer may be sustained in this way. Token reinforcers are typically

small plastic discs that are earned by performing some response, and once earned

they can be exchanged for food. In an experiment by Kelleher (1958), chimpanzees

had to press a key 125 times to receive a single token. When they had collected 50

tokens they were allowed to push them all into a slot to receive food. In this

experiment, therefore, the effect of the token reinforcers was sufficiently strong that

they were able to reinforce a sequence of more than 6000 responses.

FIGURE 4.8 The mean rates of lever pressing for a brief tone by two groups of rats. For the experimental group the tone had previously been paired with food,whereas for the control group the tone and food had been presented randomly inrespect to each other (adapted from Hyde, 1976).

150

100

50

00 1 2 3 4 5 6 7 8

Session

M e a n r e s p o

n s e s p e r s e s s i o n

Experimental

Control

K E Y T E R M S

Conditioned reinforcer

An originally neutral

stimulus that servesas a reinforcer throughtraining, usually bybeing paired with apositive reinforcer.

Token reinforcer

A conditionedreinforcer in the formof a plastic chip thatcan be held by thesubject.





A straightforward explanation for the results of the experiment by Hyde (1976)

is that the tone became an appetitive Pavlovian CS and thus effectively served as a

substitute for food. The results from experiments such as that by Kelleher (1958)

have led Schwartz (1989) to argue that there are additional ways in which

conditioned reinforcers can be effective (see also Golub, 1977):

• They provide feedback that the correct response has been made. Delivering a tokenafter the completion of 125 responses would provide a useful signal that the subject

is engaged in the correct activity.

• Conditioned reinforcers might act as a cue for the next response to be performed.

Kelleher (1958) observed that his chimpanzees often waited for several hours

before making their first response in a session. This delay was virtually eliminated

by giving the subject some tokens at the start of the session, thus indicating that the

tokens acted as a cue for key pressing.

• Conditioned reinforcers may be effective because they help to counteract the

disruptive effects of imposing a long delay between a response and the delivery of

a primary reinforcer. Interestingly, as far as tokens are concerned, this property of the token is seen only when the chimpanzee is allowed to hold it during the delay.

Taken together, these proposals imply that the properties of a conditioned

reinforcer are considerably more complex than would be expected if they were based

solely on its Pavlovian properties.

THE PERFORMANCE OF INSTRUMENTAL

BEHAV IOR

The experiments considered so far have been concerned with revealing theknowledge that is acquired during the course of instrumental conditioning. They

have also indicated some of the factors that influence the acquisition of this

knowledge. We turn our attention now to examining the factors that determine the

vigor with which an animal will perform an instrumental response. We have already

seen that certain devaluation treatments can influence instrumental responding, and

so too can manipulations designed to modify the strength of the instrumental

association. But there remain a number of other factors that influence instrumental

behavior. In the discussion that follows we shall consider two of these influences in

some detail: deprivation state and the presence of Pavlovian CSs.

Deprivation

The level of food deprivation has been shown, up to a point, to be directly related to

the vigor with which an animal responds for food. This is true when the response is

running down an alley (Cotton, 1953) or pressing a lever (Clark, 1958). To explain this

relationship, Hull (1943) suggested that motivational effects are mediated by activity

in a drive center. Drive is a central state that is excited by needs and energizes

behavior. It was proposed that the greater the level of drive, the more vigorous will be

the response that the animal is currently performing. Thus, if a rat is pressing a lever

for food, then hunger will excite drive, which, in turn, will invigorate this activity.





A serious shortcoming of Hull’s (1943) account is the claim that drive is

nonspecific, so that it can be enhanced by an increase in any need of the animal.

A number of curious predictions follow from this basic aspect of his theorizing. For

example, the pain produced by electric shock is assumed to increase drive, so that if

animals are given shocks while lever pressing for food, they should respond more

rapidly than in the absence of shock. By far the most frequent finding is that this

manipulation has the opposite effect of decreasing appetitive instrumentalresponding (e.g. Boe & Church, 1967). Conversely, the theory predicts that

enhancing drive by making animals hungrier should facilitate the rate at which they

press a lever to escape or avoid shock. Again, it should not be surprising to discover

that generally this prediction is not confirmed. Increases in deprivation have been

found, in this respect, to be either without effect (Misanin & Campbell, 1969) or to

reduce the rate of such behavior (Meyer, Adams, & Worthen, 1969; Leander, 1973).

In response to this problem, more recent theorists have proposed that animals

possess two drive centers: One is concerned with energizing behavior that leads to

reward, the other is responsible for invigorating activity that minimizes contact

with aversive stimuli. These can be referred to, respectively, as the positive and

negative motivational systems. A number of such dual-system theories of motivation have been proposed (Konorski, 1967; Rescorla & Solomon, 1967;

Estes, 1969).

The assumption that there are two motivational systems rather than a single drive

center allows these theories to overcome many of the problems encountered by

Hull’s (1943) theory. For example, it is believed that deprivation states like hunger

and thirst will increase activity only in the positive system, so that a change in

deprivation should not influence the vigor of behavior that minimizes contact with

aversive stimuli such as shock. Conversely, electric shock should not invigorate

responding for food as it will excite only the negative system.

But even this characterization of the way in which deprivation states influence

behavior may be too simple. Suppose that an animal that has been trained to leverpress for food when it is hungry is satiated by being granted unrestricted access to

food before it is returned to the conditioning chamber. The account that has just been

developed predicts that satiating the animal will reduce the motivational support for

lever pressing by lowering the activity in the positive system. The animal would thus

be expected to respond less vigorously than one that was still hungry. There is some

evidence to support this prediction (e.g. Balleine, Garner, Gonzalez & Dickinson,

1995), but additional findings by Balleine (1992) demonstrate that dual-system

theories of motivation are in need of elaboration if they are to provide a complete

account of the way in which deprivation states influence responding.

In one experiment by Balleine (1992), two groups of rats were trained to press a

bar for food while they were hungry (H). For reasons that will be made evident

shortly, it is important to note that the food pellets used as the instrumental reinforcer

were different to the food that was presented at all other times in this experiment.

Group H–S was then satiated (S) by being allowed unrestricted access to their normal

food for 24 hours, whereas Group H–H remained on the deprivation schedule.

Finally, both groups were again given the opportunity to press the bar, but responding

never resulted in the delivery of the reinforcer. Because of their different deprivation

states, dual-system theories of motivation, as well as our intuitions, predict that

Group H–H should respond more vigorously than Group H–S in this test session. But

it seems that our intuitions are wrong on this occasion. The mean number of

responses made by each group in the test session are shown in the two gray

K E Y T E R M

Dual-system theories

of motivation

Theories that assumethat behavior ismotivated by activityin a positive system,which energizesapproach to an object,and a negativesystem, whichenergizes withdrawalfrom an object.





histograms on the left-hand side of Figure 4.9, which reveal that both groups

responded quite vigorously, and at a similar rate.

The equivalent histograms on the right-hand side of Figure 4.9 show the results

of two further groups from this study, which were trained to lever press for food

while they were satiated by being fed unrestricted food in their home cages. Rats will

learn to respond for food in these conditions, provided that the pellets are of a

different flavor to that of the unrestricted food presented in the home cages. GroupS–S was then tested while satiated, whereas Group S–H was tested while hungry.

Once again, and contrary to our intuitions, both groups performed similarly in the

test session despite their different levels of deprivation. When the results of the four

groups are compared, it is evident that the groups that were trained hungry responded

somewhat more on the test trials than those that were trained while they were

satiated. But to labor the point, there is no indication that changing deprivation level

for the test session had any influence on responding.

Balleine’s (1992) explanation for these findings is that the incentive value, or

attractiveness, of the reinforcer is an important determinant of how willing animals

will be to press for it. If an animal consumes a reinforcer while it is hungry, then that

reinforcer may well be more attractive than if it is consumed while the animal issatiated. Thus Group H–H and Group H–S may have responded rapidly in the test

session because they anticipated a food that in the past had proved attractive, because

they had only eaten it while they were hungry. By way of contrast, the slower

responding by Groups S–S and S–H can be attributed to them anticipating food that

in the past had not been particularly attractive, because they had eaten it only while

they were not hungry.

This explanation was tested with two additional groups. Prior to the experiment,

animals in Group Pre(S) H–S were given reward pellets while they were satiated to

FIGURE 4.9 The mean number of responses made by six groups of rats in anextinction test session. The left-hand letter of each pair indicates the level of deprivation when subjects were trained to lever press for reward—either satiated (S)or hungry (H)—the right-hand letter indicates the deprivation level during test trials.Two of the groups were allowed to consume the reward either satiated, Pre(S), orhungry, Pre(H), prior to instrumental conditioning (adapted from Balleine, 1992).

200

150

100

50

0H–H H–S Pre(S)

H–SS–S S–H Pre(H)

S–HGroup Group

M e a n t o t a l r e s p o n s e s

200

150

100

50

0

M e a n t o t a l r e s p o n s e s





demonstrate that the pellets are not particularly attractive in this deprivation state.

The group was then trained to lever press while hungry and received test trials while

satiated. On the test trials, the subjects should know that because, of their low level

of deprivation, the reward pellets are no longer attractive and they should be reluctant

to press the lever. The results, which are shown in the blue histogram in the left-hand

side of Figure 4.9, confirmed this prediction. The final group to be considered, Group

Pre(H) S–H, was first allowed to eat reward pellets in the home cage while hungry,instrumental conditioning was then conducted while the group was satiated and the

test trials were conducted while the group was hungry. In contrast to Group S–H and

Group S–S, this group should appreciate that the reward pellets are attractive while

hungry and respond more rapidly than the other two groups during the test trials.

Once again, the results confirmed this prediction—see the blue histogram on the

right-hand side of Figure 4.9

By now it should be evident that no simple conclusion can be drawn concerning

the way in which deprivation states influence the vigor of instrumental responding.

On some occasions a change in deprivation state is able to modify directly the rate

of responding, as dual-systems theories of motivation predict. On other occasions,

this influence is more indirect by modifying the attractiveness of the reinforcer. Aninformative account of the way in which these findings may be integrated can be

found in Balleine et al. (1995).

Pavlovian–instrumental interactions

For a long time, theorists have been interested in the way in which Pavlovian CSs

influence the strength of instrumental responses that are performed in their

presence. One reason for this interest is that Pavlovian and instrumental

conditioning are regarded as two fundamental learning processes, and it is

important to appreciate the way in which they work together to determine how an

animal behaves. A second reason was mentioned at the end of Chapter 2, where wesaw that Pavlovian CSs tend to elicit reflexive responses that may not always be in

the best interests of the animal. If a Pavlovian CS was also able to modulate the

vigor of instrumental responding, then this would allow it to have a more general,

and more flexible, influence on behavior than has so far been implied. For example,

if a CS for food were to invigorate instrumental responses that normally lead to

food, then such responses would be strongest at a time when they are most needed,

that is, in a context where food is likely to occur. The experiments described in this

section show that Pavlovian stimuli can modulate the strength of instrumental

responding. They also show that there are at least two ways in which this influence

takes place.

Motivational influences

Konorski (1967), it should be recalled from Chapter 2, believed that a CS can excite

an affective representation of the US that was responsible for arousing a preparatory

CR. He further believed that a component of this CR consists of a change in the level

of activity in a motivational system. A CS for food, say, was said to increase activity

in the positive motivational system, whereas a CS for shock should excite the

negative system. If these proposals are correct, then it should be possible to alter the

strength of instrumental responding by presenting the appropriate Pavlovian CS

(see also Rescorla & Solomon, 1967).





An experiment by Lovibond (1983), using Pavlovian–instrumental transfer

design, provides good support for this prediction. Hungry rabbits were first trained

to operate a lever with their snouts to receive a squirt of sucrose into the mouth. The

levers were then withdrawn for a number of sessions of Pavlovian conditioning in

which a clicker that lasted for 10 seconds signaled the delivery of sucrose. In a final

test stage, subjects were again able to press the lever and, as they were doing so, the

clicker was occasionally operated. The effect of this appetitive CS was to increasethe rate of lever pressing both during its presence and for a short while after it was

turned off. A similar effect has also been reported in a study using an aversive US.

Rescorla and LoLordo (1965) found that the presentation of a CS previously paired

with shock enhanced the rate at which dogs responded to avoid shock.

In addition to explaining the findings that have just been described, a further

advantage of dual-system theories of motivation is that they are able to account for

many of the effects of exposing animals simultaneously to both appetitive and aversive

stimuli. For example, an animal may be exposed to one stimulus that signals reward

and another indicating danger. In these circumstances, instead of the two systems

working independently, they are assumed to be connected by mutually inhibitory links,

so that activity in one will inhibit the other (Dickinson & Pearce, 1977).To understand this relationship, consider the effect of presenting a signal for

shock to a rat while it is lever pressing for food. Prior to the signal, the level of

activity in the positive system will be solely responsible for the rate of pressing.

When the aversive CS is presented, it will arouse the negative system. The existence

of the inhibitory link will then allow the negative system to suppress activity in the

positive system and weaken instrumental responding. As soon as the aversive CS is

turned off, the inhibition will be removed and the original response rate restored. By

assuming the existence of inhibitory links, dual-system theories can provide a very

simple explanation for conditioned suppression. It occurs because the aversive CS

reduces the positive motivational support for the instrumental response.

Response-cueing properties of Pavlovian CRs

In addition to modulating activity in motivational systems, Pavlovian stimuli can

influence instrumental responding through a response-cueing process (Trapold &

Overmier, 1972). To demonstrate this point we shall consider an experiment by

Colwill and Rescorla (1988), which is very similar in design to an earlier study by

Kruse, Overmier, Konz, and Rokke (1983).

In the first stage of the experiment, hungry rats received Pavlovian conditioning

in which US1 was occasionally delivered during a 30-second CS. Training was then

given, in separate sessions, in which R1 produced US1 and R2 produced US2. The

two responses were chain pulling and lever pressing, and the two reinforcers were

food pellets and sucrose solution. For the test stage, animals had the opportunity forthe first time to perform R1 and R2 in the presence of the CS, but neither response led

to a reinforcer. As Figure 4.10 shows, R1 was performed more vigorously than R2.

The first point to note is that it is not possible to explain these findings by

appealing to the motivational properties of the CS. The CS should, of course,

enhance the level of activity in the positive system. But this increase in activity

should then invigorate R1 to exactly the same extent as R2 because the motivational

support for both responses will be provided by the same, positive, system.

In developing an alternative explanation for the findings by Colwill and Rescorla

(1988), note that instrumental conditioning with the two responses was conducted in

K E Y T E R M

Pavlovian–instrumental

transfer

Training in which a CSis paired with a USand then the CS ispresented while thesubject is performingan instrumentalresponse.





separate sessions. Thus R1 was acquired against a

background of presentations of US1 and, likewise,

R2 was acquired against a background of US2

presentations. If we now accept that the training

resulted in the development of S-R associations, it is

conceivable that certain properties of the two rewards

contributed towards the S component of theseassociations. For example, a memory of US1 might

contribute to the set of stimuli that are responsible for

eliciting R1. When the CS was presented for testing,

it should activate a memory of US1, which in turn

should elicit R1 rather than R2. In other words, the

Pavlovian CS was able to invigorate the instrumental

response by providing cues that had previously

become associated with the instrumental response.

Concluding commentsThe research reviewed so far in this chapter shows that we have discovered a

considerable amount about the associations that are formed during instrumental

conditioning. We have also discovered a great deal about the factors that influence

the strength of instrumental responding. In Chapter 2 a simple memory model was

developed to show how the associations formed during Pavlovian conditioning

influence responding. It would be helpful if a similar model could be developed for

instrumental conditioning, but this may not be an easy task. We would need to take

account of three different associations that have been shown to be involved in

instrumental behavior, S–R, R–US, S–(R–US). We would also need to take account

of the motivational and response-cueing properties of any Pavlovian CS–US

associations that may develop. Finally, the model would need to explain how changesin deprivation can influence responding. It hardly needs to be said that any model

that is able to take account of all these factors satisfactorily will be complex and

would not fit comfortably into an introductory text. The interested reader is, however,

referred to Dickinson (1994) who shows how much of our knowledge about

instrumental behavior can be explained by what he calls an associative-cybernetic

model. In essence, this model is a more complex version of the dual-system theories

of motivation that we have considered. The reader might also wish to consult

Balleine (2001) for a more recent account of the influence of motivational processes

on instrumental behavior.

Our discussion of the basic processes of instrumental conditioning is now

complete, but there is one final topic to consider in this chapter. That is, whether theprinciples we have considered can provide a satisfactory account for the problem

solving abilities of animals.

THE LAW OF EF FECT AND PROBLEM

SOLVING

Animals can be said to have solved a problem whenever they overcome an obstacle

to attain a goal. The problem may be artificial, such as having to press a lever for

FIGURE 4.10 The meanrates of performing tworesponses, R1 and R2,in the presence of anestablished Pavlovianconditioned stimulus(CS). Prior to testing,instrumentalconditioning had beengiven in which thereinforcer for R1 wasthe same as thePavlovian unconditionedstimulus (US), and thereinforcer for R2 wasdifferent to thePavlovian US. Testingwas conducted in theabsence of anyreinforcers in a singlesession (adapted fromColwill & Rescorla,1988).

10

8

6

4

2

0

Blocks of two trials

M e a n r e s p o n s e s p

e r m i n u t e

R2

R1

0 1 2 3 4 5





reward, or it might be one that occurs naturally, such as having to locate a new source

of food. Early studies of problem solving in animals were conducted by means of

collecting anecdotes, but this unsatisfactory method was soon replaced by

experimental tests in the laboratory (see Chapter 1). As a result of his experiments,

Thorndike (1911) argued that despite the range of potential problems that can

confront an animal, they are all solved in the same manner. Animals are assumed to

behave randomly until by trial and error the correct response is made and reward isforthcoming. To capture this idea, Thorndike (1911) proposed the Law of Effect,

which stipulates that one effect of reward is to strengthen the accidentally occurring

response and to make its occurrence more likely in the future. This account may

explain adequately the way cats learn to escape from puzzle boxes, but is it suitable

for all aspects of problem solving? A number of researchers have argued that animals

are more sophisticated at solving problems than is implied by the Law of Effect. It has

been suggested that they are able to solve problems through insight. It has also been

suggested that animals can solve problems because they have an understanding of the

causal properties of the objects in their environment or, as it is sometimes described,

an understanding of folk physics. We shall consider each of these possibilities.

Insight

An early objector to Thorndike’s (1911) account of problem solving was Kohler

(1925). Thorndike’s experiments were so restrictive, he argued, that they prevented

animals from revealing their capacity to solve problems by any means other than the

most simple. Kohler spent the First World War on the

Canary Islands, where he conducted a number of

studies that were meant to reveal sophisticated

intellectual processes in animals. He is best known

for experiments that, he claimed, demonstrate the

importance of insight in problem solving. Many of

his findings are described in his book The mentality

of apes, which documents some remarkable feats of

problem solving by chimpanzees and other animals.

Two examples should be sufficient to give an

indication of his methodology. These example

involve Sultan (Figure 4.11), whom Kohler (1925)

regarded as the brightest of his chimpanzees. On one

occasion Sultan, was in a cage in which there was

also a small stick. Outside the cage was a longer

stick, which was beyond Sultan’s reach, and even

further away was a reward of fruit (p. 151):

Sultan tries to reach the fruit with the smaller of

the sticks. Not succeeding, he tries a piece of

wire that projects from the netting in his cage,

but that, too, is in vain. Then he gazes about him

(there are always in the course of these tests

some long pauses, during which the animal

scrutinizes the whole visible area). He suddenly

picks up the little stick once more,

K E Y T E R M

Insight

An abrupt change in

behavior that leads toa problem beingsolved. The change inbehavior is sometimesattributed to a periodof thought followed bya flash of inspiration.

FIGURE 4.11 Sultan stacking boxes in an attempt toreach a banana (drawing based on Kohler, 1956).





goes to the bars directly opposite to the long stick, scratches it towards him

with the auxiliary, seizes it and goes with it to the point opposite the objective

which he secures. From the moment that his eyes fell upon the long stick, his

procedure forms one consecutive whole.

In the other study, Kohler (1925) hung a piece of fruit from the ceiling of a cage

housing six apes, including Sultan. There was a wooden box in the cage (p. 41):

All six apes vainly endeavored to reach the fruit by leaping up from the ground.

Sultan soon relinquished this attempt, paced restlessly up and down, suddenly

stood still in front of the box, seized it, tipped it hastily straight towards the

objective, but began to climb upon it at a (horizontal) distance of 1 ⁄ 2 meter and

springing upwards with all his force, tore down the banana.

In both examples there is a period when the animal responds incorrectly; this is

then followed by activity that, as it is reported, suggests that the solution to the

problem has suddenly occurred to the subject. There is certainly no hint in thesereports that the problem was solved by trial and error. Does this mean, then, that

Kohler (1925) was correct in his criticism of Thorndike’s (1911) theorizing?

A problem with interpreting Kohler’s (1925) findings is that all of the apes had

played with boxes and sticks prior to the studies just described. The absence of

trial-and-error responding may thus have been due to the previous experience of the

animals. Sultan may, by accident, have learned about the consequences of jumping

from boxes in earlier sessions, and he was perhaps doing no more than acting on the

basis of his previous trial-and-error learning. This criticism of Kohler’s (1925) work

is by no means original. Birch (1945) and Schiller (1952) have both suggested that

without prior experience with sticks and so forth, there is very little reason for

believing that apes can solve Kohler’s problems in the manner just described.

The absence of trial-and-error responsesin Kohler’s (1925)findings might havebeen due to the factthat most apes wouldhave had priorexperience of playingwith sticks.





An amusing experiment by Epstein, Kirshnit, Lanza, and Rubin (1984) also

shows the importance of past experience in problem solving and, at the same time,

raises some important issues concerning the intellectual abilities of animals. Pigeons

were given two different types of training. They were rewarded with food for pushing

a box towards a spot randomly located at the base of a wall of the test chamber.

Pushing in the absence of the spot was never rewarded. They were also trained to

stand on the box when it was fixed to the floor and peck for food at a plastic bananasuspended from the ceiling. Attempts to peck the banana when not standing on the

box were never rewarded. Finally, on a test session they were confronted with a novel

situation in which the banana was suspended from the ceiling and the box was placed

some distance from beneath it. Epstein et al. (1984) report that (p. 61):

At first each pigeon appeared to be “confused”; it stretched and turned beneath

the banana, looked back and forth from banana to box, and so on. Then each

subject began rather suddenly to push the box in what was clearly the direction

of the banana. Each subject sighted the banana as it pushed and readjusted the

box as necessary to move it towards the banana. Each subject stopped pushing

it in the appropriate place, climbed and pecked the banana. This quite

remarkable performance was achieved by one bird in 49 sec, which compares

very favorably with the 5 min it took Sultan to solve his similar problem.

There can be no doubt from this study that the prior training of the pigeon played

an important role in helping it solve the problem. Even so, the study clearly reveals

that the pigeons performed on the test session in a manner that extends beyond

trial-and-error responding. The act of pecking the banana might have been acquired

by trial-and-error learning, and so, too, might the act of moving the box around. But

the way in which the box was moved to below the banana does not seem to be

compatible with this analysis.

The description by Epstein et al. (1984) of the pigeons’behavior bears a striking

similarity to Kohler’s (1925) account of Sultan’s reaction to the similar problem. It

might be thought, therefore, that it would be appropriate to account for the pigeons’

success in terms of insight. In truth, this would not be a particularly useful approach

as it really does not offer an account of the way in which the problem was solved.

Other than indicating that the problem was solved suddenly, and not by trial and

error, the term “insight” adds little else to our understanding of these results.

I regret that I find it impossible to offer, with confidence, any explanation for the

findings by Epstein et al. (1984). But one possibility is that during their training with the

blue spot, pigeons learned that certain responses moved the box towards the spot, and

that the box by the spot was a signal for food. The combination of these associations

would then result in them pushing the box towards the spot. During their training with

the banana, one of the things the pigeons may have learned is that the banana is

associated with food. Then, for the test session, although they would be unable to push

the box towards the blue spot, generalization from their previous training might result in

them pushing the box in the direction of another signal for food, the banana.

Causal inference and folk physics

The term “insight” is now rarely used in discussions of problem solving by animals.

As an alternative, it has been proposed that animals have some understanding of





causal relationships and that they can draw inferences based on this understanding to

solve problems. When a problem is encountered, therefore, animals are believed to

solve it through reasoning based on their understanding of the physical and causal

properties of the objects at their disposal. To take the example of Sultan joining two

sticks together to reach food, if he understood that this action would create a longer

stick that would allow him to reach further, he would then be able to solve the

problem in a manner that is considerably more sophisticated than relying on trial anderror. Of course, we have just seen that the studies by Kohler (1925) do not provide

evidence that animals can solve problems in this way, but the results from other

experiments have been taken as evidence that animals are capable of making causal

inferences. The following discussion will focus separately on this work with

primates and birds.

Primates

Premack (1976, pp. 249–261) describes an experiment with chimpanzees in which

a single subject would be shown an array of objects similar to the one in Figure 4.12.

To gain reward, the chimpanzee was required to replace the strange shape in

the upper row with the knife from the lower row. The choice of the knife was

intended to reveal that the ape understood this object causes an apple to be cut in

half. Two of the four subjects that were tested performed consistently well on this

task. They received a novel problem on each trial, thus their success could not

depend on them solving the problem by associating a given choice with a particular

array of objects. An alternative explanation for the problem shown in Figure 4.12 is

that the apes had repeatedly seen an apple being cut with a knife and they may have

learned to select the object from the lower row that was most strongly associated

with the one from the upper row. Although this explanation will work for many of

the test trials, Premack (1976) argues there were certain occasions where it provides

an implausible explanation for the successful choice. For instance, one trial was

similar to that shown in Figure 4.12 except the apple was replaced by a whole ball

and a ball cut into pieces. Even though the subjects had rarely seen a knife and a

ball together, they still made the correct choice (see also Premack & Premack, 1994,

pp. 354–357).

FIGURE 4.12 Sketchof an array of objectsused by Premack(1976) to test for

causal inference inchimpanzees (adaptedfrom Premack, 1976).





The findings from the experiment are encouraging, but there is some doubt about

how they should be interpreted. The two apes who performed well on the task had

received extensive training on other tasks, and one of them, Sarah, had even been taught

an artificial language (see Chapter 13). Perhaps the extensive training given to the

chimpanzees resulted in them acquiring a rich array of associations that helped them

perform correctly on the tests, without the need to understand the relevant causal

relationships. It is also possible that because of the similarity between the shapes of anapple and a ball, stimulus generalization rather than a causal inference was responsible

for the selection of the knife during the test with the ball. To properly evaluate the results

from the experiment it would be necessary to have a complete account of the training

that was given before it started. It would also be important to have a full description of

the method and results from the experiment itself. Unfortunately, the information that is

available is rather brief and the reader is left in some doubt as to how Premack’s findings

should be interpreted. Another possibility is that because of their extensive training, the

two chimpanzees were able to appreciate causal relationships and to draw inferences

from them in a way that is not open to relatively naïve chimpanzees. Once again, there

is insufficient information available for this possibility to be assessed. There is no

denying that the experiment by Premack has revealed some intriguing findings but,before its full significance can be appreciated, additional experiments are needed in

order to pursue the issues that have just been raised.

Rather than study causal inference, some researchers have investigated what they

call “folk physics”, which refers to a common-sense appreciation of the causal

properties of objects in the environment. Problems could then be solved by drawing

inferences from the understanding about these properties. Povinelli (2000) has

conducted a thorough series of experiments to explore whether chimpanzees make

use of folk physics in problem solving, and they point to rather different conclusions

to those drawn by Premack (1976).

In one of Povinelli’s experiments, chimpanzees were confronted with a clear

tube that contained a peanut. To retrieve the food, they were required to push it outof the tube with a stick. Once they had mastered this skill they were given the same

task but this time there was a trap in the tube. Pushing the stick in one direction

caused the peanut to fall out of the tube, pushing it in the other direction caused the

peanut to fall in the trap where it was inaccessible. A sketch of this simple apparatus

can be seen in Figure 4.13. Three of the four chimpanzees that were given this task

never mastered it, and the fourth chimpanzee came to terms with it only after many

practice trials. The conclusion to be drawn from this study is that chimpanzees did

not have any appreciation of the problem created by the presence of the trap, that is,

they lacked an understanding provided by folk physics concerning the properties of

traps. Instead, the eventual success of the single chimpanzee can be explained by

assuming that she learned through trial and error how to avoid pushing the food into

the trap by inserting the stick into the end of the tube that was furthest from the

peanut. A similar failure to find evidence of successful performance on the same

problem has been found with capuchin monkeys (Visalberghi & Limongelli, 1994).

Povinelli (2000) cites a total of 27 experiments, using a variety of tests, all of which

show that chimpanzees have a complete lack of understanding of the physical

properties of the problems that confront them. The interested reader might also refer

to an article by Nissani (2006), which reports a failure by elephants to display causal

reasoning in a tool-use task.

The negative results that have just been cited make it all the more important to

determine whether Premack (1976) was correct in claiming that chimpanzees are





capable of causal inference. For the present, it is perhaps wisest to keep an open mind

about the capacity of primates to refer to folk physics when solving problems, but

what about other species? Clayton and Dickinson (2006) have suggested that the

most compelling evidence that animals have an appreciation of folk physics can be

found in certain species of birds.

Birds

In one study, Seed, Tebbich, Emery, and Clayton (2006) presented a group of naïve

rooks a version of the trap problem used by Povinelli (2000; described above). When

first confronted with this problem, the direction in which the birds pushed the foodwas determined by chance but, as training progressed, they showed a marked

improvement in avoiding the trap. To test whether this improvement reflected

anything more than learning through trial and error, the birds were given a new

problem where performance was not expected to be influenced by the effects of any

prior trial-and-error learning. Six out of seven birds performed poorly on the

new problem, but one bird performed extremely well from the outset. As the authors

point out, it is hard to know what conclusions to draw when one bird passes a test

that six others have failed, but the performance of the one successful bird encourages

the view that future research with rooks might reveal promising results.

Another species of bird that might possess an understanding of folk physics is

the raven. Heinrich (2000; see also Heinrich and Bugnyar, 2005) describes a seriesof experiments with hand-reared ravens in which the birds were presented with a

piece of meat hanging from a perch (see the left-hand side of Figure 4.14). The meat

could not be obtained by flying towards it and clasping it in the beak. Instead, to

reach the meat some birds settled on the perch where the string was attached and

grasped the string below the perch with their beak and pulled it upwards. To stop the

meat falling back they placed a foot on the string and then let it drop from their beak

whereupon they bent down to grasp again the string below the perch. This operation

was repeated until the meat was near enough to be grasped directly with the beak. In

another test, ravens were confronted with the arrangement shown in the right-hand

side of Figure 4.14. On this occasion, meat could be retrieved by standing on the

Failure

Success

FIGURE 4.13 Right: Diagram of the apparatus used by Povinelli (2000) and by Visalberghi and Limongelli(1994) to test whether an animal will push a peanut in the direction that ensures it does not fall into a trap.From Visalberghi and Limongelli, 1994. Copyright © 1994 American Psychological Association. Reproducedwith permission. Left: A monkey about to attempt to retrieve a nut from the apparatus. Photograph byElisabetta Visalberghi.





perch and pulling the string downwards. Birds who had mastered the original task

were also adept at mastering this new task, but birds without prior experience of

pulling string never mastered this second task.

Heinrich and Bugnyar (2005) believe these results show that ravens have “some

kind of understanding of means–end relationships, i.e. an apprehension of a cause-

effect relation between string, food, and certain body parts” (p. 973). In other words,

they have an appreciation of folk physics. This conclusion is based on the finding

that birds spontaneously solved the first problem but not the second one. It was

assumed that an understanding of cause–effect relations would allow the birds to

appreciate that pulling food in one direction would result in the food moving in the

same direction (Figure 4.15). Such knowledge would be beneficial when the birds

had to pull the string upwards to make the meat rise upwards, but it would be a

hindrance in the second problem in which the birds were required to pull the string

downwards to make meat rise upwards.

Although the performance of the ravens is impressive, it does not necessarily

demonstrate that the birds relied on folk physics to solve the problem that initially

confronted them. As Heinrich and Bugnyar (2005) acknowledge, the solution to the

first problem might have been a product of trial-and-error learning in which the sight

of food being drawn ever closer served as the reward for the sequence of stepping

and pulling that the birds engaged in. According to this analysis, the initial contact

with the string would have to occur by chance, which seems plausible because the bird’s

FIGURE 4.14 Diagramof the apparatus usedby Heinrich and Bugnyar(2005). A raven stoodon the perch and wasexpected to retrievefood by pulling the

string upwards(left-hand side) ordownwards (right-handside).

(a) (b)

PerchPerch

Meat

Meat

50 cm

stringWiremesh

50 cmstring







They use long thin strips of leaf with barbs running down one edge to draw prey from

cracks and holes in trees. It is not known if this ability is learned or inherited, but if

the conclusions drawn from the study by Tebbich et al. (2001) have any generality,

then it will be a mixture of the two.

In the experiment by Weir et al. (2002), a male and a female crow were expected

to retrieve a piece of food from a bucket with a handle that was placed in a clear,

vertical tube (Figure 4.16). The tube was so deep that it was impossible for the birdsto reach the handle of the bucket with their beaks. A piece of straight wire and a piece

of wire with a hook at one end were placed near the tube and the birds were expected

to use the hooked wire to lift the bucket out of the tube by its handle. On one

occasion, the male crow selected the hooked wire, which left the female with the

straight wire. She picked up one end in her beak inserted the other end in a small

opening and then bent the wire to create a hook which was of a suitable size to enable

her to lift the bucket from the tube. A video clip of this sequence can be seen by

going to the following web address http://www.sciencemag.org/cgi/content/full/

297/5583/981/DC1. As one watches the female crow bend the wire it is hard not to

agree with the authors that she was deliberately modifying the wire to create a tool,

and that this modification relied on an understanding of “folk physics” and causality.However, appearances can be deceptive, and it would be a mistake to ignore the

possibility that the bird’s behavior was a consequence of less sophisticated processes.

As with the woodpecker finches, it is possible that the skill displayed by the female

crow was a consequence of the interaction between inherited tendencies and learning

based on prior experience with sticks, twigs, and so on. Before this explanation can

be rejected with complete confidence, more needs to be known about the

development of tool use in New Caledonian crows in their natural habitat. It is also

a pity that rather little is known about the prior experiences of the bird in question,

which was captured in the wild.

The results from these tests for an understanding of folk physics in birds can

perhaps most fairly be described as ambiguous in their theoretical significance.On the one hand, it is possible to explain most, if not all, of them in terms of

the trial-and-error principles advocated by Thorndike (1911) almost a century ago.

FIGURE 4.16 A NewCaledonian crow liftinga bucket out of a tubein order to retrieve foodin an experiment byWeir, et al., (2002).Photograph by AlexWeir © Behavioural

Ecology ResearchGroup, University of Oxford.





However, a critic of this type of explanation would argue that it is so versatile that it

can explain almost any result that is obtained. Moreover, although the trial-and-error

explanations we have considered may be plausible, there is no evidence to confirm

that they are necessarily correct. On the other hand, some of the behavior that has

been discovered with birds is so impressive to watch that many researchers find it

hard to believe they lack any understanding of the problem that confronts them.

For myself, my sympathies rest with an analysis of problem solving in terms of trial-and-error learning. The great advantage of this explanation is that it is based on

firmly established principles of associative learning. Problem solving relies on the

capacity for rewards to strengthen associations involving responses, and the transfer

of the solution from one problem to another is explained through stimulus

generalization. By contrast, much less is known about the mental processes that

would permit animals to make causal inferences, or to reason using folk physics.

Seed et al. (2006) note briefly that these processes may involve the capacity for

acquiring abstract rules about simple, physical properties of the environment. Given

such a proposal two questions arise: first, how is such knowledge about the

environment acquired, and second, are animals capable of abstract thought? As far as

I am aware, no-one has offered an answer to the first question and, as for the secondquestion, we shall see in later chapters that whether or not animals are capable of

abstract thought is a contentious issue that has yet to be fully resolved.




Subject index

abstract categories 179–80, 189, 358abstract mental codes 325, 359,

368, 371abstract thought 121acquired distinctiveness 157–8,

157 , 158acquired equivalence 157–8, 157 , 158active memory 220, 221adaptability 12–13addition 249–50, 250African claw-toed frog 216, 216 African grey parrot

and category formation 172,182, 182

and communication 351–3and mimicry 304and mirror-use 322, 323number skills of 248–50, 248–9,

251, 252aggression, displays of 337–8, 337 AI see artificial intelligenceair pressure 285Akeakamai (dolphin) 353–5alarm calls 331, 332–5, 336, 337albatross 286Alex (parrot) 248–50, 248–9, 251,

252, 351–3, 368alley running 94, 98, 106

and extinction 131–3, 132and navigation 267–8and number skills 245–6, 246

alpha conditioning 44American goldfinch 331American sign language 341–2,

349–50, 349amnesia, drug-induced 224–5amodal representation 240–1, 241amygdala 21, 224analogical reasoning 187, 188, 368, 371

anecdotes 23, 112animal intelligence 2–33

and brain size 10–12, 10, 361–4definition 12–16, 362–3distribution 4–12, 4, 360–71ecological view 228, 229, 231and evolution 369–71general process view 230–1historical background 22–33null hypothesis of 261, 364–9reasons to study 16–20research methods 20–2

animal kingdom 4–5animal welfare 19–20anisomycin 224–5anthropomorphism 23–4, 168–9,

262–3ants 265–8, 267 apes 16

and category formation 182and communication 329–50, 335–6,

340, 356–7and language 116, 329–50, 356–7and self-recognition 369

vocalizations 339–40, 340see also chimpanzee; gorilla;

orang-utan Aplysia californica (marine snail) 43,

43–5, 46, 47, 149, 263Arctic tern 289artificial intelligence (AI) 19associability/conditionability 67–8,

74–5, 91associative competition 101–2, 101associative learning 35–61, 123, 143

and attention 365–6conditioning techniques 26–42CRs 55–61and deception studies 313, 315definition 35and evolution 370–1memory model of 46–9, 46–9, 59nature of 42–9and the null hypothesis 364–6and problem solving 121and the reflexive nature of the CR

60–1and stimulus-stimulus learning

49–52and surprise 365–6and time 233

and US representations 52–5see also instrumental (operant)

conditioning; Pavlovian(classical) conditioning

associative strength 64–71, 66 , 68, 83,88–9, 91, 101, 102

and category formation 175–7, 179and discrimination learning 152–3equations 65–70, 83, 87, 88, 89and extinction 125–31, 134, 142negative 127–8, 130, 176and stimulus significance 83–4

associative-cybernetic model 111asymptote 36attention 365–6, 371

automatic 86and conditioning 63, 74–91, 86controlled/deliberate 86and the CS 63, 76, 77, 78–80dual 86multiple modes of 86

audience effects 335–6auditory templates 332Austin (chimpanzee) 345, 346, 357

autoshaping 37, 37 , 38, 60, 126, 127 omission schedules 60

baboonand deception 314, 314, 315predatory nature 333, 336

Bach, J.S. 172–3back-propagation networks 164–5

two-layer 164bees

light detection 285navigation 269–71, 271, 280see also bumble-bee; honey-bee

behaviorists 28bicoordinate navigation 288bidirectional control 308–9, 308–9,

310biological significance 80–1,

84–5, 102birds

evolution 6and imitation 304–5, 305and navigation 275and problem solving 117–21,

118–20short-term memory of 228,

229–30

song birds 331–2, 363see also specific species

black and white images 173–4black-capped chickadee 305, 363, 363blackbird 302bliss points 104blocking 53, 63–4, 63, 69–70, 78,

365–6LePelley on 91and Mackintosh’s theory 83–4and navigation 293, 294and the Pearce–Hall model 88–9

Note: Page numbers in italic refer to information contained in tables and diagrams.




412 Subject index

blocking (Continued )and the Rescorla–Wagner model

72–3, 73blue tit 230, 304–5body concept 323–4bombykol 265bonobo, and communication 343–5,

344, 346, 347–8, 348, 349, 357bottle-nosed dolphin

and communication 338, 353–6,354–5

and learning set studies 362brain 20–1

human 19size 10–12, 10, 11, 361–4

brain size-body ratio 10–11, 11, 361brown bear 266budgerigar 309–10, 367bumble-bee 298Burmese jungle fowl 298

cache 226canary 363capuchin monkey

and category formation 184and diet selection 300and problem solving 116, 117 see also cebus monkey

carp 173cat

and communication 328and the puzzle box escape task

26–7, 26–7 , 28category formation 170–89, 358

abstract categories 179–80, 189,358

categories as concepts 179–80examples of 171–3exemplar theories of 177–8, 178feature theory of 173–9and knowledge representation

188–9prototype theory of 178–9relationships as categories 180–7theories of 173–9

causal inference 114–21, 115cebus monkey

and imitation 312, 312and serial order 256–7, 256–7 see also capuchin monkey

cephalization index (K) 11–12, 12,13, 361

chaffinch 331chain pulling tasks 95, 97, 110, 141, 253cheremes 341chick, and discrimination learning

159–61, 159–61chicken

and communication 331and discrimination learning 150

chimpanzee 4, 368, 371and abstract representation 189and category formation 179–81,

180, 186–7, 188, 189and communication 24, 116, 327,

336, 339–50, 342–4, 357, 359dominant 317–18

and emulation learning 310and imitation 306, 307, 310,

311, 367and instrumental conditioning

105, 106lexigram use 343–5, 343–4, 346and mimicry 303and number skills 245, 245, 248,

250, 250, 252and plastic token use 342–3, 343and problem solving 112–17,

112–13, 115, 117 and self-recognition 319–24,

320, 322short-term memory of 228and social learning 367subordinate 317–18and theory of mind 312–13,

314–18, 316 , 317 , 324and transitive inference

259–60, 262vocal tracts 340, 340welfare issues 19see also bonobo

chinchilla 172, 172chunking 255–6circadian rhythms 233–4

circannual rhythms (internalcalendars) 291

Clark’s nutcracker 214, 228, 230,272–3, 294, 370

Clever Hans 16, 243, 243clock-shift experiment 288coal tit 230cockroach 233–4, 262–3cognition, and language 358–9cognitive maps 276–80, 330, 369–70color perception 173–4communication 326–59, 369

definition 327

development of 330–2honey-bees and 328–30, 328–30,331, 338–9

as innate process 330–3, 336, 356and intention 335–6and interpretation of the signal

330–1as learnt process 330–3, 336, 337and representation 334–5and the significance of signals

334–5vervet monkeys and 332–7, 333–4see also language

compass bearings 269–70, 271, 272compound stimuli 68–73, 83–4, 88–9,

126–9, 151–3, 160, 176–7concepts, categories as 179–80concrete codes 188, 371conditioned emotional response (CER)

see conditioned suppression

conditioned inhibition see inhibitoryconditioning

conditioned reinforcement 105–6, 105conditioned response (CR) 29, 35,

39–40, 44, 55–60, 285air pressure as 285and autoshaping 37and blocking 63compensatory 56–8, 58consummatory 55–6and discrimination learning 149,

155, 156, 164, 165and eye-blink conditioning 36–7and fear of predators 302influence of the CS on 58–9and inhibitory conditioning 42and the memory model of

conditioning 47preparatory 56reflexive nature of 60–1and representation of the US 109response-cueing properties

110–11, 111and stimulus-stimulus

conditioning 50–2strength of 64, 65, 70, 71, 73

conditioned stimulus (CS) 29, 30,

32–3, 123associability/conditionability of

67–8, 75associative strength 125–31,

134, 142and attention 63, 76, 77, 78–80aversive 110and compensatory CRs 57, 58compound 68–73, 83–4, 88–9,

126–9, 151–3, 160and conditioned suppression 38and discrimination learning 162and excitatory conditioning 35,

36–7, 38, 39, 40and extinction 124, 125–30, 134–5,136–46

and eye-blink conditioning 36–7and fear of predators 302and imitation 305influence on the CR 58–9inhibitory 40, 41–2, 217and instrumental conditioning 106,

109–11intensity 67–8, 68and long-term memory

studies 216, 217




Subject index 413

memory model of 46–8, 59and the nature of US representations

52–4neural mechanisms of 43–5protection from extinction 126–8and the reflexive nature of the CR

60–1

and the renewal effect 136–8single 64–8spontaneous recovery 134–6and stimulus-stimulus conditioning

50–2and taste aversion conditioning 39and timing 242and trace conditioning 194see also CS–no–US association;

CS–R association; CS–USassociation/contingency

conditioned suppression (conditionedemotional response) 37–8, 38,41, 42, 63, 67, 67–8, 71–2, 72,85, 85, 89–90, 89

conditioningalpha 44and attention 63, 74–91, 86compound 68–73, 83–4, 88–9,

126–9, 151–3, 160excitatory 35–40, 42–5, 44–5,

52–4, 72–3eye-blink 36–7, 36–7 , 53–4, 64,

64, 68–9, 69, 124, 125,135, 135

information-processing model of 46, 46

instrumental 92–121observational 302, 308, 309second-order (higher-order) 50–2,

51, 185–7, 186 serial 49–50single CS 64–8and surprise 62–74trace 194–6, 195–6 see also inhibitory conditioning;

instrumental (operant)conditioning; Pavlovian(classical) conditioning; tasteaversion conditioning

conditioning techniques 26–42autoshaping 37, 37 , 38conditioned suppression 37–8, 38control groups 39–40, 40excitatory conditioning 35, 36–40inhibitory conditioning 35, 40–2

configural cues 153configural theory of discrimination

learning 155–7, 156 connectionist models

of discrimination learning 161–6,162–6

exemplar-based networks 165–6, 165

hidden layers 165multi-layer networks 164–5, 165single-layer networks 162–4, 162–4of time 241

conscious thought 210–11consolidation theory (rehearsal theory)

of long-term memory 218–20,

218, 225conspecifics 300, 324, 335context 72, 138, 229context-stimulus associations

78–80, 90–1contextual variables 15contiguity 31, 63, 98, 98continuous reinforcement 130–1,

131–2, 134, 134, 145–6, 145control groups 39–40, 40

truly random control 40cooperation, and language 338–9copying behavior 297, 298, 301,

302–12, 324mimicry 303–4, 312, 324see also imitation

coriolis force 288corvids 181, 182, 368counting 233, 243, 245–9, 251–2

cardinal principle 252one–one principle 252stable-ordering principle 252

cow 229coyote 58–9CS–no–US association 135–6CS–R (response) association 364–5

inhibitory 140, 142

CS–US association/contingency 71–2,88–90, 91, 94–5, 99, 111,346, 364

associative strength 64–71, 83, 88and discrimination learning 162and extinction 123, 136–8, 138–40and inhibitory connections

136–8, 142and learned irrelevance 85–6, 90negative 71–2positive 71and stimulus significance 83–4, 85–7and the strength of the CR 64

zero 71

dark-eyed juncos 363dead reckoning/path integration 266–9,

269–70, 278, 280, 293, 367decay theory 202–3deception 313–15, 314delay, gradient of 98, 98delay conditioning 194delayed matching to sample (DMTS)

197–9, 228and decay theory 202–3and forgetting 200, 201–3, 204, 205

density discrimination 167–9,167 , 168

deprivation 106–109, 108, 111detours 276–8, 277 diana monkeys 335diet selection 298–301, 299

see also food

discrimination learning 148–69,171, 178, 368

configural theory of 155–7, 156 connectionist models of 161–6,

162–6 elemental theories of 155, 156and learning sets 361–2, 362and metacognition 166–9,

167–8and relational learning 149–50Rescorla–Wagner theory of

152–5, 153–4Spence on 150–2, 155, 156stimulus preexposure 157–61theories of 149–61and theory of mind studies

317, 318and transposition 149–51

dishabituation 79, 193–4displacement (language) 337, 345,

354–6distance effect 257, 259distractors 193–4, 201–2, 203–4

surprising 203DMTS see delayed matching to

sample“Do as I do” test 307–8, 310,

311, 367dog 22–3

and classical conditioning29–30, 29

and communication 337, 337 ,350–1

and trial and error learning 25–6dolphin 16, 16 , 124

and category formation 181and communication 338–9, 353–6,

354–5and learning set studies 362and self-recognition 322, 324,

369short-term retention of 198,201, 206, 206 , 207 , 228

see also bottle-nosed dolphindrive centers 106, 107drives 30–1, 106–7drug tolerance 56–8, 58

eagle 80, 333, 334, 335ecological niches 369–70ecological view of animal intelligence

228, 229, 231EDS see extradimensional shift




414 Subject index

electric shocks 14, 31–2, 43–4, 50, 53,63–4, 67, 69, 71–3, 84–6, 89–90,107, 124–5, 129–30, 135

and conditioned suppression 38and eye-blink conditioning 36and inhibitory conditioning 41, 42and long-term memory studies 216,

216 , 217, 219–21, 219and navigation experiments 285and preparatory CRs 56and stimulus-stimulus learning

51–2electroconvulsive shock (ECS) 218,

218–19, 220, 224–5, 228elemental theories of discrimination

learning 155, 156elephant 36

long-term memory of 213, 213and problem solving 116and self-recognition 322, 324, 369and theory of mind 317

emulation learning 310endogenous control, of migration

290–2, 290English language, spoken 339–40,

344–5environment 7–8episodic memory 225–31episodic-like memories 226–8equilibrium theories of behavior 104European robin 291–2evolutionary development 3, 4–10,

22, 23and intelligence 369–71

tree of evolution 6, 7 excitatory conditioning 35–40, 72–3

and the nature of US representations52–4

of neural processes 42–5, 44–5exemplar effect 177exemplars 175, 177–9, 178expectancy (R–US) theorists 94–5

see also R–US associationsexperience 113–14extinction 36, 51–2, 57–8, 65–6, 66,

108, 109, 122–47, 224associative changes during 134–42

of conditioned inhibition 129–30conditions for 125–42enhanced 128–9, 128–9as generalization decrement 123–5,

125, 128, 132–4, 144, 146and an inhibitory S–R connection

140–2, 141not affecting CS–US associations

138–40, 139partial reinforcement 130–4, 131,

132, 134, 145–7, 145and Pavlovian conditioning 142–7,

143–6

protection from 126–8, 126–7 and the renewal effect 136–8,

136 , 137 and surprise 125–30trial-by-trial basis 142–7, 143–6 as unlearning 123with a US 129

extradimensional shift (EDS) 81–3, 82eye-blink conditioning 36–7, 36–7 ,

53–4, 64, 64, 68–9, 69extinction 124, 125, 135, 135

face recognition 173–4, 176, 176 , 178family trees 6, 8–9fear, of predators 301–2fear conditioning see electric shocks;

electroconvulsive shock feature theory of category formation

173–9feature-positive discrimination 151–2,

155–6Fellow (dog) 350–1fish

and category formation 173and learning 13, 15and navigation 274short-term memory of 228see also specific species

folk physics 116–21food 8, 9

food-pulling tasks 117–19, 118–19hiding/storing behavior 191, 214,

226–7, 229–30, 363, 370rewards 13–14, 14, 15, 27–8

see also diet selectionforaging behavior 300–1, 301forgetting 216–25, 228

decay theory of 202–3deliberate 204–5and proactive interference 200–1,

201, 203, 204and retroactive interference 163,

200, 201–2, 203and short-term retention 199–205

fossil record 6fruit-fly 327frustration theory 132–3, 135

Galapagos Islands 5, 6 , 119geese 268–9, 268–9, 270general process view of animal

intelligence 230–1generalization

and category formation 174, 177,178

gradients 150, 150, 156mediated 180numerosity 244, 244temporal 236–8, 237 , 240–1see also stimulus generalization

generalization decrement 37as extinction 123–5, 125, 128,

132–4, 144, 146geometric modules 274–6geometric relations 272–4,

273–4, 367geraniol 234

gerbils 268, 269–72, 270, 272ghost control 310goal directed learning 32goldfish 364–5gorilla

and communication 339and self-recognition 319, 321, 322

gradient of delay 98, 98grain selection 80–1, 81grammar 337–8, 346–50, 353–6,

355, 358of American sign language 341and lexigrams 343using plastic tokens 342

gravity 288great spotted woodpecker 191great tit 304–5, 305, 363green turtle 283, 284, 291grief 341guppy 301

habituation 75, 77, 79, 80, 91definition 192and retroactive interference 201,

203and short-term retention 192–4,

193, 201, 203

see also dishabituationhamster

and navigation 268, 277–8, 277 social behavior of 59–60

harbor seal 303heading vectors 272hedgehog 229higher vocal centre 363hippocampal place cells 280–3hippocampus 21, 363–4homing 284, 286–9

and bicoordinate navigation 288and landmark use 286

map and compass hypothesis of 287–8and olfaction 289by retracing the outward route

287, 287 Homo sapiens 5

see also human beingshoney-bee

communication (waggle dance) of 328–30, 328–30, 331, 338–9

and navigation 268short-term memory of 228sounds made by 329–30




Subject index 415

and time perception 234, 235, 236,262

Hooker, John Lee 173 Horizon (TV programme) 338horse 16, 243, 243human beings 371

brain 19

and category formation 177, 178episodic memory of 225and the exemplar effect 177and language 327, 338, 339–40,

344–5, 356and learning 74, 86long-term memory of 213metamemory of 207–8navigation skills 266, 275and number 248and serial position effects 207see also Homo sapiens

hunger 106–9, 110“hunting by search image”

80–1, 80–1hyena 370

IDS see intradimensional shiftiguana 5, 6 imaginal codes 188imitation 303, 304–12, 324, 367–8

and bidirectional control 308–9,308–9, 310

chimpanzees and 306, 307, 310,311, 367

“Do as I do’’ test 307–8, 310, 311,367

and emulation learning 310laboratory studies of 306–10mechanisms of 311–12monkeys and 311–12, 312, 367,

368naturalistic evidence 304–6and two-action control 309–10,

309inactive memory 220indigo bunting 292information-processing 15–16, 21information-processing model

of conditioning 46, 46

of time 237–8, 237 , 239, 239,240, 241infrasound 285, 288inheritance, of problem solving

abilities 119–20inhibitory conditioning 35, 40–2,

41–3, 70–1, 73, 127–8detection 41–2extinction 129–30and latent inhibition 76and long-term memory 217and the nature of US representations

54–5, 55

innate processescommunication 330–3, 336, 356fear 302

insight 112–14instrumental (operant) conditioning

28, 30, 32–3, 92–121conditions of 97–106

and discriminatory learning 153extinction 123–8, 131–4, 141historical background 93–5and long-term memory studies

216–17and the memory model 111and mimicry 303nature of 93–7and the null hypothesis 364, 365–6Pavlovian interactions 106, 109–11and problem solving 111–21and serial order 253and vigor of performance 106–11

intellectual curiosity 16intelligence see animal intelligenceintention, and communication 335–6internal clocks 233–6, 262,

287–8, 292alternatives to 241–2

interneurons 45intertrial interval 197interval scales 250interval timing 233, 236–41, 237–41,

242, 263midpoints 238, 239–40pacemaker 241, 263

intradimensional shift (IDS) 81–3, 82

Japanese macaque monkey 300, 306Japanese quail

and imitation 309, 309and mate choice 301

jaw movement 39–40, 40

Kanzi (bonobo) 344–5, 344, 346,347–8, 348, 349, 357

kinesthetic self concept 323–4knowledge attribution 315–19knowledge representation 188–9,

358–9, 368–9

abstract mental code 325, 359,368, 371concrete mental code 188, 371self-recognition 319–24, 320–2,

368–9see also number; serial order; time

Koko (gorilla) 319, 321

Lana (chimpanzee) 343, 346, 352landmarks 293–4

and cognitive maps 278–80and compass bearings 269–70,

271, 272

in distinctively shaped environments274–6, 274–5

and geometric relations 272–4,273–4, 367

and heading vectors 272and hippocampal place

cells 281–3

and homing 286piloting with multiple 271–2,

271–2, 367piloting with single 269–71, 367retinal snapshots of 270, 271

language 327, 336–59, 371American sign language 341–2,

349–50, 349apes and 116, 329–50, 356–7arbitrariness of units 337, 338and cognition 358–9and cooperation 338–9definition 336–8discreteness of 337, 338, 345and displacement 337, 345, 354–6grammar 337–8, 341–3, 346–50,

353–6, 355, 358human 327, 338, 339–40, 344–5,

356as innate process 356and lexigrams 343–5, 343–4, 346and motivation 357and the null hypothesis 369and plastic tokens 342–3and problem solving 19productivity of 337–8, 346–7requirements for learning 356–79

and semanticity 337, 346sentences 342–3, 346–50, 352–4,

355, 356, 358spoken English 339–40, 344–5spontaneous 357training assessment (apes) 345–50training methods (apes) 339–40training methods (non-primates)

350–6language acquisition device 356–7latent inhibition 76, 79–80, 88–91,

365–6Law of Effect 27–8, 30–1, 93–4,

102–4and problem solving 111–21learned irrelevance 84, 85–6, 85, 90learning 8–9, 13–15

communication and 330–3,336, 337

definition 13emulation learning 310goal directed 32human beings and 74, 86and migration 292–3perceptual 158–61, 159–61relational 149–50




416 Subject index

learning (Continued )speed of 13–15, 14stimulus-stimulus 49–52and surprise 62–74trial and error 25–7, 112–14,

116–17, 119–21, 169, 211,297, 323, 336

see also associative learning;discrimination learning; sociallearning

learning curve 64, 64, 65learning sets 361–2, 362leopard 335lever pressing 93, 95–6, 98–102,

105–8, 110and conditioned suppression 37–8and CS–US contingency 71and discrimination learning

151–2, 153and extinction 124, 141and imitation studies 306–7, 307 and interval timing 236–8, 237 ,

239–41, 240–1and number skills 245, 246, 251and preparatory CRs 56and serial order 253

lexigrams 343–5, 343–4, 346light

polarized 285ultraviolet 285

light-dark cycle 287–8, 292artificial 288, 291

limited capacity theory 202, 203–4loggerhead turtle 290–1, 290

long-distance travel 265, 283–93and air pressure 285homing 284, 286–9and magnetic fields 284, 288, 291migration 284, 289–93and navigational cues 284–5, 293–5and polarized light 285and ultraviolet light 285

long-term memory 191, 212–31capacity 214–15, 214, 215comparative studies of 228–31consolidation theory of 218–20,

218, 225

durability 215–17, 216 , 217 episodic 225–31neural circuitry 218, 224–5retrieval theory of 218, 220–5,

222, 223loris 266Loulis (chimpanzee) 341–2, 342, 350

magnetic fields 269–70, 284,288, 291

magnitude effect 256, 259mammalian evolution 6Manx shearwater 286

map and compass hypothesis 287–8maps, cognitive 276–80, 330, 369–70marmoset monkey 83

and imitation 310, 367short-term memory of 366–7, 366

marsh tit 363marsh warbler 332

Matata (bonobo) 344matching to sample 180–3, 181, 184,

186–7, 186 , 189, 215, 368see also delayed matching to sample

matingand deception 314–15mate choice 301

maze running 94, 94and foraging behaviour 300–1, 301and long-term memory 222–4, 223and navigation skills 276–7, 277 ,

281–3, 281–2, 294and trace conditioning 195–6,

195, 196 see also radial maze; T-mazes

mean length of utterances (mlu) 349, 349meaning 337, 346Melton (baboon) 314, 315memory 188–9

active 220, 221episodic 225–31episodic-like 226–8and evolution 370human 371inactive 220and the null hypothesis 366–7, 366 reference 237–8

retrieval 21spatial 363–4, 366 , 366–7 and standard operating procedures

76–9and stimulus significance 81visual 304see also long-term memory;

short-term retentionmemory model of associative learning

46–9, 46–9, 59, 111memory traces 202, 211, 241–2mental states 28, 312, 324–5metacognition

definition 166and discrimination learning 166–9,167–8

metamemorydefinition 208and short-term retention 207–11, 209

metamphetamine 241Mexican jay 230migration 284, 289–93

definition 284, 289endogenous control of 290–2, 290and learning 292–3and the null hypothesis 367

mimicry 303–4, 312, 324vocal 303–4

mind 12mirror neurons 311–12, 324mirrors 319–24, 320, 321, 368–9Missouri cave bat 286mlu see mean length of utterances

molar theory of reinforcement 99–100molecular theory of reinforcement

100, 101–2monkeys 368

and category formation 179, 187and discrimination learning 166–9,

167 , 168episodic-like memory of 227fear of predators 301–2and imitation 311–12, 312, 367, 368and instrumental conditioning 93and metacognition 166–9, 167 , 168mirror-use 322, 323number skills of 246–8, 247 ,

251, 252and self-recognition 322and serial order 253, 256–9,

256–8, 358short-term retention of 198, 201–2,

204, 228and trace conditioning 195and transitive inference 260–1,

260, 262see also specific species

morphine 56–8, 58motivation

dual-system theories of 107–8,

109–10and language 357

motor neurons 44–5, 45, 46mouse 233music 172–3mynah bird 303

natural selection 5navigation 21, 264–95

bicoordinate 288and communication in honey-bees

328–30, 328–30, 331long-distance travel 265, 283–93

methods of 265–83and the null hypothesis 367and the Rescorla-Wagner theory 293short-distance travel 265–83, 293and the sun 267

navigational cues 284–5, 293–5need 30, 106–7negative patterning 153, 156, 162neocortex 364neophobia 298–9, 300neural circuitry

of associative learning 42–5, 44, 45of long-term memory 218, 224–5




Subject index 417

neural net theories see connectionistmodels

New Caledonian crow 119–20, 120Nim (Neam Chimpsky) (chimpanzee)

341, 346, 349–50, 357no–US center 134–6, 135no–US representation 55

nominal scales 247nous (mind) 12null hypothesis of intelligence 12,

261, 364–9and associative learning 364–6as impossible to refute 369and language 369and memory 366–7, 366 and navigation 367and the representation of knowledge

368–9and social learning 367–8

number 233, 243–52, 263, 368absolute number 245–7, 245–7 addition 249–50, 250interval scales 250nominal scales 247numerical symbols 248–50, 248–9,

252ordinal scales 247–8and perceptual matching 251–2relative numerosity 243–4, 244representation 247–8subitizing 251

number-identification task 86numerons 252numerosity generalizations 244, 244

oak leaf silhouette experiment 171,172, 173

observational conditioning 302, 308, 309octopus 298odour cues 309olfaction 289omission schedules 60operant conditioning see instrumental

(operant) conditioningorang-utan

and communication 339–40, 340and mimicry 303

and self-recognition 323–4ordinal scales 247–8orienting response (OR) 74–6, 75,

87–8, 87 oscillators 236, 263overshadowing 70, 101–2, 293

Pacific sea anemone 192paramecium 149, 149, 192, 193parrot

and communication 351–3and mimicry 303see also African grey parrot

partial reinforcement 130–4, 131, 132,134, 145–7, 145

partial reinforcement effect (PRE)130, 132, 133, 145–7

path integration see deadreckoning/path integration

Pavlovian (classical) conditioning 20,

28–30, 29, 32–3, 35–61,93, 99

using air pressure 285and autoshaping 37and the conditioned response 55–61and conditioned suppression 38and diet selection 299and discrimination learning

149–50, 162and dual attention 86and evolution 370–1and extinction 123–6, 128–31,

133–4, 138–47eye-blink conditioning 36–7and fear of predators 302and imitation 305using infrasound 285instrumental interactions 106,

109–11and long-term memory studies 217using magnetic fields 284memory model of 46–9, 46 neural mechanisms of 43–5,

44, 45and the null hypothesis 364–6and overshadowing 101and stimulus-stimulus learning

49–52taste aversion conditioning 39and timing 242using ultraviolet and polarized

light 285Pavlovian-instrumental transfer

design 110peak procedure 239, 239peak shift 151, 153, 178Pearce–Hall theory 86–91pedometers 267penguin 24, 24–5pentobarbitol 224

perception, colour 173–4perceptual learning 158–61, 159–61perceptual processing 21periodic timing 233–6, 263

oscillators 236, 263pheromone trails 265–6Phoenix (dolphin) 353phonemes 173, 341phylogenetic scale 5physiological techniques 20–1pigeon 16, 17–18, 55–6, 365–6, 368

and autoshaping 37, 37 , 38, 60and category formation 171–2, 172,

173–5, 174–5, 176, 177,182–5, 182, 183–5, 358

and communication 339and discrimination learning 151,

151, 154episodic-like memory of 227–8and extinction 124, 126–8

and homing 286–9, 293and imitation 310, 311and inhibitory conditioning 40and long-term memory 214–17,

214, 215, 227–8metamemory of 211navigation skills of 264, 272,

284–9, 293number skills of 243–4, 244, 251and problem solving 114and selective association 84and self-recognition 322and serial order 253–6, 254–5, 259short-term retention of 197–8, 200,

201–2, 204, 205, 206, 211, 228and stimulus significance 80–3, 84and stimulus-stimulus

conditioning 50and time 241–2and transitive inference 261–2, 261

pilotingwith multiple landmarks 271–2,

271–2with single landmarks 269–71

pinyon jay 363place cells 280–3plastic tokens 342–3, 343

population-specific behavioraltraditions 305–6

PRE see partial reinforcement effectpredators

and alarm calls 331, 332–5, 334,337

fear of 301–2prefrontal lobe 21Premack principle 103–4primacy effect 206–7primates 6, 8–9

and imitation 305–6and navigation 275

and social learning 367see also specific speciesproactive interference 200–1, 201,

203, 204problem solving 26–7

and causal inference 114–21, 115and folk physics 116–21and insight 112–14and language 19and the Law of Effect 111–21

prospective code 198“protection from extinction” 126–8,

126–7




418 Subject index

prototype theory 178–9protozoan 327punishment 14, 31–2, 93

see also electric shocks;electroconvulsive shock

puzzle boxes 26–7, 26–7 , 28

quail 367

R–US associations 94–5, 95, 97,101–2, 110–11, 311, 365

rabbit 64, 64, 68–9, 69, 77–80, 110and extinction 124–5, 135and habituation 192–4, 193and Pavlovian conditioning 36–7,

39–40, 53–4raccoon 60–1, 191radial maze 228–9, 366

and forgetting 200–1, 201, 202,203, 204–5

and navigation 294and serial position effects 206–7,

208and short-term retention 196–7,

196–7 , 198, 199rat 14, 31–2, 32, 59

and attention and conditioning75–6, 75, 78, 83–8, 86 , 89–90

and compensatory CRs 56–8and conditioned suppression 37–8and CS–US contingency 71and diet selection 298–301,

299, 301and discrimination learning 151–2,

153, 153, 157–8, 163episodic-like memory of 227, 228and extinction 128, 128, 130–4,

139, 140–2, 146and imitation 306–7, 307 , 308–9,

308, 367–8and inhibitory conditioning 42and instrumental conditioning

93–101, 94–101, 103–8, 103,105, 108, 110, 365

and knowledge representation 188and learned irrelevance 85–6, 86 long-term memory of 216–25,

218–19, 222, 227–8and the memory model of conditioning 47–9, 47

and method of learning 14–15navigation skills of 268, 273–9,

273–5, 277–9, 281–3, 281,293, 294

number skills of 245–6, 246 , 251–2and orienting responses 87–8and preparatory CRs 56and selective association 84, 85and serial order 253short-term retention of 194–7,

195–7 , 199–207, 201, 204,208, 228–9

and stimulus significance 83, 84, 85and stimulus-response theorists

31–2, 32and stimulus-stimulus learning

49–50

and time 235–8, 237 , 239–41,240, 241

and trace conditioning 194–6,195–6

raven 117–19, 118–19reactivation effects 221–5, 222, 223reasoning, analogical 187, 188,

368, 371recency effect 206–7reference memory 237–8reinforcement

conditioned 105–6, 105continuous 130–1, 131–2, 134,

134, 145–6, 145molar theory of 99–100molecular theory of 100, 101–2

reinforcer devaluation design 95–6,95, 97

reinforcers 93and associative competition 101–2conditioned 105–6definition 102–4incentive value of the 108, 109nature of the 102–4negative 93positive 93response contiguity 98

token 105–6relational learning 149–50relationships between objects 180–7,

188–9second-order 185–7, 186

renewal effect 136–8, 136 , 137 representation 21–2

A1 76–8, 76 , 81A2 76–80, 76 , 81amodal 240–1, 241and communication 334–5inactive 76, 76 of number 247–8

retrieval-generated A2representations 78–9self-generated A2 representations

77–8, 79spatial 258–9, 262standard operating procedures

model 76, 76 US 109see also knowledge representation

reptiles 6Rescorla–Wagner model 64–5, 68–71,

72–4, 83, 84, 91and category formation 175, 176

and discrimination learning 152–5,153–4, 156, 162, 163, 164–5

and extinction 123, 125–6,127–31, 142

and navigation 293research methods 20–2response-chaining 253–7, 254–7

response-cueing 110–11, 111response-reinforcer contingency

99–100, 99, 100, 101–2retardation test 41–2retention interval 197–8, 216retinal snapshots 270, 271retrieval theory 218, 220–5,

222, 223retroactive interference 163, 200,

201–2, 203retrospective code 198reward 93

anticipation of 94–5, 96, 108and category formation 175–6and gradient of delay 98, 98and the Law of Effect 112and response-reinforcer contingency

99–100and short-term memory

studies 199and stimulus-response theorists

31, 32rhesus monkey

and category formation 181, 184and metamemory 208–11, 210and serial order 257–8, 257–8and short-term retention 206,

208–11Rico (dog) 351rook 117running wheels 103, 103

S–R associations 365and imitation 311and inhibition in extinction 140–2,

141and instrumental conditioning 93–4,

96and serial order 253and theory of mind studies 318

S–R theorists 46, 48–9, 52, 94–5, 96 ,111S–(R–US) associations 96–7, 111sage grouse 301sameness 180–5, 181–5, 188–9Sarah (chimpanzee) 116, 187, 188,

260, 342–3, 346, 347, 349, 359,368, 371

satiety 107, 108–9satisfaction 28, 30scala naturae 4–5scrub jay 318–19, 324sea lion 181, 181, 215, 216




Subject index 419

Second World War 266second-order (higher-order)

conditioning 50–2, 51,185–7, 186

selective association 84–6self-awareness 323self-recognition 319–24, 320–2,

368–9semanticity 337, 346sensory neurons 44, 45, 46sensory preconditioning 50sentences 342–3, 346–50, 352–4, 355,

356, 358comprehension 347–8, 352,

353–4, 355production 348–50, 352–3,

356, 358sequential stimuli 245–6, 245sequential theory 132, 133serial conditioning 49–50serial delayed alternation 203serial order 233, 253–9, 254–8, 263,

358, 368and chunking 255–6and the distance effect 257, 259and the magnitude effect 256, 259spatial representation of 258–9and transitive inference 259–61

serial position effects 206–7, 208serial recognition tasks 253–7, 254–7 serotonin 45Sheba (chimpanzee) 248, 250,

252, 368Sherman (chimpanzee) 345, 346, 357

short-distance travel 265–83, 293and cognitive maps 276–80and dead reckoning/path integration

266–9, 269, 270, 278, 280,293, 367

and detours 276–8, 277 and distinctively shaped

environments 274–6, 274–5and geometric modules 274–6and geometric relations 272–4,

273–4and hippocampal place cells 280–3methods of navigation 265–83

and novel release sites 278–80, 279and pheromone trails 265–6piloting with multiple landmarks

271–2, 271–2piloting with single landmarks

269–71short-term retention 190–211,

366, 366 comparative studies of 228–31and decay theory 202–3forgetting 199–205and habituation 192–4, 193limited capacity theory of 202

and metamemory 207–11, 209methods of study 191–9and serial position effects 206–7and trace conditioning 194–6

shuttle-boxes 220–1Siamese fighting fish 229sign language 24, 341–2, 349–50, 349

signals 334–5silkworm moth 265simultaneous chaining 253–7, 254–7 simultaneous stimuli 246–7, 247 Slocum (sailor) 266snakes 301–2, 302, 333, 334social behavior 59–60social enhancement 305social groups 367social learning 296–325

and copying behavior 302–12and diet selection 298–301, 299and foraging behavior 300–1, 301and mate choice 301and the null hypothesis 367–8and predator fear 301–2and self-recognition 319–24, 320–2and theory of mind 312–19, 324–5

song birds 331–2, 363SOP model see standard operating

procedures (SOP) modelspatial memory 363–4, 366 , 366–7 spatial representation 258–9, 262spontaneous recovery 134–6squirrel monkey 300

short-term retention of 206and transitive inference 260, 260

standard operating procedures (SOP)model 76–80, 76 , 81, 83, 192

starling 292–3stars 292Stellar’s jay 328, 328, 337stickleback 192, 228, 328, 334stimulus 20

context-stimulus associations78–80, 90–1

and habituation 192–4preexposure 157–61significance 80–6see also conditioned stimulus; S–R

associations; S–(R–US)associationsstimulus enhancement 298, 306, 307,

308, 309stimulus generalization

definition 37and discrimination learning 150,

153, 155–6, 157–8neural basis of 45and problem solving 116, 121

stimulus-response theorists 30–3, 32see also S–R associations

stimulus-stimulus learning 49–52

Stravinsky, Igor 172–3strawberry finch 332subitizing 251Sultan (chimpanzee) 112–13, 112,

114, 115summation test 42, 43sun 267, 287–8

sun compass 287–8surprise

and distractors 203and extinction 125–30and learning 62–74, 84,

86–7, 90symbolic distance effect 260–1

T-mazes 195, 203–4, 204tamarin monkey 229, 300

and self-recognition 322and short-term memory 366–7, 366

taste aversion conditioning 14, 39,47–9, 58–9, 78, 78, 95–6, 300

and discrimination learning 158,159–61

and trace conditioning 194–5, 195temporal generalization 236–8, 237 ,

240–1theory of mind 312–19, 324–5, 371

and communication 336and deception 313–15, 314and knowledge attribution 315–19

three-spined stickleback 192tiger 297 time 233–42, 262–3, 368

circadian rhythms 233–4

connectionist model of 241information-processing model of

237–8, 237 , 239, 239, 240,241

interval timing 233, 236–41,237–41, 242, 263

and memory traces 241–2periodic timing 233–6, 263scalar timing theory 238, 238,

239–40, 240tool use 116–17, 117 , 119–20trace conditioning 194–6, 195–6 trace intervals 194

transitive inference 259–62, 260–1and spatial representation 262and the symbolic distance effect

260–1and value transfer 261–2

transposition 149–51transposition test 150trial and error learning 25–7, 112–14,

116–17, 119–21, 169, 211, 297,323, 336

trial-based theories 142truly random control 40two-action control 309–10, 309



420 Subject index

ultraviolet (UV) light 285unconditioned stimulus (US) 29, 30,

32–3, 35, 40, 123and attention 78–9, 80aversive 110and compensatory CRs 58and diet selection 299

and discrimination learning 155,156, 162, 165

and extinction 125–6, 128–9, 135,139–43, 145–6

and eye-blink conditioning 36and fear of predators 302and imitation 305and inhibitory conditioning 40,

41, 42intensity 66–7and long-term memory studies 217memory model of 46–8, 59and the nature of representations

52–5, 54and the reflexive nature of the

CR 60

representation 109and stimulus-stimulus conditioning

49–50surprising 63–4, 69, 365–6and trace conditioning 194see also CS–US

association/contingency;

no–US center; no–USrepresentation; R–USassociations; S–(R–US)associations

unobservable processes 21–2urine 266utterances, mean length of

(mlu) 349, 349

value transfer 261–2vasoconstriction 77–8,

192–3, 193vasopressin 241vervet monkey 332–7, 333–4vestibular system 268Vicki (chimpanzee) 307, 340

visual category formation 171–2,173–6

visual memory 304vocal mimicry 303–4vocal tract 340, 340

waggle dance (bees) 328–30, 328–30,

331, 338–9Washoe (chimpanzee) 24, 341–2,

342, 344, 346, 348–9,350, 357

Western scrub jay 226–7, 226 white-crowned sparrow 331–2, 370wind direction 289wolf 58–9, 59Woodpecker finch 119, 120words 337working memory 237–8wrasse 228

Yerkish 343

Zugunruhe 291, 292

Documents

Amimal Learning