A model for serial verbal learning

  • Published on

  • View

  • Download

Embed Size (px)






    A model for analyzing the learning process with a special emphasis on serial-position effect is proposed. This model consists of two analyses, one being an analysis of the learning process of each item in a list by a stochastic method, and the other being an analysis of serial-position effect in terms of pro- and retroactive inhibitions, and of forgetting. The model is experimentally verified, and moreover, it is found that the model permits prediction of the results of many experiments with lists of various lengths and varying difficulty.

    1. Introduction

    1.1 Historical Review

    In serial verbal learning of nonsense material which is sequentially presented, the items near the beginning and the end of the series are, in general, easier to learn than those in the middle. This phenomenon has been well known since the 19th century and is called the serial-position effect, or primacy effect at the beginning and recency effect at the end, respec- tively.

    More than thirty papers on experimental studies have appeared on this subject. The problems discussed in the earlier papers mainly concern the predominancy of the primacy effect or recency effect. Among these and among other papers for other experimental purposes, some present definite results on the predominancy of either effect in the serial-position effect. After categorizing these papers, it is found that the predominancy of the primacy effect is supported mainly by the experiments on the serial-anticipa- tion method, i.e., those by Ebbinghaus [6], Robinson and Brown [19], Warden [25], Lepley [13], Ward [24], Hovland [9], Malmo and Amsel [14], and McCrary and Hunter [15]; the predominancy of the recency effect is supported by the experiments on the free-recall method or the paired-association method, i.e., those by Calkins [5] and Raffel [20]; and the equal effectiveness of the two effects is supported mainly by the experiments on the free-recall method, i.e., those by Bigham and Munsterberg [2], Smith [231, Finkenbinder [7], Foucault [8], and Shipley [22].

    On the other hand, very few theoretical analyses of the serial-position

    *The author wishes to acknowledge help received during discussion with Prof. T. Indow.



    effect have been carried out. Foucault [8] and Ono [18] explained this effect by pro- and retroactive inhibition; Leplcy [12, 13], Hull et aL [10], and Bugelski [3] did so by forward and backward associations; and Atkinson [I] by the probabilistic model.

    Foucault's model is rather crude and other models are constructed upon too complicated assumptions or by using too many unmeasurable intervening variables. Furthermore, all of these models deal with only one kind of experiment, e.g., Foucault's model deals with only the experiment on reproduction procedure (free-recall procedure) and all other models deal with only the experiment on the serial-anticipation method. However, both are serial verbal learning experiments with human subjects and the data on them contain both the primacy and the recency effects, more or less. Therefore, the process underlying both experiments seems to be the same and it would be undesirable that each model should deal with only one kind of experiment.

    Atkinson's model seems to be clear but it has no conceptual relation to the usual psychological terms. The purpose of the present paper is to con- struct a generalized model analyzing serial-position effect in any kind of serial learning experiment in terms of well-known concepts.

    Recently, stochastic [4] and information-theoretical [21] learning models have been proposed. These models deal only with the average properties of many words in a list which are assumed to be homogeneous, namely, all the words are supposed to be on the same level of difficulty of learning. Hence, these models are applicable only to a special experimental situation where the positions of the words in a list are randomized at each trial.

    1.2 The Problem

    Since the items in a list are presented in a fixed order in an ordinal learning experiment and consequently the difficulty of learning each item in a list is not on the same level, it is desirable to construct a model for ordinal serial learning processes which permits analysis of both the process of learning each item in a list and the serial-position effect of the items con- tained in that list, in terms of well-known concepts, irrespective of the method of experiment.

    When such a model has been constructed, its validity should be verified by examining the following three points.

    (i) How well do the theoretical curves derived from the model fit the experimental data?

    (ii) To what extent can the model predict other learning processes? (iii) Do the curves derived from the model fit the data better than

    the curves derived from other models? The plan set forth in this paper is to begin with a general description

    of the construction of the model. Following this, estimation procedures of

  • UI,ARA KUNO 325

    the parameters contained in the model will be explained, and thcn the model will be applied to experiments on free-recall procedure and the serial-anticipa- tion method. We can then use the model to predict the results of learning experiments yet unperformed. Finally, goodness of fit of the model will be statistically tested against experimental data.

    2. The Model

    This model consists of two analyses, one being an analysis of the learning process of each item in a list (Part I), and the other being an analysis of the serial-position effect in the list (Part II).

    2.1 The Model ]or the Process o] Learning Each Item (Part I)

    This section will give in detail a statistical model which is related to the Miller-McGill learning model [16]. All notation in this section is analogous to that of the Miller-McGill two-parameter case [16].

    In Part I of this model, we assume that rk, the probability of recalling an item after k previous recalls, is given by

    r~+l -- r~ = a(l -- ~k),

    with the initial condition ~o = Po, where a, the parameter of fixation, and Po, the parameter of memorization, are constants and 1 => a, Po ~ O. This equation means that the increase of r~ for each recall is proportional to the possibility of ascension of rk , namely (1 -- r~). The above equation is re- written in the form

    rk+, -- (1 -- a)r~ = a,

    which is a nonhomogeneous difference equation with the initial condition ro = Po The solution of this equation is given by

    (1) r, = 1 -- (1 --p0)(1 -- a) *.

    This expression is the same as that of Bush-Mosteller's fixed point form [4] and Miller-McGill's two-parameter case [16]. The two parameters contained in this expression, a and Po , will be estimated in the next section by the method of maximum likelihood.

    Now, in our experiments, all the figures plotting po against the position of items in a list showed a U-shaped tendency as illustrated by o in Fig. 1, and no apparent tendency was observed in all the figures plotting a as illus- trated by in Fig. 1. From these tendencies, which will be statistically verified in Section 4.1.2, we have the conclusion that the serial-position effect appears primarily in the parameter Po and not in a.

    2.2 The Model for Serial-Position Effect (Part I I)

    The purpose of Part I I is, therefore, to construct a theory to analyze the serial-position effect appearing in Po in terms of pro- and retroactive



    0.80 a





    P. 0.40




    Q O





    0 0 0 0

    0 0 0 0 0 0 0 0 0

    0 0 0


    2 4 6 8 10 12 14 16 18 20 serial position

    FIGURE 1

    Empirical Values of at and p~0 Obtained from Free-Recall Experiment of Typical Subject (The ordinate of a~ (e) is shifted vertically from that of p~0 (o).)

    inhibition, and of forgetting. We shall introduce the following six parameters. ~, is the theoretical amount of recall immediately after presentation

    of an item defined apart from any of the inhibitions mentioned below. In the usual learning experiments, the empirical amount of learning is limited to 1.00, so that the value -~ cannot be measured directly in an experiment unless the value is equal to 1.00 (see Experiment I I I in section 5.4.1).

    is the amount of forgetting during the period of time from the presenta- tion of one item to the presentation of the next. Therefore, if we denote by F the amount of forgetting of the ith item immediately after presentation of the last item on the list with I items, we have

    (2) F = ~( l - -0 .

    a is the amount of inhibition arising from an item affecting the memoriza- tion of the succeeding item, namely, the parameter of the proactive inhibition.

    is the amount of inhibition arising from an item affecting the retention of the preceding item, namely, the parameter of the retroactive inhibition.

    ), is the decreasing rate of the proactive inhibition a. Consequently, ak is the amount of inhibition that affects the memorization of the item following the succeeding one; aM -2 is the amount of inhibition affecting the memorization of the jth item counted from the one in question. There-

  • VLAaA XVNO 327

    fore, the amount of the proactive inhibition which the kth item of the list affects the ith item of the list, Fk~, is

    (3) Fk, ---- aX '-k-l,

    where k < i. On the other hand, the ith item of the list is affected by all the items previously presented. Since the total amount of the proactive inhibition which affects the ith item, I, , is the sum of I~, over k (< i), we have

    i - - I i - -1 i - -2

    (4) I , = Z I~, = Z -x ' -~- ' = - Z x' = . (1 - ~'- ' ) .

    is the decreasing rate of the retroactive inhibition ft. As in the case of )~, the total amount of the retroactive inhibition which affects the ith item of the list, I~, is given by

    ' - ' - ' ~(! - ~,'-') (5) I : = ~ ~ =

    h-o 1 - - /~

    If l denotes the number of items comprising the list, then for the free- recall procedure experiment, P,o (the Po of the ith item of the list) is the amount of memorization of the ith item immediately after the presentation of the last item on the list, and is written

    (6) P,o = '~- - E - - I , - I:.

    Substituting (2), (4), and (5) into (6), we obtain

    (7) P ,o = "1 - 5 (1 - i ) - a(1 - k ' - ' ) _ fl(1 - ut - ' ) . 1 - -k 1 - -~

    We shall estimate the values of parameters contained in (I) and (7) in the next section.

    3. Estimation of Parameters

    3.1 The Maximum-Likelihood Estimation o] the a and po in Equation (1)

    The maximum-likelihood method for the estimation of the parameters a, and P~o is used in order to avoid the introduction of a tendency to artifi- ciality.

    The estimates are obtainable from the frequency distribution of n,k , where n~ is the number of unrecalled trials of item i between its kth reca]~ and its (k + 1)st recall. Let the state that item i has not been recalled on n,k successive trials after its kth recall be A,~ ; then P~k , the probability of occurrence of state A ~k, is given by

    p,~ = r,~(l - T,~)"'L

    If the p,0's for all k are assumed to be independent, then the maximum likelihood for item i is given by



    L, = I~ r ,k (1 - r,O"', kffiO

    where ~ is a number which is arbitrarily fixed as common to all items. Substituting r~k of (1) into the above equation, the likelihood L~ is

    expressed as a function of a~ and p~o In order to obtain the values of a, and P~o which maximize L~ , the partial derivatives of log L~ with respect to a~ and P~o are made equal to zero, and the following equations are obtained.

    (1 p~o)(1 - a , ) ' ' Z1 - - ~11---~o>~11 - - a,> '= Zn,k , kffi0 kffi0

    The right-hand members of these equations are determined from exper- imental data and the left-hand members are functions of a~ and P~o only.



    TABLE i

    Table for Calculating Values of a and Po from Two Values of X and Y

    Value of a

    i 2 3 4 5 6 7 8 9

    1.00 i.00 1.00 i.00 i.00 1.00 1.00 i.00 i.00 .37 .53 .58 .59 .6i .6i .62 .63 .63 .03 .33 .4i .45 .46 .47 .48 .49 .50

    . i6 .30 .35 .38 .39 .40 .41 .42

    .03 .20 .27 .30 .33 .34 .35 .35 .iO .20 .25 .27 .29 .30 .3i .Oi . i3 . i9 .23 .25 .26 .27

    .07 . i4 . i8 .22 .23 .24

    .02 .I0 .14 .18 .20 .21 .06 .11 . i4 , i7 .t9

    Value of Po

    i 2 3 4 5 6 7 8 9

    .50 .33 .25 .20 .17 .14 .13 . l l .iO

    .68 .42 .30 .23 .19 .16 .14 .12 . l i

    .82 .51 .35 .26 .21 . i8 .15 .13 . l i .62 .40 .30 .23 .19 .16 . i4 .12 .69 .47 .34 .26 .21 .17 .i5 .13

    .55 .39 .29 .23 .19 .16 .14

    .6i .44 .33 .25 .20 .17 .15 .49 .37 .29 .22 .18 .i6 .53 .40 .32 .24 .20 .17

    .44 .34 .26 .22 .i8

  • ULARA KUNO 329

    Let them be X and Y, respectively. In case ~ = 4, when the values of the right-hand members are given, we can find the values of a~ and P~o from Table 1.

    3.2 A Tentative Method ]or Estimating the Six Parameters in Equation (7)

    The procedure for estimating parameters contained in (7) is rather cumbersome. I t appears impossible to estimate the values either by the least-square or the maximum-likelihood method. I f one wishes to estimate these parameters by trial and error, the calculation would become too tedious and there might be some risk of subjective preiudices arising. In this paper, a tentative method by which the estimation may be considered to be objective within a certain range of error is adopted.

    We shall explain this method in the case of 20 items (l = 20). First, we plot the values of P~o against i, and assume the most probable values of p,o at i = 1, 5, 10, 15, and 20. Let these values be pl.o, Ps.o, Pio.o, p~5.o, and p20.o , respectively. Substituting these P,o values into those of (7), we have the following five equations.

    P~.o = ~' -- 198 ~(1 -- ~9) 1 - - .u '

    1 - -X - - - - - i ' - - # '

    ~(1-x g) ~(1 - ~o) (8) p,o.o = ~ - 10~ - - l - - -X - - - -1 - - u '

    ~,o = "y - 5~ a(1 - X" ) te l - u ~) , 1 - -X 1 - -u

    ,~(1 - ~) . ~2o,o = ~/ - - 1 - - k

    Since 1 ~ X, u > O, neglecting those terms higher than the eighth power of X and u, the above equations can be reduced to

    p l .O ~-- 5" - - 195 B , l - - ta

    ~(1 - - X ~) P~.o --- ~/ -- 155 ~ X


    1 - - p. ~

    Plo,o = 'Y -- 105 8

    1 - -~ 1 - -~z '

    P15.o = ")' -- 55 ol B(1 -- ,u 5) 1 - -X 1 -- ,~ '


    P2o.o = ~- - 1 -- X"


    In fact, the values of X and t~ which were used in our experiment are less than 0.75, and hence their eighth-power terms are less than 0.07.

    Now we have five equations with six unknown parameters. The process of estimating these parameters is as follows.

    (i) Since the values of ~ show little variation between subjects, we estimate it by trial and error.

    (ii) The value of ~ is decided from the following relation, which is easily obtained from (9).

    Plo.o = Pl.o + P~0.0 + 10~ -- %

    (iii) The value of a/(1 -- X) is calculated by substituting ~, into the equation for P2o.o

    (iv) The value of f~/(1 -- ~) is obtained from the equation for p~ .o (v) By using these values of % 8, a/(1 --...


View more >