34
Towards a unified process model for graphemic buffer disorder and deep dysgraphia David W. Glasspool University College London, UK, and Cancer Research UK, London, UK Tim Shallice University College London, UK, and SISSA, Trieste, Italy Lisa Cipolotti University College London, UK, and National Hospital for Neurology and Neurosurgery, London, UK Models based on the competitive queuing (CQ) approach can explain many of the effects on dysgraphic patients’ spelling attributed to disruption of the “graphemic output buffer”. Situating such a model in the wider spelling system, however, raises the question of what happens when input to the buffer (e.g., from a semantic system) is degraded while the buffer remains intact. We present a preliminary exploration of predictions following from the CQ approach. We show that the CQ account of the graphemic buffer predicts and explains the finding that deep dysgraphic patients generally show features of graphemic buffer disorder, as disrupted input from a damaged semantic system has an inevitable effect upon the functioning of the buffer. The approach also explains the most salient differences between the two syndromes, which are seen as consequences of the differ- ence between an intact sequence generation system operating on degraded input versus a damaged sequencing system operating on intact input. INTRODUCTION There have been two implicit approaches to con- nectionist computational models among cognitive neuroscientists. Some have viewed connec- tionist models as challenging older information- processing accounts. Others have viewed them as unpacking information-processing accounts at a deeper theoretical level. Traditionally, neuropsychological reasoning has tended to concentrate on determining what distinct processing systems, or modules, exist and their functional organization. Shallice (1988) emphasizes the use of dissociation data in such inferences. But modellers taking the first approach to connectionism have argued that some dis- sociation data may imply different types of damage to a single system rather than distinct Correspondence should be addressed to D. Glasspool, Advanced Computation Laboratory, Cancer Research UK, 44 Lincoln’s Inn Fields, London WC2A 3PX, UK (Email: [email protected]). This work was supported in part by a grant from the McDonnell – Pew programme in cognitive neuroscience. We are grateful to George Houghton, Argye Hillis, Jon Machtynger, Richard Cooper, Gordon Brown, Andrew Ellis, and an anonymous referee for cogent comments on the modelling work and on earlier versions of this report. # 2005 Psychology Press Ltd 479 http://www.psypress.com/cogneuropsychology DOI:10.1080/02643290500265109 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3), 479 – 512

Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

Towards a unified process model for graphemic bufferdisorder and deep dysgraphia

David W. GlasspoolUniversity College London, UK, and Cancer Research UK, London, UK

Tim ShalliceUniversity College London, UK, and SISSA, Trieste, Italy

Lisa CipolottiUniversity College London, UK, and National Hospital for Neurology and Neurosurgery, London, UK

Models based on the competitive queuing (CQ) approach can explain many of the effects ondysgraphic patients’ spelling attributed to disruption of the “graphemic output buffer”. Situatingsuch a model in the wider spelling system, however, raises the question of what happens wheninput to the buffer (e.g., from a semantic system) is degraded while the buffer remains intact. Wepresent a preliminary exploration of predictions following from the CQ approach. We show thatthe CQ account of the graphemic buffer predicts and explains the finding that deep dysgraphicpatients generally show features of graphemic buffer disorder, as disrupted input from a damagedsemantic system has an inevitable effect upon the functioning of the buffer. The approach also explainsthe most salient differences between the two syndromes, which are seen as consequences of the differ-ence between an intact sequence generation system operating on degraded input versus a damagedsequencing system operating on intact input.

INTRODUCTION

There have been two implicit approaches to con-nectionist computational models among cognitiveneuroscientists. Some have viewed connec-tionist models as challenging older information-processing accounts. Others have viewed them asunpacking information-processing accounts at adeeper theoretical level.

Traditionally, neuropsychological reasoninghas tended to concentrate on determining whatdistinct processing systems, or modules, exist andtheir functional organization. Shallice (1988)emphasizes the use of dissociation data in suchinferences. But modellers taking the first approachto connectionism have argued that some dis-sociation data may imply different types ofdamage to a single system rather than distinct

Correspondence should be addressed to D. Glasspool, Advanced Computation Laboratory, Cancer Research UK, 44 Lincoln’s

Inn Fields, London WC2A 3PX, UK (Email: [email protected]).

This work was supported in part by a grant from the McDonnell–Pew programme in cognitive neuroscience. We are grateful to

George Houghton, Argye Hillis, Jon Machtynger, Richard Cooper, Gordon Brown, Andrew Ellis, and an anonymous referee for

cogent comments on the modelling work and on earlier versions of this report.

# 2005 Psychology Press Ltd 479http://www.psypress.com/cogneuropsychology DOI:10.1080/02643290500265109

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3), 479–512

Page 2: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

processing systems (e.g., Plaut, 1995), undercut-ting a major theoretical tool in neuropsychology.

The second approach to connectionism hasbeen less threatening to information-processingaccounts. On this view connectionist modellingis a tool for developing concrete theories of theinternal operation of the modules inferred bymore traditional techniques (e.g., Burgess &Hitch, 1992). Computational modelling allowsexplanations for effects of damage to cognitivesystems to operate at a finer level of detail, encom-passing the effects of damage on the internaloperation of a module as well as on the disruptedfunctioning of the system as a whole.

Shallice (1988) took a cautious approach inarguing that dissociation data are the most relevantdata for neuropsychological reasoning, and thatthe two other types of information that havemost often been taken as neuropsychologicalevidence—association of deficits and error data—both provide only weak constraints for theory.Caramazza and McCloskey (1991) and McCloskeyand Caramazza (1991) defend the use of thesetypes of evidence, as long as both the data and theputative theory are characterized at a sufficientlydetailed level. Computational modelling enablesconcrete theorizing at this fine level of detail and inparticular provides a framework for the appropriateuse of error data in cognitive neuropsychology.

In this paper we develop a connectionist com-putational model of the second type, whichdraws on both these types of data—it is inspiredby detailed consideration of patient errors, and itis used to account for an association of symptomsacross two disorders.

Dysgraphic patients with “graphemic bufferdisorder” (GBD) produce characteristic errors intheir spelling, which indicate problems with theserial output of a sequence of letters in the finaloutput stage of the spelling process. A secondgroup of patients with so-called “deep dysgraphia”produce spelling errors indicative of a higher levellocus of damage, with semantic content affectingerrors. The two syndromes have generally beenseen as quite distinct. However, as Cipolotti,Bird, Glasspool, and Shallice (2004) point out,they do in fact share many features in common.

In this paper we argue that this association offeatures is predicted on a straightforward assump-tion about the internal operation of one com-ponent of the spelling system, the graphemicoutput buffer (GOB).

In previous work we have modelled the opera-tion of the GOB and have simulated spellingproblems thought to be due to disruption to thiscomponent (GBD). In the work described herewe take a first step towards situating our GOBmodel in a wider model of the spelling system byadding input from a semantic spelling route.Such a model allows us to compare the effectof degraded input to the GOB (following damageto the semantic components of the model) versusthe effect of a damaged GOB operating on intactinput. A number of gross features are shared bythe error patterns produced by these two manipula-tions. Several aspects of the mechanisms leading toerrors are, however, modified by the change inlocus of damage, leading to both major and minordifferences in the error patterns generated. Thereis a correspondence with the association of featuresand the salient differences between deep dysgraphiaand GBD that allows us to propose a unified theor-etical account for the two disorders.

We start by discussing the theoretical andempirical background to the work, and wepropose a theoretical account for the similaritiesand differences between two key groups of GOBpatients. We describe a computational modelbased on this account and report the results of anumber of simulations comparing the modelwith a range of patients. Finally we discuss bothsuccesses and limitations of the modelling enter-prise and relate them to aspects of the underlyingtheoretical approach.

THEORETICAL AND EMPIRICALBACKGROUND

Two spelling syndromes

The standard information-processing model ofspelling that was developed in the 1980s involveda number of routes—2 or 3 in different

480 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 3: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

versions—by which information was transmittedfrom a semantic system and/or auditory input(or output) lexicon system to the GOB (Ellis,1984; Margolin, 1984; Shallice, 1988).Information was then thought to be held in thebuffer before being translated letter by letter intoallographs and then motor stroke representations(Ellis, 1982). In the context of a model of thistype, Caramazza, Miceli, and their colleagues, ina series of papers, described two patients with anacquired dysgraphia whose deficit was held to beat the level of the GOB (FV: Miceli, Silveri, &Caramazza, 1985; and LB: Caramazza, Miceli,Villa, & Romani, 1987; Caramazza & Miceli,1990). Since the initial descriptions of thesepatients other patients with a similar type of diffi-culty have been described (e.g., SE: Posteraro,Zinelli, & Mazzucchi, 1988; ML, DH: Hillis &Caramazza, 1989; JES: Aliminosa, McCloskey,Goodman-Schulman, & Sokol, 1993; HE:McCloskey, Badecker, Goodman-Shulman, &Aliminosa, 1994; JH: Kay & Hanley, 1994; AM:De Partz, 1995; AS: Jonsdottir, Shallice, &Wise, 1996; SFI: Miceli, Benvegnu, Capasso, &Caramazza, 1995).

The spelling disorders of these patients have fourmain types of property (see Caramazza et al., 1987,and Shallice, Glasspool, & Houghton, 1995, forreview):

1. There is a tendency for spelling errors toincrease with word length.

2. Spelling accuracy is not affected by semantic orsyntactic variables such as concreteness or partof speech, or variables relating to the mappingfrom phonology to orthography, and is gene-rally unaffected by word frequency.

3. Nonwords are spelled in a roughly similarfashion to words but less accurately length forlength.

4. Errors typically correspond to one or twooperations of the following type: substitution,deletion, transposition, and insertion of singleletters.

Characteristics 2 and 3 indicate that the disorderoccurs no higher than the stage at which theoutputs from the semantic and phonological

systems to the graphemic output system converge.The other properties correspond to what onewould expect from an impairment to a buffer inwhich individual letter representations are heldas in the standard model.

More recently a second set of patients havebeen described whose errors also show effects ofword length and include substitutions, deletions,transpositions, and insertions—Properties 1 and4 above. However, Properties 2 and 3 do notapply; unlike the first type of patient, semanticvariables such as concreteness affect spelling per-formance, as does word frequency, and nonwordsare typically very poorly spelled (Cipolotti et al.,2004). This second set of patients seem tocombine failure of the GOB with a deep dysgra-phia (Bub & Kertesz, 1982); the dysgraphiasyndrome with analogous properties to those ofdeep dyslexia.

Additionally, there are a number of differencesin detailed aspects of the error pattern comparedwith the “classic” GOB disorder described above.Patients in this second set show spelling errorrates that increase monotonically from start toend of word, whereas patients in the first setshow a bowed serial curve with lower error ratesat the start and end of words. The second set ofpatients show a dominance of letter deletionerrors whereas the first set often produce a highproportion of letter substitutions. Finally, thissecond set of patents produce large numbers of anew type of error not reported for the first set—the fragment (Ward & Romani, 1998), responsestwo or more letters shorter than the target word,often preserving earlier portions of the word.This set of patients include HR (Katz, 1991),BA (Ward & Romani, 1998), TH and PB(Schiller, Greenhall, Shelton, & Caramazza,2001), and DA (Cipolotti et al., 2004).

For convenience we label these two groups ofpatients as displaying “GOB disorder Type A”and “GOB disorder Type B”, respectively. Thereare sufficient differences between the groups thatone might be tempted to propose that theycorrespond to distinct functional syndromes.However, two factors suggest that the picturemay not be so straightforward.

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 481

GRAPHEMIC BUFFER

Page 4: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

First, while most patients fall into one group orthe other, a number of patients exist who appear tobe somewhat intermediate between the twogroups. The most common type of intermediatepatient are those apparently of Type A whoshow a significant effect of word frequency (AS,JH, DH, JES, HE). A few patients apparently ofType B produce no semantic errors—HR andAZO (Miceli, Capasso, Ivella, & Caramazza,1997)—and a few patients have anomalous serialposition curves (FM, Tainturier & Caramazza,1996, and AM are apparently Type B but havebowed serial position curves; FV is Type A buthas a flat serial position curve; GSI, Miceli,Capasso, Benvegnu, & Caramazza, 2004, is appa-rently of Type A but has a monotonic serial errorcurve). One patient appears to combine both thelast two types of discrepancy: BH has both TypeA characteristics (bowed serial position curve, nosemantic errors) and Type B characteristics (fre-quency and imageability effects and fragmenterrors).

Second, it is not clear that the two groups arefunctionally distinct. In many respects “prototypic”Type B disorder is similar to Type A disorder withthe addition of some extra features—“deep dys-graphic” type semantic errors, sensitivity to wordfrequency and semantic content, and fragmenterrors. According to the standard information-processing model of spelling, GOB disorder washeld to be essentially unrelated to deep dysgraphia,which was usually seen as involving completedamage to the phonological route or routes andalso partial damage to the semantic system or toits direct connections to the GOB. However,Cipolotti et al. (2004) review a number of accountsof the second type of GOB syndrome in the litera-ture. Where data were given, Cipolotti et al. con-clude that the relevant patients show GOB Type Berrors as well as semantic and frequency effects.The suggestion that deep dysgraphia includesaspects of GOB disorder implies a much moreintimate link between the two syndromes.

In this paper we propose a theoretical accountfor GOB disorder Types A and B and theirrelationship to deep dysgraphia. We treat “proto-typic” Type A and B disorders as relating to

orthogonal types of damage, so that the two dis-orders represent end-points of a potential conti-nuum of intermediate cases where both types ofdamage are present. We propose that Type A dis-order results from disrupted operation of the GOBcaused by damage to the GOB itself, upstreamsystems (including semantic systems) remainingintact. Type B disorder also results from disruptedoperation of the GOB, but in this case the disrup-tion is caused by damage to upstream semanticsystems, which then feed degraded input to anintact GOB.

However, while Type B disorder in mostrespects simply adds features to Type A, a few fea-tures are altered. The mix of different error types ischanged, deletion errors becoming more promi-nent, and the serial position curve for errors,which is bowed in Type A disorder, becomes amonotonic rising curve in Type B. If an accountof the type that we suggest is to be successfulit must explain these differences in terms ofdifferences in the operation of the GOB whenpresented with degraded input compared with itsoperation when it is itself damaged. This is a situ-ation where “classical” neuropsychological reason-ing, based on identifying processing modules,which are then treated as “black boxes”, cannothelp us. In order to provide an account of thistype it is necessary to theorize about the internaloperation of a module—the GOB—and theeffects of different types of disruption.

The general theoretical position outlined aboveis compatible with our process model of the GOB(Glasspool, 1998; Glasspool & Houghton, 2005;Houghton, Glasspool, & Shallice, 1994; Shalliceet al., 1995) and with our approach to modellingthe semantic system (Plaut & Shallice, 1993). Itis also compatible with the suggestion of Wardand Romani (1998) that the Type A features ofspelling errors in their Type B patient BA mightresult from damage to a GOB model of the sortthat we have proposed.

Our purpose in this paper, then, is to begin toexplore the consequences of placing a processmodel of the GOB in the context of other partsof the spelling system and to test the theoreticalaccount proposed above by exploring the effect of

482 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 5: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

degraded input on the operation of the GOB, onboth gross and detailed error patterns. Our aimis to lay the foundations for a parsimonioustheoretical explanation of the two types of GOBdisorder, A and B, including the similarities anddifferences between them and their relation todeep dysgraphia.

Competitive queuing accounts of thegraphemic output buffer

Spelling, like other forms of language production,is an inherently serial process. While generation ofserially ordered sequences of responses is basic tohuman and animal behaviour, the processes under-lying serial behaviour are not well understood(Houghton & Hartley, 1995; Lashley, 1951).One effect of the recent rise of connectionistmodels has been to force modellers to address theproblem of serial behaviour from first principles,rather than relying on the serial constructs availablein traditional computing paradigms to side-stepthe issue. The variety and complexity of thesolutions that connectionists have found tothe problem of serial behaviour is testament tothe difficulty of achieving this.

One class of models that has been successful inexplaining a number of common features ofhuman automatic serial behaviour, in the fields oflanguage and of short-term memory for example,are those termed competitive queuing (CQ)models (Glasspool, 1998, 2005; Houghton,1990). A number of successful models in a rangeof psychological paradigms have been based onthis approach, including verbal short-termmemory (e.g., Burgess & Hitch, 1992, 1996),typing (Rumelhart & Norman, 1982), spelling(Houghton et al., 1994; Shallice et al., 1995),speech production (e.g., Hartley & Houghton,1996), and action plans (e.g., Cooper & Shallice,2000).

Although there are important differencesbetween implemented CQ models, the followingthree features are typically present:

1. A set of response representations, which can becomerefractory: These constitute a finite pool of

representations of distinct responses or actionsfrom which individual sequences are generated.The responses are potentially refractory inthat when an item is produced as part of asequence, it becomes temporarily unavailablefor further use.

2. Parallel activation of response representations andactivation gradients: The representations ofresponses in a target sequence are activated inparallel at the beginning of sequence gener-ation, but with a gradient of activation overthem such that the sooner a response is to beproduced the more active it is. The set ofactive response representations forms the“competitive queue”, as the representationscompete for output on the basis of their acti-vation level. The relative activation levels maybe static throughout production of a sequenceor may change over time.

3. A competitive output mechanism: This mechan-ism has to resolve the response competition inthe queue by selecting the currently mostactive response representation. This process ofselection triggers the subsequent inhibition ofthe chosen representation.

The process of generating a sequence ofresponses in such a model involves an activatingmechanism generating a gradient of activationsover some subset of item representations. Thecompetitive output mechanism then repeatedlyselects for output the most active item. Aseach item is output it becomes refractory andhence temporarily unavailable. In this way theactivated items are output in order of their rela-tive activation values, from the most to the leastactive.

CQ models rely on the transient inhibition ofitem representations after they appear in asequence to prevent perseveration of responses.Sequences with items occurring more than oncecan nonetheless be generated, but this requiresthat an item representation used earlier in asequence is sufficiently strongly excited at a laterpoint that the transient inhibition is overcome.Generally this entails allowing the pattern of exci-tation applied to sequence items to change during

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 483

GRAPHEMIC BUFFER

Page 6: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

the course of sequence production (Glasspool, 1998;Houghton, 1994). While sequences with repeateditems separated by other items can be represented,immediate doubling of an item presents a particulardifficulty and requires a special-purpose mechanism.We have argued (Houghton et al., 1994; Shalliceet al., 1995) that the apparent marked status ofdouble items in sequences, and in particular thetypes of error made on doubled items (gemi-nates)—for example, in spelling (Caramazza &Miceli, 1990) and typing (Rumelhart & Norman,1982), in which the “doubled” status appears to berepresented separately from the identity of the itembeing doubled—constitute strong evidence forspecial treatment of doubled items in serialbehaviour. The CQ model provides a principledexplanation of their special status. Moreover, errorsinvolving doubles are strongly suggestive of theerrors found in implemented CQ systems withspecial-purpose doubling mechanisms added tothem.

Our previous work on modelling neurologicaldisorders of spelling using the CQ modellingparadigm has been concerned with GOB disorderType A. Within a symbolic-stages model frame-work (e.g., Caramazza & Miceli, 1990) this dis-order can be characterized as resulting from animpairment to a buffer that holds graphemicrepresentations prior to their production inwriting or spelling aloud. However, currentaccounts using the symbolic-stages approach ofthe Type A disorder (e.g., Caramazza & Miceli,1990) make claims about the form of informationrepresented within the spelling system withoutproviding an account of the mechanism by whichspelling is achieved from these representations.While it is important to have representationaltheories, certain features of an error pattern maybe due to the mechanism that is damaged ratherthan the information that it is processing. Indeedwe believe this to be the case for errors arisingfrom damage to the “graphemic buffer”. It isimportant, therefore, to consider both represen-tation and mechanism.

Our approach has been to view the GOB as aCQ system for generating sequences of letter iden-tities when receiving input representing the

identity of the word to be spelled. It is assumedthat these letter identities are processed by down-stream systems to produce written or spokenoutput. The GOB is therefore viewed primarilyin the role of a sequence generator. On thisapproach the Type A disorder was explained asresulting from an increased uncertainty in theselection of the “winning” letter from the set ofletter activations available at each momentduring spelling production. This might, forexample, be due to disruption of the selectionsystem itself, disruption to or noise in the systemgenerating the activation gradient, or a reductionin the general level of activation of letter represen-tations in the face of a fixed amount of backgroundnoise. Our theoretical approach has been toexplain what we consider to be the most salientfeatures of Type A GBD as resulting directlyfrom the characteristic breakdown pattern of theCQ sequencing process under random pertur-bation of “winning item” selection. This patternincludes a clear effect of word length on errorrate, bowed serial error curves, and a characteristicset of error types including ordering errors such asthe exchange of two letters within a word, lettersubstitution, insertion and deletion errors, anderrors involving movement of a geminationoperator. Within this framework other regularitiesin the GBD error pattern are most naturallyinterpreted as reflecting features of the activatingsystem that establishes the activation levels ofqueued letters (Glasspool & Houghton, 1997).Regularities within this system lead to a character-istic pattern of vulnerability, upon which thedetailed CQ error pattern is superimposed. Thispattern of vulnerability would include, forexample, the propensity for consonants toexchange with consonants and vowels withvowels in some patients’ errors.

In common with most models of the CQtype, this work has been based on fully localistsingle-layer connectionist networks (Glasspool,1998; Glasspool & Houghton, 2005; Glasspoolet al., 1995; Houghton et al., 1994; Shalliceet al., 1995). These papers describe a number ofdifferent versions of the basic localist model. Ourapproach has been to start with the basic CQ

484 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 7: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

sequence generation mechanism and then to elab-orate the activating mechanism in order to attemptto simulate Type A GBD in more detail. Forinstance, the earlier models do not include explicitconsonant/vowel representations, while the latermodels do.

However, from a computational perspective theform of localist representations and learning algor-ithm used in this class of CQ models has somedisadvantages. A separate representation for thetemporal context of each letter must be createdfor each word—the generation of timing infor-mation that drives sequential recall is integral tothe representation of the sequence, and there areno opportunities for generalization across wordsbased on temporal position. Additionally the lackof shared representations at the letter level ofthe model means that there are no cross-wordgeneralizations in the spelling properties of themodel; thus frequency of letter combinations andletter sequential dependencies are not representedin any way. No efficiencies of generalization arepossible in the storage of several very similarwords. Each must be stored separately. Moreoverthe problem of combining information sources,as when phonological and lexical information arecombined in a two-route spelling model, isdifficult (Glasspool et al., 1995).

In Glasspool (1998) and Glasspool, Shallice,and Cipolotti (1999) we describe an extension tothe standard fully localist CQ architecture toallow CQ sequence production in a multilayerconnectionist network trained with a backpropa-gation-type rule. This type of model does nothave the theoretical inadequacies of the earlierlocalist versions. However, replacing localistcontext representations by distributed represen-tations and word-specific sets of connections by ashared set of connections used by all wordsrequired hundreds of times more training epochs;it was computationally much more demanding.The model therefore had to be more restricted inits scope with omission, for instance, of conside-ration of geminates.

The overall philosophy of this approach is notto view these distributed models as inherentlymore adequate than the localist models, but only

more adequate in particular respects. The model-ling philosophy is to produce a set of models,each of which captures salient parts of an overallideal model. If the models each work well withrespect to the particular set of empirical phenom-ena most related to their key assumptions then anattempt would be made to combine them into amacromodel. We view these models as attemptsto explore the possibilities of elaborating invarious ways upon this central theme and also totest our hypothesis that a core set of features ofGBD are due to the dynamic behaviour of theCQ sequencing mechanism rather than to thedetails of any particular implementation.

Predictions of the CQ GOB account

It is reasonable to assume that output from adamaged semantic pathway impinging on theGOB will be degraded relative to its usual form.What, though, will be the effect of such degradedinput on the operation of the buffer? On the stan-dard information-processing model of spelling, itis not possible to make any prediction.

Our view of the GOB as a CQ mechanism,however, predicts that degraded input will affectthe generation of letter sequences in a number ofways. If degraded input leads to suboptimal acti-vation of “queued” letter representations, then inthe face of a constant level of background noisewe would expect the likelihood of errors insequence generation to be higher and the typesof error to be broadly similar to those seen inGOB disorder. We would, however, expect differ-ences in the detailed form of the errors produced.For example, our recent spelling models have, incommon with many other CQ models, reactedto subthreshold activation of letters by omitting aresponse and have interpreted very low overallactivation across letter representations as indicat-ing that the end of a word has been reached. Wewould therefore expect omission of letters andpremature ending of words to be more likelywhen disruption is due to low input activationlevel rather than increased uncertainty in letterselection.

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 485

GRAPHEMIC BUFFER

Page 8: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

When the semantic pathway is damaged, onewould expect the degree of degradation in the rep-resentation of particular words to vary dependingon semantic factors, such as semantic category, andon the robustness of the semantic representation,which can be expected to vary with concretenessor imageability. The degree of GOB disruptionwould be expected to reflect these variations, super-imposing these effects on the pattern of errors pro-duced by the disrupted GOB. The error patternfollowing from damage to the GOB in the presenceof an optimal input from an intact semantic system,however, is dependent only on the degree of disrup-tion internal to the CQ system and would not beexpected to vary with the semantic category or thelexical status of the target word.

Both the localist and distributed versions of theCQ GOB model make an interesting predictionwith respect to word frequency. The CQ approachto sequence generation requires only that thecorrect item is most active at a particular point ina sequence; the exact activation level, and thestate of other items, is not important. CQsystems are thus less demanding than usual intheir learning algorithms, and our models conse-quently have the property that once a word hasbeen learned sufficiently that it can be correctlyrecalled (or once a specified level of robustnesshas been achieved in its recall), further exposureto the word will not lead to increased robustnessof representations. Hence, providing all words ina training corpus have been learned at least tothe minimum level required for correct recall, wewould predict that errors caused by disruption tothe GOB will not be sensitive to word frequency.The semantic-to-orthographic mapping may bemodelled by a more conventional connectionistnetwork, which we would expect to be sensitiveto word frequency.1

Our general prediction is thus that damage tosystems or pathways impinging on the GOB will

lead to error patterns with the broad characteristicsof GOB disorder—increase of error rate with wordlength, presence of ordering errors, separation ofgeminate features from letter identity in errors,and possibly preservation of consonant/vowelstatus. However, there would be a greater rate ofomission errors, particularly at the end of wordsdue to the early ending of the production ofletters, than is seen when sequence generation isdisrupted by uncertainty in winning-letter selec-tion. We would expect the somewhat differentbalance of error mechanisms at work to give riseto altered serial position curves for errors.Moreover, these effects will be superimposed ona pattern related to the vulnerability of thedamaged system or pathway—in the case of thesemantic system, we would expect that these willinclude effects related to semantic class and con-creteness and word frequency. This is suggestiveof the characteristics of Type B disorder.Disruption to the GOB, on the other hand,while the semantic orthographic system is unda-maged, should lead to classic Type A error patternswith no effect of semantic class or word frequency.These predictions suggest that a rather simplemodel might provide a parsimonious unifiedexplanation for the features of Type A and Bdysgraphias and for the relationship betweenthem.

In order to test these predictions, and to deter-mine the detailed form of the error pattern result-ing from suboptimal input to a CQ sequencegeneration system, we describe here a series ofsimulations with an outline computational modelcombining the GOB with input from lexicalsemantic representations, based on an extensionto our multilayer network model of the GOB(Glasspool, 1998; Glasspool et al., 1999). In thecurrent model we only consider inputs to theCQ system from semantic representations. Thisis done for two reasons. First, the overall

1 In the model presented here a standard backpropagation network is used to model the semantic pathway. A more elaborate

model of this pathway might include attractor dynamics (Plaut & Shallice, 1993), and it is not clear what effect this might have

on the sensitivity of the semantic route to word presentation frequency. The incremental way in which such attractors emerge

during training suggests that word frequency would affect robustness of representation in such a network, however, so we do not

expect that the present simplified model would behave qualitatively differently in this respect.

486 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 9: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

modelling philosophy, as discussed earlier, is toincrease the complexity of the basic CQ modelin a step-wise fashion. Thus we increase the com-plexity by adding to existing models a simulationof one “route” to the graphemic buffer ratherthan two or more and do this by incorporating aroute that carries representations that have noinherent temporal structure in them so that twodifferent temporal structures do not need to becombined theoretically. Secondly, as the phonolo-gical route or routes, which are not simulated, aregenerally held to be grossly damaged in deepdysgraphia so the semantic route, which is simu-lated, is the route that should be critical for anaccount of the determining properties of deepdysgraphia.

However, we make one exception to this con-centration on semantic aspects. The initial creationof the output graphemic representations will beprimarily driven from sublexical phonology(Lennox & Siegel, 1994). Thus certain phonologi-cal characteristics may be expected to be implicitlyrepresented in the structure of output graphemicrepresentations. We therefore make the mostlimited assumption of this type, as proposed byCaramazza and Miceli (1990; see also Cubelli,1991), that consonant and vowel status arerepresented in the output.

THE MODEL

The architecture of the model is shown inFigure 1. The following informal description pre-sumes a familiarity with the principles of connec-tionist models; a more formal treatment is givenin the Appendix. Full technical details of theGOB model on which it is based are given byGlasspool (1998).

The model comprises two subsystems repre-senting the semantic activating system and theGOB (dotted outlines in Figure 1).

The semantic representation field represents theinput to the semantic spelling pathway and speci-fies the word to be spelled in terms of a pattern ofactivation distributed over nodes representing

semantic features. The pattern is held steadyduring the spelling of a word.

The immediate input to the GOB systemcannot be the semantic representation itself sincein the full system its input can also be based ontranslation from the phonological system. In fact,during normal development of the spellingsystem spelling is likely to derive initially fromphonology (Lennox & Siegel, 1994). One wouldtherefore expect to find representations intermedi-ate between semantic features and letters even ifthese simply represent the hidden layer of a pho-nology-to-spelling system. Moreover, without ajoint input from the two routes there would beno transfer of spelling knowledge between them.For these reasons the immediate input to theGOB on the present model is what we call theword identity field. It would correspond to thegraphemic output lexicon on a symbolic approachand is held to represent the identity of the to-be-spelled word independent of semantic or phonolo-gical content. The semantic representation fieldprojects via Hidden Layer 1 to the word identityfield. The semantic activating subsystem is thus astraightforward multilayer connectionist networkmapping from semantic representation to wordidentity.

The GOB subsystem has two input fields ofnodes. One is the word identity field discussedabove. Activation of the other input field, serialposition, changes in a stereotyped way through aseries of patterns representing successive serialletter positions in the word. This is equivalent tosimilar fields in other CQ sequencing models(e.g., Burgess & Hitch, 1992; Houghton, 1990).The two input fields project via Hidden Layer 2onto an output field representing individual letteridentities.

The parallel activation pattern over letter nodes isresolved into a sequence of letter identities by a com-petitive queuing output stage. The most active letternode is determined at each step during production,and this constitutes the output of the model.Following output of a letter the correspondingletter identity node becomes refractory and isbriefly unavailable for further output. In our pre-vious models this function has been performed by

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 487

GRAPHEMIC BUFFER

Page 10: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

a “competitive filter”—a field with strong inhibitoryconnections between nodes creating a winner-takes-all competition (Houghton, 1990). In the presentmodel a competitive filter is simulated by a peak-picking algorithm. The GOB network learns toassociate a particular serial position in a particularword with a pattern of excitation on the outputfield, which, in conjunction with the “select-inhibit” dynamics of the CQ approach, results inthe corresponding letter node being most activefor that serial position during spelling.

Two threshold levels are free parameters of themodel. If on a particular time step no letter acti-vation exceeds the response threshold Tr no

response is made at that time step. If no letterexceeds the lower stopping threshold Ts then theletter sequence for the current word is ended.

As in previous work (Glasspool, 1998;Glasspool et al., 1999) we add a pair of nodes atthe output layer to indicate the consonant/vowel(CV) status of each letter. During training thenetwork is required to activate the C or V nodein parallel with the appropriate letter node. TheCV nodes are ignored during testing (this hassome similarities with the approach ofChristiansen, 1997). This is to encourage thenetwork to represent C/V status of letters; wemake no theoretical claims for the way this is

Figure 1. Architecture of the model. See text for details.

488 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 11: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

achieved in the current model. We interpret this asan example of a common phenomenon in serialbehaviour—the imposition of domain-specificserial constraints or biases. Such constraints onthe class of elements participating in an erroralso operate in models of speech output andverbal short-term memory, for example (seeGlasspool & Houghton, 1997, for discussion).The C and V nodes are not included in theselect/inhibit output process mediated by thecompetitive filter.

For reasons given above, we have not extendedthe multilayer version of our spelling model toinclude a mechanism for doubled letters, althoughwe do not anticipate that this would in principle bemore difficult than for the single-layer localistmodels in which such mechanisms have been rea-lized (Houghton et al., 1994; Rumelhart &Norman, 1982; Shallice et al., 1995). We do notaim to account for doubled letters in the presentmodel but assume that the general account pro-vided in previous models will hold.

Representations and training

The representations used for input and output tothe various elements of the model, the trainingprocedures used, and the details of the trainingcorpus are as follows.

Input representationsThe input to the semantic route model (semanticrepresentation in Figure 1) is a semantic represen-tation of the word to be spelled. For the purposesof the current model we have not attempted toencode the semantics of each word individually.Each word is instead assigned a randomly gener-ated vector of 56 semantic features, each ofwhich may be “on” or “off”. A total of 28 of thesemantic features are labelled “concrete” features,and high-concreteness words each have 14 ofthese features active, such that every possible pairof words differ on at least 5 features. The other28 features are labelled “abstract”, and low-concre-teness words each have 5 of these features activesuch that each pair differs on at least 3. High-and low-concreteness words thus differ in the

density of their semantic representations. Thismethod of representating abstraction is similar,for example, to that used by Plaut and Shallice(1993) in their explicit representation of semanticinformation.

The field of nodes forming the output of thesemantic system and the input to the GOB (theword identity field in Figure 1) may be interpretedas corresponding to the output graphemic lexiconin the standard model. In a fuller model this fieldwould represent the place where phonologicallexical and semantic lexical information interactprior to the GOB. Representations here areassumed to be independent of semantic content,but some degree of overlap is nonethelessassumed between orthographically similar wordsto allow for some generalization across suchwords in spelling knowledge. We have based thechosen representation on the assumption that thisfield might represent orthographic words as theblend of a small number of idealized quasi-morphemes. A sparse distributed representation wastherefore selected for this layer with 400 vectorsdefined over 400 nodes such that each vector has 4active nodes. The scheme used ensures that eachvector overlaps (shares active nodes) with six others,of which two each overlap by one active node, twooverlap by two nodes, and two overlap by three.Vectors are randomly assigned to words.

CQ models require a means of roughly repre-senting item position within a larger unit. This isthe job of the serial position units. Using binaryvalued inputs, a suitable representation for serialposition is that introduced by Burgess and Hitch(1992, 1996; see inset, Figure 1). A set of patternsis generated by shifting a “window” of active unitsacross a field of inactive units. Each position isuniquely represented, but there is some overlapwith other positions, the overlap being greater thecloser positions are together. In the model awindow of 8 active units is shifted across a field of16 units to produce eight vectors that can be usedto identify serial positions in the word to be spelled.

Output representationsSince the CQ approach depends on the localizedinhibition of a single output item at each step in

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 489

GRAPHEMIC BUFFER

Page 12: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

sequence production, a local representation ofletter identity is used at the output layer (letteroutput nodes on Figure 1) as in previous CQmodels. A set of 26 nodes represents letter identi-ties. An additional pair of output nodes representsthe consonant/vowel status of letters duringtraining.

Training corpusA corpus of 400 words was used, 100 each oflength 4, 5, 6, and 7 letters. Words were selectedfrom the MRC psycholinguistic database(Coltheart, 1981; Quinlan, 1993). While wordfrequency and concreteness are simulated withinthe model (as detailed elsewhere in this section),the orthographic structure of words may vary sys-tematically with these factors. We therefore usedactual words that were either high or low on fre-quency or concreteness or both in the corpus.Half the words of each length were low frequency(mean Kucera–Francis frequency 3.8), the otherhalf high frequency (mean frequency 190). Ineach of the resulting eight sets of words halfwere high concreteness (mean concreteness valuein the MRC psycholinguistic database 536) andhalf low concreteness (mean MRC concretenessvalue 313). Frequency values were balancedacross high- and low-concreteness conditionsand concreteness values across high- and low-frequency conditions, and both were equalizedacross different word lengths, as far as possible.

Training procedureOther than the one-to-one connections betweenoutput nodes and the competitive filter, all con-nections in the network are modifiable. For con-nections in the semantic route model a standardcross-entropy backpropagation training procedurewas used (see Appendix). This section of thenetwork was trained in isolation until themaximum error on any word identity node wasbelow 1%, which occurred after 885 epochs. Ineach training epoch every high-frequency wordwas presented to the network, but each low-frequency word was presented with a probabilityof .3. After training, weights on this section ofthe network were frozen, and the entire network

from the input layers to the letter identity fieldwas trained in the following fashion.

Conventionally, backpropagation algorithmsassume that the activation level of every outputunit is equally important for the generation of acorrect output. What is critical in the present situ-ation is that at each time step the correct letter isthe most active one. The CQ mechanism itselfensures that activation levels of units other thanthe most active one have no effect on immediatebehaviour. Moreover subsequent behaviour needsto be influenced by the current activation levelsof the competitors. For the remainder of themodel, which implements a CQ output stage, amore minimal procedure than standard backpropagation is therefore used. A “lazy” learningrule is used in an approach that is similar insome respects to that of Jordan (1986). This learn-ing procedure, proposed by Glasspool (1998), is“lazy” in that learning only occurs for the indi-vidual letters explicitly involved when an erroroccurs in sequence generation. The error signalused to drive learning in the network is calculatedfor each letter node at each time step essentiallyaccording to the following rules (further detailsare given in the Appendix):

. If the letter that should be most active in thecurrent position within the current word—thetarget letter—is not activated above the presetresponse threshold Tr, the error signal is suchthat this letter is reinforced, making it likely tobe more active in this position in subsequenttrials. If it is activated above threshold, noerror signal is generated for the target letter.

. If the most active letter is not the target letter,then the error signal for the most active letteris such that this letter is punished, makingit likely to be less active in this position insubsequent trials.

. No error signal is generated for any other letters,regardless of their activation level.

Training is continued for two letter positionsbeyond the end of the word, in which the modelis trained to reduce all letter activation below thestopping threshold Ts. A small margin is incorpo-rated in comparisons with thresholds in these rules

490 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 13: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

(see Appendix), meaning that the model is trainedwith slightly more stringent thresholds than thosethat are used during recall. This is similar to the“learning margin” used in our previous CQmodels and prevents undue fragility of spellingperformance simply due to operation of themodel too close to its thresholds. The errorvalues thus calculated are used in a cross-entropybackpropagation learning algorithm (see theAppendix for details). In each training epochevery word is presented to the network once, inrandom order.

While our preferred explanation for the lack ofword frequency effects in GBD is as outlinedabove—the clearly defined limits on learning inCQ systems resulting in no overlearning forfrequent items—there is a practical difficulty forthis explanation in the present model. Pilotstudies indicated the pragmatic computationalneed in a network of this complexity for a “momen-tum” term in the learning rule (see Appendix) inorder to learn a substantial number of words in atractable time on our equipment. The use of amomentum term is a standard technique in net-works with backpropagation learning rules andrelates purely to speeding up learning. However,the momentum term also entails that weightchanges in the network during training are artifi-cially affected by presentation frequency. As thisis a pragmatic feature of the implementationintended to accelerate learning we do not placeany theoretical weight on it. However, for practicalpurposes in the majority of simulations reportedhere the momentum term is present. To avoid theartifactual effect of presentation frequency in theGOB section of the network we therefore onlyexpose the semantic pathway section of the modelto varying word frequency—high- and low-fre-quency words are presented equally often to theGOB section. Because this is an unrealisticassumption a version of the model was also testedin which no momentum term is used, and inwhich both subsystems are exposed to varyingword frequency. Of necessity this is a smallerscale model; however, it is sufficient to test our pre-dictions regarding word frequency effects. It is dis-cussed in the section on frequency effects below.

Simulation procedure

Type A disorderOn the model, Type A disorder is the consequenceof disruption to the sequence generation mecha-nism of the GOB itself. It has been argued else-where (e.g., Houghton et al., 1994) that oneobvious manipulation exists whereby nonspecificdamage to the operation of a CQ system maybe simulated—the addition of random noise tothe activation levels of competing items. This hasthe effect of rendering the competitive processnondeterministic and corresponds to a loss ofpositional specificity in the sequencing process.Accordingly, random disruption of the outputcompetition is how we simulate damage to theGOB component of our model. However, simplyadding noise directly to item activation levels canlead to instability when recall is halted using athreshold on activation levels. While other mech-anisms are possible for halting sequence gener-ation at the end of a word, none is so intuitivelysatisfying as stopping when no letter remainswith superthreshold activation (Glasspool, 1998).We follow Glasspool (1998) and Glasspool andHoughton (2005) in adding noise at the level ofthe competitive filter rather than at the letteroutput field (see Appendix). This may bethought of as targeting disruption directly at theprocess of response selection.

Type B disorderOn the model, Type B disorder results fromdegraded input to the GOB. Because we are inter-ested in the relationship between this disorder anddeep dysgraphia we have included parts of thelexical semantic system in the model, and we simu-late Type B disorder by lesioning connections inthe projections from the semantic representationfield via Hidden Layer 1 to the word identityfield. Bullinaria and Chater (1995) show that,unless a very large number of hidden units canbe used, a lesion to such a projection in a multi-layer feedforward network should always be simu-lated by uniform reduction in strength of allconnections in the projection rather than randomcropping of connections. As the number of

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 491

GRAPHEMIC BUFFER

Page 14: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

connections in a projection increases towards thehigh numbers found in biological neural systems,the results obtained by random cropping asympto-tically approach those of uniform weightreduction. Conversely in the limited simulationsthat are tractable on current equipment input–output mappings are distributed unevenly acrossconnections and hidden units, and cropping andweight reduction produce divergent results.Accordingly we simulate damage to the semanticpathway by reducing connection weights by auniform proportion.

SIMULATIONS

The model was trained as described above until20 epochs were achieved with all words spelledcorrectly. This was after 708 epochs. The intactmodel, when fully trained, produces correctspellings for all 400 words in its corpus (thepresent model is not intended to model thespelling of novel words; see the Discussion sectionfor comments on this process). From this singleintact model four different damaged models wereproduced, which were used to provide the resultsreported in this section.

A Type A model was produced by addingrandom noise to activation levels of all nodes inthe competitive filter of the intact model. Thenoise level was adjusted to give a performance ofapproximately 50% correct spelling for six-letterwords, which is comparable to our previoussimulation work and with the performance ofseveral well-known Type A patients (e.g., ASand LB). This was achieved with a noise magni-tude of +0.04.

A Type B model was produced by setting amuch lower background level of noise in thecompetitive filter nodes (+0.005, sufficient tocause less than 0.05% errors with intact weights)to simulate an intact GOB. Damage to the seman-tic system was then simulated by degradingweights. All weights in the projections from thesemantic representation field via Hidden Layer 1to the word representation field were scaled by auniform factor of 0.58, which was found to give

similar overall spelling performance to DA (4%correct spelling on seven-letter words, average15% correct over four- to six-letter words).

One concern is that the Type B model isoperating with a considerably more severe deficitthan the Type A model, which raises the possi-bility that differences in error patterns betweenthe two models could be due simply to the differ-ence in severity of damage rather than locus ofdamage. We therefore produced two controlmodels—the “Type B control” model is a TypeB model with a lower level of impairment(weights scaled by 0.67) to match its performanceapproximately to the Type A primary model, andthe “Type A control” model is a Type A modelwith a higher noise magnitude (+0.09) to matchits performance to the Type B primary model. Inthe results reported below the primary modelsare given solid lines in graphs, and the controlmodels broken lines. All results were averagedover 500 runs of the models over the test corpus(i.e., 200,000 attempted spellings for eachmodel). We illustrate the performance of themodels by comparing with LB (Caramazza et al.,1987) and AS (Jonsdottir et al., 1996), Type Apatients, and with DA (Cipolotti et al., 2004)and BA (Ward & Romani, 1998), Type Bpatients. Of these, AS, DA, and BA are Englishspeakers, and LB is an Italian speaker.

Word length effect

Figure 2 shows overall spelling performance ofeach model plotted against word length. Thedamage introduced to the models producesspelling errors, and these show a clear effect ofword length with fewer errors on shorter words.Strong sequence length effects are a standardfinding with CQ models and occur for at leastthree reasons. First, more items in a sequencepresent more opportunities for error. Second, thereis a tendency for recall to be better (due to relativelyless confusability in item position) near the start andend of a sequence, and these end effects have a rela-tively larger impact on shorter sequences. Third, insome models the gradient of activation levels of thesequenced items is steeper in shorter sequences

492 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 15: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

(this makes the difference in activation betweenadjacent items larger, and so successive itemsbecome more distinctive). Given the type of rep-resentation that we use at the sequence positionlayer we assume that the first two of these factorsare primarily responsible for the effect in thepresent model.

The primary models produce curves withsimilar gradients to their controls. The curvesappear more related to severity than to locus ofdamage in the models. Figure 2 also plots compar-able data for Type A patients AS and LB and TypeB patients DA and BA. The Type A modelledcurves are similar in shape and gradient to theType A patients. The Type B curves are similarto patient DA but patient BA clearly has a lesssevere deficit.

Concreteness and frequency

Figure 3 shows overall performance of each of thedamaged models on (a) the high- and low-frequency words in the test corpus and (b) thehigh- and low-concreteness words (200 words ineach category).

The Type A models show no effect of eitherconcreteness or frequency. This is expected asthe intact semantic spelling route in these modelsproduces essentially perfect activation patterns atthe word representation field regardless of wordfrequency or concreteness. The GOB subsystem

Figure 2. Overall spelling performance of the models for each wordlength, compared with Type A patient AS and Type B patient DA.

Figure 3. The performance of the models on high- and low-

concreteness (a) and high- and low-frequency (b) words. The

“mini” models vary word frequency during training of the GOB

component and do not include a momentum term in the

backpropagation learning procedures. They are trained on only

100 words.

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 493

GRAPHEMIC BUFFER

Page 16: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

thus operates with normal input regardless ofeither variable, and errors occur only as a resultof sequencing inaccuracies introduced by noise inthe competitive queuing process. By contrast, theType B models show effects of both frequencyand concreteness. Here errors result from sub-optimal input to an intact GOB, and as expectedthe effects of semantic route damage are greaterfor low-concreteness and low-frequency words,which produce more abnormal word represen-tation patterns and hence higher error rates.

The effect of concreteness on the Type Bmodels is clear in Figure 3(a), but the magnitudeof the effect is less than that for patient DA(Cipolotti et al., 2004, Table 5). However noattempt has been made in the present model toaccurately model the difference in semantic rep-resentation between high- and low-concretenesswords. Rather, our representational scheme forword semantics aims to capture the qualitativedifference between the two groups. Again forFigure 3(b), while the qualitative effect of wordfrequency is clear and accords with the effect inpatient DA (Cipolotti et al., 2004, Table 7), themagnitude is rather low. However, we haverepresented word frequency extremely crudely inthe model with the aim of capturing only thequalitative effect.

As discussed above, it was not practical toexplore our preferred theoretical explanation forthe lack of a frequency effect in Type A disorderin the full-scale model due to the need to have amomentum term in the learning rule in order tomake training tractable. We have therefore notexposed the GOB model to word frequency differ-ences during training. This is a highly unrealisticassumption. To test our preferred explanation weproduced a small-scale version of the model,which is identical to the standard model exceptfor the following features. A corpus of 100 wordswas used, with 25 words of each of four, five, six,and seven letters. A total of 12 words at eachlength were high frequency, 13 were low fre-quency, 12 words were of high concreteness, and13 were of low concreteness, arranged so that allfour permutations were covered at each wordlength as before. The words used were selected

from the larger corpus. Hidden Layer 2 wasreduced to 50 nodes. The momentum term inthe learning rules for both semantic and GOBsections of the model (see Appendix) was setto 0, and word frequency was represented inGOB training as well as semantic route training,using the same technique. The semantic networkwas trained to criterion on the full set of 400 inputrepresentations after 9,545 epochs. However, toreduce training time for the second and morecomputationally demanding simulation, only thefirst 12 low-concreteness and first 13 high-concrete-ness word identity representations at each wordlength were fed to the graphemic buffer network.The GOB network was trained until 200 consecutivecorrect epochs were achieved (the tighter criterionbeing due to the probabilistic occurrence of low-frequency words in each epoch). Without themomentum term a much larger number of trainingepochs were required (76,743). Type A and Type Bprimary models were produced (with a noise levelof +0.0025 and with a scaling factor of 0.65 along-side background noise at +0.0005, respectively).The results are shown in Figure 3 as “mini-model”(dashed lines).

Despite the fact that the GOB network hasbeen exposed to high-frequency words threetimes as often as low-frequency words, there isno difference in the Type A error rate betweenhigh- and low-frequency words. By contrast theType B model shows a clear effect of frequency.This is just what would be expected on ourexplanation in terms of the particular requirementsof learning in CQ systems.

Serial position

Figure 4 shows the number of single-errorresponses produced at each serial position foreach of the full-scale models. These figures arenormalized across word lengths using the schemeof Wing and Baddeley (1980), which distributeserrors for all word lengths to five notional serialpositions.

The Type A models (Figure 4a) produce curveswhich are bowed, as typically are those for Type Apatients (see Caramazza et al., 1987; Jonsdottir

494 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 17: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

et al., 1996). Serial curves showing primacy andrecency effects—relatively preserved performanceon items near the start and end of sequences—are typical of damaged CQ models. A number offactors contribute:

1. As sequencing progresses earlier sequenceitems are inhibited and thus removed (tosome extent, depending on the degree of

inhibition) from the pool of available competi-tors. This tends to reduce errors late in asequence.

2. In some models sequence items have higherabsolute activation nearer the start of asequence (this is not the case in the presentmodel). This can lead to better separation ofconsecutive items and hence fewer errors.

3. In models with dynamic cueing of serial pos-ition (as in the present model) “end effects”arise where serial positions near the start andend of the sequence overlap in their represen-tations with fewer adjacent positions thanthose in the middle of the sequence.

In the present Type A model Factors 1 and 3 arelikely to contribute to the observed serialposition effect.

The Type B models (Figure 4b), however,produce a monotonically increasing serial errorcurve. We interpret this as resulting from thedifferent mix of error mechanisms that predomi-nate when letter activations are reduced instrength. While the profiles of all error typeschange to some extent, errors involving omissionof letters or the early stopping of sequencing dueto subthreshold letter activation predominate inthis situation, and the large number of late del-etion errors is the main contributor to the alteredserial error curve.

Figure 5 repeats this analysis for all errors,plotting the total percentage of incorrect lettersat each serial position, again normalized acrossword lengths. This is the way that the serialposition curves of errors made by Type B patientshave typically been represented. Again bowedcurves are produced for Type A, but monotonicincreasing curves for Type B. The Type Bpatients who have been described show serialposition curves of this type (see Cipolotti et al.,2004). The bowing of the Type A curves inFigure 5 is very shallow, however. We assumethat the difference in degree of bowing betweenFigures 4 and 5 reflects differences in the distri-bution of complex errors (i.e., errors involvingmore than a single transposition, insertion,deletion, or substitution).

Figure 4. The incidence of errors occurring in each serial position in

responses containing a single identifiable error, (a) for Type A

models and patients and (b) for Type B models and patients.

Position of error is determined according to the scoring scheme of

Caramazza and Miceli (1990) and is normalized across word

lengths by the method of Wing and Baddeley (1980).

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 495

GRAPHEMIC BUFFER

Page 18: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

Error types

Table 1 gives examples of the errors produced bythe Type A and Type B models, and Figure 6shows the proportions of different error typesproduced for six-letter words.

The model data in Figure 6 are restricted toerrors on six-letter words and to responses includ-ing only a single apparent error to reduce thepossible ambiguity in classification. The closest

equivalent data for Type A patients LB and ASand Type B patients DA and BA are also provided.Some caution must be exercized in comparingthese results for two reasons. Different studieshave used slightly different criteria to classifyerror types (for example, they have included orexcluded multiple-error responses). The distri-bution of error types can also be affected by thedistribution of word lengths included in thestudy, as proportions of different error types canvary with word length. However, we can makesome general observations.

Type A dysgraphic patients vary somewhat inthe relative proportions of different error types.ML and DH (Hillis & Caramazza, 1989) andSE (Posteraro et al., 1988) show a predominanceof deletions, while FV (Miceli et al., 1985), forexample, produces relatively few deletions (10%).For LB (Caramazza & Miceli, 1990; Caramazzaet al., 1987), CW (Cubelli, 1991), and JH (Kay& Hanley, 1994) substitutions are the mostcommon error (64% of FV’s errors are substi-tutions). HE (McCloskey et al., 1994) and AS(Jonsdottir et al., 1996) produce similar numbersof substitutions and deletions. Most patients alsoproduce a fair number of insertion and transposi-tion errors. For example, LB’s errors include 6%insertions and 17% transpositions, while for ASthe proportions are 22% and 14%, respectively.For those patients where the distinction has beenmade (LB, JH, and AS), transposition errors arepredominantly exchanges rather than shifts, andindeed shift errors were very infrequent for thesepatients. Mixed errors, containing more than oneof the error types listed above, also occur, withdifferent incidences in different patients althoughsuch errors are generally at least as frequent asany of the individual error types.

The Type A models produce all four error types atthis word length (deletion, exchange, insertion, andtransposition errors). The Type B models producedeletion and substitution errors at this word length,though they also produce relatively low rates ofinsertion and transposition errors at other wordlengths. Examining the qualitative error patterns ofFigure 6 it is immediately clear that the two modeltypes have different qualitative error profiles.

Figure 5. The percentage of letters produced in each serial position

that were incorrect, normalized across word lengths by the method of

Wing and Baddeley (1980).

Table 1. Sample errors form the primary Type A and B models

Error type Examples from models

Insertion “KEPT” ! KEPDT (A)

“ENVY” ! ENVNY (A)

Deletion “POWER” ! POER (A)

“VANITY” ! VANTY (B)

Exchange “MAIN” ! MIAN (A)

“PLAYING” ! PYALING (A)

Substitution “PERIOD” ! PEDIOD (A)

“STORY” ! SCORY (B)

Fragment–correct “SPIRIT” ! SPIR (B)

“CORD” ! C (B)

Fragment–similar “SCORN” ! SO (B)

“RADIO” ! AI (B)

Fragment–unrelated “ADORN” ! I (B)

“FLOAT” ! SO (B)

Note:The model that produced the error is indicated as (A) or (B).

496 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 19: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

The Type A models produce a pattern withinsertions as the least common error type, transpo-sitions the next common, deletions next, andsubstitutions most common. The same qualitativepattern is shown by Type A patient LB. Type Apatient AS has the qualitative ranking of insertionand transposition errors reversed and has a moresimilar incidence of deletions and substitutions.The proportion of substitutions produced by theType A models is somewhat higher than is usualamong Type A patients, although it is comparablewith that of FV, for example.

The Type B patients show a much higher inci-dence of deletion errors and few transpositions.The Type B models both also show very highincidence of deletion errors and low rates of trans-positions. For both models the second rankingerror type is substitutions, although the incidenceis considerably lower than that for deletions.This also accords with patients DA and BA.

Although shift and exchange errors have beencombined as “transpositions” in the figure, all ofthe models produce very low numbers of lettershift errors. This is an apparently robust featureof CQ models, and it is a feature of all TypeA patients for whom the distinction has beenmade.

Ward and Romani (1998) introduced a newtype of error, the fragment, in their analysis ofthe Type B dysgraphic patient BA. Fragmentsare errors that are two or more letters shorterthan the target word. Type B patients BA andDA produce large numbers of fragment errors,whereas they are not a prominent feature ofType A dysgraphics. Table 2 shows the errorsproduced by each model that were classified asfragments using the same criteria as those for theanalysis of DA’s errors. Fragments are classed as“correct” if there are only letters missing fromthe end of the word. Otherwise fragments are“structurally similar” if more than 50% of theletters in the response are present in the targetword. All other fragments are classed as “unre-lated”. (Examples of each fragment type from themodel are given in Table 1.) Table 2 also includescomparable figures for Type B patients BAand DA.

There is a striking effect of locus of damageon the overall proportion of fragments. TheType A models produce few fragments; the TypeB models produce many. Virtually all of thefragments produced by the Type A models are ofthe “similar” subtype, whereas the Type Bmodels also produce many “correct” fragments.

Figure 6.Distribution of different error types produced by the models (combining results for single-error responses and fragments over all word

lengths) compared with similar analyses for Type A patients LB and AS and Type B patient DA.

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 497

GRAPHEMIC BUFFER

Page 20: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

The patients themselves do not appear to show aconsistent pattern in the relative proportions ofdifferent fragment subtypes. The Type B modelsproduce a low rate of “unrelated” fragments,which is similar to patient BA but is lower thanDA. The rate of “correct” fragments is lowerthan either patient, though it is more comparableto DA than BA. Finally the model produces con-siderably more of the “similar” subtype than eitherpatient. However the model does concur with thehigh rate of preservation of the first letter in frag-ment errors, and in fact the primary Type B modelclosely matches this figure for patient DA.

The explanation for fragment errors in themodel is very simple. A pair of letter activationthresholds must be exceeded for spelling of aword to continue and for a particular letter to bearticulated. Degraded input to the GOB leads tosubnormal activation levels within the model,which in turn leads to more frequent omission ofletters and the early ending of the spellingattempt. The large increase in deletion and frag-ment errors in the Type B model than in theType A models follows straightforwardly. Thelarge number of Type B model fragment errorsthat preserve the first letter of the word bearsthis explanation out, suggesting that stoppingspelling too early is a common error. To checkthis we have confirmed in the Type B modelsthat the number of fragments in which sequencegeneration appears to have stopped early falls off

monotonically, as one would expect, with decreas-ing fragment length (i.e., fragments in which thefinal letter only is deleted are more frequent thanthose in which the final two letters are deleted,which in turn are more common than those withthe final three letters deleted).

Consonant/vowel status

The GOB subsystem of the model is forced torepresent the consonant or vowel status of lettersduring training. This was done to investigate theeffect on the error pattern after damage of sharedfeatures in the internal representations created insuch a model. Preservation of domain-dependentconstraints in sequencing errors is common inmany types of serial behaviour, and we see therelative preservation of consonant/vowel (C/V)status in the errors of some Type A patients as amanifestation of this general principle. In anyCQ system where some feature of the itemsbeing sequenced is shared across items withinparticular categories, errors will tend to preservethese categories. In the case of movement errorsthis is equivalent to a higher probability ofconfusion errors occurring between items of thesame category (Glasspool & Houghton, 1997). Itmakes most sense to speak of C/V status beingpreserved in substitution and exchange errors,and Table 3 shows the proportion of these typesof error that preserve consonant/vowel status in

Table 2. Errors produced by each model that were classified as fragments

Type A Type B Patient

Fragment

Primary Control Primary Control BA DA

type No. Percent. No. Percent. No. Percent. No. Percent. No. Percent. No. Percent.

Total 1,602 2 12,555 7 129,644 73 30,221 39 194 13 242 20

Correct 19 1 604 5 32,668 25 8,542 28 126 65 89 37

Similar 1,581 99 11,504 92 90,695 70 21,236 70 58 30 78 32

Unrelated 2 0 447 4 6,281 5 443 1 10 5 75 31

First letter

correct

1,288 80 6,520 52 73,501 57 21,344 71 (Not

reported)

148 61

Note: For comparison, fragment errors are also listed from Type B patients BA (Ward & Romani, 1998, Table 2) and DA (Cipolotti

et al., 2004, Table 11). Single and multiple letter fragments are combined. Percent.¼ percentage of all errors.

498 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 21: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

each model. The sharing of C/V status infor-mation in the internal representations of lettersin each model leads to a relative preservation ofC/V status in errors, as expected.

This feature is included only to confirm thatshared features can lead to preservation ofdomain-specific serial constraints in sequencingerrors. We make no theoretical claim for theparticular method that we use, and we have notattempted a quantitative fit to a particular patient.

GENERAL DISCUSSION

The disorder that we have termed “Type B graphe-mic buffer disorder” (see Cipolotti et al., 2004;Ward & Romani, 1998) has presented theoristswith a paradox. Typical patients of this type—forexample, BA and DA—have a pattern of errorsthat is, in most respects, qualitatively equivalentto that found in what was assumed to be the soleform of graphemic buffer disorder (Type A), ofwhich prototypic patients are LB (Caramazza &Miceli, 1990; Caramazza et al., 1987) and AS(Jonsdottir et al., 1996). These patients are widelyassumed to have a disorder at the level at whichrepresentations of letters are held in a bufferbefore being produced in writing or spellingaloud (Caramazza & Miceli, 1990; Caramazzaet al., 1987) or at the roughly equivalent level in aconnectionist architecture—that of letter formunits (Houghton et al., 1994; Shallice et al.,1995). However, at the same time graphemicbuffer Type B patients generally exhibit character-istics not normally associated with the GOB, suchas the presence of semantic errors (as in deepdysgraphia) or more commonly an effect of wordconcreteness or part of speech on writing, or

both. Conversely, Cipolotti et al. (2004) arguethat typical deep dysgraphic patients show GOBType B characteristics. Moreover, the Type BGOB patients described in the literature alsoshow as a frequent error type the fragment error(Ward & Romani, 1998), in which the patient pro-duced a response two or more letters shorter thanthe word itself. Often these fragments begin withthe same first letter as that of the target word.

In this paper we have begun the process ofsituating the CQ model of the GOB within thecontext of the wider spelling system. Modellingthe relationship between the semantic systemand the GOB in this way results in a modelwhich predicts the existence of the two forms ofGOB disorder, A and B. Moreover, it also explainstheir most salient characteristics and similaritiesand differences between them.

The model also has a number of limitations,however, in particular relating to more fine-grained details of patient error characteristics.We begin, therefore, by discussing the majorsuccesses and limitations of the model and consi-dering what the implications of the limitationsmight be for our theoretical position. Despite thecomplexity of the model we consider this workto be essentially preliminary, laying the foun-dations for more comprehensive modelling of thespelling system. It is especially important, there-fore, to identify the theoretically significantfeatures of the model with respect to the dataagainst which it has been compared. We finishthis section by identifying the main features thatare critical to our account of GOB disordersTypes A and B.

Successes of the model

The GOB element of the model has a form ofcompetitive queuing architecture (Houghton,1990) initially applied to spelling by Houghtonet al. (1994) and Shallice et al. (1995). Themajor development in the model is that it usesdistributed rather than localist internal represen-tations. When noise is added to the activationlevels of the competitive queuing mechanismthe model reproduces qualitatively similar

Table 3. Percentages of substitution, exchange, and shift errors

produced by the models in which consonant/vowel status is preserved

Type A Type B

Primary Control Primary Control

93 91 100 91

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 499

GRAPHEMIC BUFFER

Page 22: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

characteristics to the localist form of the CQmodel when noise is added at a similar point(Houghton et al., 1994). In particular the modelreproduces the four characteristic error types ofthe empirical disorder Type A (GOB disorder),it shows a word length effect, and it shows preser-vation of consonant/vowel status in the errors.

A primary aim of the simulations that we havedescribed was to test our prediction that degradedinput to an intact CQ sequence generationmechanism would result in errors with the samegeneral character as those produced by a damagedCQ mechanism. This is supported by the pro-duction of deletion, substitution, and exchangeerrors (and insertion errors in some situations, seebelow) by the Type B models. The effects ofword length on error rate and preservation of C/V status are also present in the Type B models.

On this model not only can the GOB Type Apattern be explained by a damaged CQ networkbut also those features that it shares with theGOB Type B error pattern (and hence that itshares with the typical deep dysgraphic pattern)can be explained in terms of degraded input tothe GOB. This is compatible with our hypothesisthat deep dysgraphia generally leads to weakoutput from central semantic representations sothat only a weak input reaches the graphemicbuffer in writing. A further aim was to test theassumption that the fundamental differencesbetween GOB Type A and Type B disorders canalso be explained by the same basic account.

At a gross level these include the effect on errorrate of word frequency and of semantic factorssuch as concreteness in Type B but not Type Adisorder. On the model Type B disorder is theconsequence of damage to the semantic-to-orthographic system, whereas Type A disorderdoes not involve damage to systems carryingsemantic representations. This difference is thusnaturally explained as the result of the semanticpathway locus of damage in Type B but notType A. Indeed the simulation shows strongeffects of concreteness and frequency when theinput to graphemic output systems from semanticrepresentations is degraded, but not when thegraphemic buffer itself is damaged.

At a finer level three major differences werenoted in the pattern of errors produced by TypeA and Type B patients. Type B patients producemonotonically increasing rather than bowedserial error incidence curves, they produce amajority of deletion errors, and they produce anew type of error in addition to those characteristicof GOB Type A disorder—the fragment error. Allthree of these differences have been identified inthe simulations presented above, and they havebeen explained as straightforward consequencesof the operation of a disrupted CQ sequencingsystem versus an intact CQ system operating ondegraded input. Most critically the effect ofreduced input to the CQ mechanism leads tomany fragment errors, which occur only rarely inthe Type A models. On the model, fragmentsoccur as a direct consequence of the reducedactivation of letter level units. It should be notedthat the assumption of Schiller et al. (2001) thatin such patients fragment errors occur due to anexcessively rapid decay in the graphemic buffercould be incorporated in a different form inthe model. However, it would be redundant. Thequalitative similarity between mild and severeversions of each type of simulated disorder inthe model confirms that the observed differencesare due to damage locus rather than severity.

The model therefore produces the followingadvances:

1. It allows us to explain why the pattern ofGOB disorder errors—substitution, deletion,insertions, and transpositions—occurs in thecontext of at least two qualitatively differentpatterns of overall impairment.

2. It explains why one of these patterns shouldcontain as a critical subcomponent the errorpattern associated with “deep dysgraphia” andnormally thought of as resulting from a morecentral locus of impairment than classicalGOB disorder.

3. It explains the observed differences in thepattern of errors between the two types ofdisorder—specifically, the difference in serialerror incidence, the difference in incidence ofdeletion errors, and the occurrence within the

500 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 23: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

GOB Type B pattern of an additional type oferror—the fragment error.

In all but the most advanced and narrow domainsof science a model is valuable if it captures in a suc-cinct way important aspects of the empirical dataset. Particularly when it is produced fairly earlyin the theoretical investigation of a domain it isunrealistic to expect a model to explain thedetails of all findings that are potentially relevant.However, to be useful it must be potentiallyextendable to deal with aspects of the datadomain where there is a possible discrepancy.The current model has the strong advantage ofsuccinctly capturing the areas of overlap and differ-ence between GOB disorder and deep dysgraphia,which no previous model has done in such detail.There are, however, a number of limitations inthe model’s account for patient characteristics.How serious are its failures to capture empiricalfeatures, and where are the areas where itsaccount of the data needs to be extended?

Limitations of the model

We can identify four main areas where the modeldisplays significant limitations in its account forpatient characteristics:

1. An inaccurate quantitative, and in some casesqualitative, match to the error patterns ofpatients.

2. An incomplete account for semantic errors.3. No account for nonword spelling.4. No account for errors involving double letters.

The first issue listed is the most important fromthe point of view of the theoretical adequacy ofour approach, as it is the issue for which it isleast clear whether modelling simplifications orproblems with the underlying theory are to blame.

1. An inaccurate match to patient error patternsAs our aims for this model, outlined above, wererather broad we have not at this stage paid greatattention to anything beyond the most general fitto empirical data. The model does in fact show anumber of quantitative and in some cases

qualitative deviations from the data. We have inmost cases made no attempt to quantitativelymodel patient performance, so it is the qualitativemismatches that are of most concern. There arefour main problems with the simulation resultsin this regard. First, the models aim to capture“prototypic” Type A and Type B patients, but asnoted earlier a number of patients exist thatappear to combine features of both types, or areotherwise nontypical. The models do not capturethe error patterns of these patients. Second, thequalitative fit of the proportions of errors ofdifferent types in both typical A and typical Bpatients is imperfect. Third, the qualitative fit ofthe proportions of different types of fragmenterrors to those of Type B patients is imperfect.Finally the Type A models produce only weaklybowed serial error curves when all incorrectletters are taken into account.

A number of patients reported in the literaturedo not fall cleanly into either Type A or Type B aswe have characterized them. We have proposedthat, since we relate Type A and B disorders toorthogonal types of damage, they represent endpoints of a continuum of possibilities correspond-ing to cases in which both types of damage maybe present to some degree. We have not yetattempted to simulate such cases so we cannotcategorically say what the predictions of themodel would be. However, it is important toconsider the ways in which these atypical patientsdiverge from the model’s predictions. We beginby summarizing the model’s predictions for the“prototypical” end-point A and B cases. Withinternal disruption to the competitive queuingprocess in the GOB the model predicts effects ofword length, bowed serial error curves, letter-sequencing errors with substitutions, deletions,transpositions, and insertions, and a tendencyto preserve consonant–vowel status in errors.Spelling accuracy is not sensitive to word fre-quency or concreteness. This we consider themodel’s prototypic Type A profile. With degradedinput to an intact GOB the model again predictsword length effects, consonant–vowel preser-vation, and letter-sequencing errors of the samegeneral types as the Type A case. However, the

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 501

GRAPHEMIC BUFFER

Page 24: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

proportion of letter deletion errors is much greater,and a new error, the fragment, occurs in substantialnumbers. Additionally the serial error plot changesto a monotonically increasing curve, and spellingerrors are sensitive to word frequency and semanticfactors such as concreteness. This is the model’spredicted Type B profile.

Turning to patients apparently intermediatebetween these profiles, the majority of these fallinto three classes. The most common are thoseessentially of Type A but which show a significanteffect of word frequency or other lexical features(e.g., patients AS, JH, DH, JES, and HE).There appear to be two ways in which one mightaccount for this within the present model. Moststraightforwardly one could propose minor disrup-tion to the semantic representation reaching theGOB (mild Type B disorder) combined withmore serious disruption to the serial outputstage (strong Type A disorder), in line with our“continuum” proposal.

Alternatively, if our assumptions that represen-tations at the level of “word identity” are entirelyfree from semantic content, and that the GOBportion of the model is trained entirely to ceilingso that word frequency does not affect the strengthof representations in this subsystem, do not holdcompletely for all patients, semantic content orword frequency would affect the error rate of thepure Type A model. In the present model wearrange for consonant–vowel status to be rep-resented in the connection weights activatingletter words, and this leads to a bias in the compe-tition for output resulting in relative preservationof CV status. Features such as word frequency,age of acquisition, and concreteness, if they arerepresented at all in the system that activatesletter representations in the GOB, may in thesame way lead to biases in the output competitionand effects on error patterns. In the present modelwe have assumed a “classical” interpretation ofType A GBD, with no lexical influences onactivations at this level of the model. Sage andEllis (2004) argue that such features, as well asthe number of near “orthographic neighbours”that a word has, do often influence spelling inType A GBD, and the present model has the

advantage that such effects could be readilyaccounted for by relaxing the assumption offreedom from lexical influence in GOB represen-tations. We would expect that relaxing therequirement that the GOB model be trained toceiling would allow these influences to be felt inthe Type A simulations.

Less common are patients who show anoma-lous serial position curves, either patients appar-ently of Type B with bowed serial error curves(for example FM and AM) or apparently ofType A but with a flat or monotonic serial positioncurve (e.g., FV and GSI, Miceli et al., 2004). Themost obvious way in which the model mightaccount for these patterns is through a combi-nation of Type A and B damage. As this has notbeen simulated we cannot easily predict its effecton the model.

A third set of anomalous patients are apparentlyof Type B but produce no semantic errors (HR,AZO, and GSI). The present model simulatesType B disorder through degraded input toGOB. However, this degraded input might havea number of causes, some of which (for example,damage to projections from semantics to GOB)would not lead to semantic errors. Such patientsmight then correspond to damaged transfer ofrepresentations from an intact semantic system toan intact GOB system. In the absence of a fullsimulation of the semantic system (see below)the present model does not distinguish this casefrom a damaged semantic system.

None of these more complex situations has yetbeen simulated, so intermediate patients of thesetypes remain an important issue for the model.

Two further types of patient are relevant to thetheoretical position that we have taken. First, weclaim that degraded input to the GOB from thesemantic system necessarily results in Type BGOB symptoms. However, many types of seman-tic disorder involve no such dysgraphic symptoms.This would be critical to our account if it wereassumed that the presence of semantic processingproblems necessarily led to degraded input to theGOB. However, other types of input to theGOB may compensate (the phonological spellingroute may be operating correctly, for example),

502 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 25: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

and additionally there may be ways in whichsemantic processing could be disrupted while pre-senting a superficially normal output to the GOB(a fully activated pattern of activation representingan incorrect word, for example). A failure mode ofthis type is in fact a prediction of the Plaut andShallice (1993) model of lexical semantic represen-tation. A central aspect of that model is the emer-gence of “attractors”, which, under some types ofdamage, clean up errors in semantic to word-identity mapping so that a known word is alwaysproduced rather than, for example, a blend ofwords. Damage to the model can result in degra-dation of the output signal by comparison withthe intact model; however, some types of damagecan allow the attractor system to “clean up” a dis-rupted internal representation to produce a normaloutput, albeit possibly representing an incorrectword. This type of output from the semanticprocessing system would allow the GOB as wehave modelled it to operate normally resulting indeep dysgraphic symptoms in the absence ofGOB Type B errors. While the review ofCipolotti et al. (2004) suggests that this is notthe norm in deep dysgraphia, patient RCM(Hillis, Rapp, & Caramazza, 1999) appears to beclose to this pattern. This patient produced many(56%) whole-word substitution errors, but rela-tively few (21%) spelling errors that resulted innonwords. We would expect this type of errorpattern under some types of damage to a modelof the present type combined with a Plaut andShallice type of lexical semantic system.However, a considerably more complex simulationwill be necessary to test this aspect of the model’spredictions. Similarly, our position that the TypeB error profile is associated with degraded inputto the GOB does not necessarily imply a locusfor the corresponding damage. It would seemthat a lexical–semantic locus is common in suchcases, but some patients with aspects of the TypeB profile show no indication of lexical–semanticdamage (e.g., GSI). Any damage affecting thetransmission of information into the GOBwould affect performance in this way, for example.

Second, we claim that certain key features ofthe error pattern in Types A and B GBD follow

directly from the compromise of the GOB,whether through damage or degraded input.These include the production of letter sequenceerrors including insertions, omissions, transpo-sitions, and substitutions, and the effect of wordlength on accuracy of spelling. Patients in whichthe former pattern appeared in the absence ofthe latter would challenge this position.

In fact a few patients have been reported whoapparently show letter-sequencing errors inthe absence of word length effects. However, weare aware of no patient of this type where thereis no indication that a substantial proportion oferrors arise in systems other than GOB (such asphonological spelling or allographic conversion).Thus Hillis, Chang, Breese, and Heidler (2004)report three dysgraphic patients who produce spel-ling errors resulting in nonwords in the absence ofsignificant effects of word length. However, thesepatients all show a large proportion of phonologicallyplausible errors (62–90%). This suggests that theyhave preserved phonological spelling systems,which can be assumed to support input to theGOB. The two that were tested also showedmany stroke-related errors, evidence of damageto allographic conversion processes subsequent tothe GOB. It appears possible that these patients’spelling errors do not involve abnormal operationof the GOB, which would be consistent with theabsence of word length effects. RCM (Hilliset al., 1999) also shows no effect of word lengthon spelling accuracy. However, as discussedabove this patient appears to fit the pattern ofdisrupted lexical semantic representations thatare largely “cleaned up” prior to the GOB, andagain the majority of the patient’s spelling errorsmay not involve abnormal operation of the GOB.

Turning now to the distribution of differenterror types (substitution, insertion, deletion, andtransposition), this is the least stable feature ofthe model and is easily affected by changes infree parameters. Viewed positively this accordswith the relative variability of the error distri-bution across different patients. The variability ofpatients makes it less easy to directly comparethe models with data. However, certain commonfeatures appear to be present across a majority of

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 503

GRAPHEMIC BUFFER

Page 26: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

patients. One of these, the universally low rate ofshift errors, is robustly shown by the model (andby previous CQ spelling models). The mostobvious quantitative discrepancy is the low rateof insertion errors in the simulations that wehave reported (Figure 6). However, other typesof manipulation to the existing model affect thisfeature. For example, reducing the value of theresponse threshold Tr from 0.8 to 0.7, duringrecall only, leads to higher rates of insertionerrors on shorter words. It is, however, possiblethat an additional error mechanism may be atwork in patients, which is not present in themodel.

As yet the different mechanisms by whicherrors can occur within a simple CQ sequencegenerator are not well characterized (seeGlasspool, 1998, 2005, for a preliminary discus-sion). This makes it difficult to predict in detailthe effects of small changes to a model on theproportions of different error types produced. Anumber of factors influence the balance betweenerror types. For example, the relative rate ofinsertions, transpositions, and substitutions thatis produced on the model depends on specificdetails of the implementation of the impairmentas well as the core assumptions of the model. Itis likely to depend on:

1. How noise is added—whether it changes by theletter or by the word.

2. The distribution of noise—exponential, normal,or rectangular distributions, for example.

3. The number of thresholds used. Two activationthresholds are used in the simulation. There aretwo mechanisms for deletion errors in themodel, one involving failure of any letternodes to exceed one or both of the activationthresholds (for letter articulation or continu-ation of spelling) and another that is morebasic to the CQ approach in which initiallyan anticipation error occurs, and then spellingcontinues without the letter that was omittedin the initial error ever being produced. Usingonly a single activation threshold would affectthe first of these mechanisms, producing areduction in the rate of deletion errors (the

manipulation of the response threshold men-tioned above has a similar effect and increasesthe rate of insertions relative to deletions).Various factors such as the slope of the letteractivation gradient can affect the incidence ofthe second type of deletion (Glasspool, 1998).

The discrepancies between model and patientsin the proportions of different types of error area concern for the CQ approach. However, we donot believe that it would be appropriate to arguefrom this alone that the model is flawed, for tworeasons. First, the model does not tightly constrainperformance on this feature. Performance isaffected by changes to free parameters of themodel, whereas other features of the approachare much more stable (see Glasspool, 1998, andGlasspool & Houghton, 2005, for discussion).Second, factors outside the model can be expectedto have a relatively large impact on the finer detailsof error proportions. For example, this aspect ofthe data is likely to be affected most by strategicfactors such as the patients’ tendency to guess.

A relatively clear general account exists on themodel for fragment errors. However, the relativeincidence of different subtypes of fragment errorsis not predicted by that account. Indeed the rateof correct and similar fragments is reversedbetween the model and patient BA. Again,though, patients are variable on this measure,and we would expect the same range of factors asthat mentioned above to influence this feature.However the mechanism for fragment errors inthe model does provide a straightforward expla-nation for two effects seen in the patients: thequalitative difference in incidence of fragmenterrors between Type A and Type B disorder, andthe tendency for letters nearer the start of theword to be preserved in these errors (observed inboth DA and BA). These appear to be somewhatmore gross features than the relative proportionsof correct and similar fragments, which is whatwe would expect to see if the discrepancy weredue to the crude nature of our simulation of thistype of error.

Finally, the Type A models produce onlyweakly bowed serial error curves when all incorrect

504 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 27: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

letters are taken into account. Previous CQ modelshave demonstrated strongly bowed serial errorcurves, both in spelling and other domains (includ-ing models of Type A GOB, e.g., Houghton et al.,1994), so this does not appear to be a limitation ofthe CQ approach per se. It is possible that themove to a multilayer network architecture has insome way affected this aspect of the model’soperation. However, we do not yet understandthe cause of this difference.

2. Incomplete account for semantic errorsFrom the perspective of modelling Type Bdysgraphia the most serious omission in thepresent model is the lack of a full model for thesemantic spelling route. Our general assumptionhas been that a system of the type advanced byPlaut and Shallice (1993) would provide thisfunction.

While it is beyond the scope of this paper tocombine two complex models, we would expectthat full simulation of the semantic system wouldproduce similar results to the present model withthe exception that some errors would be incorrectbut intact words semantically related to the target.However, the manner in which output from adamaged Plaut/Shallice type network is degradedmay differ in detail from that assumed in thepresent model, and further simulations will berequired to confirm that this does not materiallyaffect our overall account.

3. No account for nonword spellingThe present model does not attempt to coverthe entire spelling system. The main omissionis the lack of a route via phonology, which couldbe used to spell nonwords. However, we do notbelieve that incorporating a phonological spellingroute would impact on the general accountprovided by the present model.

The extension of the CQ approach to a multi-layer network architecture allows us to propose asimpler account for the integration of semanticand phonological routes at the level of the GOBthan was possible in our previous localist spellingmodels (Shallice et al., 1995). The multilayerframework has the potential to learn a generalized

mapping from a phonemic representation to aspelling output, allowing the model to incorporatea phoneme-to-grapheme conversion system as wellas the graphemic output lexicon. Two options arepossible—the phonological system might projectto Hidden Layer 1, in which case the word identityfield would represent a combined phonological andsemantic orthographic lexicon, or the word rep-resentation field could be split into semantic andphonological sections, each receiving input viatheir own hidden layers from semantic or phonolo-gical representations. The phonological represen-tation might represent the entire word and be heldconstant throughout spelling, or it could be adynamic pattern such as the shifting window ofmodels such as NETspell (Olson & Caramazza,1994). We leave exploration of these possibilitiesfor future work.

4. No account for spelling with geminate (double)lettersOur previous modelling work (e.g., Glasspool,1998; Houghton et al., 1994; Shallice et al.,1995) as well as other CQ modelling work (e.g.,Rumelhart & Norman, 1982) has demonstratedthe general effectiveness of a separate geminaterepresentation in modelling the phenomenaassociated with doublings in serial behaviour,including spelling. The arguments supportingthis approach hold good for the current model,and we do not believe it will be problematic inprinciple to add a separate geminate representationto the current GOB model.

The issue of repeated letters

Ward and Romani (1998) consider a CQ sequen-cing explanation for the GOB-type errors ofthe Type B patient BA. They conclude that theapproach provides a promising model for thisaspect of the deficit, with one reservation concern-ing spelling of words with repeated letters. Aprima facie prediction of the CQ approach isthat repeated items in sequences will cause difficul-ties, as one might expect that the temporaryinhibition of items following their productionshould make the second occurrence of a repeated

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 505

GRAPHEMIC BUFFER

Page 28: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

item more error prone than a nonrepeated letter inthe same sequence position. Ward and Romaniconclude that words with repeated letters (e.g.,fence) should be more error prone than wordswith no repeats (e.g., lance) on any model of theCQ type. Patient BA, though, shows no differencein spelling performance on this variable.

However, the prediction from any moderatelycomplex CQ model is unlikely to be so straight-forward. The most basic form of CQ mechanismwith a static activation gradient is unable to repeatitems within a sequence, so most implementedmodels allow the gradient of activations over the“queue” of responses to vary over time. When anitem is to be repeated its representation may thenreceive additional excitement later in the sequencein order to overcome its temporary inhibition(e.g., Houghton, 1990). This is the basic formof the model presented here. When a letter isrepeated temporary inhibition is balanced by anincrease in excitation, and it is not clear thatrepeated letters will necessarily be disadvantaged(although doubled letters—that is, immediaterepeats—do seem to be difficult for such models torepresent unaided). The effect of repeated lettersdepends on the balance between these twoinfluences.

The present model has the additional featurethat it learns to spell during repeated attempts. Itthus has the opportunity to learn to compensatefor any additional fragility in repeated lettersduring the training process. Based on these con-siderations we did not expect a strong effect ofrepeated letters on the present model. To testthis expectation the models’ responses to high-frequency, high-concreteness words were sortedaccording to the presence or absence of repeatedletters in the target word. The percentage ofcorrect spellings at each word length was averagedto correct for word length effects. The mean resultswere 60% correct (with repeated letters) versus55% correct (no repeated letters) for the Type Aprimary model and 21% correct (repeats) versus24% correct (no repeats) for the Type B primarymodel. We conclude that similar performance onboth word types in a patient does not present adifficulty for the model.

Critical features of the model

The computational model that we have advancedhas several components and is moderatelycomplex. Some complexity is necessary in orderto produce a self-contained process model in anontrivial domain. However, the essence of ourapproach is straightforward, so it is useful toconsider which features of the model are criticalto our theoretical position and which are presentsimply in order to provide a concrete realizationof that position in a workable model.

To summarize our theoretical position, weconsider Type A disorder to be due to disruptionof the operation of a sequence generation com-ponent working according to the principles of com-petitive queuing (the disruption being equivalentto increased stochastic variability in the selectionof the “winning” letter at each serial position), sothat intact word identity information reaches adamaged serial output system. Type B disorderinvolves damage to an “upstream” activatingsystem so that the word identity information isdegraded, while the serial output mechanismitself is undamaged. The similarities between thetwo disorders arise because in both cases the oper-ation of a CQ output system is disrupted. Some ofthe differences between the two disorders are dueto the different locus of damage to the system(the differential effects of frequency and semanticclass arise because damaged areas in Type B, butnot Type A, are sensitive to these features), whileothers are due to differences in the type of errorstypical of a CQ system under noise disruption com-pared with degraded input (the different serialerror incidence curves, the occurrence of fragmenterrors, and the different profile of error types).

We can identify seven features of the modelthat are critical with respect to this account:

1. The output of the semantic spelling route is arepresentation of word identity that is free fromsemantic content. This is implied by the factthat Type A errors are insensitive to semanticclass. Presumably then a level exists in thespelling system where such information isnot present, and the proposed location for

506 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 29: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

this representation is at the last point that itcould reasonably be expected: at the input tothe final sequence generation stage in theshared output path of the system. Themodel’s explanation for the lack of semanticclass sensitivity in Type A disorder dependson this feature.

2. The semantic spelling route, when damaged,produces degraded output sensitive to wordfrequency and to semantic class. This is necessaryto the explanation for the frequency andsemantic class sensitivity of Type B errors.

3. The final shared output from the spelling system isa sequence generation system operating accordingto the principles of competitive queuing.

4. Damage to the CQ output system has the effect ofintroducing uncertainty in letter selection. Thesetwo assumptions are critical to our explanationfor the types of error common to the two dis-orders, for the effects of word length andserial position, and for the differences in thesefeatures between the two disorders.

5. The CQ output system is not sensitive to wordfrequency when damaged. This feature isrequired to explain lack of frequency sensitivitytypical of Type A disorder. It is implied empiri-cally by the existence of patients producingGBD type errors but with no sensitivity tofrequency. This feature follows from the typeof learning required in a CQ system.

6. The CQ system represents consonant/vowel statusof letters (and any other linguistic informationsubject to observed regularities in the errors ofboth Type A and Type B dysgraphics, see Sage &Ellis, 2004). The model implies that any regu-larities in the errors of both types of dysgraphicmust arise from information represented in thegradient of activations of the CQ system. Theeffect of such regularities has been demon-strated in the model by forcing the represen-tation of consonant/vowel information, butany information leading to systematic biasingof letter activations in this part of the modelwill have similar effects (notably the tendencyfor letters of the same “class” to participatejointly in errors). Any such information mustbe present in the word ID field or in

representations or connections downstream ofthat point.

7. The CQ output system stops sequence generationwhen no letter is activated above a threshold.This mechanism for stopping sequence pro-duction has been proposed on independentgrounds as the most elegant means of stoppingsequence production in a CQ model (e.g.,Glasspool, 1998; Houghton, 1990,). This iscritical to the explanation for fragment errorsin Type B but not Type A disorder. Thepresent model also incorporates a responsethreshold for letter production, which leads toomission rather than stopping on subthresholdletters. This assumption appears less critical tothe approach but contributes to the finer levelperformance, in particular the proportion ofdeletion errors.

The extension of CQ to a multilayer networkoffers a number of methodological advantages,although it also suffers from disadvantages (e.g.,the long training time required for backpropaga-tion learning algorithms). It also has a number ofeffects on the fine level performance of themodel. However, we do not consider that it iscritical to the theoretical position advanced here.There is no fundamental reason why a localistmodel incorporating the critical features outlinedabove could not offer essentially the same expla-nation for Type A and B disorders. The multilayerapproach allows the separation of the timing signalfrom the word identity representation. This isinteresting from the point of view of processingrequirements and may have implications for phys-iological plausibility. The approach may also offera more tractable solution to the problem of inte-grating semantic and assembled spelling routes atthe level of the GOB. However these featuresare not critical to the explanation of Type A andB syndromes.

CONCLUSION

Previously deep dysgraphia and graphemic bufferdisorder have been seen as two distinct syndromes.

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 507

GRAPHEMIC BUFFER

Page 30: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

The present model supports the conclusion thatthere is an overlap: Certain critical systems areinvolved in both. This is consistent with the obser-vation (Cipolotti et al., 2004) of similarities in theerror patterns in the two syndromes. It should benoted, however, that the idea of examining inthat paper whether deep dysgraphic patients alsomake graphemic buffer type errors only followedthe analysis of the properties of the model.

On the model, what differs between the twotypes of patient is whether an intact outputmechanism operates on degraded input, or adamaged output mechanism operates on intactinput. On this view the Type A-like propertiesthat Cipolotti et al. (2004) find in Type B disorderare a consequence of the presence of a CQ sequen-cing mechanism, which receives input from adamaged system.

There is perhaps a wider implication for com-puter simulation in neuropsychology. The internaloperation of a particular element in a cognitivemodel may have a profound effect on the predic-tions following from damage to the model. Wewould argue that process models of the operationof modules and their interactions are required inaddition to abstract reasoning about high-level“box-and-arrow” models of cognitive architecture,or about the type and structure of information thatis represented within them.

We have identified a number of limitations ofthe model in its present form. In particular itdoes not provide a close quantitative fit topatient data in all areas, and it does not give afull qualitative account for some patients describedin the literature. Further simulation work isrequired to confirm that these cases can be accom-modated within the general theoretical frameworkthat we propose. This will involve the develop-ment of a more accurate model of the spellingsystem, which will in turn require a careful inves-tigation of the influence of lexical parameterssuch as word concreteness and frequency at differ-ent points in the system, and it will no doubt alsorequire a better understanding of the mechanismsof error in CQ models. If these issues can beaddressed then we believe that the benefits ofwidening the modelling enterprise to encompass

multiple subsystems, and their interaction, willbe substantial.

Manuscript received 20 November 2002

Revised manuscript received 19 July 2005

Revised manuscript accepted 20 July 2005

PrEview proof published online December 2005

REFERENCES

Aliminosa, D., McCloskey, M., Goodman-Schulman, R.,& Sokol, S. M. (1993). Remediation of acquired dys-graphia as a technique for testing interpretations of def-icits. Aphasiology, 7, 55–69.

Bub, D., & Kertesz, A. (1982). Deep agraphia. Brainand Language, 17, 146–165.

Bullinaria, J. A., & Chater, N. (1995). Connectionistmodelling: Implications for cognitive neuropsycho-logy. Language and Cognitive Processes, 10, 227–264.

Burgess, N., & Hitch, G. J. (1992). Towards a networkmodel of the articulatory loop. Journal of Memory and

Language, 31, 429–460.Burgess, N., & Hitch, G. J. (1996). A connectionist

model of STM for serial order. In S. E. Gathercole(Ed.), Models of short-term memory (pp. 51–72).Hove, UK: Psychology Press.

Caramazza, A., & McCloskey, M. (1991). The poverty ofmethodology. Behavioural and Brain Sciences, 14,444–445.

Caramazza, A., & Miceli, G. (1990). The structureof graphemic representations. Cognition, 37,243–297.

Caramazza, A., Miceli, G., Villa, G., & Romani, C. (1987).The role of the graphemic buffer in spelling: Evidencefrom a case of acquired dysgraphia. Cognition, 26,59–85.

Christiansen, M. H. (1997). Improving learning and gen-eralisation in neural networks through the acquisitionof multiple related functions. In J. Bullinaria,D. Glasspool, & G. Houghton (Eds.), Connectionistrepresentations. Proceedings of the 4th Neural

Computation and Psychology Workshop (pp. 58–70).London, Springer-Verlag.

Cipolotti, L., Bird, C., Glasspool, D., & Shallice, T. S.(2004). The impact of deep dysgraphia ongraphemic output buffer disorders. Neurocase, 10(6),405–419.

Coltheart, M. (1981). The MRC PsycholinguisticDatabase. Quarterly Journal of Experimental

Psychology, 33A, 497–505.

508 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 31: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

Cooper, R., & Shallice, T. (2000). Contention schedul-ing and the control of routine activities. CognitiveNeuropsychology, 17, 297–338.

Cubelli, R. (1991). A selective deficit for writing vowelsin acquired dysgraphia. Nature, 353, 258–260.

De Partz, M.-P. (1995). Deficit of the graphemic buffer:Effects of a written lexical segmentation strategy.Neuropsychological Rehabilitation, 5, 129–147.

Ellis, A. W. (1982). Spelling and writing (and readingand speaking). In A. W. Ellis (Ed.), Normality and

pathology in cognitive function (pp. 251–275).London: Academic Press.

Ellis, A. W. (1984). Reading writing and dyslexia:

A cognitive analysis. Hove, UK: Lawrence ErlbaumAssociates Ltd.

Glasspool, D. W. (1998). Modelling serial order in beha-

viour: Studies of spelling. Unpublished Ph.D. thesis,University College London, London.

Glasspool, D. W. (2005). Modelling serial order in beha-viour: Evidence from performance slips. InG. Houghton (Ed.), Connectionist models in psychology

(pp. 241–270). Hove, UK: Psychology Press.Glasspool, D. W., & Houghton, G. (1997). Dynamic

representation of structural constraints in models ofserial behaviour. In J. Bullinaria, D. Glasspool, & G.Houghton (Eds.), Connectionist representations.

Proceedings of the 4th Neural Computation and Psychology

Workshop (pp. 269–282). London, Springer-Verlag.Glasspool, D. W., & Houghton, G. (in press). Serial

order and consonant-vowel structure in a model ofdisordered spelling. Brain and Language.

Glasspool, D. W., Houghton, G., & Shallice, T. (1995).Interactions between knowledge sources in a dual-route connectionist model of spelling. In L. S. Smith& P. J. B. Hancock (Eds.), Neural computation and

psychology (pp. 209–226). London: Springer-Verlag.Glasspool, D. W., Shallice, T., & Cipolotti, L. (1999).

Neuropsychologically plausible sequence generation.In D. Heinke, G. W. Humphreys, & A. Olson(Eds.), Connectionist models in cognitive neuroscience

(pp. 40–51). London: Springer-Verlag.Hartley, T., & Houghton, G. (1996). A linguistically con-

strained model of short-term memory for nonwords.Journal of Memory and Language, 35, 1–31.

Hillis, A. E., & Caramazza, A. (1989). The graphemicbuffer and attentional mechanisms. Brain &

Language, 36, 208–235.Hillis, A. E., Chang, S., Breese, E., & Heidler, J.

(2004). The crucial role of posterior frontal regionsin modality specific components of the spellingprocess. Neurocase, 10, 175–187.

Hillis, A. E., Rapp, B. C., & Caramazza, A. (1999).When a rose is a rose in speech but a tulip inwriting. Cortex, 35, 337–356.

Hinton, G. E. (1989). Connectionist learning pro-cedures. Artificial Intelligence, 40, 185–234.

Houghton, G. (1990). The problem of serial order: Aneural network model of sequence learning andrecall. In R. Dale, C. Mellish, & M. Zock (Eds.),Current research in natural language generation

(pp. 287–319). London: Academic Press.Houghton, G. (1994). Some formal variations on the

theme of competitive queueing (Internal TechnicalReport, UCL-PSY-CQ1). University CollegeLondon, Department of Psychology.

Houghton, G., Glasspool, D., & Shallice, T. (1994).Spelling and serial recall: Insights from a competitivequeueing model. In G. D. A. Brown & N. C. Ellis(Eds.), Handbook of spelling: Theory, process and inter-

vention (pp. 365–404). Chichester, UK: John Wileyand Sons.

Houghton, G., & Hartley, T. (1996). Parallel models ofserial behaviour: Lashley revisited. PSYCHE, 2(25).Retrieved from http://psyche.cs.monash.edu.au/v2/psyche-2-25-houghton.html

Jonsdottir, M., Shallice, T., & Wise, R. (1996).Language-specific differences in graphemic bufferdisorder. Cognition, 59, 169–197.

Jordan, M., (1986). Attractor dynamics and parallelismin a connectionist sequential machine. Proceedingsof the 8th Annual Conference of the Cognitive Science

Society (pp. 10–17). Hillsdale, NJ: LawrenceErlbaum Associates, Inc.

Katz, R. (1991). Limited retention of information in thegraphemic buffer. Cortex, 27, 111–119.

Kay, J., & Hanley, R. (1994). Peripheral disorders ofspelling: The role of the graphemic buffer. In G.D. A. Brown & N. C. Ellis (Eds.), Handbook of spel-

ling: Theory, process and intervention (pp. 295–315).Chichester, UK: John Wiley and Sons.

Lashley, K. S. (1951). The problem of serial order inbehaviour. In L. A. Jeffress (Ed.), Cerebral mechan-isms in behavior (pp. 341–392). New York: Wiley.

Lennox, C., & Siegel, L. S. (1994). The role of phono-logical and orthographic processes in learning tospell. In G. D. A. Brown & N. C. Ellis (Eds.),Handbook of spelling: Theory, process and intervention

(pp. 93–109). Chichester, UK: John Wiley and Sons.Margolin, D. I. (1984). The neuropsychology of writing

and spelling. Semantic, phonological, motor andprocesses. Quarterly Journal of Experimental

Psychology, 36A, 459–489.

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 509

GRAPHEMIC BUFFER

Page 32: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

McCloskey, M., Badecker, W., Goodman-Shulman, R., &Aliminosa, D. (1994). The structure of graphemic rep-resentations in spelling:Evidence from acaseof acquireddysgraphia. Cognitive Neuropsychology, 11, 341–392.

McCloskey, M., & Caramazza, A. (1991). On crudedata and impoverished theory. Behavioural and

Brain Sciences, 14, 453–454.Miceli, G., Benvegnu, B., Capasso, R., & Caramazza, A.

(1995). Selective deficit in processing double letters.Cortex, 31, 161–171.

Miceli, G., Capasso, R., Benvegnu, B., & Caramazza, A.(2004). The categorical distinction of vowel and con-sonant representations: Evidence from dysgraphia.Neurocase, 10, 109–121.

Miceli, G., Capasso, R., Ivella, A., & Caramazza, A.(1997). Acquired dysgraphia in alphabetic and steno-graphic handwriting. Cortex, 33, 355–367.

Miceli, G., Silveri, M. C., & Caramazza, A. (1985).Cognitive analysis of a case of pure dysgraphia.Brain and Language, 25, 187–196.

Olson, A., & Caramazza, A. (1994). Representationand connectionist models: The NETspell experience.In G. D. A. Brown & N. C. Ellis (Eds.), Handbook of

spelling: Theory, process and intervention (pp. 337–363). Chichester, UK: John Wiley and Sons.

Plaut, D. (1995). Double dissociation without modula-rity: Evidence from connectionist neuropsychology.Journal of Clinical and Experimental Neuropsychology,

17, 291–321.Plaut, D. C., & Shallice, T. (1993). Deep dyslexia: A

case study of connectionist neuropsychology.Cognitive Neuropsychology, 10, 377–500.

Posteraro, L., Zinelli, P., & Mazzucchi, A. (1988).Selective impairment of the graphemic buffer in

acquired dysgraphia: A case study. Brain and

Language, 35, 274–286.Quinlan, P. T. (1993). The Oxford Psycholinguistic

Database. Oxford, UK: Oxford University Press.Rumelhart, D. E., Hinton, G. E., & Williams, R. J.

(1986) Learning internal representations by back-propagating errors. Nature, 323, 533–536.

Rumelhart, D. E., & Norman, D. A. (1982). Simulatinga skilled typist: A study of skilled cognitive-motorperformance. Cognitive Science, 6, 1–36.

Sage, K., & Ellis, A. W. (2004). Lexical influencesin graphemic buffer disorder. Cognitive

Neuropsychology, 21, 381–400.Schiller, N. O., Greenhall, J. A., Shelton, J. R., &

Caramazza, A. (2001). Serial order effects in spellingerrors: Evidence from two dysgraphic patients.Neurocase, 7, 1–14.

Shallice, T. (1988). From neuropsychology to mental struc-

ture. Cambridge, UK: Cambridge University Press.Shallice, T., Glasspool, D. W., & Houghton, G. (1995).

Can neuropsychological evidence inform connec-tionist modelling? Analyses of spelling. Language

and Cognitive Processes, 10, 195–225.Tainturier, M.-J., & Caramazza, A. (1996). The status

of double letters in graphemic representations.Journal of Memory and Language, 35, 53–73.

Ward, J., & Romani, C. (1998). Serial position effectsand lexical activation in spelling: Evidence from asingle case study. Neurocase, 4, 189–206.

Wing, A. M., & Baddeley, A. D. (1980). Spelling errorsin handwriting: A corpus and a distributional analy-sis. In U. Frith (Ed.), Cognitive processes in spelling

(pp. 251–285). London: Academic Press.

APPENDIX

Formal description of the model

The operation of the network can be separated into two passes.

During the forward pass, an activation pattern is applied to the

input layer, and activation propagates forward to the output

layer. The backward pass operates only during training, when

an error signal is propagated back from the output layer to

adjust weights in proportion to their contribution to the error.

The two subnetworks of the model are trained separately.

During training of the semantic network, the semantic input

layer is treated as the input layer and the word identity layer is

treated as the output. During training of the GOB network,

the word identity field and the position field are treated as the

input layer and the letter layer is treated as the output. During

recall the model is treated as a single network, with the word

identity field operating as a hidden layer. The following applies

equally to both subnetworks except where otherwise indicated.

Forward pass

Nodes in input fields have their activation levels set to 1.0 or 0.0

according to the current input pattern for that field. The net

input neti(t) to node i in a hidden or output layer at time step

t is given by:

neti(t) ¼Xnj¼1

Aj (t)Wji (1)

510 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI

Page 33: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

where Aj is the activation of node j in the previous layer, and Wji

is the weight from node j to node i. The function f is the

logistic function standardly used in backpropagation networks

(Rumelhart, Hinton, & Williams, 1986):

f (x) ¼1

e�x þ 1(2)

As well as receiving input from nodes in the previous layer,

each node in hidden and output layers also receives a bias input,

which may be thought of as an additional weight from a unit

that is permanently set to an activation of 1.0.

The activity Ai (t) of node i in a hidden layer at time step t is

given by:

Ai(t) ¼ f (neti(t)) (3)

The C and V nodes also obey Equation 3.

Letter nodes recover slowly from inhibition. The activation

Ai (t) of letter node i at time t is given by:

Ai(t) ¼neti(t) þ h if Ai (t � 1) � 0

neti(t) þ rAi(t � 1) þ h otherwise

�(4)

where h is a noise value drawn from a uniform distribution

around 0 (actual values used in simulations are given in the

text) and r is a parameter that governs the rate of recovery

from inhibition.

During recall the most active letter node is determined

by a simulated competitive filter, and the activation level

of this letter node is then set to a standard negative

(inhibited) activation level, Inh. During learning the

letter that should be produced in the current position is

inhibited, regardless of which letter actually has the

highest activation level (this immediate correction of errors

during learning prevents errors from disrupting subsequent

correctly learned letters: see Houghton, Glasspool, &

Shallice, 1994).

Backward pass

Hinton (1989) shows that in a network where binary-valued

output vectors are desired, and real-valued output vectors may

be interpreted as probability distributions over binary vectors

(the CQ noisy selection procedure is straightforwardly inter-

pretable in this way), the appropriate error measure to use in

a backpropagation training procedure is the cross-entropy, C,

between the desired and actual probability distributions. The

cross-entropy between an actual probability vector A, with

elements ai, and a desired probability vector D, with elements

di, is given by:

C ¼ �Xi

di log2 (ai) þ (1 � di) log2 (1 � ai) (5)

When used in a backpropagation algorithm the derivative of

C is multiplied by the derivative of the logistic function, and

this product reduces to the difference between the desired and

actual outputs (Hinton, 1989). An error value du is thus gener-

ated for each output layer unit:

du ¼ (du � au) (6)

where du is the desired activation value for output unit u and auis the actual value. The desired values du are generated accord-

ing to a “lazy” learning rule as follows. In determining the

desired value for a letter node during spelling we distinguish

five cases:

1. The most active letter node corresponds to the target letter

for the current position within the word, and its activation

exceeds the response threshold Tr.

2. The most active letter node corresponds to the target letter, but

its activation does not exceed Tr.

3. The most active letter node is not the target letter for the

current position within the word.

4. The current letter position is past the end of the word, and

the most active letter node does not exceed the stopping

threshold Ts.

5. The current letter position is past the end of the word but

the most active letter node exceeds Ts.

In Cases 1 and 4 we score the letter position as “correct” for the

purposes of monitoring learning. However, weight changes are

only made in the “error” cases, Cases 2, 3, and 5. In Cases 2 and

3 the target letter is insufficiently active and must be reinforced:

In these cases the desired activation for the target node is set to

1.0. In Cases 3 and 5 the incorrect winning letter is too active;

in these cases the desired activation for the winning node is set

to 0.0. For all other letter nodes the desired activation is set to

the actual activation (i.e., no weight changes are made). A small

training margin mT is added to threshold Tr and subtracted

from Ts during training. This means that the model is trained

against slightly more stringent thresholds than those that are

used in recall, which prevents undue fragility of recall due

simply to the model operating very close to its thresholds. An

equivalent effect could be achieved over a longer training

period if a small amount of random noise were present in

activation levels or thresholds during training.

The C and V nodes are treated more conventionally

since they do not participate in the CQ process: The

desired activation for these nodes is simply set to 1.0 if

the current target letter is a consonant or vowel, respectively,

and 0.0 otherwise.

A weight change is calculated for the weights from a hidden

layer to an output layer according to:

DWuh ¼ 1duAh (7)

where DWuh is the required change in the weight from hidden

unit h to output unit u, Ah is the activation of hidden unit h,

and 1 is a small constant, the “learning rate”.

COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3) 511

GRAPHEMIC BUFFER

Page 34: Towards a unified process model for graphemic buffer ...cnpbi.sissa.it/Articles/Glasspooletal06.pdfversions—by which information was transmitted from a semantic system and/or auditory

The error value dh for each hidden unit h is given by:

dh ¼ Ah(1 � Ah)Xu

duWuh (8)

The weight change for the input-to-hidden layer weights is

then:

DWhi ¼ 1dhAi (9)

where DWhi is the required change in the weight from input

unit i to hidden unit h, and Ai is the activation of input unit

i. For each weight W in the network, the weight changes are

now applied using:

W (t) ¼ W (t � 1) þ DW (t) þmDW (t � 1) (10)

where W(t) is the new weight value for time step t, W(t – 1) is

the value of the weight at the previous time step, DW(t) is the

required weight change for the current time step, DW(t – 1) is

the required weight change calculated on the previous time

step, and m (0 , m , 1) is the momentum value. At the

start of learning, all weights W(t) are set to random values

between +0.5 and all DW(t – 1) are assumed to be 0.

Table A1 gives the parameter values used in the

simulations.

Table A1. Parameter values used in simulations unless otherwise

stated

Parameter Symbol Value

Learning rate 1 0.001

Momentum m 0.9

Letter node recovery rate r 0.8

Letter node inhibition level Inh 21.0

Response threshold Tr 0.8

Stopping threshold Ts 0.6

Training margin mT 0.025

512 COGNITIVE NEUROPSYCHOLOGY, 2006, 23 (3)

GLASSPOOL, SHALLICE, CIPOLOTTI