Word Recognition Models

Word Recognition ModelsWord Recognition Models

Lucas RizoliLucas RizoliThursday, September 30Thursday, September 30PSYC 365*, Fall 2004PSYC 365*, Fall 2004Queen’s University, KingstonQueen’s University, Kingston

Human Word RecognitionHuman Word Recognition

● Text interpreted as it is perceivedText interpreted as it is perceived– Stroop test (Stroop test (RedRed, , GreenGreen, , YellowYellow))– Aware of results, not of processesAware of results, not of processes

● Likely involves many areas of brainLikely involves many areas of brain– VisualVisual– SemanticSemantic– PhonologicalPhonological– ArticulatoryArticulatory

● How can we model this?How can we model this?

Creating a Word Recognition ModelCreating a Word Recognition Model

● AssumptionsAssumptions– Working in EnglishWorking in English– Only monosyllabic wordsOnly monosyllabic words

● FOXFOX, , CAVECAVE, , FEIGNFEIGN......– Concerned only with simple word recognitionConcerned only with simple word recognition

● Symbols → soundsSymbols → sounds● Visual, articulatory systems function independentlyVisual, articulatory systems function independently● Context of word is irrelevantContext of word is irrelevant


● Rules by which to recognize Rules by which to recognize CAVECAVE– CC → /k/→ /k/– AA → /A/→ /A/– VEVE → /v/→ /v/

● Describe Describe grapheme-phoneme correspondencesgrapheme-phoneme correspondences ((GPCGPC))– Grapheme Grapheme → phoneme→ phoneme


● Recognize Recognize HAVEHAVE– H H → /h/→ /h/– A A → /A/→ /A/– VE VE → /v/→ /v/– SoSo HAVE HAVE → /hAv/ ? → /hAv/ ?

● Rules result in incorrect pronunciationRules result in incorrect pronunciation


● English is English is quasi-regularquasi-regular– Can be described as systematic, but with exceptionsCan be described as systematic, but with exceptions– English has a English has a deep orthographydeep orthography

● grapheme grapheme →→ phoneme rules inconsistent phoneme rules inconsistent– GAVEGAVE, , CAVECAVE, , SHAVESHAVE end with /Av/ end with /Av/– HAVEHAVE ends with /@v/ ends with /@v/


● Models needs to recognize irregular wordsModels needs to recognize irregular words● Check for irregular words before applying GPCsCheck for irregular words before applying GPCs

– List irregular words and their pronunciationsList irregular words and their pronunciations● HAVE HAVE → /h@v/,→ /h@v/, GONE GONE → /gon/,→ /gon/, ... ...

– Have separate look-up processHave separate look-up process

Our Word Recognition ModelOur Word Recognition Model

Orthographic InputOrthographic Input

Phonological OutputPhonological Output

IrregularIrregularWordsWords GPCsGPCs

From Visual SystemFrom Visual System

To Articulatory SystemTo Articulatory System

The Dual-Route ModelThe Dual-Route Model

● Proposed by Max Coltheart in 1978Proposed by Max Coltheart in 1978– Supported by Pinker, BesnerSupported by Pinker, Besner– Revised throughout the 80’s, 90’s, 00’sRevised throughout the 80’s, 90’s, 00’s

● Context sensitive rulesContext sensitive rules● Rule frequency checksRule frequency checks● Lots of other complex stuffLots of other complex stuff

● We’ll follow his 1993 model (We’ll follow his 1993 model (DR93DR93))

DR93 ExamplesDR93 Examples

Context-sensitive GPCContext-sensitive GPC

Note: Above, /a/ should be /@/ Note: Above, /a/ should be /@/

What’s Good About DR93What’s Good About DR93

● Regular word pronunciationRegular word pronunciation– Goes well with rule-based theoriesGoes well with rule-based theories

● Berko’s Wug test (This is a wug, these are two wug_)Berko’s Wug test (This is a wug, these are two wug_)● Childhood over-regularizationChildhood over-regularization

● Nonword pronunciationNonword pronunciation– NUSTNUST, , FAIJEFAIJE, , NARFNARF are alright are alright

What’s Not Good About DR93What’s Not Good About DR93

● Irregular word pronunciationIrregular word pronunciation– GONEGONE → /dOn/, → /dOn/, AREARE → /Ar/ → /Ar/

● GPCs miss subregularitiesGPCs miss subregularities– OWOW → /aW/, from → /aW/, from HOWHOW, , COWCOW, , PLOWPLOW– SHOWSHOW, , ROWROW, , KNOWKNOW are exceptions are exceptions

● Biological plausibilityBiological plausibility– Do humans need explicit rules in order to read?Do humans need explicit rules in order to read?

The SM89 ModelThe SM89 Model

● Implemented by Seidenberg and Implemented by Seidenberg and McClelland in 1989McClelland in 1989– Response to dual-route modelResponse to dual-route model– Neural network/PDP modelNeural network/PDP model– ““As little as possible of the solution built As little as possible of the solution built

in”in”– ““As much as possible is left to the As much as possible is left to the

mechanisms of learning”mechanisms of learning”● We’ll call it We’ll call it SM89SM89


Hidden UnitsHidden Units(200 units)(200 units)

Phonological UnitsPhonological Units(460 units)(460 units)


Orthographic UnitsOrthographic Units(400 units)(400 units)



Orthographic UnitsOrthographic Units(400 units)(400 units)

● Orthographic units are Orthographic units are triplestriples– Three charactersThree characters– Letters or word-borderLetters or word-border– CAVECAVE

● _CA, CAV, AVE, VE__CA, CAV, AVE, VE_– Context-sensitiveContext-sensitive


● Hidden units needed for complete neural networkHidden units needed for complete neural network● Encode information in a non-specified wayEncode information in a non-specified way● Learning occurs by changing weights on Learning occurs by changing weights on

connections to and from hidden unitsconnections to and from hidden units– Process of Process of back-propagationback-propagation



● Phonological units are Phonological units are also triplesalso triples– /kAv//kAv/

● _kA, kAv, Av__kA, kAv, Av_● Triples are generalizedTriples are generalized

● [stop, vowel, fricative][stop, vowel, fricative]● Number of units are Number of units are

sufficient for English sufficient for English monosyllablesmonosyllables


How SM89 LearnsHow SM89 Learns

● Orthographic units artificially stimulatedOrthographic units artificially stimulated● Activation spreads to hidden, phonological unitsActivation spreads to hidden, phonological units

– FeedforwardFeedforward from ortho. to phono. units from ortho. to phono. units● Model response is pattern of activation in Model response is pattern of activation in

phonological unitsphonological units

How SM89 LearnsHow SM89 Learns● Difference in activation between response and the Difference in activation between response and the

correct activationcorrect activation

● Error computed as the sum of difference for each Error computed as the sum of difference for each unit, squaredunit, squared

● Weights of all connections between units Weights of all connections between units adjustedadjusted


● Simply, it learns to pronounce words properlySimply, it learns to pronounce words properly– Don’t worry about the equationsDon’t worry about the equations


● Trained using a list of ~ 3000 English Trained using a list of ~ 3000 English monosyllabic wordsmonosyllabic words– Includes Includes homographshomographs ( (WINDWIND, , READREAD) and irregulars) and irregulars

● Each training session called an Each training session called an epochepoch● Words appeared somewhat proportionately to Words appeared somewhat proportionately to

their frequency in written languagetheir frequency in written language

Practical Limits on SM89’s TrainingPractical Limits on SM89’s Training

● Activation calculated in a single stepActivation calculated in a single step– Impossible to record how long it took to respondImpossible to record how long it took to respond– Correlated error scores with latencyCorrelated error scores with latency

● Error Error →→ time time● Frequency of words was compressedFrequency of words was compressed

– Would’ve required ~ 34 times more epochsWould’ve required ~ 34 times more epochs– Saved computer timeSaved computer time

How SM89 PerformedHow SM89 Performed

How SM89 PerformedHow SM89 Performed

HumanHuman SM89SM89

What’s Good About SM89What’s Good About SM89

● Regular word pronunciationRegular word pronunciation● Irregular word pronunciationIrregular word pronunciation● Similar results to human studiesSimilar results to human studies

– Word naming latenciesWord naming latencies– Priming effectsPriming effects

● Behaviour the result of learningBehaviour the result of learning– Ability increases in human fashionAbility increases in human fashion

What’s Not Good About SM89What’s Not Good About SM89

● Nonword pronunciationNonword pronunciation– Significantly worse than skilled readersSignificantly worse than skilled readers– JINJEJINJE, , FAIJEFAIJE, , TUNCETUNCE pronounced strangely pronounced strangely

● Design was awkwardDesign was awkward– TriplesTriples– Feedforward networkFeedforward network– Compressed word frequenciesCompressed word frequencies– Single-step computationSingle-step computation


● Seidenberg, Plaut, and Seidenberg, Plaut, and McClelland revise SM89 in 1994McClelland revise SM89 in 1994– Response to criticism of SM89’s Response to criticism of SM89’s

poor nonword performancepoor nonword performance● We’ll call this model We’ll call this model SM94SM94● Compared humans’ nonword Compared humans’ nonword

responses with model’s responsesresponses with model’s responses





Graphemic UnitsGraphemic Units(108 units)(108 units)


How SM94 Differs From SM89How SM94 Differs From SM89

● Feedback loops for hidden and phonemic unitsFeedback loops for hidden and phonemic units● Weights adjusted using cross-entropy methodWeights adjusted using cross-entropy method

– Complicated math, results in better learningComplicated math, results in better learning● Not computed in a single stepNot computed in a single step● No more triplesNo more triples

– Graphemes for word inputGraphemes for word input– Phonemes for word outputPhonemes for word output– Input based on syllable structureInput based on syllable structure

Examples of SM94’s UnitsExamples of SM94’s Units

NonwordsNonwords

● May be similar to regular wordsMay be similar to regular words– SMURFSMURF ←← TURFTURF

● In many cases there are many responsesIn many cases there are many responses– BREATBREAT

● ←← EATEAT ? ?● ←← GREATGREAT ? ?● ←← YEAHYEAH ? ?

NonwordsNonwords

HumanHuman

How SM94 and DR93 PerformedHow SM94 and DR93 Performed

Note: Above, PDP is SM94; Rules is DR93Note: Above, PDP is SM94; Rules is DR93

Comparing SM94 and DR93Comparing SM94 and DR93

● Both perform well with list of ~ 3000 wordsBoth perform well with list of ~ 3000 words– SM94 responds 99.7% correctly, DR93 78%SM94 responds 99.7% correctly, DR93 78%

● Both do well with nonwordsBoth do well with nonwords– SM89’s weakness caused by design issuesSM89’s weakness caused by design issues

● SM94 avoids such issuesSM94 avoids such issues– Neural networks equally capable for nonwordsNeural networks equally capable for nonwords

Comparing SM94 and DR93Comparing SM94 and DR93

● SM94 is a good performerSM94 is a good performer– Regular, irregular wordsRegular, irregular words– Behaviour similar to humanBehaviour similar to human

● Latency effectsLatency effects● Nonword pronunciationNonword pronunciation

● DR93 still has problemsDR93 still has problems– Trouble with irregular wordsTrouble with irregular words– More likely to regularize wordsMore likely to regularize words

Models and DyslexiaModels and Dyslexia

● Consider specific types of dyslexiaConsider specific types of dyslexia– Phonological DyslexiaPhonological Dyslexia

● Trouble pronouncing nonwordsTrouble pronouncing nonwords– Surface DyslexiaSurface Dyslexia

● Trouble with irregular wordsTrouble with irregular words– Developmental DyslexiaDevelopmental Dyslexia

● Inability to read at age-appropriate levelInability to read at age-appropriate level● How can word recognition models account for How can word recognition models account for

dyslexic behaviour?dyslexic behaviour?

DR93 and DyslexiaDR93 and Dyslexia

● Phonological dyslexia as damage to GPC routePhonological dyslexia as damage to GPC route– Cannot compile sounds from graphemesCannot compile sounds from graphemes– Relies on look-upRelies on look-up

● Surface dyslexia as damage to look-up routeSurface dyslexia as damage to look-up route– Cannot remember irregular wordsCannot remember irregular words– Relies on GPCsRelies on GPCs

● Developmental dyslexiaDevelopmental dyslexia– Problems somewhere along either routeProblems somewhere along either route

● Cannot form GPCs, slow look-up, for exampleCannot form GPCs, slow look-up, for example

SM89 and DyslexiaSM89 and Dyslexia

● Developmental dyslexia as damaged or missing Developmental dyslexia as damaged or missing hidden unitshidden units

200 Hidden Units200 Hidden Units 100 Hidden Units100 Hidden Units

The 1996 Models and DyslexiaThe 1996 Models and Dyslexia

● Plaut, McClelland, Seidenberg, and Patterson Plaut, McClelland, Seidenberg, and Patterson study networks and dyslexia (1996)study networks and dyslexia (1996)– Variations of the SM89/SM94 modelsVariations of the SM89/SM94 models

● FeedforwardFeedforward● Feedforward with actual word-frequenciesFeedforward with actual word-frequencies● Feedback with attractorsFeedback with attractors● Feedback with attractors and semantic processesFeedback with attractors and semantic processes

– Compare each to case studies of dyslexicsCompare each to case studies of dyslexics

Feedforward and Dyslexia Case-Feedforward and Dyslexia Case-StudiesStudies

Feedback, with Attractors and Feedback, with Attractors and Semantics, and Dyslexia Case-StudiesSemantics, and Dyslexia Case-Studies

The 1996 Models and DyslexiaThe 1996 Models and Dyslexia

● Most complex damage caused closest resultsMost complex damage caused closest results– Not as simple as removing hidden unitsNot as simple as removing hidden units

● Severing semanticsSevering semantics● Distorting attractorsDistorting attractors

● Results are encouragingResults are encouraging

Questions or CommentsQuestions or Comments

Technology

Word Recognition Models