43
Word Recognition Models Word Recognition Models Lucas Rizoli Lucas Rizoli Thursday, September 30 Thursday, September 30 PSYC 365*, Fall 2004 PSYC 365*, Fall 2004 Queen’s University, Kingston Queen’s University, Kingston

Word Recognition Models

Embed Size (px)

DESCRIPTION

Overview of Coltheart's Dual-Route Model and Seidenberg & McClelland's neural network models of word recognition. Course presentation for PSYC365*, Fall 2004, Dr. Butler, Queen's University. Images used without permission.

Citation preview

Page 1: Word Recognition Models

Word Recognition ModelsWord Recognition Models

Lucas RizoliLucas RizoliThursday, September 30Thursday, September 30PSYC 365*, Fall 2004PSYC 365*, Fall 2004Queen’s University, KingstonQueen’s University, Kingston

Page 2: Word Recognition Models

Human Word RecognitionHuman Word Recognition

● Text interpreted as it is perceivedText interpreted as it is perceived– Stroop test (Stroop test (RedRed, , GreenGreen, , YellowYellow))– Aware of results, not of processesAware of results, not of processes

● Likely involves many areas of brainLikely involves many areas of brain– VisualVisual– SemanticSemantic– PhonologicalPhonological– ArticulatoryArticulatory

● How can we model this?How can we model this?

Page 3: Word Recognition Models

Creating a Word Recognition ModelCreating a Word Recognition Model

● AssumptionsAssumptions– Working in EnglishWorking in English– Only monosyllabic wordsOnly monosyllabic words

● FOXFOX, , CAVECAVE, , FEIGNFEIGN......– Concerned only with simple word recognitionConcerned only with simple word recognition

● Symbols → soundsSymbols → sounds● Visual, articulatory systems function independentlyVisual, articulatory systems function independently● Context of word is irrelevantContext of word is irrelevant

Page 4: Word Recognition Models

Creating a Word Recognition ModelCreating a Word Recognition Model

● Rules by which to recognize Rules by which to recognize CAVECAVE– CC → /k/→ /k/– AA → /A/→ /A/– VEVE → /v/→ /v/

● Describe Describe grapheme-phoneme correspondencesgrapheme-phoneme correspondences ((GPCGPC))– Grapheme Grapheme → phoneme→ phoneme

Page 5: Word Recognition Models

Creating a Word Recognition ModelCreating a Word Recognition Model

● Recognize Recognize HAVEHAVE– H H → /h/→ /h/– A A → /A/→ /A/– VE VE → /v/→ /v/– SoSo HAVE HAVE → /hAv/ ? → /hAv/ ?

● Rules result in incorrect pronunciationRules result in incorrect pronunciation

Page 6: Word Recognition Models

Creating a Word Recognition ModelCreating a Word Recognition Model

● English is English is quasi-regularquasi-regular– Can be described as systematic, but with exceptionsCan be described as systematic, but with exceptions– English has a English has a deep orthographydeep orthography

● grapheme grapheme →→ phoneme rules inconsistent phoneme rules inconsistent– GAVEGAVE, , CAVECAVE, , SHAVESHAVE end with /Av/ end with /Av/– HAVEHAVE ends with /@v/ ends with /@v/

Page 7: Word Recognition Models

Creating a Word Recognition ModelCreating a Word Recognition Model

● Models needs to recognize irregular wordsModels needs to recognize irregular words● Check for irregular words before applying GPCsCheck for irregular words before applying GPCs

– List irregular words and their pronunciationsList irregular words and their pronunciations● HAVE HAVE → /h@v/,→ /h@v/, GONE GONE → /gon/,→ /gon/, ... ...

– Have separate look-up processHave separate look-up process

Page 8: Word Recognition Models

Our Word Recognition ModelOur Word Recognition Model

Orthographic InputOrthographic Input

Phonological OutputPhonological Output

IrregularIrregularWordsWords GPCsGPCs

From Visual SystemFrom Visual System

To Articulatory SystemTo Articulatory System

Page 9: Word Recognition Models

The Dual-Route ModelThe Dual-Route Model

● Proposed by Max Coltheart in 1978Proposed by Max Coltheart in 1978– Supported by Pinker, BesnerSupported by Pinker, Besner– Revised throughout the 80’s, 90’s, 00’sRevised throughout the 80’s, 90’s, 00’s

● Context sensitive rulesContext sensitive rules● Rule frequency checksRule frequency checks● Lots of other complex stuffLots of other complex stuff

● We’ll follow his 1993 model (We’ll follow his 1993 model (DR93DR93))

Page 10: Word Recognition Models

DR93 ExamplesDR93 Examples

Context-sensitive GPCContext-sensitive GPC

Note: Above, /a/ should be /@/ Note: Above, /a/ should be /@/

Page 11: Word Recognition Models

What’s Good About DR93What’s Good About DR93

● Regular word pronunciationRegular word pronunciation– Goes well with rule-based theoriesGoes well with rule-based theories

● Berko’s Wug test (This is a wug, these are two wug_)Berko’s Wug test (This is a wug, these are two wug_)● Childhood over-regularizationChildhood over-regularization

● Nonword pronunciationNonword pronunciation– NUSTNUST, , FAIJEFAIJE, , NARFNARF are alright are alright

Page 12: Word Recognition Models

What’s Not Good About DR93What’s Not Good About DR93

● Irregular word pronunciationIrregular word pronunciation– GONEGONE → /dOn/, → /dOn/, AREARE → /Ar/ → /Ar/

● GPCs miss subregularitiesGPCs miss subregularities– OWOW → /aW/, from → /aW/, from HOWHOW, , COWCOW, , PLOWPLOW– SHOWSHOW, , ROWROW, , KNOWKNOW are exceptions are exceptions

● Biological plausibilityBiological plausibility– Do humans need explicit rules in order to read?Do humans need explicit rules in order to read?

Page 13: Word Recognition Models

The SM89 ModelThe SM89 Model

● Implemented by Seidenberg and Implemented by Seidenberg and McClelland in 1989McClelland in 1989– Response to dual-route modelResponse to dual-route model– Neural network/PDP modelNeural network/PDP model– ““As little as possible of the solution built As little as possible of the solution built

in”in”– ““As much as possible is left to the As much as possible is left to the

mechanisms of learning”mechanisms of learning”● We’ll call it We’ll call it SM89SM89

Page 14: Word Recognition Models

The SM89 ModelThe SM89 Model

Hidden UnitsHidden Units(200 units)(200 units)

Phonological UnitsPhonological Units(460 units)(460 units)

To Articulatory SystemTo Articulatory System

Orthographic UnitsOrthographic Units(400 units)(400 units)

From Visual SystemFrom Visual System

Page 15: Word Recognition Models

The SM89 ModelThe SM89 Model

Orthographic UnitsOrthographic Units(400 units)(400 units)

● Orthographic units are Orthographic units are triplestriples– Three charactersThree characters– Letters or word-borderLetters or word-border– CAVECAVE

● _CA, CAV, AVE, VE__CA, CAV, AVE, VE_– Context-sensitiveContext-sensitive

Page 16: Word Recognition Models

The SM89 ModelThe SM89 Model

● Hidden units needed for complete neural networkHidden units needed for complete neural network● Encode information in a non-specified wayEncode information in a non-specified way● Learning occurs by changing weights on Learning occurs by changing weights on

connections to and from hidden unitsconnections to and from hidden units– Process of Process of back-propagationback-propagation

Hidden UnitsHidden Units(200 units)(200 units)

Page 17: Word Recognition Models

The SM89 ModelThe SM89 Model

● Phonological units are Phonological units are also triplesalso triples– /kAv//kAv/

● _kA, kAv, Av__kA, kAv, Av_● Triples are generalizedTriples are generalized

● [stop, vowel, fricative][stop, vowel, fricative]● Number of units are Number of units are

sufficient for English sufficient for English monosyllablesmonosyllables

Phonological UnitsPhonological Units(460 units)(460 units)

Page 18: Word Recognition Models

How SM89 LearnsHow SM89 Learns

● Orthographic units artificially stimulatedOrthographic units artificially stimulated● Activation spreads to hidden, phonological unitsActivation spreads to hidden, phonological units

– FeedforwardFeedforward from ortho. to phono. units from ortho. to phono. units● Model response is pattern of activation in Model response is pattern of activation in

phonological unitsphonological units

Page 19: Word Recognition Models

How SM89 LearnsHow SM89 Learns● Difference in activation between response and the Difference in activation between response and the

correct activationcorrect activation

● Error computed as the sum of difference for each Error computed as the sum of difference for each unit, squaredunit, squared

● Weights of all connections between units Weights of all connections between units adjustedadjusted

Page 20: Word Recognition Models

How SM89 LearnsHow SM89 Learns

● Simply, it learns to pronounce words properlySimply, it learns to pronounce words properly– Don’t worry about the equationsDon’t worry about the equations

Page 21: Word Recognition Models

How SM89 LearnsHow SM89 Learns

● Trained using a list of ~ 3000 English Trained using a list of ~ 3000 English monosyllabic wordsmonosyllabic words– Includes Includes homographshomographs ( (WINDWIND, , READREAD) and irregulars) and irregulars

● Each training session called an Each training session called an epochepoch● Words appeared somewhat proportionately to Words appeared somewhat proportionately to

their frequency in written languagetheir frequency in written language

Page 22: Word Recognition Models

Practical Limits on SM89’s TrainingPractical Limits on SM89’s Training

● Activation calculated in a single stepActivation calculated in a single step– Impossible to record how long it took to respondImpossible to record how long it took to respond– Correlated error scores with latencyCorrelated error scores with latency

● Error Error →→ time time● Frequency of words was compressedFrequency of words was compressed

– Would’ve required ~ 34 times more epochsWould’ve required ~ 34 times more epochs– Saved computer timeSaved computer time

Page 23: Word Recognition Models

How SM89 PerformedHow SM89 Performed

Page 24: Word Recognition Models

How SM89 PerformedHow SM89 Performed

HumanHuman SM89SM89

Page 25: Word Recognition Models

What’s Good About SM89What’s Good About SM89

● Regular word pronunciationRegular word pronunciation● Irregular word pronunciationIrregular word pronunciation● Similar results to human studiesSimilar results to human studies

– Word naming latenciesWord naming latencies– Priming effectsPriming effects

● Behaviour the result of learningBehaviour the result of learning– Ability increases in human fashionAbility increases in human fashion

Page 26: Word Recognition Models

What’s Not Good About SM89What’s Not Good About SM89

● Nonword pronunciationNonword pronunciation– Significantly worse than skilled readersSignificantly worse than skilled readers– JINJEJINJE, , FAIJEFAIJE, , TUNCETUNCE pronounced strangely pronounced strangely

● Design was awkwardDesign was awkward– TriplesTriples– Feedforward networkFeedforward network– Compressed word frequenciesCompressed word frequencies– Single-step computationSingle-step computation

Page 27: Word Recognition Models

The SM94 ModelThe SM94 Model

● Seidenberg, Plaut, and Seidenberg, Plaut, and McClelland revise SM89 in 1994McClelland revise SM89 in 1994– Response to criticism of SM89’s Response to criticism of SM89’s

poor nonword performancepoor nonword performance● We’ll call this model We’ll call this model SM94SM94● Compared humans’ nonword Compared humans’ nonword

responses with model’s responsesresponses with model’s responses

Page 28: Word Recognition Models

The SM94 ModelThe SM94 Model

Hidden UnitsHidden Units(100 units)(100 units)

Phonological UnitsPhonological Units(50 units)(50 units)

To Articulatory SystemTo Articulatory System

Graphemic UnitsGraphemic Units(108 units)(108 units)

From Visual SystemFrom Visual System

Page 29: Word Recognition Models

How SM94 Differs From SM89How SM94 Differs From SM89

● Feedback loops for hidden and phonemic unitsFeedback loops for hidden and phonemic units● Weights adjusted using cross-entropy methodWeights adjusted using cross-entropy method

– Complicated math, results in better learningComplicated math, results in better learning● Not computed in a single stepNot computed in a single step● No more triplesNo more triples

– Graphemes for word inputGraphemes for word input– Phonemes for word outputPhonemes for word output– Input based on syllable structureInput based on syllable structure

Page 30: Word Recognition Models

Examples of SM94’s UnitsExamples of SM94’s Units

Page 31: Word Recognition Models

NonwordsNonwords

● May be similar to regular wordsMay be similar to regular words– SMURFSMURF ←← TURFTURF

● In many cases there are many responsesIn many cases there are many responses– BREATBREAT

● ←← EATEAT ? ?● ←← GREATGREAT ? ?● ←← YEAHYEAH ? ?

Page 32: Word Recognition Models

NonwordsNonwords

HumanHuman

Page 33: Word Recognition Models

How SM94 and DR93 PerformedHow SM94 and DR93 Performed

Note: Above, PDP is SM94; Rules is DR93Note: Above, PDP is SM94; Rules is DR93

Page 34: Word Recognition Models

Comparing SM94 and DR93Comparing SM94 and DR93

● Both perform well with list of ~ 3000 wordsBoth perform well with list of ~ 3000 words– SM94 responds 99.7% correctly, DR93 78%SM94 responds 99.7% correctly, DR93 78%

● Both do well with nonwordsBoth do well with nonwords– SM89’s weakness caused by design issuesSM89’s weakness caused by design issues

● SM94 avoids such issuesSM94 avoids such issues– Neural networks equally capable for nonwordsNeural networks equally capable for nonwords

Page 35: Word Recognition Models

Comparing SM94 and DR93Comparing SM94 and DR93

● SM94 is a good performerSM94 is a good performer– Regular, irregular wordsRegular, irregular words– Behaviour similar to humanBehaviour similar to human

● Latency effectsLatency effects● Nonword pronunciationNonword pronunciation

● DR93 still has problemsDR93 still has problems– Trouble with irregular wordsTrouble with irregular words– More likely to regularize wordsMore likely to regularize words

Page 36: Word Recognition Models

Models and DyslexiaModels and Dyslexia

● Consider specific types of dyslexiaConsider specific types of dyslexia– Phonological DyslexiaPhonological Dyslexia

● Trouble pronouncing nonwordsTrouble pronouncing nonwords– Surface DyslexiaSurface Dyslexia

● Trouble with irregular wordsTrouble with irregular words– Developmental DyslexiaDevelopmental Dyslexia

● Inability to read at age-appropriate levelInability to read at age-appropriate level● How can word recognition models account for How can word recognition models account for

dyslexic behaviour?dyslexic behaviour?

Page 37: Word Recognition Models

DR93 and DyslexiaDR93 and Dyslexia

● Phonological dyslexia as damage to GPC routePhonological dyslexia as damage to GPC route– Cannot compile sounds from graphemesCannot compile sounds from graphemes– Relies on look-upRelies on look-up

● Surface dyslexia as damage to look-up routeSurface dyslexia as damage to look-up route– Cannot remember irregular wordsCannot remember irregular words– Relies on GPCsRelies on GPCs

● Developmental dyslexiaDevelopmental dyslexia– Problems somewhere along either routeProblems somewhere along either route

● Cannot form GPCs, slow look-up, for exampleCannot form GPCs, slow look-up, for example

Page 38: Word Recognition Models

SM89 and DyslexiaSM89 and Dyslexia

● Developmental dyslexia as damaged or missing Developmental dyslexia as damaged or missing hidden unitshidden units

200 Hidden Units200 Hidden Units 100 Hidden Units100 Hidden Units

Page 39: Word Recognition Models

The 1996 Models and DyslexiaThe 1996 Models and Dyslexia

● Plaut, McClelland, Seidenberg, and Patterson Plaut, McClelland, Seidenberg, and Patterson study networks and dyslexia (1996)study networks and dyslexia (1996)– Variations of the SM89/SM94 modelsVariations of the SM89/SM94 models

● FeedforwardFeedforward● Feedforward with actual word-frequenciesFeedforward with actual word-frequencies● Feedback with attractorsFeedback with attractors● Feedback with attractors and semantic processesFeedback with attractors and semantic processes

– Compare each to case studies of dyslexicsCompare each to case studies of dyslexics

Page 40: Word Recognition Models

Feedforward and Dyslexia Case-Feedforward and Dyslexia Case-StudiesStudies

Page 41: Word Recognition Models

Feedback, with Attractors and Feedback, with Attractors and Semantics, and Dyslexia Case-StudiesSemantics, and Dyslexia Case-Studies

Page 42: Word Recognition Models

The 1996 Models and DyslexiaThe 1996 Models and Dyslexia

● Most complex damage caused closest resultsMost complex damage caused closest results– Not as simple as removing hidden unitsNot as simple as removing hidden units

● Severing semanticsSevering semantics● Distorting attractorsDistorting attractors

● Results are encouragingResults are encouraging

Page 43: Word Recognition Models

Questions or CommentsQuestions or Comments