Upload
jorgeybotana
View
581
Download
2
Embed Size (px)
DESCRIPTION
So what can a model such as LSA tell us about some of the empirical effects related to lexical ambiguity?
Citation preview
1st Joint Conference of the EPS and SEPEX, Granada 2010
Monitoring the penalization/advantage of lexical ambiguity in vector model representations
Guillermo Jorge-Botana, Ricardo Olmos & José A. LeónUniversidad Autónoma de Madrid
Extensional definition of lexical ambiguity
Two criteria for categorizing the phenomenon
Representation of meanings Separate vs. single representation of senses(Homonymy, Polysemy, Metonymy, Metaphors…)
Contextual distribution of word occurrencesOccurrences across few or many contexts
www.elsemantico.com
Representation of senses
Separate senses (Klein & Murphy, 2001).
A common core with specific parts(Rodd, Gaskell & Marslen-Wilson, 2002; Klepousniotou & Baum, 2006).
Unique representation as in LSA models (Kintsch, 2001; Kintsch & Mangalath, in press).
www.elsemantico.com
Contextual distribution of word occurrences
Measured by Contextual Diversity (Adelman, Gordon, Brown & Quesada, 2006, McDonald & Shillcock, 2001)
Polysemy: Polysemic words have greater Contextual Diversity than monosemic.
Since Abstract words have greater Contextual Diversity than Concrete words …
Can abstractness be a kind of lexical ambiguity?
www.elsemantico.com
Concept of Context Diversity in Vector-Models
The vectors that represent the words inside a Vector-Space (as in LSA) can be a measure of indices such Context Diversity
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11
Measuring the thematic focus, for example with entropy formula
www.elsemantico.com
Current Study
So what can a model such as LSA tell us about some of the empirical effects related to lexical ambiguity?
A model of a unique lexical representationAn exclusively linguistic model.
www.elsemantico.com
Current Study
Effects related to lexical ambiguity
1. Associative difficulty2. Lexical Decision Task
www.elsemantico.com
1. Associative difficulty
For polysemic words: difficulty generating meaning without a context (Duffy, Morris, & Rayner, 1988)
Abstract words rarely have semantic relations (they predominantly have associative relations) (e.g. Crutch, 2006; Crutch, Ridha, & Warrington, 2006; Crutch & Warrington, 2005; Warrington & Crutch, 2007; Crutch & Warrington, 2005; Duñabeitia et al., 2009 )
www.elsemantico.com
Associative difficulty for LSA
www.elsemantico.com
A
B
C
D
E
F
G
Main sense The other sense
Penalization
The similaritydecreases Polysemy
Monosemy
Monosemy
2. Data with LDT
Polysemy advantage (Hino & Lupker, 1996; Pexman & Lupker, 1999)
Concrete advantageThe response to LDT is task dependentIt is favored by unspecific activation mediated by semantic representations (Piercey & Joordens, 2000)
www.elsemantico.com
LDT with LSA
Ambiguous vectors have difficulty with main relations but an advantage with other relations.This is due to the distributional properties of the vectors.This advantage with other relations can produce non-specific activation, improving LDT response times
www.elsemantico.com
LDT with LSA
www.elsemantico.com
A
B
C
D
E
F
Advantage with other relationsgenerates non-specific activation
Penalization with other wordswith different senses
Polysemy
Polysemy
Monosemy
HypothesesA vector model like LSA can simulate the associative difficulty
In polysemic wordsIn abstract words
A vector model can simulate the response to LDT in polysemicwordsSince responses to LDT are the opposite in Abstract words (to polysemic). A linguistic model like LSA cannot simulate the response of abstract words to LDT without introducing another source of activation (for example mental imagery).
www.elsemantico.com
Procedure
Three simulations1. Introducing an invented word into semantic space at
different points along the polysemy-monosemycontinuum.
2. With real polysemic-monosemic words
3. With real abstract-concrete words
www.elsemantico.com
The analysis
Extract semantic neighbors and analyze the similarities (cosines) between them and the reference word.Analyze Context Diversity within the vector with ranges and entropy measures.
www.elsemantico.com
First simulationLSA is trained with LEXESP corpus (Sebastián, Cuetos, Carreiras & Martí, 2000)
An artificial nonsense word “noray” is introduced into the space in documents with two possible senses represented in different proportions: the sense of “traditional sport” and sense of “old neighborhood”The proportions of the two senses were
30-0 (monosemy), 25-5 (dominant polysemy)15-15 (equally probable polysemy)5-25 (dominant polysemy)0-30 (monosemy)
www.elsemantico.com
Results(I)
0.3
0.35
0.4
0.45
0.5
0.55
0.6
Series1 0.5918 0.52358 0.44215 0.4206 0.42179 0.47611 0.47638
0-30 5-25 10-20 15-15 20-10 25-5 30-0
Mean of the 30 first semantic neighbors
The penalization exists for polysemic conditions
www.elsemantico.com
Results(I)
0
0,05
0,1
0,15
0,2
0,25
0,3
0,35
0,41
180
359
538
717
896
1075
1254
1433
1612
1791
1970
2149
2328
2507
2686
2865
3044
3223
3402
3581
3760
3939
0-305-2510-2015-1520-1025-530-0
Ranges of the 4000 first semantic neighbors
The pattern changed around neighbor 350 (cosine 0.2).For the polysemic conditions, the penalization became advantageous from this point.
www.elsemantico.com
Results(I)
-0,1
-0,08
-0,06
-0,04
-0,02
0
0,02
0,04
0,06
0,08
0,1
1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239
0-3015-1530-0
Scores on the dimension of the vectors in the three main conditions
The effect of the penalization and the change of pattern may be due to the distributional properties of the vectors.The 15-15 condition was spread along all dimensions.
www.elsemantico.com
Results(I)
1st, 2nd and 3rd position in the saturations of dimensions
The position of polysemic condition (15-15) were often medium (scoring low on all the dimensions).
www.elsemantico.com
Second simulation
Same as first simulation but with real polysemicand monosemic words extracted from other studies (Estévez, 1991; Jorge-Botana, León & Olmos, 2010).
Controlled for frequency, concreteness and imaginability.
www.elsemantico.com
Results (II)
0
0.1
0.2
0.3
0.4
0.5
0.6
1
153
305
457
609
761
913
1065
1217
1369
1521
1673
1825
1977
2129
2281
2433
2585
2737
2889
3041
3193
3345
3497
3649
3801
3953
4105
4257
4409
4561
4713
4865
MonosemyPolysemy
www.elsemantico.com
Ranges of the first 5000 semantic neighbors
Monosemy had an advantage for the first relations.The pattern changes around neighbor 550 (cosine 0.2) in favour of polysemic.
Results(II)
GrupomonosemyPolisemy
Entro
pía
0,80
0,70
0,60
0,50
0,40
0,30
87
Global weight
Polysemy Monosemy
The polysemy condition have less global weight thus more entropyThe spread among different contexts for polysemic words is a substantive fact.
www.elsemantico.com
Third simulation
Same as second simulation but with real abstract and concrete words extracted from other studies (Duñabeitia et al., 2009).
www.elsemantico.com
Results(III)
0
0.1
0.2
0.3
0.4
0.5
0.6
1
242
483
724
965
1206
1447
1688
1929
2170
2411
2652
2893
3134
3375
3616
3857
4098
4339
4580
4821
AbstractConcrete
Ranges of the first 5000 semantic neighbors
Concrete words obtained an advantage for the first 400 relations.The pattern changed around neighbor 400 (cosine 0.2) in favor ofabstract words.
www.elsemantico.com
Results(III)
GrupoConcreteAbstract
Entro
pía
0,80
0,70
0,60
0,50
0,40
0,30
Global weight
Abtracts Concretes
The abstract condition had less global weight thus more entropy.The spread among different contexts for abstract words is a substantive fact.
www.elsemantico.com
Preliminary conclusionsSo, what can a model like LSA tell us about the empirical effects in question?
A model of a unique representation
An exclusively linguistic model.
www.elsemantico.com
Preliminary conclusions (I)
In polysemy and in abstractness, associative difficulty could be explained by the linguistic distributional properties of a unique representation.
www.elsemantico.com
Preliminary conclusions (II)
In polysemy, LDT responses could be explained by the unspecific activation produced by the linguistic distributional properties of a unique representation.
www.elsemantico.com
Preliminary conclusions (III)
In the Abstract word condition, LDT responses couldn’t be explained by the linguistic distributional properties. Another source of potential activation is required.
(perceptual representations?)
www.elsemantico.com
Preliminary conclusions (IV)
In abstract word condition:No additional source other than linguistic is needed as an explanation to account for associative difficulty.No differences in cerebral activation between concrete and abstract in semantic tasks (Pexman et al., 2007).
Another source of activation is needed for LDT responses.Differences in cerebral activation between concrete and abstract in LDT (Binder, Westbury, et al, 2005).
www.elsemantico.com
Thank you
www.elsemantico.com
Monitoring the penalization/advantage of lexical ambiguity in vector model representations
Guillermo Jorge-Botana, Ricardo Olmos & José A. León
Universidad Autónoma de Madrid