Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
The Complexity Sources and their Compensation inLanguage Processing
Philippe Blache
Laboratoire Parole et LangageCNRS & Aix-Marseille Universite
1 / 41
Situation: different kinds of complexity
I System vs. Structural Complexity (Dahl, 2004)
System complexity Structural complexity
I Number of categories in each domainI Number of features for each categoriesI Grammar size for each domainI Lexicon sizeI Average local complexity
I Number of categoriesI DepthI Number of rules used to build the
structureI Number of words
I Absolute vs. Relative Complexity (Miestamo, 2008)I Absolute complexity: theory-orientedI Relative complexity: user-dependent
I Our perspective : structural relative complexity
Dahl O. (2004) The Growth and Maintenance of Linguistic Complexity, John Benjamins.Miestamo M. (2008) “Grammatical complexity in a cross-linguistic perspective”, in Language Complexity, John Benjamins.
2 / 41
Difficulty Models in Psycholinguistics
I Existing modelsI Incomplete Dependency HypothesisI Dependency Locality TheoryI Early Immediate Constituents PrincipleI Activation
I However, they fail at:I Describing language in its natural environmentI Explaining interaction between sources of information
Gibson E. (1998) “Linguistic complexity: Locality of syntactic dependencies”, Cognition, 68Gibson, E. (2000) “The dependency locality theory: A distance-based theory of linguistic complexity”, in Image, language, brain, MIT Press.Hawkins J. (2001) “Why are categories adjacent?”, in Journal of Linguistics, 37.Vasishth S. (2003) “Quantifying processing difficulty in human sentence parsing : The role of decay, activation, and similarity-based interference”, inProceedings of The European Cognitive Science Conference 2003
3 / 41
Language processing in the real word
I New challenges for Linguistics, Psycholinguistics and NLPI Dealing with natural dataI Language in its context: spoken language, natural interaction
I IssuesI Units are not always possible to determine, difficult to categorize (gradience)I Information can be parcimoniousI Language processing relies on domains interaction
4 / 41
Overview
I Hypothesis : Difficulty depends on the search space sizeThe larger the search space, the more difficulty
I Question: How search space size can be controlled?
I A framework : Maximize Online Principle (Hawkins, 2004)The more properties, the smaller the search space
I Questions:I How to represent properties?I How properties implement structural complexity?I How structural complexity predict processing difficulty?
5 / 41
Outline
I Basis: a constraint-based representation for syntax
I Cohesion: a model for syntactic complexityI How to measure cohesionI Experimenting cohesion in human language processing: an interplay between
difficulty and facilitation
6 / 41
Part I
Representing syntactic information
7 / 41
Classical Generative Syntax
Language and grammar
I Derivation
I Language = set of derived strings
I Recursively enumerable
Parsing
I Finding a derivation
I Building a tree
I Consequences:I There exists a complete grammar of the languageI The initial system is a complete grammar (acquisition)
8 / 41
Our Proposal: Property Grammars
I Describing the characteristics of an input (not building a structure)
I Linguistic statements as constraints
I Declarative approach : no specific mechanism but constraint evaluation
I Basics:I Constraints are independent
I No ranking (contra OT)I Seperate evaluation (contra (GEN)
I No hierarchical structure (contra PSG)I Constraints are at the same level (contra DEP)
9 / 41
What kind of syntactic information?
I Linear precedence
I Mandatory cooccurrence between two categories
I Impossible cooccurrence between two categories
I No repetition of the same category within a construction
I Dependency between two categories
10 / 41
Linearity
Prec(A,B) : (∀x , y)[(A(x) ∧ B(y)→ y 6≺ x)]
I Example: nominal construction
Det ≺ Adj Det ≺ N Det ≺ ProR
Adj ≺ N N ≺ ProR N ≺ Prep
the
l&&
l
%%
l
&&very l -- famous l --
l''
reporter l ++who the senator attacked ...
11 / 41
Requirement (cooccurrence)
Req(A,B) : (∀x , y)[A(x)→ B(y)]
I Example:
V[trans] ⇒ N[obj] Det ⇒ N[com] Adj ⇒ N
ProR ⇒ N V[ditrans] ⇒ Prep Prep ⇒ N
I Relations without government:
(1) a. The most interesting book of the libraryb. *A most interesting book of the library
Sup ⇒ Det[def]
The
r
%%most
rssr %%
interesting
r %%book of the library
rxx
12 / 41
Exclusion (cooccurrence restriction)
Excl(A,B) : (∀x)( 6 ∃y)[A(x) ∧ B(y)]
I ExamplesI Nominal construction:
Pro ⊗ N N[prop] ⊗ N[com] N[prop] ⊗ Prep[inf]
I Relative construction:
ProR[subj] ⊗ N[subj]
13 / 41
Uniqueness
Uniq(A) : (∀x , y)[A(x) ∧ A(y)→ x ≈ y ]
I Example (nominal construction):
Uniq = {Det, ProR, Prep[inf], Adv}
The
u��
book that
u��
I read
14 / 41
Dependency: the type hierarchy
dep
mod comp aux conj
subj obj iobj xcomp
dep : generic dependency relationmod : modification (typically adjunction)spec : specification (typically Det-N)comp : head-complementsubj : subjectobj : direct objectiobj : indirect objectxcomp : other complements (e.g. N − Prep)aux : auxiliary-verbconj : conjunction
15 / 41
Dependency
I Example:
Det ;spec N[com] Adj ;mod N ProR ;mod N
The
spec
most
mod %%interesting
mod %%book of
mod��
the
spec ""library
comp
��
I Syntactic role: N[subj ] ;subj V
I Agreement: Det[agri ] ;spec N[agri ]; Adj [agri ] ;mod N[agri ]
16 / 41
Example: the nominal construction
Det ≺ {Det, Adj, ProR, Prep, N} Det ;spec N
N ≺ {Prep, ProR} Adj ;mod N
Pro ⊗ {Det, Adj, ProR, Prep, N} ProR ;mod N
N[prop] ⊗ Det Prep ;mod N
Uniq = {Pro, Det, N, ProR, Prep} Det ⇒ N[com]
{Adj, ProR, Prep} ⇒ N
17 / 41
Syntactic Representation: Constraint Graph
The
u
��
l
::
l
%%
r
!!
spec
��most
l77
mod ''r //roo interesting
mod ((
l
88book
u
ZZl
<<of
u
ZZ
mod��
l
��the
u
��
l88
r
EE
spec &&library
comp
��
18 / 41
Constraint violation
Contraint graph Characterization
The
l%%
l
##
d
88veryl **
d,r44 old
l ,,
d 33book
r
bb
P+ = {Det ≺ Adj ,Det ≺ N,Adv ≺Adj ,Adj ≺ N,Det ;
N,Adj ; N,Adv ;
Adj ,Det ⇒ N,Adv ⇒Adj ,Adj ⇒ N}
P− = ∅
Veryl **
dc44 old
l
��
d
BBthel{{ l ++
d 33book
r
^^
P+ = {Det ≺ N,Adv ≺Adj ,Adj ≺ N,Det ;
N,Adj ; N,Adv ;
Adj ,Det ⇒ N,Adv ⇒Adj ,Adj ⇒ N}
P− = {Det ≺ Adj}
19 / 41
Part II
Measuring cohesion
20 / 41
Graph-based measures
I Definition: degree = number of edges incident to the vertex
I Degree of a category in the grammar:
Adjr
##d
ProRr
��
d
��Det
l
66l
22
l --
l''
d11 N
lrr
lgg
rtt
Prep
r
22
d
@@
Pro
deg[gram](N) = 9deg[gram](ProR) = 2deg[gram](Adj) = 1
I Degree of a category in the sentence:
Thel %%
l
!!
d77old l ,,
d 22book
c
``
deg[sent](N) = 5deg[sent](Adj) = 1deg[sent](Det) = 1
21 / 41
Category Completeness
I The completeness level of a category depends on the number of relations inits description
I This measure also depends on the number of relations for the category in thegrammar
I Completeness ratio: the higher the number of relations in the grammar isverified, the higher the completeness value
completeness(cat) =deg[sent](cat)
deg[gram](cat)
22 / 41
Sentence Density
I Measure based on 4 types or properties: uniqueness, requirement,dependency, linearity
I Density of a construction: the ratio btw the evaluated properties and thepossible properties:
density(sent) = |properties(sent)||words(sent)|
23 / 41
Satisfaction ratio
I All constraints can be violated
I A characterization contains both satisfied and violated constraints
I The “quality” of a construction depends on the ratio satisfied/violated
I All constraints can be weighted. We note W+ (resp. W− the sum of theweights of the satisfied (resp. violated) constraints :
satisfaction(sent) = W +−W −
W ++W −
24 / 41
Cohesion function
Given S a sentence, w the set of its words:
cohesion(S) =|S|∑i=1
completeness(wi ) ∗ density(S) ∗ satisfaction(S)
Hypothesis: cohesion is correlated with difficulty
25 / 41
Example 1
le cote hysterique un peu de enfin c’est normal tu vois elle souffre et
machin
the hysterical side rather of well it’s normal you see she suffers and that’s it
le l ..d ��
r==
l
��
l
<<cote
l
99hysteriquedkk un peu
dii
luude
d
yyenfin
c ′l !!
d>>est
l &&r��
r
VV normaldjj [tu vois] elle
d ''
l::souffre
r
YYd;; et machin
ddd
26 / 41
Example 2le chien apparemment connaissait parfaitement – the dog apparently knew perfectly –– le coin – the areaet euh quand on est partis and hmm when we leftle chien a decide de nous suivre the dog decided to follow us
led<<
l %%r��
chien
d''
l
77apparemmentd
55connaissaitl **
r
``
l
&&
r
==parfaitement
dff le
d==
l -- coin
d
ee
et euh quand onl
d??est
l %%
d
DDpartisrbb
r
YY
d
ZZ
led==
l $$r��
chienl
d>> a
l %%
d<<decide
r
WW
r
YYl &&
r;;de
d
XX
l!!
nousl &&
d;;suivrerpp
d
aa
27 / 41
Evaluation
Cat Degree-gramDet 1N 34Adj 11Adv 17Prep 31Pro 4Conj 0Aux 8V 7Conj 21
Word Deg-sent Deg-gram Completenessle 0 1 0cote 5 34 0.15hysterique 3 11 0.27un 0 1 0peu 0 17 0de 2 31 0.06enfin 0 17 0c’ 1 4 0.25est 3 7 0.43normal 2 11 0.18tu vois 0 0 0elle 1 4 0.25souffre 2 7 0.29et 2 0 0machin 0 17 0
28 / 41
Evaluation
le l --d ��r 11
l %%
l
66cote
l
66hysterique
dii un peu
ddd
lvvde
d
||enfin
Words Constraints Completeness Density CohesionSent. 1 15 19 0.13 1.18 0.15Sent. 2 21 38 0.17 1.80 0.32
le
d
@@l ��r
��chien
d
##
l
;;apparemment
d
99connaissait
l &&
r
[[
l
""
r
BBparfaitement
d
aa le
d
AAl ++
coin
d
bb
29 / 41
Part III
Difficulty and Compensation: the Case of
Idiom Processing
30 / 41
The situation
Idiom: multiword expression with a figurative meaning separate from the literalmeaning
Examples
I Decomposable idioms (variables)‘‘let the cat out of the bag’’
I Non-decomposable idioms (opaque semantics, no variability)‘‘spill the beans’’, ‘‘kick the bucket’’
Experimental perspective
I Idioms are read faster
I Idioms are related with specific brain activities
I Two different models according to the way they are processed
31 / 41
Compositional models (nonlexical models)
Main ideas
I Idiom comprehension uses normal language processing
I Idioms are represented as configurations of lexical items, no separaterepresentation in the lexicon
The Configurational HypothesisCacciari & Tabossi (1988) “The comprehension of idioms”, Journal of Memory and Language
I A sufficient portion of an idiomatic expression must be processed literallybefore the idiom can be identified
I After the “Recognition Point”, rest of the string is not processed literally
32 / 41
The Cohesion Model: interplay between difficulty andfacilitation
I Our hypothesis: Difficulty can be compensated by Cohesion
I Experiment:I Idioms have high cohesion values
I Introducing a difficulty into an idiom (a syntactic violation) is compensated bythe cohesion
I We compare idiomatic vs. non-idiomatic sentences, with and without violation
33 / 41
Experimental design
Material
I Idiom (IDNV)Paul a une idee derriere la tete depuis ce matin
I Idiome with violation (IDV)Paul a une idee derriere le tete depuis ce matin
I Control (CTRNV)Paul a une douleur derriere la nuque depuis ce matin
I Control with violation (CTRV)Paul a une douleur derriere le nuque depuis ce matin
34 / 41
Experimental design
Specific positions to study
I Recognition point (RP)Paul a une idee derriere la tete depuis ce matin
I Modified word, where the violation is introduced (MM)Paul a une idee derriere le tete depuis ce matin
I Detection word, where the violation is detected (MD)Paul a une idee derriere le tete depuis ce matin
35 / 41
Results: recognition point
I Different processing idiom vs. controlI More positive amplitude in the P300 and N400 windows for idioms:
facilitation36 / 41
Results: modified word
I More negative N400 for violated idiom (IDV) than non violated (IDNV):surprisal at the unexpected (modified) word for idioms
I No significative P600 : no repair
37 / 41
Results: detection word
I Small N400, small P600 for the violated control
I Positive deflection for IDV (wrt IDNV) at N400+P600 : repair
38 / 41
Results
I Idioms are processed differently: more positive amplitude after RP
I Modifying a word after the RP in idioms generates N400: surprisal
I At the position of syntactic violation detection:I High negativity (difficulty) for control sentencesI Earlier positivity (P300) for idioms: expectancy confirmation
39 / 41
Violation Compensation
Paul a
l,d %%
l,d
��une
d,lAAdouleur
c
VV
l
FFderri ere
l
EEla
l
EE
d��
nuque
d
��depuis ce matin
Paul a
l,d $$c
��
l,d
��
c
��une
d,lCCidee
c
UU
l
GGderri ere
l
EE
c
IIla
l
GG
c
JJ
d ��tete
d
��depuis ce matin
Paul a
l,d $$c
��
l,d
��
c
��une
d,lCCidee
c
UU
l
GGderri ere
l
EE
c
IIle
l
GG
c
JJ
d ��tete
d
��depuis ce matin
40 / 41
Violation Compensation
Paul a
l,d %%
l,d
��une
d,lAAdouleur
c
VV
l
FFderri ere
l
EEla
l
EE
d��
nuque
d
��depuis ce matin
Paul a
l,d $$c
��
l,d
��
c
��une
d,lCCidee
c
UU
l
GGderri ere
l
EE
c
IIla
l
GG
c
JJ
d ��tete
d
��depuis ce matin
Paul a
l,d $$c
��
l,d
��
c
��une
d,lCCidee
c
UU
l
GGderri ere
l
EE
c
IIle
l
GG
c
JJ
d ��tete
d
��depuis ce matin
40 / 41
Violation Compensation
Paul a
l,d %%
l,d
��une
d,lAAdouleur
c
VV
l
FFderri ere
l
EEla
l
EE
d��
nuque
d
��depuis ce matin
Paul a
l,d $$c
��
l,d
��
c
��une
d,lCCidee
c
UU
l
GGderri ere
l
EE
c
IIla
l
GG
c
JJ
d ��tete
d
��depuis ce matin
Paul a
l,d $$c
��
l,d
��
c
��une
d,lCCidee
c
UU
l
GGderri ere
l
EE
c
IIle
l
GG
c
JJ
d ��tete
d
��depuis ce matin
40 / 41
Conclusion
What can be done with constraints
I Describing whatever the input (including ill-formed)
I Measuring structural complexity
Complexity Model : a Cognitive Perspective
I An interplay between difficulty and facilitation
I An interaction between different sources of information
I Complexity depends on the quantity of information to reduce the search space
I Necessary to take into account the cognitive matrix as well as the context
41 / 41