The Complexity Sources and their Compensation in Language ... · Di culty Models in Psycholinguistics I Existing models I Incomplete Dependency Hypothesis I Dependency Locality Theory

The Complexity Sources and their Compensation inLanguage Processing

Philippe Blache

Laboratoire Parole et LangageCNRS & Aix-Marseille Universite

1 / 41

Situation: different kinds of complexity

I System vs. Structural Complexity (Dahl, 2004)

System complexity Structural complexity

I Number of categories in each domainI Number of features for each categoriesI Grammar size for each domainI Lexicon sizeI Average local complexity

I Number of categoriesI DepthI Number of rules used to build the

structureI Number of words

I Absolute vs. Relative Complexity (Miestamo, 2008)I Absolute complexity: theory-orientedI Relative complexity: user-dependent

I Our perspective : structural relative complexity

Dahl O. (2004) The Growth and Maintenance of Linguistic Complexity, John Benjamins.Miestamo M. (2008) “Grammatical complexity in a cross-linguistic perspective”, in Language Complexity, John Benjamins.

2 / 41

Difficulty Models in Psycholinguistics

I Existing modelsI Incomplete Dependency HypothesisI Dependency Locality TheoryI Early Immediate Constituents PrincipleI Activation

I However, they fail at:I Describing language in its natural environmentI Explaining interaction between sources of information

Gibson E. (1998) “Linguistic complexity: Locality of syntactic dependencies”, Cognition, 68Gibson, E. (2000) “The dependency locality theory: A distance-based theory of linguistic complexity”, in Image, language, brain, MIT Press.Hawkins J. (2001) “Why are categories adjacent?”, in Journal of Linguistics, 37.Vasishth S. (2003) “Quantifying processing difficulty in human sentence parsing : The role of decay, activation, and similarity-based interference”, inProceedings of The European Cognitive Science Conference 2003

3 / 41

Language processing in the real word

I New challenges for Linguistics, Psycholinguistics and NLPI Dealing with natural dataI Language in its context: spoken language, natural interaction

I IssuesI Units are not always possible to determine, difficult to categorize (gradience)I Information can be parcimoniousI Language processing relies on domains interaction

4 / 41

Overview

I Hypothesis : Difficulty depends on the search space sizeThe larger the search space, the more difficulty

I Question: How search space size can be controlled?

I A framework : Maximize Online Principle (Hawkins, 2004)The more properties, the smaller the search space

I Questions:I How to represent properties?I How properties implement structural complexity?I How structural complexity predict processing difficulty?

5 / 41

Outline

I Basis: a constraint-based representation for syntax

I Cohesion: a model for syntactic complexityI How to measure cohesionI Experimenting cohesion in human language processing: an interplay between

difficulty and facilitation

6 / 41

Part I

Representing syntactic information

7 / 41

Classical Generative Syntax

Language and grammar

I Derivation

I Language = set of derived strings

I Recursively enumerable

Parsing

I Finding a derivation

I Building a tree

I Consequences:I There exists a complete grammar of the languageI The initial system is a complete grammar (acquisition)

8 / 41

Our Proposal: Property Grammars

I Describing the characteristics of an input (not building a structure)

I Linguistic statements as constraints

I Declarative approach : no specific mechanism but constraint evaluation

I Basics:I Constraints are independent

I No ranking (contra OT)I Seperate evaluation (contra (GEN)

I No hierarchical structure (contra PSG)I Constraints are at the same level (contra DEP)

9 / 41

What kind of syntactic information?

I Linear precedence

I Mandatory cooccurrence between two categories

I Impossible cooccurrence between two categories

I No repetition of the same category within a construction

I Dependency between two categories

10 / 41

Linearity

Prec(A,B) : (∀x , y)[(A(x) ∧ B(y)→ y 6≺ x)]

I Example: nominal construction

Det ≺ Adj Det ≺ N Det ≺ ProR

Adj ≺ N N ≺ ProR N ≺ Prep

the

l&&

l

%%

l

&&very l -- famous l --

l''

reporter l ++who the senator attacked ...

11 / 41

Requirement (cooccurrence)

Req(A,B) : (∀x , y)[A(x)→ B(y)]

I Example:

V[trans] ⇒ N[obj] Det ⇒ N[com] Adj ⇒ N

ProR ⇒ N V[ditrans] ⇒ Prep Prep ⇒ N

I Relations without government:

(1) a. The most interesting book of the libraryb. *A most interesting book of the library

Sup ⇒ Det[def]

The

r

%%most

rssr %%

interesting

r %%book of the library

rxx

12 / 41

Exclusion (cooccurrence restriction)

Excl(A,B) : (∀x)( 6 ∃y)[A(x) ∧ B(y)]

I ExamplesI Nominal construction:

Pro ⊗ N N[prop] ⊗ N[com] N[prop] ⊗ Prep[inf]

I Relative construction:

ProR[subj] ⊗ N[subj]

13 / 41

Uniqueness

Uniq(A) : (∀x , y)[A(x) ∧ A(y)→ x ≈ y ]

I Example (nominal construction):

Uniq = {Det, ProR, Prep[inf], Adv}

The

u��

book that

u��

I read

14 / 41

Dependency: the type hierarchy

dep

mod comp aux conj

subj obj iobj xcomp

dep : generic dependency relationmod : modification (typically adjunction)spec : specification (typically Det-N)comp : head-complementsubj : subjectobj : direct objectiobj : indirect objectxcomp : other complements (e.g. N − Prep)aux : auxiliary-verbconj : conjunction

15 / 41

Dependency

I Example:

Det ;spec N[com] Adj ;mod N ProR ;mod N

The

spec

most

mod %%interesting

mod %%book of

mod��

the

spec ""library

comp

��

I Syntactic role: N[subj ] ;subj V

I Agreement: Det[agri ] ;spec N[agri ]; Adj [agri ] ;mod N[agri ]

16 / 41

Example: the nominal construction

Det ≺ {Det, Adj, ProR, Prep, N} Det ;spec N

N ≺ {Prep, ProR} Adj ;mod N

Pro ⊗ {Det, Adj, ProR, Prep, N} ProR ;mod N

N[prop] ⊗ Det Prep ;mod N

Uniq = {Pro, Det, N, ProR, Prep} Det ⇒ N[com]

{Adj, ProR, Prep} ⇒ N

17 / 41

Syntactic Representation: Constraint Graph

The

u

��

l

::

l

%%

r

!!

spec

��most

l77

mod ''r //roo interesting

mod ((

l

88book

u

ZZl

<<of

u

ZZ

mod��

l

��the

u

��

l88

r

EE

spec &&library

comp

��

18 / 41

Constraint violation

Contraint graph Characterization

The

l%%

l

##

d

88veryl **

d,r44 old

l ,,

d 33book

r

bb

P+ = {Det ≺ Adj ,Det ≺ N,Adv ≺Adj ,Adj ≺ N,Det ;

N,Adj ; N,Adv ;

Adj ,Det ⇒ N,Adv ⇒Adj ,Adj ⇒ N}

P− = ∅

Veryl **

dc44 old

l

��

d

BBthel{{ l ++

d 33book

r

^^

P+ = {Det ≺ N,Adv ≺Adj ,Adj ≺ N,Det ;

N,Adj ; N,Adv ;

Adj ,Det ⇒ N,Adv ⇒Adj ,Adj ⇒ N}

P− = {Det ≺ Adj}

19 / 41

Part II

Measuring cohesion

20 / 41

Graph-based measures

I Definition: degree = number of edges incident to the vertex

I Degree of a category in the grammar:

Adjr

##d

ProRr

��

d

��Det

l

66l

22

l --

l''

d11 N

lrr

lgg

rtt

Prep

r

22

d

@@

Pro

deg[gram](N) = 9deg[gram](ProR) = 2deg[gram](Adj) = 1

I Degree of a category in the sentence:

Thel %%

l

!!

d77old l ,,

d 22book

c

``

deg[sent](N) = 5deg[sent](Adj) = 1deg[sent](Det) = 1

21 / 41

Category Completeness

I The completeness level of a category depends on the number of relations inits description

I This measure also depends on the number of relations for the category in thegrammar

I Completeness ratio: the higher the number of relations in the grammar isverified, the higher the completeness value

completeness(cat) =deg[sent](cat)

deg[gram](cat)

22 / 41

Sentence Density

I Measure based on 4 types or properties: uniqueness, requirement,dependency, linearity

I Density of a construction: the ratio btw the evaluated properties and thepossible properties:

density(sent) = |properties(sent)||words(sent)|

23 / 41

Satisfaction ratio

I All constraints can be violated

I A characterization contains both satisfied and violated constraints

I The “quality” of a construction depends on the ratio satisfied/violated

I All constraints can be weighted. We note W+ (resp. W− the sum of theweights of the satisfied (resp. violated) constraints :

satisfaction(sent) = W +−W −

W ++W −

24 / 41

Cohesion function

Given S a sentence, w the set of its words:

cohesion(S) =|S|∑i=1

completeness(wi ) ∗ density(S) ∗ satisfaction(S)

Hypothesis: cohesion is correlated with difficulty

25 / 41

Example 1

le cote hysterique un peu de enfin c’est normal tu vois elle souffre et

machin

the hysterical side rather of well it’s normal you see she suffers and that’s it

le l ..d ��

r==

l

��

l

<<cote

l

99hysteriquedkk un peu

dii

luude

d

yyenfin

c ′l !!

d>>est

l &&r��

r

VV normaldjj [tu vois] elle

d ''

l::souffre

r

YYd;; et machin

ddd

26 / 41

Example 2le chien apparemment connaissait parfaitement – the dog apparently knew perfectly –– le coin – the areaet euh quand on est partis and hmm when we leftle chien a decide de nous suivre the dog decided to follow us

led<<

l %%r��

chien

d''

l

77apparemmentd

55connaissaitl **

r

``

l

&&

r

==parfaitement

dff le

d==

l -- coin

d

ee

et euh quand onl

d??est

l %%

d

DDpartisrbb

r

YY

d

ZZ

led==

l $$r��

chienl

d>> a

l %%

d<<decide

r

WW

r

YYl &&

r;;de

d

XX

l!!

nousl &&

d;;suivrerpp

d

aa

27 / 41

Evaluation

Cat Degree-gramDet 1N 34Adj 11Adv 17Prep 31Pro 4Conj 0Aux 8V 7Conj 21

Word Deg-sent Deg-gram Completenessle 0 1 0cote 5 34 0.15hysterique 3 11 0.27un 0 1 0peu 0 17 0de 2 31 0.06enfin 0 17 0c’ 1 4 0.25est 3 7 0.43normal 2 11 0.18tu vois 0 0 0elle 1 4 0.25souffre 2 7 0.29et 2 0 0machin 0 17 0

28 / 41

Evaluation

le l --d ��r 11

l %%

l

66cote

l

66hysterique

dii un peu

ddd

lvvde

d

||enfin

Words Constraints Completeness Density CohesionSent. 1 15 19 0.13 1.18 0.15Sent. 2 21 38 0.17 1.80 0.32

le

d

@@l ��r

��chien

d

##

l

;;apparemment

d

99connaissait

l &&

r

[[

l

""

r

BBparfaitement

d

aa le

d

AAl ++

coin

d

bb

29 / 41

Part III

Difficulty and Compensation: the Case of

Idiom Processing

30 / 41

The situation

Idiom: multiword expression with a figurative meaning separate from the literalmeaning

Examples

I Decomposable idioms (variables)‘‘let the cat out of the bag’’

I Non-decomposable idioms (opaque semantics, no variability)‘‘spill the beans’’, ‘‘kick the bucket’’

Experimental perspective

I Idioms are read faster

I Idioms are related with specific brain activities

I Two different models according to the way they are processed

31 / 41

Compositional models (nonlexical models)

Main ideas

I Idiom comprehension uses normal language processing

I Idioms are represented as configurations of lexical items, no separaterepresentation in the lexicon

The Configurational HypothesisCacciari & Tabossi (1988) “The comprehension of idioms”, Journal of Memory and Language

I A sufficient portion of an idiomatic expression must be processed literallybefore the idiom can be identified

I After the “Recognition Point”, rest of the string is not processed literally

32 / 41

The Cohesion Model: interplay between difficulty andfacilitation

I Our hypothesis: Difficulty can be compensated by Cohesion

I Experiment:I Idioms have high cohesion values

I Introducing a difficulty into an idiom (a syntactic violation) is compensated bythe cohesion

I We compare idiomatic vs. non-idiomatic sentences, with and without violation

33 / 41

Experimental design

Material

I Idiom (IDNV)Paul a une idee derriere la tete depuis ce matin

I Idiome with violation (IDV)Paul a une idee derriere le tete depuis ce matin

I Control (CTRNV)Paul a une douleur derriere la nuque depuis ce matin

I Control with violation (CTRV)Paul a une douleur derriere le nuque depuis ce matin

34 / 41

Experimental design

Specific positions to study

I Recognition point (RP)Paul a une idee derriere la tete depuis ce matin

I Modified word, where the violation is introduced (MM)Paul a une idee derriere le tete depuis ce matin

I Detection word, where the violation is detected (MD)Paul a une idee derriere le tete depuis ce matin

35 / 41

Results: recognition point

I Different processing idiom vs. controlI More positive amplitude in the P300 and N400 windows for idioms:

facilitation36 / 41

Results: modified word

I More negative N400 for violated idiom (IDV) than non violated (IDNV):surprisal at the unexpected (modified) word for idioms

I No significative P600 : no repair

37 / 41

Results: detection word

I Small N400, small P600 for the violated control

I Positive deflection for IDV (wrt IDNV) at N400+P600 : repair

38 / 41

Results

I Idioms are processed differently: more positive amplitude after RP

I Modifying a word after the RP in idioms generates N400: surprisal

I At the position of syntactic violation detection:I High negativity (difficulty) for control sentencesI Earlier positivity (P300) for idioms: expectancy confirmation

39 / 41

Violation Compensation

Paul a

l,d %%

l,d

��une

d,lAAdouleur

c

VV

l

FFderri ere

l

EEla

l

EE

d��

nuque

d

��depuis ce matin

Paul a

l,d $$c

��

l,d

��

c

��une

d,lCCidee

c

UU

l

GGderri ere

l

EE

c

IIla

l

GG

c

JJ

d ��tete

d


Paul a

l,d $$c

��

l,d

��

c

��une

d,lCCidee

c

UU

l

GGderri ere

l

EE

c

IIle

l

GG

c

JJ

d ��tete

d


40 / 41


Paul a

l,d %%

l,d

��une

d,lAAdouleur

c

VV

l

FFderri ere

l

EEla

l

EE

d��

nuque

d


Paul a

l,d $$c

��

l,d

��

c

��une

d,lCCidee

c

UU

l

GGderri ere

l

EE

c

IIla

l

GG

c

JJ

d ��tete

d


Paul a

l,d $$c

��

l,d

��

c

��une

d,lCCidee

c

UU

l

GGderri ere

l

EE

c

IIle

l

GG

c

JJ

d ��tete

d


40 / 41


Paul a

l,d %%

l,d

��une

d,lAAdouleur

c

VV

l

FFderri ere

l

EEla

l

EE

d��

nuque

d


Paul a

l,d $$c

��

l,d

��

c

��une

d,lCCidee

c

UU

l

GGderri ere

l

EE

c

IIla

l

GG

c

JJ

d ��tete

d


Paul a

l,d $$c

��

l,d

��

c

��une

d,lCCidee

c

UU

l

GGderri ere

l

EE

c

IIle

l

GG

c

JJ

d ��tete

d


40 / 41

Conclusion

What can be done with constraints

I Describing whatever the input (including ill-formed)

I Measuring structural complexity

Complexity Model : a Cognitive Perspective

I An interplay between difficulty and facilitation

I An interaction between different sources of information

I Complexity depends on the quantity of information to reduce the search space

I Necessary to take into account the cognitive matrix as well as the context

41 / 41

Documents

The Complexity Sources and their Compensation in Language ... · Di culty Models in Psycholinguistics I Existing models I Incomplete Dependency Hypothesis I Dependency Locality Theory