16
Representing Anaphora with Dependent Types Daisuke Bekki 1,2,3 1 Ochanomizu University, Graduate School of Humanities and Sciences 2 National Institute of Informatics 3 CREST, Japan Science and Technology Agency Abstract. Discourse semantics based on dependent type theory, such as Ranta’s Type Theoretical Grammar, is expected to serve as a proof- theoretic alternative to standard, model-theoretic discourse semantics such as DRT and DPL. Its compositionality, however, with respect to anaphora and presupposition, has been left as an open problem, to- ward which several different approaches have been proposed. In this paper, I will point out that four problems still remain to be solved in the previous approaches, and present a compositional discourse theory that remedies this enterprise, by the combination of the following set- tings: 1) the context-passing mechanism, 2) @-operators for representing anaphora/presupposition triggers, 3) (bottom-up) semantic composition with raw terms, and 4) (top-down) anaphora resolution as type checking. 1 Introduction One of the difficulties that has motivated and driven the school of dynamic, model-theoretic semantics over the past 30 years (DRT in Kamp [13], DPL in Groenendijk and Stokhof [10], and their successors) lies in the tension between dynamics and compositionality. In other words, we have sought for a semantic theory in which sentences with inter- and intra-sentential anaphoric/presuppositional links are given well-formed representations that keep their structures parallel to their syntactic derivations. Since Sundholm [24] discovered that this is feasible with dependent type the- ories, such as constructive/Martin-L¨of Type Theory in Martin-L¨ of [17] (hence- forth MLTT), much of the subsequent research investigated the horizon of Sund- holmian semantics and established a school of constructive, proof-theoretic se- mantics. Among others, Ranta [21]’s Type Theoretical Grammar (TTG) is no- table for its broad empirical coverage. TTG also provoked many insightful dis- cussions, including the proof-theoretic explanation of anaphora (in)accessibility, My sincere thanks to Koji Mineshima, Pascual Mart´ ınez Gom´ ez, Ribeka Tanaka, Shunsuke Yatabe, Eric McCready, Kohei Kishida, and Stefan Kaufmann for many helpful discussions. I also thank the anonymous reviewers for their insightful com- ments. This research is partially supported by JST, CREST. 2-1-1 Ohtsuka, Bunkyo-ku, Tokyo 112-8610, Japan. 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan. 4-1-8 Honcho, Kawaguchi, Saitama, 332-0012, Japan

Representing Anaphora with Dependent Types

  • Upload
    ocha

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Representing Anaphora with Dependent Types

Daisuke Bekki1,2,3 �

1 Ochanomizu University, Graduate School of Humanities and Sciences ��

2 National Institute of Informatics � � �

3 CREST, Japan Science and Technology Agency †

Abstract. Discourse semantics based on dependent type theory, suchas Ranta’s Type Theoretical Grammar, is expected to serve as a proof-theoretic alternative to standard, model-theoretic discourse semanticssuch as DRT and DPL. Its compositionality, however, with respect toanaphora and presupposition, has been left as an open problem, to-ward which several different approaches have been proposed. In thispaper, I will point out that four problems still remain to be solved inthe previous approaches, and present a compositional discourse theorythat remedies this enterprise, by the combination of the following set-tings: 1) the context-passing mechanism, 2) @-operators for representinganaphora/presupposition triggers, 3) (bottom-up) semantic compositionwith raw terms, and 4) (top-down) anaphora resolution as type checking.

1 Introduction

One of the difficulties that has motivated and driven the school of dynamic,model-theoretic semantics over the past 30 years (DRT in Kamp [13], DPL inGroenendijk and Stokhof [10], and their successors) lies in the tension betweendynamics and compositionality. In other words, we have sought for a semantictheory in which sentences with inter- and intra-sentential anaphoric/presuppositionallinks are given well-formed representations that keep their structures parallel totheir syntactic derivations.

Since Sundholm [24] discovered that this is feasible with dependent type the-ories, such as constructive/Martin-Lof Type Theory in Martin-Lof [17] (hence-forth MLTT), much of the subsequent research investigated the horizon of Sund-holmian semantics and established a school of constructive, proof-theoretic se-mantics. Among others, Ranta [21]’s Type Theoretical Grammar (TTG) is no-table for its broad empirical coverage. TTG also provoked many insightful dis-cussions, including the proof-theoretic explanation of anaphora (in)accessibility,� My sincere thanks to Koji Mineshima, Pascual Martınez Gomez, Ribeka Tanaka,

Shunsuke Yatabe, Eric McCready, Kohei Kishida, and Stefan Kaufmann for manyhelpful discussions. I also thank the anonymous reviewers for their insightful com-ments. This research is partially supported by JST, CREST.

�� 2-1-1 Ohtsuka, Bunkyo-ku, Tokyo 112-8610, Japan.� � � 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan.

† 4-1-8 Honcho, Kawaguchi, Saitama, 332-0012, Japan

2 Daisuke Bekki

(T2)

A

(X)(Y )Σ(X, (x)Y (x)) : (X : set)((X)prop)prop

man

man : set

(Z)((X)(Y )Σ(X, (x)Y (x))(man, (z)app(man, Z, z)) : ((man)prop)prop=β (Z)(Σ(man, (x)app(man, Z, x))=β (Z)Σ(man, (x)Z(x))

where app(X, Y, V ) =

{app(A, Y, π1(V )) if X has the form Σ(A, B)Y (V ) : prop otherwise

(T4)

A man

(Z)Σ(man, (x)Z(x)) : ((man)prop)prop

entered

(x)entered(x) : (X)prop [X : set ]

((Z)Σ(man, (x)Z(x)))((x)entered(x)) : prop [X = man]=β Σ(man, (x)entered(x))

(T4)

He

(Y )Y (r0) : ((X : set)prop)prop [r0 : X]

whistled

(x)whistled(x) : (X)prop [X : set ]

((Y )Y (r0))((x)whistled(x)) : prop [X = X, X : set , r0 : X]=β whistled(r0)

(T19)

A man entered

Σ(man, (x)entered(x)) : prop

He whistled

whistled(r0) : prop [X : set , r0 : X]

whistled(r0) : prop [x0 : Σ(man, (x)entered(x)), X : set , r0 : X]−→ whistled(π1(x0)) : prop [x0 : Σ(man, (x)entered(x))]

(by a substitution [X = man, r0 = π1(x0)])

Fig. 1. A semantic composition of (1a) in Davila-Perez [4]

the data of which had been accumulated during the 1970s (as found in Karttunen[14]).

However, TTG is a sugaring-oriented theory (i.e., a theory for sentence gen-eration). In terms of a semantic theory, this is one-sided: a proof-theoretic se-mantics theory ought to tell, for a given sentence, what should be known to utterit and what can be deduced from its utterance (cf. Prawitz [20]). TTG offers onlythe former. The latter, that is, a parsing-oriented theory, has been in demand,toward the establishment of which two different approaches have developed.

Davila-Perez [4] is one of few attempts to provide a parsing-oriented semantictheory4 in MLTT. Davila-Perez’s theory consists of semantic composition rules,defined in the style of Montague Grammar (namely, one rule for each syntacticconfiguration).

Krahmer and Piwek [15] and Piwek and Krahmer [19] (henceforth, K&P)presented an algorithm for anaphora resolution, presupposition binding, andaccommodation. For the syntactic component, K&P adopts a set of translationrules from DRS to Sundholmian representations, which are proposed in Ahn andKolb [1].

4 Ranta [21] also presented a parsing-oriented theory based on MLTT in chapters 8and 9, but I focus only on Davila-Perez’s theory for the sake of space.

Representing Anaphora with Dependent Types 3

2 Previous Approaches

2.1 Davila-Perez [4]

In Davila-Perez’s theory, an anaphora (or a presupposition trigger) is representedas a variable in context, and anaphora resolution/presupposition binding is aprocess to substitute the variables with proof terms. In Davila-Perez’s theory, asentence with E-type anaphora as (1a) (Evans [5], Groenedijk and Stokhof [10])is derived as shown in Fig. 1, and the resulting representation is (1b).5

(1) a. A man entered. He whistled.b. whistled(r0) : prop [x0 : Σ(man, (x)entered(x)),X : set , r0 : X]

where variables x0,X, r0 remain in the context. Then, anaphora resolution takesplace by a consistent substitution of variables; [X = man, r0 = π1(x0)] isone such substitution, yielding the following representation with no unresolvedanaphora.

(2) whistled(π1(x0)) : prop [x0 : Σ(man, (x)entered(x))]

“Lack of Asymmetry” Problem However, Davila-Perez’s theory suffers fromtwo problems. The first problem arises from the inability to theoretically distin-guish between unresolved anaphora (or presupposition triggers) and their an-tecedents. Both are uniformly represented as variables in contexts in MLTT,although the former are called reference markers and distinguished from thelatter at the implementation level.

Therefore, Davila-Perez’s theory does not prevent us from substituting avariable in a given context that is not supposed to be a reference marker, as thefollowing simple discourse exemplifies.

(3) A man was playing jazz. Then someone made a noise, and he made afurious exit.

The anaphora “he” in the second sentence is to be resolved under the follow-ing context:

[x0 : Σ(man, (x)play(x, j)),x1 : Σ(man, (x)make(x, noise)),X : set ,r0 : X].

The intended anaphoric link is established with a substitution [X = man, r0 =π1(x0)]. However, we may instead substitute x1 with (π1(x0), k(π1(x0))(π2(x0))),assuming that we share a world knowledge k that for any man, playing jazz makesa noise (which is quite sensible):5 For details, see chapters 2 and 3 in Davila-Perez [4], where T2 is a Determiner-Noun

rule (p. 43), T4 is a Subject-Predicate Rule (p. 45), and T19 is a Discourse LinkingRule (p. 62).

4 Daisuke Bekki

k : Π(man, (x)(play(x, j) ⊃ make(x,noise))).

This is a legitimate operation in Davila-Perez’s theory but empirically inac-curate: it interprets an existential quantifier as anaphoric.

In short, the setting of Davila-Perez [4] lacks asymmetry between contextvariables and reference markers, and therefore lacks asymmetry between anaphoraand non-anaphora; this accidentally affords non-anaphoric expressions an anaphoricinterpretation.

Ill-formedness Problem The second problem is that some semantic repre-sentations in Davila-Perez [4] are not well-formed in the sense of MLTT. Forexample, let us consider the following lexical item for the definite article “the”(Davila-Perez [4], p. 58).

(Y )Y (rn) : ((X : set)prop)prop [rn : X]

This representation is ill-formed because its well-formedness requires a proofas shown in (4), in which the application of the (ΠI) rule in the last line isprohibited because the variable r0 that remains in the context still depends onthe variable X, which blocks the abstraction of X.6

(4)

∗(ΠI)

(ΠI)

(ΠE)Y : (X)prop [X : set , Y : (X)prop] r0 : X [X : set , r0 : X]

Y (r0) : prop [X : set , Y : (X)prop, r0 : X]

(Y )Y (r0) : ((X)prop)prop [X : set , r0 : X]

(X)(Y )Y (r0) : (X : set)((X)prop)prop [r0 : X]

An even more serious problem occurs when “the” is composed with a commonnoun. Suppose that “the” is to be composed with a common noun “dog”, repre-sented as dog : set . The beta reduction of the result substitutes the variable X in(X)(Y )Y (rn) : (X : set)((X)prop)prop and yields (Y )Y (rn) : ((dog)prop)prop,as expected. However, X also occurs as free in the context [rn : X], which shouldnot be substituted with dog, assuming the standard definition of beta reduction.But this means we lose the presuppositional content altogether.

This is not a technical problem exclusive to the definite article. It occurswhenever a presupposed semantic content contains a variable that is to be sat-urated by its argument. This includes cases like factive presuppositions andpossessive presuppositions.

2.2 Krahmer and Piwek [15], Piwek and Krahmer [19]

K&P defined an extended language of MLTT with annotated variables as anextra construction that is intended to represent unresolved anaphora and pre-supposition triggers. For example, K&P’s representation for (1a) is as follows.6 This is a general assumption in lambda calculi (cf. Martin-Lof [17], p. 27 has the

following note on the Π-introduction rule for introducing a term of the form (λx ∈A)M : “we assume that the usual variable restriction is met, i.e. that x does notappear free in any assumption except (those of the form) x ∈ A.”).

Representing Anaphora with Dependent Types 5

resolve(Φ, C, Φ) :- atomic(Φ).resolve(ΠV : Φ.Ψ, C, ΠV : Φ′ : Ψ ′) :- resolve(Φ, C, Φ′),

C′ := C ⊗ (Γ − C),resolve(Ψ, C′ ⊗ V : Φ′, Ψ ′).

resolve(Φχ, C, Φ′) :- binder(χ, C, S),resolve(Φ[S], C, Φ′).

resolve(Φχ, C, Φ′) :- adequate(χ[S′], C),add(χ[S′], Γ ),resolve(Φ[S′], C ⊗ χ[S′], Φ′).

(and three more clauses for intermediate and local accommodation).7

Fig. 2. The Anaphora Resolution Algorithm of Krahmer and Piwek [15]

(5) (Σx0 : (Σu : (Σx : entity)man(x))enter(π1(u)))

whistle(X[X:entity,m:man(X)])

In this representation, X[X:entity,m:man(X)] is an annotated variable correspond-ing to the anaphora “He” in the second sentence, where [X : entity,m :man(X)] is called an annotation (syntactically, an annotation is a context).In K&P’s theory, anaphora resolution is understood as a substitution of anno-tated variables, which, in this case, fulfils the condition that each of the followingjudgments is simultaneously proven under the substitution.

(6) a. x0 : (Σu : (Σx : entity)man(x))enter(π1(u)) � X : entityb. x0 : (Σu : (Σx : entity)man(x))enter(π1(u)) � m : man(X)

A substitution X = π1π1(x0), m = π2π1(x0) satisfies the condition, andthe anaphora is thus resolved. Let us call the context responsible for a givenanaphora resolution a local context. K&P’s algorithm for deriving a resolutionfor a given representation under a given local context is partly shown in Fig. 2.Readers wanting details are referred to Appendix A in Krahmer and Piwek [15].

In K&P, anaphora resolution reduces to a search for a proof of a judgmentfrom a given local context to the types in the annotations. Thus, K&P denomi-nated this view behind the algorithm anaphora resolution as proof construction.This paradigm inherits the presupposition as anaphora paradigm by van derSandt [22] and Geurts [9] in the sense that both regard anaphora resolution andpresupposition binding as the same operation. However, K&P states a strongerclaim with regard to the nature of anaphora and presupposition: namely, anyanaphoric link is a proof. Moreover, K&P’s theory is empirically superior to vander Sandt [22] and Geurts [9]’s DRT-based approach in cases such as bridging(Clark [3]) where an inference is required to establish an anaphoric link.8

7 ⊗ denotes the append operation between two contexts, and Γ is a given context inwhich the presuppositions of Φ are to be resolved. The readers may be puzzled bythe status of Γ as a global variable despite its Prolog-style notation. For details,refer to Krahmer and Piwek [15], the appendix A.

8 Mineshima [18] pursued this analysis in the setting of Ranta [21].

6 Daisuke Bekki

(7) If Johni buys a car, hei checks the motor first.

Partiality Problem of Resolution Algorithm Yet K&P’s theory has leftat least two problems unsolved. The first problem is that K&P’s algorithm isnot defined for all constructions of MLTT, but only for those listed in Fig. 2:atomic formulas, constructions of the form ΠV : Φ, and annotated variables.This means that the algorithm resolves anaphora that appears in certain formsonly.

Note that this is not problematic for K&P’s theory itself, which adopts Ahnand Kolb [1]’s translation from DRS so that it only yields permissible forms.However, this requirement is too strict if we try to extend the empirical cover-age of K&P’s theory or use K&P’s theory in a compositional setting, where itis naturally expected that an annotated variable may occur within a functionalapplication, projection, lambda abstraction, or other construction. For exam-ple, the following is a semantic representation of “John” and “loves his father”,respectively.

(8) a. (λp)p(j)b. (λx)love(x,X[X:entity,f :fatherOf(x,j)])

The representation (8b) contains a variable with annotation [X : entity, f :

fatherOf(x, j)] within the lambda abstraction construction, on which K&P’sanaphora resolution algorithm does not work. This may cause trouble whena construction such as generalized quantifiers uses a lambda-abstracted repre-sentation as its subformula, as in (9).

(9) Most((λx)boy(x))((λx)love(x,X[X:entity,f :fatherOf(x,j)])

Therefore, it is a natural urge to extend the algorithm to arbitrary forms aswell so as to secure the empirical coverage of the theory.9

Copying Problem of Annotated Variable The second problem occurs whena variable with annotation gets copied. This happens when operations such asthe conjunction reduction take place due to the generalized conjunction (in thesense of Gazdar [7]). This occurs in the following sentence.

(10) Each of John and Bill loves his father.

The reading for (10) is either (each of John and Bill loves his own father)or (both John and Bill love somebody’s father (including John’s and Bill’s)).9 However, there seems to exist an inherent difficulty in defining resolution rules for

all constructions: in a semantic representation of the form (λP )P (...XA...), whereXA is an annotated variable, the local context for a resolution of XA depends onthe content of P . Thus, the resolution algorithm has to delay the evaluation to thelatter stage, where P is instantiated.

Representing Anaphora with Dependent Types 7

However, a mixed reading in which John only loves his own father and Bill onlyloves Sam’s father, for example, does not seem available.

K&P’s algorithm gives rise to the mixed reading as well as other legitimatereadings, and so it overgenerates anaphoric links. Assume that the semanticrepresentation of (10) is as follows, where the annotated variable Y is copiedand appears twice.

(11) love(j, Y[X:entity,Y :entity,f :fatherOf(Y,X)])

∧love(b, Y[X:entity,Y :entity,f :fatherOf(Y,X)])

In K&P’s theory, anaphora is resolved locally ; namely, each occurrence of avariable with annotation gets substituted under each local context. Thus, theresolution allows the first X to be substituted with John and the second X withsome person other than John and Bill. Therefore, the semantic language thatK&P suggests for unresolved representations empirically overgenerates.

3 Dependent Type Semantics

For the solution of various problems discussed in the last section, I present aframework that I will call Dependent Type Semantics (henceforth DTS). DTSis based on dependent type theory with Π,Σ constuctors, equation and naturalnumber types (like MLTT), extended with an @-operator (unlike MLTT), whosesyntax for raw terms is specified as follows:

Λ := x | c | () | (@i : Λ)| (Πx:Λ)Λ | (λx)Λ | ΛΛ | (Σx:Λ)Λ | (Λ,Λ) | π1(Λ) | π2(Λ)| eqΛ(Λ,Λ) | rΛ(Λ,Λ) | s(Λ) | R(Λ,Λ,Λ)

where x is a variable, c is a constant symbol, () is a unit, (@i : Λ) is an @-operator(i is a natural number), rA(M,N) is a proof term of the equation between Mand N of type A, s is the successor operator in N-introduction rule, and R is themathematical induction operator in N-elimination rule, respectively.10

3.1 Concepts

DTS is characterized by the following four features, though the first one is sharedby most of the other Sundholmian semantics.

1. From the Curry-Howard correspondence, where a proposition is a type and aproof is a term, a static proposition in DTS is a type. Under a given context,it is true if and only if it is inhabited (i.e., there exists a proof term of thattype).

10 Although the whole system of DTS is described by means of Martin-Lof type theory,our notation in this paper is inherited from Pure Type System (Barendregt [2]),which is more explicit about the status of contexts, substitutions, discharging, andother formal notions.

8 Daisuke Bekki

2. A dynamic proposition in DTS is a function from a proof of a static propo-sition that corresponds to its preceding discourse (a context in DTS) to astatic proposition. A context is transferred either externally by the context-passing mechanism built into dynamic conjunction and disjunction (whichare akin to continuation semantics (de Groote [11])), or internally in thelexical representations.

3. All the lexical representations are raw terms of dependent type theory, andcomposed as raw terms. This allows a context to be polymorphic.

4. When such raw terms are composed together to yield a representation for asyntactic category S (that is, a sentence), it must be of sort γ → type (forsome new variable γ) in order to be uttered felicitously. This is called thefelicity condition in DTS.

3.2 Static and Dynamic Propositions

In a dynamic proposition, a given context is discarded when the dynamic propo-sition does not contain any anaphora or presupposition triggers. For example, asemantic representation in DTS for “A man entered” is as (12)11 where c is avariable for a proof of its left context, which is not used in this representation.

(12) (λc)(Σu:(Σx:e)M(x))E(π1(u))

3.3 The @-operator and Anaphora/Presupposition triggers

Anaphora and presupposition triggers are represented by @-operators that takethe left context that is passed to a dynamic proposition that contains them. Therepresentation for “He whistled” is (13), where the @0-operator (@0 : γ0 → e) isfed its left context c (whose type is underspecified as γ0) and returns an entity(of type e) whom the pronoun refers to.11 The constant symbol e is short for entity, M is short for Man, E is short for

Enter, and so forth. With respect to the representations of noun phrases, two dif-ferent approaches have been adopted in previous works. For example, Sundholm [24]and Ranta [21] (and also Davila-Perez [4] in MLTT setting) encode the semanticrepresentation of “A man laughs” as in (1), while Krahmer and Piwek [15] adoptsthe style in (2).

(1) (Σx:M)L(x) (2) (Σu:(Σx:e)M(x))L(π1(u))

DTS adopts the style in (2). One reason for this choice is to avoid the verbose use ofthe app operator of Davila-Perez [4] (see Fig. 1), which arises from polymorphismof noun phrases: the representation of “man” is man while that of “old man” isΣ(man, (x)old(x)).

This not only forces nominal modifiers to be polymorphic, but also requires awrapper, such as an app operator, between noun phrases and verbs to absorb thepolymorphism. Even worse, this polymorphism makes the definition of generalizedquantifiers in Sundholm [25] notoriously polymorphic, as criticized in Fox and Lappin[6].

Representing Anaphora with Dependent Types 9

D1 ≡ (@F )

(→F )

γ : type

...

V : type

γ × V : type e : type

γ × V → e : type γ × V → e true

(@0 : γ0 → e) : γ × V → ewhere V = (Σu:(Σx:e)M(x))E(π1(u)).

(→I)

(ΣF )

...

V : type(→E)

W : e → type(→E)

D1(×I)

c : γ(2)

v : V(1)

(c, v) : γ × V

(@0 : γ0 → e)(c, v) : e

W((@0 : γ0 → e)(c, v)) : type

(Σv:(Σu:(Σx:e)M(x))E(π1(u))))W((@0 : γ0 → e)(c, v)) : type

(1)

(λc)(Σv:(Σu:(Σx:e)M(x))E(π1(u))))W((@0 : γ0 → e)(c, v)) : γ → type

(2)

Fig. 3. Type Checking for (1a)

(13) (λc)W((@0 : γ0 → e)(c))

@-operators have the following formation rule (@F ).

(@F )A : type A true

(@i : A) : A

The (@F ) rule requires that a certain type (γ0 → e, in the case of (13)) isinhabited, which is the presupposition triggered by the @-operator.12 There is nointroduction or elimination rule for @-operators. Free variables and substitutionfor @-operators are defined as follows.

fv((@i : A))def≡ fv(A)

(@i : A)[M/x]def≡ (@i : (A[M/x]))

A natural number i in (@i : A) is assigned for each occurrence of an anaphoricexpression or presupposition trigger.

3.4 Context-Passing Mechanism

Then, two dynamic propositions are merged into one by either dynamic conjunc-tion or dynamic disjunction (a dynamic version of equivalence in propositionallogic: A ∨ B ≡ ¬A → B).

Definition 1 (Dynamic conjunction and disjunction).

M ;Ndef≡ (λc)(Σu:Mc)N(c, u)

M |N def≡ (λc)(Πu:¬Mc)N(c, u)12 This idea is in line with the use of the epsilon operator in Mineshima [18].

10 Daisuke Bekki

A left context c for M ;N (or M |N) is first passed to M (or ¬M), and thenthe pair (c, u) is passed to N ; here, u is a proof of Mc (or ¬Mc). This meansthat the anaphora in N can refer to both antecedents in the left context andthose introduced in M , but the anaphora in M can only refer to antecedents inthe left context.

The types of the context c and the pair of contexts (c, u) are different, thusthe two dynamic propositions M and N should be assigned different types. Butthis does not require a polymorphic setting at the object language level since Mand N are raw terms and polymorphism is handled at the metalanguage levelwhen type checking takes place.

The dynamic conjunction operator, if applied to the sequence of (12) and(13), yields a complex as follows.

((λc)(Σu:(Σx:e)M(x))E(π1(u))); ((λc)W((@0 : γ0 → e)(c))))= (λc)(Σv : (Σu:(Σx:e)M(x))E(π1(u)))W((@0 : γ0 → e)(c, v))

3.5 The Felicity Condition and Anaphora Resolution

Since the last representation in the previous derivation is one for syntactic cat-egory S, it obeys the felicity condition of DTS. Checking the felicity conditionfor each sentential representation evokes its type checking algorithm (cf. the al-gorithm W in Hindley [12]) of dependent type theory, as shown in Fig. 3, whichrequires that the following judgment holds at the topmost node.

(14) (Σc:γ)(Σu:(Σx:e)M(x))E(π1(u)))) → e true

This is true, since this type inhabits a proof term (λc)π1π1π2(c). In words,there is an entity in a given context to which the pronoun “He” can refer. Thisexactly corresponds to the presupposition that the pronoun triggers.

Now, I declare the following statement for anaphora resolution and presup-position binding: anaphora resolution involves proof search, in line with K&P’stheory.

Definition 2 (Anaphora Resolution / Presupposition Binding in DTS).Suppose that Γ � (@i : A) : A and Γ � a : A. Then a resolution of @i by a underthe context Γ is an equation (@i : A) = a : A.

A proof such as a : A always exists when the felicity condition is met, inwhich case there is a solution to anaphora resolution. In the above example, thefollowing equation is a resolution:

(@0 : γ0 → e) = (λc)π1π1π2(c) : γ0 → e

where γ0 = (Σc:γ)(Σu:(Σx:e)M(x))E(π1(u)))).Note that this equation is not a logical consequence deduced from the given

representation; anaphora resolution (and presupposition binding) in DTS is un-derstood as an abduction that infers the speaker’s knowledge behind the utter-ance that contains anaphora or presupposition triggers, which makes it true (cf.Krause [16]).

Representing Anaphora with Dependent Types 11

4 Example Derivations: Donkey Sentences

As a compositional semantics, DTS is more transparent than Davila-Perez’stheory where each composition rule is tightly bound to a particular syntacticconfiguration.

In order to provide a more detailed exposition of how the machinery of DTSworks in general, this section will demonstrate a derivation of donkey sentence(Geach [8]) shown in (15), as one of the canonical benchmark tests that weexpect any new discourse theory to cover.

(15) Every farmer who owns [a donkey]1 beats it1.

The lexical items required to derive the sentence (15) are listed in Defini-tion 3. Throughout this paper, DTS is presented as a semantic component ofcombinatory categorial grammar (Steedman [23]), but it naturally serves forother lexical grammars as well.13

Definition 3 (Lexical items in DTS).

PF CCG categories Semantic representations in DTSif S/S/S (λp)(λq)(λc)(Πu:pc)(q(c, u))everynom T /(T \NP)/N (λn)(λp)(λc)(Πu:(Σx:e)nxc)(p(π1(u))(c, u))everyacc T \(T /NP)/N (λn)(λp)(λx)(λc)(Πv:(Σy:e)nyc)(p(π1(v))x(c, v))anom T /(T \NP)/N (λn)(λp)(λc)(Σu:(Σx:e)nxc)p(π1(u))(c, u)aacc T \(T /NP)/N (λn)(λp)(λx)(λc)(Σv:(Σy:e)nyc)p(π1(v))x(c, v)farmer N (λx)(λc)F(x)donkey N (λx)(λc)D(x)who N\N/(S\NP ) (λp)(λn)(λx)(λc)(nxc ∧ pxc)whom N\N/(S/NP ) (λp)(λn)(λx)(λc)(nxc ∧ pxc)owns S\NP/NP (λy)(λx)(λc)O(x, y)beats S\NP/NP (λy)(λx)(λc)B(x, y)hei

nom T /(T \NP) (λp)(λc)((λx)px(c, x))((@i : γi → e)(c))itjacc T \(T /NP) (λp)(λx)(λc)((λy)pyx(c, y))((@j : γj → e)(c))

The conditional “if” and the universal quantifier “every” are constructedfrom Π-operator, while the indefinite article “a” is from Σ-operator, followingSundholm [24].

The relativizer “who” takes a subjectless sentence and a common noun, andstatically conjoins them. Pronouns are represented by means of @-operators,where each γi is a new variable.

The derivation of the relative donkey sentence (15) is as follows.13 Given that the semantic representations in DTS are raw terms, the result of seman-

tic composition may not be typable. This problem can be avoided by adopting acategorial grammar for a syntactic component of DTS and by ensuring that eachsemantic representation in the lexicon is typable. The proof is routine (we only haveto show that each rule of the adopted categorial grammar preserves typability). Wealso need Subject Reduction Theorem and Normalization Theorem of DTS withrespect to the CCG categories in order to execute reductions during derivations.

12 Daisuke Bekki

<

ownsS\NP/NP

: (λy)(λx)(λc)O(x, y)

>

aT \(T /NP)/N

: (λn)(λp)(λx)(λc)(Σv:(Σy:e)nyc)(p(π1(v))x(c, v))

donkeyN

: (λx)(λc)D(x)T \(T /NP)

: (λp)(λx)(λc)(Σv:(Σy:e)D(y))(p(π1(v))x(c, v))S\NP

: (λx)(λc)(Σv:(Σy:e)D(y))O(x, π1(v))

<

farmerN

: (λx)(λc)F(x)

>

whoN \N /(S\NP)

: (λp)(λn)(λx)(λc)(nxc ∧ pxc)

owns a donkeyS\NP

: (λx)(λc)(Σv:(Σy:e)D(y))O(x, π1(v))N \N

: (λn)(λx)(λc)(nxc ∧ (Σv:(Σy:e)D(y))O(x, π1(v)))N

: (λx)(λc)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v)))

>

everyS/(S\NP)/N

: (λn)(λp)(λc)(Πu:(Σx:e)nxc)(p(π1(u))(c, u))

farmer who owns a donkeyN

: (λx)(λc)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v)))S/(S\NP)

: (λp)(λc)(Πu:(Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v))))(p(π1(u))(c, u))

<

beatsS\NP/NP

: (λy)(λx)(λc)B(x, y)

it1(S\NP)\(S\NP/NP)

: (λp)(λx)(λc)p((@1 : γ1 → e)(c))xc

S\NP: (λx)(λc)B(x, (@1 : γ1 → e)(c))

>

every farmer who owns [a donkey]1S/(S\NP)

:(λp)(λc)(Πu : (Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v))))(p(π1(u))(c, u))

beats it1S\NP

: (λx)(λc)B(x, (@i : γ1 → e)(c))

S: (λc)(Πu:(Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v))))B(π1(u), (@1 : γ1 → e)(c, u))

Thus we obtain (16) as a semantic representation of (15), from the last lineof the above derivation.

(16) (λc)(Πu:(Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v))))B(π1(u), (@1 : γ1 → e)(c, u))

Then the felicity condition requires (16) to be of type γ0 → type (where γ0

is a new variable), which is to be checked by the type checking algorithm.We assume that the constant symbols are annotated as F : e → type, D :

e → type, O : e × e → type, and B : e × e → type. Under this setting, the typesetting algorithm requires, via the (@F) rule, that the following type is true,namely, it has a proof term14:14 We need a proof search to check this step, which gives rise to the undecidability

of type checking in DTS. This is one potential problem of the current version ofDTS, though the inhabitance of (17) is exactly what one has to check to decide theantecedent of the donkey anaphora in (15) which has to be calculated at some pointor other.

Representing Anaphora with Dependent Types 13

(17) γ0 × (Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v))) → e

As the following proof diagram shows, (λc)π1π1π2π2π2(c) is a proof termthat satisfies the felicity condition, which corresponds to the reading where “it”refers to the donkey.

(→I)

(ΣE)

(ΣE)

(ΣE)

(ΣE)

(×E)c : γ0 × (Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v)))

(1)

π2(c) : (Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v)))

π2π2(c) : F(π1π2(c)) ∧ (Σv:(Σy:e)D(y))O(π1π2(c), π1(v))

π2π2π2(c) : (Σv:(Σy:e)D(y))O(π1π2(c), π1(v))

π1π2π2π2(c) : (Σy:e)D(y)

π1π1π2π2π2(c) : e

(λc)π1π1π2π2π2(c) : γ0 × (Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v))) → e(1)

Thus, the equation (@1 : γ1 → e) = (λc)π1π1π2π2π2(c) : γ1 → e is aresolution of @1 under the given context in the sense of Definition 2. Assumingthis equation in succeeding inferences allows us to substitute (@1 : γ1 → e) inthe semantic representation of (15) with (λc)π1π1π2π2π2(c), yielding the resolvedsemantic representation (18).

(18) (λc)(Πu:(Σx:e)(F(x) ∧ (Σv:(Σy:e)D(y))O(x, π1(v))))B(π1(u), π1π1π2π2(u))

5 Solutions to the Puzzles

5.1 “Lack of Asymmetry” Problem Solved

The first problem of Davila-Perez [4] does not occur in DTS in an obvious sense,since DTS distinguishes anaphora and non-anaphora by representing the former,but not the latter, by means of @-operators. For example, the representation for(3) is as (19), where an @-operator is used for only the pronoun “he”; thus, thereis no danger that other terms are interpreted as anaphoric.

(19) (λc)(Σx:e)(M(x)∧P(x, j)); (λc)(Σx:e)(M(x)∧MK(x, n)); (λc)MFE(@0 :(γ0 → e)(c))

5.2 Ill-Formedness Problem Solved

The second problem of Davila-Perez [4] is also avoided in DTS. The semanticrepresentation for “the” in DTS is as (20), where the lambda abstraction withrespect to the variable n is legitimate, since the occurrence of n in the @-operatoris just that of a free variable and no other variables depend on n.

(20) (λn)(λp)(λc)p(π1((@i : (Πc:γi)(Σx:e)nxc)(c)))c

The semantic representation for a sentence in (21a) in DTS is (21b), asderived in Fig. 4.

14 Daisuke Bekki

>

>

the

S/(S\NP)/N : (λn)(λp)(λc)p(π1((@0 : γ0 → (Σx:e)nxc)(c)))c

dog

N : (λx)(λc)D(x)

S/(S\NP) : (λp)(λc)p(π1((@0 : γ0 → (Σx:e)D(x))(c)))c

barks

S\NP : (λx)(λc)B(x)

S : (λc)B(π1((@0 : γ0 → (Σx:e)D(x))(c)))

Fig. 4. A Derivation of (21a)

(→I)

(→E)

B : e → type(ΣE)

(→E)

(@F )

c : γ0(1)

...

γ0 → (Σx:e)D(x) : type γ0 → (Σx:e)D(x) true

(@1 : γ1 → (Σx:e)D(x)) : γ0 → (Σx:e)D(x) c : γ0(1)

(@1 : γ1 → (Σx:e)D(x))(c) : (Σx:e)D(x)

π1((@1 : γ1 → (Σx:e)D(x))(c)) : e

B(π1((@1 : γ1 → (Σx:e)D(x))(c))) : type

(λc)B(π1((@1 : γ1 → (Σx:e)D(x))(c))) : γ0 → type

(1)

Fig. 5. Type Checking for (21a)

(21) a. The dog barks.

b. (λc)B(π1((@1 : γ1 → (Σx:e)D(x))(c)))

Then, its felicity condition is checked as shown in Fig. 5, assuming the typeof the given context as c : γ0. Thus, all we need for the felicity condition to besatisfied is that the type γ0 → (Σx:e)D(x) is inhabited (i.e., that there is a prooffrom a given left context of the existence of a dog), which is just as expected.

5.3 Partiality Problem Solved

The first problem of K&P is solved by the context-passing mechanism. DTS doesnot need to define a rule for each configuration in dependent type theory abouthow local contexts are passed since local contexts are explicitly passed aroundas arguments for dynamic propositions. Consider a semantic representation (22)for (8b) in DTS.

(22) (λx)(λc)L(x, π1((@1 : γ1 → (Σy:e)fatherOf(y, (@0 : γ0 → e)(c, x)))(c, x)))

In (22), the representation explicitly takes a context c as an argument andpasses it (with x) to the @-operator, which presupposes that the existence of x’sfather is proven from c.

Thus we do not have to care about what kind of construction of dependenttype theory we are dealing with because the relations between a local contextand @-operators are always explicitly specified.

Representing Anaphora with Dependent Types 15

5.4 Copying Problem Solved

The second problem of K&P is circumvented by indexing @-operators. For ex-ample, assume that the representation of (10) in DTS is as (23).

(23) (λc)(L(j, π1((@1 : γ1 → (Σx:e)fatherOf(x, (@0 : γ0 → e)(c, j)))(c, j)))

∧L(b, π1((@1 : γ1 → (Σx:e)fatherOf(x, (@0 : γ0 → e)(c, b)))(c, b)))

In (23), an @0 operator presupposed an existence of somebody that the pro-noun “his” denotes, and the @1 operator presupposes that of his father. Sincethe two occurrences of the @0 operators in (23) share the same index, their reso-lution must substitute both occurrences of the @0 operator at the same time. Apossible substitution is that with π2, which picks up John and Bill as antecedentsof “his”.

The same argument applies to the @1 operators. As a result, they are to besubstituted by the same proof term, which in turn picks up the same person asthe father in question from a given context c. This way, the mixture of antecedentselection that troubled K&P’s theory does not arise in DTS.

6 Conclusion

We presented the framework of DTS, a new compositional semantic theory basedon dependent type theory (in line with Sundholm [24] and Ranta [21]) extendedwith @-operators. While inheriting the parsing-orientedness from Davila-Perez[4] and the “anaphora resolution as proof construction” paradigm from K&P,DTS differs from the previous constructive type theoretic approaches in fourpoints: 1) DTS has a context-passing mechanism, 2) DTS uses an @-operatorfor anaphora and presupposition triggers, 3) semantic representations in DTS areraw terms, and 4) the felicity condition invokes a type checking of the semanticrepresentation, which works as anaphora resolution and presupposition binding.With this setting, DTS gives a unified solution to the problems that the previousapproaches left unsolved.

References

1. Ahn, R., Kolb, H.P.: Discourse representation meets constructive mathematics. In:Kalman, L., Polos, L. (eds.) Papers from the Second Symposium on Logic andLanguage. Akademiai Kiado (1990)

2. Barendregt, H.P.: Lambda calculi with types. In: Abramsky, S., Gabbay, D.M.,Maibaum, T. (eds.) Handbook of Logic in Computer Science, vol. 2, pp. 117–309.Oxford Science Publications (1992)

3. Clark, H.H.: Bridging. In: Roger, S., L., N.W.B. (eds.) TINLAP ’75: Proceed-ings of the 1975 workshop on Theoretical issues in natural language processing.pp. 169–174. Association for Computational Linguistics, Stroudsburg, PA, USA,Cambridge, Massachusetts (1975)

16 Daisuke Bekki

4. Davila-Perez, R.: Semantics and Parsing in Intuitionistic Categorial Grammar.Ph.d. thesis, University of Essex (1995)

5. Evans, G.: Pronouns. Linguistic Inquiry 11, 337–362 (1980)6. Fox, C., Lappin, S.: Type-theoretic approach to anaphora and ellipsis. In: Re-

cent Advances in Natural Language Processing (RANLP 2003). Borovets, Bulgaria(2003)

7. Gazdar, G.: A cross-categorial semantics for conjunction. Linguistics and Philoso-phy 3, 407–409 (1980)

8. Geach, P.: Reference and Generality: An Examination of Some Medieval and Mod-ern Theories. Cornell University Press, Ithaca, New York (1962)

9. Geurts, B.: Presuppositions and pronouns. Oxford, Elsevier (1999)10. Groenendijk, J., Stokhof, M.: Dynamic predicate logic. Linguistics and Philosophy

14, 39–100 (1991)11. de Groote, P.: Towards a montagovian account of dynamics. In: Gibson, M., Howell,

J. (eds.) 16th Semantics and Linguistic Theory Conference (SALT16). pp. 148–155.CLC Publications, University of Tokyo (2006)

12. Hindley, J.R.: The principal type-scheme of an object in combinatory logic. Trans-actions of the American Mathematical Society 146, 29–60 (1969)

13. Kamp, H.: A theory of truth and semantic representation. In: Groenendijk, J.,Janssen, T.M., Stokhof, M. (eds.) Formal Methods in the Study of Language.Mathematical Centre Tract 135, Amsterdam (1981)

14. Karttunen, L.: Discourse referents. In: McCawley, J.D. (ed.) Syntax and Semantics7: Notes from the Linguistic Underground, vol. 7, pp. 363–85. Academic Press, NewYork (1976)

15. Krahmer, E., Piwek, P.: Presupposition projection as proof construction. In: Bunt,H., Muskens, R. (eds.) Computing Meanings: Current Issues in ComputationalSemantics. Studies in Linguistics Philosophy Series, Kluwer Academic Publishers,Dordrecht (1999)

16. Krause, P.: Presupposition and abduction in type theory. In: E., K., Manandhar,S., Nutt, W., Siekman, J. (eds.) Edinburgh Conference on Computational Logicand Natural Language Processing. Edinburgh: HCRC (1995)

17. Martin-Lof, P.: Intuitionistic Type Theory, vol. 17. Italy: Bibliopolis, Naples (1984),sambin, Giovanni (ed.)

18. Mineshima, K.: A presuppositional analysis of definite descriptions in proof theory.In: Satoh, K., Inokuchi, A., Nagao, K., Kawamura, T. (eds.) JSAI 2007 LNAI, vol.4914, pp. 214–227. Springer, Heidelberg (2008)

19. Piwek, P., Krahmer, E.: Presuppositions in context: Constructing bridges. In: Bon-zon, P., Cavalcanti, M., Nossum, R. (eds.) Formal Aspects of Context. AppliedLogic Series, Kluwer Academic Publishers, Dordrecht (2000)

20. Prawitz, D.: Intuitionistic logic: A philosophical challenge. In: von Wright, G. (ed.)Logics and Philosophy. Martinus Nijhoff, The Hague (1980)

21. Ranta, A.: Type-Theoretical Grammar. Oxford University Press (1994)22. van der Sandt, R.: Presupposition projection as anaphora resolution. Journal of

Semantics 9, 333–377 (1992)23. Steedman, M.J.: The Syntactic Process (Language, Speech, and Communication).

The MIT Press, Cambridge (2000)24. Sundholm, G.: Proof theory and meaning. In: Gabbay, D., Guenthner, F. (eds.)

Handbook of Philosophical Logic, vol. III, pp. 471–506. Kluwer, Reidel (1986)25. Sundholm, G.: Constructive generalized quantifiers. Synthese 79, 1–12 (1989)