Upload
truonglien
View
224
Download
0
Embed Size (px)
Citation preview
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
The XDG Development Kit
Ralph Debusmann
Programming Systems Lab, Saarbrücken
IGK Colloquium, December 16th, 2004
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Overview
1 Introduction
2 Extensible Dependency Grammar (XDG)
3 XDG Development Kit (XDK)
4 Conclusions
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Overview
1 Introduction
2 Extensible Dependency Grammar (XDG)
3 XDG Development Kit (XDK)
4 Conclusions
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Declarative Grammar Formalisms
Declarative Grammar Formalisms
specify linguistic knowledge independently from processingparsers/generators: can be generically created for allgrammars
ExamplesLFG (Bresnan 2001)HPSG (Pollard/Sag 1994)TAG (Joshi 1987)CCG (Steedman 2000)
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Declarative Grammar Formalisms
Declarative Grammar Formalisms
specify linguistic knowledge independently from processingparsers/generators: can be generically created for allgrammars
ExamplesLFG (Bresnan 2001)HPSG (Pollard/Sag 1994)TAG (Joshi 1987)CCG (Steedman 2000)
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Grammar Development Systems
Grammar Development Systems
tools for grammar creationparsersgenerators
ExamplesXLE (Kaplan/Maxwell 1996) for LFGLKB (Copestake 2002) for HSPGXTAG (XTAG Group 2001) for TAGOpenCCG (White 2004) for CCG
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Grammar Development Systems
Grammar Development Systems
tools for grammar creationparsersgenerators
ExamplesXLE (Kaplan/Maxwell 1996) for LFGLKB (Copestake 2002) for HSPGXTAG (XTAG Group 2001) for TAGOpenCCG (White 2004) for CCG
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Existing Formalisms and Word Order
Existing Formalisms and Word Order
languages with freer word order than English (e.g.German, Czech, Hindi etc.) pose problemsSmolka (Smolka/Uszkoreit 1996): Could more advancedconstraint programming techniques improve linguisticprocessing?
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Axiomatization of Dependency Trees
Axiomatization of Dependency Trees
(Duchier 1999, 2003): axiomatization of valid dependencytrees using finite set constraintsparsing: reduced to constraint programminggrammar formalism: Topological Dependency Grammar(TDG) (Duchier/Debusmann 2001)elegant new treatment of free word order
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Extensible Dependency Grammar (XDG)
Extensible Dependency Grammar (XDG)
generalization of TDG (Debusmann et al. 2004)graph description language for modeling arbitrary manylevels of linguistic structuresame parsing methods by constraint programming(Duchier 1999, 2003)goes beyond syntax:
semantics(Debusmann/Duchier/Koller/Kuhlmann/Smolka/Thater2004)information structure (Debusmann/Postolache/Traat 2005),out of an IGK-project also with Ciprian Gerstenberger andStefan Thater
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
XDG Grammar Development Kit (XDK)
XDG Grammar Development Kit (XDK)
first grammar development system for XDGimplemented in Mozart/Oz, published in the Mozart GlobalUser Library (MOGUL)
Facilitiesnew abstract lexicon languagegrammar file compilergraphical interfacessolver for parsing and generation
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
XDG Grammar Development Kit (XDK)
XDG Grammar Development Kit (XDK)
first grammar development system for XDGimplemented in Mozart/Oz, published in the Mozart GlobalUser Library (MOGUL)
Facilitiesnew abstract lexicon languagegrammar file compilergraphical interfacessolver for parsing and generation
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Overview
1 Introduction
2 Extensible Dependency Grammar (XDG)
3 XDG Development Kit (XDK)
4 Conclusions
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphs
Graphs
XDG describes labeled graphsuses the linguistic notion of dependency grammar
Example
every programmer should like Mozart
det obj
vcompsubj
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphs
Graphs
XDG describes labeled graphsuses the linguistic notion of dependency grammar
Example
every programmer should like Mozart
det obj
vcompsubj
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphs
Multiple Graphs
XDG typically describes an arbitrary number of graphscalled dimensionssame set of nodes, different edges
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphs
Example
Syntax tree
every programmer should like Mozart
det obj
vcompsubj
Semantic dag
every programmer should like Mozart
patag
propag
det
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphs
Example
Syntax tree
every programmer should like Mozart
det obj
vcompsubj
Semantic dag
every programmer should like Mozart
patag
propag
det
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Well-Formedness Conditions
Well-Formedness Conditions
interaction of principles and the lexiconprinciples: restrictions on one or more dimensionssubset of an extensible principle librarylexicon: feature structures controlling the principles
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Well-Formedness Conditions
Example Principles
tree: dimension i must be a treedag: dimension i must be a dagvalency: for each node on dimension i, the incoming edgesmust be licensed by the in specification, and the outgoingedges by the out specificationorder: constrains the order of words on dimension i, e.g.subjects precede objectslinking: constrains how arguments on dimension i
(semantics) must be realized on dimension j (syntax), e.g.agents are realized as subjects
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Well-Formedness Conditions
Example Lexical Entry
lexical entry for like, controls valency and linking principles:
like =
syn :
[
in : {vcomp?}out : {obj!}
]
sem :
in : {prop?}out : {ag!, pat!}link : {ag : {subj}
pat : {obj}
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Overview
1 Introduction
2 Extensible Dependency Grammar (XDG)
3 XDG Development Kit (XDK)
4 Conclusions
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
XDK Features
XDK Features
new abstract lexicon languagegrammar file compilergraphical interfacessolver for parsing and generation
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Lexicon Language
XDG: linguistic information mostly specified in the lexiconbut: lexicon grows huge even for medium-sized grammarsneed facilities for adequate modularization andfactorizationtypes: specify feature structures, define combinationoperationmetagrammar (Crabbe/Duchier 2004): abstract descriptionlanguage for lexicon construction
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Types
each type: set L and partial function u : L × L → L
(combination operation of L)u: typically greatest lower bound
Supported typesdomains, records, valencies, sets, tuples, strings
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Types
each type: set L and partial function u : L × L → L
(combination operation of L)u: typically greatest lower bound
Supported typesdomains, records, valencies, sets, tuples, strings
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Domain Types
e.g. set of edge labels:
syn.label = {det, subj, obj, vcomp}
combination operation: a u a = a, a u b undefined for a 6= b
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Record Types
given set of features (fi)i=1...n and types Ti = (Li,ui):
{f1 : v1, . . . , fn : vn}
where vi ∈ Li.combination operation defined feature-wise:
{f1 : v1, . . . , fn : vn} u {f1 : v′
1, . . . , fn : v′
n} ={f1 : v1 u1 v′
1, . . . , fn : vn un v′
n}
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Valency Types
e.g. in and out specifications:
syn.valency = valency(syn.label)
defines syn.valency to be the record type:
{det : mode, subj : mode, obj : mode, vcomp : mode}
mode = {0, ?, !, ∗}
mode combination operation (specialization):
0 u x = x ∗ u! =! ∗ u? =? ?u! =!
convenient notation:
{det : 0, subj :!, obj :?, vcomp : 0} = {subj!, obj?}
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Metagrammar
abstract description language for lexicon constructionlexical classes:
Class ::= ClassName → ClassBody
class body:
ClassBody ::= ClassBody1&ClassBody
2
| ClassBody1| ClassBody
2
| ClassName| partialLexicalEntry
&: combination operation|: non-deterministic choiceClassName: class reference
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Example (1)
finite verbs can be roots or the head of a relative clause:
root →[
syn :[
in : {}] ]
rel →[
syn :[
in : {relcl?}] ]
finite → root | rel
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Example (2)
finite verbs may be either intransitive, transitive orditransitive:
intr →[
syn :[
out : {subj!}] ]
tr → intr &[
syn :[
out : {obj!}] ]
ditr → tr &[
syn :[
out : {iobj!}] ]
verb → intr | tr | ditr
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Lexicon Language
Example (3)
finite verb:
finiteVerb → finite & verb= (root | rel) & (intr | tr | ditr)
all possibilities of
(root & intr) (root & tr) (root & ditr)(rel & intr) (rel & tr) (rel & ditr)
e.g.:
rel & ditr →
[
syn :
[
in : {relcl?}out : {subj!, obj!, iobj!}
] ]
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Grammar File Compiler
Grammar File Compiler
fast static grammar checkerfast grammar file compilationprepared for very large grammars (GNU GDBM support)three concrete syntaxes for different purposes:
XML language: automated grammar developmentUser Language (UL): handcrafted grammarsIntermediate Language (IL): record-based language,tailored for Mozart/Oz and further processing within theXDK
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphical Interfaces
Graphical Interfaces
comprehensive graphical user interface (GUI)solver search tree visualization: Oz Explorer (Schulte1997), IOzSeF (Tack 2003)visualization of partial/full analyses: output library:
Tcl/Tk dag displayLaTeX dag output (using Denys Duchier’s dtree.sty)internal solver language output using the Oz Inspector(Brunklaus 2000)
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphical Interfaces
GUI
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphical Interfaces
Oz Explorer
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Graphical Interfaces
Dag display
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Solver
Solver
based on axiomatization of dependency parsing in(Duchier 1999, 2003)factorized into modular, extensible principle libraryprinciples: sets of constraint functorse.g. valency principle: in constraint and out constraintstarting sequence regulated by global constraint prioritiesto increase efficiency
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Solver
Preferences and Search
idea: guide the search for solutions by external knowledgesources: oraclesidea by Thorsten Brants and Denys Duchier, extended in(Dienes et al. 2003)oracles interact with solver using socketsXDK: supports new standard oracle architecture createdby Marco Kuhlmann
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Overview
1 Introduction
2 Extensible Dependency Grammar (XDG)
3 XDG Development Kit (XDK)
4 Conclusions
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Summary
Summary
introduced XDG Development Kit (XDK)new lexicon specification languagegrammar file compilergraphical interfacessolver for parsing and generationextensive documentation (200+ pages), PDF, PS, HTML,GNU info
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
Future Work
Future Work
solver: fairly efficient for handcrafted grammars, but not forautomatically generated oneswhy? grammar encoding or solver or both?theoretical investigation of fragments of XDGintegration of the new faster GECODE constraint library(Christian Schulte, Guido Tack, Gabor Szokoli)super-tagging (lexicon disambiguation beforeparsing/generation)
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
References
References
Joan Bresnan.Lexical Functional Syntax.Blackwell, 2001.
Thorsten Brunklaus.Der Oz Inspector — Browsen: Interaktiver, einfacher,effizienter.Diploma thesis, Saarland University, 2000.
Ann Copestake and Dan Flickinger.An Open-Source Grammar Development Environment andBroad-Coverage English Grammar Using HPSG.In Conference on Language Resources and Evaluation,Athens/GRE, 2000.
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
References
References
Benoit Crabbé and Denys Duchier.Metagrammar Redux.In Proceedings of the International Workshop on ConstraintSolving and Language Processing, Roskilde/DK, 2004.
Ralph Debusmann, Denys Duchier, Alexander Koller,Marco Kuhlmann, Gert Smolka, and Stefan Thater.A Relational Syntax-Semantics Interface Based onDependency Grammar.In Proceedings of COLING 2004, Geneva/SUI, 2004.
Ralph Debusmann, Oana Postolache, and Maarika Traat.A Modular Account of Information Structure in ExtensibleDependency Grammar.In Proceedings of the CICLING 2005 Conference, MexicoCity/MEX, 2005. Springer.
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
References
References
Peter Dienes, Alexander Koller, and Marco Kuhlmann.Statistical A* Dependency Parsing.In Prospects and Advances in the Syntax/SemanticsInterface, Nancy/FRA, 2003.
Denys Duchier.Axiomatizing Dependency Parsing Using Set Constraints.In Proceedings of MOL 6, Orlando/USA, 1999.
Denys Duchier.Configuration of Labeled Trees under LexicalizedConstraints and Principles.Research on Language and Computation, 1(3–4):307–336,2003.
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
References
References
Denys Duchier and Ralph Debusmann.Topological Dependency Trees: A Constraint-BasedAccount of Linear Precedence.In Proceedings of ACL 2001, Toulouse/FRA, 2001.
Aravind K. Joshi.An Introduction to Tree-Adjoining Grammars.In Alexis Manaster-Ramer, editor, Mathematics ofLanguage, pages 87–115. John Benjamins,Amsterdam/NL, 1987.
Ronald M. Kaplan and John T. Maxwell.LFG Grammar Writer’s Workbench.Technical report, Xerox PARC, 1996.
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
References
References
Carl Pollard and Ivan A. Sag.Head-Driven Phrase Structure Grammar.University of Chicago Press, Chicago/USA, 1994.
Christian Schulte.Oz Explorer: A Visual Constraint Programming Tool.In Lee Naish, editor, Proceedings of the FourteenthInternational Conference on Logic Programming, pages286–300, Leuven/BEL, jul 1997. MIT Press.
Gert Smolka and Hans Uszkoreit.NEGRA Project of the Collaborative Research Centre(SFB) 378, 1996–2001.Saarland University.
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
References
References
Mark Steedman.The Syntactic Process.MIT Press, Cambridge/USA, 2000.
Guido Tack.Linearisation, Minimisation and Transformation of DataGraphs with Transients.Diploma thesis, Saarland University, 2003.
Michael White.Reining in CCG Chart Realization.In Proceedings of the 3rd International Conference onNatural Language Generation, 2004.
Introduction Extensible Dependency Grammar (XDG) XDG Development Kit (XDK) Conclusions
References
References
XTAG Research Group.A Lexicalized Tree Adjoining Grammar for English.Technical Report IRCS-01-03, IRCS, University ofPennsylvania, 2001.