27
Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich 1,2 & Omri Abend 2 & Ari Rappoport 2 1 Edmond and Lily Safra Center for Brain Sciences 2 School of Computer Science and Engineering Hebrew University of Jerusalem Learning Club November 17, 2016 Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 1 / 26

Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Broad-Coverage Transition-Based UCCA Parsing

Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2

1Edmond and Lily Safra Center for Brain Sciences2School of Computer Science and Engineering

Hebrew University of Jerusalem

Learning ClubNovember 17, 2016

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 1 / 26

Page 2: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

Outline

1 Semantic Annotation Schemes

2 Transition-based UCCA Parsing

3 Experiments

4 Future Work

5 Conclusions

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 2 / 26

Page 3: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

Semantic Annotation Schemes

Represent semantic structure of text as a graph.Used by NLP applications for features and structure, providing informationsuch as who did what to whom?

Examples:Semantic Role LabelingSemantic DependenciesAbstract Meaning RepresentationUniversal Conceptual Cognitive Annotation

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 3 / 26

Page 4: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

Semantic Role Labeling (SRL)

Annotate predicates and their arguments as a flat structure. Examples:

PropBank

After graduation , John moved to Paris

move.01AM-TMP A1 A2

MotionTime Theme Goal

FrameNet

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 4 / 26

Page 5: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

Semantic Dependency Parsing (SDP)

Graph on the text tokens, including internal structure of arguments.Examples:

DELPH-IN MRS-derived bi-lexical dependencies (DM)

After graduation , John moved to Paris

top

ARG2

ARG1

ARG1 ARG1 ARG2

top

TWHEN

ACT-argDIR3-arg

Prague Dependency Treebank tectogrammatical layer (PSD)Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 5 / 26

Page 6: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

Abstract Meaning Representation (AMR)

Graph on knowledge resource entries inferred from the tokens.

name

ARG0ARG2 time

name

op1

op1

op1

Paris John

person

move-01

city

name

after

name graduate-01

ARG0

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 6 / 26

Page 7: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

Universal Conceptual Cognitive Annotation (UCCA)

Cross-linguistically applicable semantic representation scheme.Builds on typological [Dixon, 2012] and Cognitive Linguisticsliterature [Croft and Cruse, 2004].Demonstrated applicability to English, French, German & Czech.Support for rapid annotation.Semantic stability in translation [Sulem et al., 2015].Proven useful for machine translation evaluation [Birch et al., 2016].Applicability has been so far limited by the absence of a parser.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 7 / 26

Page 8: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

Structural Properties

(1) non-terminal nodes, (2) reentrancy, (3) discontinuity

John

C

and

N

Mary

C

A

wentP

home

A

John

A

gaveC

everything upC

P

A

John

A

decided

P

toF

take

C

aF

quick shower

C

P

A

A

D

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 8 / 26

Page 9: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Semantic Annotation Schemes

UCCA Corpora

Wiki 20KTrain Dev Test Leagues

# passages 300 34 34 154# sentences 4267 453 518 506# nodes 298,665 33,263 37,262 29,315% terminal 42.95 43.62 42.89 42.09% non-term. 58.30 57.46 58.31 60.01% discont. 0.53 0.51 0.47 0.81% reentrant 2.31 1.76 2.18 2.03# edges 287,381 32,015 35,846 27,749% primary 98.29 98.81 98.75 97.73% remote 1.71 1.19 1.25 2.27Average per non-terminal node# children 1.67 1.68 1.66 1.61

a

Excluding root node, implicit nodes, andlinkage nodes and edges.

aCorpora are available at http://www.cs.huji.ac.il/˜oabend/ucca.html

P processS stateA participantL linkerH linked sceneC centerE elaboratorD adverbialR relatorN connectorU punctuationF function unitG ground

Table: Edge labels.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 9 / 26

Page 10: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Transition-based UCCA Parsing

Outline

1 Semantic Annotation Schemes

2 Transition-based UCCA Parsing

3 Experiments

4 Future Work

5 Conclusions

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 10 / 26

Page 11: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Transition-based UCCA Parsing

Transition-Based Parsing

Parse sentence w1 . . . wn to graph G = (V , E , `) incrementally, using bufferB and stack S. Classifier determines transition to apply at each step.Transition-based parsers work by applying a transition at each step to theparser state, defined using a buffer B of tokens and nodes to be processed,a stack S of nodes currently being processed, and a graph G = (V , E , `)of constructed nodes and edges. A classifier selects the next transitionbased on the current state’s features. It is trained by an oracle based ongold-standard annotations.

S After graduation B John moved to Paris

G

AfterL

graduationP

H

Transitions for UCCA parsing:Shift, Reduce, NodeX ,Left-EdgeX , Right-EdgeX , Left-RemoteX , Right-RemoteX ,Swap, Finish

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 11 / 26

Page 12: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Transition-based UCCA Parsing

Transition-Based Parsing

S

John

B

moved to Paris

G

After

L

graduation

P

H

Transitions: Shift, Right-EdgeL, Reduce, Shift,

NodeP, Reduce, Shift, Right-EdgeH, Shift

Figure: Example for intermediate state during transition-based parsing.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 12 / 26

Page 13: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Transition-based UCCA Parsing

TUPA (Transition-Based UCCA Parser)

Our parser supports the structural properties of UCCA.1

Before Transition After TransitionStack Buffer Nodes Edges Stack Buffer Nodes EdgesS x | B V E Shift S | x B V ES | x B V E Reduce S B V ES | x B V E NodeX S | x y | B V ∪ {y} E ∪ {(y , x)X }S | y , x B V E Left-EdgeX S | y , x B V E ∪ {(x , y)X }S | x , y B V E Right-EdgeX S | x , y B V E ∪ {(x , y)X }S | y , x B V E Left-RemoteX S | y , x B V E ∪ {(x , y)∗

X }S | x , y B V E Right-RemoteX S | x , y B V E ∪ {(x , y)∗

X }S | x , y B V E Swap S | y x | B V E[root] ∅ V E Finish ∅ ∅ V E

Table: TUPA transitions. (·, ·)X denotes a primary X -labeled edge, and (·, ·)∗X a

remote X -labeled edge.1Parser code available at https://github.com/danielhers/ucca.Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 13 / 26

Page 14: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Transition-based UCCA Parsing

2

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 14 / 26

Page 15: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Transition-based UCCA Parsing

TUPA Classifiers

We experiment with three classifiers:

TUPAsparse Perceptron, sparse features: words, POS tags & edge label combinations.TUPAdense Perceptron, dense embedding features: word2vec [Mikolov et al., 2013] for words, else random.TUPANN 2-layer MLP, learned embedding features, logistic activation + dropout.

For all classifiers, inference is performed greedily, i.e., without beam search.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 14 / 26

Page 16: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Experiments

Outline

1 Semantic Annotation Schemes

2 Transition-based UCCA Parsing

3 Experiments

4 Future Work

5 Conclusions

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 15 / 26

Page 17: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Experiments

Experimental Setup

We conduct our main experiment on the UCCA Wikipedia corpus, and usethe English part of the UCCA Twenty Thousand Leagues Under the SeaEnglish-French parallel corpus as out-of-domain data.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 16 / 26

Page 18: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Experiments

Evaluation

We report two variants of labeled precision, recall and F-score: one wherewe consider only primary edges, and another for remote edges. Givengraphs Gp = (Vp, Ep, `p) and Gg = (Vg , Eg , `g) over terminalsW = {w1, . . . , wn}, the yield y(e) ⊆ W of an edge e = (u, v) in eithergraph is the set of terminals in W that are descendants of v . The mutualedges between the graphs are:

M(Gp, Gg) = {(e1, e2) ∈ Ep × Eg | y(e1) = y(e2) ∧ `p(e1) = `g(e2)}

and we defineLP = |M(Gp, Gg)|/|Ep| LR = |M(Gp, Gg)|/|Eg | LF = 2 · LP · LR/(LP + LR).

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 17 / 26

Page 19: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Experiments

Baselines

After graduation , John moved to Paris

L UA

A

H

RA

John gave everything up

A A

C

John and Mary went home

A

NC

A

Figure: Bilexical approximation for UCCA graphs.

Sincetherearenoex-ist-ingUCCAparsers,weusebilexical DAG parsers:

1 Convert UCCA into bilexical dependencies.2 Train parsers on the resulting training set.3 Apply trained parsers to the test set.4 Reconstruct UCCA graphs.5 Compare with gold standard.

Upper bounds are computed by applying the conversion both ways to thegold standard graphs and comparing to the original.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 18 / 26

Page 20: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Experiments

Results

TUPANN obtains the highest scores in nearly all metrics:

Wiki (in-domain) 20K Leagues (out-of-domain)Primary Remote Primary Remote

LP LR LF LP LR LF LP LR LF LP LR LFBilexical Approximation

Upper Bound 93.4 83.7 88.3 73.9 49.5 59.3 93.5 83.5 88.2 66.7 31.6 42.9DAGParser [Ribeyre et al., 2014] 63.7 56.1 59.5 0.8 9.5 1.4 58 49.8 53.4 – 0 0TurboParser [Almeida and Martins, 2015] 60.2 47.4 52.9 2.2 7.8 3.4 52.6 39 44.7 100 0.3 0.6Direct Approach

TUPAsparse 64 55.6 59.5 16 11.6 13.4 60.6 53.9 57.1 20.2 10.3 13.6TUPAdense 55 54.8 54.9 15.2 16.9 16 54.8 55.2 55 6 3 4TUPANN 65 62.5 63.7 20.7 11.3 14.6 58.3 56.4 57.3 15.2 3.8 6

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 19 / 26

Page 21: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Experiments

Tree Approximation

For completeness, we also explore lossily converting UCCA into trees,resulting in a simplified task for the underlying parser, in addition to thematurity of tree-based parsers.Although remote edges are of pivotal importance, exploring treeapproximation methods can inform the future development of DAG parsersin general and of UCCA parsers in particular.

Constituency Tree ApproximationUpper Bound 100 100 100uparse [Maier and Lichte, 2016] 63 64.7 63.7Dependency Tree Approximation

Upper Bound 93.7 83.6 88.4MaltParser [Nivre et al., 2007] 64.9 57.9 61LSTM Parser [Dyer et al., 2015] 74.9 66.4 70.2Direct Tree Parsing

TUPAsparse − Remote 65.5 57.5 61.3TUPAdense − Remote 57.2 57.3 57.2TUPANN − Remote 66.3 64.4 65.3

Theper-for-manceisen-cour-ag-ingin

light of UCCA’s inter-annotator agreement of 80–85% F-score on primaryedges [Abend and Rappoport, 2013].

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 20 / 26

Page 22: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Future Work

Outline

1 Semantic Annotation Schemes

2 Transition-based UCCA Parsing

3 Experiments

4 Future Work

5 Conclusions

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 21 / 26

Page 23: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Future Work

UCCA-Based Distributed Representation

Vector representation for sentences and documents, based on recursivecomposition on the UCCA graph.Impact:

General automatic semantic feature extractor for text.Accurate measure for text similarity.Understand the semantic contribution of different elements.

AfterL

graduationP

H,

U

JohnA

movedP

toR

ParisC

A

H

A

LR

LA

LA

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 22 / 26

Page 24: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Conclusions

Outline

1 Semantic Annotation Schemes

2 Transition-based UCCA Parsing

3 Experiments

4 Future Work

5 Conclusions

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 23 / 26

Page 25: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Conclusions

Conclusions

We present TUPA, the first parser for UCCA, and evaluate it in bothin-domain and out-of-domain settings, showing it surpasses bilexical DAGparsers on the task of UCCA parsing.Future work will incorporate LSTMs into TUPA, and apply the parser tomore languages such as German, demonstrating the importance ofbroad-coverage parsing. We will also improve the conversion-basedmethods and explore different target representations. A UCCA parser willenable using the scheme for representation in NLP tasks.

UCCA exhibits formal properties important for semanticrepresentation.We present the first parser for UCCA and the first to support theseproperties.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 24 / 26

Page 26: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Conclusions

References I

Abend, O. and Rappoport, A. (2013).Universal Conceptual Cognitive Annotation (UCCA).In Proc. of ACL, pages 228–238.

Almeida, M. S. C. and Martins, A. F. T. (2015).Lisbon: Evaluating TurboSemanticParser on multiple languages and out-of-domain data.In Proc. of SemEval, pages 970–973, Denver, Colorado. Association for Computational Linguistics.

Birch, A., Haddow, B., Bojar, O., and Abend, O. (2016).Hume: Human UCCA-based evaluation of machine translation.arXiv preprint arXiv:1607.00030.

Croft, W. and Cruse, D. A. (2004).Cognitive linguistics.Cambridge University Press.

Dixon, R. M. W. (2010/2012).Basic linguistic theory.

Dyer, C., Ballesteros, M., Ling, W., Matthews, A., and Smith, N. A. (2015).Transition-based dependeny parsing with stack long short-term memory.In Proc. of ACL, pages 334–343.

Maier, W. and Lichte, T. (2016).Discontinuous parsing with continuous trees.In Proc. of Workshop on Discontinuous Structures in NLP, pages 47–57, San Diego, California. Association forComputational Linguistics.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 25 / 26

Page 27: Broad-Coverage Transition-Based UCCA Parsing · Broad-Coverage Transition-Based UCCA Parsing Daniel Hershcovich1,2 & Omri Abend2 & Ari Rappoport2 1Edmond and Lily Safra Center for

Conclusions

References II

Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013).Efficient estimation of word representations in vector space.arXiv preprint arXiv:1301.3781.

Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kubler, S., Marinov, S., and Marsi, E. (2007).MaltParser: A language-independent system for data-driven dependency parsing.Natural Language Engineering, 13(02):95–135.

Ribeyre, C., Villemonte de la Clergerie, E., and Seddah, D. (2014).Alpage: Transition-based semantic graph parsing with syntactic features.In Proc. of SemEval, pages 97–103, Dublin, Ireland.

Sulem, E., Abend, O., and Rappoport, A. (2015).Conceptual annotations preserve structure across translations: A French-English case study.In Proc. of S2MT, pages 11–22.

Daniel Hershcovich Broad-Coverage Transition-Based UCCA Parsing 26 / 26