A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction

Preview:

DESCRIPTION

 

Citation preview

A GRAPH-BASED CROSS-LINGUAL

PROJECTION APPROACH FOR

WEAKLY SUPERVISED RELATION EXTRACTION

The 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012)

July 11th, 2012, Jeju

Seokhwan Kim (Institute for Infocomm Research)

Gary Geunbae Lee (POSTECH)

Contents

• Introduction

• Methods

Cross-lingual Annotation Projection for Relation Extraction

Graph-based Projection Approach

• Evaluation

• Conclusions

2

Contents

• Introduction

• Methods

Cross-lingual Annotation Projection for Relation Extraction

Graph-based Projection Approach

• Evaluation

• Conclusions

3

Problem Definition

• Relation Extraction

To identify semantic relations between a pair of entities

Considered as a classification problem

4

Honolulu was Barack Obama , in

Birthplace

Hawaii born . PER LOC LOC

Related Work (1)

• Supervised Learning

Many supervised machine learning approaches have been

successfully applied

• (Kambhatla, 2004; Zhou et al., 2005; Zelenko et al., 2003; Culotta and

Sorensen, 2004; Bunescu and Mooney, 2005; Zhang et al., 2006)

• Semi-supervised Learning

To obtain the annotations of unlabeled instances from the seed

information

• (Brin, 1999; Riloff and Jones, 1999; Agichtein and Gravano, 2000;

Sudo et al, 2003; Yangarber, 2003; Stevenson and Greenwood, 2006;

Zhang, 2004; Chen el al., 2006; Zhou et al., 2009)

5

Motivation

• Resources for Relation Extraction

Supervised/Semi-supervised Approaches

• Labeled corpora for supervised learning

• Seed instances for semi-supervised learning

• Available for only a few languages

ACE 2003 Multilingual Training Dataset

• English (252 articles)

• Chinese (221 articles)

• Arabic (206 articles)

• No resources for other languages

Korean

6

Related Work (2)

• Self-supervised Learning

To obtain the annotated dataset without any human effort

Using the information obtained from external resources

• Heuristic-based Method (Banko et al., 2007; Banko et al., 2008)

• Wikipedia-based Methods (Wu and Weld, 2010)

• Cross-lingual Annotation Projection

To leverage parallel corpora to project the relation annotations on

the resource-rich source language to the resource-poor target

language (Kim et al., 2010, Kim et al., 2011)

7

Contents

• Introduction

• Methods

Cross-lingual Annotation Projection for Relation Extraction

Graph-based Projection Approach

• Implementation

• Evaluation

• Conclusions

8

Overall Architecture

9

Projection Annotation Parallel Corpus

Sentences in Ls

Preprocessing (POS Tagging,

Parsing)

NER

Relation Extraction

Annotated Sentences in

Ls

Sentences in Lt

Preprocessing (POS Tagging,

Parsing)

Word Alignment

Projection

Annotated Sentences in

Lt

Direct Projection

• Annotation

• Projection

10

fE (<Barack Obama, Honolulu>) = 1

Barack Obama was born in Honolulu Hawaii , .

버락 오바마 (beo-rak-o-ba-ma)

는 (neun)

하와이 (ha-wa-i)

호놀룰루 (ho-nol-rul-ru)

의 (ui)

에서 (e-seo)

태어났다 (tae-eo-nat-da)

fK (<버락 오바마, 호놀룰루>) = 1

(Kim et al., 2010)

Limitations of Direct Projection

• Direct projection approach is still vulnerable to the

erroneous inputs generated by preprocessors

• Main causes of this limitation

Considering alignment between entity candidates only, not any

contextual information

Performed by just a single pass process

11

Graph-based Learning

• Semi-supervised learning algorithm

• Defining a graph

The nodes represent labeled and unlabeled examples in a dataset

The edges reflect the similarity of examples

• Learning a labeling function in an iterative manner

It should be close to the given labels on the similar labeled nodes

It should be smooth on the whole graph

• Related Work

Graph-based Learning for Relation Extraction (Chen et al, 2006)

Bilingual projection of POS tagging (Das and Petrov, 2011)

12

Graph Construction

• Graph Nodes

Instance Nodes

• Defined for all pairs of entity candidates in both languages

• Each instance node has a soft label vector Y = [y+ y-]

Context Nodes

• For identifying the relation descriptors of the positive instances

• Defined for each trigram which is located between a given entity pair

which is semantically related

• Each context node has a soft label vector Y = [y+ y-]

13

<ARG1> was born in <ARG2>

<ARG1> was born was born in born in <ARG2>

Graph Construction

• Edge Weights

Between instance node and context node in the same language

Between context nodes in a language

Between context nodes in source and target languages

14

𝑤 𝑣𝑖,𝑗 , 𝑢𝑘

= 1 𝑖𝑓 𝑣𝑖𝑗 ℎ𝑎𝑠 𝑢𝑘 𝑎𝑠 𝑎 𝑐𝑜𝑛𝑡𝑒𝑥𝑡𝑢𝑎𝑙 𝑠𝑢𝑏𝑠𝑒𝑞𝑢𝑒𝑛𝑐𝑒, 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.

𝑤(𝑢𝑘 , 𝑢𝑙) = 𝐽(𝑢𝑘 , 𝑢𝑙) =|𝑢𝑘 ∩ 𝑢𝑙|

|𝑢𝑘 ∪ 𝑢𝑙|.

𝑤(𝑢𝑠𝑘 , 𝑢𝑡

𝑙) =𝑐𝑜𝑢𝑛𝑡 𝑢𝑠

𝑘 , 𝑢𝑡𝑙

‍𝑢𝑡𝑚 𝑐𝑜𝑢𝑛𝑡 𝑢𝑠

𝑘 , 𝑢𝑡𝑚

,

Graph Construction

• Example

15

Label Propagation

• Algorithm

Input

• A transition matrix T

• An initial label matrix Y0

Output

• The updated label matrix Yt

16

Initialize T

Normalize T

Initialize Y

Update Y

Label Propagation

• Executed in three phases

17

1st phase

2nd phase

3rd phase

Contents

• Introduction

• Methods

Cross-lingual Annotation Projection for Relation Extraction

Graph-based Projection Approach

• Evaluation

• Conclusions

18

Implementation

• Dataset

English-Korean parallel corpus

• 266,982 bi-sentence pairs in English and Korean

• Aligned by GIZA++

• Annotation

ReVerb (Fader et al., 2011)

• English Open IE system

• Label Propagation

Junto Label Propagation Toolkit

• Learning

Tree kernel-based SVM classifier

• Shortest path dependency kernel (Bunescu and Mooney, 2005)

• SVM-Light (Joachims, 1998)

19

Evaluation

• Dataset

Manually annotated Korean dataset

• Obtained from the Web following Bunescu and Mooney(2007)’s work

• 500 sentences with manual annotations for four relation types

Acquisition

Birthplace

Inventor Of

Won Prize

• Evaluation Metrics

Precision/Recall/F-measure

20

Experimental Results

• Direct Projection vs. Graph-based Projection

21

Type Direct Projection Graph-based Projection

P R F P R F

Acquisition 51.6 87.7 64.9 55.3 91.2 68.9

Birthplace 69.8 84.5 76.4 73.8 87.3 80.0

Inventor of 62.4 85.3 72.1 66.3 89.7 76.3

Won Prize 73.3 80.5 76.7 76.4 82.9 79.5

Total 63.9 84.2 72.7 67.7 87.4 76.3

Experimental Results

• Comparisons to other self-supervised approaches

Heuristic-based Approach (Banko et al., 2007; Banko et al., 2008)

• Korean Treebank and Syntactic Heuristics

Wikipedia-based Approach (Wu and Weld, 2010)

• Korean Wikipedia articles and Infoboxes

22

Approach P R F

Heuristic-based 92.31 17.27 29.09

Wikipedia-based 66.67 66.91 66.79

Projection-based 67.69 87.41 76.30

Contents

• Introduction

• Methods

Cross-lingual Annotation Projection for Relation Extraction

Graph-based Projection Approach

• Evaluation

• Conclusions

23

Conclusion

• Summary

A graph-based projection approach for relation extraction

• Label propagation algorithm

• On a graph that represents the instance and context features of both

the source and target languages

Experimental results show that our approach helps to improve the

performances of relation extraction compared to other approaches

• Future work

To relieve the high complexity problem of the approach

To deal with more expanded graph structure to improve the

extraction performances

24

Q&A

Recommended