Updating a Name Tagger Using Contemporary Unlabeled Data

Preview:

DESCRIPTION

Presentation at ACL-IJCNLP 2009 of Cristina Mota & Ralph Grishman (2009a). “Updating a name tagger using contemporary unlabeled data.” Proc. of the Joint conference of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, August, 2009, Singapore.

Citation preview

Updating a Name Tagger Using

Contemporary Unlabeled Data

ACL-IJCNLP 2009Singapore, August 3rd - 5th

Cristina Mota1,2 and Ralph Grishman2

1IST & L2F INESC-ID (Portugal)2New York University (USA)

(Advisors: Ralph Grishman & Nuno Mamede)

This research was funded by Fundacao para a Ciencia e a Tecnologia (doctoral scholarship SFRH/BD/3237/2000)

Motivation

0 1 2 3 4 5 6 7

0.79

0.80

0.81

0.82

0.83

0.84

0.85

Time gap (year)

F−m

easu

re

y=−0.00391x+0.82479R2=0.3647

The performance of a co-trainednamed entity tagger decreases asthe time gap increases betweentraining and test sets (Mota &Grishman, 2008)

Do we need to update the seeds or the unlabeled data?

Does more older data help?

Motivation

0 1 2 3 4 5 6 7

0.79

0.80

0.81

0.82

0.83

0.84

0.85

Time gap (year)

F−m

easu

re

y=−0.00391x+0.82479R2=0.3647

The performance of a co-trainednamed entity tagger decreases asthe time gap increases betweentraining and test sets (Mota &Grishman, 2008)

Do we need to update the seeds or the unlabeled data?

Does more older data help?

Related Work

“More data are better data” (Church & Mercer, 1993)Enlarge labeled data as a way of improving performance

Contemporary (labeled) data reduces out-of-vocabulary rates

Time-adaptive language model (Auzanne et al., 2000)Generation of offline name lists (Palmer & Ostendorf, 2005)Daily adaptation of the language model of a broadcast newstranscription system (Martins et al., 2006)

Data Sets

Data sets were drawn from the Politics section of CETEMPublicocorpus (Santos & Rocha, 2001)

Language: Portuguese

Time span: 8 years (1991-1998)

Time gap: 1=6 months

For each six month period

Seeds (S): names collected from first 192 extracts∗

Test data (T): next 208 extractsUnlabeled data (U): next 7856 extracts

∗1 extract = app. 2 paragraphs

Named Entity Tagger

Identification

Pairs (spelling features,

contextual features)

Co-training

Spelling +

contextual rules

Seeds

Unlabeled text

Training

Based on a co-training classifier(Collins & Singer, 1999)

Includes propagation step

Needs few seeds andperformance is high (above80%)

Performance is parametrized bycombination of seeds,unlabeled set and test set:(S,U,T)

Tagger is evaluated afterpropagation with HAREMscoring programs

Named Entity Tagger

Test text

Labeled Pairs

Text with classified NE

Identification

Classification

Propagation

Pairs (spelling features,

contextual features)

Co-training

Spelling +

contextual rules

Seeds

Unlabeled text

TestingTraining

Based on a co-training classifier(Collins & Singer, 1999)

Includes propagation step

Needs few seeds andperformance is high (above80%)

Performance is parametrized bycombination of seeds,unlabeled set and test set:(S,U,T)

Tagger is evaluated afterpropagation with HAREMscoring programs

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 1: Baseline (vary seeds and unlabeled datasynchronously as in Mota & Grishman (2008))

Update seeds or unlabeled data?0.

740.

760.

780.

800.

820.

84

Training epoch

F−m

easu

re

(i,i,98b)(98b,i,98b)(i,98b,98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Performance decays as thetime gap increases (Mota &Grishman, 2008)

v v v v v v v v v v v v v v v v v v v v v v v v

Update seeds or unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?

Timeline

Tn

Seeds

Unlabeled

examples Ui

Test

91a 98b

Sn

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?

Timeline

Tn

Seeds

Unlabeled

examples Ui

Test

91a 98b

Sn

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?

Timeline

Tn

Seeds

Unlabeled

examples Ui

Test

91a 98b

Sn

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?

Timeline

Tn

Seeds

Unlabeled

examples Ui

Test

91a 98b

Sn

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 2: Update seeds (vary unlabeled data but usecontemporary seeds)

Update seeds or unlabeled data?0.

740.

760.

780.

800.

820.

84

Training epoch

F−m

easu

re

(i,i,98b)(98b,i,98b)(i,98b,98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Contemporary seeds slightlyattenuate the decrease

v v v v v v v v v v v v v v v v v v v v v v v v

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Un

Test

91a 98b

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples

Test

91a 98b

Un

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples

Test

91a 98b

Un

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples

Test

91a 98b

Un

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples

Test

91a 98b

Un

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples

Test

91a 98b

Un

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Un

Test

91a 98b

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Update seeds or unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Un

Test

91a 98b

Experiment 3: Update unlabeled data (vary seeds but usecontemporary unlabeled data)

Updating the unlabeled data is better than

updating the seeds0.

740.

760.

780.

800.

820.

84

Training epoch

F−m

easu

re

(i,i,98b)(98b,i,98b)(i,98b,98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Contemporary unlabeled datamaintain the performance

v v v v v v v v v v v v v v v v v v v v v v v v

Augment unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 4: Enlarge unlabeled data with older data anduse contemporary seeds

Augment unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

Ui

Experiment 4: Enlarge unlabeled data with older data anduse contemporary seeds

Augment unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

UiUi

Experiment 4: Enlarge unlabeled data with older data anduse contemporary seeds

Augment unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

UiUiUi

Experiment 4: Enlarge unlabeled data with older data anduse contemporary seeds

Augment unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

UiUiUiUi

Experiment 4: Enlarge unlabeled data with older data anduse contemporary seeds

Augment unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples Ui

Test

91a 98b

UiUiUiUiUi

Experiment 4: Enlarge unlabeled data with older data anduse contemporary seeds

Augment unlabeled data?

Timeline

Tn

SnSeeds

Unlabeled

examples

Test

91a 98b

UiUiUiUiUiUiUi

Experiment 4: Enlarge unlabeled data with older data anduse contemporary seeds

Augment unlabeled data?0.

740.

760.

780.

800.

820.

84

Time frame (semester)

F−m

easu

re

(i,98b,98b)(i,u[i,...,98a],98b)(98b,u[i,...,98a],98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Green line: Same seeds for all taggers (98b);unlabeled data is enlarging backwards

Blue line: Different seeds for each tagger; sameunlabeled data for all taggers (98b)

Larger amounts of olderunlabeled data does not alwaysresult in better performance

Augment unlabeled data?0.

740.

760.

780.

800.

820.

84

Time frame (semester)

F−m

easu

re

(i,98b,98b)(i,u[i,...,98a],98b)(98b,u[i,...,98a],98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Green line: Same seeds for all taggers (98b);unlabeled data is enlarging backwards

Blue line: Different seeds for each tagger; sameunlabeled data for all taggers (98b)

Larger amounts of olderunlabeled data does not alwaysresult in better performance

Augment unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Experiment 5: Enlarge the size of unlabeled data and varyseeds

Augment unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Ui

Experiment 5: Enlarge the size of unlabeled data and varyseeds

Augment unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Ui Ui

Experiment 5: Enlarge the size of unlabeled data and varyseeds

Augment unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Ui Ui Ui

Experiment 5: Enlarge the size of unlabeled data and varyseeds

Augment unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Ui Ui Ui Ui

Experiment 5: Enlarge the size of unlabeled data and varyseeds

Augment unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Ui Ui Ui Ui Ui

Experiment 5: Enlarge the size of unlabeled data and varyseeds

Augment unlabeled data?

Timeline

Tn

SiSeeds

Unlabeled

examples Ui

Test

91a 98b

Ui Ui Ui Ui Ui Ui

Experiment 5: Enlarge the size of unlabeled data and varyseeds

Updating the unlabeled data is better than

accumulating older unlabeled data0.

740.

760.

780.

800.

820.

84

Time frame (semester)

F−m

easu

re

(i,98b,98b)(i,u[i,...,98a],98b)(98b,u[i,...,98a],98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Violet line: Seeds in the same time frame asunlabeled set being added; unlabeled data isenlarging backwards

Blue line: Seeds are the same as in the violetline; same unlabeled data for all taggers (98b)

Green line: Same seeds for all taggers (98b);unlabeled data is enlarging backwards

Larger amounts of unlabeleddata is worse than training withcontemporary unlabeled data

Larger amounts of unlabeleddata does not outperform thetagger trained withcontemporary seeds andunlabeled data

Updating the unlabeled data is better than

accumulating older unlabeled data0.

740.

760.

780.

800.

820.

84

Time frame (semester)

F−m

easu

re

(i,98b,98b)(i,u[i,...,98a],98b)(98b,u[i,...,98a],98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Violet line: Seeds in the same time frame asunlabeled set being added; unlabeled data isenlarging backwards

Blue line: Seeds are the same as in the violetline; same unlabeled data for all taggers (98b)

Green line: Same seeds for all taggers (98b);unlabeled data is enlarging backwards

Larger amounts of unlabeleddata is worse than training withcontemporary unlabeled data

Larger amounts of unlabeleddata does not outperform thetagger trained withcontemporary seeds andunlabeled data

Updating the unlabeled data is better than

accumulating older unlabeled data0.

740.

760.

780.

800.

820.

84

Time frame (semester)

F−m

easu

re

(i,98b,98b)(i,u[i,...,98a],98b)(98b,u[i,...,98a],98b)

91a

91b

92a

92b

93a

93b

94a

94b

95a

95b

96a

96b

97a

97b

98a

98b

Violet line: Seeds in the same time frame asunlabeled set being added; unlabeled data isenlarging backwards

Blue line: Seeds are the same as in the violetline; same unlabeled data for all taggers (98b)

Green line: Same seeds for all taggers (98b);unlabeled data is enlarging backwards

Larger amounts of unlabeleddata is worse than training withcontemporary unlabeled data

Larger amounts of unlabeleddata does not outperform thetagger trained withcontemporary seeds andunlabeled data

Final remarks

Contemporary unlabeled data are better data

But...

Why doesn’t the labeled data impact the performance more?Are other semi-supervised approaches also sensitive?

Acknowledgments

This research work was funded by Fundacao para a Ciencia e a

Tecnologia (doctoral scholarship SFRH/BD/3237/2000)

Updating a Name Tagger Using

Contemporary Unlabeled Data

ACL-IJCNLP 2009Singapore, August 3rd - 5th

Cristina Mota1,2 and Ralph Grishman2

1IST & L2F INESC-ID (Portugal)2New York University (USA)

(Advisors: Ralph Grishman & Nuno Mamede)

This research was funded by Fundacao para a Ciencia e a Tecnologia (doctoral scholarship SFRH/BD/3237/2000)

Example of (miss)classification

Test set 98b includes two instances of “Tizi Ouzou”:Tizi Ouzou tem (en: Tizi Ouzou has)manifestacoes em Tizi Ouzou (en: demonstrations in Tizi Ouzou)

Does not occur in u 91a so depends on contexts:(”n v” ”tem”) ORGANIZATION 0.52(”type” ”nprop v”) PERSON 0.43(”len” 2) PERSON 0.62

But occurs in u 98b:noite em Tizi (en: night in Tizi)ruas de Tizi Ouzou (en: street of Tizi Ouzou)ir a Tizi-Ouzou (en: go to Tizi Ouzou)

NE tagger: Identification

Raw text

Lexical analysis

Chunking

NE + context identification

Portuguese dictionary

Pairs (NE,context)

Priority dictionaries

Chunking grammars

Morphological grammars

NE + context grammars

Text with unclassified NE

Identification designed with NooJ(Silberztein, 2004)

1 Elisa Ferreira comecou porcriticar Cavaco Silva

2 [Elisa Ferreira]SEQM [comecouporcriticar]V+Complexo+Pred=criticar

[Cavaco Silva]SEQM

3 [Elisa Ferreira]nprop v+criticar

comecou por criticar [CavacoSilva]v nprop+criticar

4 [Elisa Ferreira]nprop v+criticar

[Cavaco Silva]v nprop+criticar

NE tagger: Classification

Seeds

Label with name rules

Infer context rules

Label with context rules

Infer name rules

Labeled examples

Context rules

Labeled examples

Name rules

Label with name + context rules

Labeled examples

Infername + context rules

List of examples

Name + context rules

Spelling features ← SEEDS: (ElisaFerreira,PESSOA,0.9999)

1 LABEL: Elisa Ferreira,criticar ← PESSOA

2 INFER: (criticar,PESSOA,0.98)

3 LABEL: Cavaco Silva,criticar ← PESSOA

4 INFER: (Silva,PESSOA,0.97)

5 REPEAT

NE tagger performance decreases over time (Mota & Grishman, 2008)

Detailed analysis using six-month periods (instead of periods of 1 year)

(Si , Ui , Tj)a b R2

P 0.827 -0.0024 0.24824R 0.773 -0.0022 0.19393F 0.799 -0.0023 0.23765

0 5 10 15

0.74

0.76

0.78

0.80

0.82

Time gap (1=6 months)

F−m

easu

re

y=−0.00232x+0.79906R2=0.2376

The performance decreases at an estimated rate of:

0.00232 in F-measure each 6 months (0.0348 after 8 years)

The low R-squared values show that not all variation is attributableto increasing the time gap

Updating the unlabeled data is better thanupdating the seeds (Complete training-test configurations)

0 5 10 15

0.74

0.76

0.78

0.80

0.82

Time gap (1=6 months)

F−m

easu

re

y=−0.00232x+0.79906R2=0.2376

Update? a b R2

No 0.799 -0.0023 0.238Seeds 0.800 -0.0019 0.192Unlabeled 0.807 -0.0005 0.019

Updating the unlabeled data is better thanupdating the seeds (Complete training-test configurations)

0 5 10 15

0.76

0.78

0.80

0.82

Time gap (1=6 months)

F−m

easu

re

y=−0.00189x+0.80025R2=0.1917

Update? a b R2

No 0.799 -0.0023 0.238Seeds 0.800 -0.0019 0.192Unlabeled 0.807 -0.0005 0.019

Updating the unlabeled data is better thanupdating the seeds (Complete training-test configurations)

0 5 10 15

0.77

0.78

0.79

0.80

0.81

0.82

0.83

Time gap (1=6 months)

F−m

easu

re

y=−0.00051x+0.80769R2=0.0189

Update? a b R2

No 0.799 -0.0023 0.238Seeds 0.800 -0.0019 0.192Unlabeled 0.807 -0.0005 0.019

Confusion matrices

91a 335 12 22 330 16 20 393 12 22

52 453 79 52 456 69 12 463 38

23 21 330 28 14 342 5 11 371

92b 368 19 42 368 16 40 391 11 22

19 435 55 23 445 39 14 463 29

23 32 334 19 25 352 5 12 380

95b 375 14 34 387 14 30 394 12 26

22 465 78 13 461 73 12 463 43

13 7 319 10 11 328 4 11 362

98a 390 16 31 386 16 28 395 11 28

11 458 58 13 460 48 11 464 39

9 12 342 11 10 355 4 11 364

98b 394 9 20 394 9 20 394 9 20

8 467 29 8 467 29 8 467 29

8 10 382 8 10 382 8 10 382