34
Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu

Unsupervised Constraint Driven Learning for Transliteration Discovery

  • Upload
    mareo

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Unsupervised Constraint Driven Learning for Transliteration Discovery. M. Chang, D. Goldwasser, D. Roth, and Y. Tu. What I am going to do today…. Goal 1 : Present the transliteration work Get feedback! Goal 2: Think about this work with CCM Tutorial ….  - PowerPoint PPT Presentation

Citation preview

Page 1: Unsupervised Constraint Driven Learning for Transliteration Discovery

Unsupervised Constraint Driven Learning for Transliteration Discovery

M. Chang, D. Goldwasser, D. Roth, and Y. Tu

Page 2: Unsupervised Constraint Driven Learning for Transliteration Discovery

What I am going to do today…

Goal 1 : Present the transliteration work Get feedback!

Goal 2: Think about this work with CCM Tutorial …. I will try to present this work in a slightly

different way Some of them are my personal comment Different than our yesterday discussion

Please give us comment about this Make this work more general (not only

transliteration)

Page 3: Unsupervised Constraint Driven Learning for Transliteration Discovery

Wait a sec! What is CCM?

),(maxarg xyfwTy

Cc

cT

y xycxyfw ),,(),(maxarg

Page 4: Unsupervised Constraint Driven Learning for Transliteration Discovery

Constraints Driven Learning

Why Constraints? The Goal: Building a good system easily We have prior knowledge at our hand

Why not inject knowledge directly ?

How useful are constraints? Useful for supervised learning [Yih and Roth 04]

[many others]

Useful for semi-supervised learning [Chang et.al. ACL 2007]

Some times more efficient than labeling data directly

Page 5: Unsupervised Constraint Driven Learning for Transliteration Discovery

Unsupervised Constraint Driven Learning

In this work We do not use any label instance Achieve to good performance that competitive

several supervised model

Compared to [Chang et.al. ACL 2007]

In ACL 07, they use a small amount of dataset (5-20)

Reason: Bad Models can not benefit from constraints!

For some applications, we have very good resource We do not need labeled instances at all!

Page 6: Unsupervised Constraint Driven Learning for Transliteration Discovery

6

In a nutshell:

Traditional semi-supervised learning.

• Model can drift from the correct one. Model

Unlabeled Data

Prediction

Label unlabeled data

Feedback

Learn from labeled data

Unsupervised Learning

Resource

Page 7: Unsupervised Constraint Driven Learning for Transliteration Discovery

7

In a nutshell:

CODLUse constraints to generate better training samples in unsupervised learning.

Prediction+ Constraints

Model

Unlabeled Data

PredictionFeedback

More accurate labeling

Better Model

CODLImproves “Simple” Model

Using Expressive Constraints

Page 8: Unsupervised Constraint Driven Learning for Transliteration Discovery

Outline

Constraint Driven Learning (CoDL)

Transliteration Discovery

Algorithm

Experimental Results

Page 9: Unsupervised Constraint Driven Learning for Transliteration Discovery

Transliteration Generation (Not our focus)

Given a Source Transliteration; What is the target transliteration? Bush

布希 Sushi

壽司 Issues

Ambiguity : For the same source word, many different

transliteration Think about Chinese

What we want: find the most widely used transliteration

Page 10: Unsupervised Constraint Driven Learning for Transliteration Discovery

Transliteration Discovery (Our focus)

Problem Settings Give you two list of words, map them!

Advantages A relatively easy problem Can find the most widely used transliteration

Assumption: Source: English Each source entities has a transliteration in the

target candidates Target candidates might not be named entities

Page 11: Unsupervised Constraint Driven Learning for Transliteration Discovery

Outline

Constraint Driven Learning (CoDL)

Transliteration Discovery

Algorithm

Experimental Results

Page 12: Unsupervised Constraint Driven Learning for Transliteration Discovery

Algorithm Outline

Prediction Model

How to use existing resource to construct the Model?

Constraints?

Learning Algorithm

Page 13: Unsupervised Constraint Driven Learning for Transliteration Discovery

The Prediction Model

How do we make prediction? Given a source word, how to predict the best target

?

Model 1 : Vs, Vt Yes or No Issue: Not many obvious constraints can be added Not a structure prediction problem

Model 2: Vs, Vt Hidden variables Yes or No Predicting F is a structure prediction algorithm We can add constraints more easily

Page 14: Unsupervised Constraint Driven Learning for Transliteration Discovery

The Prediction Model

Score for a pair

A CCM formulation

A slightly different scoring function

Ff

tsts vvfWfvvWFscore ),|(]1[),,|(

Cc

tststs cvvFvvWFscorevvFg ),,|(),,|(),,(

More on this point in the next few slides

),(maxarg*ts vvscorev

t

||

),,(maxarg),(

t

tsFts v

vvFgvvscore

Hidden Variables

Violation

Page 15: Unsupervised Constraint Driven Learning for Transliteration Discovery

Prediction Model: Another View

The scoring function looks like weight times features!

If there is a bad feature, score - ∞

Our Hidden variable (Feature Vectors): Character Mapping

),(

),|(]1[),,|(

tsT

Fftsts

vvFW

vvfWfvvWFscore

Cc

tstsT

ts cvvFvvFWvvFg ),,|(),(),,(

),( ts vvF

Page 16: Unsupervised Constraint Driven Learning for Transliteration Discovery

Everything

(a,a), (o,O), (w,_),……

Page 17: Unsupervised Constraint Driven Learning for Transliteration Discovery

Algorithm Outline

Prediction Model

How to use existing resource to construct the Model?

Constraints?

Learning Algorithm

Page 18: Unsupervised Constraint Driven Learning for Transliteration Discovery

Resource: Romanization Table

Hebrew, Russian How can you type Hebrew or Russian?

Use English Keyboard, C maps to A similar character “C” or “S” in Hebrew or Russian

Very easy to get Ambiguous

Special Case: Chinese (Pin Yin)壽司 shòu sī (Low ambiguity) Map Pin-Yin to English (sushi) Romanization Table? a a

Page 19: Unsupervised Constraint Driven Learning for Transliteration Discovery

Initialize the Table

Every character pair in the Romanization Table Weight = 0 Everything else, -1 Could have better way to do initialization

Note: All (v_s,v_t) will get zero without constraints

Cc

tstsT

ts cvvFvvFWvvFg ),,|(),(),,(

Page 20: Unsupervised Constraint Driven Learning for Transliteration Discovery

Algorithm Outline

Prediction Model

How to use existing resource to construct the Model?

Constraints?

Learning Algorithm

Page 21: Unsupervised Constraint Driven Learning for Transliteration Discovery

Constraints

General Constraints Coverage: all character need to be mapped at

least once No crossing: character mappings can not cross

each other

Language Specific Constraints General Restricted Mapping Initial Restricted Mapping Length Restriction

Page 22: Unsupervised Constraint Driven Learning for Transliteration Discovery

Constraints

Pin-Yin to EnglishMany other works use similar information as well!

Page 23: Unsupervised Constraint Driven Learning for Transliteration Discovery

Algorithm Outline

Prediction Model

How to use existing resource to construct the Model?

Constraints?

Learning Algorithm

Page 24: Unsupervised Constraint Driven Learning for Transliteration Discovery

High-Level Overview

Model Resource While Converge

Use Model + Constraints to get Labels (for both F, y)

Update Model with newly labeled F and y (without Constraints) (details in the next slide)

Similar to ACL 07 Update the model without Constraints

Difference from ACL 07 We get feedback from the labels of both hidden

variables and output

Page 25: Unsupervised Constraint Driven Learning for Transliteration Discovery

Training

Predict hidden variables and the labels

Update

Algorithm

Page 26: Unsupervised Constraint Driven Learning for Transliteration Discovery

Outline

Constraint Driven Learning (CoDL)

Transliteration Discovery

Algorithm

Experimental Results

Page 27: Unsupervised Constraint Driven Learning for Transliteration Discovery

Experimental Setting

Evaluation

ACC: Top candidate is (one of) the right answer

Learning Algorithm Linear SVM with C = 0.5

Dataset English-Hebrew 300: 300 English-Chinese 581:681 English-Russian 727:50648 (Target includes all

words)

Page 28: Unsupervised Constraint Driven Learning for Transliteration Discovery

Results - Hebrew

Page 29: Unsupervised Constraint Driven Learning for Transliteration Discovery

Results - Russian

Page 30: Unsupervised Constraint Driven Learning for Transliteration Discovery

Analysis

A small Russian subset was used here

1) Without Constraints (on features), Romanization Table is useless!

2) General Constraints are more important!

4) Better Constraints Lead to Better Final Results

3) Learning has great impact here!But constraints are very important, too!

Page 31: Unsupervised Constraint Driven Learning for Transliteration Discovery

Related Works (Need more work here)

Learning the score for Edit Distance

Previous transliteration works

Machine translation?

Page 32: Unsupervised Constraint Driven Learning for Transliteration Discovery

Conclusion

ML: unsupervised constraint driven algorithm Use hidden variable to find more constraints (e.g. co-ref) Use constraints to find “cleaner” feature representation

Transliteration: Usage of Normalization Table as the starting point

We can get good results without training data Right constraints (modeling) is the key

Future Work Transliteration Model: Better Model, Quicker Inference CoDL: Other applications for unsupervised CoDL

Page 33: Unsupervised Constraint Driven Learning for Transliteration Discovery

33

Constraint - Driven Learning (CODL)

=learn(Tr)For N iterations do

T= For each x in unlabeled dataset

y Inference(x, )T=T {(x, y)}

= +(1- )learn(T)

Any supervised learning algorithm parametrized by

Learn from new training data.Weight supervised and unsupervised model(Nigam2000*).

Augmenting the training set (feedback). Any inference algorithm (with constraints).

Inference(x,C, )

Page 34: Unsupervised Constraint Driven Learning for Transliteration Discovery

34

Unsupervised Constraint - Driven Learning

=Construct(Resource)For N iterations do

T= For each x in unlabeled dataset

y Inference(x, )T=T {(x, y)}

= +(1- )learn(T)

Construct the model with Resources

Learn from new training data. = 0 in this work

Augmenting the training set (feedback). Any inference algorithm (with constraints).

Inference(x,C, )