96
Zaidan et. al – Annotator Rationales This is a 20-25 minute talk (without interruption) best viewed on a PC running Windows (it does not look right on a Mac). If you use the slides to present our ideas, please credit us for our hard work If you use the slides for an unrelated topic (e.g. the slides on SVM) please credit the first author, who prepared the slides. This is the slideshow used by the first author for his NAACL HTL 2007 talk on 4/24/2007. It is a slightly updated version from that given to CLSP on 12/8/2006 (which is also available). Of course, we would love to hear your ideas! – Omar, Jason, and Christine

Zaidan et. al – Annotator Rationales This is a 20-25 minute talk (without interruption) best viewed on a PC running Windows (it does not look right on

Embed Size (px)

Citation preview

Zaidan et. al – Annotator Rationales

• This is a 20-25 minute talk (without interruption) best viewed on a PC running Windows (it does not look right on a Mac).

• If you use the slides to present our ideas, please credit us for our hard work

• If you use the slides for an unrelated topic (e.g. the slides on SVM) please credit the first author, who prepared the slides.

• This is the slideshow used by the first author for his NAACL HTL 2007 talk on 4/24/2007. It is a slightly updated version from that given to CLSP on 12/8/2006 (which is also available).

• Of course, we would love to hear your ideas!

– Omar, Jason, and Christine

Zaidan et. al – Annotator Rationales

FYI: BibTeX Entry@InProceedings{zaidan-eisner-piatko:2007:MainConf, author = {Zaidan, Omar and Eisner, Jason and Piatko, Christine}, title = {Using ``Annotator Rationales'' to Improve Machine Learning

for Text Categorization}, booktitle = {Human Language Technologies 2007: The Conference of the

North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference},

month = {April}, year = {2007}, address = {Rochester, New York}, publisher = {Association for Computational Linguistics}, pages = {260--267}, url = {http://www.aclweb.org/anthology/N/N07/N07-0133}}

Zaidan et. al – Annotator Rationales

Using “Annotator Rationales” to ImproveMachine Learning for Text Categorization

Omar F. Zaidan

Jason Eisner

Christine D. Piatko

Johns Hopkins University

NAACL HLT 2007 – Rochester, NYTuesday April 24th, 2007

cs.jhu.edu{ |ozaidan cpiatkojason @| }

Zaidan et. al – Annotator Rationales

Training Docs

Per

form

ance

x

x

x

xx

. . .

Supervised Learning

Annotated

Supervised Learning

Zaidan et. al – Annotator Rationales

Training Docs

Per

form

ance

x

x

x

xx

. . .

Supervised Learning

Zaidan et. al – Annotator Rationales

Richer Annotation

Training Docs

Per

form

ance

x

x

x

xx

. . .

x

xx

xx

. . .

Zaidan et. al – Annotator Rationales

Richer Annotation?

• Usually, an annotator indicates what the correct answer is.

• We propose the annotator also indicate why.

Each training example provides data about its class and why.

Richer annotation provides more data.

• Idea #1: richer annotation can aid ML.• Idea #2: richer better use of our time than more.

Zaidan et. al – Annotator Rationales

Rationales in Text CategorizationThe following segments were taken from movie reviews. Did the reviewer have a positive or negative opinion of the movie?

• Trust me, you will enjoy the hell out of American Pie.

• He continues to be one of the most exciting artists on the big screen, performing his own stunts and dazzling audiences.

• …and the romance was enchanting.

• The movie is so badly put together that even the most casual viewer may notice the miserable pacing and stray plot threads.

• …and it even makes watching Eddie Murphy a tedious experience.

• A woman in peril. A confrontation. An explosion. The end. Yawn. Yawn. Yawn.

Zaidan et. al – Annotator Rationales

. . .. . .

Non-annotated documents

Zaidan et. al – Annotator Rationales

. . .. . .

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

Zaidan et. al – Annotator Rationales

. . .. . .

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

Zaidan et. al – Annotator Rationales

. . .. . .

Annotated documents

Class and also “rationales”

Training Docs

Per

form

ance

x

x

x

xx

. . .

x

xx

xx

. . .

Only class; no “rationales”

Zaidan et. al – Annotator Rationales

Kundun

Martin Scorsese's Kundun has been criticized for its lack of narrative structure. Personally, I don't think it needs one: it works perfectly well as a study of Tibetan Buddhist culture and communist China. Scorsese views the Dalai Lama the way many Tibetans probably do, as a larger-than-life symbol of Buddhist spirituality and political leadership: the only glimpses into his head come from several dream sequences, but his portrayal is appropriate for a film that concentrates on the political and spiritual. The set design and cinematography were absolutely outstanding. Scorsese gets carried away with the spectacle, and it helps to augment the cultural contrast when the Dalai Lama travels to China to meet Chairman Mao. Kundun does not succumb to the temptation to start shouting slogans, delivering its message in an artistically interesting way without being overly manipulative.

. . .. . .

Zaidan et. al – Annotator Rationales

Kundun

Martin Scorsese's Kundun has been criticized for its lack of narrative structure. Personally, I don't think it needs one: it works perfectly well as a study of Tibetan Buddhist culture and communist China. Scorsese views the Dalai Lama the way many Tibetans probably do, as a larger-than-life symbol of Buddhist spirituality and political leadership: the only glimpses into his head come from several dream sequences, but his portrayal is appropriate for a film that concentrates on the political and spiritual. The set design and cinematography were absolutely outstanding. Scorsese gets carried away with the spectacle, and it helps to augment the cultural contrast when the Dalai Lama travels to China to meet Chairman Mao. Kundun does not succumb to the temptation to start shouting slogans, delivering its message in an artistically interesting way without being overly manipulative.

. . .. . .

Zaidan et. al – Annotator Rationales

Volcano

Volcano starts with Tommy Lee Jones worrying about a small earthquake enough to leave his daughter at home with a baby sitter. There is one small quake then another quake. Then a geologist points out to Tommy that its takes a geologic event to heat millions of gallons of water in 12 hours. A few hours later large amount of ash start to fall. Then...it starts: the volcanic eruption!

I liked this movie...but it was not as great as I hoped. It was still good none the less. It had excellent special effects. The best view was that of the helecopters flying over the streets of volcanos. Also, there were interesting side stories that made the plot more interesting. So...it was good!!

. . .. . .

Zaidan et. al – Annotator Rationales

Volcano

Volcano starts with Tommy Lee Jones worrying about a small earthquake enough to leave his daughter at home with a baby sitter. There is one small quake then another quake. Then a geologist points out to Tommy that its takes a geologic event to heat millions of gallons of water in 12 hours. A few hours later large amount of ash start to fall. Then...it starts: the volcanic eruption!

I liked this movie...but it was not as great as I hoped. It was still good none the less. It had excellent special effects. The best view was that of the helecopters flying over the streets of volcanos. Also, there were interesting side stories that made the plot more interesting. So...it was good!!

. . .. . .

. . .

Zaidan et. al – Annotator Rationales

The Postman

Question: after the disaster that was Waterworld, what the fuck were the execs who gave Costner the money to make another movie thinking??

In this 3 hour advertisement for his new hair weave, Costner plays a nameless drifter who dons a long dead postal employee's uniform (I shit you not) and gradually turns a nuked-out USA into an idealized hippy-dippy society. (The main accomplishment of this brave new world is in re-inventing polyester.) When he's not pointing the camera directly at himself, director Costner does have a nice visual sense, but by the time the second hour rolled around, I was reduced to sitting on my hands to keep from clawing out my own eyes. Mark this one "return to sender".

. . .

. . .

Zaidan et. al – Annotator Rationales

The Postman

Question: after the disaster that was Waterworld, what the fuck were the execs who gave Costner the money to make another movie thinking??

In this 3 hour advertisement for his new hair weave, Costner plays a nameless drifter who dons a long dead postal employee's uniform (I shit you not) and gradually turns a nuked-out USA into an idealized hippy-dippy society. (The main accomplishment of this brave new world is in re-inventing polyester.) When he's not pointing the camera directly at himself, director Costner does have a nice visual sense, but by the time the second hour rolled around, I was reduced to sitting on my hands to keep from clawing out my own eyes. Mark this one "return to sender".

. . .

. . .

Zaidan et. al – Annotator Rationales

The Postman

Question: after the disaster that was Waterworld, what the fuck were the execs who gave Costner the money to make another movie thinking??

In this 3 hour advertisement for his new hair weave, Costner plays a nameless drifter who dons a long dead postal employee's uniform (I shit you not) and gradually turns a nuked-out USA into an idealized hippy-dippy society. (The main accomplishment of this brave new world is in re-inventing polyester.) When he's not pointing the camera directly at himself, director Costner does have a nice visual sense, but by the time the second hour rolled around, I was reduced to sitting on my hands to keep from clawing out my own eyes. Mark this one "return to sender".

. . .

. . .

Zaidan et. al – Annotator Rationales

Batman & Robin

I once wrote that Speed 2 was the worst film I've ever reviewed. I didn't know that I'd soon encounter Batman & Robin, the picture least worthy of your attention this summer. As directed by Joel Schumacher, B&R is one long excuse for a Taco Bell promotion.

The plot, which has Mr. Freeze and Poison Ivy planning to take over the world, is weighted down by repetitive asides about the nature of trust, partnership, blah, blah, blah. But morals are not the point of this film – topping each bloated, confusing action scene with the next one is. Only George Clooney comes out on top; he underplays nicely and pretends like he's in a real movie… . . .

. . .

Zaidan et. al – Annotator Rationales

Batman & Robin

I once wrote that Speed 2 was the worst film I've ever reviewed. I didn't know that I'd soon encounter Batman & Robin, the picture least worthy of your attention this summer. As directed by Joel Schumacher, B&R is one long excuse for a Taco Bell promotion.

The plot, which has Mr. Freeze and Poison Ivy planning to take over the world, is weighted down by repetitive asides about the nature of trust, partnership, blah, blah, blah. But morals are not the point of this film – topping each bloated, confusing action scene with the next one is. Only George Clooney comes out on top; he underplays nicely and pretends like he's in a real movie… . . .

. . .

Zaidan et. al – Annotator Rationales

Armageddon

This disaster flick is a disaster alright. Directed by Tony Scott (Top Gun), it's the story of an asteroid the size of Texas caught on a collision course with Earth. After a great opening, in which an American spaceship, plus NYC, are completely destroyed by a comet shower, NASA detects said asteroid and go into a frenzy. They hire the world's best oil driller (Bruce Willis), and send him and his crew up into space to fix our global problem.

The action scenes are over the top and too ludicrous for words. So much so, I had to sigh and hit my head with my notebook a couple of times. Also, to see a wonderful actor like Billy Bob Thornton in a film like this is a waste of his talents. The only real reason for making this film was to somehow out-perform Deep Impact. Bottom line is, Armageddon is a failure.

. . .

. . .

Zaidan et. al – Annotator Rationales

Armageddon

This disaster flick is a disaster alright. Directed by Tony Scott (Top Gun), it's the story of an asteroid the size of Texas caught on a collision course with Earth. After a great opening, in which an American spaceship, plus NYC, are completely destroyed by a comet shower, NASA detects said asteroid and go into a frenzy. They hire the world's best oil driller (Bruce Willis), and send him and his crew up into space to fix our global problem.

The action scenes are over the top and too ludicrous for words. So much so, I had to sigh and hit my head with my notebook a couple of times. Also, to see a wonderful actor like Billy Bob Thornton in a film like this is a waste of his talents. The only real reason for making this film was to somehow out-perform Deep Impact. Bottom line is, Armageddon is a failure.

. . .

. . .. . .

Zaidan et. al – Annotator Rationales

. . .. . .

Annotated documents

Class and also “rationales”

OK … now what??

Zaidan et. al – Annotator Rationales

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

How sure was the annotator that this is a positive review?

Zaidan et. al – Annotator Rationales

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

How sure was the annotator that this is a positive review?

Zaidan et. al – Annotator Rationales

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

If a rationale is masked out, the annotator would not be as sure that this is a positive review.

Intuition: a good model should also be less sure.

How sure was the annotator that this is a positive review?

Zaidan et. al – Annotator Rationales

“Contrast” Examples

Original Example

Contrast ExamplesObtain a contrast by

masking out a rationale

Intuition: a good model should be less sure of a positive classification on contrasts than on the original.

Our work: modified SVM that takes this intuition into account.

Zaidan et. al – Annotator Rationales

Standard SVM

Class +1 examples

Class -1 examples

Zaidan et. al – Annotator Rationales

Standard SVM

Class +1 examples

Class -1 examples

Zaidan et. al – Annotator Rationales

Standard SVM

Class +1 examples

Class -1 examples

Zaidan et. al – Annotator Rationales

Class +1 examples

Class -1 examples

Support vectors

Standard SVM

Class +1 examples

Class -1 examples

Zaidan et. al – Annotator Rationales

Standard SVM

Zaidan et. al – Annotator Rationales

w1

1x

Standard SVM2x

Zaidan et. al – Annotator Rationales

w1

1x

Standard SVM2x

:Minimize

2

2

1w

:subject to

1 ixw

Zaidan et. al – Annotator Rationales

12v11vj1v

1 ixw

:Minimize

2

2

1w

:subject to

w1

1x

Incorporating Contrasts

. . . . . .

2x

. . . . . .22v

21vj2v

Zaidan et. al – Annotator Rationales

12v11vj1v

1 ixw

:Minimize

2

2

1w

:subject to

w1

1x

Incorporating Contrasts

. . . . . .

2x

. . . . . .22v

21vj2v

Original (x)

Contrast (v)

Zaidan et. al – Annotator Rationales

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize

2

2

1w

:subject to

Original (x)

Contrast (v)

Zaidan et. al – Annotator Rationales

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize

2

2

1w

:subject to

Original (x)

Contrast (v)

Zaidan et. al – Annotator Rationales

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize:subject to

ALSO

2

2

1w

:subject to

iji vwxw

Original (x)

Contrast (v)

Zaidan et. al – Annotator Rationales

1

iji vxw

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize:subject to

ALSO

2

2

1w

:subject to

iji vwxw

Original (x)

Contrast (v)

Zaidan et. al – Annotator Rationales

w1

1x

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

iji vwxw

1

iji vxw

Original (x)

Contrast (v)

def

Incorporating Contrasts

Zaidan et. al – Annotator Rationales

1 ixw i 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

Slack Variables

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw

Original (x)

Contrast (v)

i

iC

Zaidan et. al – Annotator Rationales

ij1 ixw i 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

Slack Variables

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw ij

)1( ij

Original (x)

Contrast (v)

jiijcontrastC

,

i

iC

Zaidan et. al – Annotator Rationales

) (iy 1 ixw i ij) (iy 1 ijxw

jiiijcontrasti CC

,

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

((Include Negative Examples))

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw ij

)1( ij) (iy

) (iy

Original (x)

Contrast (v)

Zaidan et. al – Annotator Rationales

ij) (iy 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

The Modified SVM

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw ij

)1( ij) (iy

) (iy 1 ixw i

) (iy

Original (x)

Contrast (v)

jiiijcontrasti CC

,

Zaidan et. al – Annotator Rationales

What this Means in Practice

Zaidan et. al – Annotator Rationales

What this Means in Practice

Standard SVM cares about this margin

Zaidan et. al – Annotator Rationales

What this Means in Practice

Standard SVM cares about this margin

Modified SVM cares about both margins

Zaidan et. al – Annotator Rationales

What this Means in Practice

Standard SVM cares about this margin

Modified SVM cares about both margins

i.e. a hyperplane that might reduce standard margin (to help the other margin)

Zaidan et. al – Annotator Rationales

Recap

• Training examples: (x1,y1), (x2,y2), …

• yi has ni rationales: ri1,ri2,…,rin• xi gives ni contrast examples: vi1,vi2,…,vin

(obtain jth contrast by masking out jth rationale.)

• We extend the SVM to determine best hyperplane subject to:– Constraints for standard margin,

and also– Constraints for original/contrast separating margin.

Zaidan et. al – Annotator Rationales

What this is not• In tasks like digit recognition, one can “generate” more

training data from the existing examples

Class-preserving transformations

?

Zaidan et. al – Annotator Rationales

What this is not

Class-preserving

Information from new examples similar to that from real examples

Can get benefit by automatic preprocessing (rescale, deslant, etc)

?Not necessarily (contrast)

Information from contrast examples is of a different kind

Actually provides new information via human insight

Zaidan et. al – Annotator Rationales

The Dataset

• The movie review dataset (Pang & Lee)– 1000 positive reviews– 1000 negative reviews

• For each document, given the class annotation, we added the rationale annotation– Annotation process: in an HTML editor, rationale

segments are boldfaced.

Zaidan et. al – Annotator Rationales

• How big is the overhead for annotating rationales?

• Ought to establish that richer annotation is a good use of an annotator’s time.– vs. just annotating more documents

• One can imagine three annotation tasks:– T1: given document, annotate the class.– T2: given document and gold standard class,

annotate the rationales.– T3: given document, annotate both the class

and the rationales.

• 50 docs/task given to four annotators

Annotation Time

Zaidan et. al – Annotator Rationales

– T1: given document, annotate the class.– T2: given document and gold standard class,

annotate the rationales.– T3: given document, annotate both the class

and the rationales.

• We found that Time(T3) ≈ 2 x Time(T1)

• Even though on average 8.3 rationales/doc + class!

• Annotator already needs to find rationales to determine class. Extra work is only to make them explicit:

Time(T3) < Time(T1)+Time(T2) by about 20%

Annotation Time

Zaidan et. al – Annotator Rationales

– T1: given document, annotate the class.– T2: given document and gold standard class,

annotate the rationales.– T3: given document, annotate both the class

and the rationales.

• Synergy: Time(T3) < Time(T1)+Time(T2)

• Extra time reduced with better annotation setup (e.g. automatic boldfacing of highlighting, stylus, etc) or smart use of eye tracking.

• Note: the task of classifying full docs is a worst-case scenario for rationales.– Other tasks would have simpler/fewer rationales and more

complex classes.

Annotation Time

Zaidan et. al – Annotator Rationales

Feature Set

• Binary unigram features– A document is reduced to a 0-1 vector with 17,744

dimensions.

Zaidan et. al – Annotator Rationales

no a-list actor would star in a movie like zoolander because that would be a mistake !

Word!"...aa+

a-listaaron...

zoolanderzorrozuckerzwick

val10...1010...1000

Feature12...

333334335336...

17741177421774317744

Feature Vector

Zaidan et. al – Annotator Rationales

Feature Set

• Binary unigram features– A document is reduced to a 0-1 vector with 17,744

dimensions.

• Feature set too simple?– Goal is not to build the best classifier.– Goal is to improve an existing classifier regardless of

its feature set.– We use this feature set to mirror previous work.

(( Pang & Lee actually tried other features and found it did not matter much ))

Zaidan et. al – Annotator Rationales

Let’s see some experimental results…

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Baseline

Contrasts Introduced

Standard vs. Modified SVMS

VM

as constrasts:+

Mod

ified

SV

M

Significantly different

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size (Documents)

Ac

cu

rac

y (

%)

Baseline

Contrasts Introduced

Zaidan et. al – Annotator Rationales

Baseline

Contrasts Introduced

SV

M

as constrasts:+

Mod

ified

SV

M

Zaidan et. al – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Removing rationales

hurts performance

Zaidan et. al – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Zaidan et. al – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as concatenations)

SV

MS

VM

SV

M

as constrasts:+

Mod

ified

SV

M

Keeping only rationales

hurts performance

Zaidan et. al – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as concatenations)

SV

MS

VM

SV

M

as constrasts:+

Mod

ified

SV

M

Zaidan et. al – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as individual docs)

Rationales Only (as concatenations)

SV

M

SV

M

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Zaidan et. al – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as individual docs)

Rationales Only (as concatenations)

SV

M

SV

M

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Pieces to solving classification puzzlecannot be found solely in the rationales

Zaidan et. al – Annotator Rationales

Using Rationales fromsome (and not all) Documents

• We showed what happens if you use all the rationales in all the training documents.

• What if you use all the rationales from some training documents instead of all training documents?

Baseline

Contrasts Introduced

Explore this space

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Baseline

Contrasts Introduced

Zaidan et. al – Annotator Rationales

Use Rationales from R Documents

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

T = 800: Class annotation from 800 documents.

R = 200: Rationales from 200 documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Zaidan et. al – Annotator Rationales

• Observation #1: much of the benefit can be obtained without annotating 100% of the documents– e.g. (0%, 50%, 100%) for T = 800 and T = 1600

Using Rationales fromsome (and not all) Documents

Zaidan et. al – Annotator Rationales

• Observation #2: if you have a lot of training documents, adding more may not help much (curves flatten out).

BUT adding more rationales provides a fresh benefit. Benefit from R even if T “reaches its potential”

Using Rationales fromsome (and not all) Documents

Zaidan et. al – Annotator Rationales

Simulating a “Lazy Annotator”

• In last few experiments, we kept all rationales from some training documents.– R=200 and T=800 means 600 documents contributed

no contrast examples. Each of the 200 R documents contributes all its rationales.

• What if we keep some rationales from all documents?– Instead of using all the rationales in 200 documents,

use the same number of rationales spread out over all 800 documents.

Zaidan et. al – Annotator Rationales

Simulating a “Lazy Annotator”

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Zaidan et. al – Annotator Rationales

Any Differences? Not Really.

Diligent Lazy

0 400 800 1200 1600

Training Set Size T (Documents)

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Acc

ura

cy (

%)

The two (T=800,R=200) points are comparable: same number of rationales. Difference is in distribution only.

Zaidan et. al – Annotator Rationales

Simulating a “Lazy Annotator”

• Experiment simulates a not-so-diligent annotator– This might be more common in reality.– Likely to pick ‘obvious’ rationales, yielding

faster rationale annotation.– Also, obvious rationales may prove to be

better.(Though experiment doesn't test for that; rationales were picked at random)

Zaidan et. al – Annotator Rationales

Big Picture

• Idea #1: richer annotation can aid ML.• Idea #2: richer better use of our time than more.• Example of richer annotation: rationales.• Developed and tested one method to use

rationales (our extended SVM).• Simulated degree of annotator laziness.

• Bonus: annotator knows nothing about the ML method (or even feature set).

Zaidan et. al – Annotator Rationales

Future Work

• More datasets:– Different task may require different kind of rationales.– Might also require different annotation tool.

• More experiments:– Examination of annotation process.– Real experiments to see effect of a lazy annotator.

• More models:– Generative models: model annotation of rationales as

a noisy process (annotators are not perfect).– Potentially other discriminative methods.

Zaidan et. al – Annotator Rationales

On The Internets

• The enriched dataset (and slides) here:http://cs.jhu.edu/~ozaidan/rationales

Thank you!

Zaidan et. al – Annotator Rationales

Thank you!

Zaidan et. al – Annotator Rationales

Using “Annotator Rationales” to Improve Machine Learning for Text Categorization

This talk presented an interesting idea; it does seem to me that rationales would help. I think the authors are on the right track. However the talk was quite long. The animation was OK, I suppose, though I think some people might be put off by it. At any rate, I’d be interested to know more about this research. It should be noted that other ML methods may benefit from this approach even more than SVM’s, if such a learning method emulates the human decision process more than an SVM (e.g. a decision tree). An interesting idea overall though more experimental results are needed to convince me completely.

Thank you!

Zaidan et. al – Annotator Rationales

• The following two slides were prepared in anticipation to any related questions…

Zaidan et. al – Annotator Rationales

Paired-Permutation Significance Test

Diff = 165

A9666...78

1415

B9737...59

1350

A9766...79

1465

B9637...58

1300

A9767...58

1475

B9636...79

1290

Diff = 65 Diff = 185

If entries are independently switched with chance ½, how often would the difference reach the difference observed originally?

Zaidan et. al – Annotator Rationales

Note on Masking vs. Deleting

• It is not always clear how you would “mask out” features.

Annotator: “this is a one because there is a lot of empty space here and here”