83
Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch Courtney Napoles UPenn UPenn JHU

Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

  • Upload
    others

  • View
    16

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Problems in Current Text Simplification Research

TACL paper @ EMNLP Sep-20-2015

Wei XuChris Callison-Burch

Courtney Napoles

UPennUPenn JHU

Page 2: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 3: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Page 4: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Page 5: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 6: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

• for children, disabled, non-native speakers … • for other NLP tasks (MT, summarization …)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 7: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

• for children, disabled, non-native speakers … • for other NLP tasks (MT, summarization …)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

paraphrasing

Page 8: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

• for children, disabled, non-native speakers … • for other NLP tasks (MT, summarization …)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

paraphrasing deletion

Page 9: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

• for children, disabled, non-native speakers … • for other NLP tasks (MT, summarization …)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

paraphrasing deletion splitting!!

Page 10: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

What is Text Simplification

• for children, disabled, non-native speakers … • for other NLP tasks (MT, summarization …)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

paraphrasing deletion splitting!!

Page 11: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Goal of Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 12: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Goal of Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

grammaticality

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 13: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Goal of Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

grammaticality meaning preservation

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 14: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Goal of Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

grammaticality meaning preservation simplicity

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 15: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Goal of Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

grammaticality meaning preservation simplicity

5 5 4

5 4 5

Human EvaluationApplesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 16: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Goal of Text Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

grammaticality meaning preservation simplicity

5 5 4

5 4 5

Human Evaluation

(no reliable automatic evaluation yet)

Applesauce is a puree made of apples.INPUT

Applesauce is a soft paste. OUT-1

Applesauce is a paste. It is made of apples.OUT-2

Page 17: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Brief History of Sentence Simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 18: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Brief History of Sentence Simplification

rule-based

1997 Chandrasekar & Srinivas 1999 Dras (PhD thesis)2000 Carroll, Minnen, Pearce, Canning, Devlin 2002 Canning (PhD thesis)2004 Siddharthan (PhD thesis)2010 Zhu, Bernhard, Gurevych2011 Woodsend & Lapata2011 Coster & Kauchak2012 Wubben, van den Bosch, Krahmer2014 Narayan & Gardent2014 Siddharthan (Survey)2014 Angrosh, Nomoto, Siddharthan2014 Narayan (PhD thesis)Now Xu, Callison-Burch, Napoles (Opinion)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 19: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Brief History of Sentence Simplification

rule-based

1997 Chandrasekar & Srinivas 1999 Dras (PhD thesis)2000 Carroll, Minnen, Pearce, Canning, Devlin 2002 Canning (PhD thesis)2004 Siddharthan (PhD thesis)2010 Zhu, Bernhard, Gurevych2011 Woodsend & Lapata2011 Coster & Kauchak2012 Wubben, van den Bosch, Krahmer2014 Narayan & Gardent2014 Siddharthan (Survey)2014 Angrosh, Nomoto, Siddharthan2014 Narayan (PhD thesis)Now Xu, Callison-Burch, Napoles (Opinion)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 20: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Parallel Wikipedia Corpus

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 21: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Brief History of Sentence Simplification

rule-based

machine translation

1997 Chandrasekar & Srinivas 1999 Dras (PhD thesis)2000 Carroll, Minnen, Pearce, Canning, Devlin 2002 Canning (PhD thesis)2004 Siddharthan (PhD thesis)2010 Zhu, Bernhard, Gurevych2011 Woodsend & Lapata2011 Coster & Kauchak2012 Wubben, van den Bosch, Krahmer2014 Narayan & Gardent2014 Siddharthan (Survey)2014 Angrosh, Nomoto, Siddharthan2014 Narayan (PhD thesis)Now Xu, Callison-Burch, Napoles (Opinion)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 22: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Brief History of Sentence Simplification

rule-based

machine translation

1997 Chandrasekar & Srinivas 1999 Dras (PhD thesis)2000 Carroll, Minnen, Pearce, Canning, Devlin 2002 Canning (PhD thesis)2004 Siddharthan (PhD thesis)2010 Zhu, Bernhard, Gurevych2011 Woodsend & Lapata2011 Coster & Kauchak2012 Wubben, van den Bosch, Krahmer2014 Narayan & Gardent2014 Siddharthan (Survey)2014 Angrosh, Nomoto, Siddharthan2014 Narayan (PhD thesis)Now Xu, Callison-Burch, Napoles (Opinion)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 23: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Brief History of Sentence Simplification

rule-based

machine translation

1997 Chandrasekar & Srinivas 1999 Dras (PhD thesis)2000 Carroll, Minnen, Pearce, Canning, Devlin 2002 Canning (PhD thesis)2004 Siddharthan (PhD thesis)2010 Zhu, Bernhard, Gurevych2011 Woodsend & Lapata2011 Coster & Kauchak2012 Wubben, van den Bosch, Krahmer2014 Narayan & Gardent2014 Siddharthan (Survey)2014 Angrosh, Nomoto, Siddharthan2014 Narayan (PhD thesis)Now Xu, Callison-Burch, Napoles (Opinion)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 24: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Problems in Simplification Research

• State-of-the-art evaluation is suboptimal. But we have been doing this in the past 5 years*.

• Simple Wikipedia data dominated in the past 5 years. But its quality was taken for granted. It limits the scope of research.

* (Angrosh et al. 2014) tried comprehension quizWei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 25: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Breakthrough on Sea

Why this is important?

• better understanding

• better review

• more diverse research

• better data and evaluation

• better model

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

wind direction

a straight pathupwind

a zigzag pathupwind

wind direction

Page 26: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Breakthrough on Sea

Why this is important?⎯⎯

• better understanding

• better review

• more diverse research

• better data and evaluation

• better model

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

wind direction

a straight pathupwind

a zigzag pathupwind

wind direction

Page 27: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

SimplificationBreakthrough on Sea

Why this is important?⎯⎯

• better understanding

• better review

• more diverse research

• better data and evaluation

• better model

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

wind direction

a straight pathupwind

a zigzag pathupwind

wind direction

Page 28: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 29: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

—— Štajner, Béchara, Saggion (2015)

“Recently, there have been several attempts ataddressing the text simplification task as a monolingualtranslation problem … However, they did not try to seekreasons for the success or the failure of their systems.”

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 30: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

—— Štajner, Béchara, Saggion (2015)

“Recently, there have been several attempts ataddressing the text simplification task as a monolingualtranslation problem … However, they did not try to seekreasons for the success or the failure of their systems.”

“state-of-the-art” competition not easy to doWHY DID THIS HAPPEN?

1 2

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 31: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Opinion #1

Current evaluation doesn’t tell us what’s going on.

Page 32: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

System Comparabilitygrammaticality

meaning preservation

simplicity

paraphrasing

deletion

splitting!!

not easy to measure

sub-systems evaluation criteria

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 33: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

System Comparabilitygrammaticality

meaning preservation

simplicity

paraphrasing

deletion

splitting!!

not easy to measureWe need more controlled evaluation:

sub-systems evaluation criteria

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 34: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

System Comparabilitygrammaticality

meaning preservation

simplicity

paraphrasing

deletion

splitting!!

not easy to measureWe need more controlled evaluation:- evaluate sub-tasks separately

sub-systems evaluation criteria

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 35: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

System Comparabilitygrammaticality

meaning preservation

simplicity

paraphrasing

deletion

splitting!!

not easy to measureWe need more controlled evaluation:- evaluate sub-tasks separately- target specific audience (e.g. 10-12 year old)

sub-systems evaluation criteria

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 36: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Opinion #2

Simple Wikipedia isnot that simple

Page 37: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 38: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 39: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

—— Advaith Siddharthan (2014 Survey)

“Specific questions that need addressing are :

… we need to better understand the quality of SimpleEnglish Wikipedia, a resource that has been used totrain many SMT based simplification systems…”

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 40: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

—— Advaith Siddharthan (2014 Survey)

“Specific questions that need addressing are :

… we need to better understand the quality of SimpleEnglish Wikipedia, a resource that has been used totrain many SMT based simplification systems…”

We quantitively and systematically answer this quest.WHAT’S NEW?

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 41: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 42: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Quality of Parallel Wikipedia Corpus*

17%

33%

50%

alignment error

not simpler

real simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 43: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

17%

33%

50%

alignment error

not simpler

real simplification

Inaccuracy in Parallel Wikipedia Corpus*

not simpler

real simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

(two sentences have different meaning)

Page 44: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

17%

33%

50%

alignment error

not simpler

real simplification

Inaccuracy in Parallel Wikipedia Corpus*

not simpler

real simplification

Best automatic sentence alignment gets about 0.7 F1 score (Hwang et al. 2015)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

(two sentences have different meaning)

Page 45: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

17%

33%

50%

alignment error

not simpler

real simplification

real simplification

Inadequacy in Parallel Wikipedia Corpus*

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Best automatic sentence alignment gets about 0.7 F1 score (Hwang et al. 2015)

(two sentences have different meaning)

Page 46: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

17%

33%

50%

alignment error

not simpler

real simplification

real simplification

Inadequacy in Parallel Wikipedia Corpus*

Sentences can have similar meaning but not simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Best automatic sentence alignment gets about 0.7 F1 score (Hwang et al. 2015)

(two sentences have different meaning)

Page 47: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

17%

33%

50%

alignment error

not simpler

real simplification (aligned and simpler)

Inadequacy in Parallel Wikipedia Corpus*

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 48: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

50%

alignment error real simplification (aligned and simpler)?

Inadequacy in Parallel Wikipedia Corpus*

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 49: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

50%

alignment error real simplification (aligned and simpler)

Some sentences are simpler by only one word while the rest of sentence is still complex

?

Inadequacy in Parallel Wikipedia Corpus*

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 50: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Issues with Parallel Wikipedia Corpus

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 51: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Issues with Parallel Wikipedia Corpus

• suboptimal for estimating “translation” probabilities

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 52: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Issues with Parallel Wikipedia Corpus

• suboptimal for estimating “translation” probabilities

• suboptimal for developing automatic metrics

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 53: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Issues with Parallel Wikipedia Corpus

• suboptimal for estimating “translation” probabilities

• suboptimal for developing automatic metrics

• suboptimal for tuning MT system

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 54: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Issues with Parallel Wikipedia Corpus

• suboptimal for estimating “translation” probabilities

• suboptimal for developing automatic metrics

• suboptimal for tuning MT system

• unsuitable for document-level simplification

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 55: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Opinion #3

New data can help

Page 56: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Newsela Dataset

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

every article at 5 levels of simplification written by trained editors, comes with comprehension quizzes

Page 57: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Newsela

17%

33%50%

alignment error

not simpler

real simplification

2%6%

92%

alignment errornot simpler real

simplification

Wikipedia*

manual inspection of aligned sentence pairs

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 58: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

degree of paraphrasing

24%

34%

42%

deletion +

paraphrase

paraphrase only

deletion only

74%

20%7%

paraphrase only

deletion onlydeletion

+paraphrase

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Good simplification needs more paraphrasing.

Page 59: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

sentence length (#words) see syntax analysis in the paper

0

6

12

18

24

30

Normal Simple

0

6

12

18

24

30

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Good simplification could be much shorter.

Page 60: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

71,340

23,7716,669

Normal Simple

19,849

19,197583

Normal Simple

48% reduction18% reduction

(total 2.6 million tokens) (total 1.3 million tokens)

vocabulary size (#unique words)

“chimp”

“chimpanzee”

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Good simplification uses a much smaller vocabulary.

Page 61: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

most significantly reduced words (weighted log-odds-ratio analysis w/ informative Dirichlet prior)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

commune,

asand

northernnorthwestern

film;

southwesternfootballer

,and

"—of

whichas

percentincludingdirector

Good simplification reduces certain function word usage.

Page 62: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

most significantly reduced words see syntax analysis in the paper

which

where

0 2,000 4,000 6,000 8,000

NormalSimple

which

where

0 750 1500 2250 3000

approximately

0 125 250 375 500

approximately

0 10 20 30 40

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Postal officials recently tried to … … , which could … .…Postal officials recently tried to … … . That could … .…

Page 63: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

most significantly reduced words see syntax analysis in the paper

which

where

0 2,000 4,000 6,000 8,000

NormalSimple

which

where

0 750 1500 2250 3000

approximately

0 125 250 375 500

approximately

0 10 20 30 40

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Postal officials recently tried to … … , which could … .…Postal officials recently tried to … … . That could … .…

Page 64: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

most significantly reduced words see syntax analysis in the paper

which

where

0 2,000 4,000 6,000 8,000

NormalSimple

which

where

0 750 1500 2250 3000

approximately

0 125 250 375 500

approximately

0 10 20 30 40

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Postal officials recently tried to … … , which could … .…Postal officials recently tried to … … . That could … .…

Page 65: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

document compression ratio (simple/normal) see discourse analysis in the paper

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

0.00 0.25 0.50 0.75 1.00 1.250.00 0.25 0.50 0.75 1.00 1.25

3.19% 57.28%

Wikipedia is not suitable for full-document simplification.

Page 66: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Current evaluation doesn’t tell us what’s going on.Opinion #1

Simple Wikipedia is not that simple.Opinion #2

New data can help.Opinion #3

Page 67: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 68: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to reviewers:

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 69: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to reviewers:

- be open-minded to papers that may not follow previous evaluation setup, may not outperform the “state-of-the-art” on Wikipedia

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 70: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to reviewers:

- be open-minded to papers that may not follow previous evaluation setup, may not outperform the “state-of-the-art” on Wikipedia

- be sympathetic towards papers specially on data construction*, data analysis* and automatic evaluation metrics

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

* (Pellow & Maxine, 2014 HCOMP; Marcelo & Specia, 2014 PITR)

Page 71: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to reviewers:

- be open-minded to papers that may not follow previous evaluation setup, may not outperform the “state-of-the-art” on Wikipedia

- be sympathetic towards papers specially on data construction*, data analysis* and automatic evaluation metrics

- read our paper

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

* (Pellow & Maxine, 2014 HCOMP; Marcelo & Specia, 2014 PITR)

Page 72: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 73: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to researchers:

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 74: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to researchers:

- consider working on text simplification (“pre-BLEU age”)

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 75: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to researchers:

- consider working on text simplification (“pre-BLEU age”)

- improve evaluation

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 76: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to researchers:

- consider working on text simplification (“pre-BLEU age”)

- improve evaluation

- make your system replicable

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 77: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

My Suggestions• to researchers:

- consider working on text simplification (“pre-BLEU age”)

- improve evaluation

- make your system replicable

- read our paper

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 78: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Thank you

Sponsor:  NSF

Newsela  data  are  available  at  h5ps://newsela.com/data/      

Questions? Opinions?

Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 79: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Back Up

Page 80: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

NewselaWikipedia*

simple  cue    

words

complex  conjunc1ons

change of discourse connectives (odds-ratio)Wei  Xu,  Chris  Callison-­‐Burch,  Courtney  Napoles.  “Problems  in  Current  Text  Simplifica@on  Research:  New  Data  Can  Help”    TACL  (2015)    

Page 81: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Reasons of Quality Issues in Parallel Wikipedia Corpus

• The Simple Wikipedia was created by volunteers with no specific objective;

• Articles in Simple Wikipedia do not necessarily map Normal Wikipedia;

• As an encyclopedia, Wikipedia contains extremely difficulty words and sentences.

Page 82: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Newsela DatasetOriginal

Slightly more fourth-graders nationwide are reading proficiently compared with a decade ago, but only a third of them are now reading well, according to a new report.

Simple-1Fourth-graders in most states are better readers than they were a decade ago. But only a third of them actually are able to read well, according to a new report.

Simple-2Fourth-graders in most states are better readers than they were a decade ago. But only a third of them actually are able to read well, according to a new report.

Simple-3 Most fourth-graders are better readers than they were 10 years ago. But few of them can actually read well.

Simple-4 Fourth-graders are better readers than 10 years ago. But few of them read well.

Page 83: Problems in Current Text Simplification Research · Problems in Current Text Simplification Research TACL paper @ EMNLP Sep-20-2015 Wei Xu Chris Callison-Burch ... Human Evaluation

Newsela Dataset1,130 news articles

Time: 2013 January ~ 2015 March

Source: Chicago Tribune, Seattle Times, LA Times, The Baltimore Sun

Original: 56k sentences — Simple: 64k sentences