37
Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Embed Size (px)

Citation preview

Page 1: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Computer-Aided Language Processing

Ruslan MitkovUniversity of Wolverhampton

Page 2: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

The rise and fall of Natural Language Processing (NLP)?

Automatic NLP: expectations fulfilled? Many practical applications such as

IR, shy away from NLP techniques Performance accurate? There are many applications such as

word alignment, anaphora resolution, term extraction where accuracy could be well below 100%

Dramatic improvements feasible in foreseeable time?

Page 3: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Context

Promising NLP projects, results but In the vast majority of real-world

applications, fully automatic NLP is still far from delivering reliable results.

Page 4: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Alternative: computer-aided language processing (CALP)

Computer-aided scenario: Processing is not done entirely by

computers Human intervention improves, post-

edits or validates the output of the computer program.

Page 5: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Historical perspective

Martin Kay’s (1980) paper on machine-aided translation

Page 6: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Machine-Aided Translation

The translator sends the simple sentences for translation to the computer and translates the more difficult, complex ones him(her)self.

Page 7: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

CALP examples

Machine-aided Translation Summarisation (Orasan, Mitkov and Hasler

2003) Generation of multiple-choice tests (Mitkov

and An, 2003; Mitkov, An and Karamanis 2006)

Information extraction (Cunningham et al 2002)

Acquisition of semantic lexicons (Riloff and Schmelzenbach, 1998)

Annotation of corpora (Orasan 2005) Translation Memory

Page 8: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Translation Memory

A Translation Memory is a linguistic database that collects all your translations and their target language equivalents as you translate.A Translation Memory is a database that collects all your translations and their target language equivalents as you translate.Match 87%linguistic linguistische

Page 9: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

CALP applications in focus

Machine-aided Translation Translation Memory Annotation tools Computer-aided Summarisation Computer-aided Generation of

Multiple-Choice Tests

Page 10: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

MAT: the Penang experiment Books/manuals averaging about 250 pages

translated manually by a translation bureau and by a Machine-Aided Translation program (SISKEP).

Manual translation took 360 hours on average

Translation by a Machine-Aided Translation program needed 200 hours on average

Efficiency rate: 1.8

Page 11: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Translation Memory

A case study (Webb 1998) Client saves 40% money, 70% time Translator / translation agency saves

69% money, 70% time Efficiency rate: 3.3

Page 12: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

PALinkA: multi-task annotation tool

Employed in a number of corpora annotation tasks

(Semi-automatic) mark-up of coreference

(Semi-automatic) mark-up of centering

(Semi-automatic) mark-up of events

Page 13: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

The noun “the storm” is marked coreferential with the noun “the cyclone”. WordNet is consulted to find out the relation between them

The user can override theinformation retrieved from WordNet

Page 14: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

PALinkA: multi-task annotation tool (II)

Webpage: http://clg.wlv.ac.uk/projects/PALinkA

Old version: over 500 downloads used in several projects

New version: supports plugins (not available for download yet.

Page 15: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Further CALP experiments (evaluations) at the University of Wolverhampton

Computer-aided summarisation Computer-aided generation of multiple-

choice tests Efficiency and quality evaluated in both

cases

Page 16: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Computer-aided summarisation CAST: computer-aided summarisation tool

(Orasan, Mitkov and Hasler 2003) Combines automatic methods with human

input Relies on automatic methods to identify the

important information Humans can decide to include this

information and/or additional one Humans post-edit the information to

produce a coherent summary

Page 17: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Evaluation (Orasan and Hasler 2007)

Time for producing summaries with and without CAST

Consistent familiarity-effect-extinguished model: same texts produced manually and with the help of the program in intervals of 1 year

Human had to choose the better summary when presented with a pair of summaries

Page 18: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Experiment 1 Used one professional summariser 69 texts from CAST corpus were used Summaries were produced with and

without the tool at one year distance

Without CAST With CAST Reduction %

Newswire texts 498secs 382secs23.29%

New Scientist texts 771secs 623secs19.19%

Efficiency rate: 1.25

Page 19: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Experiment 1

Term-based summariser used in the process evaluated

Correlation between the success of the automatic summariser and the time reduction

Page 20: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Experiment 2

Turing-like experiment where humans were asked humans to pick the better summary in a pair

Each pair contained one summary produced with CAST and one without CAST

17 judges were shown 4 randomly selected pairs

Page 21: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Experiment 2 In 41 pairs the summary produced

with CAST was preferred In 27 pairs the summary produced

without CAST was preferred Chi-square shows that there is no

statistically significant difference with 0.05 confidence

Page 22: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Discussion Computer-aided summarisation works

for professional summarisers Reduces the time necessary to

produce summaries by about 20% Quality of summaries not

compromised

Page 23: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Computer-aided generation of multiple-choice tests (Mitkov and Ha 2003)

Multiple-choice test: an effective way to measure student achievements.

Fact: development of multiple-choice tests is a time-consuming and labour intensive task

Alternative: computer-aided multiple-choice test generation based on a novel NLP methodology

How does it work?

Page 24: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Methodology

The system identifies the important concepts in text

Generates questions focusing on these concepts

Chooses semantically closest distractors

Page 25: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

NLP-based methodologyterm

extraction

terms (key concepts)

test items

distractor selection

question generation

wordnet

narrative texts

distractors

transformational rules

Page 26: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Test developer’s post-editing environment

First version of system: 3 distractors generated to be post-edited

Current version of system: long list of distractors generated with the user choosing 3 from them

Page 27: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Test developer’s post-editing environment (2)

Page 28: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Post-editing Automatic generation

Test items classed as “worthy” (57%) or “unworthy” (43%)

About 9% of the automatically generated items did not need any revision

From the revisions needed: minor (17%), fair (36%), and major (47%)

Page 29: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

In-class experiments Controlled set of test items administered First experiment: 24 items constructed with the

help of the first version of the system Second experiment: another 18 items

constructed with the help of the current version of the system

Further 12 manually produced items included 113 undergraduate students took the test

45 in first experiment 78 in second experiment subset of second group (30) replied to manually

produced test

Page 30: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Evaluation

Efficiency of the procedure Quality of the test items

Page 31: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Evaluation: efficiency of the procedure Efficiency:

 

6' 55''450'65manual

average time per item

timeitems produced

1' 48''540'300computer-aided

Page 32: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Evaluation (A): quality of the test items Item analysis

Item Difficulty (= C/T)

Discriminating Power (=(CU-CL):T/2)

Usefulness of the distractors (comparing no. of students in upper and lower groups who selected each incorrect alternative)

Page 33: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Evaluation: results

2.94541300.36000.587818computer-

aided (new)

1.92653610.4030.754524computer-aided (old)

1.18338500.26020.563012manual

avg. difference

totalnot

usefulpoor

neg. discriminating

power

avg. discriminating

power

toodifficult

avg.item

difficulty#students#items

USEFULNESS OF

DISTRACTORS

ITEM DISCRIMINATING

POWER

ITEMDIFFICULTY

TEST

tooeasy

Page 34: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Discussion Computer-aided construction of

multiple-choice test items is much more efficient than purely manual construction (efficiency rate 3.8)

Quality of test items produced with the help program is not compromised in exchange for time and labour savings

Page 35: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Efficiency rates summary

CALP: summarisation 1.25 CALP: MAT 1.8 CALP: TM 3.3 CALP: generation of 3.8

multiple-choice tests

Page 36: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Conclusions

CALP: attractive alternative of automatic NLP

CALP: significant efficiency (time and labour)

CALP: no compromise of quality

Page 37: Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton

Further information

My web page: www.wlv.ac.uk/~le1825

The Research Group in Computational Linguistics: clg.wlv.ac.uk