21
WP4-22. Final Evaluation of Subtitle Generator Vincent Vandeghinste, Pan Yi CCL – KULeuven

WP4-22. Final Evaluation of Subtitle Generator

  • Upload
    rafael

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

WP4-22. Final Evaluation of Subtitle Generator. Vincent Vandeghinste, Pan Yi CCL – KULeuven. Example. Transcript: Het meest spectaculaire aan de daadwerkelijke start van de euro is dat er eigenlijk niets spectaculairs te melden valt. Ondertitel: - PowerPoint PPT Presentation

Citation preview

Page 1: WP4-22. Final Evaluation of Subtitle Generator

WP4-22. Final Evaluation of Subtitle Generator

Vincent Vandeghinste, Pan Yi

CCL – KULeuven

Page 2: WP4-22. Final Evaluation of Subtitle Generator

Example

Transcript:

Het meest spectaculaire aan de daadwerkelijke start van de euro is dat er eigenlijk niets spectaculairs te melden valt.

Ondertitel:Het meest spectaculaire aan de start van de euro was dat er niets spectaculairs te melden valt.

Page 3: WP4-22. Final Evaluation of Subtitle Generator

Flow

Page 4: WP4-22. Final Evaluation of Subtitle Generator

Availability Calculator

• Pronunciation Time of Input Sentence => estimate nr of characters available in subtitle

• If UNKNOWN, estimate it by– counting nr of syllables– Average speaking rate for Dutch

Page 5: WP4-22. Final Evaluation of Subtitle Generator

Syllable Counter

• Rule-based

• Evaluated on CGN-lexicon combined with FREQ-lists

• Estimated nr Nr of syl in phonetic transcripts

• 99.63% of all words in CGN is correctly estimated

Page 6: WP4-22. Final Evaluation of Subtitle Generator

Average Syllable Duration

ASD No pauses Pauses included

Literature 177 ms

All CGN files 186 ms 237 ms

One Speaker 185 ms 239 ms

Read-aloud 188 ms 256 ms

Page 7: WP4-22. Final Evaluation of Subtitle Generator

Availability Calculator

• When pronunciation time not given: estimate it

• Subtitles: 70 chars / 6 sec = 11.67 chars/sec

• If nr of chars > nr of available chars => compress sentence

Page 8: WP4-22. Final Evaluation of Subtitle Generator

Sentence Compressor

• Parallel Corpus

• Sentence Analysis

• Sentence Compression

• Evaluation

Page 9: WP4-22. Final Evaluation of Subtitle Generator

Parallel Corpus

• Sentence aligned

• Source & Target corpus:– Tagging– Chunking– SSUB detection

• Chunk alignment

Page 10: WP4-22. Final Evaluation of Subtitle Generator

Chunk Alignment

Every 4-gram from src-chnk is compared with every 4-gram from tgt-chnk

A = ( m / (m+n)) . (L1 + L2)/2If (A > 0.315) then Align Chunk

F-value for NP/PP-alignment is 95%

Page 11: WP4-22. Final Evaluation of Subtitle Generator

Sentence Analysis

• Tagging (TnT): accuracy = 96.2% (Oostdijk et al., 2002)

• Chunking

Chunk Type Prec. Recall F-value

NP 94.36% 93.91% 94.13%

PP 94.84% 95.22% 95.03%

Page 12: WP4-22. Final Evaluation of Subtitle Generator

Sentence Analysis (2)

• SSUB detection

Type of S Prec. Recall F-value

OTI 71.43% 65.22% 68.18%

RELP 69.66% 68.89% 69.27%

SSUB 56.83% 60.77% 58.74%

Page 13: WP4-22. Final Evaluation of Subtitle Generator

Sentence Compression

• Use of statistics

• Use of rules

• Word reduction

• Selection of the Compressed Sentence

Page 14: WP4-22. Final Evaluation of Subtitle Generator

Use of statistics

Page 15: WP4-22. Final Evaluation of Subtitle Generator

Use of rules

• To avoid generating ungrammatical sentences

• Rules of type

For every NP, never remove the head noun

• Rules are applied recursively

Page 16: WP4-22. Final Evaluation of Subtitle Generator

Word Reduction

• Example: replace gevangenisstraf by straf

• Counterexample: replace voetbal by bal• Making use of Wordbuilding module (WP2)• Introduces a lot of errors: added accuracy?• Better integration with rest of system should

be possible

Page 17: WP4-22. Final Evaluation of Subtitle Generator

Selection of the Compressed Sentence

• All previous steps result in an ordered list of sentence alternatives– Supposedly grammatically correct– Sentences are ordered depending on their

probability– First sentence (most probable) with a length

smaller than available nr of chars is chosen

Page 18: WP4-22. Final Evaluation of Subtitle Generator

Evaluation

Condition A B C

ASD 185 ms/syl 192ms/syl 256 ms/syl

No output 44.33% 41.67% 15.67%

Red rate 39.93% 37.65% 16.93%

Interrater Agreement

86.2% 86.9% 91.7%

Accurate 4.8% 8.0% 28.9%

± accurate 28.1% 26.3% 22.1%

Reasonable 32.9% 34.3% 51%

Page 19: WP4-22. Final Evaluation of Subtitle Generator

Subtitle Layout Generator

Actieve of gewezen voetballers

zoals Ruud Gullit of Dennis

Bergkamp moeten het stellen met

nauwelijks anderhalf miljard .

wordtActieve of gewezen voetballers

zoals Ruud Gullit of

Dennis Bergkamp moeten het stellen

met nauwelijks anderhalf miljard .

Page 20: WP4-22. Final Evaluation of Subtitle Generator

Conclusion

• System approach works very well:– If sentence analysis is correct

– If there are possible reductions (according to the ruleset)

• A lot of No Output cases: System cannot reduce sentence– Sentence cannot be reduced (even by humans)

– Rule-set is too strict / Wrong sentence analysis

– Not fine-grained enough statistical info

• Bad output:– Wrong sentence analysis (CONJ)

– Wrong word-reductions

Page 21: WP4-22. Final Evaluation of Subtitle Generator

Future

• Near future (within Atranos)– Better integration of word-reduction

– Combine advantages of CNTS approach and CCL approach into one approach

• Far future (outside Atranos)– Better sentence analysis: full parse is needed

– More fine-grained analysis of parallel corpus