15
PrepTalk a Preprocessor for Talking book production Ted van der Togt, Dedicon, Amsterdam

PrepTalk a Preprocessor for Talking book production Ted van der Togt, Dedicon, Amsterdam

Embed Size (px)

Citation preview

PrepTalk

a Preprocessor for Talking book production

Ted van der Togt, Dedicon, Amsterdam

Situation

• Growing demand for audio products with growing and more diverse user group

• Quality of ‘standard’ TTS ‘not good enough

• Limited budget, more narrators no option

• Need for speed (newspapers, higher education)

Objectives

• Build (components for) TTS production

• Integrate with Daisy Pipeline

• Focus on:

A) Quality improvement

B) Workflow efficiency

C) Resource optimization

Project partners and funding

• Dutchear (speech synthesis)

• Polderland & van den Heuvel HLT (language & speech technology)

Co-financed by:

• VOB (Dutch public library organization)

• several funds

A) TTS Quality improvement

• TTS voices are not perfect

• Documents containing ‘foreign’ text, names

• Ambiguity inherent to language

• TTS software sometimes ‘too smart’

B) Workflow efficiency

• Collaborative web portal based approach, showing status, priority, etc.

• Central lexicon accumulates knowledge about exceptions

• Prioritization of issues within one document when time is limited

C) Resource optimization

• Time needed for quality check

• Time needed to fix incorrect pronunciation

• Licenses on TTS software

• Licenses on lexica

• Licenses on other tools

PrepTalk process

Workflow with 3 steps:

1. Automatic document analysis (Daisy XML)

2. Human (interactive) evaluation and correction

3. Daisy Pipeline Narrator adapted for processing pronunciation information (in SSML)

1. Analysis

• Sentence detection (+named entity recognition)

• Check against lexicon (corpus spoken Dutch)

• Pattern recognition (numbers etc.)

• Suggestions (ambiguity, spelling mistake)

2. Editing environment

• Evaluate issues based on ‘importance’.

• Improve pronunciation with either alternative text or phonemes.

• Listen to corrected text within context.

• Add solutions to central lexicon.

3. Daisy Pipeline Narrator adapted

• SSML to describe pronunciation information.

• X-SAMPA as phonetic alphabet.

• TTS engine independent.

• Connection to TTS server using SOAP

Demo

Example: IFLA Newsletter ...

Demonstration

System Architecture

Thank you for your attention.

For questions:[email protected]

Postbus 24 5360 AA Grave The NetherlandsT+ 31 (0) 486 486 486F+ 31 (0) 486 476 535