36
19.05.2009 1 Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

  • Upload
    voanh

  • View
    233

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 1

Emotional SpeechSynthesisState of the art 2009 Felix Burkhardt

Page 2: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 2Emotional Soeech Synthesis - Felix Burkhardt,

outline

how to model and why simulate emotions?emotions in speechintroduction to speech synthesis approachesexamples, examples, examplesconclusion and outlook

Page 3: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 3Emotional Soeech Synthesis - Felix Burkhardt,

contents

how to model and why simulate emotions?emotions in speechoverview on speech synthesisexamples, examples, examplesconclusion, outlook

Page 4: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 4Emotional Soeech Synthesis - Felix Burkhardt,

emotion models

…everyone except a psychologist knows what an emotion is (Young 1973)

categories, e.g. anger, joy, …

dimensions, e.g. activation, dominance, valence

appraisals, e.g. novelty, intrinsic pleasantness, relevance, coping potential,

emotion cube

arou

sal

valence

dominance

anger joy

sadness

content

neutral

despair

boredom

source: Burkhardt 2001

Page 5: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 5Emotional Soeech Synthesis - Felix Burkhardt,

why model emotional behaviour?

aspects of emotion modeling in human-machine interaction:

source: Batliner et al 2006

Page 6: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 6Emotional Soeech Synthesis - Felix Burkhardt,

applications of emotional tts

fun, e.g. emotional greetingsprosthesisemotional chat avatarsgaming, believable charactersadapted dialog designadapted persona designtarget-group specific advertising…believable agents…artificial humans

time

Page 7: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 7Emotional Soeech Synthesis - Felix Burkhardt,

aspects of emotional tts

Page 8: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 8Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions?emotions in speechoverview on speech synthesisexamples, examples, examplesconclusion, outlook

Page 9: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 9Emotional Soeech Synthesis - Felix Burkhardt,

speech features

source: Reynolds et al 2003

descriptive layers of speech

Page 10: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 10Emotional Soeech Synthesis - Felix Burkhardt,

emotion in speech

source: TUB emotional database

frightened sad

happy bored

neutral angry

spectrograms from emotional acted speech

Page 11: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 11Emotional Soeech Synthesis - Felix Burkhardt,

emotional data?

actors vs. realityBerlin EmoDB: 10 actors x 7 emotions x 10 sentencesalternatives

induced data, e.g. Aibotelevision, radio data

EmoDB: Burkhardt et al 2005

Page 12: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 12Emotional Soeech Synthesis - Felix Burkhardt,

how to describe emotion?

EmotionML, incubator group at W3CExample, embedded in SSML:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">

<voice gender="female">

<prosody contour="(0%,+20Hz)(10%,+30%)(40%,+10Hz)">

Hi, am sad know but start getting angry...

</prosody>

</voice>

<emotion>

<category name="sadness„ set="basic" intensity="0.6"/>

<timing start="10%" end="50%"/>

</emotion>

<emotion>

<category name="anger" set="basic" intensity="0.4"/>

<timing start="50%" end="100%"/>

</emotion>

</speak> http://www.w3.org/2005/Incubator/emotion/

Page 13: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 13Emotional Soeech Synthesis - Felix Burkhardt,

loquendo tts director

source: Loquendo

Page 14: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 14Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions?emotions in speechintroduction to speech synthesis approachesexamples, examples, examplesconclusion, outlook

Page 15: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 15Emotional Soeech Synthesis - Felix Burkhardt,

speech synthesis taxonomy

speech synthesis systems

voice response systems arbitary speech synthesizersre (copy)-synthesis, voice transformation

text-to-speech(unknown input)

concept-to-speech(input from text-generation system)

voice conversion

Page 16: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 16Emotional Soeech Synthesis - Felix Burkhardt,

tts process chain

NLP naturallanguageprocessing

DSP digital speechprocessing

phonetic transcriptionprosody track

preprocessingmorpho-syntactic analysistranspcriptionprosody modeling

unit concatenation / searchprosody fittingedge smoothing

Page 17: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 17Emotional Soeech Synthesis - Felix Burkhardt,

synthesis approaches

system modelingsignal modeling

expert systemsformant synthesis

articulatory synthesisvocal tract shape synthesis

concatenative synthesis

coding of units type of unitssyllables, diphones, allophones, subsegments

parametric codedLPC linear predictive codingMFCC mel frequency cepstralMBR multi band resynthesisformants

waveform codedPCMLDM (linear delta mod.)

hybrid approachesMBRPSOLA, RELP

statistical model generatedHMM hidden markov modelsANN neural nets

rule based data based

non-uniform unit selection

pseudo articulatory

Page 18: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 18Emotional Soeech Synthesis - Felix Burkhardt,

historic development

articulatoryvan Kempelen

formant synthesise.g. Dec Talk

PSOLA basedsynthesise.g. Elan

non-uniform unitselectione.g. RealSpeak

flexiblehistoric

not flexiblemodern

natural soundingdomain dependent

artificial soundingdomain independent

2000199019801780 ….

Page 19: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 19Emotional Soeech Synthesis - Felix Burkhardt,

system modeling

Page 20: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 20Emotional Soeech Synthesis - Felix Burkhardt,

source filter model

source: Klatt80 formant synthesizer (Klatt 1980)

Page 21: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 21Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions?emotions in speechoverview on speech synthesisexamples, examples, examplesconclusion, outlook

Page 22: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 22Emotional Soeech Synthesis - Felix Burkhardt,

open source Java program based on MBROLA synthesis engine.NOT a complete text-to-speech systemprosody filter between natural language and digital speech signal processing modulesas multilingual as MBROLA which currently supports 35 languages.

examples: emofilt

Page 23: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 23Emotional Soeech Synthesis - Felix Burkhardt,

emoSpeak is integrated into the MARY text-to-speech framework by DFKI.Marc Schröderinvestigated in his ph.d. thesis, how to assign rule-based modification of speech to emotional dimensions.the system can be freely dowloaded

examples: emoSpeak

source: Schröder 2004

Page 24: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 24Emotional Soeech Synthesis - Felix Burkhardt,

examples voice conversion

neutral sadPhase vocoderGreg Beller, IRCAM

neutral angryPSOLA - LPC conversion

Murtaza Bulut et al, USC

Page 25: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 25Emotional Soeech Synthesis - Felix Burkhardt,

examples voice transformation

Laughter synthesis byLPC synthesis and mass-spring model

Shiva SundaramUSC 2007

womanas boyas manman breathywhisperytense

Mixed LF + harmonicmodel

Olivier RosecFranceTelecom 2009

Page 26: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 26Emotional Soeech Synthesis - Felix Burkhardt,

examples formant synthesis

neutral sadangry cryingcontent

prosody rules + phonation model

EmoSynBurkhardt, 2000

sad angryDEC Talk prosodyrules

AffectEditorJ. Cahn, MIT 1998

Page 27: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 27Emotional Soeech Synthesis - Felix Burkhardt,

examples diphone synthesis

neutral joyprosody rulesEmoFiltBurkhardt, 1999

joy angryprosody rules fordimensionsthree inventories forsoft, normal and tensespeech

MARYM. Schröder, DFKI

Page 28: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 28Emotional Soeech Synthesis - Felix Burkhardt,

examples statistical based

neutral joyHMM models spectraland prosodic features

Tokyo Institute, Kobayashi Lab

Page 29: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 29Emotional Soeech Synthesis - Felix Burkhardt,

examples unit selection

Katrinextralinguistic units

product researchCTTS with expressive units

Damian Shoutyfun personality voices

Page 30: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 30Emotional Soeech Synthesis - Felix Burkhardt,

examples non human

anger fearformant synthesisMIT Kismet robot

happy sadconcatenativeOudeyer: Sony petrobots

Page 31: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 31Emotional Soeech Synthesis - Felix Burkhardt,

examples singing

bicycle1961 articulatory, firstsong ever

Bell Labs Gerstman & Mathews,

aria1993Articulatory

pavarobottiIngo Titze

donna nobis2007articulatory

vocal tract labPeter Birkholz

Page 32: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 32Emotional Soeech Synthesis - Felix Burkhardt,

more examples …http://emosamples.syntheticspeech.de

Page 33: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 33Emotional Soeech Synthesis - Felix Burkhardt,

contents

why simulate emotions?emotions in speechoverview on speech synthesisexamples, examples, examplesconclusion, outlook

Page 34: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 34Emotional Soeech Synthesis - Felix Burkhardt,

conclusion

emotions are part of natural speechsimulation possible by either

modeling the processincluding emotional data

still text to speech fights with intelligible, neutral speechfirst steps: speaking styles, extralinguisticsfirst apps: fun, gaming

Page 35: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 35Emotional Soeech Synthesis - Felix Burkhardt,

outlook

discrepancy betweennatural but unflexible vs. artificial sounding but flexible

solutions short - middle term:very large databaseshybrid parametric – non-uniform unit selectionvoice transformation techniqueshigh quality source filter model based synthesis

solutions on the long runphysical modeling

Page 36: Emotional Speech Synthesis - Expressive Synthetic Speechemosamples.syntheticspeech.de/EmotionalTTS.pdf ·  · 2009-05-19Emotional Speech Synthesis State of the art 2009 Felix Burkhardt

19.05.2009 36Emotional Soeech Synthesis - Felix Burkhardt,

references