Towards Synthesis of Focus in Mandarin Text-to-speech System Dr. Dezhi HUANG...

Preview:

Citation preview

Towards Synthesis of Focus in Mandarin Text-to-speech System

Dr. Dezhi HUANG

dezhi.huang@francetelecom.com.cnSNLP Unit, FTRD Beijing

2005/11/2 V1.1

2

Table of Contents

11 Synthesis of focus

22 Proposal for SSML

33 Examples with <focus>

3

Human has the strong ability of information reconstruct

Evidence from music perception

The “Butterfly Lovers” violin concerto

4

Human has the strong ability of information reconstruct (Cont.)

Evidence from human vision

5

Application model of Mandarin Text-to-speech (Cont.)

Spoken dialog system

PSTN/Wireless

PSTN/Wireless

Mandarin Voice-enabled

Service Gateway

Mandarin Voice-enabled

Service Gateway

Mandarin TTS

Engine

Mandarin TTS

Engine

Information query by the side of road

Angry Environment Noise

Environment Noise

6

Why we fail?

The important content is not prominent as we expect

Weaken the background noise (Noise reduction)

Improve the prominence of information that we need

Utilizing the human ability of information reconstruct

7

What do we need in speech communication?

The key information is always contained in a phrase/word in a sentence

Have you always seen Prof. Zhao? No, I saw him only once.

The container of key information is called the focus.

The semantic centre of a sentence

8

The value of synthesis of focus

It is helpful for

Analyzing the syntactic of sentence

Understanding the meaning of utterance

Capturing the turn-taking

Comprehending the attempt and emotion of speaker

Improve the acceptance of TTS

9

Key challenges in synthesis of focus

Difficult to locate a focus in a sentence Some focuses can be found from the syntactic structure

明天你准备去买什么?我要去买红色的帽子。

The other focuses are decided by the context of a sentence

老王去年退休了。 老王去年退休了。 老王去年退休了。

Lack of appropriate acoustic model to realize a focus

Pitch accent Duration Energy Pause Weakness

Markup Language for Focus

10

Table of Contents

11 Synthesis of focus

22 Proposal for SSML

33 Examples with <focus>

Make the synthesized speech clear

Improve the validity of speech communication with TTS

11

What is SSML?

It is designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in Web and other applications

Natural Language

Processing and Understanding

Natural Language

Processing and UnderstandingSpeech SynthesisSpeech Synthesis

SSML

12

<EMPHASIS> in SSML

The emphasis element requests that the contained text be spoken with emphasis (also referred to as prominence or stress)

Level: strong, moderate and none

For synthesizer, it is easy to know which word has sentence stress

老王买了车。 老王买了车。

13

The proposed <focus> element

The focus element indicates that the contained text be the semantic centre and the carrier of important information of a sentence

In the perspective of pragmatics

Contrastive focus (also referred to as identificational focus)

Informational focus (also referred to as the presentational focus, natural focus)

14

Samples of focus

(1) 你经常见赵教授吗? 我见过他一次。 (2) 昨天老张干什么了? 昨天老张去看病。 (3) 是老张帮我修了车。 (4) 他连我也不相信。 (5) 他经常和我打球。 (6) 他居然卖了房子。 (7) 我们去钓鱼吧。

15

A focus in Mandarin is not one-to-one corresponding with an emphasis

Most of focuses are realized by stresses

是老张退休了。

明天最高气温多少度?明天最高气温 30度。

Some of them are realized by pause or intonation

你常常见赵老师吗?我见过他一次。

我们下象棋吧。

16

Differences between focus and emphasis

Focus is the concept of semantics and pragmatics

We can mark the focus up without speech signal

国家工商总局昨天发出紧急通知强调,全国大中城市、边境地区、发生过疫情的地区、养殖大省四类区域必须建立健全禽类产品“挂牌经营”制度,市场内禽类产品要标明禽类生产地、动物检验检疫证明及销售承诺。

Emphasis is the concept of psychoacoustics

The consistency of emphasis label is relatively difficult to achieve without speech signal

17

Differences between focus and emphasis (Cont.)

Focus always carries the purpose of utterance

We can know exactly what the sentence means

Emphasis is not directly linked to the purpose of utterance

The emphasized word may be trivial

黄菊强调,认真学习贯彻五中全会精神,继续推进国有商业银行改革。

他经常和我打球。

18

What can we benefit from focus labeling?

Improve the intelligibility of synthesized speech, especially in communication environment with noise

Q: 明天最晚一班到北京的飞机是几点?

A: 在晚上 9 点钟有一班 CZ8071 的飞机飞往北京。

Q: 几点钟?

A: 是 9 点。

Q: 哪一班?

A: 是 CZ8071 。

19

What can we benefit from focus labeling? (Cont.)

focus labeling can be directly applied to text information processing

The next generation of search engine should need to know

which is the topic of a paragraph which are the focuses of a sentence

Text highlight is important step for information retrieval

Keywords in automatic digest are always the focuses

20

Table of Contents

11 Synthesis of focus

22 Proposal for SSML

33 Examples with <focus>

<focus> indicates what is semantic centre

<focus> solves the problem of focus location

21

Attributes of <focus>

Type

informational

contrastive

Method

StrongStress ModerateStress None Pause Intonation

22

Samples of <focus>

(1) 你经常见 <focus type=“informational” method=“StrongStress ”> 赵教授 </focus> 吗?

我见过他 <focus type=“informational” method=“Pause”>一次 </focus> 。

(2) 昨天老张干什么了?

昨天老张 <focus type=“informational” method=“ModerateStress ”>去看病 </focus> 。

(3) 是 <focus type=“contrastive” method=“StrongStress ”>老张 </focus> 帮我修了车。

23

Samples of <focus> (Cont.)

(4) 他连 <focus type=“contrastive” method=“StrongStress ”> 我 </focus> 也不相信。

(5) 他经常 <focus type=“informational” method=“Pause”>和我打球 </focus> 。

(6) 他居然 <focus type=“informational” method=“ModerateStress ”>卖了房子 </focus> 。

(7) 我们 <focus type=“informational” method=“Intonation ”>去钓鱼 </focus> 吧。

24

Thank you!

Recommended