Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/sds_treedialogs.pdf ·...

Speech Processing 11-492/18-492

Spoken Dialog Systems

Tree based dialogs

VoiceXML

State-based Dialogs

Simple state-based dialog systems

Get Name

Get Account number

Get PIN

Present balance

Go back to start or exit

State-based Dialogs

Get Name:

What is your name?

ASR Name

May be correct (in the database)

May be unknown (not in database)

May not be name (What do I say?/Help/Repeat)

Should you echo the recognized name?

Confirmation (or not)

State-based dialog

Get name

Check in database

Ask again if not

Deal with help

Get account number

Check in database (with name)

Confirm account number and name

For security

State-based Interaction

Trees can get very large

User can get lost easily

You want to minimize the number of turns

Faster throughput means more calls

Faster throughput means happier customer

The level of help

First time users *need* a successful call

Otherwise, they wont call back

Having very helpful prompts is good

At start, gets annoying quickly

Designing prompts is a craft

What is spoken should be understood

How much should you tailor it to the user

VoiceXML

A W3C standard for voice browsing

XML based “programming” language for

speech

Output synthesized (and recorded) speech

Recognition of speech

Recording of spoken input

Telephony features

VoiceXML

From Grammars (JSGF: java speech grammar

format)

From tri-grams

From “Domain Managers”

Credit card numbers

City, Stats

VoiceXML

<ssml> markup

Choice of voice

Choice of language

Choice of how to pronounce things

Specify breaks, timing, emphasis

Structure

<form>

<block>

</block>

</form>

<block>

Goodbye!

</block>

</form>

</vxml>

Basic Tags

<field> gather info from user through

speech

<record> record data user

<subdialog> performs some sub dialog

<field> tag

<prompt>Which bus line do you want?</prompt>

<help> Please say your desired bus number, e.g. 61C</help>

</field>

</form>

Flow of Control

Goto <goto next=“#GetBusNumber>

<prompt> Sorry that bus no longer runs</prompt>

<prompt> Sorry it’ll be a long wait </prompt>

<prompt> One will be along shortly </prompt>

Variables

<prompt> I just wanted to say <value

expr=“var1”> </prompt>

Recognition Grammars

Speech Recognition Grammar Specification

(SRGS)

Augmented BNF

$order = I would like a $drink

$drink = coke | pepsi | mountain_dew

VoiceXML Browsers

Compatibility

Not as compatible as one would like

<objects> can be different (but useful)

City, State recognizers

ECMAscript (Javascript)

Beyond VoiceXML (in VoiceXML)

Mixing html/cgi scripts in VoiceXML

Use php to generate VoiceXML files

Use urls (with ?...) to calculate/get data

http://weather.com?zip=“15213”

Use urls to get waveforms

http://tts.com?text=“Hello World”

VoiceXML future

N-gram grammar Markup Language

Many browsers have own extensions

Pronunciation Lexicon Markup Language

A way to add new items to the lexicon

Hard to find good standards

Call Control Markup Language

For management and logging of calls

Microsoft SALT

SALT tags

Listen DTMF prompt bind grammar (plus ssml)

Designed for desktop not just phone

Design to be shared documents

Viewing (HTML) and Speech (SALT)

Available Systems

Nuance

Be-vocal

Tell Me

Tell-me studio

OpenVXI/publicvoicexml.org

Many others

SDS Architecture

Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/sds_treedialogs.pdf ·...

Documents

CIC VoiceXML Integration with Nuance Dialog Modules ...€¦ · Introduction to VoiceXML Integration with Nuance Dialog Modules Version 6.1/6.3 The CIC VoiceXML Integration with Nuance

VoiceXML: Forms, Menus, Grammars, Form Interpretation Algorithm

VoiceXML: Forms, Menus, Grammars, Form Interpretation Algorithm

Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/s2s_details.pdfTranstac System Details Two way system 2 ASR systems: English and Iraqi 2 way statistical translation

Aural Interfaces to Databases based on VoiceXML

An Introduction to VoiceXML - MIT - Massachusetts Institute of

VoiceXML: A Field Evaluation

VoiceXML. Теория и практика проектирования голосовых приложений

Plataformas IVR VoiceXML para Asterisk (Webinar)

VoiceXML tutorial - Nuance Caf©: Supercharge Your Phone!

i02 Voicexml Basics

I6NET VXI* VoiceXML Brower 2010

Introduction to VoiceXml and Voice Web Architecture

Cisco IOS VoiceXML Quick Start Guide · † Cisco VoiceXML Programmer’s Guide About Cisco IOS VoiceXML This release of the Cisco VoiceXML Quick Start Guide adds in formation pertinent

Template-based framework for building VoiceXML application

Speech Processing 15-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/multilingual_all.pdf · Writing Systems Romanized writing systems Latin-1 (iso-8599-1) Covers many Western

Articulo de Invest VoiceXML

VoiceXML Technical Reference - Genesys · The VoiceXML Technical Reference provides information about the Genesys PureConnect VoiceXML system. VoiceXML, the Voice Extensible Markup

IVRPowers VoiceXML IVR Platform

VoiceXML ACS 352 Based on