View
216
Download
1
Tags:
Embed Size (px)
Citation preview
4
Outline
MS Technologies VoiceXml
– Demo Speech .NET
– Demo Future Questions (throughout) ~25 slides
6
Tools
MS Agents SAPI / Speech SDK 5.1 (.NET wrappable) Office AutoPC ??? ASP .NET (VoiceXml) (beta) Speech .NET / IE Speech Add-In … SALT Telephony gateway (early 2003) … Pocket IE Speech Add-In (mid 2003)
7
Devices
Phone– billions of devices, people are comfortable speaking to
Desktop PC– large market, speech input is slower and uncomfortable
Pocket PC– small market, opportunities for speech (device limitations)
Tablet PC– new market, speech friendly (slate models don’t have
keyboards)
8
Phone
ASP .NET w/ VoiceXml 2.0– Production quality now– Multiple vendor support
Speech .NET VoiceOnly– Currently no way to deploy and test over a phone– Speech .NET Beta 2 has telephony simulation– MS target market for Speech .NET
9
Desktop PC
Web– Speech .NET MultiModal
Beta 2 IE Speech Add-In
– Embedded control w/SAPI– MS Agents
Fat– SAPI– MS Agents
10
Pocket PC
Web– SALT Pocket IE Speech Add-Ins (mid 2003)
Fat– 3rd parties only– MS Reader does not support TTS
11
Tablet PC - TODAY!
Web– … same as desktop PC– Beta 2 has added support for Tablet PC– Virtual keyboard has speech control
Fat– … same as desktop PC– Virtual keyboard has speech control– MS Reader should be able to support TTS– Digital Ink is currently more compelling to MS
12
VoiceXml
XML-based language– Declarative – XML tags, grammars– Procedural – Javascript
Telephony Gateway is the client
– Event driven – Bargein, Goodbye– Object oriented – Properties
13
Usage
Input – Speech Recognition (Command and Control)– DTMF– Voice recording and posting to a server
Output– Text-To-Speech– Prerecorded audio files
Telephony control– Hang-up, Transfers, …
16
VoiceXml - SALT
VoiceXml : ??? : : SALT : Speech .NET– Nuance has some WYSIWYG
SALT is considered lightweight to VoiceXml SALT was submitted to W3C August 2002 VoiceXml is v2.0 in W3C
– Mandatory W3C grammar spec Beta 2 Speech .NET has moved to W3C SRGS
VoiceXml has complementary specs (ccXml) VoiceXml is moving to MultiModal as well
17
VoiceXml - SALT
VoiceXml = AT&T, Motorola, TellMe, (IBM) SALT = MS, SpeechWorks, Intel, (BeVocal) VoiceXml has multiple vendor support with
venture capital from before the burst Most vendors will support both specs VoiceXml has ~ 15,000 developers SALT has potentially millions
18
SALT
I have not read the new spec Remember doing an in-head mapping to VoiceXml
when reading an early spec Why
– Common spec for MultiModal operation– Multiple modes of interaction with the same syntax– Speech enabling existing sites
Why not VoiceXml– MultiModal retrofit harder than redo
19
Speech .NET
MS implementation of SALT (VoiceWebSolutions + DreamWeaver MX) Some Beta 1 Speech .NET apps still work,
because SALT has not changed much, but Speech .NET Beta 2 controls have
VoiceXml not as portable between vendors as it should be, the Speech .NET controls could help mitigate this for SALT– i.e. layer of abstraction for voice browser wars
21
Code
Creating static grammars and prompts Very little server-side code
– Only dynamic grammars / prompts– Server-side code mods to better support speech
Mainly setting properties on Speech controls and tying to client-side javascript
Tie javascript to mouse-click events to avoid redundant code
22
Impression
Separate app layers to reduce complexity– Voice UI will be less functional, design is key
Learning low level SALT might be easier than high level Speech .NET controls
Application controls change this in Beta 2 Speech .NET has a great debugger (now server side
too), grammar, and prompt tools Speech Control Editor was needed for dev IE Audio meter was needed for MultiModal MultiModal has some time to grow
24
Industry
Wrote 1st VoiceXml article a year ago– Received 1st proposal request last month– 1 other proposal request since then
Wrote 1st Speech .NET article 5 months ago– Request for an article from MSDN magazine
25
Voice Recognition
PSTN is less secure than Internet!– More accessible and easier to automate hack
Traditionally spoken password OR DTMF pin, also # Clients always confuse with speech recognition Not a part of VoiceXml or SALT specs
– Telephony gateways proprietary implementations Not useful for identifying somebody Useful for confirming somebody is whom they say
they are Prints have to change when device changes
26
Future (MS Speech)
SALT Telephony gateways Speech .NET (VoiceOnly then MultiModal) Pocket IE Speech Add-In NET Fat-client Speech APIs
– Desktop / Tablet / PPC
MS or 3rd party VS .NET VoiceXml controls Possibility for Speech .NET controls to
render both SALT and VoiceXml
27
Future
Lots of W3C Voice specs … VoiceXml MultiModal browser Auto (hands-free, navigation, radio) 3G (bridge voice and wireless web)
– offload Speech processing– VOIP or PSTN– Pocket PC Phone Edition / SmartPhones
IBM recently announced chip for Speech on mobile devices