22
Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research [email protected] Feb 14 th 2003

Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research [email protected] Feb 14 th 2003

Embed Size (px)

Citation preview

Page 1: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Speaking to Computers

Alex AceroManager, Speech Research GroupMicrosoft Research

[email protected] Feb 14th 2003

Page 2: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Talk Outline

Role of speech technology in devices

Telephony Smartphones and PDAs Multimodality in User Interface

Page 3: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

The Promise of Speech Technology

Page 4: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

HighHigh

InternetInternetTVTV

PhonePhone

PDAPDA

Ease of text input (keyboard/pen)Ease of text input (keyboard/pen)

Ease Ease of GUIof GUI

(screen/(screen/Pointer)Pointer)

LowLow HighHigh

PCPC

TabletTabletPCPC

ScreenScreenPhonePhoneScreenScreenPhonePhone

PDAPDA

TabletTabletPCPC

CarCarCarCar

InternetInternetTVTV

Role of Speech in Different Devices

Page 5: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

PhonePhone

PCPC

ScreenScreenPhonePhone

PDAPDA

TabletTabletPCPC

CarCar

InternetInternetTVTV

A Roadmap for Speech

Ease of text input (keyboard/pen)Ease of text input (keyboard/pen)

Ease Ease of GUIof GUI

(screen/(screen/Pointer)Pointer)

HighHigh

HighHighLowLow

Speech-Only Speech-Only TelephonyTelephony

DictationDictation

Multimodal Multimodal Command/ControlCommand/Control

Page 6: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Speech Technology

Meeting / Voicemail Transcription

Market Opportunity

Mobile Devices / Cars

Telephony / Call Center

Accessibility

Desktop Dictation

Desktop Command & Control

Technology Readiness

Customer Need

PoorAlternative

Page 7: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

The Business Value of Speech for Call Centers

Customer Focus

Less Time/Call

Efficient Agents

Less Time in Queue

Increased System Usage

Customer Retention

$5/call to $.20/call

Reduced Call Time

Fewer Agents

New Revenue Opportunities

Up-Sell/Cross-Sell

Page 8: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Amtrak61% Increase in Satisfaction

75% Increase in Automation Rate

90% Increase in Ticket Sales

Thrifty Car Rental40% increase in CSR productivity $1 million first year savings

Merrill LynchAutomation rates from 82% to 90%

First Year Savings $6.3M

Call Center Examples

Page 9: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

The Business Value of Speech for Operators

0

5000

10000

15000

20000

25000

30000

35000

2000 2001 2002 2003 2004 2005 2006 2007

Data Revenue

Voice RevenueRevenueIn US$M

The mobile operators need to make money from value-added services!

Page 10: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

If you still doubt speech is goodfor the call center….

Page 11: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Why Speech at Microsoft?

Natural UI, or the combination of speech recognition, natural language understanding, automatic learning... Those are the key technologies that will have the most impact over the next 15 years.

Bill Gates, Microsoft Chairman

Page 12: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Microsoft Speech Server & SDK

Visual Studio + ASP.NET + SALT

Multiple Devices

Call center + multimodal solution

Unifies web & call center

Reduces TCO

Page 13: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Speech in Mobile Devices

Microsoft Smartphone & PocketPC Phones• Rich Client• 3% to 16% of WW mobile phone market

Smartphones• Thin Client• 11% to 25% of WW mobile phone market

Cellular Phones• No Client• 86% to 59% of WW mobile phone market

SOURCE: Gartner, IDC, Microsoft

2004 2007

Page 14: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Thin Client Devices Over Voice Channel

Web ServerMS Speech Server

PSTN

SMS Messages

Voic

e O

nly

Ap

ps

Page 15: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

GrammarsGrammars

PromptsPrompts

ASP.NETDialogs

ASP.NETDialogs

Speech EngineServices

Speech EngineServices

Telephony AppServices

Telephony AppServices

Rich Client Devices Over Data Channel

Web ServerMS Speech Server

SMS Push for Brower Launch

Page 16: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Microsoft Voice Command

Pocket PC voice-enabled applications: Voice Dialer, Contacts, Calendar, Media

Player No connectivity necessary (100%

embedded) No training needed, (speaker-

independent) Continuous speech recognition

“Call John at home”

Page 17: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Multimodal Interactive Pad (MIPAD)

Page 18: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Multimodal Map

Page 19: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Current Speech User Interfaces

Need improved Speech user interfaces Even no-errors and fast processing not sufficient But errors occur: better error correction needed

Social issues: Microphones can’t tether user Users more comfortable talking to phones, cars. Talking to computers not likely in meetings or

cubicles

Page 20: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

The Future of Natural User Interfaces

Page 21: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

End User End User NeedsNeeds

Technology, Technology, ResearchResearch

Software ScenariosSoftware Scenarios

Bridging The Gap

Page 22: Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research alexac@microsoft.com Feb 14 th 2003

Thank You!

http://research.microsoft.com/srg