15
Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Embed Size (px)

Citation preview

Page 1: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Automatic Switchboard Operator

Luboš Šmídl, Tomáš Valenta

Department of Cybernetics

Faculty of Applied Sciences

University of West Bohemia in Pilsen

Page 2: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

ContentsA dialogue systemThe dialogueAutomatic speech recognition and speech grammarThe dataAdvanced featuresMaintenance-free runningExperiences and futureInteresting factsOther applications

Page 3: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

PurposeAutomatic Switchboard Operator is a voice application

whose purpose is to answer phone calls and transfer callers to requested persons. The caller makes input preferably by voice and the system informs him by voice as well.

A voice dialogue systemWhole UWB in PilsenFirst such a large application of its kind in CZ

Page 4: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

A dialogue system

Data

Document serverDilague controller

Scripting engineASR TTS

Telephoneback-end

Telephonenetwork

Caller

Data: MySQL, Oracle, …Document server: PHPDialogue controller: VoiceXML InterpreterSpeech engines: ERIS by SpeechTech and Dpt. of

CyberneticsTelephony: SIP or ISDN

Page 5: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

The dialogueExperienced vs. newbies

Shortcuts Call n-th number

Called person specificationFirst name and surnameTitles and degreesDepartment and

functionVoice or DTMF input

Smith → 76484#

Page 6: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Automatic Speech RecognitionMethods

LVCSR Isolated wordsGrammars

person = (

[(salut function) | (function salut) | salut | function]

[degrees] (([firstname] surname) | (surname firstname))

[degrees]

[function | department]

) | function;

Page 7: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Speech grammar complexityProf. Ing. Josef Psutka, CSc., boss of DCy

1. Josef Psutka2. Engineer Psutka3. Boss of the Department of Cybernetics4. Mister Psutka, professor5. Professor Psutka, the Department of Cybernetics6. Psutka Josef7. etc.

26,042 acceptable utterances

Page 8: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

The DataVisual data vs. Aural data

Prof. Ing. Psutkaprofessor engineer psutka

Generating pronunciationsRules-based, for TTS vs. for ASRTomáš

Tomáš, Thomas, Tom

Fields taggingBetter grammar matching, faster DB searchJ(firstname) P(surname) D(department) F(function) T(degree)

Page 9: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Advanced featuresWeb presentationAdministration

Rules for pronunciationsShortcuts or Direct numbersCallers’ rights

Phonebook searchingMonitoring Statistics

Page 10: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Maintenance-free running Windows services, daemons Task scheduler

1. Import data2. Generate pronunciations3. Generate and compile grammar4. Optional sanitary restart

Page 11: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

ExperiencesRunning since 2008Extended grammar accepting

Hello, Please, Thank you I would like to talk to

Optimizing promptsApplication made generalFuture

Using statistics for person/number selectionMore info about employeesMore features and speed for experienced usersNew technologies: better TTS and ASR

Page 12: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Interesting numbers

2,095 persons

2,322 telephones

35,566,194 utterances

2.5 hours – grammar compilation time

Page 13: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Other dialogue applications Entrance exams

Since June 2000 3,000–5,000 calls a year

Exams Web access alternative

Recent news reading RSS from www.idnes.cz Categories: general, sport, economics, …

ASR demo Users can test ASR capabilities Web interface, users log in, own set of utterances

and others…

Page 14: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

Thank you for attention

Do you have any questions?

Page 15: Automatic Switchboard Operator Luboš Šmídl, Tomáš Valenta Department of Cybernetics Faculty of Applied Sciences University of West Bohemia in Pilsen

VoiceXML Mark-up language based upon XML Main advantage

Minimizes client/server communication (more interactions in a document) Hides low-lever implementation details from the programmer Enables better portability Designed for content providers, dialogue designers Separates user interface (VoiceXML) from program logic Easy for both simple and complex applications

VoiceXML Interpreter (like web browser) Document getter Document interpreter (dialogue controller) I/O interface – speech engine: telephony, ASR and TTS units

Two kinds of dialogue: forms and menus