View
217
Download
3
Embed Size (px)
Citation preview
JamesJames
A Personal Mobile Universal A Personal Mobile Universal Speech Interface for Electronic Speech Interface for Electronic
DevicesDevices
Phone Client PDA Client Computer Client
Speech Application
Backend
Current Speech Application ConceptCurrent Speech Application Concept
??? ??? ???
Speech Application
Backend
Current Electronic DevicesCurrent Electronic Devices
???Questions??????Questions???
History:History: Why is there a conceptual gap? Why is there a conceptual gap?
Motivation:Motivation: Is speech a useful modality for Is speech a useful modality for “other” electronic devices?“other” electronic devices?
Hardware:Hardware: How would one get speech in “other” How would one get speech in “other” devices?devices?
Architecture:Architecture: What should the system look like? What should the system look like?
Dialog: Dialog: What should/will these conversations be What should/will these conversations be like?like?
HistoryHistory
Why is there a conceptual gap?Why is there a conceptual gap?
Speech is still hard.Speech is still hard.
That will change.That will change.
MotivationMotivation
Is speech a useful modality for “other” Is speech a useful modality for “other” electronic devices?electronic devices?
It seems probable.It seems probable.
There has been some positive There has been some positive research (see Microsoft)research (see Microsoft)
Ideas?Ideas?
HardwareHardware
How would one get speech in “other” How would one get speech in “other” devices?devices?No need to as long as devices are remote No need to as long as devices are remote controlled via a known interface.controlled via a known interface.
Refer to system architecture.Refer to system architecture.
ArchitectureArchitecture
Havi adapter
X10adapter
MobileSpeech Client
DialogDialog
USI ModelUSI Model
Artificial subset languageArtificial subset language
Tree-structured functionsTree-structured functions
Universal primitivesUniversal primitives
User-directedUser-directed
Great for recognitionGreat for recognition
Entirely declarative (automatic)Entirely declarative (automatic)
James
StereoDigital
camera
(mode)<turns stereo
on>
x-bass volume off
on
off
volume up
volume down
tuner auxiliary CD
(radioband)
seek
AM
FM
frequency
station
#
WXXX
frequency
station
#
WXXX
forward
backward
(status)
repeat
disc
track
next track
last track
random
play stop pause
offsingle track
single disc
all discs
#
#
on off
control info
Play mode
play
stop
fast fwd
rewind
record
pause
Other devices…
Device mode
Media type
camera
VCR
Digital video
unknown
VHS
none
(mode)
step
forwardbackwar
d
KeywordsKeywords
hello-jameshello-jamesoptions options where-am-i, where-was-i where-am-i, where-was-i go-ahead, ok go-ahead, ok status status goodbye goodbye what-is, what-is-the what-is, what-is-the how-do-i how-do-i more more
Session Management Session Management
hello-james/goodbye hello-james/goodbye User: blah blah blah...User: blah blah blah...System: System: ignoring userignoring userUser: hello-jamesUser: hello-jamesSystem: stereo, digital cameraSystem: stereo, digital cameraUser: stereoUser: stereoSystem: stereo hereSystem: stereo hereUser: goodbyeUser: goodbyeSystem: goodbyeSystem: goodbyeUser: blah blah blah...User: blah blah blah...System: System: ignoring userignoring user
QueryQuery
what-is what-is pathpath/status/status
User: what-is-the am frequencyUser: what-is-the am frequencySystem: the am frequency is five hundred System: the am frequency is five hundred thirtythirtyUser: what-is randomUser: what-is randomSystem: random is offSystem: random is offUser: what-is-the stereoUser: what-is-the stereoSystem: the stereo is tuner System: the stereo is tuner
help/exploration/implicit navigation help/exploration/implicit navigation
how-do-i.../options/how-do-i.../options/pathpath options options
User: control alarm clock radio optionsUser: control alarm clock radio optionsSystem: alarm, clock, radio, sleep...System: alarm, clock, radio, sleep...User: moreUser: moreSystem: x10, stereoSystem: x10, stereoUser: stereo optionsUser: stereo optionsSystem: System: while turning stereo on:while turning stereo on: off, am, fm, off, am, fm, auxiliary, cd...auxiliary, cd...
invocation/specification/implicit invocation/specification/implicit exploration/navigation exploration/navigation
PathPath
User: stereo auxiliaryUser: stereo auxiliarySystem: while turning the stereo on and System: while turning the stereo on and switching to auxiliary: auxiliaryswitching to auxiliary: auxiliaryUser: cdUser: cdSystem: while switching to cd mode: cdSystem: while switching to cd mode: cdUser: playUser: playSystem: while playing a cd: playSystem: while playing a cd: play
list navigation list navigation
MoreMoreUser: radio band am optionsUser: radio band am optionsSystem: bracketed list [frequency, kabc, k001, System: bracketed list [frequency, kabc, k001, k002, k003, k004, k005][fm][off, volume][alarm, k002, k003, k004, k005][fm][off, volume][alarm, clock, sleep][x10, stereo] rendered: frequency, clock, sleep][x10, stereo] rendered: frequency, kabc, k001, k002...kabc, k001, k002...User: moreUser: moreSystem: k003, k004, k005...System: k003, k004, k005...User: moreUser: moreSystem: fm, off, volume...System: fm, off, volume...
orientation orientation
where-am-i where-am-i
User: what is the discUser: what is the discSystem: the disc is threeSystem: the disc is threeUser: where am iUser: where am iSystem: stereo cd discSystem: stereo cd disc
Research QuestionsResearch Questions
Is the subset language learnable?Is the subset language learnable?
Once learned, is it efficient?Once learned, is it efficient?
Are user mistakes infrequent enough?Are user mistakes infrequent enough?
Are system mistake infrequent enough?Are system mistake infrequent enough?
Can one generalize from one device to Can one generalize from one device to another?another?
Is the subset language well retained?Is the subset language well retained?