Efficient Computer Interfaces Using Continuous Gestures, Language Models, and Speech Keith Vertanen...

Efficient Computer Interfaces Using Continuous Gestures,

Language Models, and Speech

Keith Vertanen

Inference Group

August 4th, 2004

The problem

Speech recognizers make mistakes Correcting mistakes is inefficient

140 WPM Uncorrected dictation 14 WPM Corrected dictation, mouse/keyboard 32 WPM Corrected typing, mouse/keyboard

Voice-only correction is even slower and more frustrating

Research overview

Make correction of dictation: More efficient More fun More accessible

Approach: Build a word lattice from a recognizer’s n-best list Expand lattice to cover likely recognition errors Make a language model from expanded lattice Use model in a continuous gesture interface to

perform confirmation and correction

Building lattice

Example n-best list:1: jack studied very hard2: jack studied hard3: jill studied hard4: jill studied very hard5: jill studied little

Insertion errors

Acoustic confusions Given a word, find words that sound similar Look pronunciation up in dictionary:

studied s t ah d iy d Use observed phone confusions to generate alternative

pronunciations:s t ah d iy d s t ah d iy d

s ao d iys t ah d iy…

Map pronunciation back to words:s t ah d iy d studieds ao d iy saudis t ah d iy study

Acoustic confusions:“Jack studied hard”

Morphology confusions Given a word, find words that share the same “root”. Using the Porter stemmer:

jackingjacksjackjacked

studystudyingstudiedstudies

Morphology confusions:“Jack studied hard”

Language model confusions:“Jack studied hard”

Look at words before or after a node, add likely alternate words based on n-gram LM

Expansion results (on WSJ1)

Baseli

Bigram

Trigra

ObservedFully additive

Upper bound

Probability model

Our confirmation and correction interface requires probability of a letter given prior letters:

Probability model

Keep track of possible paths in lattice Prediction based on next letter on paths Interpolate with default language model Example, user has entered “the_cat”:

Handling word errors Use default language model during entry of erroneous word Rebuild paths allowing for an additional deletion or substitution error Example, user has entered “the_cattle_”:

0.0625

Using expanded lattice Paths using arcs added during lattice expansion are penalized. Example, user has entered “jack_”:

0.041.00

Evaluating expansion Assume a good model requires as little information

from the user as possible

0ii211i2 )s...ss|s(Plog

1 entropy(T) Cross

Baseli

Bigram

Trigra

Results on test set Model evaluated on held out test set (Hub1) Default language model

2.4 bits/letter User decides between 5.3 letters

Best speech-based model 0.61 bits/letter User decides between 1.5 letters

“To the mouse snow means freedom from want and fear”

“The hibernating skunk curled up in his deep den uncurls himself and ventures forth to prowl the world”

Conclusions One-third of recognition errors covered by

expanding lattice. Only insertion error expansion improves

efficiency. Speech-based model significantly improves

efficiency (2.4 bits -> 0.61 bits). A good correction interface is possible using

Dasher and an off-the-shelf recognizer.

Future work Update Speech Dasher to use lattice-based

probability model. Incorporate hypothesis probabilities into lattice

(or even better get at recognizer’s lattice). Improve efficiency on sentences with few or no

errors. User trials to validate numeric results.

Questions?

Efficient Computer Interfaces Using Continuous Gestures, Language Models, and Speech Keith Vertanen...

Documents

African gestures

Gestures of Disappearance

User-Generated Free-Form Gestures for Authentication ...janne/mobisys14gesturesecurity.pdf · User-Generated Free-Form Gestures for Authentication: Security and Memorability ... gestures

Your Gestures Speak

2015 In Gestures

402 Gestures

: Shape unconventional gestures intoidahotc.com/Portals/0/webinar documents... · Shape unconventional gestures into conventional gestures and/or target . symbol use _____: Conventional

pro Gestures

70 Japanese Gestures

Lesson 3(Gestures)

Anu Vertanen Rintamalta Ratakadulle Suomalaiset SS-miehet

Real-Time Inference of Complex Mental States from …jhoey/teaching/cs886-affect/...Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures Rana El Kaliouby

Gestures ppt 280714

Formulas Gestures

Gestures Recognition

Christmas gestures

Japanese gestures (1)

Gestures Of Orixas

Paikkatietoalustan merkitys yhteiskunnalle / Antti Vertanen

International gestures