29
Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Embed Size (px)

Citation preview

Page 1: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Audiovisual Attentive User Interfaces

Attending to the needs and actions of the user

Paulina ModlitbaT-121.900 Seminar on User Interfaces and Usability

Page 2: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

What is an Attentive User Interface? (1/2)

• Negotiate the timing and volume of communication with the user

• Use specific input, output and turn-taking techniques to determine what task, device or person a user is attending to

• User’s presence, orientation, speech activity and gaze and statistically modeling attention and interaction are detected

Page 3: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

• Four characteristic components– visual attention– turn-taking techniques– modeling techniques for the attention– focus and context displays and visualisation

• Dürsteler (2003)

What is an Attentive User Interface? (2/2)

Page 4: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Why are they needed?

• Roel Vertegaal (2003)• Multiple ubiquitous computing devices lead to a

growing demands on users’ attention• Metaphor: modern traffic light system

– Sensors– Statistical models of traffic volume– Peripheral displays (traffic lights)

• Disruptive effect of interruptions can be avoided

Page 5: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Evolution of human-machine interaction

1960s-1980s: many-one 1980s-1990s: one-one

2000s-2010s: many-many1990s-2000s: one-many

Page 6: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Visual attention

• Eye-gaze tracking: detecting the user’s visual focus of attention

• Operate by sending an infrared light source toward the user’s eye

• Provides information about the context

• Central I/O channel in communication

• Limitations in existing hardware/software

• Biological limitations

Page 7: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Reasons for implementing gaze tracking

• Kaur et al. (2003)• The gaze location is the only reliable predictor of

the locus of visual attention• Gaze can be used as a “natural” mode of input

that avoids the need for learned hand-eye coordination

• Gaze selection of screen objects is expected to be significantly faster than the traditional hand-eye coordination

• Gaze allows for hands-free interaction

Page 8: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Current issues

• Limited size of fovea (1-3°)

• Subconscious eye movements

• Eyes are not control organs (Zhai et al., 2003)

• No natural analogy to current input devices, e.g. mouse

• Gaze is always active (Kaur et al., 2003)

Page 9: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Current state

• Eye-gaze control used as an additional input channel

• Provides context to the action

• Combined with manual input gaze tracking can improve the robustness and reliability of a system

Page 10: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

EASE Chinese Input (1/2)

• Zhai et al. (2002)• Supports pinyin type-writing

– official Chinese phonetic alphabet based on Roman characters

– Chinese characters are homophonic - each syllable corresponds to several Chinese characters

– When the user types the pinyin of a character, a number of possible characters with the same pronunciation are displayed

Page 11: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

• Normally, user chooses a character by pressing a number on the keyboard

• With EASE user only has to press the spacebar as soon as he or she sees the wished-for character in the list

• The system selects the character closest to the user’s

current gaze location

EASE Chinese Input (2/2)

Page 12: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Speech recognition (1/2)

• Limited technology, despite extensive research and progress

• Crucial issues– error rate of speech recognition engines and

how these errors can be reduced– the effort required to port the speech

technology applications between different application domains or languages (Deng & Huang, 2004)

Page 13: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

• Three directions for enhancing the technique– improve the microphone ergonomics for

enhancing the signal-to-noise ratio – equipping speech recognizers with the ability

to learn and to correct errors – add semantic (meaning) and pragmatic

(application context) knowledge (Deng & Huang, 2004)

Speech recognition (2/2)

Page 14: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Multimodal interfaces

• Can provide more natural human-machine interaction

• Improves the robustness of the interaction by using redundant or complementary information

• Today: usually gaze/speech + manual control (e.g. mouse)

• Future: gaze + speech, gaze, speech

Page 15: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Main issue

• Shumin Zhai (2003)

• “We need to design unobtrusive, transparent and subtle turn-taking processes that coordinate attentive input with the user’s explicit input in order to contribute to the user’s goal without the burden of explicit dialogues.”

Page 16: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Manual and Gaze Input Cascaded (MAGIC) Pointing

• interaction technique that utilizes eye movement to assist the control task

• Zhai et al. have constructed two MAGIC pointing techniques, one liberal and one conservative (Zhai et al., 1999)

Page 17: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Liberal approach (1/2)

• The cursor is warped to every new object that the user looks at

• The user can then manually take control of the cursor near (or on) the target, or ignore it and search for the next target

• New target defined by distance (e.g. 120 pixels) from the current cursor position

• Issues: pro-active (cursor waits readily); overactive (gaze enough to move cursor)

Page 18: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Liberal approach (2/2)

Page 19: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Conservative approach (1/2)

• Warps the cursor to a target when the manual input device has been actuated

• Once moved, the cursor appears in motion towards the target

• Hence, the cursor never jumps directly to a target that the user does not intend to obtain

• May be slower than the liberal approach

Page 20: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Conservative approach (2/2)

Page 21: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

EyeCOOK

• Bradbury et al. (2003) • Multimodal attentive cookbook that helps unaccustomed

computer users cook a meal • User interacts with the eyeCOOK system by using eye-

gaze and speech commands

• System responds visually and verbally • The system replaces the object of the user’s gaze with

the word “this”• If the user’s gaze can not be tracked by the eyeCOOK

system the user has to specify the target verbally

Page 22: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

EyeCOOK in Page Display Mode

Page 23: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

GAZE-2

• Vertegaal et al, 2003• A new group video conferencing system that

uses gaze-controlled cameras to convey eye-contact

• Consists of a video tunnel that makes it possible to place cameras behind the participant images on the screen

• system automatically directs the video cameras in this tunnel using a gaze tracker by selecting the camera closest to the user’s current focus of attention (gaze location)

Page 24: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

GAZE-2 system structure

Page 25: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

3D rendering

• The 2D video images of the participants are displayed in a 3D virtual meeting room and are automatically rotated to face the participant each user is looking at.

• In the picture bellow, everyone is looking at the left person, who’s image is broadcasted in a higher resolution.

Page 26: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Turn-taking in video conferencing

• Misunderstandings cause interruptions

• Eye contact plays an important role in turn-taking (Vertegaal, et al., 2003)

Page 27: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

References

• Vertegaal, et al., 2003• Bradbury et al. (2003)• Zhai et al., 1999• Dürsteler (2003)• Vertegaal (2003)• Kaur et al. (2003)• Shumin Zhai (2003)• Zhai et al. (2002)

• (Deng & Huang, 2004)

Page 28: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Things missing• Are attentive user interfaces better in following the user in

order to "capture his/her context" to make proactive actions for him/her,or are they better used as input devices (an approach you take).

• The distinction between explicit and implicit input, as presented byHorvitz (you can find a link from the seminar homepage), is thus importanthere and could give you benefit.

• Please take some real world examples of prototypes and real situationsto your presentation. This makes grasping the idea better and arguingmore concrete. You might consider presenting other application ideas aswell as the ones already in the paper.

• I think you would benefit from considering in more detail, for eachparticular application, why attention and preferences are tracked andhow they might be combined, effectively, to minimize disruption and makeinteraction more fluent. Binding the presentation more tightly to the"let's make interruptions go away" theme of the seminar is important here.

• Consequently, the presentation, it would be nice to see your analysis of"how things were" and "how things are" (now with AUIs).

Page 29: Audiovisual Attentive User Interfaces Attending to the needs and actions of the user Paulina Modlitba T-121.900 Seminar on User Interfaces and Usability

Oulasvirta

• Attention• Working memory• Long-time memory• Task resumptions• Control• Trust • Stress• Social interaction