1
Dialogue systems in cars face two major challenges Speech recognition errors Increased cognitive load on the user Statistical dialogue modelling deals with speech recognition errors Substantial research concerns safety while talking to a dialogue system in a car We examine how humans speak when under cognitive load We find dis-fluencies in communication and preference towards certain system questions The Effect of Cognitive Load on a Statistical Dialogue System Dialogue as a Secondary Task Acknowledgements We would like to thank Prof. Peter Robinson and Ian Davies for their help with the simulated car experiments. Experimental Set-up Bayesian Update of Dialogue State dialogue manager provides robustness to speech recognition errors: It models dialogue via a Bayesian network with hidden concepts It maintains a distribution over the hidden concepts Domain: TopTable restaurant domain for Cambridge (150 venues, 8 slots) Car Simulator: seat, steering weal, pedals and large projector 30 subjects drove along a motorway in three scenarios Driving for 10 minutes (without talking) Talking to the system for 7 dialogues Talking&driving at the same time (7 dialogues) Milica Gašić, Pirros Tsiakoulis, Matthew Henderson, Blaise Thomson, Kai Yu, Eli Tzirkel* and Steve Young Cambridge University Engineering Department, *General Motors Results We measured differences in speed and related statistics per subject We examined which is larger for Talking&Driving: Conclusions Dialogues with cognitively loaded users tend to be less successful Cognitively loaded users tend to answer some system questions more than others Users tend to use barge-ins and filler significantly more often when cognitively loaded Incremental dialogue and adaptation techniques are needed to better model dialogue as a secondary task When talking subjects were given specific dialogue tasks to complete We measured both the objective task completion and the perceived (subjective) task completion Although not statistically significant, the performance is worse when driving at the same time. Cognitive Load Driving Perfomance Dialogue Performance Conversational Patterns Driving Talking Talking&Driving How mentally demanding was the scenario? (1 low -- 5 high) 1.61 2.21 2.89 How hurried was the pace of the scenario? (1 low -- 5 high) 1.21 1.71 1.89 How hard did you have to work? (1 low -- 5 high) 1.5 2.32 2.96 How frustrated did you feel during the task? (1 low -- 5 high) 1.29 2.61 2.61 How stressed did you feel during the task? (1 low -- 5 high) 1.29 2.0 2.32 Subjects were able to notice differences in cognitive load: Driving is more erratic when the subjects talk to the system at the same time Measure % of Subjects Conf. int. Speed 8% [1%,25%] Std. dev. 77% [56%,91%] Entropy 85% [65%,95%] Talking Talking&Drivin g Subjective 78.6% 74% Objective 68.4% 64.8% User obedience to system’s questions: 1. System requests Samples Obedience Talking 392 67.6% Talking&Driving 390 63.9% 2. System confirms Samples Obedience Talking 91 73.6% Talking&Driving 92 81.5% Analysis of measures related to speaking which increase for Talking&Driving compared to Talking: Measure % of Subjects Conf. int. Barge-ins 87% [69%,96%] Fillers 73% [54%,88%] Intensity 67% [47%,83%] Users prefer confirmations to request when they are driving Cognitively loaded user speech is more dis-fluent and louder tem

Dialogue systems in cars face two major challenges Speech recognition errors

  • Upload
    lenora

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

The Effect of Cognitive Load on a Statistical Dialogue System. Milica Gašić , Pirros Tsiakoulis , Matthew Henderson, Blaise Thomson, Kai Yu, Eli Tzirkel * and Steve Young Cambridge University Engineering Department, *General Motors. Driving Perfomance. Dialogue as a Secondary Task. - PowerPoint PPT Presentation

Citation preview

Page 1: Dialogue systems in cars face two major challenges Speech recognition errors

• Dialogue systems in cars face two major challenges• Speech recognition errors • Increased cognitive load on the user

• Statistical dialogue modelling deals with speech recognition errors• Substantial research concerns safety while talking to a dialogue

system in a car• We examine how humans speak when under cognitive load• We find dis-fluencies in communication and preference towards

certain system questions

The Effect of Cognitive Load ona Statistical Dialogue System

Dialogue as a Secondary Task

AcknowledgementsWe would like to thank Prof. Peter Robinson and Ian Davies for their help with the simulated car experiments.

Experimental Set-up

•Bayesian Update of Dialogue State dialogue manager provides robustness to speech recognition errors:

•It models dialogue via a Bayesian network with hidden concepts •It maintains a distribution over the hidden concepts

•Domain: TopTable restaurant domain for Cambridge (150 venues, 8 slots)

•Car Simulator: seat, steering weal, pedals and large projector•30 subjects drove along a motorway in three scenarios•Driving for 10 minutes (without talking)•Talking to the system for 7 dialogues•Talking&driving at the same time (7 dialogues)

Milica Gašić, Pirros Tsiakoulis, Matthew Henderson, Blaise Thomson, Kai Yu, Eli Tzirkel* and Steve Young

Cambridge University Engineering Department, *General Motors

Results

•We measured differences in speed and related statistics per subject•We examined which is larger for Talking&Driving:

Conclusions•Dialogues with cognitively loaded users tend to be less successful

•Cognitively loaded users tend to answer some system questions more

than others

•Users tend to use barge-ins and filler significantly more often when

cognitively loaded

•Incremental dialogue and adaptation techniques are needed to better

model dialogue as a secondary task

• When talking subjects were given specific dialogue tasks to complete• We measured both the objective task completion and the perceived

(subjective) task completion

• Although not statistically significant, the performance is worse when driving at the same time.

Cognitive Load

Driving Perfomance

Dialogue Performance

Conversational Patterns

Driving Talking Talking&DrivingHow mentally demanding was the scenario? (1 low -- 5 high) 1.61 2.21 2.89How hurried was the pace of the scenario? (1 low -- 5 high) 1.21 1.71 1.89How hard did you have to work? (1 low -- 5 high) 1.5 2.32 2.96How frustrated did you feel during the task? (1 low -- 5 high) 1.29 2.61 2.61How stressed did you feel during the task? (1 low -- 5 high) 1.29 2.0 2.32

• Subjects were able to notice differences in cognitive load:

•Driving is more erratic when the subjects talk to the system at the same time

Measure % of Subjects Conf. int.Speed 8% [1%,25%]Std. dev. 77% [56%,91%]Entropy 85% [65%,95%]

Talking Talking&DrivingSubjective 78.6% 74%Objective 68.4% 64.8%

User obedience to system’s questions:

1. System requestsSamples Obedience

Talking 392 67.6%Talking&Driving 390 63.9%2. System confirms

Samples ObedienceTalking 91 73.6%Talking&Driving 92 81.5%

Analysis of measures related to speaking which increase for Talking&Driving compared to Talking:

Measure % of Subjects Conf. int.Barge-ins 87% [69%,96%]Fillers 73% [54%,88%]Intensity 67% [47%,83%]

• Users prefer confirmations to request when they are driving

• Cognitively loaded user speech is more dis-fluent and louder

tem