20
1ª Reunião Acompanhamento - BRIDGE 1 Evaluation of a multimodal Virtual Personal Assistant Glória Branco Sophie-Antipolis, March 23, 2006 20th International Symposium 2006

Evaluation of a multimodal Virtual Personal Assistant Glória Branco

  • Upload
    tyson

  • View
    31

  • Download
    2

Embed Size (px)

DESCRIPTION

20th International Symposium 2006. Evaluation of a multimodal Virtual Personal Assistant Glória Branco. Sophie-Antipolis, March 23, 2006. Agenda. Introduction FASiL project and consortium The Virtual Personal Assistant (VPA) Architecture Functionalities Interface - PowerPoint PPT Presentation

Citation preview

Page 1: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

1ª Reunião Acompanhamento - BRIDGE1

Evaluation of a multimodal Virtual Personal Assistant

Glória Branco

Sophie-Antipolis, March 23, 2006

20th International Symposium 2006

Page 2: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

2

Agenda

• Introduction– FASiL project and consortium– The Virtual Personal Assistant (VPA)

• Architecture• Functionalities• Interface

– Global Evaluation Methodology• Heuristic Evaluation• User Trials

• The Portuguese trials– Method– Results– Users comments

• Conclusions

Page 3: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

3

FASiL Project

• FASiL – “Flexible and Adaptive Multi-Modal Spoken Interface Language” – EU-IST funded, multimodal, multi-lingual, conversational

application to e-mail management.

• Objectives– “...to pilot a full multi-modal voice portal application that is

3G mobile network ready, along with tools for rapid development of new applications. FASiL targets the languages of UK English, Portuguese and Swedish… [with] intelligent, friendly adaptive multi-modal interaction.”

Page 4: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

4

FASiL Consortium

generation

Inovação

P TInovação

FASiL: “Flexible and Adaptive Multi-Modal Spoken Interface Language”

Page 5: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

5

VPA Architecture

ASR Multilingual

TTS

Vox Generator Services

Fission

Mid

dle

we

re

PIM

Administ

rtion

Multi-Modal Gateway

Fusion

Dialogue Manager

GUI Gateway

Page 6: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

6

VPA Funcionalities

• Hear a summary of the Inbox.• Navigation: next, previous.• Select specific e-mails : search by

State (new, old), Sender, Date, Priority and Category.

• Read, compose, reply, forward and delete e-mails.

• Recipient list management.• Summarisation.• Categorisation.

Page 7: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

7

VPA Interface

• Output – Voice– Avatar– Screen– PDA

• Input – Voice– Keyboard– Mouse– Touch– Stylus

Multimodal

VUI

GUI

Available in English, Swedish and Portuguese

Page 8: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

8

Global Evaluation

• Set up of test environment– Task design, to cover the VPA functionalities. – Test mailbox populated with a restricted set of contacts and

emails.

• Heuristic Evaluation– 5 expert assessments by each language. – Experts in accessibility, usability and voice interaction.

• User Tests– 20 users for accessibility only for the English version (RNIB and

RNID)– 20 Swedish and English users and 12 Portuguese. – Experts in email usage.

“to iteratively gather information about the usability and accessibility of the system”

Page 9: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

9

The Portuguese Trial• Laboratory environment

– the graphic interface was a web-based page, simulating a mobile phone. The users used a desktop PC with Internet access to interact with the GUI and a fixed phone to convey voice to the system.

• 12 native Portuguese speakers – 8 males and 4 females – from 19 to 46 years (mean 30,6 years) – 75% of the participants had high-level education and 16,7 % had mid-

level education – ICT domain professionals and experienced e-mail users.

• 5 typical e-mail tasks– login and browsing mailbox– search for and reply to an e-mail– search and forward– administer and manage the recipient list – finding, reply and deleting an e-mail

Page 10: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

10

Task results summary

Task

Time Comp. %

VUI GUI. Correct resp.

No resp.

Misund. Incor. resp

T1 10(7-18)

83,3 14,0 (9-49)

6,5 (0-22)

54,3 13,7 6,8 25,2

T2 7(5-12)

75 11,5 (5-33)

4,5(0-16)

60,2 11,4 8,5 19,9

T3 5(3-16)

100 14,5(1-26)

1,5(0-20)

67 9,8 9,8 13,3

T4 6(2-10)

50 16;5(4-41)

2(0-6)

55,84 7,4 18,6 18,2

T5 10(4-18)

75 32(9-66)

4(0-11)

59,4 9,4 9,4 21,8

Interactions Spoken interaction (%)

Page 11: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

11

Post-test satisfaction questionnaire

0

1

2

3

4

5

6

7

8

Intu

itiven

ess

Easy

Confid

ence

Satisf

actio

n

whe

re&abo

uts

erro

reco

g

prom

pts

emai

ls

Conve

rsat

ions

Very Sat Satisf ied Neutral Unsatisf ied Very Unsat.

Frequeci

e

Page 12: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

12

Statistical analysis

• Significant correlation (Spearman’s correlation coefficient) between the overall satisfaction and: – Quality of dialog: = 0,87 – Confidence: = 0,79– Easy of use: = 0,74– Interaction control: = 0,73 – Interaction quality (error recognition): = 0,69

• Significant correlation (Spearman’s correlation coefficient) between the overall satisfaction (subjective) and the concept accuracy (objective value of correct responses): = 0,85.

• No differences between females and males (Mann-Whitney test) as well as between the experimented or naïve users.

Page 13: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

13

Users aproach

• The preferred modality was speech.• Natural language, using short phrases

but with complex commands.• Speech input to convey the

commands and graphical interface to read the messages and to scroll quickly through the contacts list.

• More intensive use of the GUI to overcome the recognition problems and slowness of the system response.

• Mixed initiative dialog.

Page 14: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

14

Interaction Example 1

U I want replace [recipient name] by carbon copy.

S Who would you like to send copy to?

U (barge-in) [recipient name] S Send copy to [recipient

name] U I want change the recipient

list.

Page 15: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

15

Interaction Example 2

• U mailbox • S You have 4 e-

mails • U New search. Find

high priority messages from [recipient name]

• S You have 1 new priority e-mail

from [recipient name] • U Read it

Page 16: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

16

Users apreciation

• The conversational and multimodal VPA concept was attractive to all users and was seen as a key enabler supporting the growing user mobile attitude.

• The VPA was seen as easy to use and intuitive. The Help part of the system was almost not used.

• Users did not liked excessive confirmations.• The use of the TTS Portuguese voice was

well accepted by the users. • Users liked voice-in and VUI and GUI-out in a

small-screen environment. • The multimodality was seen as a very good

capability to overcome recognition problems encountered in the VUI.

Page 17: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

17

Future Use

But, when asked about the future use

• 58% of the users said that they would not use the system in its current form.

• Main reasons:– slow response time– recognition/understanding problems.

Page 18: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

18

Failure?

Tell me “when it’s time” to stop!

Page 19: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

19

NO!

Lessons learned– Speed of feedback is very important. Users

dislike latency latency or long periods of silence. – Improvements are needed to increase the

recognition accuracy of the spoken components.– Natural language is working ... with limitations.

Multimodal interfaces can overcome the weaknesses of each modality and exploit the full strengths of combined modes.

Page 20: Evaluation of a multimodal Virtual Personal Assistant Glória Branco

20

Glória Branco: [email protected]

www.ptinovacao.pt

Thank you!