Adapting TV based applications’ user interface

ADAPTING TV BASED APPLICATIONS’ USER INTERFACE

David Costa and Carlos Duarte Faculdade de Ciências da Universidade de Lisboa

Campo Grande, 1749-016 Lisboa - Portugal

ABSTRACT

In this paper we describe the design and implementation of an adaptive multimodal fission component integrated in the multimodal GUIDE framework. This component is able to adapt any TV based application’s UI to a specific user’s characteristics making possible for elderly and impaired users to interact by offering several output modalities that tries to overcome his difficulties of interaction.

KEYWORDS

Multimodal fission, Adaptation, Accessibility, GUIDE, Elderly

1. INTRODUCTION

Interaction with a TV nowadays has stopped being a passive viewing experience, where all the interaction was choosing what channel to watch or changing volume settings. Today with services like Google TV, Samsung smart TV or even Apple TV, viewers have active roles consuming and producing contents like browsing the web, uploading media such as images or videos, using different TV applications available or simply recording TV programs.

Since everyone can and should be able to access these new technologies regardless of their age, knowledge and physical, sensorial or cognitive abilities it makes sense to employ adaptation techniques and offer a wide range of possible interaction devices for both input and output feedback.

This paper describes a component, part of a multimodal system named GUIDE, which aims to solve the aforementioned issues on the output side.

2. CONTEXT

Our work is being developed in the scope of the European project GUIDE (“Gentle user interfaces for elderly people”) which has the goal of developing a framework for developers to efficiently integrate accessibility features into their applications (Jung and Hahn, 2011).

GUIDE puts a dedicated focus on the emerging Hybrid TV platforms and services (connected TVs, set-top-boxes, etc.), including application platforms as Hybrid broadcast broadband TV (HbbTV) as well as proprietary middleware solutions of TV manufacturers. These platforms have the potential to become the main media terminals in the users’ homes, due to their convenience and wide acceptance. GUIDE intends to simplify users’ lives, help them to stay connected in their social network and enhance their understanding of the world.

This project main target population is the elderly due to their likelihood of having some kind of mild disability such as visual, hearing or motor impairment though the benefits of such an adaptable system can be applied to every range of population.

One of GUIDE’s goals is to implement the use of more human like ways of communication employing voice and gestures as a mean of interaction with the system, not requiring a long learning curve which is one of the reasons that keeps away this type of users.

IADIS International Conference Interfaces and Human Computer Interaction 2012

149

Click t

o buy N

OW!PD

F-XChange Viewer

ww

w.docu-track.com Clic

k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om

http://www.pdfxviewer.com/


GUIDE must adapt the operation of these modalities to benefit the user experience. GUIDE must be able to render the application content in the most appropriate modality for a given user and therefore provide alternative ways of displaying that content. GUIDE must enable the possibility to adapt its features and settings automatically demanding no skills or technological knowledge of users to configure the system, providing thus the best interaction and performance possible.

Our work is mainly to develop a component to be integrated in the GUIDE framework capable of firstly bring out the content using the best available modalities suitable to user’s profile and the content features, secondly to distribute that content through the selected modalities (using strategies of redundancy and/or complementarity), finally to adjust that content for each modality chosen.

3. RELATED WORK

3.1 Adaptive Multimodal Systems

Dumas et al. (2009) defines multimodal systems as “computer systems endowed with multimodal capabilities for human-computer interaction and able to interpret information from various sensory and communication channels”. These systems offer users a set modalities to allow them to interact with machines and “are expected to be easier to learn and use, and are preferred by users for many applications” (Oviatt, 2003)

Adaptive multimodal systems enable a more effective interaction by adapting to different situations and different users according to their skills, physical or cognitive abilities.

The flexibility of these systems allows them to adapt not only to users but also to the environment (context awareness). For example the system can use speech to warn or present information in an eyes-busy situation.

GUIDE architecture follows the “integration committee” approach introduced by Dumas et al. (2009) which is composed by a fusion engine (Feiteira and Duarte, 2011), fission engine (David Costa and Duarte, 2011), dialogue manager (Daniel Costa and Duarte, 2011) and user (Biswas et al., 2011) and context models.

3.2 Multimodal Fission

This component is responsible for choosing the output to be presented to the user and how that output is channeled and coordinated throughout the different available output channels (based on the user’s perceptual abilities and preferences). Dumas et al. (2009) states that a multimodal fission engine should follow three main tasks, message construction, modality selection and output coordination.

GUIDE’s fission component bases its process on the What-Which-How-Then (WWHT) conceptual model of Rousseau et al. (2005). This component must know what information to present, which modalities to choose to present that information, how to present it using those modalities and coordinate the flow of the presentation.

In (David Costa and Duarte, 2011) we describe a more extensive research work done in the past years on multimodal modal systems and multimodal fission engine in particular.

3.3 Learning with Users

GUIDE follows a user centered methodology to offer the best adaptive experience for the end users so we conducted user trials to meet the users’ requirements, behaviors and specificities studying and analyzing their interaction with a multimodal application (Coelho et al., 2012).

These studies’ conclusions had clear implications in the development of such adaptive system as is GUIDE (Duarte et al., 2011) and (Coelho, 2011). In what concerns fission adaptation we learned that users no matter their characteristics prefer applications where there are short number of interactive elements for each screen, focusing on big buttons. If developers make complex UIs, GUIDE has to be capable of dividing one screen in multiple screens or present options to the user in alternative modalities.

ISBN: 978-972-8939-75-5 © 2012 IADIS

150

Click t

o buy N

OW!PD

F-XChange Viewer

ww


k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om



https://www.researchgate.net/publication/256006851_Developing_Accessible_TV_Applications?el=1_x_8&enrichId=rgreq-c3aa9623-1330-4ec8-842b-37439c721068&enrichSource=Y292ZXJQYWdlOzI1NTk4OTYwMDtBUzoxMDQxOTczODMwMDAwNzdAMTQwMTg1NDAwMTAxMg==

Applications should make sure both text size and audio volume are configurable by the user at the beginning as well as in the middle of an interaction. If the application by itself doesn’t offer this option, GUIDE UI adaptation should offer this possibility.

The existence of a strong relation between arm and item locations on the screen (e.g. users tended to use the left arm for pointing to menus located in the left side of the screen and right arm for the right side), will influence the way developers design the layout of their applications, as it also affects the configuration and parameterization of GUIDE presentation manager (fission module), as both have to contemplate the existence of this user-UI relation. For example, an user with impairments in the left arm should have interactive elements in the right side of the screen if he wishes to interact with hands in free air.

Users also prefer visual and audio to give redundant information on feedback interactions but it may differ in different applications and contexts.

4. MULTIMODAL ADAPTIVE FISSION

A multimodal adaptive system should be able to flexibly generate various presentations for the same information content in order to meet the individual user’s requirements, environmental context, type of tasks and hardware limitations. Adapting the system to combine all this time changing elements is a delicate task.

The fission module is crucial to make that possible as it takes the advantage of multimodalities to overcome sensory impairments that users may have.

4.1 Architecture

Figure 1. Multimodal Adaptive Fission Architecture

Figure 1 shows the architecture of the fission component, their subcomponents, the GUIDE components which it communicates and Output devices.

UIML, CM, UM, IA and DM parsers are the classes responsible to parse the information received from corresponding components. UIML parser gets the UI representation, i.e., all the visual elements (e.g. buttons, images, text, videos, etc.) and their properties (e.g. text size, color, width, height, location, volume, etc.) of the current application screen. UM parser gets the data from the user model such as the user’s level of impairment on each modality and the range of recommended values for visual and audio attributes. CM parser parses the information about the devices available and their configurations. IA parser gets information about the intended item that a user wants to select in order to do a selection action (e.g. highlight item and render selection sound). DM parser gets data from issues with users input recognition, for example the fusion component cannot understand a user’s speech command (low confidence level on the words spoken), so fission component is responsible to render a message for the user to ask him to repeat the command.

The Modality Selector class is responsible for selecting the modalities to use for a specific user. First the level of adaptation is chosen, inferred from the compatibility between the current user interface and the user’s characteristics. Then obtaining the impairment levels, this component decides which modalities to use.

The UI Adapter adapts the presentation based on the user and context model values while the Presentation Evaluator assesses the validity of the presentation.


151

Click t

o buy N

OW!PD

F-XChange Viewer

ww


k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om



The Coordinator component is responsible for laying out the output events in the right order and sending them to the respective output devices.

4.2 Output Features

In order to fulfill GUIDE objectives the fission module has at its disposal different types of equipment to perform the rendering of the presentation.

The main medium used for visual rendering is the TV where it will be shown the channels, GUIDE interface and TV based applications such as Tele-learning, video conferencing or home automation. A tablet may also be used to clone or complement information displayed on the TV screen (e.g. context menus). It is a powerful tool for user adaptation purposes and is essentially used as a secondary display.

For these devices it will be also available an avatar (Ribeiro et al., 2010) with the goal to illustrate, answer, advice and support the user. This avatar plays a major role for elderly acceptance and adoption of the GUIDE system.

Audio feedback is available from the TV, tablet or remote control through audio speakers. Audio outputs can be from non-speech sounds such as rhythmic sequences that are combined with different timber, intensity, pitch and rhythm parameters to speech synthesizers that produce artificial human speech.

Haptic feedback is performed by vibration features from the remote control or tablet and is mainly used to complement other modalities (e.g. an alert message appears on the screen and the remote control vibrates to warn the user).

These are the devices available for the fission module to perform adaptation actions to the TV applications based on the user characteristics, visual modality can be complemented by auditory modality and/or haptic feedback. All elements of the presentation (e.g. text, images, buttons, videos, avatar, sounds, etc.) should be highly configurable and scalable, i.e., attributes such as size, font, location, color, volume, intensity should be able to set by the fission module. Depending on the user the depth of the adaption will change.

4.3 Adaptation Levels

In order to select the most appropriate modalities to use, it is necessary to define how deep the adaption level will be. We decided to divide it in three levels of adaptation of the interaction and presentation interface, which can be characterized as augmentation, adjustment and replacement. These three levels represent an increasing change to the visual presentation defined by the application, from no change to the visual rendering to a, possibly, complete overhaul. Given that GUIDE aims to support legacy applications, we must consider that these applications have been developed without accessibility concerns towards impaired users.

Ideally, GUIDE would be able to evaluate if the application’s presentation is close to the recommended presentation parameters for the current user and context specifications (e.g. Text or Button sizes are between the values perceived by the user’s vision), and based on that analysis select which adaptation level to apply. In practice, this represents a loss of control for application developers and publishers which they might not agree to. As such, the level of adaptation of an application might be limited by the application publisher.

Augmentation is the first and lightest form of adapting the interface implemented by developers. In this case visual elements are not subject of adaptation. Instead, what the fission module does is to complement the user interface with other modalities. The UI (its HTML/JavaScript-based embodiment, in the case of web applications) should be enriched with UI mark-up (e.g. WAI-ARIA), so that GUIDE can extract semantic information of UI elements and render a user-specific multimodal UI augmentation of it (fission would be able to perceive which elements are part of a menu, information text, etc. and then build an intelligent speech for the avatar).

Adjustment is the level of adaptation where the interface rendering is adjusted to the abilities of the user and can also be combined with augmentation. Considering applications are primarily developed using visual presentation mechanisms, this corresponds to be able to adjust several parameters such as font size, color, contrast, etc. If other modalities are employed, their parameters can be also adapted (e.g. adjust audio volume of the speakers, vibration intensity of remote control, etc.). Fission obtains all the recommended properties for a specific user from the different elements and compares with the application’s settings, and then it sets

ISBN: 978-972-8939-75-5 © 2012 IADIS

152

Click t

o buy N

OW!PD

F-XChange Viewer

ww


k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om



the right values for those elements with possibly some small adjustments for presentation arrangement purposes.

Replacement level is the most complex adaptation scheme as it means a full control over the interface developed by the developer. Besides augmentation and adjustment it can also replace interactive elements by others (e.g. menus vs buttons) or distribute the screen content over different modalities and/or more screens. Users with cognitive impairments could experience a better navigation through the application by getting a simpler screen and not feel lost due to the tangle of menus and buttons displayed.

4.4 Modality Selection and Evaluation

After the selection of the adaptation level best suited to the situation, the modalities to render the content are chosen through weights selected in accordance with the availability or resource limitations (context model) and with the user specificities described in the user model (figure 2).

Figure 2. Information for Modality Selection during Multimodal Fission

In this phase we need to know the user’s characteristics and decide what modalities to use. Impairment levels are accessed (none or mild) and modalities are chosen to complement and/or be adapted by a rule system described in table 1 which results from the analysis of the user studies (Coelho et al., 2011).

Table 1. Modality Selection Rules

Complement Adapt Modality Impairment level Visual Auditive Visual Auditive Visual None 0 0 0 0 Mild 0 1 1 0 Auditive None 0 0 0 0 Mild 0 0 0 1 Cognitive None 0 0 0 0 Mild 0 1 1 0 Motor None 0 0 0 0 Mild 0 0 1 0 Visual & Auditive Mild 0 1 1 1 Visual & Motor Mild 0 1 1 0 Auditive & Motor Mild 0 0 1 1 Auditive & Visual & Motor

Mild 0 1 1 1

“Complement” modalities are the modalities used to give redundant information to the user. “Adapt”

modalities are the modalities that need to be adapted (e.g. a user with mild visual impairments needs visual elements to be adapted to his characteristics while audio complements the visual impairments).

Using the information provided by the user and context model the fission module is capable of calculating the best values for visual elements within the recommended ones by evaluating the presentation coherency (e.g. assure that bigger buttons will not overlap each other or reach screen boundaries).


153

Click t

o buy N

OW!PD

F-XChange Viewer

ww


k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om



https://www.researchgate.net/publication/256006851_Developing_Accessible_TV_Applications?el=1_x_8&enrichId=rgreq-c3aa9623-1330-4ec8-842b-37439c721068&enrichSource=Y292ZXJQYWdlOzI1NTk4OTYwMDtBUzoxMDQxOTczODMwMDAwNzdAMTQwMTg1NDAwMTAxMg==

Once the presentation is ready to be rendered, the necessary messages to the output devices need to be sent in a coordinated way.

4.5 Coordination

To synchronize the presentation flow, coordination events will be sent to the bus in order to start or stop rendering, or to be notified when a render is completed. Figure 3 show a simple example of render synchronization where the audio synthesis will only start after the visual rendering is completed.

These rendering instructions are handled by a buffer in the fission module, which sends one instruction for each device at a time. The device will then respond with a notification of completion or failure. By controlling the flow of events sent and notifications received, instructions that do not get a chance to be rendered because a new state needs to be loaded due to user intervention are not even sent to the rendering devices, saving bandwidth.

Figure 3. Example of Rendering Synchronization

Coordination can be guaranteed by the fission component based on notifications of completion and failure. If all instructions were sent to rendering at once, coordination would be much more complex because output devices cannot be expected to be aware of the state of other devices.

4.6 Communication Events

The GUIDE framework uses a publisher/subscriber communication system in which events flow through different buses. The fission component receives and sends events from other components in the framework and output devices. For example the Web browser interface sends to fission new states from the application to be rendered and upon receiving this type of event it requests information from user and context models by publishing events to those components. After the correct adaptation performed by the fission component it publishes output events to the devices. A summary of the events produced and received are presented on the table 3.

Table 2. Communication Events for Multimodal Fission

Name In/Out Bus Description IntendedTargetList In Context Published by Input Adaptation CurrentUIML In Output Published by Web Browser

Interface UIParameters In Context Published User Model DeviceContextInformation

In Context Published by Context Model

OutputControl In Context Published by Dialogue Manager Status In Service Subscribed by Multimodal Fission,

published by all output renders GuiAdaptationRequired Out Output Published by Multimodal Fisson,

subscribed by Visual renders Behavior Out Service Subscribed by Avatar Window Out Service Subscribed by Avatar AvatarProfile Out Service Subscribed by Avatar

ISBN: 978-972-8939-75-5 © 2012 IADIS

154

Click t

o buy N

OW!PD

F-XChange Viewer

ww


k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om



The fission module needs to have knowledge of the application’s UI and that information must be structured and has to contain all elements and their properties in order to be possible to adapt that content to the user. Several abstract and concrete user interface mark-up languages were taken into account to be used as the UI standard for GUIDE. We conducted a study on several languages and UIML was the best suited for our purposes (Daniel Costa, 2011, pp. 59-61).

The UIML specification does not define property names. This is a powerful concept, because it permits to be extensible as we can define any property appropriate for a particular element of the UI. Additionally, they might be used to represent the information developers might provide using WAI-ARIA markup tags.

5. CONCLUSION AND FUTURE WORK

GUIDE tries to deliver accessibility features to TV based applications without any effort from the developers as the adaptation process is the framework’s responsibility. With a set of multimodal devices provided by the system, users with some kind of functional limitation or impairment can still enjoy interaction with TV based applications.

Thanks to the knowledge achieved by user studies conducted with end-users in the early phases of GUIDE implementation the Fission component is capable to choose the best modalities to perform adaptation based on the user model using a simple binary rule method. It evaluates the content in order to change the visual properties with compatible values. It also coordinates the presentation flow, ordering which and when devices should start rendering.

Future work will address the development of rules and algorithms for complementing and adapting motor and cognitive impairments, but also to deal with the third level of adaptation, the replacement level. This level has been left out of developments in the project so for, given the reluctance shown by application developers in having a “foreign” framework taking over the rendering of their applications. As such, we have not committed any resources so far to this goal. However, it is envisioned that new algorithms will have to be devised to that end, since the variables at play are substantially different than the ones for the two first adaptation levels.

ACKNOWLEDGEMENT

This work was founded by GUIDE project - the European Union’s Seventh Framework Programme (FP7/2007-2013) under grant agreement nº 24889.

REFERENCES

Biswas, P., Robinson, P. & Langdon, P., 2012. Designing Inclusive Interfaces Through User Modeling and Simulation. International Journal of Human-Computer Interaction, 28(1), p.1-33.

Coelho, J., Duarte, C., Feiteira, P., Costa, David, Costa, Daniel, 2012. "Building Bridges Between Elederly and TV Application Developers", in Proceedings of the 5th International Conference on Advances in Computer-Human Interactions (ACHI 2012), Valencia, Spain.

Coelho, J., Duarte, C., Langdon, P., Biswas, P., 2011. "Developing Accessible TV Applications", in Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2011), Dundee, Scotland.

Costa, Daniel, 2011. Self-Adaptation of Multimodal Systems. MSc. Faculty of Science University of Lisbon. Costa, Daniel & Duarte, C., 2011. Self-adapting TV Based Applications. In Proceedings of the 14th International

Conference on HumanComputer Interaction HCII. pp. 357-364. Costa, David & Duarte, C., 2011. Adapting Multimodal Fission to User ’ s Abilities.Proceedings of the 14th

International Conference on HumanComputer Interaction HCII2, p.347-356. Duarte, C. et al., 2011. Eliciting Interaction Requirements for Adaptive Multimodal TV based Applications. Proceedings

of the 14th International Conference on HumanComputer Interaction HCII, p.42-50.


155

Click t

o buy N

OW!PD

F-XChange Viewer

ww


k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om



Dumas, B., Lalanne, D. & Oviatt, S., 2009. Multimodal Interfaces: A Survey of Principles, Models and Frameworks D. Lalanne & J. Kohlas, eds. Human Machine Interaction, 5440(2), p.3-26.

Feiteira, P., Duarte, C. , 2011. Adaptive Multimodal Fusion C. Stephanidis, ed. Lecture Notes in Computer Science, 6765, p.373-380.

Jung, C. and Hahn,V., 2011. GUIDE – Adaptive User Interfaces for Accessible Hybrid TV Applications. Second W3C Workshop Web & TV. Berlin, Germany.

Oviatt, S., 2003. The Human-Computer interaction handbook: Fundamentals, evolving technologies and emerging application. L. Erlbaum Associates Inc., Hillsdale, NJ, USA. Chapter Multimodal interfaces., pp. 286-304.

Ribeiro, P. et al., 2010. “Combining Explicit and Implicit Interaction Modes with virtual characters in public spaces”. ICIDS'10 Proceedings of the Third joint conference on Interactive digital storytelling. Edinburgh, UK. pp. 244-247.

ISBN: 978-972-8939-75-5 © 2012 IADIS

156

Click t

o buy N

OW!PD

F-XChange Viewer

ww


k to b

uy NOW

!PD

F-XChange Viewer

ww

w.docu-track.c

om



Documents

Adapting TV based applications’ user interface