11
Assignment Cover Sheet MSc in User Experience Design Student Name: Stephen Norman Student Number: N00147768 Programme: MSc UX Design Year of Programme: 2015/2016 Module Name: User Research and Usability Assignment: Comparison of UX Evaluation Techniques Assignment Deadline: 14/02/2015 I declare that that this submission is my own work. Where I have read, consulted and used the work of others I have acknowledged this in the text. Signature: Stephen Norman Date: 14/02/2016

Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

Embed Size (px)

Citation preview

Page 1: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

Assignment Cover Sheet

MSc in User Experience Design

Student Name: Stephen Norman

Student Number: N00147768

Programme: MSc UX Design

Year of Programme: 2015/2016

Module Name: User Research and Usability

Assignment: Comparison of UX Evaluation Techniques

Assignment Deadline: 14/02/2015

I declare that that this submission is my own work. Where I have read, consulted and used the work of others I have acknowledged this in the text.

Signature: Stephen Norman Date: 14/02/2016

Page 2: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

Table of Contents 1. Introduction .................................................................................................................................... 2

2. Evaluation Methods ........................................................................................................................ 2

2.1. Usability Testing ...................................................................................................................... 2

2.1.1. Case Study: “Find it if you can: Usability Case Study of Search Engines for Young

Users” 3

2.1.2. Case Study Review .......................................................................................................... 3

2.2. UX Curve .................................................................................................................................. 4

2.2.1. Case Study: “Comparing the Effectiveness of Electronic Diary and UX Curve Methods

in Multi-Component Product Study” .............................................................................................. 5

2.2.2. Case Study Review .......................................................................................................... 5

2.3. Web Surveys ........................................................................................................................... 6

2.3.1. Case Study: “Approaches to Cross-Cultural Design: Two Case Studies with UX Web-

Surveys” 6

2.3.2. Case Study Review .......................................................................................................... 7

3. Comparison of Evaluation Methods ............................................................................................... 7

4. Conclusion ....................................................................................................................................... 8

5. References ...................................................................................................................................... 8

6. Bibiligraphy ................................................................................................................................. 9

Page 3: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

1. Introduction There are numerous amounts of user experience evaluation methods currently in use.

Some 841 evaluation methods are currently in use, a reduction in methods since a

publication by Vermeeren et al. (2010) where 96 methods were referenced. This

paper will examine three methods; Usability Testing, UX Curve and Web Surveys.

These will be discussed and their effectiveness demonstrated though various case

studies. Furthermore, their performance will be examined and critiqued. Examples of

their multi-functional roles will also be introduced and discussed. Followed up by a

comparison on each for their real world feasibility and any future improvements

demonstrated in the conclusion.

2. Evaluation Methods In this chapter three evaluation methods will be introduced. The methods discussed

are Usability Testing, UX Curve and Web Surveys. Each method will be described

and analysed through a case study review.

2.1. Usability Testing Usability testing is a single behavioural study using as many as five users to maximise

outcome (Nielsen, 2012). Participants are set tasks while observers watch, listen and

take notes (Usability.gov, 2016). It is an effective method at gathering both quantitative

and qualitative information. Usability Testing is also ideal for examining attitudinal and

behavioural dimensions (Rohrer, 2014). These tests are cost effective requiring no

formal laboratory (Usability.gov, 2016), any room with portable recording devices will

be sufficient, or the testing can be performed remotely, whilst eliminating such factors

which can alter human behaviour such as location, time of day, season, or temperature

(Trivedi, 2012). Remote testing is conducted in one of two ways; moderated or

unmoderated (Schade, 2013). Moderated testing is conducted where there is two-way

communication between the participant and facilitator allowing for additional

information to be gathered. Unmoderated testing is done solely by the user without a

facilitator where users are set predefined tasks without moderation. Unmoderated

studies lack real time questioning and support (Schade, 2013). Because usability

testing is done mostly in controlled environments Monahan et al., (2008) argues this

is a disadvantage as these studies lack context. However, this also depends on the

type of application being tested.

1 http://www.allaboutux.org/all-methods

Page 4: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

2.1.1. Case Study: “Find it if you can: Usability Case Study of Search Engines for

Young Users”

This study set out to assess 7 English search engines, and 5 German on their ability

to successfully match their interface to the abilities and skills of children. Interestingly

the study’s method was conducted without the involvement of any children, which

deviates it from standard practices (Nielsen, 2012). Three main points were

addressed; motor skills, cognitive skills and presentation of results. The motor skills

research included artefacts such as a mouse and keyboard, which assessed the

abilities of these devices from the handling, to their accuracy on the interface. This

included button sizes, clickable regions such as imagery and use of alternate methods

of providing results such as tangible figurines used in applications like TeddIR2

proposed by Jansen et al., 2010. Cognitive abilities were studied in both their

understanding of general search and how they interacted with these interfaces from

previous research. Children from age six to thirteen were in scope, as well as two

types of interfaces; browsing versus keyword orientated. Browsing interfaces allow

users to navigate and explore a set of predefined categories as used in KidsClick3,

whereas keyword orientated interfaces e.g. Google, require the user to type each

query. Final assessment criteria focused on font size, number of results per page, use

of imagery and did the search cater for semantics and spell checking.

2.1.2. Case Study Review

This was an untraditional usability evaluation with regards to using existing research

of children’s web use. It was acknowledged that to verify their research and enrich the

results that further studies should be conducted with children. With sufficient prior

research a good user model was created to allow the researchers to conduct their own

study of these interfaces thus saving time and money.

Furthermore, the chosen method was appropriate for producing desired results.

However further studies such as contextual usability inquiries, or EmoCards4 could be

performed to gather richer qualitative data.

Moreover, credit should be given to the paper’s authors with regards to their

organisational skills. Exemplary efforts were carried out on the categorisation (Figure 1),

which was conducted on all criteria throughout the paper. Without these efforts it

would have been difficult to assess the search engines properly.

2 An interface designer to help children retrieve books by placing tangible figurines on screen to represent search terms in hopes of reducing errors from spelling and finding the correct query (Gossen et al., 2010). 3 http://www.kidsclick.org/ 4 http://www.allaboutux.org/emocards

Page 5: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

Figure 1- Categorisation of search results by button size and page length.

2.2. UX Curve UX Curve is a method in which participants are asked to sketch their retrospective

experiences of a product use over time (Figure 2). UX Curve has been designed to better

understand user emotions and experiences chronologically (Kujala et al., 2011a).

Sketching is done on a template divided in to two planes; x-axis is time with the y-axis

can be any desired evaluating factors e.g. satisfaction or dissatisfaction (Sahar,

Varsaluoma & Kujala, 2014).

Figure 2- (Left) Showing a deteriorating and stable curve. (Right) Improving ease of use curve.

When compared to a questionnaire UX Curve has proven to be more effective at

collecting the hedonic aspects of users such as fun and pleasure (Kujala et al., 2011b).

However, in a later study concluded that long term diary studies were more effective

at collecting detailed information versus UX Curve (Sahar et al., 2014). Due to the

longevity of this study, results favoured the long-term diary study (LTDS) as it recorded

data more accurately, whereas recollection was required during UX Curve evaluation

due its presentation after the diary studies had concluded. Having to recall such

Page 6: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

information can lead to biases argues Norman, D.A, (2009); “Retrospective

evaluations of long-term user experiences are based on memories of the user and

they can be vulnerable to biases” (Kujala et al., 2011b). According to Vermeeren et

al. (2010), it is one of the lesser used methods because it is not cost effective

impractical in product development contexts (Kujala et al. 2011b).

2.2.1. Case Study: “Comparing the Effectiveness of Electronic Diary and UX Curve

Methods in Multi-Component Product Study”

This case study assessed the performance of both UX Curve and LTDS, each for

collecting qualitative data as a remote research method (Sahar, Varsaluoma & Kujala,

2014). Twenty-five customers were recruited who had recently purchased a sports

watch and were using it at least five times per week. This multi-product study included

connected accessories such as a heart rate monitor, speed sensor and website, was

conducted remotely over an eight-week period. Participants were asked to completed

the electronic diary online up to twice a week, upon completion of the eight-week

period they were sent four UX Curve templates; one for each of the components. The

templates addressed the “Attractiveness” of the product. “We chose ‘attractiveness’

UX dimension because it represents overall appeal and non-instrumental qualities

(aesthetics, symbolic and motivational aspects), although these were not specific to

the users” (Sahar, et al., 2014), and was also chosen based on a previous study done

by Kujala et al. (2014).

2.2.2. Case Study Review

The results were clear that the LTDS proved more effective at its ability to collect

further in depth information about each component when compared to UX Curve.

Although getting good user response rates was a challenge over the study duration

according to (Sahar et al.,2014).

UX Curve took less time overall from implementation to deployment and analysis as it

was conducted in one session with participants. However, this study did not maximise

on the best intended use for UX Curve; “UX Curve is intended to be used in a face to

face setting where the researcher is better able to inquire into the participants’

reasoning and thoughts” (Kujala et al. 2011a). Instead it was mailed to participants

during the Sahar et al. (2014) study eliminating the potential of qualitative data

gathering. Although limiting the full use of UX Curve, it did consider the existing

research of Kujala et al., (2011a) whom tested six UX Curve types (Figure 3) and

identified “attractiveness” as the best performing template.

Page 7: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

Figure 3- Different curve types used while testing a product.

2.3. Web Surveys Web Surveys are a commonly used method in the researcher’s toolkit. These allow

greater access from a larger audience due to accessibility of the internet. Both Walsh

(2012) and Vermeeren et al. (2010) agree that these are desirable due to their

lightweight nature; speed on implementation, and ease of use.

These highly versatile studies can be used at any stage of the design process. In a

recent project (Norman, 2016) a web survey was used during the exploratory phase

to gauge people’s attitudes and use of the An Post website before prototype

conceptualisation. In the opposite scale web surveys are used as a LTDS used in a

study by Sahar, et al. (2014).

Challenges raised by both Walsh (2012) and Sahar et al. (2014) was the ability to

keep participants engaged for the duration as users tended to drop out or not complete

the survey. These should be considered when surveys are being used in LTDS

contexts. An issue identified by Walsh (2012) is that researchers who formulate

questions and hypotheses should consider their own cultural background. This may

affect research questions, its performance and participant interpretation if testing

occurs in different regions and cultural backgrounds.

2.3.1. Case Study: “Approaches to Cross-Cultural Design: Two Case Studies with UX

Web-Surveys”

This study assesses the use of web surveys in two different cases. One covers an

online gaming site and the other an online sports diary. The online gaming site

objectives were to gain insights on how to design a good UX for new markets in the

future (Walsh, 2012). The online sports diary evaluated customer usage of over a

period of three months. The sample size in each greatly differed; 11,238 participants

of the gaming site were sent an invite email, with only 632 responding. 17 were

recruited for the online sports diary with 7 dropping out through the evaluation. A more

effective response was noted for the online sports diary which screened for willing

volunteers prior to the evaluation. Both surveys were sent internationally, however,

translations had to be considered prior to survey deployment (Walsh, 2012).

Page 8: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

Therefore, the survey was created in both Swedish and Spanish, requiring researchers

to translate to English for collection. For the sports diary an invitation questionnaire

was first sent, allowing the researchers to screen for English speaking participants,

and collecting internet and device usage information.

2.3.2. Case Study Review

It is believed researchers conducted their research using the best practice approaches

to both studies. However, in the diary study, the survey could be used in conjunction

with UX Curve, reflecting more positive results with regards to customer satisfaction

based on the study by Sahar et al. (2014). Again Walsh (2012) experienced the same

difficulties as Sahar et al. (2014) with regards to participation levels dropping on long-

term studies. However, the benefits of the long term diary’s format allowed for the

collection of rich qualitative data in conjunction with context, which is an important

cultural factor according to Gillham (2005).

The gaming site sought the research of Soley & Smith (2008) as their research

appeared to prove that a “sentence completion survey” method proved to be most

effective across cultures. Also this research could be improved by introducing an

invitation questionnaire to recruit willing participants initially opening research to

additional forms of questioning.

3. Comparison of Evaluation Methods The short term studies such as Usability Testing and Web Survey used in the gaming

website (Section 2.3.1) are more cost effective, requiring less time to implement, run

and analyse. However, short term studies lack visibility on long term user experienced

emotions. Whereas LTDS provides rich qualitative data during the evaluation because

users provides feedback usually within the same day of use, while the information is

fresh.

It seems UX Curve is open to a debate, as some researchers would argue that

evaluating retrospectively during a long term study can be open to biases (Norman,

2009). However, during their study Sahar et al. (2014) found that although UX Curve

requires less implementation effort, it requires additional time analysing and converting

user sketches to digital formats.

Although not in scope, iScale is worth studying further as its application can certainly

be improved as the original publication was issues in 2012. Technology has improved

and there is potential for this product to be every bit as intuitive as sketching on paper.

If both projects were combined perhaps a better UX evaluation could emerge.

Page 9: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

4. Conclusion With all evaluations there are definite challenges, from participation levels, to time

required to implement, coordination, and evaluate the data. Given the facts, it is the

opinion of the author that with current trends and technology Web urveys of any form

are the most effective at acquiring rich data. Although the set up may be longer than

other methods, having the ability to easily access a database of users quickly and

easily makes this a strong candidate to address the majority of business objectives,

especially from a costing perspective.

Interestingly, during the analysis of UX Curve the author had questioned the feasibility

of a digital platform to address the same issue. It would eliminate the time needed to

convert sketches to spreadsheet. Surprisingly enough development of such an

application has been conceived. A more in depth study is required for analysing the

potential of merging UX Curve and iScale with current technology.

5. References Desmet, P., Overbeeke, K., & Tax, S. (2001). Designing products with added

emotional value: Development and application of an approach for research

through design. The design journal, 4(1), 32-47.

Gossen, T., Hempel, J., & Nürnberger, A. (2013). Find it if you can: usability case

study of search engines for young users. Personal and Ubiquitous

Computing, 17(8), 1593-1603.

Gillham, R. (2005). Diary Studies as a Tool for Efficient Cross-Cultural Design. In

IWIPS (pp. 57-65).

Kujala, S., Roto, V., Väänänen-Vainio-Mattila, K., Karapanos, E., & Sinnelä, A.

(2011). UX Curve: A method for evaluating long-term user experience.

Interacting with Computers, 23(5), 473-483.

Kujala, S., Roto, V., Väänänen-Vainio-Mattila, K., & Sinnelä, A. (2011, June).

Identifying hedonic factors in long-term user experience. In Proceedings of the

2011 Conference on Designing Pleasurable Products and Interfaces (p. 17).

ACM.

Nielsen, J. (2012). How Many Test Users in a Usability Study? Nngroup.com.

Retrieved 10 February 2016, from https://www.nngroup.com/articles/how-

many-test-users/

Norman, S. (2016). Interaction Design Project - Anpost.ie (1st ed., pp. 3-4).

Monahan, K., Lahteenmaki, M., McDonald, S., & Cockton, G. (2008, September). An

investigation into the use of field methods in the design and evaluation of

interactive systems. In Proceedings of the 22nd British HCI Group Annual

Conference on People and Computers: Culture, Creativity, Interaction-Volume

Page 10: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768

1 (pp. 99-108). British Computer Society.

Reijneveld, K., de Looze, M., Krause, F., & Desmet, P. (2003, June). Measuring the

emotions elicited by office chairs. In Proceedings of the 2003 international

conference on Designing pleasurable products and interfaces (pp. 6-10). ACM.

Rohrer, C. (2014). When to Use Which User-Experience Research Methods.

Nngroup.com. Retrieved 7 January 2016, from

https://www.nngroup.com/articles/which-ux-research-methods/

Sahar, F., Varsaluoma, J., & Kujala, S. (2014, November). Comparing the

effectiveness of electronic diary and UX curve methods in multi-component

product study. In Proceedings of the 18th International Academic MindTrek

Conference: Media Business, Management, Content & Services (pp. 93-100).

ACM.

Schade, A. (2013). Remote Usability Tests: Moderated and Unmoderated.

Nngroup.com. Retrieved 10 February 2016, from

https://www.nngroup.com/articles/remote-usability-tests/

Soley, L., & Smith, A. (2008). Projective techniques for social science and business

research.

Usability.gov. (2016). Usability Testing. Retrieved 11 February 2016, from

http://www.usability.gov/how-to-and-tools/methods/usability-testing.html

Vermeeren, A. P., Law, E. L. C., Roto, V., Obrist, M., Hoonhout, J., & Väänänen-

Vainio-Mattila, K. 2010, October). User experience evaluation methods:

current state and development needs. In Proceedings of the 6th Nordic

Conference on Human-Computer Interaction: Extending Boundaries (pp. 521-

530). ACM.

Walsh, T., & Nurkka, P. (2012, November). Approaches to cross-cultural design: two

case studies with UX web-surveys. In Proceedings of the 24th Australian

Computer-Human Interaction Conference (pp. 633-642). ACM.

6. Bibiligraphy Allaboutux.org,. (2016). All UX evaluation methods « All About UX. Retrieved 11

February 2016, from http://www.allaboutux.org/all-methods

Karapanos, E., Martens, J. B., & Hassenzahl, M. (2012). Reconstructing experiences

with iScale. International Journal of Human-Computer Studies,70(11), 849-865.

Jansen, M., Bos, W., van der Vet, P., Huibers, T., & Hiemstra, D. (2010, June).

TeddIR: tangible information retrieval for children. In Proceedings of the 9th

international conference on interaction design and children (pp. 282-285). ACM.

Norman, D. A. (2009). THE WAY I SEE IT Memory is more important than

actuality. Interactions, 16(2), 24-26.

Page 11: Comparison_of_UX_Evaluation_Techniques_CA2_N00147768