Towards developing computational models to predict ... › files › neu:1481 › fulltext.pdf · Fig. 4.1 The eight abstract mock-up screens. 52 Fig. 4.2 The eight webpage designs

Towards Developing Computational Models to Predict

Perceived Visual Aesthetics of Website Interface Design

A Dissertation presented

by

Ahamed A. O. Altaboli

to

Department of Mechanical and Industrial Engineering

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in the field of

Industrial Engineering

Northeastern University

Boston, Massachusetts

April 2012

I

ABSTRACT

This study comes within a framework of scientific research arguing that users'

perception of visual aesthetics of computer interface design is related to visual screen

design features and layout elements. This framework is concerned with determining what

features triggers users’ perception of aesthetics and tries to express such features

numerically. The goal is to combine their collective effects using mathematical formulas

and computational models that would objectively predict perceived visual aesthetics.

The general purpose of this study is to continue the research efforts towards this goal.

In this study, rigorous experimental methods were utilized to verify and further improve

currently available measures and models. This included extending the application of these

measures and models to the case of website interface design. Viability of this extension

was assessed using standard questionnaires deigned to measure perceived visual

aesthetics of website design.

Results of this study confirmed findings of previous studies regarding the use of visual

features to predict perceived visual aesthetics and further support the concept of

expressing such feature using mathematical formulas and use them in turn as basis to

develop computational models to predict aesthetics. Results also proved that such screen

layout-based measures can work with website interfaces. Moreover, results showed that

objective screen layout-based measures relates to subjective questionnaire-based

measure. The relationship is particularly stronger with questionnaire elements related to

screen layout. This supports further the suggestion that objective layout-based measures

could be used to generally assess the overall visual aesthetics of websites and particularly

aesthetic aspects related to classical and simplicity dimensions of website aesthetics.

II

TABLE OF CONTENTS

______________________________________________________________________

ABSTRACT .................................................................................................................. I

TABLE OF CONTENTS ............................................................................................ II

LIST OF FIGURES ................................................................................................. IV

LIST OF TABLES ....................................................................................................... VI

_______________________________________________________________________

1 CHAPTER 1 INTRODUCTION ............................................................................ 1

1-1 Acceptability of computer systems ................................................. 2

1-2 Aesthetics: definition and historical note ....................................... 4

1-3 Purpose of the study ......................................................................6 1-3-1 Study framework .............................................................................. 6

1-3-2 Objectives of the study and related research questions ................... 9

1-3-3 Organization of the study ................................................................. 11

_______________________________________________________________________

2 CHAPTER 2 LITERATURE REVIEW AND BACKGROUND RESEARCH . 12

2-1 Aesthetics and Usability in Interface Design .................................. 13 2-1-1 Aesthetics and perceived usability .................................................. 13

2-1-2 Role of context of use ..................................................................... 16

2-1-3 Aesthetics and performance ............................................................ 18

2-2 Quantitative Measures and Models of Interface Aesthetics .......... 20 2-2-1 Objective screen layout- based measures ........................................ 20

2-2-2 Subjective questionnaire-based measures ........................................ 24

_______________________________________________________________________

CHAPTER 3 VERIFYING NGO AND BYRNE’S FINDINGS AND

DEVELOPING A PRELIMINARY MODEL................................... 28

3-1 Introduction ......................................................................................... 29

3-2 Method .................................................................................................. 30 3-2-1 Design of the experiment ................................................................. 30

3-2-2 Screen designs .................................................................................. 31

3-2-3 Participants and apparatus ................................................................ 33

3-2-4 Procedure ......................................................................................... 33

3-3 Results ............................................................................................................ 34

3-3-1 Participants ratings ........................................................................... 34

3-3-2 Analysis of variance ......................................................................... 35

3-4 Constructing and Validating the Regression Model ............................. 37 3-4-1 Constructing the model .................................................................... 38

3-4-2 Compare with Ngo and Byrne’s model ........................................... 40

III

3-4-3 Validating the model using standard questionnaire scores of real

webpages ......................................................................................... 45

3-4-4 Checking for correlations with simple counts measures .................. 50

_______________________________________________________________________

CHAPTER 4 FURTHER TESTING OF VISUAL LAYOUT ELEMENTS AND

VALIDATING OF THE MODEL .................................................... 54

4-1 Introduction ........................................................................................ 21

4-2 Method ................................................................................................. 56 4-2-1 Experimental Design ........................................................................ 56

4-2-2 Procedure ......................................................................................... 57

4-3 Results Analysis and Discussion ..................................................61 4-3-1 One-question with mock-up screens trail ........................................ 61

4-3-2 One-question with webpages trail .................................................... 65 4-3-3 The Classic/Expressive questionnaire trails ................................... 68 4-3-4 The VisAWI questionnaire trails .................................................... 77 4-3-5 Overall discussion ........................................................................... 86

4-4 Comparing Objective Measures with Subjective Measures .........94 4-4-1 Correlation analysis ........................................................................ 94

4-4-2 Proposed modification to the unity of form formula ...................... 96

4-4-3 Incorporating the modified unity of form formula into the

computational model .....................................................................101

_______________________________________________________________________

CHAPTER 5 CONCLUSIONS AND FUTURE WORK ................................107

5-1 Summary of Experimental Work and Results ..................................108

5-2 Conclusions and Contributions ...........................................................111 5-3 Recommendations for Future Work ..................................................112

_______________________________________________________________________

REFERENCES ...........................................................................................................115

APPENDIX A THE USED FORMULAS WITH EXAMPLES OF

CALCULATIONS...................................................................121

APPENDIX B QUESTIONNAIRE SCORES AND MEASURES AND

MODELS VALUES FOR THE 42 WEBPAGES .................129

APPENDIX C QUESTIONNAIRE SCORES FOR EXPERIMENTAL

TRAILS OF CHAPTER 4 .....................................................136

_______________________________________________________________________

IV

LIST OF FIGURES

Fig. 3.1 The eight screen models associated with the experimental

conditions

33

Fig. 3.2 Average effects and interactions plots 37

Fig. 3.3 Scatter diagram of actual and predicted aesthetic values for

the eight screens

39

Fig. 3.4 Scatter diagram of actual and predicted (Ngo and Byrne

model) aesthetic values for the eight screens

42

Fig. 3.5 Screen shoots of webpages with the highest and lowest

average questionnaire scores

47

Fig. 3.6 An Example of how a webpage is divided into visual

objects

48

Fig. 4.1 The eight abstract mock-up screens. 52

Fig. 4.2 The eight webpage designs 53

Fig. 4.3 Average scores for the one-questions, mock-up screens trail

63

Fig. 4.4 Average scores for the one-question, webpages trail

67

Fig. 4.5 Average scores for the Classical/Expressive questionnaire

balanced trail

71

Fig. 4.6 Average scores for the Classical/Expressive unbalanced

trail

75

V

Fig 4.7 Average scores for the VisAWI balanced trail

79

Fig. 4.8 Average scores for the VisAWI unbalanced trail

83

VI

LIST OF TABLES

Table 2.1 Scales and items in the Classical/Expressive and the

VisAWI questionnaires

27

Table 3.1 The eight experimental conditions and the associated

factors levels and values

32

Table 3.2 Calculated aesthetic values and participants' average

aesthetic ratings

35

Table 3.3 Analysis of variance results 36

Table 3.4 Actual and predicted aesthetic values of the eight screens 39

Table 3.5 Calculated values of the five terms (elements) included in

Ngo and Byrne (2001) model

41

Table 3.6 Actual and predicted (Ngo and Byrne model) aesthetic

values of the eight screens

42

Table 3.7 Actual and predicted (current model) aesthetic values of

the 57 screens of Ngo and Byrne (2001) study

44

Table 3.8 Summary of total errors of each type per group

Descriptive statistics for questionnaire scores for the 42

webpages

46

Table 3.9 Descriptive statistics for the measures and the models for

the 42 webpages

49

Table 3.10 Correlations between the measures and questionnaire

scores

49

VII

Table 3.11 Descriptive statistics for the selected count-based

measures for the 42 web pages

52

Table 3.12 Correlations between objective simple count-based

measures and subjective questionnaire based measures

53

Table 4.1 The eight deigns and the associated factors levels and

values

57

Table 4.2 Experimental trails and participants information

60

Table 4.3 Descriptive statistics for average scores for the one-

question, mock-up screens trail

62

Table 4.4 ANOVA for average scores for the one-question, mock-

up screens trail

64

Table 4.5 Descriptive statistics for average scores for the one-

question with webpages trail

66

Table 4.6 ANOVA for average scores for the one-question

webpages trail

68

Table 4.7 Descriptive statistics for average scores for the

Classical/Expressive balanced trail

70

Table 4.8 ANOVA for average scores for the Classical/Expressive

balanced trail

72

Table 4.9 Descriptive statistics for average scores for the

Classical/Expressive unbalanced trail

74

Table 4.10 ANOVA for average scores for the Classical/Expressive

unbalanced trail

76

VIII

Table 4.11 ANOVA for balance and scales for the

Classical/Expressive trail

76

Table 4.12 Descriptive statistics for average scores for the VisAWI

balanced trail

78

Table 4.13 ANOVA for average scores for the VisAWI balanced

trail

80

Table 4.14 Descriptive statistics for average scores for the VisAWI

unbalanced trail

82

Table 4.15 ANOVA for average scores for the VisAWI unbalanced

trail

84

Table 4.16 ANOVA for balance and scales for the VisAWI trail

85

Table 4.17 Summary of results for all experimental trails

87

Table 4.18 Summary of average scores for all experimental trails

91

Table 4.19 Correlation coefficients of average scores for all eight

designs (balanced and unbalanced)

92

Table 4.20 Correlation coefficients of average scores for the

balanced condition

93

Table 4.21 Values of measures and models for the eight designs

95

Table 4.22 Correlation coefficients of measures and models for all

eight designs (balanced and unbalanced)

97

Table 4.23 Correlation coefficients of measures and models for the

balanced condition

98

Table 4.24 Differences between average total scores for each design

pair

99

IX

Table 4.25 Correlation coefficients for values of unity of form

computed by the original formula

99

Table 4.26 Comparing differences with values of unity of form of

both original and modified formula

100

Table 4.27 Correlation coefficients for values of unity of form

computed by the modified formula

101

Table 4.28 Values of measures and actual and predicted aesthetic

values of the eight screens for the original and the

modified model

102

Table 4.29 Values of the three measures and the model for the eight

webpage designs

103

Table 4.30 Correlation coefficients of the models for all eight

designs (balanced and unbalanced)

104

Table 4.31 Correlation coefficients of the models for the balanced

condition

105

Table 4.30 Correlation coefficients of the models for all the 42

webpages of chapter 3

106

CHAPTER 1

INTRODUCTION

2

1.1 Acceptability of computer systems

In developing a framework for usability, Shackel (1991) introduced a paradigm explaining

what affect users’ decisions to accept (purchase) a system. In this paradigm, acceptability of a

system depends on balance between its cost and three design factors: utility (functionality),

usability, and likeability. Utility/functionality is related to the questions of: will the system be

useful? Will it do the expected function? Usability is related to the question of: can the system

be used successfully? Likeability refers to whether the users like the system and feel it is

suitable. The acceptability (purchase) decision is made by balancing the above three factors in

a trade-off with cost of the system.

Aesthetics is related to the likeability factor. Visual appealing of a system can affect the

user’s first impression and opinion about the system and how suitable and easy to use it

appears (more on that in next sections).

With the emerging of interactive computer systems in the second half of the twentieth

century, earlier computer systems were setup in specialized research centers and were used

mostly by professionals and scientists who considered them as means that would help in

achieving their required research goals. They were willing to tolerate whatever usability issues

they faced and to spend whatever time to master these tools (Bailey, 1982). Consequently;

designers, at that time, were more concerned with functionality of their systems, usability and

likeability weren’t a big concern.

In the 1970s and 1980s, following advances in the related technologies, computers became

cheaper and more powerful. Commercial types of computers (e.g. personal computers) were

3

introduced and their use started spreading among the general public. For these systems to be

accepted, ease of use and usability issues can no longer be ignored. Usability became the

dominant acceptability factor and an important element in design.

In research and academia these developments were reflected in the establishment of the

Human Computer Interaction field in earlier 1980s. One of the main concerns of this field

upon its establishment is usability of computer systems. Numerous researches have been

conduced and volumes of design standards and guidelines have been published leading to

sufficient understanding of many aspects of usability. However, most of these guidelines

neglected aesthetics of the user interface and many insisted that aesthetics should only be used

in the design to support usability (Nielsen, 1993; Dix et al., 2004). Some even argued that

introducing aesthetics in the design will have negative effect on usability (Ngo et al., 2002).

This orientation began to change in late 1990s, largely motivated by the widen use of the

internet and the web. In today societies, with the wide spread of the web and its social

networks, computer systems are no longer considered as just tools to carry out daily tasks,

they are also considered by many as important aspect of social communications. Aesthetic

aspects became more recognizable in human computer interaction and in engineering and

product design in general. Considerable number of studies and publications concerned with

aesthetics appeared in recent years (e.g Jordon, 1998; Liu, 2003a & 2003b; Norman, 2004).

Many of these studies showed that visual aesthetics in interface design can affect users’

perception of ease of use of the interface (Kurosu & Kashimura, 1995; Tractinsky, 2000),

some even argue that visually appealing interfaces might have positive effects on performance

(Altaboli et al., 2010; Moshagen et al., 2009; Sonderegger & Sauer 2010).

4

1.2 Aesthetics: definition and historical note

As a word, Oxford online dictionary (2012) defines aesthetics as : “a set of principles

concerned with the nature and appreciation of beauty, especially in art”. Merriam-Webster

online dictionary (2012) defines aesthetics as: “a particular theory or conception of beauty or

art; a particular taste for or approach to what is pleasing to the senses and especially sight”.

As a term, aesthetics is related to the study of beauty or perception of beauty as the meaning

of the original term in Greek implies. The term “Aesthetics” was first used by Alexander

Baumgarten in 1735 in his book “Reflection on Poetry” (Reich, 1993). However, the term

Aesthetics was later utilized to represent a discipline within philosophy, established in 1790 by

Kant in the book “Critique of judgment” (Liu, 2003a). This discipline deals with topics such

as analysis of the beautiful and the sublime, the logic of aesthetic judgments, and the moral

function of the aesthetic.

In experimental psychology, first attempts to investigate quantitative relationship between

psychological responses and physical stimuli were conducted by Fechner (1876, as cited in

Liu, 2003a). His approach involves the manipulation of dimensions of visual objects (like

polygons) in order to find out relationships between aesthetic response and the manipulated

dimensions (Liu, 2003a). His bottom-up approach influenced many researches in later

centuries. Birkhoff (1933) used this approach to develop a universal aesthetic measure. This

measure was presented in a mathematical formula. The formula was applied to measure

aesthetics of geometrical forms (polygons and vases), to melody and harmony in music,

musical quality in poetry, and arts.

http://english.oxforddictionaries.com/search?q=set

http://english.oxforddictionaries.com/view/entry/m_en_us1280769#m_en_us1280769.001


http://english.oxforddictionaries.com/view/entry/



http://english.oxforddictionaries.com/view/entry/


5

In human factors and ergonomics, until recently, aesthetics was completely ignored as a

topic of systematic scientific research (Hoffmann & Krauss, 2004; Liu, 2003a & 2003b). Two

recent methodology were widely accepted as basis to incorporate users feelings and aesthetic

aspects of design in human factor methodology: Kansei engineering/ergonomics and the dual-

process engineering aesthetics research methodology

Kansei engineering was introduced in late 1980s in Japan by Mitsuo Nagamachi

(Nagamachi, 1995). It was developed as an ergonomics and consumer-oriented technology for

producing a new product (Nagamachi, 2002). Kansei is a Japanese term which means a

consumer's psychological feeling and image regarding a new product. It aims at the

implementation of the customer’s feeling and demands (Kansei) into product function and

design (Nagamachi, 1995). The procedure utilizes various techniques to capture consumer’s

feelings about a new product and translate them into design characteristics of the product.

Yili Liu (Liu, 2003a) has proposed the establishment of a new scientific and engineering

discipline that he named “Engineering Aesthetics” to systematically incorporate engineering

and scientific methods in the aesthetic design and evaluation process. He developed the dual-

process methodology as a comprehensive research methodology for “Engineering Aesthetics”.

The methodology consists of two parallel but closely related lines of research. The first

process is called “multidimensional construct analysis or multivariate psychometric analysis”,

whose goal is to establish a “global”, “top-down”, and quantitative view of the critical

dimensions involved in a specific aesthetic response process. The second process is called

“psychophysical analysis”, whose objective is to establish a “local”, “bottom-up”, and

quantitative view of the individual’s perceptual abilities in making fine aesthetic distinctions

6

along selected dimensions. It identifies how keen the perceivers’ senses are in detecting

variations along critical aesthetic dimensions and how their preference levels change as a

function of specific design parameters or aesthetic variables.

As an example in judgment of visual aesthetics of screen design, the first process will be

concerned with finding what overall attributes of screen design would affect users’ perception

of aesthetics, e.g. symmetry or balance of the screen, or number of visual objects on the

screen. The second process will be concerned with finding how manipulation of these

attributes would specifically affect aesthetics, e.g. how changing the number of objects in the

screen would affect aesthetics.

In comparison with Kansei engineering, Liu claimed that the dual-process methodology is a

more comprehensive methodology that includes Kansei engineering as a special case (Liu,

2003).

In addition to the above two methodologies, Norman’s cleverly written book: “emotional

design” (Norman, 2004) was another milestone on revealing the importance of considering

aesthetics aspects of product design in the field of human factors and ergonomics.

1.3 Purpose of the study

1.3.1 Study framework

Two approaches to measure interface aesthetics can be distinguished in the literature. The first

is an objective quantitative approach relating screen design layout elements to the user

perception of visual aesthetics (e.g. Ngo et al., 2003; Bauerly & Liu, 2006). It is concerned

7

with determining what features in the interface design trigger users’ perception of aesthetics of

the interface. It also tries to explore the possibility of expressing changes in such features

using numerical values and use these numerical values to assess users' perception of interface

aesthetics. Methods in this approach are motivated by earlier aesthetic measures developed by

Birkhoff (1933), Tullis’ quantitative techniques for evaluating screen design (Tullis, 1983 &

1988), and Gestlest theory for visual design (Chand et al., 2002; Ngo et al., 2002).

The second approach is a subjective questionnaire-based approach. Supporters of this

approach argue that the complexity and interrelated relationships among the screen design

elements make it difficult to use them to quantitatively measure aesthetics (Lavie &

Tractinsky, 2004). It would be more convenient to use questionnaire-based instruments to

measure users’ subjective perception of aesthetics. Two of the most widely accepted of such

instrument are the Classical/Expressive Aesthetics questionnaire developed by Lavie and

Tractinsky (2004), and the Visual Aesthetics of Website Inventory (VisAWI) tool developed

by Moshagen and Thielsch (2010). Both were designed to measure perceived visual aesthetics

of websites.

The objective methods represent a bottom-up approach. This approach has its root in the

rationalistic philosophical view of aesthetics (Reich, 1993). This approach comprises the

concept of “beauty in the observed object”; i.e. human perception of beauty is based on the

order and organization of the various components constructing the object. On the other hand,

the subjective methods reflect a top-down approach. It is based on the concept of “beauty in

the mind of the observer”; the main principle of the romanticist philosophical view of

aesthetics (Reich, 1993). It stated that human perception of beauty is based on the whole form

8

of an object (influenced by cultural beliefs) and aesthetics cannot be evaluated by looking

separately at the various components constituting an object.

This study can be categorized mainly within the framework of the objective quantitative

approach. The general purpose of this study is to continue the research efforts towards the

major goal of developing overall measures of visual aesthetics of interface design.

The main goal of this study is to verify some of the latest findings in this line of research and

try to further improve currently available measures and models.

The concentration will be on computational models development based on visual features

and screen layout elements of the interface. More rigorous experimental methods will be used

to verify previous findings and validate currently available measures and models. Exploring

the possibilities of further development and improvements to these measures and models will

be part of the verification and validations procedures of this study.

Although this study is categorized mainly within the framework of the first objective

approach, however, subjective questionnaire-based measures of the second approach will be

used in the validation process in this study to assess and evaluate the tested objective layout-

based measures and models.

The rationalization for studies in the objective approach (including current study) is based on

that the majority of the available interface and screen design guidelines are qualitative (e.g.

Galitz, 2007; Shneiderman, 2010). They mostly comprise of qualitative descriptions and

summaries (Bauerly & Liu, 2006) that leave designers with no quantitative tools to evaluate

and compare their design alternatives and leave many of the design decisions to subjective

views of the designers. This study and other previous studies (Ngo et al., 2002; Bauerly & Liu,

9

2006) argue that developing quantitative measures that can provide numerical values for

different designs based on interface and screen design characteristic can be very helpful in

many design situations. These numerical tools can be extremely helpful in early stages of

design. They can assist in preparing design alternatives and can reduce the number of

prototypes that will undergo tests with human users in later stages of design. However, these

tools are not meant to be replacement to human designers, but are intended to serve as

numerical tools to help designers and researches evaluate different design alternatives without

the need to use human participants and to understand the extent to which their designs will

affect usability. Moreover, these measures can provide researchers with quantitative tools that

can help in systematical study of different design aspects and give a numerical basis for direct

comparing between different design proposals. These measures can also be useful in cases

where on-the-fly designs are needed for non professional designers as in online tools for

designing websites (Lai et al., 2010).

1.3.2 Objectives of the study and related research questions

Specific objectives of the study include:-

1. Verify findings of previous studies and validate the available objective measures and

models of visual aesthetics of computer interfaces using more rigorous experimental

approach utilizing statistical testing and design of experiment techniques.

2. Explore the possibility of improving the available measures and models and/or the

possibility of developing new more compact and efficient ones.

10

3. Extend research domain to the case of website design interface. The goal is to see if

these measures and models that proved to be working in many situations of the

traditional graphical user interfaces will be applicable to the website interface.

4. Compare objective layout-based measures of visual aesthetics with subjective

questionnaire-based measures. Earlier observations suggest that the objective layout-

based measures would correlate with questionnaires scales related to screen layout in the

subjective questionnaire-based measures, this study will look more into this. Moreover,

these comparisons should help in assessing the tested objective measures and models

using the subjective questionnaire-based measures.

Part of potential contributions that this study would add to the knowledge base in the field of

human factors and ergonomics in general and the field of human computer interaction and

interface design in particular include trying to find clearer answers to the following still open

to research questions:-

1. Can visual aesthetics be measured quantitatively and objectively using mathematical

formulas and computational models? More specifically, do certain visual layout elements

in the interface relate to perception of visual aesthetics? If yes, can these elements be

used as a basis for developing objective measures and models of visual aesthetics?

2. Do objective layout-based measures of visual aesthetics relate to subjective

questionnaire-based measures? If yes, what is the extent of this relationship? Does

quantitative values produced by both types of measures correlate completely for all

11

dimensions of visual aesthetics or is it limited to certain dimensions? Do specific

objective measures relate only to specific subjective measures?

1.3.3 Organization of the study

The rest of this report is organized as follows, chapter 2 provides a literature review and

related background research. It covers the latest findings concerning the relationship between

visual aesthetics and usability in interface design, with consideration to effects of aesthetics on

perceived usability and performance. Next, a classification of quantitative measures and

models of interface aesthetics is given, with coverage of the latest developed measures and

models of visual aesthetics.

Chapter 3 and chapter 4 cover experimental work of this study. Chapter 3 includes coverage

of an experiment conducted to investigate effects of selected visual elements and incorporate

these elements to construct a regression model to predict perceived visual aesthetics.

Experimental work to validate the model and compare it with earlier developed models is also

covered in this chapter.

Material in Chapter 4 reports an experimental work carried out to further investigate effects

of certain visual elements on perceived visual aesthetics of website interface design.

Comparisons of objective with subjective measures used in the experimental trails and

proposed modification of the model based on these comparisons are also presented in this

chapter.

The final chapter contains summary of results, conclusions, and recommendations for future

work.

CHAPTER 2

LITERATURE REVIEW AND BACKGROUND RESEARCH

13

2.1 Aesthetics and Usability in Interface Design

The attention to the importance of aesthetics in interface design began with findings of Kurosu

and Kashimura (1995) work. Using different designs of an automated teller machine interface,

they managed to find high correlation between users’ prior perception of usability (they called

it apparent usability) and users’ perception of visual aesthetics of the interface. Participants

perceived the visually appealing interface designs as easier to use.

More researches followed, aiming at understanding the nature of the relationship between

visual aesthetics of interface and its usability. Earlier researches concentrated on verifying

Kurousu and Kashimura results using different interface designs and test setups. Later

researchers examined the role of context of use in the aesthetics and usability relationship and

the most recent researches tried to inspect for possible positive effect of interface aesthetics on

users’ performance.

2.1.1 Aesthetics and perceived usability

In an attempt to demonstrate that Kurousu and Kashimura (1995) findings were culturally

dependent, Tractinsky (1997) replicated the study in a different cultural setting using more

rigorous methodology. Kurousu and Kashimura study was conducted in Japan, Tractinsky

claimed that Japan’s culture is known for its aesthetic traditions and Japanese would have

more positive attitude towards aesthetics in computer interfaces. This attitude might have led

to Kurousu and Kashimura results. Tractinsky conducted his study in Israel, a culture known

for its action orientation and supposed to have less positive attitude towards aesthetics.

14

Unexpectedly, higher correlation between interface aesthetics and perceived usability was

found. This result supported further Kurousu and Kashimura findings and suggested a strong

relationship between interface aesthetics and perceived usability. Furthermore, this strong

relationship between user perception of interface aesthetics and perceived usability remains

intact even after actual use of the system (Tractinsky et al., 2000). Tractinsky et al. (2000)

conducted a study to see if actual use of the system would change users’ easy to use

perception of the visually aesthetic interfaces. Results showed the same high correlation

between visual aesthetics and post use perceived usability.

Lavie and Tractinsky (2004) found that users’ perception of interface aesthetics consists of

two main dimensions; they termed “classical aesthetics” and “expressive aesthetics”. The

classical aesthetics emphasize orderly and clear design and are closely related to many of the

usability and interface design rules and guidelines. The expressive aesthetics dimension is

linked to the designers’ creativity and originality and to the ability to break design

conventions. They also developed a questionnaire-based instrument to measure each of the

two dimensions.

Lindgaard et al. (2006) performed a number of experiments to determine how fast users' first

impressions of perception of visual appeal of websites formed. Their results indicated that

users' immediate aesthetic impressions formed very quickly within 50 milliseconds. Tracinsky

et al. (2006) replicated and extended Lindgaard et al. (2006) study to test if these immediate

impressions would remain stable over time. Tracinsky et al. (2006) confirmed Lindgaard et al.

(2006) findings and showed that users' first impressions, formed after a short exposure to the

webpages, remained stable even after a considerably longer exposure.

15

Phillips and Chapparro (2009) examined users’ impression of usability in case of users

performing search and exploratory tasks on websites which varied in visual appeal and

usability. Their results indicate that first impressions are most influenced by the visual appeal

of the site. Users rated sites with high visual appeal and low usability as easier to use, and

gave lower rates to sites with low visual appeal and high usability.

Tracinsky et al. (2000) suggested that the positive effect interface aesthetics has on

perceived usability resembles the known phenomena of "beautiful is good" in the field of

social psychology relating physical attractiveness to socially desirable characteristics. Dion et

al. (1972, as cited in Tracinsky et al., 2000) found that people who are physically attractive are

assumed (by other people) to possess more socially desirable personality traits than persons

who are unattractive. This phenomenon was also reported in marketing and consumer

behavior literature (Tracinsky et al., 2000) responsible for a carryover of first impressions of

products or shopping environments to consumers' evaluations of other attributes of these

products or environments.

Two terms are used to refer to this phenomena, "halo effect" in the consumer behavior

literature and "confirmation bias" in the human-decision making and judgment literature

(Lindgaard et al., 2006). Confirmation bias states that people tend to seek confirming evidence

of their first impressions and ignore disconfirming evidence (Phillips and Chapparro, 2009).

These first impressions of buyers of a product or users of a website are strongly influenced by

physical appearance and visual appeal of the product or the site (Lindgaard et al., 2006;

Phillips & Chapparro, 2009). Users, who like the appearance of a website when they first see

it, may continue to like it regardless of how successful they are in using the site.

16

2.1.2 Role of context of use

The role of context of use was raised by conclusions of a study conducted by Hassenzahl

(2004) contradicting Kurosu and Kashimura (1995), Tracinsky (1997), and Tracinsky et al.

(2000) findings of high correlation between interface aesthetics and perceived usability.

Hassenzahl used an MP3 player skins to investigate the relationship among users perceptions

of beauty (visual aesthetics) goodness (satisfaction) and usability before using the system and

after actually using the system. Based on the study results, Hassenzahl argued that if users are

faced with actual usability problems while using a system than users’ perception of usability

will no longer relates to visual aesthetics. In such a case, actual usability (experienced during

use of the system) will be the main determinant of the post-use perceived usability.

Hassenzahl findings pointed out to a possible effect of context of use on the relationship

between visual aesthetics and perceived usability.

Ben-Bassat et al. (2006) argued that aesthetics appreciation would be more dominant on less

serious contexts where users’ judgments have no later consequences. In a laboratory setting

where users are not going to actually buy a system, their judgment of systems’ acceptability

may be more related to aesthetic aspects rather than actual performance issues. Ben-Bassat et

al. (2006) conducted a study in which users evaluated aesthetics and usability of on screen

simulations of computerized phone book systems using both subjective and economic

measures. Auction bids were used as economic measure of the system in the sense that they

would require users to give market values (prices) to the systems that they are willing to pay to

acquire the systems. With the subjective questionnaire-based measures, results showed the

same strong relationship between perceived aesthetics and perceived usability before and after

17

using the system. However, with the auction bids, this relationship was not evident. Users’

bids in auctions were more related to performance and usability of the systems, i.e. if users

held responsible for their judgments then actual usability aspects will influence their decisions

more than visual appeal of the systems.

De Angeli (2006) and later Hartmann et al. (2008) studied the effect of context of use

represented in two interaction styles of a website on perception of aesthetics and usability.

One interaction style of the website is a more traditional and menu-based; the other is a more

interactive, exploiting metaphor and humor effects, both designs have the same content. Using

Lavie and Tractinksy's scales to evaluate classical and expressive aesthetics, they found that

the metaphor based design was perceived as having better expressive aesthetics although it

had worse perceived usability. The more serious menu-based design rated high on the classic

aesthetic scale and was perceived as having better usability. They concluded that when the

context is less serious and implied more fun and engagement as in the metaphor based design,

aesthetics can have a strong halo effect even on information content and users are willing to

tolerate usability problems for more engaging interfaces. On the other hand, with the more

serious usage contexts, usability appears to have a positive halo effect on content. In such

context, users prefer a more easy to use interface with information content presented in clear

and orderly fashion that simplifies information access.

Similar conclusions were reached by Van Shaik and ling (2009) regarding the halo effect of

usability in perception of aesthetics in case of information oriented context. Their findings

indicated that after participants were briefly exposed to the stimuli webpages used in the

study, classically aesthetic webpages that are information oriented were rated as more

18

attractive than expressively aesthetic pages. Another interesting finding of this study is the

effect of context in stability of aesthetic perception. Providing a context and a goal of use

increases stability of users’ judgments of perception of aesthetics after brief exposure to those

after self-paced exposure and, from perceptions after self-paced exposure to those of after site

use.

2.1.3 Aesthetics and performance

As the positive effect of interface aesthetics on subjective perception of usability became

clearer, it is still unclear what effect aesthetics would have on performance. In almost all of the

previous researches dealing with interface aesthetics and usability, usability was evaluated

using subjective questionnaires. The possibility of this positive relation between interface

aesthetics and usability holds in case of using objective performance measures of usability is

yet to be inspected. So far, results of recent researches addressing the effect of aesthetics on

performance show inconsistent findings. Some reported negative effect of aesthetics on

performance. For examples, Schmidt et al. (2003) found no significant effect of webpages

with different graphics and font sizes on participants’ interaction time in a reading

comprehension task. Van Schaik and Ling (2009) found that number of completed tasks was

significantly lower with more appealing webpages in an information retrieval task.

On the other hand, many of the findings of recent studies indicated positive effects of

aesthetics on performance. Cawthon and Vande Moere (2007) found a high positive

correlation between data visualization techniques rated high in aesthetics and objective

usability measures of efficiency and effectiveness.

19

Using a website providing health-related information as stimulus, Moshagen et al. (2009)

found significant effect of aesthetics on completion time in a low usability condition when

participants completed search tasks. They concluded that high aesthetics could enhance

performance under conditions of poor usability.

Sondergger and Sauer (2010) examined the effect of visual aesthetics on perceived usability

and performance. They employed two deigns of cell phones (highly appealing vs. not

appealing) simulated in computer screen. Participants were asked to complete a number of

typical tasks of cell phone users. Results showed that the visual appearance of the phone had a

positive effect on performance, leading to reduced completion time and number of errors for

the visually appealing design. Same positive relation between aesthetics and perceived

usability was also reported.

Sondergger and Sauer (2010) argues that controversy in findings regarding effect of

aesthetics on performance measures could be due to that aesthetics may have positive or

negative effects depending on the context of use. A positive “increased motivation” effect may

be more likely to occur in a serious work context. Aesthetically pleasing designs might put the

user at ease or in flow, which may improve performance. On the contrary, a negative

“prolongation of joyful experience” effect might be prevailing in a leisure context. The user

taken by the beauty of the product may concentrate less on the task on hand and try to extend

the enjoyment time, which may reduce performance.

20

2.2 Quantitative Measures and Models of Interface Aesthetics

In general, two approaches to measure interface aesthetics can be distinguished in the

literature. The first is an objective approach relating screen design features and layout

elements to the users' perception of visual aesthetics (e.g. Bauerly & Liu, 2006; Ngo et al.

2003). The second one is a subjective approach, utilizing questionnaire-based instruments to

measure users' perception of visual aesthetics (e.g. Laviea & Tractinsky, 2004).

2.2.1 Objective Screen layout- based measures.

This approach represents a bottom-up procedure. It has its roots in the rationalistic

philosophical view of aesthetics (Reich, 1993). This approach comprises the concept of

“beauty in the observed object”; i.e. human perception of beauty is based on the order and

organization of the various components constructing the object. It is concerned with

determining what features in the interface design triggers users’ perception of aesthetics of the

interface. It also tries to explore the possibility of expressing changes in such features using

numerical values and use these numerical values to assess users' perception of interface

aesthetics.

The techniques used by methods in this approach can be traced back to the work of Tullis

(1983 & 1988). Tullis' approach involves the establishment of objective quantitative measures

based on display characteristics. These characteristics should reflect how usable the design of

the display is and should be used to evaluate the display design without the need for collecting

performance data. Tullis applied his approach to alphanumeric displays; he proposed four

21

measures that can be used to evaluate usability of alphanumeric displays: overall density, local

density, grouping, and layout complexity. They were successfully applied to two case studies

and gave similar results when compared to human performance data (Tullis, 1983). More

studies were conducted based on Tullis' concepts, some used the same four measures

developed by Tullis (e.g. Comber & Maltby, 1995; Miyoshi & Murata, 2001) and others tried

to come up with more measures based on screen layout (e.g. Streveler & Wasserman, 1984;

Sears, 1993).

Methods in this approach can be divided into two categories; one that simply uses numerical

counts of visual features on the screen (like: number of objects, number of images …etc) and

relates them to users’ perception of aesthetics. The second one uses mathematical formulas to

express more sophisticated visual design features and concepts (like: symmetry, balance

…etc) and relate them to users’ perception of aesthetics.

a. Simple counts measures.

Visual features used in this categories include number of constructing elements or blocks and

chunks of information on the screen (Bauerly & Liu, 2006 & 2008; Michailidou et al., 2008),

number of images (Bauerly & Liu, 2006 & 2008; Djamasbia, 2010; Michailidou et al., 2008),

image size and font size (Djamasbia, 2010; Schmidt et al., 2003), JPEG file size of screenshots

of websites (Tuch et al., 2010). All the features mentioned above have been tested in the

studies cited next to them; all with results indicting some sort of relationship between these

measures and users’ perception of visual aesthetics.

22

b. Formularized measures.

Methods in this category argues that physical layout of visual objects on the screen may play a

role in users’ perception of aesthetics. The procedure involves expressing visual design

features (like symmetry, balance, unity …etc) using mathematical formulas and combine

calculated values for all features to build an overall measure that would reflect aesthetic level

of the interface design.

Methods in this approach are motivated, by Tullis’ quantitative techniques for evaluating

screen design (Tullis, 1983), earlier aesthetic measures developed by Birkhoff (1933), and

Gestlest theory for visual design (Chand, 2002; Ngo et al., 2002).

One of such measures is the model developed by Ngo et al. (2003). The model consists of

fourteen proposed measures of screen aesthetics: balance, symmetry, equilibrium, unity,

sequence, density, proportions, cohesion, simplicity, regularity, economy, homogeneity,

rhythm, and order. The value of each measure can be calculated using formulas based on the

layout of visual objects on the screen. The average of all these measures represents the overall

aesthetic value of the screen. When testing these measures using real computer screens, high

correlation was found between the model's computed aesthetic value and users' perceived

aesthetics of the interface.

In one study in which the model was applied to data entry screens (Ngo & Byrne, 2001), a

total of 57 screens with different aesthetic values were tested and multiple regression was used

to fit subjective ratings of the screens (obtained from subjective ratings of seven participants)

to the measures (calculated by the model). Results showed that the regression model was

statistically significant and that the measures of balance, unity, and sequence are the most

23

contributed terms in the model. This model could be considered one of the most successful

attempts to develop aesthetic interface measures based on interface layout. However, the

relatively large number of measures (14) and the associated formulas needed to calculate each

of them, make practical application of the model a bit difficult.

In a practical application of the model, Zain et al. (2008) designed a computer application to

incorporate only five of the fourteen measures proposed by Ngo et al. (2003). The five

selected measures were: balance, equilibrium, symmetry, sequence, and rhythm. The software

was applied to a language learning webpages. Findings of the study showed some accordance

with users rating, but no statistical test was used to get a conclusive results. The reason for

these inconclusive results could be due to the fact that not all the significant measures, as

detected in Ngo and Byrne (2001) study, were included in their software and that the

possibility of interactions among the measures wasn't considered.

Bauerly and Liu (2006 & 2008) tested the effects of symmetry and number of compositional

elements on interface aesthetics. Basically, their findings were similar to Ngo et al. (2003)

study. However, it was difficult to practically compare their findings with Ngo et al. (2003)

study, because they used different approach and different formulas to calculate the values of

the two tested measures in their experiments.

Lai et al. (2010) utilized the quantitative measures of symmetry and balance used by Bauerly

and Liu (2006 & 2008) to quantitatively analyze the aesthetics of a text-overlaid image such

that a best position for overlaying the texts on a background image can be obtained

automatically. The two measures were evaluated against participants’ subjective rating of

visual aesthetic appeal in cases of color and monochrome images. A strong relationship

24

between balance and overall aesthetic appeal was shown in both cases. No consistent

proportional relationship between symmetry and subjective ratings of aesthetic appeal was

shown.

Bi et al. (2011) repeated Bauerly and Liu (2008) study to investigate effects of symmetry

and number of compositional elements on Chinese users. The goal was to compare with

Baurely and Liu study that was conducted with American users. Similar results were found

regarding the positive effect of symmetry on participants rating of perceived visual aesthetics.

Different results were found in one case with the number of compositional elements. The

study also reported the development of a computational model of to predict aesthetic ratings

based on symmetry and number of compositional elements. The model showed acceptable

level of performance when evaluated using participants' ratings from the same study.

However, validity of the model was not thoroughly tested using different setting and other

groups of participants.

2.2.2 Subjective Questionnaire-based measures.

Supporters of this approach claim that the complexity and interrelated relationships among the

screen design elements make it difficult to use them to quantitatively measure aesthetics

(Lavie and Tractinsky, 2004). It would be more convenient to use questionnaire-based

instruments to measure users’ subjective perception of aesthetics. Two widely accepted of

such instruments are: the classical and expressive instrument developed by Lavie and

Tractinsky (2010) and the Visual Aesthetics of Website Inventory (VisAWI) tool developed

by Moshagen and Thielsch (2010). Both were designed to measure perceived visual aesthetics

25

of websites. Scales and items of both questionnaires are shown in Table 2.1.

Lavie and Tractinsky (2004) found two dimensions of the perceived website aesthetics,

termed “classical aesthetics” and “expressive aesthetics”. The classical aesthetics dimension

emphasizes orderly and clear design and is closely related to many of the usability and

interface design rules and guidelines. The expressive aesthetics dimension is linked to the

designers’ creativity and originality and to the ability to break design conventions. These two

dimensions were the basis for developing quantitative questionnaire-based instrument to

measure website interface aesthetics. The classical dimension includes the items “aesthetic”,

“pleasant”, “symmetric”, “clear”, and “clean”, while the expressive aesthetics includes the

items “creative”, “fascinating”, “original”, “sophisticated”, and “uses special effects”.

VisAWI was constructed to serve as a new tool to measure perceived website aesthetics. It

was designed to provide a tool that would cover border aspects of perceived websites

aesthetics that weren't adequately presented in early instruments. The instrument is based on

four interrelated facets of perceived visual aesthetics of websites: simplicity, diversity,

colorfulness, and craftsmanship. Simplicity comprises visual aesthetics aspects such as

balance, unity, and clarity. It is closely related to the classical aesthetics dimension. The

Diversity facet comprises visual complexity, dynamics, novelty, and creativity. It is closely

related to the expressive aesthetics dimension. The colorfulness facet represents aesthetic

impressions perceived from the selection, placement, and combination of colors.

Craftsmanship comprises the skillful and coherent integration of all relevant design

dimensions. Each of the first two facets is presented by five items in the questionnaire, while

each of the last two facets has four items.

26

Table 2.1 Scales and items in the Classical/Expressive and the VisAWI questionnaires

Classical/Expressive VisAWI

Scale Item Scale Item

Classical aesthetic

pleasant

clear

clean

symmetric

Simplicity The layout appears too dense.

The layout is easy to grasp.

Everything goes together on this webpage.

The webpage appears patchy.

The layout appears well structured.

Expressive creative

fascinating

original

sophisticated

special effect

Diversity The layout is pleasantly varied.

The layout is inventive.

The design appears uninspired.

The layout appears dynamic.

The design is uninteresting.

Colorfulness

The color composition is attractive.

The colors do not match.

The choice of color is messed up.

The colors are appealing.

Craftsmanship

The layout appears professionally designed.

The layout is not up-to-date.

The webpage is designed with care.

The design of the webpage lacks a concept.

CHAPTER 3

VERIFYING NGO AND BYRNE’S FINDINGS AND

DEVELOPING A PRELIMINARY MODEL

28

3.1 Introduction

The main purpose of experimental work covered in this chapter is to verify Ngo and Byrne

(2001) and Ngo et al. (2003) findings (summarized in section 2.2.1) using a more rigorous

experimental approach under different setting and context with fresh group of participants.

A controlled experiment was designed and conducted to further examine and verify Ngo

and Byrne (2001) findings. The goal of the experiment is first; to design and conduct a

controlled experiment to test effects of the layout elements of balance, unity, and sequence

on interface aesthetics. The possibility of interactions among these measures will also be

tested. Second, use these elements to build and validate a regression model representing

users' perceived visual aesthetics. The validation procedure includes a cross validation of

the results by comparing the regression model to be developed in this experiment with Ngo

and Byrne’s model. The model will also be validated using subjective standard

questionnaire scores of real webpages.

To accomplish these goals, the utilized experimental procedure employed simple abstract

black and white screens to systematically assess effects of these three elements on

perceived visual aesthetics. The reason for using abstract screens is to be able to easily

manipulate and study the related elements in a controlled environment that would insure

obtaining statistically valid results. This procedure was also used in similar previous

studies (Bauerly & Liu, 2006; Lai, 2010, Bi et al., 2011).

The three elements (balance, unity, and sequence) were chosen based on findings of Ngo

and Byrne’s study (2001). According to their findings, these three elements were the most

contributed terms in the developed computational model.

29

The balance element in screen design can be achieved by maintaining equal weights of

visual objects in the screen; top and bottom, left and right (Ngo et al., 2003). Unity, is the

extent by which visual objects on the screen seems to belong together as one object (Ngo et

al., 2003). Sequence corresponds to the arrangement of visual objects in a screen in a way

that facilitates eye movement. The eyes movements usually follow the pattern associated

with reading. In cultures that read from left to right, the eyes will start from the upper left

and move back and forth across the screen to the lower right (Ngo et al., 2003). Moreover,

bigger objects in the screen have more visual weight and the eyes move from bigger to the

smaller objects on the screen.

Ngo et al. (2003) have developed formulas to calculate numerical values for each of these

elements. The formulas were developed so that each element (measure) can have a value

ranges from zero (for the lowest screen aesthetics level) to one (for the highest screen

aesthetics level). These formulas are going to be used to calculate the required values for

the three elements. The formulas for the three elements with hypothetical examples

showing their uses are given in Appendix A.

3.2 Method

3.2.1 Design of the experiment

An experiment was designed and conducted to test effects of the three screen layout

elements of balance, unity, and sequence on participants' perceived aesthetic value of

interface design.

30

A factorial design was utilized with the three screen elements as the main factors. Each

of the three factors was tested at two levels (high and low) that supposed to cover the

whole range of each factor. The used design is a 23 within-participants factorial design

with repeated measures. This design produces eight experimental conditions representing

the factorial combinations of the three factors each at two levels (23 = 8 conditions).

The three factors: balance, unity, and sequence represent the independent variables and

the dependent variable is participants' ratings of interface aesthetics.

This type of factorial design was used because it is relatively easier to apply and

because it can give reliable results with relatively small number of participants.

3.2.2 Screen designs

Eight black and white screen models representing the eight experimental combinations (3

factors each at 2 levels) were prepared. Each screen has an "on –the screen" size of 1024

pixel by 1024 pixel. Four squares were used as the screen objects to be manipulated to

produce the required experimental conditions. A relatively small number of only four

objects was used in each screen to simplify objects manipulation required to produce the

experimental conditions.

The required numerical value of each factor was calculated using the formulas

developed by Ngo et al. (2003), (examples of how the calculations were carried out are

given in Appendix A). Although, theoretically, the two levels of each factor are supposed

to represent the extreme values (0 for low and 1 for high); it was practically difficult to do

that. To overcome this difficulty, a range was used to represent each level, with the low

level below 0.25 and the high level above 0.75.

31

Table 3.1 shows the different factors levels (+ for high and – for low) and values

associated with the eight screen designs. It also shows the overall aesthetic measure value

of each screen; obtained by calculating the average of the values of the three factors. Fig

3.1 represents the eight screen models associated with the eight experimental conditions.

They are presented with the same order in Table 3.1; for example, screen 1 represents the

condition of all the factors at the "high" level (+++) and screen 2 represents the condition

of all factors at the "low" level (---). The remaining screens represent the different

combinations of "high" and "low" levels for the three factors (as explained in Table 3.1).

Table 3.1 The eight experimental conditions and the associated factors levels and values.

Screen

(Condition) Levels Balance Unity Sequence

Aesthetic

Measure

1 + + + 1.00 0.99 1.00 0.997

2 - - - 0.10 0.18 0.00 0.092

3 + - - 0.98 0.24 0.00 0.406

4 + + - 0.91 0.80 0.25 0.650

5 - + + 0.09 0.82 1.00 0.637

6 - - + 0.04 0.15 1.00 0.396

7 + - + 1.00 0.15 1.00 0.716

8 - + - 0.25 0.78 0.25 0.427

32

3.2.3 Participants and apparatus

Thirteen graduate students of engineering (10 males and 3 females) volunteered to

participate in the experiment, with a mean age of 29.3 years and standard deviation of 6.1

years. The participants came with widely diverse cultural backgrounds. They included

students from the US, Asia, Europe, Africa, and the Middle East.

An IBM compatible PC with a 17" LCD display with 1280×1024 pixels screen size and

depth of colors of 32 bit true colors were used in the experiment. The operating system was

Microsoft Windows XP. Microsoft Office PowerPoint 2003 was used as a display screen.

Figure 3.1 The eight screen models associated with the experimental conditions

3.2.4 Procedure

The eight screens were presented randomly on a computer display to each participant using

a PowerPoint presentation, with the participant controlling the progress of the presentation.

The participants were instructed to rate each screen based on their personal preferences

1 2 4 3

5 6 7 8

33

using a 10 point scale, with 10 representing "most beautiful" and 1 representing "least

beautiful". Each experimental trail started with the experimenter explaining the purpose of

the experiment and reading short written instructions explaining the nature of the

experiment and the task to be performed. Next, all the eight screens were quickly presented

to the participant. After that, each screen was presented separately and the participant had

to view the screen and write his/her rating on a paper form. Participants were encouraged

to rate each screen as fast as possible based on their intuitions and first impressions.

3.3 Results

3.3.1 Participants ratings

Participants' average aesthetic ratings of each screen are presented in Table 3.2 next to the

corresponding calculated aesthetic values. Participants' ratings were divided by 10 to make

them compatible with the computed values of aesthetic measure. Comparing these ratings

to the calculated aesthetic measures, some accordance between both can be noticed, except

for screen 2; a relatively high average rating was given to this screen, which was a bit

surprising, since this screen is supposed to represent the lowest level of interface

aesthetics.

A relatively high correlation coefficient of 0.84 (p-value = 0.008) was found between

participants' ratings and the measured values of aesthetics. This confirms with finding of

previous studies.

34

Table 3.2 Calculated aesthetic values and participants' average aesthetic ratings.

Screen

(Condition) Aesthetic Measure Average Aesthetic Ratings

1 0.997 0.908

2 0.092 0.438

3 0.406 0.485

4 0.650 0.654

5 0.637 0.546

6 0.396 0.415

7 0.716 0.515

8 0.427 0.354

3.3.2 Analysis of variance

Analysis of variance results are shown in Table 3.3. All three elements: balance unity and

sequence have significant effects on the perceived interface aesthetics (P-values < 0.001).

Only the two way interactions involving the unity element were found significant (P-

values < 0.001). No significant effect between balance and sequence was found (P-value =

0.215). The three way interaction was not significant (P-value = 0.933). Power of the test

of 0.994 (at α = 0.05) was calculated using an average estimated effect value of 1.224,

indicating that the used sample size of 13 participants was enough for obtaining

statistically valid results.

35

Table 3.3 Analysis of variance results

Element F P-value

Balance (B) 76.56 < 0.001

Unity (U) 43.34 < 0.001

Sequence (S) 24.17 < 0.001

Balance – Unity interaction (B*U) 31.17 < 0.001

Balance – Sequence interaction (B*S) 1.56 0.215

Unity – Sequence interaction (U*S) 22.56 < 0.001

Balance - Unity – Sequence interaction (B*U*S) 0.01 0.933

Participants 7.97 < 0.001

Implication of the significant effects of the three elements can be better explained by

interpreting main factors effects and interactions plots presented in Fig 3.2. Average effects

of the main factors are plotted in Fig 3.2 (a), with all three factors, participants' average

ratings of interface aesthetics increase with increase of the value of the factor from the low

level to the high level. Balance has the largest effect, closely followed by unity and lastly

sequence with a relatively smaller effect.

Plots of the two-way interactions effects among the factors are shown in Fig 3.2, (b) and

(c). These plots indicate that with each pair of factors the effect of one factor is larger at

the high level of the other factor; with the low level the effect is very small. For example,

looking at Fig 3.2 (b), at the high level of balance, unity changes from a smaller value (5)

36

at its low level to a larger value (7.81) at the high level. With the low level of balance, the

plot shows a very small change in unity (from 4.3 to 4.5).

Figure 3.2 Average effects and interactions plots

0

1

2

3

4

5

6

7

8

Low High

Average Aesthetic

Rating

Level

Balance

Unity

Sequence

0

1

2

3

4

5

6

7

8

Low High

Average Aesthetic

Rating

Unity

Balance (L)

Balance (H)

0

1

2

3

4

5

6

7

8

Low High

Average Aesthetic

Rating

Sequence

Unity (L)

Unity (H)

(a). Average effects of the three factors: balance, unity, and sequence

(b). Interaction between balance and unity

(c). Interaction between unity and sequence

37

3.4 Constructing and Validating the Regression Model

3.4.1 Constructing the model

Based on results of analysis of variance, a regression model relating the significant

elements and interactions to the perceived aesthetic values was constructed. The model is

shown below (Equation 3.1):-

Aesthetic Value = 0.497 - 0.0077 B - 0.286 U - 0.0717 S + 0.419 B*U + 0.375 U*S (3.1)

Where:-

B : Balance

U : Unity

S : Sequence

The model has only five terms and only values of the three elements need to be substituted

in the model to get the equivalent value of perceived aesthetics. The model was used to

calculate values of the eight screens of the experiment and compare the results with actual

values of participants' ratings. The comparison is shown in Fig 3.3 and Table 3.4. One can

see that the predicted values calculated by the model and the actual values of participants'

ratings are very close. High correlation (r = 0.99, p-value < 0.001) was found between

actual and predicted values.

38

Figure 3.3 Scatter diagram of actual and predicted aesthetic values for the eight screens.

Table 3.4 Actual and predicted aesthetic values of the eight screens.

Screen no Aesthetic Value

Actual Predicted

1 0.908 0.920

2 0.438 0.453

3 0.485 0.519

4 0.654 0.621

5 0.546 0.528

6 0.415 0.441

7 0.515 0.493

8 0.354 0.409

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 1 2 3 4 5 6 7 8

Aesth

eti

c V

alu

e

Screen no

Actual Predicted

39

3.4.2 Compare with Ngo and Byrne’s model

For further validation of results, the model was compared with the model of Ngo and

Byrne (2001). This comparison was carried out first by using the model developed by Ngo

and Byrne (2001) to predict aesthetic values for the eight screens of the current study. A

short version of the model with only six terms (from the original 14) was used. These

terms are the ones found significant in Ngo and Byrne (2001) study. The model with the

six terms is shown in equation (3.2) below:-

Aesthetic Value = 0.038+0.11 B+0.126 U+0.0771S+ .061 D + 0.186 P + 0.0486 H (3.2)

Where:-

B : Balance

U : Unity

S : Sequence

D: Density

P: Proportion

H: Homogeneity

Table 3.5 represents calculated values for the six terms (elements) for the eight screens that

were substituted in the model to produce the predicted aesthetic values for each screen.

Predicted aesthetic values calculated by the model are plotted against actual values of

participants' ratings in Fig 3.4. Table 3.6 lists these values. Fig 3.4 and Table 3.6 indicate

that with almost all screens aesthetic values produced by the model are lower than the

actual perceived values. This was expected, since the original model has fourteen terms

40

and only six are used here. However, high correlation (r = 0.90, P-value =0.002) was found

between predicted and actual values.

Table 3.5 Calculated values of the five terms (elements) included in Ngo and Byrne (2001) model.

Screen

no

Element

Balance Unity Sequence Density Proportion Homogeneity

1 1.000 0.990 1.000 0.500 1.000 1.000

2 0.098 0.178 0.000 0.468 1.000 1.000

3 0.981 0.238 0.000 0.112 1.000 1.000

4 0.905 0.796 0.250 0.406 1.000 1.000

5 0.093 0.818 1.000 0.043 1.000 1.000

6 0.038 0.150 1.000 0.200 1.000 1.000

7 0.997 0.150 1.000 0.023 1.000 1.000

8 0.251 0.780 0.250 0.050 1.000 1.000

To finish the comparison and to further verify the current model (Eq. (3.1)); the current

model was used to estimate the values of the 57 screens used in Ngo and Byrne (2001)

study. Values of the three terms for the 57 screens, required to calculate the predicted

values were obtained from Ngo and Byrne (2001) study. After calculating all the predicted

aesthetic values for the 57 screens using the current model, coefficient of correlation was

calculated between predicted values and the actual values of participant ratings given to

these screens (obtained from Ngo and Byrne, 2001), a relatively high correlation (r = 0.81,

p-value < 0.001) was found. The original Ngo and Byrne model with all the 14 terms gave

a correlation coefficient of (r = 0.94, p-value < 0.001).

41

Figure 3.4 Scatter diagram of actual and predicted (Ngo and Byrne model) aesthetic values for

the eight screens.

Table 3.6 Actual and predicted (Ngo and Byrne model) aesthetic values of the eight screens.

Screen no Aesthetic Value

Actual Predicted

1 0.908 0.615

2 0.438 0.334

3 0.485 0.417

4 0.654 0.516

5 0.546 0.466

6 0.415 0.385

7 0.515 0.480

8 0.354 0.421

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 1 2 3 4 5 6 7 8

Aesth

eti

c V

alu

e

Screen no

Actual Predicted

42

Table 3.7 lists actual and predicted values for the 57 screens. Looking at these values, one

can see that, in general, the current model seems to overestimate the actual values in most

cases. In only thirteen cases from the 57 screens shown in Table 3.7 the model gave lower

values than the actual ones. A possible reason for this could be that participants rating

reported in Ngo and Byrne study were rounded to the nearest decimal. This possibility was

checked by rounding the predicted values; number of overestimated values was reduced

but coefficient of correlation didn’t increase. Nevertheless, with only three measures (from

the original 14 proposed by Ngo and Byrne, 2001), the current model was able to estimate

values of the 57 screen of Ngo and Byrne (2001) study with the same degree of statistical

accuracy as the original model.

The computational formulas and models suggested by Ngo and Byrne (2001) were

originally developed for data entry screens. It would be interesting to see how they would

work with website interfaces. In the next section the three selected objective layout-based

measures, with the associated models (the regression model and Ngo and Byrne's original

model) will be applied to real webpages and evaluated using subjective questionnaire

based measures.

43

Table 3.7 Actual and predicted (current model) aesthetic values of the 57 screens of Ngo and Byrne (2001)

study.

Screen

no

Aesthetic Value Screen

no

Aesthetic Value Screen

no

Aesthetic Value

Actual Predicted Actual Predicted Actual Predicted

1 0.500 0.558 21 0.500 0.514 41 0.600 0.613

2 0.600 0.604 22 0.500 0.545 42 0.600 0.656

3 0.700 0.633 23 0.400 0.464 43 0.600 0.640

4 0.700 0.879 24 0.600 0.556 44 0.500 0.510

5 0.500 0.557 25 0.700 0.740 45 0.500 0.508

6 0.500 0.501 26 0.500 0.522 46 0.500 0.505

7 0.500 0.524 27 0.600 0.672 47 0.600 0.607

8 0.600 0.593 28 0.600 0.676 48 0.600 0.615

9 0.500 0.607 29 0.600 0.535 49 0.500 0.483

10 0.600 0.584 30 0.500 0.516 50 0.500 0.491

11 0.600 0.668 31 0.500 0.485 51 0.500 0.509

12 0.500 0.510 32 0.500 0.499 52 0.500 0.528

13 0.500 0.671 33 0.500 0.514 53 0.500 0.524

14 0.500 0.512 34 0.500 0.510 54 0.500 0.527

15 0.400 0.472 35 0.500 0.532 55 0.600 0.603

16 0.500 0.568 36 0.400 0.497 56 0.500 0.575

17 0.500 0.566 37 0.600 0.582 57 0.500 0.553

18 0.400 0.357 38 0.600 0.611

19 0.500 0.462 39 0.600 0.610

20 0.500 0.481 40 0.500 0.555

44

3.4.3 Validating the model using standard questionnaire scores of real webpages

The regression model with the interaction terms was used to calculate visual aesthetics of

forty-two web pages already used in a previous study (Moshagen & Thielsch, 2010) to

develop the VisAWI questionnaire-based measure of visual aesthetics of websites. These

42 webpages were used by Moshagen & Thielsch (2010) to validate the VisAWI

questionnaire and compare it with classical and expressive aesthetics questionnaire.

Aesthetic values calculated for the 42 webpages by the regression model were compared to

scores of VisAWI and classical/expressive questionnaires already available in (Moshagen

& Thielsch, 2010). Correlation analysis was conducted to see how the three objective

layout-bases elements and the associated model tested in this study relate to standard

questionnaire-based measures of visual aesthetics.

The reason why these 42 webpages were chosen for this study is that they cover a wide

variety of websites with different levels of visual aesthetics. In addition, questionnaire

scores for a large sample size are already available for these pages; scores of a total of 512

participates were used to validate the questionnaire. Of the participants, 347 (67.8%) were

female. Age ranged from 15 to 82 years (M = 30.50; SD = 10.61). A 7 point Likert scale

was used in (Moshagen & Thielsch, 2010) study questionnaires. A list of all average scores

per each webpage is given in Appendix B. Table 3.8 summarizes descriptive statistics for

the two questionnaire scales. Screen shoots of four of the 42 webpages are shown in Fig

3.5; two with the highest scores (highest perceived visual aesthetics) and two with the

lowest scores.

45

Table 3.8 Descriptive statistics for questionnaire scores for the 42 webpages.

Questionnaire Scale Min Max Average Standard

deviation

Classical/expressive

Classic 2.40 5.21 3.98 0.75

Expressive 1.68 4.10 2.87 0.60

Average 2.27 4.46 3.42 0.54

VisAWI

Simplicity 2.35 5.10 3.96 0.68

Diversity 2.00 4.60 3.51 0.61

Colorfulness 2.98 5.33 4.32 0.61

Craftsmanship 2.90 5.42 4.37 0.61

Average 2.72 4.90 4.00 0.53

The procedure used to compute the values of the three elements (balance, unity, and

sequence) for the 42 webpages is the same as the one used to calculate their values for the

eight abstract screens. Visual information on each page was divided into hypothetical

visual objects. Layout data obtained from these objects (area, distance from central axis …

etc) were input to the computational formulas for computing the three elements (see

Appendix A for the formulas and examples of calculations). Fig 3.6 shows an example of

how a webpage was divided into visual objects. Table 3.9 gives summary of descriptive

statistics for the three elements, their average, and values calculated by the interaction

(regression) model and Ngo and Byrne model. Complete lists of all values calculated for

each page is given in Appendix B.

46

Figure 3.5 Screen shoots of webpages with the highest and lowest average questionnaire scores. (a) and (b)

with the highest scores, (c) and (d) with the lowest scores.

(a) Webpage no.4

Average scores: Classic/Expressive = 4.46

VisaWI = 4.90

(b) Webpage no.11


VisaWI = 4.86

(c) Webpage no.2


VisaWI = 2.96

(d) Webpage no.17


VisaWI = 2.72

47

Figure 3.6 An Example of how a webpage is divided into visual objects (top image shows the original web

page, bottom image shows the page divided into visual objects).

48

Table 3.9 Descriptive statistics for the measures and the models for the 42 webpages.

Measure Min Max Average Standard deviation

Balance 0.516 0.950 0.792 0.105

Unity 0.163 0.684 0.417 0.145

Sequence 0.750 1.000 0.970 0.082

Average 0.528 0.835 0.726 0.072

Interaction Model 0.486 0.712 0.591 0.060

Ngo Model 0.191 0.291 0.252 0.024

Table 3.10 shows correlation coefficients between the measures and the models in one

side, and questionnaire scores for the 42 webpages in the other side. From the table, one

can see that all significant correlations are with the questionnaire items related to screen

layout. The measure of unity and the models are significantly correlated with the classical

and the simplicity measures; both including items related to visual layout and clarity of the

design.

Table 3.10 Correlations between the measures and questionnaire scores.

Measure Classical/expressive

VisAWI

Classic Expressive Average

Simplicity Diversity Colorfulness Craftsmanship Average

Balance 0.064 0.064 0.08

0.136 -0.001 0.1 -0.111 0.044

Unity 0.562* 0.133 0.466*

0.658* 0.140 0.255 0.463* 0.457*

Sequence 0.279* 0.062 0.229

0.313** 0.131 0.297 0.167 0.269

Average 0.511* 0.143 0.436*

0.623* 0.142 0.331* 0.318** 0.428*

Interaction model

0.600* 0.189 0.524*

0.712* 0.163 0.316** 0.434* 0.491*

Ngo

model 0.539* 0.151 0.460*

0.657* 0.143 0.325** 0.347** 0.446*

* Significant at 0.01, ** significant at 0.05

49

From the three layout measures (balance, unity, and sequence) only unity has high

correlations with the questionnaire measures. No significant correlations were found

between balance and sequences, and the questionnaire measures. This might be explained

by looking at the interactions plots in Fig 3.2 and descriptive statistics in Table 3.9. High

values for both balance and sequence were calculated for the 42 webpages; values of

balance range from 0.516 to 0.950 with an average value of 0.792 and values of sequence

are all above 0.75 with an average of 0.970. In the other hand, unity has lower values; from

0.163 to 0.684 with an average of 0.417. Interpretation of interaction plots (section 3.3.2)

suggests that the effect of one factor is larger at the high levels of the other factors. For the

42 webpages, both balance and sequence have higher values than unity. Hence, unity will

have larger impact on perceived aesthetics. This was reflected in the high correlations unity

has with the related questionnaire measures. Nevertheless, the other case of lower values of

balance and sequence should also be investigated to confirm this explanation. Also, Can

the high levels of balance witnessed here be considered as a typical characteristic of all

website designs? Or is it just a coincidence with the 42 webpages used in the study?

3.4.4 Checking for correlations with simple counts measures

In this section, selected simple count-based measures for the 42 webpages will be

compared with the questionnaire scores. This could give further explanation for some of

the above observations seen with the formularized measures and might lead to better

understanding of the relationships and interactions among the measures associated with the

models. It may also help in finding other simpler measures for visual aesthetics of website

interface design.

50

Five measures were selected, namely: number of visual objects on the screen, number of

different sizes of visual objects, number of images, number of different font types used in

the web page, and JEPG file size of screenshot of the webpage. Number of objects,

number of images, and JEPG files size has already been tested in previous studies (Bauerly

& Liu, 2006; Bi et al., 2011, Djamasbi et al., 2010; Schmidt et al., 2003; Tuch et al., 2010);

all with results indicting some sort of a relationship between these measures and users’

perception of visual aesthetics. Number of different sizes of visual objects is one of the

input parameters in Ngo et al. formulas for unity. Number of different font types has been

selected based on earlier observations.

The procedure will be the same as the one used in validating the model; the selected

measures will be calculated for the 42 webpages and compared to questionnaire scores

using correlations analysis.

Descriptive statistics for the calculated values for the five selected measures for the 42

web pages are given in Table 3.11. The complete list for all 42 webpages is given in

Appendix B.

Table 3.12 shows correlation coefficients between the selected measures and

questionnaire scores for the 42 webpages. Significant correlations were found between

number of objects and number of different sizes with both the classical and the simplicity

measures. This wasn't surprising, since these two features (no of objects and no of different

sizes) are the main input parameters in the unity formula. These significant correlations

point out to clear negative effects of increasing number of objects and number of different

sizes on perceived visual aesthetics of websites.

51

Table 3.11 Descriptive statistics for the selected count-based measures for the 42 web pages.

Measure Min Max Average Standard deviation

No of objects 6 21 10.5 3.9

No of different sizes of objects 3 20 9.2 3.6

JEPG file size (Kbytes) 50 251 170.8 44.4

No of different font types 1 6 2.8 1.3

No of images 0 12 4.3 3.1

Significant correlations were also found between JEPG file size and number of different

font types, and the classical aesthetics measure. No strong correlations were found between

number of images with any of the classical and the simplicity measures. However, an

interesting result is the noticeable high and significant correlations found between number

of different fonts and the expressive and the diversity measures.

Further investigation involving some of the measures tested in this section will be carried

out in next chapter.

52

Table 3.12 Correlations between objective simple count-based measures and subjective questionnaire based measures.

Measure Classical/expressive

VisAWI



No of objects -0.355** -0.086 -0.296

-0.397* -0.087 -0.055 -0.203 -0.413*

No of different sizes of

objects -0.561* -0.170 -0.487*

-0.602* -0.248 -0.189 -0.371** -0.542*

JEPG file size (Kbytes) -0.338** 0.023 -0.223**

-0.333 -0.011 0.038 -0.123 -0.142

No of different font types -0.333** 0.600* 0.103

-0.257 0.399* -0.172 -0.047 -0.019

No of images -0.251 0.195 -0.066 -0.224 0.203 0.082 -0.016 0.002


CHAPTER 4

FURTHER TESTING OF VISUAL LAYOUT ELEMENTS

AND VALIDATING OF THE MODEL

54

4.1 Introduction

In this chapter, further experimental investigations of effects of balance and unity of form

on perceived visual aesthetics are carried out using controlled experiments. The

motivation for these investigations is the questions raised by the results of the correlation

analyses presented in the previous chapter. Results of these analyses showed significant

correlations between unity, number of objects, and number of different sizes with

subjective questionnaire-based measures. Interpretation of these results suggested that the

high correlations between these objective measures and perceived aesthetics only occur at

high levels of balance. The purpose of the experimental work presented in this chapter is

to confirm these findings. The main goal is to test the hypothesis of findings significant

effects of unity of form on perceived aesthetics of website design in case of designs with

high levels of vertical balance. Specifically, this part of the study aims at systematically

study effects of number of objects and number of different sizes of objects on perceived

visual aesthetics of website design under the conditions of balanced and unbalanced

designs.

Number of objects and number of different sizes of objects are the two input parameters

to calculate unity of form in the formula developed by Ngo et al. (2003), the measure of

unity consists of two sub-measures; unity of form and unity of layout, the value of unity

equals the average value of both sub-measures . Unity of form represents the extent to

which visual objects on the screen are related in size. High levels of unity of form can be

achieved by using objects with similar sizes on the screen and/or by reducing number of

objects on the screen. The formula for unity of form with an example of its application is

given in Appendix A.

55

4.2 Method

4.2.1 Experimental Design

A three- factor mixed (within and between) participants design was utilized. The three

factors are vertical balance, number of objects, and number of different sizes of objects.

Each of the three factors was tested at two levels (high and low). This experimental

design with three factors each with two levels produces eight experimental conditions.

Eight different designs of a webpage were prepared to represent the eight experimental

conditions. All eight designs have identical styles (colors, fonts ...etc); only visual

elements related to the three factors were manipulated. Values of the levels of the balance

factor were determined using Ngo et al. (2003) formula for vertical balance with the

higher level value equal one (1.0) and the lower level with values less than 0.28. Values

for the levels of the two other factors (number of objects and number of sizes) were

chosen based on observations from experimental work presented in the previous chapter.

Table 4.1 shows factors values and levels associated with the eight experimental

conditions. The first four designs (designs 1 to 4) represent the higher levels of balance.

The last four designs (designs 5 to 8) represent the lower levels.

Fig 4.1 shows abstract mock-up screens representing the eight experimental conditions.

These mock-up screens were used in the first experimental trail and were also used as

templates to prepare the real webpage designs. Fig 4.2 shows screens shots of the eight

designs of the webpage. The webpage represents a homepage of a hypothetical website

that talks about the ancient history of a certain region of North Africa. It uses the local

language of that region (Arabic). However; the content of the website was irrelevant for

56

the purpose of the study and shouldn’t have any effect on participants’ responses.

Two methods were used to measure participants’ perception of visual aesthetics. The

first uses a simple one question to measure the overall aesthetics of the design (same as in

the experiment in chapter 3). The second uses standard questionnaires. Both

classical/expressive and VisAWI questionnaire were used.

Table 4.1 The eight deigns and the associated factors levels and values.

Design

(Condition) Levels

Vertical

Balance No of objects

No of

different sizes Unity of form

1 + + + 1 6 3 0.67

2 + + - 1 6 5 0.33

3 + - + 1 16 3 0.88

4 + - - 1 16 11 0.38

5 - + + 0.25 6 3 0.67

6 - + - 0.25 6 5 0.33

7 - - + 0.27 16 3 0.88

8 - - - 0.28 16 11 0.38

4.2.2 Procedure

Experimental trails were carried out online. An online survey design and distributing

service was used. Participants were recruited through this service. Email invitations were

sent randomly to potential participants with the choice of entering a lottery to win 100 US

dollars. All invitations were sent to potential participants in the United States.

57

Figure 4.1 The eight abstract mock-up screens.

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

58

Figure 4.2 The eight webpage designs.

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

59

The experimental work consisted of six experimental trails each using one of the

response instruments. Each of the tests with the standard questionnaires was carried out

over two trails; one for the balanced condition and the other for the unbalanced condition.

This was merely dictated by how long each test trail should be and how that might affect

response rate of the participations and their commitment to complete the test trail. The

goal was to allow participants to finish the test within an acceptable time limit. Table 4.2

summarizes the experimental trails and gives demographic information for participants in

each trail.

Table 4.2 Experimental trails and participants information.

No of

Invitations

Emailed

No of

Responses

Delivered

No of Valid

responses

(male, female)

Age (years)

Average Standard

deviation

One-question

Mock-ups 251 31 29 (16,13) 39.9 11.7

Webpages 201 31 28 (11,17) 38.7 10.2

Questionnaire

Classic

/Expressive

Balanced 201 24 17 (8, 9) 35.1 10.1

Unbalanced 201 26 21 (11,10) 40.3 9.7

VisAWI

Balanced 251 40 25 (13,12) 41.5 16.4

Unbalanced 201 26 23 (8,15) 42.6 9

60

In each trail the images associated with the eight designs were presented randomly to

each participant one at a time with an on screen size of 800X600 pixels. The question or

questionnaire was placed under each image. In the one question trails, participant had to

rate each screen based on their personal preferences using a 10 point scale, with 10

representing "most beautiful" and 1 representing "least beautiful". In the questionnaires

trails, questionnaire items were presented in random order to each participant. A seven-

point Likert scale was used to collect responses to each item.

4.3 Results Analysis and Discussion

4.3.1 One-question with mock-up screens trail

Table 4.3 summarizes descriptive statistics for average scores (complete list of all scores

are given in Appendix C). Averages are computed per design (1 to 8), per no of objects

(6 and 16), and per condition (balanced and unbalanced). Information in Table 4.3 are

depicted in Fig 4.3. It can be seen that designs with smaller number of objects and

smaller number of different objects are given relatively higher average scores. This

pattern was clear in both conditions (balanced and unbalanced). In the balanced

condition, the highest average score (5.24) was recorded with the design associated with

the smallest number of objects and the smallest number of sizes (design 1). The lowest

average score (3.59) was recorded with the design associated with the largest number of

objects and largest number of sizes (design 4). This pattern appears in the unbalance

condition as well, design 5, with the smallest number of objects and sizes was given the

highest average score (5) and design 8 having the largest number of objects and sizes was

61

given the lowest average score (3.21).

Also with average scores per number of objects, higher average scores were recorded

with designs with smaller number of objects in both conditions. For example, in the

balanced conditions, average score for designs 1 and 2 (4.88) associated with the smaller

number of objects (6) is relatively higher than average score for designs 2 and 4 (4.16)

associated with the larger number of objects (16).

In addition, designs in the balanced condition were given relatively higher average

scores than their counter part designs in the unbalanced condition. This was reflected in

the higher average score (4.52) given to the balanced condition compare to the lower

average score (4.09) given to the unbalanced condition.

Table 4.3 Descriptive statistics for average scores for the one-question, mock-up screens trail.

Condition Design

no

No of

objects

No of

sizes

Average score Standard

deviation

(per screen) Per

design

Per no of

objects

Per

condition

Balanced

1

6

3 5.24

4.88

4.52

2.60

2 5 4.52 2.28

3

16

3 4.72

4.16

2.88

4 11 3.59 2.10

Unbalanced

5

6

3 5.00

4.45

4.09

2.41

6 5 3.90 2.24

7

16

3 4.24

3.73

2.40

8 11 3.21 1.57

62

(a) Per design

(b) Per no of objects

(c) Per condition

Figure 4.3 Average scores for the one-questions, mock-up screens trail.

5.24 5

4.52

3.9

4.72

4.24

3.59

3.21

3

3.5

4

4.5

5

5.5

Balanced Unbalanced

4.88

4.45

4.16

3.73

3

3.5

4

4.5

5

5.5

Balanced Unbalanced

4.52

4.09

3.8

3.9

4

4.1

4.2

4.3

4.4

4.5

4.6

Balanced Unbalanced

1 2 3 4 5 6 7 8

6 16 6 16

63

Analysis of variance results are shown in Table 4.4. A nested factorial and repeated

measures analysis technique was used to complete the analysis; number of different sizes

of objects was tested as a nested factor within the number of objects factor. Results show

that both factors have statistically significant effects on participants' scores in both the

balanced and the unbalanced conditions. Participants have given designs with lower

number of objects and sizes significantly higher average scores, i.e. they perceived

designs with lower number of objects and sizes as having higher level of visual

aesthetics.

Table 4.4 ANOVA for average scores for the one-question, mock-up screens trail.

Case Element F P-value

Balanced

Objects 4.62 0.034

Sizes within objects 4.01 0.021


Unbalanced

Objects 5.78 0.018



Pair-wise comparisons between levels of each factor showed that differences were

significant for all pairs except the levels of number of different sizes within the higher

level of the number of object (designs 1 and 2). Increasing the number of different sizes

from (3) in design 1 to (5) in design 2, under number of objects = 6, didn't results in a

64

significant increase in participants average score.

The difference between average scores of the balanced and unbalanced conditions was

also significant (p-value = 0.013). Balanced designs were perceived as having

significantly higher level of visual aesthetics than the unbalanced designs.

Differences among participants were also found significant in both conditions. This

justifies the use of the repeated measure approach in the analysis.

4.3.2 One-question with webpages trail

Average scores for this trail are given in Table 4.5 and Fig 4.3. From both it can be seen

that in many cases designs with smaller number of objects and smaller number of sizes

were given higher average scores. However, this pattern was not consistent over all

designs and factors. With number of objects this pattern was consistent in both

conditions; higher average scores were recorded with designs associated with smaller

number of objects. This was more evidence in the balanced condition with an average

score of 4.16 given to designs 1 and 2, and an average score of 3.70 given to designs 3

and 4. The difference was less in the unbalanced condition with an average of 3.98 for

designs 5 and 6, and an average of 3.75 for designs 7 and 8.

With number of different sizes, no obvious pattern could be recognized. In the balanced

condition within the smaller number of objects, the design with the largest number of

sizes (deign 2) was given an average higher score than the design with the smallest

number of sizes (design 1). This was also the case in the unbalanced condition with

designs 7 and 8.

65

A slightly higher average score (3.93) was give to the balanced designs compared with

a lower average score (3.87) for the unbalanced designs.

Table 4.5 Descriptive statistics for average scores for the one-question with webpages trail.

Case Screen

no

No of

objects

No of

sizes


deviation

(per screen) Per

screen

Per no of

objects

Per

case

Balanced

1

6

3 4.07

4.16

3.93

1.74

2 5 4.25 1.73

3

16

3 3.89

3.70

1.87

4 11 3.50 1.97

Unbalanced

5

6

3 4.04

3.98

3.87

2.32

6 5 3.93 2.23

7

16

3 3.64

3.75

2.08

8 11 3.86 2.17

Analysis of variance in Table 4.6 shows that only effects of the number of objects factor

in the balanced condition were found significant (p-value = 0.016). Effects of number of

different sizes in both balanced and unbalanced conditions were not significant.

Differences among participants were significant. Difference between average scores of

the balanced and unbalanced designs was not significant (p-value = 0.71). Participants in

this trail didn't perceived balanced designs as having higher level of visual aesthetics than

the unbalanced designs.

66

(a) per design

(b) per no of objects

(c) per condition

Figure 4.4 Average scores for the one-question, webpages trail.

4.07 4.04

4.25

3.93 3.89

3.64 3.5

3.86

3

3.5

4

4.5

5

Balanced Unbalanced

4.16 3.98

3.7 3.75

3

3.5

4

4.5

5

Balanced Unbalanced

3.93

3.87

3.6

3.8

4

Balanced Unbalanced

1 2 3 4 5 6 7 8

6 16 6 16

67

Table 4.6 ANOVA for average scores for the one-question webpages trail.


Balanced

Objects 6.10 0.016



Unbalanced

Objects 1.56 0.21



4.3.3 The Classic/Expressive questionnaire trails

a. The balanced condition

Average scores for this trail are summarized in Table 4.7 and Fig 4.5. With the classical

aesthetics part, slightly higher average scores were recorded with designs associated with

the smaller number of objects and smaller number of sizes. Design 1 with the smallest

number of objects and sizes was given the highest average score (4.47) and design 4 with

the largest number of objects and sizes was given the lowest average score (3.87). The

largest difference was recorded with the two designs associated with the larger number of

objects (design 3 and design 4). Design 3 with the smaller number of sizes was given an

average score of 4.38; relatively higher than the average score of 3.87 given to design 4

that have a larger number of sizes. Deigns 1 and 2 with the smaller number of objects

68

were give a higher average score (4.43) than the average score (4.13) for designs 3 and 4

associated with the larger number of objects.

With the expressive aesthetics part; no clear pattern could be distinguished. The same

average score (3.12) was given to both deign 1, associated with the smallest numbers of

objects and sizes, and design 4 associated with the largest numbers of objects and sizes.

Almost the same average scores were given to both cases of number of objects (3.03 and

3.08).

With the total average scores, same pattern as in the classical scale can be seen; higher

average scores were given to designs with smaller number of objects and smaller number

of sizes. The highest average score (3.79) was recorded with design 1 and the lowest

average score (3.49) was recorded with design 4.

A noticeable higher overall average score (4.28) was recorded with the classical scale

compared to a lower average score (3.06) for the expressive scale.

Cronbach’s α was used to measure reliability of the questionnaire. All calculated values

were within the range of 0.67-.74 for the different scales of the questionnaire, indicating

an acceptable level of reliability.

69

Table 4.7 Descriptive statistics for average scores for the Classical/Expressive balanced trail.

Scale Design

no

No of

objects

No of

sizes


deviation

(per screen)

Cronbach’s

α Per

design

Per no of

objects

Per

scale

Classical

1

6

3 4.47

4.43

4.28

1.56

2 5 4.40 1.31

3

16

3 4.38

4.13

1.26

4 11 3.87 1.34

Expressive

1

6

3 3.12

3.03

3.06

1.54

0.67-.74

2 5 2.94 1.36

3

16

3 3.05

3.08

1.41

4 11 3.12 1.51

Total

1

6

3 3.79

3.73

3.67

1.08

2 5 3.67 0.86

3

16

3 3.71

3.60

0.97

4 11 3.49 0.95

70

(a) per design


(c) per scale

Figure 4.5 Average scores for the Classical/Expressive questionnaire balanced trail.

4.47

3.12

3.79

4.4

2.94

3.67

4.38

3.05

3.71 3.87

3.12

3.49

2.5

3

3.5

4

4.5

5

Classic Expressive Total

4.43

3.03

3.73

4.13

3.08

3.6

2.5

3

3.5

4

4.5

5


4.28

3.06

3.67

2.5

3

3.5

4

4.5

5


1 2 3 4 1 2 3 4 1 2 3 4

6 16 6 16 6 16

71

Results of analysis of variance in Table 4.8 show that effects of number of objects and

number of sizes were only significant with the classical aesthetics scale in the balanced

condition. Pair-wise comparisons showed that effect of number of sizes was only

significant at the lower level of number of objects (between designs 3 and 4). No

significant effects of the two factors were found with the expressive scale and total

average scores.

Differences between average scores of the classical and expressive scales were

significant as analysis of variance in Table 4.11 (in page 76) shows. This difference was

significant in both the balanced and the unbalanced conditions. Participants perceived the

designs as having higher levels of classical aesthetics than of expressive aesthetics.

Table 4.8 ANOVA for average scores for the Classical/Expressive balanced trail.

Scale Element F P-value

Classical

Objects 4.66 0.036



Expressive

Objects 0.29 0.59



Total

Objects 2.77 0.10



72

b. The unbalanced condition

Average scores for this trail are presented in Table 4.9 and Fig 4.6. No clear pattern as in

the balanced condition could be recognized here. The only noticeable observation is that

the lowest average score was recorded with design 8 with both the classical and the total;

design 8 is the design associated with the largest numbers of objects and sizes. However,

design 5, associated with the smallest numbers of objects and sizes, was not given the

highest average score in any of the cases.

Also, one can notice, with the classical scale, that average score of designs associated

with the smaller number of objects (designs 5 and 6) were given slightly higher average

score (3.62) than average score (3.51) for designs with the larger number of objects

(designs 7 and 8). The case was reversed in both expressive scale and total average

scores.

As in the balanced condition, classical aesthetics was given a higher average score

(3.56) than the average score for expressive aesthetics (2.99).

The calculated Cronbach’s α values for all scales are between 0.64 and 0.75, all within

the acceptable reliability limits.

No significant effects were found in any of the scales as results of analysis of variance

in Table 4.10 indicate. Only differences among participants in all scales were found

significant as was the case in the previous trails.

The balance factor has significant effects in both classical and total scores as results of

analysis of variance in Table 4.11 (in page 76) indicate. The balanced designs were

perceived as having higher levels of classical aesthetics than the unbalanced designs.

73

Table 4.9 Descriptive statistics for average scores for the Classical/Expressive unbalanced trail.

Scale Design

no

No of

objects

No of

sizes


deviation

(per screen)

Cronbach’s

α Per

design

Per no of

objects

Per

scale

Classical

5

6

3 3.53

3.62

3.56

1.27

6 5 3.71 1.37

7

16

3 3.62

3.51

1.31

8 11 3.40 1.17

Expressive

5

6

3 2.93

2.91

2.99

1.43

0.64-0.75

6 5 2.89 1.39

7

16

3 3.11

3.07

1.47

8 11 3.02 1.37

Total

5

6

3 3.23

3.26

3.28

0.85

6 5 3.30 0.84

7

16

3 3.36

3.29

0.75

8 11 3.21 0.84

74

(a) per design


(c) per scale

Figure 4.6 Average scores for the Classical/Expressive unbalanced trail.

3.53

2.93

3.23

3.71

2.89

3.3

3.62

3.11

3.36 3.4

3.02

3.21

2.5

3

3.5

4


3.62

2.91

3.26

3.51

3.07

3.29

2.5

3

3.5

4


3.56

2.99

3.28

2.5

3

3.5

4


5 6 7 8 5 6 7 8 5 6 7 8

6 16 6 16 6 16

75

Table 4.10 ANOVA for average scores for the Classical/Expressive unbalanced trail.

Scale Element F P-value

Classical

Objects 0.54 0.47



Expressive

Objects 3.23 0.08



Total

Objects 0.12 0.73



Table 4.11 ANOVA for balance and scales for the Classical/Expressive trail.

Balance Scales

Scale F P-value Condition F P-value

Classical 4.67 0.003 Balanced 7.63 0.005

Expressive 1.07 0.33 Unbalanced 6.22 0.008

Total 5.41 0.002

76

4.3.4 The VisAWI questionnaire trails

a. The balanced condition

Table 4.12 and Fig 4.7 summarize average scores for questionnaire scales of this trail.

The most obvious observations is that with all scales and the total, designs with the

smaller number of objects (designs 1 and 2) were given higher average scores compared

to designs with the larger number of objects (designs 3 and 4). The largest difference

between average scores for number of objects was recorded in the simplicity scale (4.34

vs. 4.01) and the total (3.81 vs. 3.65).

With number of different sizes, in all scales (except colorfulness) and total, design 4,

associated with the largest number of objects and sizes, was given the lowest average

scores. Design 1, associated with the smallest numbers of objects and sizes, was not

given the higher average scores in all scales; In the simplicity scale, the highest score was

given to design 2 and in the craftsmanship scale, the highest score was given to design 3.

When comparing average scores per scale, the highest average score (4.18) was

recorded with the simplicity scale, followed by colorfulness with an average score of

3.95, than craftsmanship and diversity with average scores of 3.54 and 3.26 respectively.

The calculated Cronbach’s α values for all scales are between 0.60 and 0.99, all within

the acceptable reliability limits.

Analysis of variance results for this trail are shown in Table 4.13. These results show

that number of objects was only significant in case of the simplicity scale and the total (p-

values = 0.008 and 0.028 respectively). Number of sizes was only significant in case of

the craftsmanship scale (p-value = 0.025). Pair-wise comparisons showed that this effect

is significant in case of the larger number of objects (16), associated with designs 3 and 4.

77

Table 4.12 Descriptive statistics for average scores for the VisAWI balanced trail.

Scale design

no

No of

objects

No of

sizes


deviation

(per screen)

Cronbach’s

α Per

design

Per no of

objects

Per

scale

Simplicity

1

6

3 4.31

4.34

4.18

1.08

0.60-0.99

2 5 4.38 1.20

3

16

3 4.08

4.01

1.15

4 11 3.94 1.17

Diversity

1

6

3 3.37

3.33

3.26

1.30

2 5 3.30 1.33

3

16

3 3.28

3.20

1.34

4 11 3.11 1.13

Colorfulness

1

6

3 4.03

3.99

3.95

1.47

2 5 3.95 1.60

3

16

3 3.88

3.91

1.54

4 11 3.94 1.45

Craftsmanship

1

6

3 3.70

3.59

3.54

1.19

2 5 3.48 1.28

3

16

3 3.71

3.50

1.20

4 11 3.28 1.07

Total

1

6

3 3.85

3.81

3.73

1.02

2 5 3.78 1.13

3

16

3 3.74

3.65

1.16

4 11 3.57 1.05

78

(a) per design


(c) per scale

Figure 4.7 Average scores for the VisAWI balanced trail.

4.31

3.37

4.03

3.7

3.85

4.38

3.3

3.95

3.48

3.78

4.08

3.28

3.88

3.71 3.74

3.94

3.11

3.94

3.28

3.57

2.5

3

3.5

4

4.5

5

Simplicity Diversity Colorfulness Craftsmanship Total

4.34

3.33

3.99

3.59

3.81

4.01

3.2

3.91

3.5 3.65

2.5

3

3.5

4

4.5

5


4.18

3.26

3.95

3.54 3.73

2.5

3

3.5

4

4.5


1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

6 16 6 16 6 16 6 16 6 16

79

Differences among participants were also significant in all cases (p-values < 0.001). No

significant effects of both factors were found on diversity and colorfulness.

Highly significant differences were found among the different scale (p-values < 0.001),

as results for analysis of variance for scales in Table 4.16 (in page 85) shows. All pair-

wise comparisons were significant with simplicity given the highest average score (4.18)

followed by colorfulness with an average score of 3.95, than craftsmanship with an

average of 3.54, and last diversity with an average score of 3.26.

Table 4.13 ANOVA for average scores for the VisAWI balanced trail.


Simplicity

Objects 7.51 0.008



Diversity

Objects 1.38 0.24



Colorfulness

Objects 0.66 0.42



Craftsmanship

Objects 0.60 0.44



Total

Objects 5.01 0.028



80

b. The unbalanced condition

Table 4.14 and Fig 4.8 summarize average scores for this trail. Compared to the previous

trail, a completely reverse pattern was observed in this trail. Higher average scores were

given to designs with larger numbers of objects and sizes. This was very clear with

number of objects in all scales and the total; Designs associated with the larger number of

objects were given higher average score.

With number of different sizes no clear pattern could be distinguished, however, in

both the simplicity and the craftsmanship scales, deign 8, associated with the largest

numbers of objects and sizes, was given the largest average score.

Differences among scales are in the same order as in the previous trail. Simplicity was

given the higher average score of 3.92, followed by colorfulness, craftsmanship, and

diversity with average scores of 2.95, 2.83, and 2.53 respectively.

The calculated Cronbach’s α values for all scales are between 0.46 and 0.66, indicating

that not reliabilities of all scales are within the acceptable limits. Specifically, values for

the craftsmanship were all below 0.50. Thus, results associated with this scale should be

analyzed with extra caution.

Analysis of variance in Table 4.15 shows that effects of number of objects were

significant on average scores in both the craftsmanship (p-value = 0.026) and in the total

(p-value = 0.043). However, this effect was reversed in this trail; designs with the larger

number of objects were perceived as having better aesthetics level than designs with the

smaller number of objects. No significant effects of number of objects were found with

the other scales.

No significant effects of number of sizes were found in any scale or in the total.

81

Table 4.14 Descriptive statistics for average scores for the VisAWI unbalanced trail.

Scale Design

no

No of

objects

No of

sizes


deviation

(per screen)

Cronbach’s

α Per

design

Per no of

objects

Per

scale

Simplicity

5

6

3 3.68

3.72

3.82

1.79

0.47-0.66

6 5 3.77 1.83

7

16

3 3.90

3.91

1.76

8 11 3.93 1.81

Diversity

5

6

3 2.43

2.42

2.53

1.53

6 5 2.41 1.35

7

16

3 2.77

2.64

1.55

8 11 2.51 1.60

Colorfulness

5

6

3 3.02

2.88

2.95

1.83

6 5 2.74 1.72

7

16

3 3.15

3.02

1.71

8 11 2.89 1.74

Craftsmanship

5

6

3 2.67

2.70

2.83

1.71

6 5 2.72 1.78

7

16

3 2.89

2.97

1.91

8 11 3.04 1.85

Total

5

6

3 2.95

2.93

3.03

1.52

6 5 2.91 1.38

7

16

3 3.18

3.13

1.51

8 11 3.09 1.56

82

(a) per design


(c) per scale

Figure 4.8 Average scores for the VisAWI unbalanced trail.

3.68

2.43

3.02

2.67

2.95

3.77

2.41

2.74 2.72

2.91

3.9

2.77

3.15

2.89

3.18

3.93

2.51

2.89

3.04 3.09

2

2.5

3

3.5

4

4.5


3.72

2.42

2.88

2.7

2.93

3.91

2.64

3.02 2.97 3.13

2

2.5

3

3.5

4

4.5


3.92

2.53

2.95 2.83

3.03

2

2.5

3

3.5

4

4.5


5 6 7 8 5 6 7 8 5 6 7 8 5 6 7 8 5 6 7 8

6 16 6 16 6 16 6 16 6 16

83

Table 4.15 ANOVA for average scores for the VisAWI unbalanced trail.


Simplicity

Objects 1.80 0.18



Diversity

Objects 3.36 0.071



Colorfulness

Objects 1.20 0.28



Craftsmanship

Objects 5.17 0.026



Total

Objects 4.28 0.043



84

Differences among the different scales were highly significant in this trail too (p-values

< 0.001). This is shown in the results of analysis of variance for scales in Table 4.16. All

pair-wise comparisons were significant with simplicity given the highest average score

(3.92) followed by colorfulness with an average score of 2.95, than craftsmanship with an

average of 2.83, and last diversity with an average score of 3.53.

Differences between the balanced and unbalanced designs were found significant for all

scales (Table 4.16). With all facets of visual aesthetics measured by the VisAWI

questionnaire, the balanced designs were perceived as having significantly higher levels

of aesthetics than the unbalanced designs. This was the case even with the colorfulness

scale, despite the fact that all the designs in both conditions have identical colors. This

could be an evidence of how dominant the effect of balance on participants' perception of

visual aesthetics is.

Table 4.16 ANOVA for balance and scales for the VisAWI trail.

Balance Scales

Scale F P-value Condition F P-value

Simplicity 3.11 0.021 Balanced 49.40 < 0.001

Diversity 7.45 < 0.001 Unbalanced 95.49 < 0.001

Colorfulness 10.66 < 0.001

Craftsmanship 5.35 0.002

Total 8.11 < 0.001

85

4.3.5 Overall discussion

The purpose of the experimental work presented in this chapter is to systematically study

effects of number of objects and number of different sizes of objects on perceived visual

aesthetics of website design under the conditions of balanced and unbalanced designs.

Several experimental trails were conducted with different groups of participants, two

presentation methods, and two methods to collect participants' responses.

The two presentation methods were the use of abstract mock-ups of layout designs of

webpages and the use of real webpage designs. The abstract mock-ups were used to work

as pilot tests for the webpage designs. They were used to help discover any required

modifications on the experimental setup or the proposed designs before start testing with

the real webpage designs. They also helped in obtaining an overall measure of

participants' perception of visual aesthetics of the used designs in an abstract setting with

minimum number of uncontrolled effects.

The two methods used to collect participants' perception of visual aesthetics were: a one

overall question and two standard questionnaires. The one-question was used with both

the abstract mock-up designs and with the real webpage designs. The standard

questionnaires were used only with the real webpage designs.

In this section summary of results of all experimental trails is given with an overall

discussion of these results. Table 4.17 gives summary of results of all trials for the two

factors (number of objects and number of sizes) for the two balance and unbalanced

conditions. The results of effects of the two factors can be summarized as follows:-

86

Table 4.17 Summary of results for all experimental trails.

No of objects

No of sizes

Balance

One-question

Mock-ups

Balanced

significant (smaller is better)

significant (no of objects 16,

smaller is better)

significant

Unbalanced


significant (both objects sizes,

smaller is better)

Webpages

Balanced


not significant

not

significant

Unbalanced

not significant

not significant

Questionnaire

Classic

/Expressive

Balanced

significant (only with Classical,

smaller is better)

significant (Classical, no of object

16, smaller is better)

significant

Unbalanced

not significant

not significant

VisAWI

Balanced

significant (Simplicity and total,

smaller is better)

significant

(Craftsmanship, no of

objects 16, smaller is

better)

significant

Unbalanced

significant (Craftsmanship and

total, larger is better)

not significant

87

Number of objects:-

- In the balanced condition: In all trails, significant effects of number of objects on

visual aesthetics were found. In both cases of the one-question and with

questionnaire scales related to layout elements; classical in the

classical/expressive questionnaire and simplicity and total in the VisAWI

questionnaire. All these effects indicate that decreasing number of objects will

increase perceived visual aesthetics.

- In the unbalanced condition: Significant effects were only found in two trails; the

mock-up designs trail and VisAWI with the craftsmanship scale and total.

Directions of effects observed in the VisAWI were opposite to direction of effects

in all other trails; designs with larger (not smaller) number of objectives were

perceived as having better visual aesthetics.

Number of different sizes of objects:-

- In the balanced condition: Significant effects were found in all trails except the

one-question with webpages trail. With the two questionnaires, the significant

effects were found with scales related to layout elements (classical in classical/

expressive and craftsmanship in VisAWI). In these trails, the significant effects

were only observed in the case of designs with the larger number of objects.

- In the unbalanced condition: Significant effects were only found in the mock-up

designs trail.

These results confirm the earlier findings of the correlation analyses presented in chapter

3, at high levels of balance, unity of form represented by the two parameters (number of

objects and number of different sizes) has significant effects on perceived visual

88

aesthetics of website interface design. Increasing unity of form levels (by decreasing

number of objects and number of sizes) in a website interface will increase levels of the

perceived visual aesthetics. As in the correlation analyses, these effects are more evident

on visual aesthetics dimensions related to interface layout design (classical, simplicity,

and craftsmanship).

Although some significant effects were found in the unbalanced condition too, they

were not consistent and couldn’t be considered as a general case. Also, with the

craftsmanship dimension (measured by VisAWI questionnaire) the significant effect was

found to be opposite in direction to the significant effects found in the balanced

condition. It is not clear if this “opposite” effect is an indication of a general phenomenon

or just a special case associated with the experimental setup and webpage designs of this

experiment. However, the lower reliability level of the questionnaire scale used to

measure visual aesthetics in this case put more doubts on these results.

The highly significant differences between the balance and unbalanced designs found in

almost all trails confirm finding of the experiment presented in chapter 3 and support

findings of previous studies; vertical balance has a positive effect on perceived visual

aesthetics of website interface design. These results also indicate that the manipulation of

the designs in this experiment was successful in creating the two balanced and

unbalanced conditions.

For further insights correlation coefficients were calculated between all average scores

for all the used subjective measures in all trails. Table 4.18 summarizes average scores

for all trails. Correlations between these scores are shown in Table 4.19 and Table 4.20.

In Table 4.19 correlations were calculated using averages scores for both the balanced

89

and the unbalanced conditions (all eight designs). In Table 4.20 correlations were

calculated only for the balanced condition (first four designs).

Table 4.19 show highly significant correlations between all scales of the two

questionnaires (classic/expressive and VisAWI). The only exception is the expressive

scale which didn’t correlate with any other scale. No significant correlations were found

between average scores of the one-question and the average scores of any of the scales of

the two questionnaires.

Correlation coefficients calculated for only the balanced designs in Table 4.20 show

very high and significant correlations between average scores of the one-question in the

mock-ups trail and total average scores of the two questionnaire. These significant

correlations didn’t show up when all designs were considered in Table 4.19. This might

be due to that results in the balanced condition were not consistent between the mock-ups

trail and the two questionnaires trails. While in the balanced condition, results were more

consistent between trails.

Lower correlations with less number of significant cases than in Table 4.19 were shown

in the balanced designs in Table 4.20. This suggests that, with the used experimental

setup, effect of vertical balance is more dominant than effects of unity of form. Thus,

when calculating correlation coefficients for all the balanced and unbalanced designs, this

dominant effect of balance might have masked the effects of the other two factors and

prevent their effects from being clearly visible in the calculated correlation coefficients.

The next step is to examine correlations between the objective layout-based measures

and the subjective questionnaire-based measures. This will be part of the content of the

next section.

90

Table 4.18 Summary of average scores for all experimental trails.

Design no Mock-ups

Classical/expressive VisAWI

Classic Expressive Average Simplicity Diversity Colorfulness Craftsmanship Average

1 5.24 4.47 3.12 3.79 4.31 3.37 4.03 3.70 3.85

2 4.52 4.40 2.94 3.67 4.38 3.30 3.95 3.48 3.78

3 4.72 4.38 3.05 3.71 4.08 3.28 3.88 3.71 3.74

4 3.59 3.87 3.12 3.49 3.94 3.11 3.94 3.28 3.57

5 5.00 3.53 2.93 3.23 3.68 2.43 3.02 2.67 2.95

6 3.90 3.71 2.89 3.30 3.77 2.41 2.74 2.72 2.91

7 4.24 3.62 3.11 3.36 3.90 2.77 3.15 2.89 3.18

8 3.21 3.40 3.02 3.21 3.93 2.51 2.89 3.04 3.09

91

Table 4.19 Correlation coefficients of average scores for all eight designs (balanced and unbalanced).

One-question Classic/Expressive VisAWI

Mock-ups Webpages Classic Expressive Total Simplicity Diversity Colorfulness Craftsmanship

One-question

Mock-ups -

Webpages 0.565

Classic

/Expressive

Classic 0.589 0.443

Expressive -0.011 -0.583 0.212

Total 0.551 0.300 0.981* 0.398

VisAWI

Simplicity 0.335 0.455 0.855* 0.301 0.863*

Diversity 0.415 0.152 0.913* 0.523 0.961* 0.873*

Colorfulness 0.408 0.106 0.863* 0.511 0.912* 0.797** 0.975*

Craftsmanship 0.343 0.213 0.888* 0.490 0.931* 0.866* 0.943* 0.896*

Total 0.396 0.205 0.914* 0.495 0.957* 0.897* 0.993* 0.971* 0.965*

* Significant at 0.01, ** Significant at 0.05

92

Table 4.20 Correlation coefficients of average scores for the balanced condition.

One-question Classic/Expressive VisAWI

Mock-ups Webpages Classic Expressive Total Simplicity Diversity Colorfulness Craftsmanship

One-question

Mock-ups -

Webpages 0.744

Classic

/Expressive

Classic 0.945** 0.892

Expressive -0.084 -0.675 -0.404

Total 0.999* 0.746 0.953** -0.108

VisAWI

Simplicity 0.706 0.963** 0.808 -0.557 0.694

Diversity 0.980** 0.861 0.976** -0.250 0.978** 0.830

Colorfulness 0.413 0.307 0.238 0.320 0.365 0.535 0.435

Craftsmanship 0.926 0.553 0.870 -0.016 0.940 0.435 0.856 0.092

Total 0.967** 0.875 0.963** -0.256 0.961** 0.863 0.997* 0.492 0.816


93

4.4 Comparing Objective Measures with Subjective Measures

In this section correlation analysis will be conducted between the objective screen layout-

based measures and the subjective questionnaire-based measures. The reason for this

analysis is to further validate measures and models presented in chapter 3 and to further

investigate the relationship between the objective and subjective measures in general.

This further validation would help in obtaining better understanding of the relationship

between objectives and subjective measures, and would help in improve computational

formulas and models presented in this study.

4.4.1 Correlation analysis

The values of the objective measures for the eight designs are summarized in Table 4.21.

The objective measures included in the table are the same as the ones presented and

tested in chapter 3. Same formulas and procedures as used in chapter 3 were used to

calculated values of these measures for the eight designs.

Correlation analysis is shown in Table 4.22. High and significant correlations were

found between the measures of balance, unity of layout, sequence, and Ngo model, and

the average scores of most of the subjective questionnaire measures. Surprisingly, no

high or significant correlations were found with number of objects, number of sizes, unity

of form, and the interaction model. This contradicts the significant effects unity of form

showed in results of experimental work in the previous sections.

The dominant effect of vertical balance is most likely the reason for this lack of

correlation. This is supported by the negative correlations observed with unity of layout.

94

Looking at values of unity of layout in Table 4.21, it can be seen that these values are

higher in case of the unbalanced condition (designs 5 to 8). These higher values should

have increased levels of visual aesthetics (measured by questionnaire scores) not

decreased them as the negative correlations indicate. This suggests that the low levels of

visual aesthetics recorded in the unbalanced condition are largely due to the dominant

effect of balance. This dominate effect of balance could have marginalized effects of

unity.

Table 4.21 Values of measures and models for the eight designs.

Design

no

No of

objects

No of

different

sizes

Balance

Unity

Sequence Interaction model

Ngo

model form layout Average

1 6 3 1 0.67 0.17 0.42 1 0.63 0.26

2 6 5 0.98 0.33 0.16 0.25 1 0.54 0.25

3 16 3 1 0.88 0.16 0.52 1 0.68 0.26

4 16 11 0.99 0.38 0.16 0.27 1 0.55 0.24

5 6 3 0.62 0.67 0.29 0.48 0.75 0.56 0.21

6 6 5 0.61 0.33 0.27 0.30 0.75 0.51 0.20

7 16 3 0.63 0.88 0.27 0.57 0.75 0.59 0.21

8 16 11 0.62 0.38 0.27 0.32 0.75 0.52 0.19

95

To explore this more, correlations for only the balanced conditions were calculated and

presented in Table 4.23. With only the balanced designs considered (designs 1 to 4), high

and significant correlations were found with number of objects and number of sizes. All

of the significant correlations are with averages scores of questionnaire scales related to

layout design. With unity of form, higher correlations were found but none was

statistically significant.

No noticeable changes in correlations with the interaction model were shown. However,

contrary to the case of all designs, in this case Ngo model shows lower and non

significant correlations.

4.4.2 Proposed modification to the unity of form formula

Since effects of number of objects and number of sizes were more evident in the balanced

case (indicated by the high correlations) and since low correlations were found with the

unity of form and with the two computational models, it is possible that these effects are

not represented adequately by the unity of form formula and consequently the two

computational models that incorporated this formula.

To investigate this possibility, first, differences between average total scores for both

questionnaires were calculated for each counter parts pair of balanced and unbalanced

designs. Calculations are shown in Table 4.24. Second, correlations of these differences

with associated values of unity of form and the two parameters were computed (shown in

Table 4.25).

Subtracting the average scores of the designs in the unbalanced conditions form average

scores of the designs in the balanced condition should remove the effect of balance and

produce values that represent effects of number of objects and sizes more clearly.

96

Table 4.22 Correlation coefficients of measures and models for all eight designs (balanced and unbalanced).

Measure Mock-ups



No of objects -0.549 -0.259 0.608 -0.122 -0.157 0.053 0.030 0.113 0.030

No of sizes -0.860* -0.386 0.161 -0.330 -0.118 -0.155 -0.073 -0.074 -0.105

Balance 0.345 0.887* 0.427 0.918* 0.783* 0.958* 0.980* 0.916* 0.963*

Unity of form 0.571 0.159 0.426 0.234 -0.040 0.196 0.130 0.175 0.138

Unity of layout -0.236 -0.870* -0.411* -0.898* -0.792* -0.946* -0.958* -0.912* -0.952*

Unity average 0.488 -0.065 0.306 0.000 -0.237 -0.049 -0.115 -0.061 -0.107

Sequence 0.327 0.886* 0.399 0.911* 0.784* 0.950* 0.975* 0.909* 0.957*

Interaction model 0.637 0.619 0.510 0.682 0.359 0.626 0.569 0.663 0.598

Ngo model 0.570 0.848* 0.561 0.908* 0.663 0.923* 0.915* 0.877* 0.902*


97

Table 4.23 Correlation coefficients of measures and models for the balanced condition.

Measure Mock-ups



No of objects -0.605 -0.643 0.367 -0.578 -0.952** -0.725 -0.749 -0.268 -0.775

No of sizes -0.949** -0.972** 0.279 -0.963** -0.646 -0.936 -0.114 -0.957** -0.907

Unity of form 0.593 0.481 0.287 0.618 -0.118 0.445 -0.247 0.843 0.380

Unity average 0.604 0.488 0.294 0.628 -0.107 0.456 -0.229 0.848 0.391

Interaction model 0.604 0.487 0.298 0.628 -0.109 0.455 -0.226 0.848 0.391

Ngo model 0.602 0.479 0.316 0.625 -0.115 0.451 -0.214 0.844 0.387


98

Table 4.24 Differences between average total scores for each design pair.

Design

Pair

Difference between average total scores Number

of objects

Number

of sizes

Unity of

form


1-5 0.90 0.57 6 3 0.67

2-6 0.87 0.37 6 5 0.33

3-7 0.56 0.35 16 3 0.88

4-8 0.48 0.28 16 11 0.38

Table 4.25 Correlation coefficients for values of unity of form computed by the original formula.

Measure

Sores


Number of objects -0.985 -0.713

Number of sizes -0.601 -0.671

Unity of form -0.119 0.338

High correlations are shown in Table 4.25 with number of objects and number of sizes

but not with unity of form. This supports further the possibility of that the formula doesn't

represent adequately the combined effects of number of objects and number of sizes.

The formula of unity of form with examples of calculation is shown in Appendix A.

The formula is reprinted below (equation 4.1).

(4.1)

nsizes stands for the number of sizes used, and n is the number of objects on the frame.

99

After careful examinations of the formula and comparing the values calculated by the

formula for the designs and the differences in Table 4.24, a new modified formula was

proposed. The modified formula is presented below (equation 4.2).

(4.2)

This modified formula should represent better the combined effects of both number of

objects and number of different sizes; the two input parameters in the formula.

To validate this formula the unity of form values were recalculated using the modified

formula for the designs and correlation analysis was conducted using these values. These

values are presented in Table 4.26 with the original information from Table 4.24.

Correlations are shown in Table 4.27. It is clearly that values computed by the modified

formula have produced much higher correlations than values computed by the original

formula. Next the modified formula will be incorporated into the two computational

models and validate using subjective measures. This will be the job of the next section.

Table 4.26 Comparing differences with values of unity of form of both original and modified formula

Design

pair

Difference between average total scores Number

of objects

Number

of sizes

Unity of form

Classical/Expressive VisAWI Original Modified

1-5 0.90 0.57 6 3 0.67 0.44

2-6 0.87 0.37 6 5 0.33 0.33

3-7 0.56 0.35 16 3 0.88 0.38

4-8 0.48 0.28 16 11 0.38 0.15

100

Table 4.27 Correlation coefficients for values of unity of form computed by the modified formula.

Measure Sores


Number of objects -0.985 -0.713

Number of sizes -0.601 -0.671

Unity of form (original) -0.119 0.338

Unity of form (modified) 0.710 0.822

4.4.3 Incorporating the modified unity of form formula into the computational

model

The modified formula of unity of form was used to recalculate the values of unity for the

eight black and white screens of the experiment presented in chapter 3. The regression

(interaction) model was refit using these new values. The new model is given below

(Equation 4.3).

Aesthetic Value = 0.609 - 0.086 B - 0.645 U - 0.139 S + 0.743 B*U + 0.648 U*S (4.3)

Where:-

B : Balance

U : Unity

S : Sequence

Table 4.28 shows values of the three measures (balance, unity, and sequence) for the

eight screens. Values of unity were shown for both the original and the modified formula.

Both actual and predicted aesthetic values for each of the eight screens are shown in the

101

table. Predicted values for both the original model (Equation 3.1) and the modified model

(Equation 4.3) are listed in the table. A correlation coefficient of 0.97 (p-value < 0.001)

was found between actual and predicted values calculated by the modified model. It is a

bit lower than the value calculated for the original model (0.99); however, both are

statistically significant at the same level.

Table 4.28 Values of measures and actual and predicted aesthetic values of the eight screens for the

original and the modified model.

Screen

no Balance

Unity

Sequence

Aesthetic Value

Actual

Predicted

Original Modified Original Modified

1 1.00 0.99 0.99 1.00 0.908 0.920 1.120

2 0.10 0.18 0.27 0.00 0.438 0.453 0.445

3 0.98 0.24 0.33 0.00 0.485 0.519 0.552

4 0.91 0.80 0.73 0.25 0.654 0.621 0.635

5 0.09 0.82 0.76 1.00 0.546 0.528 0.516

6 0.04 0.15 0.24 1.00 0.415 0.441 0.474

7 1.00 0.15 0.24 1.00 0.515 0.493 0.565

8 0.25 0.78 0.72 0.25 0.354 0.409 0.340

Now the correlation analysis in section 4.4.1 will be repeated for the modified formula

and model. Table 4.29 gives the values of the three measures and the associated values

calculated by the modified interaction model in equation (4.3) for the eight webpage

designs of this chapter.

102

Table 4.30 and Table 4.31 show correlation results with the questionnaires scores for all

designs and balanced designs respectively. The two tables are reprints of Table 4.22 and

Table 4.23 with correlations values for the modified interaction model added to them.

Comparing correlations for the modified model to the original, it is clearly that the

modified model produced more higher and significant correlations in both the all designs

(Table 4.30) and the balanced design (Table 4.31).

Table 4.29 Values of the three measures and the model for the eight webpage designs.

design

no

No of

objects

No of

different

sizes

Balance

Unity

Sequence

Interaction model

(modified) form layout Average

1 6 3 1 0.44 0.17 0.31 1 0.74

2 6 5 0.98 0.33 0.16 0.25 1 0.70

3 16 3 1 0.38 0.16 0.27 1 0.72

4 16 11 0.99 0.15 0.16 0.15 1 0.63

5 6 3 0.62 0.44 0.29 0.37 0.75 0.62

6 6 5 0.61 0.33 0.27 0.30 0.75 0.60

7 16 3 0.63 0.38 0.27 0.32 0.75 0.61

8 16 11 0.62 0.15 0.27 0.21 0.75 0.58

10

3

Table 4.30 Correlation coefficients of the models for all eight designs (balanced and unbalanced).

Measure Mock-ups



Unity of form 0.918* 0.367 -0.144 0.315 0.107 0.135 0.080 0.060 0.097

Unity Average 0.703** -0.052 -0.331 -0.114 -0.236 -0.295 -0.344 -0.337 -0.325

Interaction model

(original) 0.637 0.619 0.510 0.682 0.359 0.626 0.569 0.663 0.598

Interaction model

(modified) 0.721** 0.956* 0.278 0.953* 0.786** 0.870* 0.832* 0.855* 0.872*

Ngo model 0.570 0.848* 0.561 0.908* 0.663 0.923* 0.915* 0.877* 0.902*


10

4

Table 4.31 Correlation coefficients of the models for the balanced condition

Measure Mock-ups



Unity of form 0.993* 0.963** -0.149 0.998* 0.688 0.975** 0.307 0.950** 0.955*

Unity Average 0.996* 0.965** -0.153 1.00* 0.717 0.984** 0.350 0.934** 0.968**

Interaction model

(original) 0.604 0.487 0.298 0.628 -0.109 0.455 -0.226 0.848 0.391

Interaction model

(modified) 0.997* 0.950** -0.101 0.999* 0.682 0.974** 0.350 0.947** 0.956**

Ngo model 0.602 0.479 0.316 0.625 -0.115 0.451 -0.214 0.844 0.387


105

The final step to complete the validation is to use the modified model to calculate

aesthetic values for the 42 webpages used to validate the original model in chapter 3.

Table 4.32 shows correlations calculated between values of the modified model and

average questionnaire scores for the webpages. Correlations of the original model and

Ngo and Byrne model are reprinted here for comparison. Values in the table show that

the modified model gave lower correlations than the original model. Nevertheless, in

most cases, levels of statistical significance were close between the two models.

Table 4.32 Correlation coefficients of the models for all the 42 webpages of chapter 3.

Measure

Classical/expressive

VisA WI



Interaction

model

(original)

0.600* 0.189 0.524*

0.712* 0.163 0.316** 0.434* 0.491*

Interaction

model

(modified)

0.493* 0.172 0.440*

0.614* 0.080 0.228 0.333** 0.383**

Ngo

model 0.539* 0.151 0.460* 0.657* 0.143 0.325** 0.347** 0.446*


CHAPTER 5

CONCLUSIONS AND FUTURE WORK

107

5.1 Summary of Experimental Work and Results

The first two objectives of this study (stated in section 1.3.2) included verifying findings

of previous studies, and validating and exploring the possibility of improving the

available measures and models of visual aesthetics of computer interfaces. To accomplish

these objectives, several experiments were designed and conducted using rigorous

statistical testing and design of experiment techniques. The first part of the experimental

work was presented in chapter 3. An experiment was designed and conducted to

investigate effects of three elements of screen layout (balance, unity, and sequence) on

the perceived interface aesthetics. Results showed that the three elements have significant

effects on perceived interface aesthetics. Significant effects of interactions among the

three elements were also found. These results confirmed findings of previous studies

(Ngo and Byrne, 2001, Ngo et al., 2003).

A regression model relating perceived visual aesthetics to the three elements was

constructed. The model represents a compact version of the original model developed by

Ngo and Byrne (2001). The model was validated using two methods; first it was cross

validated with the original model developed by Ngo and Byrne’s (2001). Second, it was

validated using subjective standard questionnaire scores of real webpages.

The comparison with Ngo and Byrne's model indicate that, although the model has less

number of terms, it was still capable of producing aesthetic values within the same level

of statistical significance as the original model. This also further confirms findings of

Ngo and Byrne (2001) and Ngo et al. (2003) studies.

108

When validating the model using standard questionnaire scores of real webpages, high

correlations were found between the values computed by the model and scores of

questionnaire items related to visual layout of the webpages. This indicates that although

the formulas used in this study were originally developed for data entry screens, they can

also be applied to websites. It also indicates that the layout-based measures tested in this

study can adequately predict aesthetics aspects related to the classical and the simplicity

dimensions of website aesthetics.

The validation of the regression model using subjective questionnaire scores of real

webpages helped achieve the other two objectives of this study; specifically, objective 3

of the study (section 1.3.2); to see if the formulas and associated measures and models of

interface design would work with website interface design, and objective 4; comparing

objective layout-based measures of visual aesthetics with subjective questionnaire-based

measures.

To further confirm findings of chapter 3 regarding application of the objective measures

and models to website interface design, more experiments were designed and conducted.

This was covered in chapter 4. The purpose of the experimental work presented in that

chapter was to systematically study effects of number of objects and number of different

sizes of objects (as parameters of unity of form) on perceived visual aesthetics of website

interface design under the conditions of balanced and unbalanced designs.

Several experimental trails were conducted with various settings and with different

groups of participants. The experimental settings included the use of abstract mock-ups of

layout designs of webpages and the use of real webpage designs. Two methods were used

to collect participants' perception of visual aesthetics; a one overall question and two

109

standard questionnaires. The standard questionnaires were used only with the real

webpage designs. The two questionnaires were the Classical/Expressive questionnaire

and the VisAWI questionnaire.

Results of these experiments confirmed the earlier findings in chapter 3, at high levels

of balance; unity of form represented by the two parameters (number of objects and

number of different sizes) has significant effects on perceived visual aesthetics of website

interface design. These results also indicate that these effects are more evident on visual

aesthetics dimensions related to interface layout design. Furthermore, results also

confirmed that vertical balance has a positive effect on perceived visual aesthetics of

website interface design.

Part of the experimental work in chapter 4 involved performing correlation analysis to

compare objective layout-based measures with subjective questionnaire based measures.

As in chapter 3, results of the comparison showed high correlations between the measures

and the models, and the questionnaire scales related to screen layout. This further

confirms findings of chapter 3 regarding this point

Observations from this comparison were the basis of a proposed modification to the

unity of form formula and consequently the regression model developed in chapter 3.

This modification was proposed in order to improve the unity of form formula so it would

express the combined effects of number of objects and number of sizes more adequately.

Compared to the original model, the modified model incorporating this formula showed

better performance with the webpage designs used in chapter 5 and an acceptable

performance with almost the same level of statistical significance as the original model in

the case of screens and webpages of chapter 3.

110

Preliminary results of this study have already been reported and published in several

articles (Altaboli and Lin, 2010, 2011a, 2011b, & 2012).

5.2 Conclusions and Contributions

Results of this study confirmed findings of previous studies regarding the possibility of

using visual features of the interface to predict perceived visual aesthetics and further

support the concept of expressing such feature using mathematical formulas and use them

in turn as basis to develop computational models to predict visual aesthetics of interface

design. Results of the study also proved that screen layout-based measures can also work

with website interfaces. Moreover, results of the study showed that objective screen

layout-based measures relates to subjective questionnaire-based measure. The

relationship is particularly stronger with questionnaire elements related to screen layout

elements. This suggests that objective layout-based measures could be used to generally

assess the overall visual aesthetics of websites and particularly aesthetic aspects related to

classical and simplicity dimensions of website aesthetics.

The following points give more specific statements of the main conclusions drawn from

results of this study and contributions added to the knowledge base in the field:-

The three layout-based elements of balance, unity, and sequence have significant

effects on perceived visual aesthetics. These three elements were measured using the

mathematical formulas developed by Ngo et al. (2003). They were utilized to develop

a compact computational model to predicate visual aesthetics. This model performed

within good levels of accuracy in all the validation procedures in this study.

111

The three elements and the model proved to work as well when their application was

extended to the case of website interface design.

Balance and unity of form (presented by number of objects and number of different

sizes on the screen) have significant effects on perceived visual aesthetics of website

interface design. Higher levels of balance with less numbers of objects and sizes will

significantly increase levels of perceived visual aesthetics.

With website interface design, effects of balance are more dominant than effects of

the other tested elements. Effects of unity of form are more evident at high levels of

balance. At low levels of balance, effects of unity of form are not significant.

The objective layout-based measures strongly correlate to the subjective

questionnaire-based measures related to screen layout. Indicating that objective

layout-based measures could be used to measure perceived visual aesthetics

dimensions related to screen layout elements.

5.3 Recommendations for Future Work

Website interface was the main type of interface tested in this study. Hence, further

testing with other types of interfaces is needed for the finding of this study to be

generalized to all types of graphical user interfaces. It would be particularly

interesting to see how these findings would work with the today's widely spread types

of interfaces and screens (e.g. smart phones).

The procedure used to divide the webpages into visual objects was a bit arbitrary

based on a personal perception of the pages. Standard criteria and systematic methods

112

should be established to make it easy to apply the formulas to any webpage.

Establishing such standards and procedures would simplify automating the process

using computer software.

The formulas used to calculate the three elements don’t include effect of color,

although, Ngo et al. (2003) suggested adding effect of colors as part of the balance

element, with darker colors given more weights. However, it was not clear how to

apply it in practice. The challenge is still open to develop practical methods to express

effects of colors on visual aesthetics using numerical values.

As quantitative measures of visual aesthetics are becoming more reliable, the next

step should be to study potential effects of visual aesthetics on performance. Possible

positive effects of visual aesthetics on performance have been reported in the

literature, however, some also reported possible negative effects and argue that

context of use may play a role in this regard. The assumption of the possible influence

of context of use might have on producing positive or negative effects of aesthetics on

performance should be further examined.

One limitations that this study encountered was the practical difficulty in

manipulating the values of the tested measures to completely match the theoretically

experimentally designed levels; changing the position of one visual object on the

screen would change the values of more than one measure at the same time. This

forced the experiments to be designed and conducted with limited numbers of

measures and imposed many limitations on the associated number of factors and

levels.

113

REFERENCES

Altaboli, A. and Lin, Y. 2010, Experimental investigation of effects of balance, unity, and

sequence on interface and screen design aesthetics, in: Blashki, K.. (Ed.), Proceedings of

The IADIS International Conference Interface and Human Computer Interaction 2010,

Freiburg, Germany, IADIS Press, pp. 243-250.

Altaboli, A., Lin, Y., Ali, M., Alterhony, Y., 2010, Using performance measures to assess the

effect of visual aesthetics on usability, in: Khalid, H., Hedge, A., Ahram, T. (Eds.),

Advances in Ergonomics Modeling and Usability Evaluation, CRC Press, Taylor & Francis

Group, pp. 107-116.

Altaboli, A., and Lin, Y., 2011a, Investigating effects of screen layout elements on interface and

screen design aesthetics. Advances in Human-Computer Interaction, vol. 2011, Article ID

659758, 10 pages, 2011. doi:10.1155/2011/659758.

Altaboli, A. and Lin, Y. 2011b, Objective and subjective measures of visual aesthetics of website

interface design: the two sides of the coin. In Proceedings of the 14th international

Conference on Human-Computer interaction: Design and Development Approaches -

Volume Part I (Orlando, FL, July 09 - 14, 2011). J. A. Jacko, Ed. Springer-Verlag, Berlin,

Heidelberg, 35-44.

Altaboli, A. and Lin, Y. 2012, Effects of unity of form on visual aesthetics of website design, to

be presented at the 4th international Conference on Applied Human Factors and Ergonomics

(AHFE 2012), July 21-25, 2012, San Francisco, CA U.S.A.

Bailey, R., 1982, Human Performance Engineering. First Edition, Prentice-Hall, Englewood

Cliffs, New Jersey.

Bauerly, M. and Liu, Y. 2006, Computational modeling and experimental investigation of effects

of compositional elements on interface and design aesthetics. Int. J. Human-Computer

Studies, 64, 670–682

Bauerly, M. and Liu, Y. 2008, Effects of symmetry and number of compositional elements on

interface and design aesthetics. International Journal of Human-Computer Interaction, 24: 3,

275 — 287

114

Ben-Bassat, T., Meyer, J., Tractinsky, N., 2006. Economic and subjective measures of the

perceived value of aesthetics and usability, ACM Transactions on Computer-Human

Interaction 13 (2), 210–234.

Bi, L., Fan, X., Liu, Y., 2011, Effects of symmetry and number of compositional elements on

chinese users' aesthetic ratings of interfaces: experimental and modeling investigations,

International Journal of Human-Computer Interaction, 27:3, 245-259

Birkhoff, G., 1933. Aesthetic Measure. Harvard University Press, Cambridge, MA.

Cawthon, N., and Vande Moere, A., 2007, The Effect of aesthetic on the usability of data

visualization, 11th International Conference Information Visualization (IV'07).

Chand, D., Dooley, L., and Tuovinen, E., 2002. Gestalt theory in visual screen design – a new

look at an old subject. Australian Computer Society, Inc. presented at the Seventh World

Conference on Computer in Education, Copenhagen, Denmark, 2001.

Comber, T and Maltby, JR. 1995. Evaluating usability of screen designs with layout complexity.

in H Hasan & C Nicastri (eds) , Proceedings of HCI, a light into the future : OZCHI '95 ,

CHISIG Australia, Downer, ACT.

De Angeli, A., Sutcliffe, A., Hartmann, J., 2006. Interaction, usability and aesthetics: what

influences users’ preferences?. In: Proceedings of the Sixth ACM Conference on Designing

Interactive Systems, PA, June 2006.

Djamasbi, S., Siegel, M., Tullis, T., 2010, Generation Y, web design, and eye tracking, Int. J.

Human-Computer Studies, 68, 307–323

Dix, A., Finlay, J., Abowd, G., and Beale, R., 2004. Human-Computer Interaction. Third edition,

Pearson Education Limited.

Galitz W. O., The Essential Guide to User Interface Design: An Introduction to GUI Design

Principles and Techniques, John Wiley & Sons, Inc., New York, 2007.

Hartmann, J., Sutcliffe, A., De Angeli, A., 2008. Towards a theory of user judgment of aesthetics

and user interface quality. ACM Transactions on Computer-Human Interaction 15(4), 1-30.

Hassenzahl, M., 2004, The interplay of beauty, goodness and usability in interactive products.

Human Computer Interaction, 19, 4, 319–349.

Hoffmann, R. and Krauss, K. 2004. A critical evaluation of literature on visual aesthetics for the

web. SAICSIT '04: Proceedings of the 2004 annual research conference of the South African

115

institute of computer scientists and information technologists on IT research in developing

countries, Western Cape, South Africa, 205-209.

Jordan, P.W., 1998. Human factors for pleasure in product use. Applied Ergonomics 29 (1), 25–

33.

Kurosu, M. and Kashimura, K. 1995. Apparent usability vs. inherent usability: experimental

analysis on the determinants of the apparent usability. CHI '95: Conference companion on

Human factors in computing systems, Denver, Colorado, United States, 292-293.

Kurosu. M. and Kashimura. K.., 1995. Determinants of the Apparent Usability. Proceedings of

IEEE SMC. pp 1509-1513.

Lai, C., Chen, P., Shih, S., Liu, Y., Hong, J., 2010, Computational models and experimental

investigations of effects of balance and symmetry on the aesthetics of text-overlaid images,

Int. J. Human-Computer Studies 68, 41–56

Laviea, T., and Tractinsky, N., 2004, Assessing dimensions of perceived visual aesthetics of web

sites, Int. J. Human-Computer Studies 60 269–298

Lindgaard, G., Fernandez, G., Dudek, C. and Brown, J., 2006, Attention web designers: You have

50 milliseconds to make a good impression. Behavior & Information Technology 25, 2

(2006), 115-126.

Liu, Y., 2003a. Engineering aesthetics and aesthetic ergonomics: theoretical foundations and a

dual-process research methodology. Ergonomics, 46, 1273–1292.

Liu, Y., 2003b. The aesthetic and the ethic dimensions of human factors and design. Ergonomics,

46, 1293–1305.

Merriam-Webster Online Dictionary 2012. Merriam-Webster, http://www.merriam-

webster.com/dictionary/aesthetics [Accessed March 5th, 2012].

Moshagen, M., Musch, J., Göritz, A.S., 2009. A blessing, not a curse: Experimental evidence for

beneficial effects of visual aesthetics on performance. Ergonomics 52, 1311-1320.

Mirdehghani, M. and Monadjemi, A. 2009. Web pages aesthetic evaluation using low-level visual

features. World Academy of Science, Engineering and Technology, 49, 2009

Miyoshi, T. and Murata, A. 2001. A Method to evaluate properness of gui design based on

complexity indexes of size, local density, aliment, and grouping. IEEE International

Conference on Systems, Man and Cybernetics, Tucson, AZ.

116

Montgomery, D., 2001. Design and Analysis of Experiments, fifth edition, Johan Wiley & Sons,

Inc., New York, USA.

Nagmachi, M., 1995, Kansei Engineering: A new ergonomic consumer-oriented technology for

product development, International Journal of Industrial Ergonomics, 15, 3-11.

Nagmachi, M., 2002, Kansei engineering as a powerful consumer-oriented technology for product

development, Applied Ergonomics 33, 289–294.

Ngo, D. and Byrne, J., 2001. Application of an aesthetic evaluation model to data entry screens.

Computers in Human Behavior, 17 (2001) 149-185.

Ngo, D., Samsudin, A., and Abdullah, R., 2000. Aesthetic measures for assessing graphic screens.

Journal of Information Science and Engineering, 16, 97-116.

Ngo, D. C. L., Teo, L. S., & Byrne, J. G., 2002. Evaluating Interface Esthetics. Knowledge and

Information Systems (4), 46-79.

Ngo, D. C. L., Teo, L. S., & Byrne, J. G., 2003. Modelling interface aesthetics. Information

Sciences, 152(1), 25-46.

Nielsen. J., 1993, Usability Engineering. AP Professional.

Norman, D., 2004. Emotional design: Why we love (or hate) everyday things. Basic Books, New

York, NY, USA.

Oxford online dictionary 2012, Oxford University Press, http://oxforddictionaries.com

/definition/aesthetics?q=aesthetics, [Accessed March 5th, 2012].

Phillips, C., and Chapparro, C., 2009, Visual appeal vs. usability: which one influences user

perceptions of a website more?, Usability News, Vol 11(2).

Reich, Y., 1993, A model of aesthetic judgment in design. Artificial Intelligence in Engineering,

Vol. 8, No. 2, pp. 141-153.

Schmidt, K.E., Bauerly, M., Liu, Y., And Sridharan, S., 2003, Web Page aesthetics and

performance: a survey and an experimental study. In Proceedings of the 8th Annual

International Conference on Industrial Engineering – Theory, Applications and Practice, Las

Vegas, Nevada, USA.

Sears, A., 1993. Layout appropriateness: a metric for evaluating user interface widget layout.

IEEE Transactions on Software Engineering, 19 (7), 707–719.

117

Shackel, B., 1991. Usability-Context, framework, definition. design and evaluation. In Shackel,

B. and Richardson, S. (eds. ) Human Factors for Informatics Usability. Cambridge University

Press,

Shneiderman, B, Plaisant, C., Cohen, M, Jacobs, S., 2010, Designing the User Interface:

Strategies for Effective Human-Computer Interaction, 5th edition, Addison Wesley.

Sonderegger, A., Sauer, J., 2010. The influence of design aesthetics in usability testing: Effects on

user performance and perceived usability. Applied Ergonomics 41, 403-410.

Streveler, D.J. and Wasserman, A.I. 1984. Quantitative measures of the spatial properties of

screen designs. In: INTERACT ’84 Conference Proceedings. North-Holland, Amsterdam.

Tractinsky, N. 1997. Aesthetics and apparent usability: empirically assessing cultural and

methodological issues. In S. Pemberton, Proceedings of the 1997 Conference on Human

Factors in Computing Systems (CHI '97). New York: ACM Press.

Tractinsky, N., Cokhavi, A., Kirschenbaum, M., Sharfi, T., 2006. Evaluating the consistency of

immediate aesthetic perceptions of web pages. International Journal of Human-Computer

Studies, 64 (11), 1071–1083.

Tractinsky, N., Shoval-Katz, A., Ikar, D., 2000. What is beautiful is usable. Interacting with

Computers, 13, 127–145.

Tuch, A.N., Bargas-Avila, J.A., Opwis, K., Wilhelm, F.H., 2009. Visual complexity of websites:

Effects on users’ experience, physiology, performance, and memory. International Journal of

Human-Computer Studies 67, 703-715.

Tullis, T.S., 1983. The formatting of alphanumeric displays: a review and analysis. Human

Factors. 25 (6), 657–682.

Tullis, T.S., 1988. Screen design. In: Helander, M. (Ed.), Handbook of Human-Computer

Interaction. Elsevier Science Publishers B.V., North-Holland, Amsterdam, pp. 377–411.

Van Schaik, P. and Ling, J., 2009, The role of context in perceptions of the aesthetics of web

pages over time, Int. J. Human-Computer Studies 67, 79–89.

Zain, J., Tey, M., and Goh, Y. 2008 Probing a self-developed aesthetics measurement application

(sda) in measuring aesthetics of mandarin learning web page interfaces, IJCSNS International

Journal of Computer Science and Network Security, Vol. 8 No. 1, January 2008.

APPENDIX A

THE USED FORMULAS WITH EXAMPLES OF

CALCULATIONS

119

This section lists the formulas developed by Ngo et al. (2003) to calculate screen balance,

unity, and sequence. A hypothetical abstract screen, similar to the screens used in the

study, is used to give examples of how the formulas were used to calculate values of each

of the three elements.

a. Balance

The balance is computed as the difference between the total weighting of components on

each side of the horizontal and vertical axis and is given by

(A.1)

Where BM stands for Balance Measure, BMvertical and BMhorizontal are the vertical and

horizontal balances with

(A.2)

(A.3)

Where

(A.4)

120

L, R, T, and B stands for left, right, top, and bottom, respectively, aij is the area of object i

on side j, dij is the distance between the central lines of the object and the frame, nj is the

total number of objects on the side

Example

This example shows how balance of a hypothetical screen shown below is computed

using the above formulas

Figure A.1. The hypothetical example screen showing inputs required to compute the balance element.

121

WL = 9 * 2.5 + 4 * 4 = 22.5 + 16 = 38.5

WR = 4 * 3 + 12.25 * 2 = 12 + 24.5 = 36.5

WT = 9 * 2.5 + 4 *3.5 = 22.5 + 14 = 36.5

WB = 4 * 4 + 12.25 * 3 = 16 + 36.75 = 52.75

BMhorizonta = (36.5 – 52.75)/ 52.75 = 0.308

BMvertical = (38.5-36.5)/38.5 = 0.052

BM = 1- ((0.308 + 0.052)/2) = 0.82

122

b. Unity

The formula for unity is

(A.7)

Where UM stands for Unity Measure, UMform is the extent to which the objects are related

in size with

(A.8)

and UMspace is a relative measure of the space between groups and that of margins with

(A.9)

Where ai, alayout, and aframe are the areas of object i, the layout, and the frame respectively,

nsizes stands for the number of sizes used, and n is the number of objects on the frame.

Example

This example shows how unity of the hypothetical screen is computed using the above

formulas

123

Figure A.2. The hypothetical example screen showing inputs required to compute the unity element.

nsizes = 3

n = 4

UMform = 0.5

Sum of areas = = 9 + 4 + 4 + 12.25 = 29.25 cm

2

alayout = 62.5 cm2 (area outlined by the solid lines)

aframe = 144 cm2

(total area of the screen = 24 cm * 24cm)

UMspace = 0.71

UM = (0.5+0.71)/2 = 0.605

124

c. Sequence

The formula for calculating sequence is

(A.10)

with

(A.11)

(A.12)

with

(A.13)

(A.14)

Where UL, UR, LL, and LR stand for upper-left, upper-right, lower-left, and lower-right,

respectively; and aij is the area of object i on quadrant j. Each quadrant is given a

weighting in q

125

Example

This example shows how sequence of the hypothetical screen is computed using the

above formulas

Figure A.3 The hypothetical example screen showing inputs required to compute the sequence element.

wUL = qUL * aUL = 4 * 9 = 36 ………. vUL = 4 …… (qUL – vUL) = 4 - 4 = 0

wUR = qUR * aUR = 3 * 4 = 12 ……….. vUR = 3 …… (qUR – vUR ) = 3 - 3 = 0

wLL = qLL * aLL = 2 * 4 = 8 …………. vLL = 1 …… (qLL – vLL) = 1 - 2 = -1

wLR = qLR * aLR = 1* 12.25 = 12.25 … vLR = 2 …… (qLR – vLR) = 2 - 1 = 1

= 0 + 0 + 1 + 1 = 2

SQM = 1 – (2/8) = 0.75

APPENDIX B

QUESTIONNAIRE SCORES AND MEASURES AND

MODELS VALUES FOR THE 42 WEBPAGES

127

Table B.1 Questionnaire scores for the 42 webpages (obtained from (Moshagen & Thielsch, 2010)

Webpage

no Classical/expressive

VisA WI



1 5.21 3.19 4.20

5.06 3.68 4.89 4.68 4.55

2 2.53 2.01 2.27

2.65 2.59 3.63 3.15 2.96

3 4.45 1.80 3.13

4.65 2.43 3.61 3.61 3.57

4 5.18 3.73 4.46

5.09 4.60 4.86 5.08 4.90

5 2.90 2.15 2.53

3.15 2.67 4.00 3.48 3.28

6 4.17 2.88 3.53

3.89 3.49 4.27 4.42 3.98

7 3.54 4.01 3.78 3.86 4.36 4.39 4.46 4.25

8 4.84 3.02 3.93 4.96 4.13 5.09 5.09 4.78

9 3.98 1.86 2.92 3.68 2.58 5.05 3.78 3.70

10 4.85 2.51 3.68 4.51 3.22 4.23 4.52 4.09

11 5.20 3.36 4.28 4.89 4.23 5.16 5.30 4.86

12 4.42 3.55 3.99 4.33 3.99 3.91 4.81 4.25

13 4.48 2.57 3.53 4.42 3.02 4.96 4.75 4.22

14 3.70 2.60 3.15 4.20 3.13 3.59 4.31 3.79

15 4.28 3.28 3.78 4.14 3.66 4.13 5.00 4.19

16 2.91 2.44 2.67 3.18 3.24 3.57 3.80 3.42

17 2.47 2.38 2.43 2.35 2.57 3.21 2.90 2.72

18 2.40 2.48 2.44 2.68 2.86 3.25 3.54 3.05

19 5.08 2.18 3.63 4.83 3.15 5.06 5.25 4.51

20 5.02 3.06 4.04 4.52 3.96 4.38 4.83 4.40

21 3.20 2.28 2.74 5.06 3.68 4.89 4.68 4.55

128

Table B.1 continue...

Webpage

no Classical/expressive

VisA WI



22 3.30 3.68 3.49

3.08 3.22 3.83 3.77 3.44

23 4.98 2.48 3.73

3.85 4.00 3.85 4.79 4.10

24 4.88 3.38 4.13

4.90 3.82 5.00 5.43 4.74

25 4.49 2.65 3.57

4.42 4.30 5.10 4.83 4.63

26 4.30 3.06 3.68

4.41 3.49 5.33 4.98 4.49

27 4.24 3.04 3.64

4.44 4.10 4.68 4.43 4.39

28 4.22 3.16 3.69 4.36 3.96 4.67 4.42 4.33

29 4.06 4.10 4.08 4.08 3.76 4.33 4.53 4.14

30 3.30 3.21 3.26 4.12 4.56 4.60 4.75 4.49

31 3.45 1.68 2.57 3.46 3.44 4.36 4.38 3.86

32 3.51 2.64 3.07 3.63 2.00 2.98 2.90 2.87

33 3.60 3.80 3.70 3.52 3.76 4.45 4.30 3.96

34 3.96 3.57 3.76 3.36 4.16 4.30 4.23 3.98

35 3.93 2.73 3.33 3.79 4.21 4.09 4.30 4.09

36 3.84 3.10 3.47 3.56 3.66 5.09 4.78 4.20

37 3.83 3.07 3.45 4.10 3.54 4.65 4.40 4.13

38 3.82 2.76 3.29 3.72 3.55 4.92 4.37 4.08

39 3.78 2.48 3.13 3.29 3.49 4.50 4.50 3.88

40 3.74 3.56 3.65 4.08 3.34 3.73 4.33 3.85

41 3.69 2.77 3.23 4.17 3.32 3.70 4.03 3.79

42 3.62 2.32 2.97 3.54 3.08 4.13 4.25 3.70

129

Table B.2 Measures and model values for the 42 webpages

Webpage

no Balance Unity Sequence Average

Interaction model

Ngo model

1 0.833 0.626 1.00 0.82 0.693 0.286

2 0.875 0.306 1.00 0.73 0.558 0.250

3 0.909 0.585 1.00 0.83 0.692 0.289

4 0.898 0.519 1.00 0.81 0.659 0.279

5 0.755 0.234 1.00 0.66 0.514 0.228

6 0.643 0.532 1.00 0.72 0.610 0.253

7 0.812 0.352 1.00 0.72 0.570 0.249

8 0.838 0.668 1.00 0.84 0.712 0.291

9 0.915 0.338 1.00 0.75 0.578 0.258

10 0.823 0.547 1.00 0.79 0.656 0.275

11 0.731 0.607 1.00 0.78 0.659 0.272

12 0.748 0.310 1.00 0.69 0.544 0.236

13 0.744 0.496 1.00 0.75 0.618 0.259

14 0.938 0.398 1.00 0.78 0.610 0.268

15 0.796 0.599 1.00 0.80 0.672 0.278

16 0.857 0.228 1.00 0.69 0.520 0.238

17 0.804 0.208 1.00 0.67 0.508 0.230

18 0.516 0.348 0.75 0.54 0.513 0.196

19 0.726 0.455 1.00 0.73 0.598 0.252

20 0.752 0.366 1.00 0.71 0.567 0.244

21 0.726 0.203 1.00 0.64 0.499 0.221

130


Webpage

no Balance Unity Sequence Average

Interaction model

Ngo model

22 0.807 0.460 1.00 0.76 0.615 0.262

23 0.597 0.571 1.00 0.72 0.614 0.253

24 0.847 0.270 1.00 0.71 0.539 0.242

25 0.848 0.451 1.00 0.77 0.619 0.265

26 0.856 0.311 1.00 0.72 0.558 0.248

27 0.864 0.452 1.00 0.77 0.622 0.267

28 0.641 0.679 1.00 0.77 0.662 0.271

29 0.837 0.335 1.00 0.72 0.566 0.249

30 0.898 0.386 1.00 0.76 0.597 0.263

31 0.900 0.321 1.00 0.74 0.568 0.254

32 0.764 0.163 1.00 0.64 0.486 0.220

33 0.634 0.200 0.75 0.53 0.490 0.191

34 0.950 0.381 1.00 0.78 0.603 0.267

35 0.846 0.352 1.00 0.73 0.575 0.253

36 0.894 0.408 1.00 0.77 0.607 0.265

37 0.796 0.521 1.00 0.77 0.639 0.268

38 0.852 0.286 1.00 0.71 0.546 0.245

39 0.574 0.684 0.75 0.67 0.600 0.245

40 0.860 0.617 1.00 0.83 0.695 0.287

41 0.757 0.393 0.75 0.63 0.560 0.229

42 0.598 0.351 0.75 0.57 0.525 0.206

131

Table B.3 Simple measures values for the 42 webpages

Webpage

no No of objects No of sizes

JEPG file size

(in Kbyte) No of font types No of images

1 6 6 52 2 0

2 18 12 209 4 12

3 7 5 69 1 0

4 18 5 161 3 12

5 11 11 192 2 3

6 6 6 149 2 4

7 8 7 194 6 5

8 8 6 162 1 2

9 11 11 214 1 6

10 8 8 50 2 0

11 4 3 179 2 4

12 5 5 108 2 2

13 7 6 174 2 4

14 6 5 160 3 2

15 11 8 113 4 3

16 10 10 140 4 1

17 10 10 169 2 3

18 20 20 170 2 9

19 11 8 175 1 2

20 7 6 191 3 2

21 13 13 174 3 5

132


Webpage

no No of objects No of sizes

JEPG file size

(in Kbyte) No of font types No of images

22 10 10 10 10 10

23 12 11 12 11 12

24 12 10 12 10 12

25 11 10 11 10 11

26 8 7 8 7 8

27 9 8 9 8 9

28 14 10 14 10 14

29 8 8 8 8 8

30 13 13 13 13 13

31 11 11 11 11 11

32 9 9 9 9 9

33 14 14 14 14 14

34 14 13 14 13 14

35 21 17 21 17 21

36 10 9 10 9 10

37 13 13 13 13 13

38 7 4 7 4 7

39 11 8 11 8 11

40 9 8 9 8 9

41 15 15 15 15 15

42 7 7 7 7 7

APPENDIX C

QUESTIONNAIRE SCORES FOR EXPERIMENTAL

TRAILS OF CHAPTER 4

134

Table C.1 Scores for the one-question mock-up trail.

Participant

no

Design

1 2 3 4 5 6 7 8

1 1 2 1 5 5 8 6 2

2 5 6 2 3 4 1 4 3

3 3 2 1 1 3 4 2 3

4 10 5 10 5 10 4 10 5

5 6 6 7 7 5 3 5 4

6 7 7 6 6 9 8 9 5

7 6 3 1 1 3 4 1 3

8 4 3 6 1 7 7 8 1

9 2 3 4 2 3 3 4 3

10 7 1 2 1 8 8 2 2

11 6 8 8 5 4 5 3 5

12 8 8 8 2 8 5 5 2

13 10 9 10 2 2 1 1 1

14 4 5 5 6 6 5 5 6

15 8 6 8 6 6 5 7 6

16 7 6 2 8 7 1 4 5

17 6 1 8 4 3 2 7 6

18 6 6 3 4 7 6 6 3

19 5 4 5 6 3 3 3 3

20 5 3 3 2 4 2 2 2

135

Table C.1 continue…

Participant

no

Design

1 2 3 4 5 6 7 8

21 2 3 3 2 5 3 3 2

22 4 5 4 2 6 5 4 4

23 6 7 4 4 8 7 4 2

24 2 3 8 6 1 1 1 3

25 6 5 5 5 6 4 5 4

26 2 2 2 2 2 2 2 2

27 10 7 8 3 6 2 5 1

28 3 4 2 2 3 3 4 4

29 1 1 1 1 1 1 1 1

136

Table C.2 Scores for the one-question with webpages trail

Participant

no

Design

1 2 3 4 5 6 7 8

1 7 7 8 8 7 7 8 8

2 1 1 1 1 1 1 1 1

3 2 2 1 1 2 2 1 1

4 6 6 6 7 7 8 8 8

5 2 3 2 1 2 2 3 2

6 2 2 4 4 2 2 2 4

7 5 5 4 5 5 5 5 4

8 5 5 6 2 1 2 3 3

9 6 6 3 1 1 1 1 1

10 4 4 3 1 5 5 2 3

11 5 5 4 2 5 5 3 2

12 4 4 3 5 4 4 4 4

13 4 6 6 6 9 5 7 8

14 3 3 5 4 5 4 5 5

15 5 4 5 5 2 2 6 6

16 4 4 4 4 4 3 3 3

17 8 7 5 3 9 10 4 6

18 6 5 6 6 6 6 6 6

19 3 3 3 3 3 3 3 3

20 3 3 3 3 3 3 3 3

137

Table C.2 continue…

Participant

no

Design

1 2 3 4 5 6 7 8

21 3 3 1 2 2 2 1 1

22 6 8 7 5 6 6 5 6

23 4 6 4 4 5 5 4 5

24 5 5 5 5 5 5 5 5

25 3 3 4 4 5 5 4 4

26 1 2 1 1 1 1 1 2

27 3 3 3 3 2 2 2 2

28 4 4 2 2 4 4 2 2

138

Table C.3 Average scores for the Classical scale of Classical/Expressive questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 5.5 5.3 5.3 5.0 3.5 3.7 3.6 3.4

2 3.0 3.0 3.0 3.0 4.0 4.5 4.5 4.0

3 5.5 5.0 5.5 3.8 3.0 4.0 3.0 3.0

4 5.5 4.8 4.5 3.0 3.5 4.0 3.0 2.5

5 4.3 4.8 5.3 5.3 2.8 3.5 2.0 3.0

6 6.0 6.0 6.0 6.0 4.5 3.5 4.5 4.0

7 5.8 5.5 4.3 3.8 5.0 5.0 5.0 5.0

8 3.5 3.5 3.5 3.5 5.5 6.0 6.0 4.0

9 6.0 6.0 5.5 5.8 2.5 2.5 2.5 2.5

10 4.0 4.3 5.0 4.8 5.8 6.0 5.3 5.0

11 1.5 1.0 1.0 1.0 4.8 4.5 4.3 5.0

12 4.3 3.8 4.5 3.5 1.0 1.0 1.5 1.0

13 4.5 4.5 4.5 4.3 3.5 3.0 4.8 3.8

14 5.3 5.3 5.0 3.8 3.3 3.0 3.5 2.8

15 2.0 4.5 4.3 2.0 3.0 3.0 3.0 3.0

16 7.0 5.3 5.0 5.0 2.0 2.0 4.3 2.0

17 2.5 2.5 2.5 2.5 3.5 5.0 2.0 4.8

18 - - - - 2.5 2.5 2.5 2.5

19 - - - - 3.5 3.7 3.6 3.4

20 - - - - 3.5 3.7 3.6 3.4

21 - - - - 3.5 3.7 3.6 3.4

139

Table C.4 Average scores for the Expressive scale of the Classical/Expressive questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 3.5 3.3 3.3 4.3 3.0 3.0 2.8 3.0

2 1.8 1.8 2.0 1.8 3.5 1.5 3.3 3.0

3 3.8 3.5 4.5 4.0 1.8 1.8 1.8 3.0

4 2.0 2.0 3.0 2.0 3.5 3.5 4.5 4.3

5 3.5 2.8 3.0 3.0 2.0 2.5 3.3 2.3

6 2.3 2.0 2.0 2.5 2.5 2.8 2.3 2.3

7 1.8 2.0 2.0 2.0 2.0 2.0 2.0 2.0

8 2.0 3.0 2.0 2.0 2.0 2.0 2.0 2.0

9 1.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0

10 3.8 2.5 4.0 4.0 1.0 1.0 1.0 1.0

11 3.3 3.3 2.0 4.3 3.5 3.5 4.0 4.0

12 1.8 2.0 2.8 1.8 2.0 2.3 2.5 2.0

13 4.0 4.3 2.5 2.5 2.0 2.3 3.5 2.3

14 6.0 5.0 5.0 5.0 4.3 4.3 3.0 4.0

15 3.0 2.0 3.0 3.0 4.0 5.0 5.0 5.0

16 6.0 5.5 6.0 6.5 3.0 3.0 3.0 3.0

17 4.3 4.3 4.3 4.3 6.3 5.5 6.5 5.5

18 - - - - 4.3 4.3 4.3 4.3

19 - - - - 6.0 5.8 5.8 5.8

20 - - - - 1.0 1.0 1.0 1.0

21 - - - - 2.0 2.0 2.0 2.0

140

Table C.5 Total average scores for the Classical/Expressive questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 4.5 4.3 4.3 4.6 3.3 3.4 3.2 3.2

2 2.4 2.4 2.5 2.4 3.8 3.0 3.9 3.5

3 4.6 4.3 5.0 3.9 2.4 2.9 2.4 3.0

4 3.8 3.4 3.8 2.5 3.5 3.8 3.8 3.4

5 3.9 3.8 4.1 4.1 2.4 3.0 2.6 2.6

6 4.1 4.0 4.0 4.3 3.5 3.1 3.4 3.1

7 3.8 3.8 3.1 2.9 3.5 3.5 3.5 3.5

8 2.8 3.3 2.8 2.8 3.8 4.0 4.0 3.0

9 3.5 3.5 3.3 3.4 2.3 2.3 2.3 2.3

10 3.9 3.4 4.5 4.4 3.4 3.5 3.1 3.0

11 2.4 2.1 1.5 2.6 4.1 4.0 4.1 4.5

12 3.0 2.9 3.6 2.6 1.5 1.6 2.0 1.5

13 4.3 4.4 3.5 3.4 2.8 2.6 4.1 3.0

14 5.6 5.1 5.0 4.4 3.8 3.6 3.3 3.4

15 2.5 3.3 3.6 2.5 3.5 4.0 4.0 4.0

16 6.5 5.4 5.5 5.8 2.5 2.5 3.6 2.5

17 3.4 3.4 3.4 3.4 4.9 5.3 4.3 5.1

18 - - - - 3.4 3.4 3.4 3.4

19 - - - - 4.8 4.7 4.7 4.6

20 - - - - 2.3 2.4 2.3 2.2

21 - - - - 2.8 2.9 2.8 2.7

141

Table C.6 Average scores for the Simplicity scale of the VisAWI questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 3.0 2.8 4.4 4.2 2.3 1.7 1.0 1.0

2 4.6 4.4 4.6 4.2 4.0 5.0 5.3 5.7

3 5.6 5.6 6.0 3.8 4.0 5.0 5.0 6.0

4 2.4 4.0 3.6 2.8 7.0 6.7 7.0 6.3

5 5.0 5.0 2.8 2.6 1.0 1.0 2.7 2.3

6 4.0 3.8 4.0 4.0 1.7 1.7 4.7 4.7

7 5.2 3.8 3.0 4.0 1.0 1.0 2.3 1.0

8 5.4 5.4 4.4 3.8 1.0 1.0 1.3 1.0

9 4.4 4.6 3.8 3.8 4.7 5.3 4.7 4.7

10 4.0 4.0 5.4 5.4 4.0 4.3 4.0 4.3

11 2.8 2.0 2.0 2.2 3.3 4.7 4.0 3.3

12 4.4 4.6 4.4 4.6 3.0 4.0 4.7 4.3

13 5.4 5.6 4.4 4.8 5.0 5.7 6.0 5.0

14 5.4 4.6 5.0 5.0 6.7 6.7 6.7 6.7

15 4.8 4.8 3.6 3.6 3.0 3.3 1.7 2.7

16 4.8 5.8 4.0 6.0 1.0 1.0 1.0 1.0

17 5.8 6.0 5.6 5.4 5.3 3.3 4.3 5.3

18 2.8 2.8 2.0 2.0 5.0 5.0 5.0 5.3

19 4.0 3.6 4.0 4.0 2.7 2.7 2.0 2.7

20 4.2 4.2 3.4 3.6 5.3 5.7 5.7 5.3

21 5.2 5.6 5.2 4.6 4.3 3.7 4.0 2.7

22 4.2 5.2 4.4 3.6 5.3 4.3 3.3 5.0

23 5.0 4.4 5.4 5.0 4.0 4.0 3.3 4.0

24 3.8 4.8 4.6 3.6 - - - -

25 1.6 2.0 2.0 2.0 - - - -

142

Table C.7 Average scores for the Diversity scale of the VisAWI questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 3.0 3.4 4.0 4.6 1.0 1.0 1.0 1.0

2 3.6 2.4 2.2 2.2 1.3 1.3 1.3 1.3

3 4.6 4.8 5.6 4.6 2.7 4.0 4.7 5.0

4 4.2 4.0 5.0 4.4 6.3 3.7 6.3 5.3

5 4.2 3.0 2.6 1.8 1.0 1.0 2.7 1.0

6 4.0 4.4 4.0 4.0 2.0 2.0 4.0 4.3

7 4.2 4.6 4.6 4.2 1.0 1.0 2.0 1.0

8 4.0 4.0 3.0 3.2 1.0 1.0 2.0 1.0

9 2.4 2.0 1.4 1.6 4.3 4.0 3.0 3.3

10 4.8 4.2 2.6 3.4 1.3 1.3 1.3 1.0

11 2.8 2.0 2.0 2.8 2.0 2.7 2.0 2.0

12 2.8 2.8 3.2 3.6 3.7 3.7 4.7 2.7

13 2.0 2.2 2.0 2.4 2.3 2.7 2.0 2.3

14 3.8 2.8 4.6 2.6 6.0 6.0 6.0 6.3

15 1.6 1.6 1.2 1.6 2.3 3.0 2.0 2.0

16 2.6 3.0 3.4 3.8 1.0 1.0 1.0 1.0

17 4.0 5.0 4.2 4.2 4.3 3.7 4.3 4.3

18 2.0 2.0 2.0 2.0 2.0 2.0 2.3 2.0

19 4.0 4.0 4.0 4.0 2.0 2.0 2.0 1.3

20 1.8 1.2 1.2 1.0 2.7 3.7 3.7 3.7

21 4.8 5.4 5.4 4.0 2.0 1.3 2.3 1.7

22 1.0 3.2 2.2 1.2 1.7 1.3 1.3 2.0

23 4.8 5.6 5.4 4.6 2.0 2.0 1.7 2.0

24 5.2 3.0 3.8 4.0 - - - -

25 2.0 1.8 2.4 2.0 - - - -

143

Table C.8 Average scores for the Colorfulness scale of the VisAWI questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 3.0 4.3 3.8 4.3 1.0 1.0 1.0 1.0

2 4.5 4.0 4.0 4.0 2.5 4.0 3.5 4.0

3 4.0 4.8 5.5 4.3 3.0 4.5 3.0 5.0

4 4.8 3.5 3.8 4.0 6.0 2.0 6.0 3.0

5 4.3 4.5 2.8 3.8 5.0 5.0 5.0 4.0

6 4.0 3.3 4.0 4.0 2.0 2.0 3.5 3.0

7 3.5 4.8 3.5 4.5 1.0 1.0 2.0 1.0

8 6.0 5.8 5.0 5.5 1.0 1.0 2.0 1.0

9 3.8 2.8 3.0 3.0 4.5 4.5 4.0 4.5

10 4.5 3.8 4.0 4.8 3.0 2.0 3.0 3.0

11 2.0 2.0 2.0 3.0 2.0 2.0 2.0 2.0

12 3.5 3.8 3.8 3.8 4.0 3.0 3.5 2.5

13 4.0 4.3 4.0 4.0 2.0 1.0 1.0 1.5

14 5.3 5.0 5.5 5.5 7.0 6.5 6.5 6.5

15 1.0 1.0 1.3 1.0 2.0 2.5 2.0 2.0

16 6.3 6.0 6.0 6.0 1.0 1.0 1.0 1.0

17 5.8 6.0 5.5 5.5 5.5 5.0 6.0 6.0

18 3.0 2.0 2.0 2.0 3.0 2.0 2.5 2.0

19 4.0 4.0 4.0 4.0 2.0 2.0 2.0 1.5

20 2.8 4.0 3.5 3.8 3.0 2.0 3.0 2.5

21 6.0 5.5 5.8 5.3 1.0 1.0 1.5 1.0

22 2.5 2.5 2.0 2.5 6.0 6.0 6.0 6.0

23 4.8 5.5 5.5 4.8 2.0 2.0 2.5 2.5

24 3.8 4.5 5.0 3.8 - - - -

25 4.0 1.5 2.0 1.8 - - - -

144

Table C.9 Average scores for the Craftsmanship scale of the VisAWI questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 3.5 2.8 4.8 4.0 1.0 1.0 1.0 1.0

2 3.3 2.5 2.8 2.3 2.0 2.5 4.0 2.5

3 5.8 5.0 5.5 4.3 3.5 4.0 4.0 6.0

4 2.0 4.3 3.3 1.3 7.0 7.0 7.0 6.5

5 4.5 2.8 3.0 2.8 1.0 1.0 1.0 1.0

6 4.0 3.3 4.0 4.0 3.5 2.0 5.0 5.0

7 4.3 4.5 4.0 3.5 1.0 1.0 1.5 1.0

8 3.0 4.0 3.3 2.8 1.0 1.0 1.0 1.0

9 3.0 2.0 2.0 3.8 4.0 4.0 4.0 4.5

10 5.5 5.0 3.8 4.5 1.5 2.0 2.0 2.5

11 3.3 2.5 3.5 2.3 2.0 2.0 1.5 2.0

12 3.0 2.8 2.5 2.8 3.5 4.5 5.5 3.5

13 5.0 3.0 4.8 4.3 3.0 2.5 2.5 2.5

14 4.8 5.0 4.8 4.5 7.0 7.0 7.0 7.0

15 2.8 2.3 2.0 2.3 3.5 3.5 2.0 3.5

16 4.5 4.3 4.3 4.3 1.0 1.0 1.0 1.0

17 4.5 4.0 5.5 4.5 3.0 3.0 4.0 4.0

18 2.3 2.0 2.0 2.0 4.0 4.5 4.0 4.5

19 4.0 4.0 4.0 4.0 2.0 2.0 2.0 2.0

20 2.8 1.8 2.0 1.8 1.5 2.5 2.5 3.5

21 5.3 5.3 5.0 3.5 1.5 1.0 1.0 1.0

22 1.3 1.5 3.3 1.5 2.0 1.0 2.0 2.0

23 4.8 4.8 5.5 4.8 2.0 2.5 1.0 2.5

24 3.3 5.0 4.0 3.8 - - - -

25 2.5 3.0 3.5 3.0 - - - -

145

Table C.10 Total average scores for the VisAWI questionnaire.

Participant

no

Design

1 2 3 4 5 6 7 8

1 3.1 3.3 4.2 4.3 1.3 1.2 1.0 1.0

2 4.0 3.3 3.4 3.2 2.5 3.2 3.5 3.4

3 5.0 5.0 5.7 4.2 3.3 4.4 4.2 5.5

4 3.3 3.9 3.9 3.1 6.6 4.8 6.6 5.3

5 4.5 3.8 2.8 2.7 2.0 2.0 2.8 2.1

6 4.0 3.7 4.0 4.0 2.3 1.9 4.3 4.3

7 4.3 4.4 3.8 4.1 1.0 1.0 2.0 1.0

8 4.6 4.8 3.9 3.8 1.0 1.0 1.6 1.0

9 3.4 2.8 2.6 3.0 4.4 4.5 3.9 4.3

10 4.7 4.2 3.9 4.5 2.5 2.4 2.6 2.7

11 2.7 2.1 2.4 2.6 2.3 2.8 2.4 2.3

12 3.4 3.5 3.5 3.7 3.5 3.8 4.6 3.3

13 4.1 3.8 3.8 3.9 3.1 3.0 2.9 2.8

14 4.8 4.4 5.0 4.4 6.7 6.5 6.5 6.6

15 2.5 2.4 2.0 2.1 2.7 3.1 1.9 2.5

16 4.5 4.8 4.4 5.0 1.0 1.0 1.0 1.0

17 5.0 5.3 5.2 4.9 4.5 3.8 4.7 4.9

18 2.5 2.2 2.0 2.0 3.5 3.4 3.5 3.5

19 4.0 3.9 4.0 4.0 2.2 2.2 2.0 1.9

20 2.9 2.8 2.5 2.5 3.1 3.5 3.7 3.8

21 5.3 5.4 5.3 4.3 2.2 1.8 2.2 1.6

22 2.2 3.1 3.0 2.2 3.8 3.2 3.2 3.8

23 4.8 5.1 5.5 4.8 2.5 2.6 2.1 2.8

24 4.0 4.3 4.4 3.8 - - - -

25 2.5 2.1 2.5 2.2 - - - -

Documents

Towards developing computational models to predict ... › files › neu:1481 › fulltext.pdf · Fig. 4.1 The eight abstract mock-up screens. 52 Fig. 4.2 The eight webpage designs