Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Towards Developing Computational Models to Predict
Perceived Visual Aesthetics of Website Interface Design
A Dissertation presented
by
Ahamed A. O. Altaboli
to
Department of Mechanical and Industrial Engineering
in partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
in the field of
Industrial Engineering
Northeastern University
Boston, Massachusetts
April 2012
I
ABSTRACT
This study comes within a framework of scientific research arguing that users'
perception of visual aesthetics of computer interface design is related to visual screen
design features and layout elements. This framework is concerned with determining what
features triggers users’ perception of aesthetics and tries to express such features
numerically. The goal is to combine their collective effects using mathematical formulas
and computational models that would objectively predict perceived visual aesthetics.
The general purpose of this study is to continue the research efforts towards this goal.
In this study, rigorous experimental methods were utilized to verify and further improve
currently available measures and models. This included extending the application of these
measures and models to the case of website interface design. Viability of this extension
was assessed using standard questionnaires deigned to measure perceived visual
aesthetics of website design.
Results of this study confirmed findings of previous studies regarding the use of visual
features to predict perceived visual aesthetics and further support the concept of
expressing such feature using mathematical formulas and use them in turn as basis to
develop computational models to predict aesthetics. Results also proved that such screen
layout-based measures can work with website interfaces. Moreover, results showed that
objective screen layout-based measures relates to subjective questionnaire-based
measure. The relationship is particularly stronger with questionnaire elements related to
screen layout. This supports further the suggestion that objective layout-based measures
could be used to generally assess the overall visual aesthetics of websites and particularly
aesthetic aspects related to classical and simplicity dimensions of website aesthetics.
II
TABLE OF CONTENTS
______________________________________________________________________
ABSTRACT .................................................................................................................. I
TABLE OF CONTENTS ............................................................................................ II
LIST OF FIGURES ................................................................................................. IV
LIST OF TABLES ....................................................................................................... VI
_______________________________________________________________________
1 CHAPTER 1 INTRODUCTION ............................................................................ 1
1-1 Acceptability of computer systems ................................................. 2
1-2 Aesthetics: definition and historical note ....................................... 4
1-3 Purpose of the study ......................................................................6 1-3-1 Study framework .............................................................................. 6
1-3-2 Objectives of the study and related research questions ................... 9
1-3-3 Organization of the study ................................................................. 11
_______________________________________________________________________
2 CHAPTER 2 LITERATURE REVIEW AND BACKGROUND RESEARCH . 12
2-1 Aesthetics and Usability in Interface Design .................................. 13 2-1-1 Aesthetics and perceived usability .................................................. 13
2-1-2 Role of context of use ..................................................................... 16
2-1-3 Aesthetics and performance ............................................................ 18
2-2 Quantitative Measures and Models of Interface Aesthetics .......... 20 2-2-1 Objective screen layout- based measures ........................................ 20
2-2-2 Subjective questionnaire-based measures ........................................ 24
_______________________________________________________________________
CHAPTER 3 VERIFYING NGO AND BYRNE’S FINDINGS AND
DEVELOPING A PRELIMINARY MODEL................................... 28
3-1 Introduction ......................................................................................... 29
3-2 Method .................................................................................................. 30 3-2-1 Design of the experiment ................................................................. 30
3-2-2 Screen designs .................................................................................. 31
3-2-3 Participants and apparatus ................................................................ 33
3-2-4 Procedure ......................................................................................... 33
3-3 Results ............................................................................................................ 34
3-3-1 Participants ratings ........................................................................... 34
3-3-2 Analysis of variance ......................................................................... 35
3-4 Constructing and Validating the Regression Model ............................. 37 3-4-1 Constructing the model .................................................................... 38
3-4-2 Compare with Ngo and Byrne’s model ........................................... 40
III
3-4-3 Validating the model using standard questionnaire scores of real
webpages ......................................................................................... 45
3-4-4 Checking for correlations with simple counts measures .................. 50
_______________________________________________________________________
CHAPTER 4 FURTHER TESTING OF VISUAL LAYOUT ELEMENTS AND
VALIDATING OF THE MODEL .................................................... 54
4-1 Introduction ........................................................................................ 21
4-2 Method ................................................................................................. 56 4-2-1 Experimental Design ........................................................................ 56
4-2-2 Procedure ......................................................................................... 57
4-3 Results Analysis and Discussion ..................................................61 4-3-1 One-question with mock-up screens trail ........................................ 61
4-3-2 One-question with webpages trail .................................................... 65 4-3-3 The Classic/Expressive questionnaire trails ................................... 68 4-3-4 The VisAWI questionnaire trails .................................................... 77 4-3-5 Overall discussion ........................................................................... 86
4-4 Comparing Objective Measures with Subjective Measures .........94 4-4-1 Correlation analysis ........................................................................ 94
4-4-2 Proposed modification to the unity of form formula ...................... 96
4-4-3 Incorporating the modified unity of form formula into the
computational model .....................................................................101
_______________________________________________________________________
CHAPTER 5 CONCLUSIONS AND FUTURE WORK ................................107
5-1 Summary of Experimental Work and Results ..................................108
5-2 Conclusions and Contributions ...........................................................111 5-3 Recommendations for Future Work ..................................................112
_______________________________________________________________________
REFERENCES ...........................................................................................................115
APPENDIX A THE USED FORMULAS WITH EXAMPLES OF
CALCULATIONS...................................................................121
APPENDIX B QUESTIONNAIRE SCORES AND MEASURES AND
MODELS VALUES FOR THE 42 WEBPAGES .................129
APPENDIX C QUESTIONNAIRE SCORES FOR EXPERIMENTAL
TRAILS OF CHAPTER 4 .....................................................136
_______________________________________________________________________
IV
LIST OF FIGURES
Fig. 3.1 The eight screen models associated with the experimental
conditions
33
Fig. 3.2 Average effects and interactions plots 37
Fig. 3.3 Scatter diagram of actual and predicted aesthetic values for
the eight screens
39
Fig. 3.4 Scatter diagram of actual and predicted (Ngo and Byrne
model) aesthetic values for the eight screens
42
Fig. 3.5 Screen shoots of webpages with the highest and lowest
average questionnaire scores
47
Fig. 3.6 An Example of how a webpage is divided into visual
objects
48
Fig. 4.1 The eight abstract mock-up screens. 52
Fig. 4.2 The eight webpage designs 53
Fig. 4.3 Average scores for the one-questions, mock-up screens trail
63
Fig. 4.4 Average scores for the one-question, webpages trail
67
Fig. 4.5 Average scores for the Classical/Expressive questionnaire
balanced trail
71
Fig. 4.6 Average scores for the Classical/Expressive unbalanced
trail
75
V
Fig 4.7 Average scores for the VisAWI balanced trail
79
Fig. 4.8 Average scores for the VisAWI unbalanced trail
83
VI
LIST OF TABLES
Table 2.1 Scales and items in the Classical/Expressive and the
VisAWI questionnaires
27
Table 3.1 The eight experimental conditions and the associated
factors levels and values
32
Table 3.2 Calculated aesthetic values and participants' average
aesthetic ratings
35
Table 3.3 Analysis of variance results 36
Table 3.4 Actual and predicted aesthetic values of the eight screens 39
Table 3.5 Calculated values of the five terms (elements) included in
Ngo and Byrne (2001) model
41
Table 3.6 Actual and predicted (Ngo and Byrne model) aesthetic
values of the eight screens
42
Table 3.7 Actual and predicted (current model) aesthetic values of
the 57 screens of Ngo and Byrne (2001) study
44
Table 3.8 Summary of total errors of each type per group
Descriptive statistics for questionnaire scores for the 42
webpages
46
Table 3.9 Descriptive statistics for the measures and the models for
the 42 webpages
49
Table 3.10 Correlations between the measures and questionnaire
scores
49
VII
Table 3.11 Descriptive statistics for the selected count-based
measures for the 42 web pages
52
Table 3.12 Correlations between objective simple count-based
measures and subjective questionnaire based measures
53
Table 4.1 The eight deigns and the associated factors levels and
values
57
Table 4.2 Experimental trails and participants information
60
Table 4.3 Descriptive statistics for average scores for the one-
question, mock-up screens trail
62
Table 4.4 ANOVA for average scores for the one-question, mock-
up screens trail
64
Table 4.5 Descriptive statistics for average scores for the one-
question with webpages trail
66
Table 4.6 ANOVA for average scores for the one-question
webpages trail
68
Table 4.7 Descriptive statistics for average scores for the
Classical/Expressive balanced trail
70
Table 4.8 ANOVA for average scores for the Classical/Expressive
balanced trail
72
Table 4.9 Descriptive statistics for average scores for the
Classical/Expressive unbalanced trail
74
Table 4.10 ANOVA for average scores for the Classical/Expressive
unbalanced trail
76
VIII
Table 4.11 ANOVA for balance and scales for the
Classical/Expressive trail
76
Table 4.12 Descriptive statistics for average scores for the VisAWI
balanced trail
78
Table 4.13 ANOVA for average scores for the VisAWI balanced
trail
80
Table 4.14 Descriptive statistics for average scores for the VisAWI
unbalanced trail
82
Table 4.15 ANOVA for average scores for the VisAWI unbalanced
trail
84
Table 4.16 ANOVA for balance and scales for the VisAWI trail
85
Table 4.17 Summary of results for all experimental trails
87
Table 4.18 Summary of average scores for all experimental trails
91
Table 4.19 Correlation coefficients of average scores for all eight
designs (balanced and unbalanced)
92
Table 4.20 Correlation coefficients of average scores for the
balanced condition
93
Table 4.21 Values of measures and models for the eight designs
95
Table 4.22 Correlation coefficients of measures and models for all
eight designs (balanced and unbalanced)
97
Table 4.23 Correlation coefficients of measures and models for the
balanced condition
98
Table 4.24 Differences between average total scores for each design
pair
99
IX
Table 4.25 Correlation coefficients for values of unity of form
computed by the original formula
99
Table 4.26 Comparing differences with values of unity of form of
both original and modified formula
100
Table 4.27 Correlation coefficients for values of unity of form
computed by the modified formula
101
Table 4.28 Values of measures and actual and predicted aesthetic
values of the eight screens for the original and the
modified model
102
Table 4.29 Values of the three measures and the model for the eight
webpage designs
103
Table 4.30 Correlation coefficients of the models for all eight
designs (balanced and unbalanced)
104
Table 4.31 Correlation coefficients of the models for the balanced
condition
105
Table 4.30 Correlation coefficients of the models for all the 42
webpages of chapter 3
106
CHAPTER 1
INTRODUCTION
2
1.1 Acceptability of computer systems
In developing a framework for usability, Shackel (1991) introduced a paradigm explaining
what affect users’ decisions to accept (purchase) a system. In this paradigm, acceptability of a
system depends on balance between its cost and three design factors: utility (functionality),
usability, and likeability. Utility/functionality is related to the questions of: will the system be
useful? Will it do the expected function? Usability is related to the question of: can the system
be used successfully? Likeability refers to whether the users like the system and feel it is
suitable. The acceptability (purchase) decision is made by balancing the above three factors in
a trade-off with cost of the system.
Aesthetics is related to the likeability factor. Visual appealing of a system can affect the
user’s first impression and opinion about the system and how suitable and easy to use it
appears (more on that in next sections).
With the emerging of interactive computer systems in the second half of the twentieth
century, earlier computer systems were setup in specialized research centers and were used
mostly by professionals and scientists who considered them as means that would help in
achieving their required research goals. They were willing to tolerate whatever usability issues
they faced and to spend whatever time to master these tools (Bailey, 1982). Consequently;
designers, at that time, were more concerned with functionality of their systems, usability and
likeability weren’t a big concern.
In the 1970s and 1980s, following advances in the related technologies, computers became
cheaper and more powerful. Commercial types of computers (e.g. personal computers) were
3
introduced and their use started spreading among the general public. For these systems to be
accepted, ease of use and usability issues can no longer be ignored. Usability became the
dominant acceptability factor and an important element in design.
In research and academia these developments were reflected in the establishment of the
Human Computer Interaction field in earlier 1980s. One of the main concerns of this field
upon its establishment is usability of computer systems. Numerous researches have been
conduced and volumes of design standards and guidelines have been published leading to
sufficient understanding of many aspects of usability. However, most of these guidelines
neglected aesthetics of the user interface and many insisted that aesthetics should only be used
in the design to support usability (Nielsen, 1993; Dix et al., 2004). Some even argued that
introducing aesthetics in the design will have negative effect on usability (Ngo et al., 2002).
This orientation began to change in late 1990s, largely motivated by the widen use of the
internet and the web. In today societies, with the wide spread of the web and its social
networks, computer systems are no longer considered as just tools to carry out daily tasks,
they are also considered by many as important aspect of social communications. Aesthetic
aspects became more recognizable in human computer interaction and in engineering and
product design in general. Considerable number of studies and publications concerned with
aesthetics appeared in recent years (e.g Jordon, 1998; Liu, 2003a & 2003b; Norman, 2004).
Many of these studies showed that visual aesthetics in interface design can affect users’
perception of ease of use of the interface (Kurosu & Kashimura, 1995; Tractinsky, 2000),
some even argue that visually appealing interfaces might have positive effects on performance
(Altaboli et al., 2010; Moshagen et al., 2009; Sonderegger & Sauer 2010).
4
1.2 Aesthetics: definition and historical note
As a word, Oxford online dictionary (2012) defines aesthetics as : “a set of principles
concerned with the nature and appreciation of beauty, especially in art”. Merriam-Webster
online dictionary (2012) defines aesthetics as: “a particular theory or conception of beauty or
art; a particular taste for or approach to what is pleasing to the senses and especially sight”.
As a term, aesthetics is related to the study of beauty or perception of beauty as the meaning
of the original term in Greek implies. The term “Aesthetics” was first used by Alexander
Baumgarten in 1735 in his book “Reflection on Poetry” (Reich, 1993). However, the term
Aesthetics was later utilized to represent a discipline within philosophy, established in 1790 by
Kant in the book “Critique of judgment” (Liu, 2003a). This discipline deals with topics such
as analysis of the beautiful and the sublime, the logic of aesthetic judgments, and the moral
function of the aesthetic.
In experimental psychology, first attempts to investigate quantitative relationship between
psychological responses and physical stimuli were conducted by Fechner (1876, as cited in
Liu, 2003a). His approach involves the manipulation of dimensions of visual objects (like
polygons) in order to find out relationships between aesthetic response and the manipulated
dimensions (Liu, 2003a). His bottom-up approach influenced many researches in later
centuries. Birkhoff (1933) used this approach to develop a universal aesthetic measure. This
measure was presented in a mathematical formula. The formula was applied to measure
aesthetics of geometrical forms (polygons and vases), to melody and harmony in music,
musical quality in poetry, and arts.
5
In human factors and ergonomics, until recently, aesthetics was completely ignored as a
topic of systematic scientific research (Hoffmann & Krauss, 2004; Liu, 2003a & 2003b). Two
recent methodology were widely accepted as basis to incorporate users feelings and aesthetic
aspects of design in human factor methodology: Kansei engineering/ergonomics and the dual-
process engineering aesthetics research methodology
Kansei engineering was introduced in late 1980s in Japan by Mitsuo Nagamachi
(Nagamachi, 1995). It was developed as an ergonomics and consumer-oriented technology for
producing a new product (Nagamachi, 2002). Kansei is a Japanese term which means a
consumer's psychological feeling and image regarding a new product. It aims at the
implementation of the customer’s feeling and demands (Kansei) into product function and
design (Nagamachi, 1995). The procedure utilizes various techniques to capture consumer’s
feelings about a new product and translate them into design characteristics of the product.
Yili Liu (Liu, 2003a) has proposed the establishment of a new scientific and engineering
discipline that he named “Engineering Aesthetics” to systematically incorporate engineering
and scientific methods in the aesthetic design and evaluation process. He developed the dual-
process methodology as a comprehensive research methodology for “Engineering Aesthetics”.
The methodology consists of two parallel but closely related lines of research. The first
process is called “multidimensional construct analysis or multivariate psychometric analysis”,
whose goal is to establish a “global”, “top-down”, and quantitative view of the critical
dimensions involved in a specific aesthetic response process. The second process is called
“psychophysical analysis”, whose objective is to establish a “local”, “bottom-up”, and
quantitative view of the individual’s perceptual abilities in making fine aesthetic distinctions
6
along selected dimensions. It identifies how keen the perceivers’ senses are in detecting
variations along critical aesthetic dimensions and how their preference levels change as a
function of specific design parameters or aesthetic variables.
As an example in judgment of visual aesthetics of screen design, the first process will be
concerned with finding what overall attributes of screen design would affect users’ perception
of aesthetics, e.g. symmetry or balance of the screen, or number of visual objects on the
screen. The second process will be concerned with finding how manipulation of these
attributes would specifically affect aesthetics, e.g. how changing the number of objects in the
screen would affect aesthetics.
In comparison with Kansei engineering, Liu claimed that the dual-process methodology is a
more comprehensive methodology that includes Kansei engineering as a special case (Liu,
2003).
In addition to the above two methodologies, Norman’s cleverly written book: “emotional
design” (Norman, 2004) was another milestone on revealing the importance of considering
aesthetics aspects of product design in the field of human factors and ergonomics.
1.3 Purpose of the study
1.3.1 Study framework
Two approaches to measure interface aesthetics can be distinguished in the literature. The first
is an objective quantitative approach relating screen design layout elements to the user
perception of visual aesthetics (e.g. Ngo et al., 2003; Bauerly & Liu, 2006). It is concerned
7
with determining what features in the interface design trigger users’ perception of aesthetics of
the interface. It also tries to explore the possibility of expressing changes in such features
using numerical values and use these numerical values to assess users' perception of interface
aesthetics. Methods in this approach are motivated by earlier aesthetic measures developed by
Birkhoff (1933), Tullis’ quantitative techniques for evaluating screen design (Tullis, 1983 &
1988), and Gestlest theory for visual design (Chand et al., 2002; Ngo et al., 2002).
The second approach is a subjective questionnaire-based approach. Supporters of this
approach argue that the complexity and interrelated relationships among the screen design
elements make it difficult to use them to quantitatively measure aesthetics (Lavie &
Tractinsky, 2004). It would be more convenient to use questionnaire-based instruments to
measure users’ subjective perception of aesthetics. Two of the most widely accepted of such
instrument are the Classical/Expressive Aesthetics questionnaire developed by Lavie and
Tractinsky (2004), and the Visual Aesthetics of Website Inventory (VisAWI) tool developed
by Moshagen and Thielsch (2010). Both were designed to measure perceived visual aesthetics
of websites.
The objective methods represent a bottom-up approach. This approach has its root in the
rationalistic philosophical view of aesthetics (Reich, 1993). This approach comprises the
concept of “beauty in the observed object”; i.e. human perception of beauty is based on the
order and organization of the various components constructing the object. On the other hand,
the subjective methods reflect a top-down approach. It is based on the concept of “beauty in
the mind of the observer”; the main principle of the romanticist philosophical view of
aesthetics (Reich, 1993). It stated that human perception of beauty is based on the whole form
8
of an object (influenced by cultural beliefs) and aesthetics cannot be evaluated by looking
separately at the various components constituting an object.
This study can be categorized mainly within the framework of the objective quantitative
approach. The general purpose of this study is to continue the research efforts towards the
major goal of developing overall measures of visual aesthetics of interface design.
The main goal of this study is to verify some of the latest findings in this line of research and
try to further improve currently available measures and models.
The concentration will be on computational models development based on visual features
and screen layout elements of the interface. More rigorous experimental methods will be used
to verify previous findings and validate currently available measures and models. Exploring
the possibilities of further development and improvements to these measures and models will
be part of the verification and validations procedures of this study.
Although this study is categorized mainly within the framework of the first objective
approach, however, subjective questionnaire-based measures of the second approach will be
used in the validation process in this study to assess and evaluate the tested objective layout-
based measures and models.
The rationalization for studies in the objective approach (including current study) is based on
that the majority of the available interface and screen design guidelines are qualitative (e.g.
Galitz, 2007; Shneiderman, 2010). They mostly comprise of qualitative descriptions and
summaries (Bauerly & Liu, 2006) that leave designers with no quantitative tools to evaluate
and compare their design alternatives and leave many of the design decisions to subjective
views of the designers. This study and other previous studies (Ngo et al., 2002; Bauerly & Liu,
9
2006) argue that developing quantitative measures that can provide numerical values for
different designs based on interface and screen design characteristic can be very helpful in
many design situations. These numerical tools can be extremely helpful in early stages of
design. They can assist in preparing design alternatives and can reduce the number of
prototypes that will undergo tests with human users in later stages of design. However, these
tools are not meant to be replacement to human designers, but are intended to serve as
numerical tools to help designers and researches evaluate different design alternatives without
the need to use human participants and to understand the extent to which their designs will
affect usability. Moreover, these measures can provide researchers with quantitative tools that
can help in systematical study of different design aspects and give a numerical basis for direct
comparing between different design proposals. These measures can also be useful in cases
where on-the-fly designs are needed for non professional designers as in online tools for
designing websites (Lai et al., 2010).
1.3.2 Objectives of the study and related research questions
Specific objectives of the study include:-
1. Verify findings of previous studies and validate the available objective measures and
models of visual aesthetics of computer interfaces using more rigorous experimental
approach utilizing statistical testing and design of experiment techniques.
2. Explore the possibility of improving the available measures and models and/or the
possibility of developing new more compact and efficient ones.
10
3. Extend research domain to the case of website design interface. The goal is to see if
these measures and models that proved to be working in many situations of the
traditional graphical user interfaces will be applicable to the website interface.
4. Compare objective layout-based measures of visual aesthetics with subjective
questionnaire-based measures. Earlier observations suggest that the objective layout-
based measures would correlate with questionnaires scales related to screen layout in the
subjective questionnaire-based measures, this study will look more into this. Moreover,
these comparisons should help in assessing the tested objective measures and models
using the subjective questionnaire-based measures.
Part of potential contributions that this study would add to the knowledge base in the field of
human factors and ergonomics in general and the field of human computer interaction and
interface design in particular include trying to find clearer answers to the following still open
to research questions:-
1. Can visual aesthetics be measured quantitatively and objectively using mathematical
formulas and computational models? More specifically, do certain visual layout elements
in the interface relate to perception of visual aesthetics? If yes, can these elements be
used as a basis for developing objective measures and models of visual aesthetics?
2. Do objective layout-based measures of visual aesthetics relate to subjective
questionnaire-based measures? If yes, what is the extent of this relationship? Does
quantitative values produced by both types of measures correlate completely for all
11
dimensions of visual aesthetics or is it limited to certain dimensions? Do specific
objective measures relate only to specific subjective measures?
1.3.3 Organization of the study
The rest of this report is organized as follows, chapter 2 provides a literature review and
related background research. It covers the latest findings concerning the relationship between
visual aesthetics and usability in interface design, with consideration to effects of aesthetics on
perceived usability and performance. Next, a classification of quantitative measures and
models of interface aesthetics is given, with coverage of the latest developed measures and
models of visual aesthetics.
Chapter 3 and chapter 4 cover experimental work of this study. Chapter 3 includes coverage
of an experiment conducted to investigate effects of selected visual elements and incorporate
these elements to construct a regression model to predict perceived visual aesthetics.
Experimental work to validate the model and compare it with earlier developed models is also
covered in this chapter.
Material in Chapter 4 reports an experimental work carried out to further investigate effects
of certain visual elements on perceived visual aesthetics of website interface design.
Comparisons of objective with subjective measures used in the experimental trails and
proposed modification of the model based on these comparisons are also presented in this
chapter.
The final chapter contains summary of results, conclusions, and recommendations for future
work.
CHAPTER 2
LITERATURE REVIEW AND BACKGROUND RESEARCH
13
2.1 Aesthetics and Usability in Interface Design
The attention to the importance of aesthetics in interface design began with findings of Kurosu
and Kashimura (1995) work. Using different designs of an automated teller machine interface,
they managed to find high correlation between users’ prior perception of usability (they called
it apparent usability) and users’ perception of visual aesthetics of the interface. Participants
perceived the visually appealing interface designs as easier to use.
More researches followed, aiming at understanding the nature of the relationship between
visual aesthetics of interface and its usability. Earlier researches concentrated on verifying
Kurousu and Kashimura results using different interface designs and test setups. Later
researchers examined the role of context of use in the aesthetics and usability relationship and
the most recent researches tried to inspect for possible positive effect of interface aesthetics on
users’ performance.
2.1.1 Aesthetics and perceived usability
In an attempt to demonstrate that Kurousu and Kashimura (1995) findings were culturally
dependent, Tractinsky (1997) replicated the study in a different cultural setting using more
rigorous methodology. Kurousu and Kashimura study was conducted in Japan, Tractinsky
claimed that Japan’s culture is known for its aesthetic traditions and Japanese would have
more positive attitude towards aesthetics in computer interfaces. This attitude might have led
to Kurousu and Kashimura results. Tractinsky conducted his study in Israel, a culture known
for its action orientation and supposed to have less positive attitude towards aesthetics.
14
Unexpectedly, higher correlation between interface aesthetics and perceived usability was
found. This result supported further Kurousu and Kashimura findings and suggested a strong
relationship between interface aesthetics and perceived usability. Furthermore, this strong
relationship between user perception of interface aesthetics and perceived usability remains
intact even after actual use of the system (Tractinsky et al., 2000). Tractinsky et al. (2000)
conducted a study to see if actual use of the system would change users’ easy to use
perception of the visually aesthetic interfaces. Results showed the same high correlation
between visual aesthetics and post use perceived usability.
Lavie and Tractinsky (2004) found that users’ perception of interface aesthetics consists of
two main dimensions; they termed “classical aesthetics” and “expressive aesthetics”. The
classical aesthetics emphasize orderly and clear design and are closely related to many of the
usability and interface design rules and guidelines. The expressive aesthetics dimension is
linked to the designers’ creativity and originality and to the ability to break design
conventions. They also developed a questionnaire-based instrument to measure each of the
two dimensions.
Lindgaard et al. (2006) performed a number of experiments to determine how fast users' first
impressions of perception of visual appeal of websites formed. Their results indicated that
users' immediate aesthetic impressions formed very quickly within 50 milliseconds. Tracinsky
et al. (2006) replicated and extended Lindgaard et al. (2006) study to test if these immediate
impressions would remain stable over time. Tracinsky et al. (2006) confirmed Lindgaard et al.
(2006) findings and showed that users' first impressions, formed after a short exposure to the
webpages, remained stable even after a considerably longer exposure.
15
Phillips and Chapparro (2009) examined users’ impression of usability in case of users
performing search and exploratory tasks on websites which varied in visual appeal and
usability. Their results indicate that first impressions are most influenced by the visual appeal
of the site. Users rated sites with high visual appeal and low usability as easier to use, and
gave lower rates to sites with low visual appeal and high usability.
Tracinsky et al. (2000) suggested that the positive effect interface aesthetics has on
perceived usability resembles the known phenomena of "beautiful is good" in the field of
social psychology relating physical attractiveness to socially desirable characteristics. Dion et
al. (1972, as cited in Tracinsky et al., 2000) found that people who are physically attractive are
assumed (by other people) to possess more socially desirable personality traits than persons
who are unattractive. This phenomenon was also reported in marketing and consumer
behavior literature (Tracinsky et al., 2000) responsible for a carryover of first impressions of
products or shopping environments to consumers' evaluations of other attributes of these
products or environments.
Two terms are used to refer to this phenomena, "halo effect" in the consumer behavior
literature and "confirmation bias" in the human-decision making and judgment literature
(Lindgaard et al., 2006). Confirmation bias states that people tend to seek confirming evidence
of their first impressions and ignore disconfirming evidence (Phillips and Chapparro, 2009).
These first impressions of buyers of a product or users of a website are strongly influenced by
physical appearance and visual appeal of the product or the site (Lindgaard et al., 2006;
Phillips & Chapparro, 2009). Users, who like the appearance of a website when they first see
it, may continue to like it regardless of how successful they are in using the site.
16
2.1.2 Role of context of use
The role of context of use was raised by conclusions of a study conducted by Hassenzahl
(2004) contradicting Kurosu and Kashimura (1995), Tracinsky (1997), and Tracinsky et al.
(2000) findings of high correlation between interface aesthetics and perceived usability.
Hassenzahl used an MP3 player skins to investigate the relationship among users perceptions
of beauty (visual aesthetics) goodness (satisfaction) and usability before using the system and
after actually using the system. Based on the study results, Hassenzahl argued that if users are
faced with actual usability problems while using a system than users’ perception of usability
will no longer relates to visual aesthetics. In such a case, actual usability (experienced during
use of the system) will be the main determinant of the post-use perceived usability.
Hassenzahl findings pointed out to a possible effect of context of use on the relationship
between visual aesthetics and perceived usability.
Ben-Bassat et al. (2006) argued that aesthetics appreciation would be more dominant on less
serious contexts where users’ judgments have no later consequences. In a laboratory setting
where users are not going to actually buy a system, their judgment of systems’ acceptability
may be more related to aesthetic aspects rather than actual performance issues. Ben-Bassat et
al. (2006) conducted a study in which users evaluated aesthetics and usability of on screen
simulations of computerized phone book systems using both subjective and economic
measures. Auction bids were used as economic measure of the system in the sense that they
would require users to give market values (prices) to the systems that they are willing to pay to
acquire the systems. With the subjective questionnaire-based measures, results showed the
same strong relationship between perceived aesthetics and perceived usability before and after
17
using the system. However, with the auction bids, this relationship was not evident. Users’
bids in auctions were more related to performance and usability of the systems, i.e. if users
held responsible for their judgments then actual usability aspects will influence their decisions
more than visual appeal of the systems.
De Angeli (2006) and later Hartmann et al. (2008) studied the effect of context of use
represented in two interaction styles of a website on perception of aesthetics and usability.
One interaction style of the website is a more traditional and menu-based; the other is a more
interactive, exploiting metaphor and humor effects, both designs have the same content. Using
Lavie and Tractinksy's scales to evaluate classical and expressive aesthetics, they found that
the metaphor based design was perceived as having better expressive aesthetics although it
had worse perceived usability. The more serious menu-based design rated high on the classic
aesthetic scale and was perceived as having better usability. They concluded that when the
context is less serious and implied more fun and engagement as in the metaphor based design,
aesthetics can have a strong halo effect even on information content and users are willing to
tolerate usability problems for more engaging interfaces. On the other hand, with the more
serious usage contexts, usability appears to have a positive halo effect on content. In such
context, users prefer a more easy to use interface with information content presented in clear
and orderly fashion that simplifies information access.
Similar conclusions were reached by Van Shaik and ling (2009) regarding the halo effect of
usability in perception of aesthetics in case of information oriented context. Their findings
indicated that after participants were briefly exposed to the stimuli webpages used in the
study, classically aesthetic webpages that are information oriented were rated as more
18
attractive than expressively aesthetic pages. Another interesting finding of this study is the
effect of context in stability of aesthetic perception. Providing a context and a goal of use
increases stability of users’ judgments of perception of aesthetics after brief exposure to those
after self-paced exposure and, from perceptions after self-paced exposure to those of after site
use.
2.1.3 Aesthetics and performance
As the positive effect of interface aesthetics on subjective perception of usability became
clearer, it is still unclear what effect aesthetics would have on performance. In almost all of the
previous researches dealing with interface aesthetics and usability, usability was evaluated
using subjective questionnaires. The possibility of this positive relation between interface
aesthetics and usability holds in case of using objective performance measures of usability is
yet to be inspected. So far, results of recent researches addressing the effect of aesthetics on
performance show inconsistent findings. Some reported negative effect of aesthetics on
performance. For examples, Schmidt et al. (2003) found no significant effect of webpages
with different graphics and font sizes on participants’ interaction time in a reading
comprehension task. Van Schaik and Ling (2009) found that number of completed tasks was
significantly lower with more appealing webpages in an information retrieval task.
On the other hand, many of the findings of recent studies indicated positive effects of
aesthetics on performance. Cawthon and Vande Moere (2007) found a high positive
correlation between data visualization techniques rated high in aesthetics and objective
usability measures of efficiency and effectiveness.
19
Using a website providing health-related information as stimulus, Moshagen et al. (2009)
found significant effect of aesthetics on completion time in a low usability condition when
participants completed search tasks. They concluded that high aesthetics could enhance
performance under conditions of poor usability.
Sondergger and Sauer (2010) examined the effect of visual aesthetics on perceived usability
and performance. They employed two deigns of cell phones (highly appealing vs. not
appealing) simulated in computer screen. Participants were asked to complete a number of
typical tasks of cell phone users. Results showed that the visual appearance of the phone had a
positive effect on performance, leading to reduced completion time and number of errors for
the visually appealing design. Same positive relation between aesthetics and perceived
usability was also reported.
Sondergger and Sauer (2010) argues that controversy in findings regarding effect of
aesthetics on performance measures could be due to that aesthetics may have positive or
negative effects depending on the context of use. A positive “increased motivation” effect may
be more likely to occur in a serious work context. Aesthetically pleasing designs might put the
user at ease or in flow, which may improve performance. On the contrary, a negative
“prolongation of joyful experience” effect might be prevailing in a leisure context. The user
taken by the beauty of the product may concentrate less on the task on hand and try to extend
the enjoyment time, which may reduce performance.
20
2.2 Quantitative Measures and Models of Interface Aesthetics
In general, two approaches to measure interface aesthetics can be distinguished in the
literature. The first is an objective approach relating screen design features and layout
elements to the users' perception of visual aesthetics (e.g. Bauerly & Liu, 2006; Ngo et al.
2003). The second one is a subjective approach, utilizing questionnaire-based instruments to
measure users' perception of visual aesthetics (e.g. Laviea & Tractinsky, 2004).
2.2.1 Objective Screen layout- based measures.
This approach represents a bottom-up procedure. It has its roots in the rationalistic
philosophical view of aesthetics (Reich, 1993). This approach comprises the concept of
“beauty in the observed object”; i.e. human perception of beauty is based on the order and
organization of the various components constructing the object. It is concerned with
determining what features in the interface design triggers users’ perception of aesthetics of the
interface. It also tries to explore the possibility of expressing changes in such features using
numerical values and use these numerical values to assess users' perception of interface
aesthetics.
The techniques used by methods in this approach can be traced back to the work of Tullis
(1983 & 1988). Tullis' approach involves the establishment of objective quantitative measures
based on display characteristics. These characteristics should reflect how usable the design of
the display is and should be used to evaluate the display design without the need for collecting
performance data. Tullis applied his approach to alphanumeric displays; he proposed four
21
measures that can be used to evaluate usability of alphanumeric displays: overall density, local
density, grouping, and layout complexity. They were successfully applied to two case studies
and gave similar results when compared to human performance data (Tullis, 1983). More
studies were conducted based on Tullis' concepts, some used the same four measures
developed by Tullis (e.g. Comber & Maltby, 1995; Miyoshi & Murata, 2001) and others tried
to come up with more measures based on screen layout (e.g. Streveler & Wasserman, 1984;
Sears, 1993).
Methods in this approach can be divided into two categories; one that simply uses numerical
counts of visual features on the screen (like: number of objects, number of images …etc) and
relates them to users’ perception of aesthetics. The second one uses mathematical formulas to
express more sophisticated visual design features and concepts (like: symmetry, balance
…etc) and relate them to users’ perception of aesthetics.
a. Simple counts measures.
Visual features used in this categories include number of constructing elements or blocks and
chunks of information on the screen (Bauerly & Liu, 2006 & 2008; Michailidou et al., 2008),
number of images (Bauerly & Liu, 2006 & 2008; Djamasbia, 2010; Michailidou et al., 2008),
image size and font size (Djamasbia, 2010; Schmidt et al., 2003), JPEG file size of screenshots
of websites (Tuch et al., 2010). All the features mentioned above have been tested in the
studies cited next to them; all with results indicting some sort of relationship between these
measures and users’ perception of visual aesthetics.
22
b. Formularized measures.
Methods in this category argues that physical layout of visual objects on the screen may play a
role in users’ perception of aesthetics. The procedure involves expressing visual design
features (like symmetry, balance, unity …etc) using mathematical formulas and combine
calculated values for all features to build an overall measure that would reflect aesthetic level
of the interface design.
Methods in this approach are motivated, by Tullis’ quantitative techniques for evaluating
screen design (Tullis, 1983), earlier aesthetic measures developed by Birkhoff (1933), and
Gestlest theory for visual design (Chand, 2002; Ngo et al., 2002).
One of such measures is the model developed by Ngo et al. (2003). The model consists of
fourteen proposed measures of screen aesthetics: balance, symmetry, equilibrium, unity,
sequence, density, proportions, cohesion, simplicity, regularity, economy, homogeneity,
rhythm, and order. The value of each measure can be calculated using formulas based on the
layout of visual objects on the screen. The average of all these measures represents the overall
aesthetic value of the screen. When testing these measures using real computer screens, high
correlation was found between the model's computed aesthetic value and users' perceived
aesthetics of the interface.
In one study in which the model was applied to data entry screens (Ngo & Byrne, 2001), a
total of 57 screens with different aesthetic values were tested and multiple regression was used
to fit subjective ratings of the screens (obtained from subjective ratings of seven participants)
to the measures (calculated by the model). Results showed that the regression model was
statistically significant and that the measures of balance, unity, and sequence are the most
23
contributed terms in the model. This model could be considered one of the most successful
attempts to develop aesthetic interface measures based on interface layout. However, the
relatively large number of measures (14) and the associated formulas needed to calculate each
of them, make practical application of the model a bit difficult.
In a practical application of the model, Zain et al. (2008) designed a computer application to
incorporate only five of the fourteen measures proposed by Ngo et al. (2003). The five
selected measures were: balance, equilibrium, symmetry, sequence, and rhythm. The software
was applied to a language learning webpages. Findings of the study showed some accordance
with users rating, but no statistical test was used to get a conclusive results. The reason for
these inconclusive results could be due to the fact that not all the significant measures, as
detected in Ngo and Byrne (2001) study, were included in their software and that the
possibility of interactions among the measures wasn't considered.
Bauerly and Liu (2006 & 2008) tested the effects of symmetry and number of compositional
elements on interface aesthetics. Basically, their findings were similar to Ngo et al. (2003)
study. However, it was difficult to practically compare their findings with Ngo et al. (2003)
study, because they used different approach and different formulas to calculate the values of
the two tested measures in their experiments.
Lai et al. (2010) utilized the quantitative measures of symmetry and balance used by Bauerly
and Liu (2006 & 2008) to quantitatively analyze the aesthetics of a text-overlaid image such
that a best position for overlaying the texts on a background image can be obtained
automatically. The two measures were evaluated against participants’ subjective rating of
visual aesthetic appeal in cases of color and monochrome images. A strong relationship
24
between balance and overall aesthetic appeal was shown in both cases. No consistent
proportional relationship between symmetry and subjective ratings of aesthetic appeal was
shown.
Bi et al. (2011) repeated Bauerly and Liu (2008) study to investigate effects of symmetry
and number of compositional elements on Chinese users. The goal was to compare with
Baurely and Liu study that was conducted with American users. Similar results were found
regarding the positive effect of symmetry on participants rating of perceived visual aesthetics.
Different results were found in one case with the number of compositional elements. The
study also reported the development of a computational model of to predict aesthetic ratings
based on symmetry and number of compositional elements. The model showed acceptable
level of performance when evaluated using participants' ratings from the same study.
However, validity of the model was not thoroughly tested using different setting and other
groups of participants.
2.2.2 Subjective Questionnaire-based measures.
Supporters of this approach claim that the complexity and interrelated relationships among the
screen design elements make it difficult to use them to quantitatively measure aesthetics
(Lavie and Tractinsky, 2004). It would be more convenient to use questionnaire-based
instruments to measure users’ subjective perception of aesthetics. Two widely accepted of
such instruments are: the classical and expressive instrument developed by Lavie and
Tractinsky (2010) and the Visual Aesthetics of Website Inventory (VisAWI) tool developed
by Moshagen and Thielsch (2010). Both were designed to measure perceived visual aesthetics
25
of websites. Scales and items of both questionnaires are shown in Table 2.1.
Lavie and Tractinsky (2004) found two dimensions of the perceived website aesthetics,
termed “classical aesthetics” and “expressive aesthetics”. The classical aesthetics dimension
emphasizes orderly and clear design and is closely related to many of the usability and
interface design rules and guidelines. The expressive aesthetics dimension is linked to the
designers’ creativity and originality and to the ability to break design conventions. These two
dimensions were the basis for developing quantitative questionnaire-based instrument to
measure website interface aesthetics. The classical dimension includes the items “aesthetic”,
“pleasant”, “symmetric”, “clear”, and “clean”, while the expressive aesthetics includes the
items “creative”, “fascinating”, “original”, “sophisticated”, and “uses special effects”.
VisAWI was constructed to serve as a new tool to measure perceived website aesthetics. It
was designed to provide a tool that would cover border aspects of perceived websites
aesthetics that weren't adequately presented in early instruments. The instrument is based on
four interrelated facets of perceived visual aesthetics of websites: simplicity, diversity,
colorfulness, and craftsmanship. Simplicity comprises visual aesthetics aspects such as
balance, unity, and clarity. It is closely related to the classical aesthetics dimension. The
Diversity facet comprises visual complexity, dynamics, novelty, and creativity. It is closely
related to the expressive aesthetics dimension. The colorfulness facet represents aesthetic
impressions perceived from the selection, placement, and combination of colors.
Craftsmanship comprises the skillful and coherent integration of all relevant design
dimensions. Each of the first two facets is presented by five items in the questionnaire, while
each of the last two facets has four items.
26
Table 2.1 Scales and items in the Classical/Expressive and the VisAWI questionnaires
Classical/Expressive VisAWI
Scale Item Scale Item
Classical aesthetic
pleasant
clear
clean
symmetric
Simplicity The layout appears too dense.
The layout is easy to grasp.
Everything goes together on this webpage.
The webpage appears patchy.
The layout appears well structured.
Expressive creative
fascinating
original
sophisticated
special effect
Diversity The layout is pleasantly varied.
The layout is inventive.
The design appears uninspired.
The layout appears dynamic.
The design is uninteresting.
Colorfulness
The color composition is attractive.
The colors do not match.
The choice of color is messed up.
The colors are appealing.
Craftsmanship
The layout appears professionally designed.
The layout is not up-to-date.
The webpage is designed with care.
The design of the webpage lacks a concept.
CHAPTER 3
VERIFYING NGO AND BYRNE’S FINDINGS AND
DEVELOPING A PRELIMINARY MODEL
28
3.1 Introduction
The main purpose of experimental work covered in this chapter is to verify Ngo and Byrne
(2001) and Ngo et al. (2003) findings (summarized in section 2.2.1) using a more rigorous
experimental approach under different setting and context with fresh group of participants.
A controlled experiment was designed and conducted to further examine and verify Ngo
and Byrne (2001) findings. The goal of the experiment is first; to design and conduct a
controlled experiment to test effects of the layout elements of balance, unity, and sequence
on interface aesthetics. The possibility of interactions among these measures will also be
tested. Second, use these elements to build and validate a regression model representing
users' perceived visual aesthetics. The validation procedure includes a cross validation of
the results by comparing the regression model to be developed in this experiment with Ngo
and Byrne’s model. The model will also be validated using subjective standard
questionnaire scores of real webpages.
To accomplish these goals, the utilized experimental procedure employed simple abstract
black and white screens to systematically assess effects of these three elements on
perceived visual aesthetics. The reason for using abstract screens is to be able to easily
manipulate and study the related elements in a controlled environment that would insure
obtaining statistically valid results. This procedure was also used in similar previous
studies (Bauerly & Liu, 2006; Lai, 2010, Bi et al., 2011).
The three elements (balance, unity, and sequence) were chosen based on findings of Ngo
and Byrne’s study (2001). According to their findings, these three elements were the most
contributed terms in the developed computational model.
29
The balance element in screen design can be achieved by maintaining equal weights of
visual objects in the screen; top and bottom, left and right (Ngo et al., 2003). Unity, is the
extent by which visual objects on the screen seems to belong together as one object (Ngo et
al., 2003). Sequence corresponds to the arrangement of visual objects in a screen in a way
that facilitates eye movement. The eyes movements usually follow the pattern associated
with reading. In cultures that read from left to right, the eyes will start from the upper left
and move back and forth across the screen to the lower right (Ngo et al., 2003). Moreover,
bigger objects in the screen have more visual weight and the eyes move from bigger to the
smaller objects on the screen.
Ngo et al. (2003) have developed formulas to calculate numerical values for each of these
elements. The formulas were developed so that each element (measure) can have a value
ranges from zero (for the lowest screen aesthetics level) to one (for the highest screen
aesthetics level). These formulas are going to be used to calculate the required values for
the three elements. The formulas for the three elements with hypothetical examples
showing their uses are given in Appendix A.
3.2 Method
3.2.1 Design of the experiment
An experiment was designed and conducted to test effects of the three screen layout
elements of balance, unity, and sequence on participants' perceived aesthetic value of
interface design.
30
A factorial design was utilized with the three screen elements as the main factors. Each
of the three factors was tested at two levels (high and low) that supposed to cover the
whole range of each factor. The used design is a 23 within-participants factorial design
with repeated measures. This design produces eight experimental conditions representing
the factorial combinations of the three factors each at two levels (23 = 8 conditions).
The three factors: balance, unity, and sequence represent the independent variables and
the dependent variable is participants' ratings of interface aesthetics.
This type of factorial design was used because it is relatively easier to apply and
because it can give reliable results with relatively small number of participants.
3.2.2 Screen designs
Eight black and white screen models representing the eight experimental combinations (3
factors each at 2 levels) were prepared. Each screen has an "on –the screen" size of 1024
pixel by 1024 pixel. Four squares were used as the screen objects to be manipulated to
produce the required experimental conditions. A relatively small number of only four
objects was used in each screen to simplify objects manipulation required to produce the
experimental conditions.
The required numerical value of each factor was calculated using the formulas
developed by Ngo et al. (2003), (examples of how the calculations were carried out are
given in Appendix A). Although, theoretically, the two levels of each factor are supposed
to represent the extreme values (0 for low and 1 for high); it was practically difficult to do
that. To overcome this difficulty, a range was used to represent each level, with the low
level below 0.25 and the high level above 0.75.
31
Table 3.1 shows the different factors levels (+ for high and – for low) and values
associated with the eight screen designs. It also shows the overall aesthetic measure value
of each screen; obtained by calculating the average of the values of the three factors. Fig
3.1 represents the eight screen models associated with the eight experimental conditions.
They are presented with the same order in Table 3.1; for example, screen 1 represents the
condition of all the factors at the "high" level (+++) and screen 2 represents the condition
of all factors at the "low" level (---). The remaining screens represent the different
combinations of "high" and "low" levels for the three factors (as explained in Table 3.1).
Table 3.1 The eight experimental conditions and the associated factors levels and values.
Screen
(Condition) Levels Balance Unity Sequence
Aesthetic
Measure
1 + + + 1.00 0.99 1.00 0.997
2 - - - 0.10 0.18 0.00 0.092
3 + - - 0.98 0.24 0.00 0.406
4 + + - 0.91 0.80 0.25 0.650
5 - + + 0.09 0.82 1.00 0.637
6 - - + 0.04 0.15 1.00 0.396
7 + - + 1.00 0.15 1.00 0.716
8 - + - 0.25 0.78 0.25 0.427
32
3.2.3 Participants and apparatus
Thirteen graduate students of engineering (10 males and 3 females) volunteered to
participate in the experiment, with a mean age of 29.3 years and standard deviation of 6.1
years. The participants came with widely diverse cultural backgrounds. They included
students from the US, Asia, Europe, Africa, and the Middle East.
An IBM compatible PC with a 17" LCD display with 1280×1024 pixels screen size and
depth of colors of 32 bit true colors were used in the experiment. The operating system was
Microsoft Windows XP. Microsoft Office PowerPoint 2003 was used as a display screen.
Figure 3.1 The eight screen models associated with the experimental conditions
3.2.4 Procedure
The eight screens were presented randomly on a computer display to each participant using
a PowerPoint presentation, with the participant controlling the progress of the presentation.
The participants were instructed to rate each screen based on their personal preferences
1 2 4 3
5 6 7 8
33
using a 10 point scale, with 10 representing "most beautiful" and 1 representing "least
beautiful". Each experimental trail started with the experimenter explaining the purpose of
the experiment and reading short written instructions explaining the nature of the
experiment and the task to be performed. Next, all the eight screens were quickly presented
to the participant. After that, each screen was presented separately and the participant had
to view the screen and write his/her rating on a paper form. Participants were encouraged
to rate each screen as fast as possible based on their intuitions and first impressions.
3.3 Results
3.3.1 Participants ratings
Participants' average aesthetic ratings of each screen are presented in Table 3.2 next to the
corresponding calculated aesthetic values. Participants' ratings were divided by 10 to make
them compatible with the computed values of aesthetic measure. Comparing these ratings
to the calculated aesthetic measures, some accordance between both can be noticed, except
for screen 2; a relatively high average rating was given to this screen, which was a bit
surprising, since this screen is supposed to represent the lowest level of interface
aesthetics.
A relatively high correlation coefficient of 0.84 (p-value = 0.008) was found between
participants' ratings and the measured values of aesthetics. This confirms with finding of
previous studies.
34
Table 3.2 Calculated aesthetic values and participants' average aesthetic ratings.
Screen
(Condition) Aesthetic Measure Average Aesthetic Ratings
1 0.997 0.908
2 0.092 0.438
3 0.406 0.485
4 0.650 0.654
5 0.637 0.546
6 0.396 0.415
7 0.716 0.515
8 0.427 0.354
3.3.2 Analysis of variance
Analysis of variance results are shown in Table 3.3. All three elements: balance unity and
sequence have significant effects on the perceived interface aesthetics (P-values < 0.001).
Only the two way interactions involving the unity element were found significant (P-
values < 0.001). No significant effect between balance and sequence was found (P-value =
0.215). The three way interaction was not significant (P-value = 0.933). Power of the test
of 0.994 (at α = 0.05) was calculated using an average estimated effect value of 1.224,
indicating that the used sample size of 13 participants was enough for obtaining
statistically valid results.
35
Table 3.3 Analysis of variance results
Element F P-value
Balance (B) 76.56 < 0.001
Unity (U) 43.34 < 0.001
Sequence (S) 24.17 < 0.001
Balance – Unity interaction (B*U) 31.17 < 0.001
Balance – Sequence interaction (B*S) 1.56 0.215
Unity – Sequence interaction (U*S) 22.56 < 0.001
Balance - Unity – Sequence interaction (B*U*S) 0.01 0.933
Participants 7.97 < 0.001
Implication of the significant effects of the three elements can be better explained by
interpreting main factors effects and interactions plots presented in Fig 3.2. Average effects
of the main factors are plotted in Fig 3.2 (a), with all three factors, participants' average
ratings of interface aesthetics increase with increase of the value of the factor from the low
level to the high level. Balance has the largest effect, closely followed by unity and lastly
sequence with a relatively smaller effect.
Plots of the two-way interactions effects among the factors are shown in Fig 3.2, (b) and
(c). These plots indicate that with each pair of factors the effect of one factor is larger at
the high level of the other factor; with the low level the effect is very small. For example,
looking at Fig 3.2 (b), at the high level of balance, unity changes from a smaller value (5)
36
at its low level to a larger value (7.81) at the high level. With the low level of balance, the
plot shows a very small change in unity (from 4.3 to 4.5).
Figure 3.2 Average effects and interactions plots
0
1
2
3
4
5
6
7
8
Low High
Average Aesthetic
Rating
Level
Balance
Unity
Sequence
0
1
2
3
4
5
6
7
8
Low High
Average Aesthetic
Rating
Unity
Balance (L)
Balance (H)
0
1
2
3
4
5
6
7
8
Low High
Average Aesthetic
Rating
Sequence
Unity (L)
Unity (H)
(a). Average effects of the three factors: balance, unity, and sequence
(b). Interaction between balance and unity
(c). Interaction between unity and sequence
37
3.4 Constructing and Validating the Regression Model
3.4.1 Constructing the model
Based on results of analysis of variance, a regression model relating the significant
elements and interactions to the perceived aesthetic values was constructed. The model is
shown below (Equation 3.1):-
Aesthetic Value = 0.497 - 0.0077 B - 0.286 U - 0.0717 S + 0.419 B*U + 0.375 U*S (3.1)
Where:-
B : Balance
U : Unity
S : Sequence
The model has only five terms and only values of the three elements need to be substituted
in the model to get the equivalent value of perceived aesthetics. The model was used to
calculate values of the eight screens of the experiment and compare the results with actual
values of participants' ratings. The comparison is shown in Fig 3.3 and Table 3.4. One can
see that the predicted values calculated by the model and the actual values of participants'
ratings are very close. High correlation (r = 0.99, p-value < 0.001) was found between
actual and predicted values.
38
Figure 3.3 Scatter diagram of actual and predicted aesthetic values for the eight screens.
Table 3.4 Actual and predicted aesthetic values of the eight screens.
Screen no Aesthetic Value
Actual Predicted
1 0.908 0.920
2 0.438 0.453
3 0.485 0.519
4 0.654 0.621
5 0.546 0.528
6 0.415 0.441
7 0.515 0.493
8 0.354 0.409
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8
Aesth
eti
c V
alu
e
Screen no
Actual Predicted
39
3.4.2 Compare with Ngo and Byrne’s model
For further validation of results, the model was compared with the model of Ngo and
Byrne (2001). This comparison was carried out first by using the model developed by Ngo
and Byrne (2001) to predict aesthetic values for the eight screens of the current study. A
short version of the model with only six terms (from the original 14) was used. These
terms are the ones found significant in Ngo and Byrne (2001) study. The model with the
six terms is shown in equation (3.2) below:-
Aesthetic Value = 0.038+0.11 B+0.126 U+0.0771S+ .061 D + 0.186 P + 0.0486 H (3.2)
Where:-
B : Balance
U : Unity
S : Sequence
D: Density
P: Proportion
H: Homogeneity
Table 3.5 represents calculated values for the six terms (elements) for the eight screens that
were substituted in the model to produce the predicted aesthetic values for each screen.
Predicted aesthetic values calculated by the model are plotted against actual values of
participants' ratings in Fig 3.4. Table 3.6 lists these values. Fig 3.4 and Table 3.6 indicate
that with almost all screens aesthetic values produced by the model are lower than the
actual perceived values. This was expected, since the original model has fourteen terms
40
and only six are used here. However, high correlation (r = 0.90, P-value =0.002) was found
between predicted and actual values.
Table 3.5 Calculated values of the five terms (elements) included in Ngo and Byrne (2001) model.
Screen
no
Element
Balance Unity Sequence Density Proportion Homogeneity
1 1.000 0.990 1.000 0.500 1.000 1.000
2 0.098 0.178 0.000 0.468 1.000 1.000
3 0.981 0.238 0.000 0.112 1.000 1.000
4 0.905 0.796 0.250 0.406 1.000 1.000
5 0.093 0.818 1.000 0.043 1.000 1.000
6 0.038 0.150 1.000 0.200 1.000 1.000
7 0.997 0.150 1.000 0.023 1.000 1.000
8 0.251 0.780 0.250 0.050 1.000 1.000
To finish the comparison and to further verify the current model (Eq. (3.1)); the current
model was used to estimate the values of the 57 screens used in Ngo and Byrne (2001)
study. Values of the three terms for the 57 screens, required to calculate the predicted
values were obtained from Ngo and Byrne (2001) study. After calculating all the predicted
aesthetic values for the 57 screens using the current model, coefficient of correlation was
calculated between predicted values and the actual values of participant ratings given to
these screens (obtained from Ngo and Byrne, 2001), a relatively high correlation (r = 0.81,
p-value < 0.001) was found. The original Ngo and Byrne model with all the 14 terms gave
a correlation coefficient of (r = 0.94, p-value < 0.001).
41
Figure 3.4 Scatter diagram of actual and predicted (Ngo and Byrne model) aesthetic values for
the eight screens.
Table 3.6 Actual and predicted (Ngo and Byrne model) aesthetic values of the eight screens.
Screen no Aesthetic Value
Actual Predicted
1 0.908 0.615
2 0.438 0.334
3 0.485 0.417
4 0.654 0.516
5 0.546 0.466
6 0.415 0.385
7 0.515 0.480
8 0.354 0.421
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 1 2 3 4 5 6 7 8
Aesth
eti
c V
alu
e
Screen no
Actual Predicted
42
Table 3.7 lists actual and predicted values for the 57 screens. Looking at these values, one
can see that, in general, the current model seems to overestimate the actual values in most
cases. In only thirteen cases from the 57 screens shown in Table 3.7 the model gave lower
values than the actual ones. A possible reason for this could be that participants rating
reported in Ngo and Byrne study were rounded to the nearest decimal. This possibility was
checked by rounding the predicted values; number of overestimated values was reduced
but coefficient of correlation didn’t increase. Nevertheless, with only three measures (from
the original 14 proposed by Ngo and Byrne, 2001), the current model was able to estimate
values of the 57 screen of Ngo and Byrne (2001) study with the same degree of statistical
accuracy as the original model.
The computational formulas and models suggested by Ngo and Byrne (2001) were
originally developed for data entry screens. It would be interesting to see how they would
work with website interfaces. In the next section the three selected objective layout-based
measures, with the associated models (the regression model and Ngo and Byrne's original
model) will be applied to real webpages and evaluated using subjective questionnaire
based measures.
43
Table 3.7 Actual and predicted (current model) aesthetic values of the 57 screens of Ngo and Byrne (2001)
study.
Screen
no
Aesthetic Value Screen
no
Aesthetic Value Screen
no
Aesthetic Value
Actual Predicted Actual Predicted Actual Predicted
1 0.500 0.558 21 0.500 0.514 41 0.600 0.613
2 0.600 0.604 22 0.500 0.545 42 0.600 0.656
3 0.700 0.633 23 0.400 0.464 43 0.600 0.640
4 0.700 0.879 24 0.600 0.556 44 0.500 0.510
5 0.500 0.557 25 0.700 0.740 45 0.500 0.508
6 0.500 0.501 26 0.500 0.522 46 0.500 0.505
7 0.500 0.524 27 0.600 0.672 47 0.600 0.607
8 0.600 0.593 28 0.600 0.676 48 0.600 0.615
9 0.500 0.607 29 0.600 0.535 49 0.500 0.483
10 0.600 0.584 30 0.500 0.516 50 0.500 0.491
11 0.600 0.668 31 0.500 0.485 51 0.500 0.509
12 0.500 0.510 32 0.500 0.499 52 0.500 0.528
13 0.500 0.671 33 0.500 0.514 53 0.500 0.524
14 0.500 0.512 34 0.500 0.510 54 0.500 0.527
15 0.400 0.472 35 0.500 0.532 55 0.600 0.603
16 0.500 0.568 36 0.400 0.497 56 0.500 0.575
17 0.500 0.566 37 0.600 0.582 57 0.500 0.553
18 0.400 0.357 38 0.600 0.611
19 0.500 0.462 39 0.600 0.610
20 0.500 0.481 40 0.500 0.555
44
3.4.3 Validating the model using standard questionnaire scores of real webpages
The regression model with the interaction terms was used to calculate visual aesthetics of
forty-two web pages already used in a previous study (Moshagen & Thielsch, 2010) to
develop the VisAWI questionnaire-based measure of visual aesthetics of websites. These
42 webpages were used by Moshagen & Thielsch (2010) to validate the VisAWI
questionnaire and compare it with classical and expressive aesthetics questionnaire.
Aesthetic values calculated for the 42 webpages by the regression model were compared to
scores of VisAWI and classical/expressive questionnaires already available in (Moshagen
& Thielsch, 2010). Correlation analysis was conducted to see how the three objective
layout-bases elements and the associated model tested in this study relate to standard
questionnaire-based measures of visual aesthetics.
The reason why these 42 webpages were chosen for this study is that they cover a wide
variety of websites with different levels of visual aesthetics. In addition, questionnaire
scores for a large sample size are already available for these pages; scores of a total of 512
participates were used to validate the questionnaire. Of the participants, 347 (67.8%) were
female. Age ranged from 15 to 82 years (M = 30.50; SD = 10.61). A 7 point Likert scale
was used in (Moshagen & Thielsch, 2010) study questionnaires. A list of all average scores
per each webpage is given in Appendix B. Table 3.8 summarizes descriptive statistics for
the two questionnaire scales. Screen shoots of four of the 42 webpages are shown in Fig
3.5; two with the highest scores (highest perceived visual aesthetics) and two with the
lowest scores.
45
Table 3.8 Descriptive statistics for questionnaire scores for the 42 webpages.
Questionnaire Scale Min Max Average Standard
deviation
Classical/expressive
Classic 2.40 5.21 3.98 0.75
Expressive 1.68 4.10 2.87 0.60
Average 2.27 4.46 3.42 0.54
VisAWI
Simplicity 2.35 5.10 3.96 0.68
Diversity 2.00 4.60 3.51 0.61
Colorfulness 2.98 5.33 4.32 0.61
Craftsmanship 2.90 5.42 4.37 0.61
Average 2.72 4.90 4.00 0.53
The procedure used to compute the values of the three elements (balance, unity, and
sequence) for the 42 webpages is the same as the one used to calculate their values for the
eight abstract screens. Visual information on each page was divided into hypothetical
visual objects. Layout data obtained from these objects (area, distance from central axis …
etc) were input to the computational formulas for computing the three elements (see
Appendix A for the formulas and examples of calculations). Fig 3.6 shows an example of
how a webpage was divided into visual objects. Table 3.9 gives summary of descriptive
statistics for the three elements, their average, and values calculated by the interaction
(regression) model and Ngo and Byrne model. Complete lists of all values calculated for
each page is given in Appendix B.
46
Figure 3.5 Screen shoots of webpages with the highest and lowest average questionnaire scores. (a) and (b)
with the highest scores, (c) and (d) with the lowest scores.
(a) Webpage no.4
Average scores: Classic/Expressive = 4.46
VisaWI = 4.90
(b) Webpage no.11
Average scores: Classic/Expressive = 4.28
VisaWI = 4.86
(c) Webpage no.2
Average scores: Classic/Expressive = 2.27
VisaWI = 2.96
(d) Webpage no.17
Average scores: Classic/Expressive = 2.43
VisaWI = 2.72
47
Figure 3.6 An Example of how a webpage is divided into visual objects (top image shows the original web
page, bottom image shows the page divided into visual objects).
48
Table 3.9 Descriptive statistics for the measures and the models for the 42 webpages.
Measure Min Max Average Standard deviation
Balance 0.516 0.950 0.792 0.105
Unity 0.163 0.684 0.417 0.145
Sequence 0.750 1.000 0.970 0.082
Average 0.528 0.835 0.726 0.072
Interaction Model 0.486 0.712 0.591 0.060
Ngo Model 0.191 0.291 0.252 0.024
Table 3.10 shows correlation coefficients between the measures and the models in one
side, and questionnaire scores for the 42 webpages in the other side. From the table, one
can see that all significant correlations are with the questionnaire items related to screen
layout. The measure of unity and the models are significantly correlated with the classical
and the simplicity measures; both including items related to visual layout and clarity of the
design.
Table 3.10 Correlations between the measures and questionnaire scores.
Measure Classical/expressive
VisAWI
Classic Expressive Average
Simplicity Diversity Colorfulness Craftsmanship Average
Balance 0.064 0.064 0.08
0.136 -0.001 0.1 -0.111 0.044
Unity 0.562* 0.133 0.466*
0.658* 0.140 0.255 0.463* 0.457*
Sequence 0.279* 0.062 0.229
0.313** 0.131 0.297 0.167 0.269
Average 0.511* 0.143 0.436*
0.623* 0.142 0.331* 0.318** 0.428*
Interaction model
0.600* 0.189 0.524*
0.712* 0.163 0.316** 0.434* 0.491*
Ngo
model 0.539* 0.151 0.460*
0.657* 0.143 0.325** 0.347** 0.446*
* Significant at 0.01, ** significant at 0.05
49
From the three layout measures (balance, unity, and sequence) only unity has high
correlations with the questionnaire measures. No significant correlations were found
between balance and sequences, and the questionnaire measures. This might be explained
by looking at the interactions plots in Fig 3.2 and descriptive statistics in Table 3.9. High
values for both balance and sequence were calculated for the 42 webpages; values of
balance range from 0.516 to 0.950 with an average value of 0.792 and values of sequence
are all above 0.75 with an average of 0.970. In the other hand, unity has lower values; from
0.163 to 0.684 with an average of 0.417. Interpretation of interaction plots (section 3.3.2)
suggests that the effect of one factor is larger at the high levels of the other factors. For the
42 webpages, both balance and sequence have higher values than unity. Hence, unity will
have larger impact on perceived aesthetics. This was reflected in the high correlations unity
has with the related questionnaire measures. Nevertheless, the other case of lower values of
balance and sequence should also be investigated to confirm this explanation. Also, Can
the high levels of balance witnessed here be considered as a typical characteristic of all
website designs? Or is it just a coincidence with the 42 webpages used in the study?
3.4.4 Checking for correlations with simple counts measures
In this section, selected simple count-based measures for the 42 webpages will be
compared with the questionnaire scores. This could give further explanation for some of
the above observations seen with the formularized measures and might lead to better
understanding of the relationships and interactions among the measures associated with the
models. It may also help in finding other simpler measures for visual aesthetics of website
interface design.
50
Five measures were selected, namely: number of visual objects on the screen, number of
different sizes of visual objects, number of images, number of different font types used in
the web page, and JEPG file size of screenshot of the webpage. Number of objects,
number of images, and JEPG files size has already been tested in previous studies (Bauerly
& Liu, 2006; Bi et al., 2011, Djamasbi et al., 2010; Schmidt et al., 2003; Tuch et al., 2010);
all with results indicting some sort of a relationship between these measures and users’
perception of visual aesthetics. Number of different sizes of visual objects is one of the
input parameters in Ngo et al. formulas for unity. Number of different font types has been
selected based on earlier observations.
The procedure will be the same as the one used in validating the model; the selected
measures will be calculated for the 42 webpages and compared to questionnaire scores
using correlations analysis.
Descriptive statistics for the calculated values for the five selected measures for the 42
web pages are given in Table 3.11. The complete list for all 42 webpages is given in
Appendix B.
Table 3.12 shows correlation coefficients between the selected measures and
questionnaire scores for the 42 webpages. Significant correlations were found between
number of objects and number of different sizes with both the classical and the simplicity
measures. This wasn't surprising, since these two features (no of objects and no of different
sizes) are the main input parameters in the unity formula. These significant correlations
point out to clear negative effects of increasing number of objects and number of different
sizes on perceived visual aesthetics of websites.
51
Table 3.11 Descriptive statistics for the selected count-based measures for the 42 web pages.
Measure Min Max Average Standard deviation
No of objects 6 21 10.5 3.9
No of different sizes of objects 3 20 9.2 3.6
JEPG file size (Kbytes) 50 251 170.8 44.4
No of different font types 1 6 2.8 1.3
No of images 0 12 4.3 3.1
Significant correlations were also found between JEPG file size and number of different
font types, and the classical aesthetics measure. No strong correlations were found between
number of images with any of the classical and the simplicity measures. However, an
interesting result is the noticeable high and significant correlations found between number
of different fonts and the expressive and the diversity measures.
Further investigation involving some of the measures tested in this section will be carried
out in next chapter.
52
Table 3.12 Correlations between objective simple count-based measures and subjective questionnaire based measures.
Measure Classical/expressive
VisAWI
Classic Expressive Average
Simplicity Diversity Colorfulness Craftsmanship Average
No of objects -0.355** -0.086 -0.296
-0.397* -0.087 -0.055 -0.203 -0.413*
No of different sizes of
objects -0.561* -0.170 -0.487*
-0.602* -0.248 -0.189 -0.371** -0.542*
JEPG file size (Kbytes) -0.338** 0.023 -0.223**
-0.333 -0.011 0.038 -0.123 -0.142
No of different font types -0.333** 0.600* 0.103
-0.257 0.399* -0.172 -0.047 -0.019
No of images -0.251 0.195 -0.066 -0.224 0.203 0.082 -0.016 0.002
* Significant at 0.01, ** significant at 0.05
CHAPTER 4
FURTHER TESTING OF VISUAL LAYOUT ELEMENTS
AND VALIDATING OF THE MODEL
54
4.1 Introduction
In this chapter, further experimental investigations of effects of balance and unity of form
on perceived visual aesthetics are carried out using controlled experiments. The
motivation for these investigations is the questions raised by the results of the correlation
analyses presented in the previous chapter. Results of these analyses showed significant
correlations between unity, number of objects, and number of different sizes with
subjective questionnaire-based measures. Interpretation of these results suggested that the
high correlations between these objective measures and perceived aesthetics only occur at
high levels of balance. The purpose of the experimental work presented in this chapter is
to confirm these findings. The main goal is to test the hypothesis of findings significant
effects of unity of form on perceived aesthetics of website design in case of designs with
high levels of vertical balance. Specifically, this part of the study aims at systematically
study effects of number of objects and number of different sizes of objects on perceived
visual aesthetics of website design under the conditions of balanced and unbalanced
designs.
Number of objects and number of different sizes of objects are the two input parameters
to calculate unity of form in the formula developed by Ngo et al. (2003), the measure of
unity consists of two sub-measures; unity of form and unity of layout, the value of unity
equals the average value of both sub-measures . Unity of form represents the extent to
which visual objects on the screen are related in size. High levels of unity of form can be
achieved by using objects with similar sizes on the screen and/or by reducing number of
objects on the screen. The formula for unity of form with an example of its application is
given in Appendix A.
55
4.2 Method
4.2.1 Experimental Design
A three- factor mixed (within and between) participants design was utilized. The three
factors are vertical balance, number of objects, and number of different sizes of objects.
Each of the three factors was tested at two levels (high and low). This experimental
design with three factors each with two levels produces eight experimental conditions.
Eight different designs of a webpage were prepared to represent the eight experimental
conditions. All eight designs have identical styles (colors, fonts ...etc); only visual
elements related to the three factors were manipulated. Values of the levels of the balance
factor were determined using Ngo et al. (2003) formula for vertical balance with the
higher level value equal one (1.0) and the lower level with values less than 0.28. Values
for the levels of the two other factors (number of objects and number of sizes) were
chosen based on observations from experimental work presented in the previous chapter.
Table 4.1 shows factors values and levels associated with the eight experimental
conditions. The first four designs (designs 1 to 4) represent the higher levels of balance.
The last four designs (designs 5 to 8) represent the lower levels.
Fig 4.1 shows abstract mock-up screens representing the eight experimental conditions.
These mock-up screens were used in the first experimental trail and were also used as
templates to prepare the real webpage designs. Fig 4.2 shows screens shots of the eight
designs of the webpage. The webpage represents a homepage of a hypothetical website
that talks about the ancient history of a certain region of North Africa. It uses the local
language of that region (Arabic). However; the content of the website was irrelevant for
56
the purpose of the study and shouldn’t have any effect on participants’ responses.
Two methods were used to measure participants’ perception of visual aesthetics. The
first uses a simple one question to measure the overall aesthetics of the design (same as in
the experiment in chapter 3). The second uses standard questionnaires. Both
classical/expressive and VisAWI questionnaire were used.
Table 4.1 The eight deigns and the associated factors levels and values.
Design
(Condition) Levels
Vertical
Balance No of objects
No of
different sizes Unity of form
1 + + + 1 6 3 0.67
2 + + - 1 6 5 0.33
3 + - + 1 16 3 0.88
4 + - - 1 16 11 0.38
5 - + + 0.25 6 3 0.67
6 - + - 0.25 6 5 0.33
7 - - + 0.27 16 3 0.88
8 - - - 0.28 16 11 0.38
4.2.2 Procedure
Experimental trails were carried out online. An online survey design and distributing
service was used. Participants were recruited through this service. Email invitations were
sent randomly to potential participants with the choice of entering a lottery to win 100 US
dollars. All invitations were sent to potential participants in the United States.
57
Figure 4.1 The eight abstract mock-up screens.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
58
Figure 4.2 The eight webpage designs.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
59
The experimental work consisted of six experimental trails each using one of the
response instruments. Each of the tests with the standard questionnaires was carried out
over two trails; one for the balanced condition and the other for the unbalanced condition.
This was merely dictated by how long each test trail should be and how that might affect
response rate of the participations and their commitment to complete the test trail. The
goal was to allow participants to finish the test within an acceptable time limit. Table 4.2
summarizes the experimental trails and gives demographic information for participants in
each trail.
Table 4.2 Experimental trails and participants information.
No of
Invitations
Emailed
No of
Responses
Delivered
No of Valid
responses
(male, female)
Age (years)
Average Standard
deviation
One-question
Mock-ups 251 31 29 (16,13) 39.9 11.7
Webpages 201 31 28 (11,17) 38.7 10.2
Questionnaire
Classic
/Expressive
Balanced 201 24 17 (8, 9) 35.1 10.1
Unbalanced 201 26 21 (11,10) 40.3 9.7
VisAWI
Balanced 251 40 25 (13,12) 41.5 16.4
Unbalanced 201 26 23 (8,15) 42.6 9
60
In each trail the images associated with the eight designs were presented randomly to
each participant one at a time with an on screen size of 800X600 pixels. The question or
questionnaire was placed under each image. In the one question trails, participant had to
rate each screen based on their personal preferences using a 10 point scale, with 10
representing "most beautiful" and 1 representing "least beautiful". In the questionnaires
trails, questionnaire items were presented in random order to each participant. A seven-
point Likert scale was used to collect responses to each item.
4.3 Results Analysis and Discussion
4.3.1 One-question with mock-up screens trail
Table 4.3 summarizes descriptive statistics for average scores (complete list of all scores
are given in Appendix C). Averages are computed per design (1 to 8), per no of objects
(6 and 16), and per condition (balanced and unbalanced). Information in Table 4.3 are
depicted in Fig 4.3. It can be seen that designs with smaller number of objects and
smaller number of different objects are given relatively higher average scores. This
pattern was clear in both conditions (balanced and unbalanced). In the balanced
condition, the highest average score (5.24) was recorded with the design associated with
the smallest number of objects and the smallest number of sizes (design 1). The lowest
average score (3.59) was recorded with the design associated with the largest number of
objects and largest number of sizes (design 4). This pattern appears in the unbalance
condition as well, design 5, with the smallest number of objects and sizes was given the
highest average score (5) and design 8 having the largest number of objects and sizes was
61
given the lowest average score (3.21).
Also with average scores per number of objects, higher average scores were recorded
with designs with smaller number of objects in both conditions. For example, in the
balanced conditions, average score for designs 1 and 2 (4.88) associated with the smaller
number of objects (6) is relatively higher than average score for designs 2 and 4 (4.16)
associated with the larger number of objects (16).
In addition, designs in the balanced condition were given relatively higher average
scores than their counter part designs in the unbalanced condition. This was reflected in
the higher average score (4.52) given to the balanced condition compare to the lower
average score (4.09) given to the unbalanced condition.
Table 4.3 Descriptive statistics for average scores for the one-question, mock-up screens trail.
Condition Design
no
No of
objects
No of
sizes
Average score Standard
deviation
(per screen) Per
design
Per no of
objects
Per
condition
Balanced
1
6
3 5.24
4.88
4.52
2.60
2 5 4.52 2.28
3
16
3 4.72
4.16
2.88
4 11 3.59 2.10
Unbalanced
5
6
3 5.00
4.45
4.09
2.41
6 5 3.90 2.24
7
16
3 4.24
3.73
2.40
8 11 3.21 1.57
62
(a) Per design
(b) Per no of objects
(c) Per condition
Figure 4.3 Average scores for the one-questions, mock-up screens trail.
5.24 5
4.52
3.9
4.72
4.24
3.59
3.21
3
3.5
4
4.5
5
5.5
Balanced Unbalanced
4.88
4.45
4.16
3.73
3
3.5
4
4.5
5
5.5
Balanced Unbalanced
4.52
4.09
3.8
3.9
4
4.1
4.2
4.3
4.4
4.5
4.6
Balanced Unbalanced
1 2 3 4 5 6 7 8
6 16 6 16
63
Analysis of variance results are shown in Table 4.4. A nested factorial and repeated
measures analysis technique was used to complete the analysis; number of different sizes
of objects was tested as a nested factor within the number of objects factor. Results show
that both factors have statistically significant effects on participants' scores in both the
balanced and the unbalanced conditions. Participants have given designs with lower
number of objects and sizes significantly higher average scores, i.e. they perceived
designs with lower number of objects and sizes as having higher level of visual
aesthetics.
Table 4.4 ANOVA for average scores for the one-question, mock-up screens trail.
Case Element F P-value
Balanced
Objects 4.62 0.034
Sizes within objects 4.01 0.021
Participants 4.48 < 0.001
Unbalanced
Objects 5.78 0.018
Sizes within objects 6.30 0.003
Participants 4.23 < 0.001
Pair-wise comparisons between levels of each factor showed that differences were
significant for all pairs except the levels of number of different sizes within the higher
level of the number of object (designs 1 and 2). Increasing the number of different sizes
from (3) in design 1 to (5) in design 2, under number of objects = 6, didn't results in a
64
significant increase in participants average score.
The difference between average scores of the balanced and unbalanced conditions was
also significant (p-value = 0.013). Balanced designs were perceived as having
significantly higher level of visual aesthetics than the unbalanced designs.
Differences among participants were also found significant in both conditions. This
justifies the use of the repeated measure approach in the analysis.
4.3.2 One-question with webpages trail
Average scores for this trail are given in Table 4.5 and Fig 4.3. From both it can be seen
that in many cases designs with smaller number of objects and smaller number of sizes
were given higher average scores. However, this pattern was not consistent over all
designs and factors. With number of objects this pattern was consistent in both
conditions; higher average scores were recorded with designs associated with smaller
number of objects. This was more evidence in the balanced condition with an average
score of 4.16 given to designs 1 and 2, and an average score of 3.70 given to designs 3
and 4. The difference was less in the unbalanced condition with an average of 3.98 for
designs 5 and 6, and an average of 3.75 for designs 7 and 8.
With number of different sizes, no obvious pattern could be recognized. In the balanced
condition within the smaller number of objects, the design with the largest number of
sizes (deign 2) was given an average higher score than the design with the smallest
number of sizes (design 1). This was also the case in the unbalanced condition with
designs 7 and 8.
65
A slightly higher average score (3.93) was give to the balanced designs compared with
a lower average score (3.87) for the unbalanced designs.
Table 4.5 Descriptive statistics for average scores for the one-question with webpages trail.
Case Screen
no
No of
objects
No of
sizes
Average score Standard
deviation
(per screen) Per
screen
Per no of
objects
Per
case
Balanced
1
6
3 4.07
4.16
3.93
1.74
2 5 4.25 1.73
3
16
3 3.89
3.70
1.87
4 11 3.50 1.97
Unbalanced
5
6
3 4.04
3.98
3.87
2.32
6 5 3.93 2.23
7
16
3 3.64
3.75
2.08
8 11 3.86 2.17
Analysis of variance in Table 4.6 shows that only effects of the number of objects factor
in the balanced condition were found significant (p-value = 0.016). Effects of number of
different sizes in both balanced and unbalanced conditions were not significant.
Differences among participants were significant. Difference between average scores of
the balanced and unbalanced designs was not significant (p-value = 0.71). Participants in
this trail didn't perceived balanced designs as having higher level of visual aesthetics than
the unbalanced designs.
66
(a) per design
(b) per no of objects
(c) per condition
Figure 4.4 Average scores for the one-question, webpages trail.
4.07 4.04
4.25
3.93 3.89
3.64 3.5
3.86
3
3.5
4
4.5
5
Balanced Unbalanced
4.16 3.98
3.7 3.75
3
3.5
4
4.5
5
Balanced Unbalanced
3.93
3.87
3.6
3.8
4
Balanced Unbalanced
1 2 3 4 5 6 7 8
6 16 6 16
67
Table 4.6 ANOVA for average scores for the one-question webpages trail.
Case Element F P-value
Balanced
Objects 6.10 0.016
Sizes within objects 1.31 0.27
Participants 10.54 < 0.001
Unbalanced
Objects 1.56 0.21
Sizes within objects 0.41 0.66
Participants 16.99 < 0.001
4.3.3 The Classic/Expressive questionnaire trails
a. The balanced condition
Average scores for this trail are summarized in Table 4.7 and Fig 4.5. With the classical
aesthetics part, slightly higher average scores were recorded with designs associated with
the smaller number of objects and smaller number of sizes. Design 1 with the smallest
number of objects and sizes was given the highest average score (4.47) and design 4 with
the largest number of objects and sizes was given the lowest average score (3.87). The
largest difference was recorded with the two designs associated with the larger number of
objects (design 3 and design 4). Design 3 with the smaller number of sizes was given an
average score of 4.38; relatively higher than the average score of 3.87 given to design 4
that have a larger number of sizes. Deigns 1 and 2 with the smaller number of objects
68
were give a higher average score (4.43) than the average score (4.13) for designs 3 and 4
associated with the larger number of objects.
With the expressive aesthetics part; no clear pattern could be distinguished. The same
average score (3.12) was given to both deign 1, associated with the smallest numbers of
objects and sizes, and design 4 associated with the largest numbers of objects and sizes.
Almost the same average scores were given to both cases of number of objects (3.03 and
3.08).
With the total average scores, same pattern as in the classical scale can be seen; higher
average scores were given to designs with smaller number of objects and smaller number
of sizes. The highest average score (3.79) was recorded with design 1 and the lowest
average score (3.49) was recorded with design 4.
A noticeable higher overall average score (4.28) was recorded with the classical scale
compared to a lower average score (3.06) for the expressive scale.
Cronbach’s α was used to measure reliability of the questionnaire. All calculated values
were within the range of 0.67-.74 for the different scales of the questionnaire, indicating
an acceptable level of reliability.
69
Table 4.7 Descriptive statistics for average scores for the Classical/Expressive balanced trail.
Scale Design
no
No of
objects
No of
sizes
Average score Standard
deviation
(per screen)
Cronbach’s
α Per
design
Per no of
objects
Per
scale
Classical
1
6
3 4.47
4.43
4.28
1.56
2 5 4.40 1.31
3
16
3 4.38
4.13
1.26
4 11 3.87 1.34
Expressive
1
6
3 3.12
3.03
3.06
1.54
0.67-.74
2 5 2.94 1.36
3
16
3 3.05
3.08
1.41
4 11 3.12 1.51
Total
1
6
3 3.79
3.73
3.67
1.08
2 5 3.67 0.86
3
16
3 3.71
3.60
0.97
4 11 3.49 0.95
70
(a) per design
(b) per no of objects
(c) per scale
Figure 4.5 Average scores for the Classical/Expressive questionnaire balanced trail.
4.47
3.12
3.79
4.4
2.94
3.67
4.38
3.05
3.71 3.87
3.12
3.49
2.5
3
3.5
4
4.5
5
Classic Expressive Total
4.43
3.03
3.73
4.13
3.08
3.6
2.5
3
3.5
4
4.5
5
Classic Expressive Total
4.28
3.06
3.67
2.5
3
3.5
4
4.5
5
Classic Expressive Total
1 2 3 4 1 2 3 4 1 2 3 4
6 16 6 16 6 16
71
Results of analysis of variance in Table 4.8 show that effects of number of objects and
number of sizes were only significant with the classical aesthetics scale in the balanced
condition. Pair-wise comparisons showed that effect of number of sizes was only
significant at the lower level of number of objects (between designs 3 and 4). No
significant effects of the two factors were found with the expressive scale and total
average scores.
Differences between average scores of the classical and expressive scales were
significant as analysis of variance in Table 4.11 (in page 76) shows. This difference was
significant in both the balanced and the unbalanced conditions. Participants perceived the
designs as having higher levels of classical aesthetics than of expressive aesthetics.
Table 4.8 ANOVA for average scores for the Classical/Expressive balanced trail.
Scale Element F P-value
Classical
Objects 4.66 0.036
Sizes within objects 3.30 0.045
Participants 18.66 < 0.001
Expressive
Objects 0.29 0.59
Sizes within objects 0.95 0.39
Participants 38.37 < 0.001
Total
Objects 2.77 0.10
Sizes within objects 2.77 0.07
Participants 27.40 < 0.001
72
b. The unbalanced condition
Average scores for this trail are presented in Table 4.9 and Fig 4.6. No clear pattern as in
the balanced condition could be recognized here. The only noticeable observation is that
the lowest average score was recorded with design 8 with both the classical and the total;
design 8 is the design associated with the largest numbers of objects and sizes. However,
design 5, associated with the smallest numbers of objects and sizes, was not given the
highest average score in any of the cases.
Also, one can notice, with the classical scale, that average score of designs associated
with the smaller number of objects (designs 5 and 6) were given slightly higher average
score (3.62) than average score (3.51) for designs with the larger number of objects
(designs 7 and 8). The case was reversed in both expressive scale and total average
scores.
As in the balanced condition, classical aesthetics was given a higher average score
(3.56) than the average score for expressive aesthetics (2.99).
The calculated Cronbach’s α values for all scales are between 0.64 and 0.75, all within
the acceptable reliability limits.
No significant effects were found in any of the scales as results of analysis of variance
in Table 4.10 indicate. Only differences among participants in all scales were found
significant as was the case in the previous trails.
The balance factor has significant effects in both classical and total scores as results of
analysis of variance in Table 4.11 (in page 76) indicate. The balanced designs were
perceived as having higher levels of classical aesthetics than the unbalanced designs.
73
Table 4.9 Descriptive statistics for average scores for the Classical/Expressive unbalanced trail.
Scale Design
no
No of
objects
No of
sizes
Average score Standard
deviation
(per screen)
Cronbach’s
α Per
design
Per no of
objects
Per
scale
Classical
5
6
3 3.53
3.62
3.56
1.27
6 5 3.71 1.37
7
16
3 3.62
3.51
1.31
8 11 3.40 1.17
Expressive
5
6
3 2.93
2.91
2.99
1.43
0.64-0.75
6 5 2.89 1.39
7
16
3 3.11
3.07
1.47
8 11 3.02 1.37
Total
5
6
3 3.23
3.26
3.28
0.85
6 5 3.30 0.84
7
16
3 3.36
3.29
0.75
8 11 3.21 0.84
74
(a) per design
(b) per no of objects
(c) per scale
Figure 4.6 Average scores for the Classical/Expressive unbalanced trail.
3.53
2.93
3.23
3.71
2.89
3.3
3.62
3.11
3.36 3.4
3.02
3.21
2.5
3
3.5
4
Classic Expressive Total
3.62
2.91
3.26
3.51
3.07
3.29
2.5
3
3.5
4
Classic Expressive Total
3.56
2.99
3.28
2.5
3
3.5
4
Classic Expressive Total
5 6 7 8 5 6 7 8 5 6 7 8
6 16 6 16 6 16
75
Table 4.10 ANOVA for average scores for the Classical/Expressive unbalanced trail.
Scale Element F P-value
Classical
Objects 0.54 0.47
Sizes within objects 0.88 0.42
Participants 14.04 < 0.001
Expressive
Objects 3.23 0.08
Sizes within objects 0.28 0.76
Participants 48.47 < 0.001
Total
Objects 0.12 0.73
Sizes within objects 1.64 0.20
Participants 27.10 < 0.001
Table 4.11 ANOVA for balance and scales for the Classical/Expressive trail.
Balance Scales
Scale F P-value Condition F P-value
Classical 4.67 0.003 Balanced 7.63 0.005
Expressive 1.07 0.33 Unbalanced 6.22 0.008
Total 5.41 0.002
76
4.3.4 The VisAWI questionnaire trails
a. The balanced condition
Table 4.12 and Fig 4.7 summarize average scores for questionnaire scales of this trail.
The most obvious observations is that with all scales and the total, designs with the
smaller number of objects (designs 1 and 2) were given higher average scores compared
to designs with the larger number of objects (designs 3 and 4). The largest difference
between average scores for number of objects was recorded in the simplicity scale (4.34
vs. 4.01) and the total (3.81 vs. 3.65).
With number of different sizes, in all scales (except colorfulness) and total, design 4,
associated with the largest number of objects and sizes, was given the lowest average
scores. Design 1, associated with the smallest numbers of objects and sizes, was not
given the higher average scores in all scales; In the simplicity scale, the highest score was
given to design 2 and in the craftsmanship scale, the highest score was given to design 3.
When comparing average scores per scale, the highest average score (4.18) was
recorded with the simplicity scale, followed by colorfulness with an average score of
3.95, than craftsmanship and diversity with average scores of 3.54 and 3.26 respectively.
The calculated Cronbach’s α values for all scales are between 0.60 and 0.99, all within
the acceptable reliability limits.
Analysis of variance results for this trail are shown in Table 4.13. These results show
that number of objects was only significant in case of the simplicity scale and the total (p-
values = 0.008 and 0.028 respectively). Number of sizes was only significant in case of
the craftsmanship scale (p-value = 0.025). Pair-wise comparisons showed that this effect
is significant in case of the larger number of objects (16), associated with designs 3 and 4.
77
Table 4.12 Descriptive statistics for average scores for the VisAWI balanced trail.
Scale design
no
No of
objects
No of
sizes
Average score Standard
deviation
(per screen)
Cronbach’s
α Per
design
Per no of
objects
Per
scale
Simplicity
1
6
3 4.31
4.34
4.18
1.08
0.60-0.99
2 5 4.38 1.20
3
16
3 4.08
4.01
1.15
4 11 3.94 1.17
Diversity
1
6
3 3.37
3.33
3.26
1.30
2 5 3.30 1.33
3
16
3 3.28
3.20
1.34
4 11 3.11 1.13
Colorfulness
1
6
3 4.03
3.99
3.95
1.47
2 5 3.95 1.60
3
16
3 3.88
3.91
1.54
4 11 3.94 1.45
Craftsmanship
1
6
3 3.70
3.59
3.54
1.19
2 5 3.48 1.28
3
16
3 3.71
3.50
1.20
4 11 3.28 1.07
Total
1
6
3 3.85
3.81
3.73
1.02
2 5 3.78 1.13
3
16
3 3.74
3.65
1.16
4 11 3.57 1.05
78
(a) per design
(b) per no of objects
(c) per scale
Figure 4.7 Average scores for the VisAWI balanced trail.
4.31
3.37
4.03
3.7
3.85
4.38
3.3
3.95
3.48
3.78
4.08
3.28
3.88
3.71 3.74
3.94
3.11
3.94
3.28
3.57
2.5
3
3.5
4
4.5
5
Simplicity Diversity Colorfulness Craftsmanship Total
4.34
3.33
3.99
3.59
3.81
4.01
3.2
3.91
3.5 3.65
2.5
3
3.5
4
4.5
5
Simplicity Diversity Colorfulness Craftsmanship Total
4.18
3.26
3.95
3.54 3.73
2.5
3
3.5
4
4.5
Simplicity Diversity Colorfulness Craftsmanship Total
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
6 16 6 16 6 16 6 16 6 16
79
Differences among participants were also significant in all cases (p-values < 0.001). No
significant effects of both factors were found on diversity and colorfulness.
Highly significant differences were found among the different scale (p-values < 0.001),
as results for analysis of variance for scales in Table 4.16 (in page 85) shows. All pair-
wise comparisons were significant with simplicity given the highest average score (4.18)
followed by colorfulness with an average score of 3.95, than craftsmanship with an
average of 3.54, and last diversity with an average score of 3.26.
Table 4.13 ANOVA for average scores for the VisAWI balanced trail.
Case Element F P-value
Simplicity
Objects 7.51 0.008
Sizes within objects 0.38 0.682
Participants 9.96 < 0.001
Diversity
Objects 1.38 0.24
Sizes within objects 0.63 0.54
Participants 15.65 < 0.001
Colorfulness
Objects 0.66 0.42
Sizes within objects 0.26 0.77
Participants 25.59 < 0.001
Craftsmanship
Objects 0.60 0.44
Sizes within objects 3.87 0.025
Participants 10.69 < 0.001
Total
Objects 5.01 0.028
Sizes within objects 1.66 0.196
Participants 27.35 < 0.001
80
b. The unbalanced condition
Table 4.14 and Fig 4.8 summarize average scores for this trail. Compared to the previous
trail, a completely reverse pattern was observed in this trail. Higher average scores were
given to designs with larger numbers of objects and sizes. This was very clear with
number of objects in all scales and the total; Designs associated with the larger number of
objects were given higher average score.
With number of different sizes no clear pattern could be distinguished, however, in
both the simplicity and the craftsmanship scales, deign 8, associated with the largest
numbers of objects and sizes, was given the largest average score.
Differences among scales are in the same order as in the previous trail. Simplicity was
given the higher average score of 3.92, followed by colorfulness, craftsmanship, and
diversity with average scores of 2.95, 2.83, and 2.53 respectively.
The calculated Cronbach’s α values for all scales are between 0.46 and 0.66, indicating
that not reliabilities of all scales are within the acceptable limits. Specifically, values for
the craftsmanship were all below 0.50. Thus, results associated with this scale should be
analyzed with extra caution.
Analysis of variance in Table 4.15 shows that effects of number of objects were
significant on average scores in both the craftsmanship (p-value = 0.026) and in the total
(p-value = 0.043). However, this effect was reversed in this trail; designs with the larger
number of objects were perceived as having better aesthetics level than designs with the
smaller number of objects. No significant effects of number of objects were found with
the other scales.
No significant effects of number of sizes were found in any scale or in the total.
81
Table 4.14 Descriptive statistics for average scores for the VisAWI unbalanced trail.
Scale Design
no
No of
objects
No of
sizes
Average score Standard
deviation
(per screen)
Cronbach’s
α Per
design
Per no of
objects
Per
scale
Simplicity
5
6
3 3.68
3.72
3.82
1.79
0.47-0.66
6 5 3.77 1.83
7
16
3 3.90
3.91
1.76
8 11 3.93 1.81
Diversity
5
6
3 2.43
2.42
2.53
1.53
6 5 2.41 1.35
7
16
3 2.77
2.64
1.55
8 11 2.51 1.60
Colorfulness
5
6
3 3.02
2.88
2.95
1.83
6 5 2.74 1.72
7
16
3 3.15
3.02
1.71
8 11 2.89 1.74
Craftsmanship
5
6
3 2.67
2.70
2.83
1.71
6 5 2.72 1.78
7
16
3 2.89
2.97
1.91
8 11 3.04 1.85
Total
5
6
3 2.95
2.93
3.03
1.52
6 5 2.91 1.38
7
16
3 3.18
3.13
1.51
8 11 3.09 1.56
82
(a) per design
(b) per no of objects
(c) per scale
Figure 4.8 Average scores for the VisAWI unbalanced trail.
3.68
2.43
3.02
2.67
2.95
3.77
2.41
2.74 2.72
2.91
3.9
2.77
3.15
2.89
3.18
3.93
2.51
2.89
3.04 3.09
2
2.5
3
3.5
4
4.5
Simplicity Diversity Colorfulness Craftsmanship Total
3.72
2.42
2.88
2.7
2.93
3.91
2.64
3.02 2.97 3.13
2
2.5
3
3.5
4
4.5
Simplicity Diversity Colorfulness Craftsmanship Total
3.92
2.53
2.95 2.83
3.03
2
2.5
3
3.5
4
4.5
Simplicity Diversity Colorfulness Craftsmanship Total
5 6 7 8 5 6 7 8 5 6 7 8 5 6 7 8 5 6 7 8
6 16 6 16 6 16 6 16 6 16
83
Table 4.15 ANOVA for average scores for the VisAWI unbalanced trail.
Case Element F P-value
Simplicity
Objects 1.80 0.18
Sizes within objects 0.11 0.90
Participants 25.46 < 0.001
Diversity
Objects 3.36 0.071
Sizes within objects 1.22 0.30
Participants 25.10 < 0.001
Colorfulness
Objects 1.20 0.28
Sizes within objects 2.23 0.12
Participants 29.19 < 0.001
Craftsmanship
Objects 5.17 0.026
Sizes within objects 0.44 0.65
Participants 37.04 < 0.001
Total
Objects 4.28 0.043
Sizes within objects 0.24 0.79
Participants 36.57 < 0.001
84
Differences among the different scales were highly significant in this trail too (p-values
< 0.001). This is shown in the results of analysis of variance for scales in Table 4.16. All
pair-wise comparisons were significant with simplicity given the highest average score
(3.92) followed by colorfulness with an average score of 2.95, than craftsmanship with an
average of 2.83, and last diversity with an average score of 3.53.
Differences between the balanced and unbalanced designs were found significant for all
scales (Table 4.16). With all facets of visual aesthetics measured by the VisAWI
questionnaire, the balanced designs were perceived as having significantly higher levels
of aesthetics than the unbalanced designs. This was the case even with the colorfulness
scale, despite the fact that all the designs in both conditions have identical colors. This
could be an evidence of how dominant the effect of balance on participants' perception of
visual aesthetics is.
Table 4.16 ANOVA for balance and scales for the VisAWI trail.
Balance Scales
Scale F P-value Condition F P-value
Simplicity 3.11 0.021 Balanced 49.40 < 0.001
Diversity 7.45 < 0.001 Unbalanced 95.49 < 0.001
Colorfulness 10.66 < 0.001
Craftsmanship 5.35 0.002
Total 8.11 < 0.001
85
4.3.5 Overall discussion
The purpose of the experimental work presented in this chapter is to systematically study
effects of number of objects and number of different sizes of objects on perceived visual
aesthetics of website design under the conditions of balanced and unbalanced designs.
Several experimental trails were conducted with different groups of participants, two
presentation methods, and two methods to collect participants' responses.
The two presentation methods were the use of abstract mock-ups of layout designs of
webpages and the use of real webpage designs. The abstract mock-ups were used to work
as pilot tests for the webpage designs. They were used to help discover any required
modifications on the experimental setup or the proposed designs before start testing with
the real webpage designs. They also helped in obtaining an overall measure of
participants' perception of visual aesthetics of the used designs in an abstract setting with
minimum number of uncontrolled effects.
The two methods used to collect participants' perception of visual aesthetics were: a one
overall question and two standard questionnaires. The one-question was used with both
the abstract mock-up designs and with the real webpage designs. The standard
questionnaires were used only with the real webpage designs.
In this section summary of results of all experimental trails is given with an overall
discussion of these results. Table 4.17 gives summary of results of all trials for the two
factors (number of objects and number of sizes) for the two balance and unbalanced
conditions. The results of effects of the two factors can be summarized as follows:-
86
Table 4.17 Summary of results for all experimental trails.
No of objects
No of sizes
Balance
One-question
Mock-ups
Balanced
significant (smaller is better)
significant (no of objects 16,
smaller is better)
significant
Unbalanced
significant (smaller is better)
significant (both objects sizes,
smaller is better)
Webpages
Balanced
significant (smaller is better)
not significant
not
significant
Unbalanced
not significant
not significant
Questionnaire
Classic
/Expressive
Balanced
significant (only with Classical,
smaller is better)
significant (Classical, no of object
16, smaller is better)
significant
Unbalanced
not significant
not significant
VisAWI
Balanced
significant (Simplicity and total,
smaller is better)
significant
(Craftsmanship, no of
objects 16, smaller is
better)
significant
Unbalanced
significant (Craftsmanship and
total, larger is better)
not significant
87
Number of objects:-
- In the balanced condition: In all trails, significant effects of number of objects on
visual aesthetics were found. In both cases of the one-question and with
questionnaire scales related to layout elements; classical in the
classical/expressive questionnaire and simplicity and total in the VisAWI
questionnaire. All these effects indicate that decreasing number of objects will
increase perceived visual aesthetics.
- In the unbalanced condition: Significant effects were only found in two trails; the
mock-up designs trail and VisAWI with the craftsmanship scale and total.
Directions of effects observed in the VisAWI were opposite to direction of effects
in all other trails; designs with larger (not smaller) number of objectives were
perceived as having better visual aesthetics.
Number of different sizes of objects:-
- In the balanced condition: Significant effects were found in all trails except the
one-question with webpages trail. With the two questionnaires, the significant
effects were found with scales related to layout elements (classical in classical/
expressive and craftsmanship in VisAWI). In these trails, the significant effects
were only observed in the case of designs with the larger number of objects.
- In the unbalanced condition: Significant effects were only found in the mock-up
designs trail.
These results confirm the earlier findings of the correlation analyses presented in chapter
3, at high levels of balance, unity of form represented by the two parameters (number of
objects and number of different sizes) has significant effects on perceived visual
88
aesthetics of website interface design. Increasing unity of form levels (by decreasing
number of objects and number of sizes) in a website interface will increase levels of the
perceived visual aesthetics. As in the correlation analyses, these effects are more evident
on visual aesthetics dimensions related to interface layout design (classical, simplicity,
and craftsmanship).
Although some significant effects were found in the unbalanced condition too, they
were not consistent and couldn’t be considered as a general case. Also, with the
craftsmanship dimension (measured by VisAWI questionnaire) the significant effect was
found to be opposite in direction to the significant effects found in the balanced
condition. It is not clear if this “opposite” effect is an indication of a general phenomenon
or just a special case associated with the experimental setup and webpage designs of this
experiment. However, the lower reliability level of the questionnaire scale used to
measure visual aesthetics in this case put more doubts on these results.
The highly significant differences between the balance and unbalanced designs found in
almost all trails confirm finding of the experiment presented in chapter 3 and support
findings of previous studies; vertical balance has a positive effect on perceived visual
aesthetics of website interface design. These results also indicate that the manipulation of
the designs in this experiment was successful in creating the two balanced and
unbalanced conditions.
For further insights correlation coefficients were calculated between all average scores
for all the used subjective measures in all trails. Table 4.18 summarizes average scores
for all trails. Correlations between these scores are shown in Table 4.19 and Table 4.20.
In Table 4.19 correlations were calculated using averages scores for both the balanced
89
and the unbalanced conditions (all eight designs). In Table 4.20 correlations were
calculated only for the balanced condition (first four designs).
Table 4.19 show highly significant correlations between all scales of the two
questionnaires (classic/expressive and VisAWI). The only exception is the expressive
scale which didn’t correlate with any other scale. No significant correlations were found
between average scores of the one-question and the average scores of any of the scales of
the two questionnaires.
Correlation coefficients calculated for only the balanced designs in Table 4.20 show
very high and significant correlations between average scores of the one-question in the
mock-ups trail and total average scores of the two questionnaire. These significant
correlations didn’t show up when all designs were considered in Table 4.19. This might
be due to that results in the balanced condition were not consistent between the mock-ups
trail and the two questionnaires trails. While in the balanced condition, results were more
consistent between trails.
Lower correlations with less number of significant cases than in Table 4.19 were shown
in the balanced designs in Table 4.20. This suggests that, with the used experimental
setup, effect of vertical balance is more dominant than effects of unity of form. Thus,
when calculating correlation coefficients for all the balanced and unbalanced designs, this
dominant effect of balance might have masked the effects of the other two factors and
prevent their effects from being clearly visible in the calculated correlation coefficients.
The next step is to examine correlations between the objective layout-based measures
and the subjective questionnaire-based measures. This will be part of the content of the
next section.
90
Table 4.18 Summary of average scores for all experimental trails.
Design no Mock-ups
Classical/expressive VisAWI
Classic Expressive Average Simplicity Diversity Colorfulness Craftsmanship Average
1 5.24 4.47 3.12 3.79 4.31 3.37 4.03 3.70 3.85
2 4.52 4.40 2.94 3.67 4.38 3.30 3.95 3.48 3.78
3 4.72 4.38 3.05 3.71 4.08 3.28 3.88 3.71 3.74
4 3.59 3.87 3.12 3.49 3.94 3.11 3.94 3.28 3.57
5 5.00 3.53 2.93 3.23 3.68 2.43 3.02 2.67 2.95
6 3.90 3.71 2.89 3.30 3.77 2.41 2.74 2.72 2.91
7 4.24 3.62 3.11 3.36 3.90 2.77 3.15 2.89 3.18
8 3.21 3.40 3.02 3.21 3.93 2.51 2.89 3.04 3.09
91
Table 4.19 Correlation coefficients of average scores for all eight designs (balanced and unbalanced).
One-question Classic/Expressive VisAWI
Mock-ups Webpages Classic Expressive Total Simplicity Diversity Colorfulness Craftsmanship
One-question
Mock-ups -
Webpages 0.565
Classic
/Expressive
Classic 0.589 0.443
Expressive -0.011 -0.583 0.212
Total 0.551 0.300 0.981* 0.398
VisAWI
Simplicity 0.335 0.455 0.855* 0.301 0.863*
Diversity 0.415 0.152 0.913* 0.523 0.961* 0.873*
Colorfulness 0.408 0.106 0.863* 0.511 0.912* 0.797** 0.975*
Craftsmanship 0.343 0.213 0.888* 0.490 0.931* 0.866* 0.943* 0.896*
Total 0.396 0.205 0.914* 0.495 0.957* 0.897* 0.993* 0.971* 0.965*
* Significant at 0.01, ** Significant at 0.05
92
Table 4.20 Correlation coefficients of average scores for the balanced condition.
One-question Classic/Expressive VisAWI
Mock-ups Webpages Classic Expressive Total Simplicity Diversity Colorfulness Craftsmanship
One-question
Mock-ups -
Webpages 0.744
Classic
/Expressive
Classic 0.945** 0.892
Expressive -0.084 -0.675 -0.404
Total 0.999* 0.746 0.953** -0.108
VisAWI
Simplicity 0.706 0.963** 0.808 -0.557 0.694
Diversity 0.980** 0.861 0.976** -0.250 0.978** 0.830
Colorfulness 0.413 0.307 0.238 0.320 0.365 0.535 0.435
Craftsmanship 0.926 0.553 0.870 -0.016 0.940 0.435 0.856 0.092
Total 0.967** 0.875 0.963** -0.256 0.961** 0.863 0.997* 0.492 0.816
* Significant at 0.01, ** Significant at 0.05
93
4.4 Comparing Objective Measures with Subjective Measures
In this section correlation analysis will be conducted between the objective screen layout-
based measures and the subjective questionnaire-based measures. The reason for this
analysis is to further validate measures and models presented in chapter 3 and to further
investigate the relationship between the objective and subjective measures in general.
This further validation would help in obtaining better understanding of the relationship
between objectives and subjective measures, and would help in improve computational
formulas and models presented in this study.
4.4.1 Correlation analysis
The values of the objective measures for the eight designs are summarized in Table 4.21.
The objective measures included in the table are the same as the ones presented and
tested in chapter 3. Same formulas and procedures as used in chapter 3 were used to
calculated values of these measures for the eight designs.
Correlation analysis is shown in Table 4.22. High and significant correlations were
found between the measures of balance, unity of layout, sequence, and Ngo model, and
the average scores of most of the subjective questionnaire measures. Surprisingly, no
high or significant correlations were found with number of objects, number of sizes, unity
of form, and the interaction model. This contradicts the significant effects unity of form
showed in results of experimental work in the previous sections.
The dominant effect of vertical balance is most likely the reason for this lack of
correlation. This is supported by the negative correlations observed with unity of layout.
94
Looking at values of unity of layout in Table 4.21, it can be seen that these values are
higher in case of the unbalanced condition (designs 5 to 8). These higher values should
have increased levels of visual aesthetics (measured by questionnaire scores) not
decreased them as the negative correlations indicate. This suggests that the low levels of
visual aesthetics recorded in the unbalanced condition are largely due to the dominant
effect of balance. This dominate effect of balance could have marginalized effects of
unity.
Table 4.21 Values of measures and models for the eight designs.
Design
no
No of
objects
No of
different
sizes
Balance
Unity
Sequence Interaction model
Ngo
model form layout Average
1 6 3 1 0.67 0.17 0.42 1 0.63 0.26
2 6 5 0.98 0.33 0.16 0.25 1 0.54 0.25
3 16 3 1 0.88 0.16 0.52 1 0.68 0.26
4 16 11 0.99 0.38 0.16 0.27 1 0.55 0.24
5 6 3 0.62 0.67 0.29 0.48 0.75 0.56 0.21
6 6 5 0.61 0.33 0.27 0.30 0.75 0.51 0.20
7 16 3 0.63 0.88 0.27 0.57 0.75 0.59 0.21
8 16 11 0.62 0.38 0.27 0.32 0.75 0.52 0.19
95
To explore this more, correlations for only the balanced conditions were calculated and
presented in Table 4.23. With only the balanced designs considered (designs 1 to 4), high
and significant correlations were found with number of objects and number of sizes. All
of the significant correlations are with averages scores of questionnaire scales related to
layout design. With unity of form, higher correlations were found but none was
statistically significant.
No noticeable changes in correlations with the interaction model were shown. However,
contrary to the case of all designs, in this case Ngo model shows lower and non
significant correlations.
4.4.2 Proposed modification to the unity of form formula
Since effects of number of objects and number of sizes were more evident in the balanced
case (indicated by the high correlations) and since low correlations were found with the
unity of form and with the two computational models, it is possible that these effects are
not represented adequately by the unity of form formula and consequently the two
computational models that incorporated this formula.
To investigate this possibility, first, differences between average total scores for both
questionnaires were calculated for each counter parts pair of balanced and unbalanced
designs. Calculations are shown in Table 4.24. Second, correlations of these differences
with associated values of unity of form and the two parameters were computed (shown in
Table 4.25).
Subtracting the average scores of the designs in the unbalanced conditions form average
scores of the designs in the balanced condition should remove the effect of balance and
produce values that represent effects of number of objects and sizes more clearly.
96
Table 4.22 Correlation coefficients of measures and models for all eight designs (balanced and unbalanced).
Measure Mock-ups
Classical/expressive VisAWI
Classic Expressive Average Simplicity Diversity Colorfulness Craftsmanship Average
No of objects -0.549 -0.259 0.608 -0.122 -0.157 0.053 0.030 0.113 0.030
No of sizes -0.860* -0.386 0.161 -0.330 -0.118 -0.155 -0.073 -0.074 -0.105
Balance 0.345 0.887* 0.427 0.918* 0.783* 0.958* 0.980* 0.916* 0.963*
Unity of form 0.571 0.159 0.426 0.234 -0.040 0.196 0.130 0.175 0.138
Unity of layout -0.236 -0.870* -0.411* -0.898* -0.792* -0.946* -0.958* -0.912* -0.952*
Unity average 0.488 -0.065 0.306 0.000 -0.237 -0.049 -0.115 -0.061 -0.107
Sequence 0.327 0.886* 0.399 0.911* 0.784* 0.950* 0.975* 0.909* 0.957*
Interaction model 0.637 0.619 0.510 0.682 0.359 0.626 0.569 0.663 0.598
Ngo model 0.570 0.848* 0.561 0.908* 0.663 0.923* 0.915* 0.877* 0.902*
* Significant at 0.01, ** Significant at 0.05
97
Table 4.23 Correlation coefficients of measures and models for the balanced condition.
Measure Mock-ups
Classical/expressive VisAWI
Classic Expressive Average Simplicity Diversity Colorfulness Craftsmanship Average
No of objects -0.605 -0.643 0.367 -0.578 -0.952** -0.725 -0.749 -0.268 -0.775
No of sizes -0.949** -0.972** 0.279 -0.963** -0.646 -0.936 -0.114 -0.957** -0.907
Unity of form 0.593 0.481 0.287 0.618 -0.118 0.445 -0.247 0.843 0.380
Unity average 0.604 0.488 0.294 0.628 -0.107 0.456 -0.229 0.848 0.391
Interaction model 0.604 0.487 0.298 0.628 -0.109 0.455 -0.226 0.848 0.391
Ngo model 0.602 0.479 0.316 0.625 -0.115 0.451 -0.214 0.844 0.387
* Significant at 0.01, ** Significant at 0.05
98
Table 4.24 Differences between average total scores for each design pair.
Design
Pair
Difference between average total scores Number
of objects
Number
of sizes
Unity of
form
Classical/Expressive VisAWI
1-5 0.90 0.57 6 3 0.67
2-6 0.87 0.37 6 5 0.33
3-7 0.56 0.35 16 3 0.88
4-8 0.48 0.28 16 11 0.38
Table 4.25 Correlation coefficients for values of unity of form computed by the original formula.
Measure
Sores
Classical/Expressive VisAWI
Number of objects -0.985 -0.713
Number of sizes -0.601 -0.671
Unity of form -0.119 0.338
High correlations are shown in Table 4.25 with number of objects and number of sizes
but not with unity of form. This supports further the possibility of that the formula doesn't
represent adequately the combined effects of number of objects and number of sizes.
The formula of unity of form with examples of calculation is shown in Appendix A.
The formula is reprinted below (equation 4.1).
(4.1)
nsizes stands for the number of sizes used, and n is the number of objects on the frame.
99
After careful examinations of the formula and comparing the values calculated by the
formula for the designs and the differences in Table 4.24, a new modified formula was
proposed. The modified formula is presented below (equation 4.2).
(4.2)
This modified formula should represent better the combined effects of both number of
objects and number of different sizes; the two input parameters in the formula.
To validate this formula the unity of form values were recalculated using the modified
formula for the designs and correlation analysis was conducted using these values. These
values are presented in Table 4.26 with the original information from Table 4.24.
Correlations are shown in Table 4.27. It is clearly that values computed by the modified
formula have produced much higher correlations than values computed by the original
formula. Next the modified formula will be incorporated into the two computational
models and validate using subjective measures. This will be the job of the next section.
Table 4.26 Comparing differences with values of unity of form of both original and modified formula
Design
pair
Difference between average total scores Number
of objects
Number
of sizes
Unity of form
Classical/Expressive VisAWI Original Modified
1-5 0.90 0.57 6 3 0.67 0.44
2-6 0.87 0.37 6 5 0.33 0.33
3-7 0.56 0.35 16 3 0.88 0.38
4-8 0.48 0.28 16 11 0.38 0.15
100
Table 4.27 Correlation coefficients for values of unity of form computed by the modified formula.
Measure Sores
Classical/Expressive VisAWI
Number of objects -0.985 -0.713
Number of sizes -0.601 -0.671
Unity of form (original) -0.119 0.338
Unity of form (modified) 0.710 0.822
4.4.3 Incorporating the modified unity of form formula into the computational
model
The modified formula of unity of form was used to recalculate the values of unity for the
eight black and white screens of the experiment presented in chapter 3. The regression
(interaction) model was refit using these new values. The new model is given below
(Equation 4.3).
Aesthetic Value = 0.609 - 0.086 B - 0.645 U - 0.139 S + 0.743 B*U + 0.648 U*S (4.3)
Where:-
B : Balance
U : Unity
S : Sequence
Table 4.28 shows values of the three measures (balance, unity, and sequence) for the
eight screens. Values of unity were shown for both the original and the modified formula.
Both actual and predicted aesthetic values for each of the eight screens are shown in the
101
table. Predicted values for both the original model (Equation 3.1) and the modified model
(Equation 4.3) are listed in the table. A correlation coefficient of 0.97 (p-value < 0.001)
was found between actual and predicted values calculated by the modified model. It is a
bit lower than the value calculated for the original model (0.99); however, both are
statistically significant at the same level.
Table 4.28 Values of measures and actual and predicted aesthetic values of the eight screens for the
original and the modified model.
Screen
no Balance
Unity
Sequence
Aesthetic Value
Actual
Predicted
Original Modified Original Modified
1 1.00 0.99 0.99 1.00 0.908 0.920 1.120
2 0.10 0.18 0.27 0.00 0.438 0.453 0.445
3 0.98 0.24 0.33 0.00 0.485 0.519 0.552
4 0.91 0.80 0.73 0.25 0.654 0.621 0.635
5 0.09 0.82 0.76 1.00 0.546 0.528 0.516
6 0.04 0.15 0.24 1.00 0.415 0.441 0.474
7 1.00 0.15 0.24 1.00 0.515 0.493 0.565
8 0.25 0.78 0.72 0.25 0.354 0.409 0.340
Now the correlation analysis in section 4.4.1 will be repeated for the modified formula
and model. Table 4.29 gives the values of the three measures and the associated values
calculated by the modified interaction model in equation (4.3) for the eight webpage
designs of this chapter.
102
Table 4.30 and Table 4.31 show correlation results with the questionnaires scores for all
designs and balanced designs respectively. The two tables are reprints of Table 4.22 and
Table 4.23 with correlations values for the modified interaction model added to them.
Comparing correlations for the modified model to the original, it is clearly that the
modified model produced more higher and significant correlations in both the all designs
(Table 4.30) and the balanced design (Table 4.31).
Table 4.29 Values of the three measures and the model for the eight webpage designs.
design
no
No of
objects
No of
different
sizes
Balance
Unity
Sequence
Interaction model
(modified) form layout Average
1 6 3 1 0.44 0.17 0.31 1 0.74
2 6 5 0.98 0.33 0.16 0.25 1 0.70
3 16 3 1 0.38 0.16 0.27 1 0.72
4 16 11 0.99 0.15 0.16 0.15 1 0.63
5 6 3 0.62 0.44 0.29 0.37 0.75 0.62
6 6 5 0.61 0.33 0.27 0.30 0.75 0.60
7 16 3 0.63 0.38 0.27 0.32 0.75 0.61
8 16 11 0.62 0.15 0.27 0.21 0.75 0.58
10
3
Table 4.30 Correlation coefficients of the models for all eight designs (balanced and unbalanced).
Measure Mock-ups
Classical/expressive VisAWI
Classic Expressive Average Simplicity Diversity Colorfulness Craftsmanship Average
Unity of form 0.918* 0.367 -0.144 0.315 0.107 0.135 0.080 0.060 0.097
Unity Average 0.703** -0.052 -0.331 -0.114 -0.236 -0.295 -0.344 -0.337 -0.325
Interaction model
(original) 0.637 0.619 0.510 0.682 0.359 0.626 0.569 0.663 0.598
Interaction model
(modified) 0.721** 0.956* 0.278 0.953* 0.786** 0.870* 0.832* 0.855* 0.872*
Ngo model 0.570 0.848* 0.561 0.908* 0.663 0.923* 0.915* 0.877* 0.902*
* Significant at 0.01, ** Significant at 0.05
10
4
Table 4.31 Correlation coefficients of the models for the balanced condition
Measure Mock-ups
Classical/expressive VisAWI
Classic Expressive Average Simplicity Diversity Colorfulness Craftsmanship Average
Unity of form 0.993* 0.963** -0.149 0.998* 0.688 0.975** 0.307 0.950** 0.955*
Unity Average 0.996* 0.965** -0.153 1.00* 0.717 0.984** 0.350 0.934** 0.968**
Interaction model
(original) 0.604 0.487 0.298 0.628 -0.109 0.455 -0.226 0.848 0.391
Interaction model
(modified) 0.997* 0.950** -0.101 0.999* 0.682 0.974** 0.350 0.947** 0.956**
Ngo model 0.602 0.479 0.316 0.625 -0.115 0.451 -0.214 0.844 0.387
* Significant at 0.01, ** Significant at 0.05
105
The final step to complete the validation is to use the modified model to calculate
aesthetic values for the 42 webpages used to validate the original model in chapter 3.
Table 4.32 shows correlations calculated between values of the modified model and
average questionnaire scores for the webpages. Correlations of the original model and
Ngo and Byrne model are reprinted here for comparison. Values in the table show that
the modified model gave lower correlations than the original model. Nevertheless, in
most cases, levels of statistical significance were close between the two models.
Table 4.32 Correlation coefficients of the models for all the 42 webpages of chapter 3.
Measure
Classical/expressive
VisA WI
Classic Expressive Average
Simplicity Diversity Colorfulness Craftsmanship Average
Interaction
model
(original)
0.600* 0.189 0.524*
0.712* 0.163 0.316** 0.434* 0.491*
Interaction
model
(modified)
0.493* 0.172 0.440*
0.614* 0.080 0.228 0.333** 0.383**
Ngo
model 0.539* 0.151 0.460* 0.657* 0.143 0.325** 0.347** 0.446*
* Significant at 0.01, ** significant at 0.05
CHAPTER 5
CONCLUSIONS AND FUTURE WORK
107
5.1 Summary of Experimental Work and Results
The first two objectives of this study (stated in section 1.3.2) included verifying findings
of previous studies, and validating and exploring the possibility of improving the
available measures and models of visual aesthetics of computer interfaces. To accomplish
these objectives, several experiments were designed and conducted using rigorous
statistical testing and design of experiment techniques. The first part of the experimental
work was presented in chapter 3. An experiment was designed and conducted to
investigate effects of three elements of screen layout (balance, unity, and sequence) on
the perceived interface aesthetics. Results showed that the three elements have significant
effects on perceived interface aesthetics. Significant effects of interactions among the
three elements were also found. These results confirmed findings of previous studies
(Ngo and Byrne, 2001, Ngo et al., 2003).
A regression model relating perceived visual aesthetics to the three elements was
constructed. The model represents a compact version of the original model developed by
Ngo and Byrne (2001). The model was validated using two methods; first it was cross
validated with the original model developed by Ngo and Byrne’s (2001). Second, it was
validated using subjective standard questionnaire scores of real webpages.
The comparison with Ngo and Byrne's model indicate that, although the model has less
number of terms, it was still capable of producing aesthetic values within the same level
of statistical significance as the original model. This also further confirms findings of
Ngo and Byrne (2001) and Ngo et al. (2003) studies.
108
When validating the model using standard questionnaire scores of real webpages, high
correlations were found between the values computed by the model and scores of
questionnaire items related to visual layout of the webpages. This indicates that although
the formulas used in this study were originally developed for data entry screens, they can
also be applied to websites. It also indicates that the layout-based measures tested in this
study can adequately predict aesthetics aspects related to the classical and the simplicity
dimensions of website aesthetics.
The validation of the regression model using subjective questionnaire scores of real
webpages helped achieve the other two objectives of this study; specifically, objective 3
of the study (section 1.3.2); to see if the formulas and associated measures and models of
interface design would work with website interface design, and objective 4; comparing
objective layout-based measures of visual aesthetics with subjective questionnaire-based
measures.
To further confirm findings of chapter 3 regarding application of the objective measures
and models to website interface design, more experiments were designed and conducted.
This was covered in chapter 4. The purpose of the experimental work presented in that
chapter was to systematically study effects of number of objects and number of different
sizes of objects (as parameters of unity of form) on perceived visual aesthetics of website
interface design under the conditions of balanced and unbalanced designs.
Several experimental trails were conducted with various settings and with different
groups of participants. The experimental settings included the use of abstract mock-ups of
layout designs of webpages and the use of real webpage designs. Two methods were used
to collect participants' perception of visual aesthetics; a one overall question and two
109
standard questionnaires. The standard questionnaires were used only with the real
webpage designs. The two questionnaires were the Classical/Expressive questionnaire
and the VisAWI questionnaire.
Results of these experiments confirmed the earlier findings in chapter 3, at high levels
of balance; unity of form represented by the two parameters (number of objects and
number of different sizes) has significant effects on perceived visual aesthetics of website
interface design. These results also indicate that these effects are more evident on visual
aesthetics dimensions related to interface layout design. Furthermore, results also
confirmed that vertical balance has a positive effect on perceived visual aesthetics of
website interface design.
Part of the experimental work in chapter 4 involved performing correlation analysis to
compare objective layout-based measures with subjective questionnaire based measures.
As in chapter 3, results of the comparison showed high correlations between the measures
and the models, and the questionnaire scales related to screen layout. This further
confirms findings of chapter 3 regarding this point
Observations from this comparison were the basis of a proposed modification to the
unity of form formula and consequently the regression model developed in chapter 3.
This modification was proposed in order to improve the unity of form formula so it would
express the combined effects of number of objects and number of sizes more adequately.
Compared to the original model, the modified model incorporating this formula showed
better performance with the webpage designs used in chapter 5 and an acceptable
performance with almost the same level of statistical significance as the original model in
the case of screens and webpages of chapter 3.
110
Preliminary results of this study have already been reported and published in several
articles (Altaboli and Lin, 2010, 2011a, 2011b, & 2012).
5.2 Conclusions and Contributions
Results of this study confirmed findings of previous studies regarding the possibility of
using visual features of the interface to predict perceived visual aesthetics and further
support the concept of expressing such feature using mathematical formulas and use them
in turn as basis to develop computational models to predict visual aesthetics of interface
design. Results of the study also proved that screen layout-based measures can also work
with website interfaces. Moreover, results of the study showed that objective screen
layout-based measures relates to subjective questionnaire-based measure. The
relationship is particularly stronger with questionnaire elements related to screen layout
elements. This suggests that objective layout-based measures could be used to generally
assess the overall visual aesthetics of websites and particularly aesthetic aspects related to
classical and simplicity dimensions of website aesthetics.
The following points give more specific statements of the main conclusions drawn from
results of this study and contributions added to the knowledge base in the field:-
The three layout-based elements of balance, unity, and sequence have significant
effects on perceived visual aesthetics. These three elements were measured using the
mathematical formulas developed by Ngo et al. (2003). They were utilized to develop
a compact computational model to predicate visual aesthetics. This model performed
within good levels of accuracy in all the validation procedures in this study.
111
The three elements and the model proved to work as well when their application was
extended to the case of website interface design.
Balance and unity of form (presented by number of objects and number of different
sizes on the screen) have significant effects on perceived visual aesthetics of website
interface design. Higher levels of balance with less numbers of objects and sizes will
significantly increase levels of perceived visual aesthetics.
With website interface design, effects of balance are more dominant than effects of
the other tested elements. Effects of unity of form are more evident at high levels of
balance. At low levels of balance, effects of unity of form are not significant.
The objective layout-based measures strongly correlate to the subjective
questionnaire-based measures related to screen layout. Indicating that objective
layout-based measures could be used to measure perceived visual aesthetics
dimensions related to screen layout elements.
5.3 Recommendations for Future Work
Website interface was the main type of interface tested in this study. Hence, further
testing with other types of interfaces is needed for the finding of this study to be
generalized to all types of graphical user interfaces. It would be particularly
interesting to see how these findings would work with the today's widely spread types
of interfaces and screens (e.g. smart phones).
The procedure used to divide the webpages into visual objects was a bit arbitrary
based on a personal perception of the pages. Standard criteria and systematic methods
112
should be established to make it easy to apply the formulas to any webpage.
Establishing such standards and procedures would simplify automating the process
using computer software.
The formulas used to calculate the three elements don’t include effect of color,
although, Ngo et al. (2003) suggested adding effect of colors as part of the balance
element, with darker colors given more weights. However, it was not clear how to
apply it in practice. The challenge is still open to develop practical methods to express
effects of colors on visual aesthetics using numerical values.
As quantitative measures of visual aesthetics are becoming more reliable, the next
step should be to study potential effects of visual aesthetics on performance. Possible
positive effects of visual aesthetics on performance have been reported in the
literature, however, some also reported possible negative effects and argue that
context of use may play a role in this regard. The assumption of the possible influence
of context of use might have on producing positive or negative effects of aesthetics on
performance should be further examined.
One limitations that this study encountered was the practical difficulty in
manipulating the values of the tested measures to completely match the theoretically
experimentally designed levels; changing the position of one visual object on the
screen would change the values of more than one measure at the same time. This
forced the experiments to be designed and conducted with limited numbers of
measures and imposed many limitations on the associated number of factors and
levels.
113
REFERENCES
Altaboli, A. and Lin, Y. 2010, Experimental investigation of effects of balance, unity, and
sequence on interface and screen design aesthetics, in: Blashki, K.. (Ed.), Proceedings of
The IADIS International Conference Interface and Human Computer Interaction 2010,
Freiburg, Germany, IADIS Press, pp. 243-250.
Altaboli, A., Lin, Y., Ali, M., Alterhony, Y., 2010, Using performance measures to assess the
effect of visual aesthetics on usability, in: Khalid, H., Hedge, A., Ahram, T. (Eds.),
Advances in Ergonomics Modeling and Usability Evaluation, CRC Press, Taylor & Francis
Group, pp. 107-116.
Altaboli, A., and Lin, Y., 2011a, Investigating effects of screen layout elements on interface and
screen design aesthetics. Advances in Human-Computer Interaction, vol. 2011, Article ID
659758, 10 pages, 2011. doi:10.1155/2011/659758.
Altaboli, A. and Lin, Y. 2011b, Objective and subjective measures of visual aesthetics of website
interface design: the two sides of the coin. In Proceedings of the 14th international
Conference on Human-Computer interaction: Design and Development Approaches -
Volume Part I (Orlando, FL, July 09 - 14, 2011). J. A. Jacko, Ed. Springer-Verlag, Berlin,
Heidelberg, 35-44.
Altaboli, A. and Lin, Y. 2012, Effects of unity of form on visual aesthetics of website design, to
be presented at the 4th international Conference on Applied Human Factors and Ergonomics
(AHFE 2012), July 21-25, 2012, San Francisco, CA U.S.A.
Bailey, R., 1982, Human Performance Engineering. First Edition, Prentice-Hall, Englewood
Cliffs, New Jersey.
Bauerly, M. and Liu, Y. 2006, Computational modeling and experimental investigation of effects
of compositional elements on interface and design aesthetics. Int. J. Human-Computer
Studies, 64, 670–682
Bauerly, M. and Liu, Y. 2008, Effects of symmetry and number of compositional elements on
interface and design aesthetics. International Journal of Human-Computer Interaction, 24: 3,
275 — 287
114
Ben-Bassat, T., Meyer, J., Tractinsky, N., 2006. Economic and subjective measures of the
perceived value of aesthetics and usability, ACM Transactions on Computer-Human
Interaction 13 (2), 210–234.
Bi, L., Fan, X., Liu, Y., 2011, Effects of symmetry and number of compositional elements on
chinese users' aesthetic ratings of interfaces: experimental and modeling investigations,
International Journal of Human-Computer Interaction, 27:3, 245-259
Birkhoff, G., 1933. Aesthetic Measure. Harvard University Press, Cambridge, MA.
Cawthon, N., and Vande Moere, A., 2007, The Effect of aesthetic on the usability of data
visualization, 11th International Conference Information Visualization (IV'07).
Chand, D., Dooley, L., and Tuovinen, E., 2002. Gestalt theory in visual screen design – a new
look at an old subject. Australian Computer Society, Inc. presented at the Seventh World
Conference on Computer in Education, Copenhagen, Denmark, 2001.
Comber, T and Maltby, JR. 1995. Evaluating usability of screen designs with layout complexity.
in H Hasan & C Nicastri (eds) , Proceedings of HCI, a light into the future : OZCHI '95 ,
CHISIG Australia, Downer, ACT.
De Angeli, A., Sutcliffe, A., Hartmann, J., 2006. Interaction, usability and aesthetics: what
influences users’ preferences?. In: Proceedings of the Sixth ACM Conference on Designing
Interactive Systems, PA, June 2006.
Djamasbi, S., Siegel, M., Tullis, T., 2010, Generation Y, web design, and eye tracking, Int. J.
Human-Computer Studies, 68, 307–323
Dix, A., Finlay, J., Abowd, G., and Beale, R., 2004. Human-Computer Interaction. Third edition,
Pearson Education Limited.
Galitz W. O., The Essential Guide to User Interface Design: An Introduction to GUI Design
Principles and Techniques, John Wiley & Sons, Inc., New York, 2007.
Hartmann, J., Sutcliffe, A., De Angeli, A., 2008. Towards a theory of user judgment of aesthetics
and user interface quality. ACM Transactions on Computer-Human Interaction 15(4), 1-30.
Hassenzahl, M., 2004, The interplay of beauty, goodness and usability in interactive products.
Human Computer Interaction, 19, 4, 319–349.
Hoffmann, R. and Krauss, K. 2004. A critical evaluation of literature on visual aesthetics for the
web. SAICSIT '04: Proceedings of the 2004 annual research conference of the South African
115
institute of computer scientists and information technologists on IT research in developing
countries, Western Cape, South Africa, 205-209.
Jordan, P.W., 1998. Human factors for pleasure in product use. Applied Ergonomics 29 (1), 25–
33.
Kurosu, M. and Kashimura, K. 1995. Apparent usability vs. inherent usability: experimental
analysis on the determinants of the apparent usability. CHI '95: Conference companion on
Human factors in computing systems, Denver, Colorado, United States, 292-293.
Kurosu. M. and Kashimura. K.., 1995. Determinants of the Apparent Usability. Proceedings of
IEEE SMC. pp 1509-1513.
Lai, C., Chen, P., Shih, S., Liu, Y., Hong, J., 2010, Computational models and experimental
investigations of effects of balance and symmetry on the aesthetics of text-overlaid images,
Int. J. Human-Computer Studies 68, 41–56
Laviea, T., and Tractinsky, N., 2004, Assessing dimensions of perceived visual aesthetics of web
sites, Int. J. Human-Computer Studies 60 269–298
Lindgaard, G., Fernandez, G., Dudek, C. and Brown, J., 2006, Attention web designers: You have
50 milliseconds to make a good impression. Behavior & Information Technology 25, 2
(2006), 115-126.
Liu, Y., 2003a. Engineering aesthetics and aesthetic ergonomics: theoretical foundations and a
dual-process research methodology. Ergonomics, 46, 1273–1292.
Liu, Y., 2003b. The aesthetic and the ethic dimensions of human factors and design. Ergonomics,
46, 1293–1305.
Merriam-Webster Online Dictionary 2012. Merriam-Webster, http://www.merriam-
webster.com/dictionary/aesthetics [Accessed March 5th, 2012].
Moshagen, M., Musch, J., Göritz, A.S., 2009. A blessing, not a curse: Experimental evidence for
beneficial effects of visual aesthetics on performance. Ergonomics 52, 1311-1320.
Mirdehghani, M. and Monadjemi, A. 2009. Web pages aesthetic evaluation using low-level visual
features. World Academy of Science, Engineering and Technology, 49, 2009
Miyoshi, T. and Murata, A. 2001. A Method to evaluate properness of gui design based on
complexity indexes of size, local density, aliment, and grouping. IEEE International
Conference on Systems, Man and Cybernetics, Tucson, AZ.
116
Montgomery, D., 2001. Design and Analysis of Experiments, fifth edition, Johan Wiley & Sons,
Inc., New York, USA.
Nagmachi, M., 1995, Kansei Engineering: A new ergonomic consumer-oriented technology for
product development, International Journal of Industrial Ergonomics, 15, 3-11.
Nagmachi, M., 2002, Kansei engineering as a powerful consumer-oriented technology for product
development, Applied Ergonomics 33, 289–294.
Ngo, D. and Byrne, J., 2001. Application of an aesthetic evaluation model to data entry screens.
Computers in Human Behavior, 17 (2001) 149-185.
Ngo, D., Samsudin, A., and Abdullah, R., 2000. Aesthetic measures for assessing graphic screens.
Journal of Information Science and Engineering, 16, 97-116.
Ngo, D. C. L., Teo, L. S., & Byrne, J. G., 2002. Evaluating Interface Esthetics. Knowledge and
Information Systems (4), 46-79.
Ngo, D. C. L., Teo, L. S., & Byrne, J. G., 2003. Modelling interface aesthetics. Information
Sciences, 152(1), 25-46.
Nielsen. J., 1993, Usability Engineering. AP Professional.
Norman, D., 2004. Emotional design: Why we love (or hate) everyday things. Basic Books, New
York, NY, USA.
Oxford online dictionary 2012, Oxford University Press, http://oxforddictionaries.com
/definition/aesthetics?q=aesthetics, [Accessed March 5th, 2012].
Phillips, C., and Chapparro, C., 2009, Visual appeal vs. usability: which one influences user
perceptions of a website more?, Usability News, Vol 11(2).
Reich, Y., 1993, A model of aesthetic judgment in design. Artificial Intelligence in Engineering,
Vol. 8, No. 2, pp. 141-153.
Schmidt, K.E., Bauerly, M., Liu, Y., And Sridharan, S., 2003, Web Page aesthetics and
performance: a survey and an experimental study. In Proceedings of the 8th Annual
International Conference on Industrial Engineering – Theory, Applications and Practice, Las
Vegas, Nevada, USA.
Sears, A., 1993. Layout appropriateness: a metric for evaluating user interface widget layout.
IEEE Transactions on Software Engineering, 19 (7), 707–719.
117
Shackel, B., 1991. Usability-Context, framework, definition. design and evaluation. In Shackel,
B. and Richardson, S. (eds. ) Human Factors for Informatics Usability. Cambridge University
Press,
Shneiderman, B, Plaisant, C., Cohen, M, Jacobs, S., 2010, Designing the User Interface:
Strategies for Effective Human-Computer Interaction, 5th edition, Addison Wesley.
Sonderegger, A., Sauer, J., 2010. The influence of design aesthetics in usability testing: Effects on
user performance and perceived usability. Applied Ergonomics 41, 403-410.
Streveler, D.J. and Wasserman, A.I. 1984. Quantitative measures of the spatial properties of
screen designs. In: INTERACT ’84 Conference Proceedings. North-Holland, Amsterdam.
Tractinsky, N. 1997. Aesthetics and apparent usability: empirically assessing cultural and
methodological issues. In S. Pemberton, Proceedings of the 1997 Conference on Human
Factors in Computing Systems (CHI '97). New York: ACM Press.
Tractinsky, N., Cokhavi, A., Kirschenbaum, M., Sharfi, T., 2006. Evaluating the consistency of
immediate aesthetic perceptions of web pages. International Journal of Human-Computer
Studies, 64 (11), 1071–1083.
Tractinsky, N., Shoval-Katz, A., Ikar, D., 2000. What is beautiful is usable. Interacting with
Computers, 13, 127–145.
Tuch, A.N., Bargas-Avila, J.A., Opwis, K., Wilhelm, F.H., 2009. Visual complexity of websites:
Effects on users’ experience, physiology, performance, and memory. International Journal of
Human-Computer Studies 67, 703-715.
Tullis, T.S., 1983. The formatting of alphanumeric displays: a review and analysis. Human
Factors. 25 (6), 657–682.
Tullis, T.S., 1988. Screen design. In: Helander, M. (Ed.), Handbook of Human-Computer
Interaction. Elsevier Science Publishers B.V., North-Holland, Amsterdam, pp. 377–411.
Van Schaik, P. and Ling, J., 2009, The role of context in perceptions of the aesthetics of web
pages over time, Int. J. Human-Computer Studies 67, 79–89.
Zain, J., Tey, M., and Goh, Y. 2008 Probing a self-developed aesthetics measurement application
(sda) in measuring aesthetics of mandarin learning web page interfaces, IJCSNS International
Journal of Computer Science and Network Security, Vol. 8 No. 1, January 2008.
APPENDIX A
THE USED FORMULAS WITH EXAMPLES OF
CALCULATIONS
119
This section lists the formulas developed by Ngo et al. (2003) to calculate screen balance,
unity, and sequence. A hypothetical abstract screen, similar to the screens used in the
study, is used to give examples of how the formulas were used to calculate values of each
of the three elements.
a. Balance
The balance is computed as the difference between the total weighting of components on
each side of the horizontal and vertical axis and is given by
(A.1)
Where BM stands for Balance Measure, BMvertical and BMhorizontal are the vertical and
horizontal balances with
(A.2)
(A.3)
Where
(A.4)
120
L, R, T, and B stands for left, right, top, and bottom, respectively, aij is the area of object i
on side j, dij is the distance between the central lines of the object and the frame, nj is the
total number of objects on the side
Example
This example shows how balance of a hypothetical screen shown below is computed
using the above formulas
Figure A.1. The hypothetical example screen showing inputs required to compute the balance element.
121
WL = 9 * 2.5 + 4 * 4 = 22.5 + 16 = 38.5
WR = 4 * 3 + 12.25 * 2 = 12 + 24.5 = 36.5
WT = 9 * 2.5 + 4 *3.5 = 22.5 + 14 = 36.5
WB = 4 * 4 + 12.25 * 3 = 16 + 36.75 = 52.75
BMhorizonta = (36.5 – 52.75)/ 52.75 = 0.308
BMvertical = (38.5-36.5)/38.5 = 0.052
BM = 1- ((0.308 + 0.052)/2) = 0.82
122
b. Unity
The formula for unity is
(A.7)
Where UM stands for Unity Measure, UMform is the extent to which the objects are related
in size with
(A.8)
and UMspace is a relative measure of the space between groups and that of margins with
(A.9)
Where ai, alayout, and aframe are the areas of object i, the layout, and the frame respectively,
nsizes stands for the number of sizes used, and n is the number of objects on the frame.
Example
This example shows how unity of the hypothetical screen is computed using the above
formulas
123
Figure A.2. The hypothetical example screen showing inputs required to compute the unity element.
nsizes = 3
n = 4
UMform = 0.5
Sum of areas = = 9 + 4 + 4 + 12.25 = 29.25 cm
2
alayout = 62.5 cm2 (area outlined by the solid lines)
aframe = 144 cm2
(total area of the screen = 24 cm * 24cm)
UMspace = 0.71
UM = (0.5+0.71)/2 = 0.605
124
c. Sequence
The formula for calculating sequence is
(A.10)
with
(A.11)
(A.12)
with
(A.13)
(A.14)
Where UL, UR, LL, and LR stand for upper-left, upper-right, lower-left, and lower-right,
respectively; and aij is the area of object i on quadrant j. Each quadrant is given a
weighting in q
125
Example
This example shows how sequence of the hypothetical screen is computed using the
above formulas
Figure A.3 The hypothetical example screen showing inputs required to compute the sequence element.
wUL = qUL * aUL = 4 * 9 = 36 ………. vUL = 4 …… (qUL – vUL) = 4 - 4 = 0
wUR = qUR * aUR = 3 * 4 = 12 ……….. vUR = 3 …… (qUR – vUR ) = 3 - 3 = 0
wLL = qLL * aLL = 2 * 4 = 8 …………. vLL = 1 …… (qLL – vLL) = 1 - 2 = -1
wLR = qLR * aLR = 1* 12.25 = 12.25 … vLR = 2 …… (qLR – vLR) = 2 - 1 = 1
= 0 + 0 + 1 + 1 = 2
SQM = 1 – (2/8) = 0.75
APPENDIX B
QUESTIONNAIRE SCORES AND MEASURES AND
MODELS VALUES FOR THE 42 WEBPAGES
127
Table B.1 Questionnaire scores for the 42 webpages (obtained from (Moshagen & Thielsch, 2010)
Webpage
no Classical/expressive
VisA WI
Classic Expressive Average
Simplicity Diversity Colorfulness Craftsmanship Average
1 5.21 3.19 4.20
5.06 3.68 4.89 4.68 4.55
2 2.53 2.01 2.27
2.65 2.59 3.63 3.15 2.96
3 4.45 1.80 3.13
4.65 2.43 3.61 3.61 3.57
4 5.18 3.73 4.46
5.09 4.60 4.86 5.08 4.90
5 2.90 2.15 2.53
3.15 2.67 4.00 3.48 3.28
6 4.17 2.88 3.53
3.89 3.49 4.27 4.42 3.98
7 3.54 4.01 3.78 3.86 4.36 4.39 4.46 4.25
8 4.84 3.02 3.93 4.96 4.13 5.09 5.09 4.78
9 3.98 1.86 2.92 3.68 2.58 5.05 3.78 3.70
10 4.85 2.51 3.68 4.51 3.22 4.23 4.52 4.09
11 5.20 3.36 4.28 4.89 4.23 5.16 5.30 4.86
12 4.42 3.55 3.99 4.33 3.99 3.91 4.81 4.25
13 4.48 2.57 3.53 4.42 3.02 4.96 4.75 4.22
14 3.70 2.60 3.15 4.20 3.13 3.59 4.31 3.79
15 4.28 3.28 3.78 4.14 3.66 4.13 5.00 4.19
16 2.91 2.44 2.67 3.18 3.24 3.57 3.80 3.42
17 2.47 2.38 2.43 2.35 2.57 3.21 2.90 2.72
18 2.40 2.48 2.44 2.68 2.86 3.25 3.54 3.05
19 5.08 2.18 3.63 4.83 3.15 5.06 5.25 4.51
20 5.02 3.06 4.04 4.52 3.96 4.38 4.83 4.40
21 3.20 2.28 2.74 5.06 3.68 4.89 4.68 4.55
128
Table B.1 continue...
Webpage
no Classical/expressive
VisA WI
Classic Expressive Average
Simplicity Diversity Colorfulness Craftsmanship Average
22 3.30 3.68 3.49
3.08 3.22 3.83 3.77 3.44
23 4.98 2.48 3.73
3.85 4.00 3.85 4.79 4.10
24 4.88 3.38 4.13
4.90 3.82 5.00 5.43 4.74
25 4.49 2.65 3.57
4.42 4.30 5.10 4.83 4.63
26 4.30 3.06 3.68
4.41 3.49 5.33 4.98 4.49
27 4.24 3.04 3.64
4.44 4.10 4.68 4.43 4.39
28 4.22 3.16 3.69 4.36 3.96 4.67 4.42 4.33
29 4.06 4.10 4.08 4.08 3.76 4.33 4.53 4.14
30 3.30 3.21 3.26 4.12 4.56 4.60 4.75 4.49
31 3.45 1.68 2.57 3.46 3.44 4.36 4.38 3.86
32 3.51 2.64 3.07 3.63 2.00 2.98 2.90 2.87
33 3.60 3.80 3.70 3.52 3.76 4.45 4.30 3.96
34 3.96 3.57 3.76 3.36 4.16 4.30 4.23 3.98
35 3.93 2.73 3.33 3.79 4.21 4.09 4.30 4.09
36 3.84 3.10 3.47 3.56 3.66 5.09 4.78 4.20
37 3.83 3.07 3.45 4.10 3.54 4.65 4.40 4.13
38 3.82 2.76 3.29 3.72 3.55 4.92 4.37 4.08
39 3.78 2.48 3.13 3.29 3.49 4.50 4.50 3.88
40 3.74 3.56 3.65 4.08 3.34 3.73 4.33 3.85
41 3.69 2.77 3.23 4.17 3.32 3.70 4.03 3.79
42 3.62 2.32 2.97 3.54 3.08 4.13 4.25 3.70
129
Table B.2 Measures and model values for the 42 webpages
Webpage
no Balance Unity Sequence Average
Interaction model
Ngo model
1 0.833 0.626 1.00 0.82 0.693 0.286
2 0.875 0.306 1.00 0.73 0.558 0.250
3 0.909 0.585 1.00 0.83 0.692 0.289
4 0.898 0.519 1.00 0.81 0.659 0.279
5 0.755 0.234 1.00 0.66 0.514 0.228
6 0.643 0.532 1.00 0.72 0.610 0.253
7 0.812 0.352 1.00 0.72 0.570 0.249
8 0.838 0.668 1.00 0.84 0.712 0.291
9 0.915 0.338 1.00 0.75 0.578 0.258
10 0.823 0.547 1.00 0.79 0.656 0.275
11 0.731 0.607 1.00 0.78 0.659 0.272
12 0.748 0.310 1.00 0.69 0.544 0.236
13 0.744 0.496 1.00 0.75 0.618 0.259
14 0.938 0.398 1.00 0.78 0.610 0.268
15 0.796 0.599 1.00 0.80 0.672 0.278
16 0.857 0.228 1.00 0.69 0.520 0.238
17 0.804 0.208 1.00 0.67 0.508 0.230
18 0.516 0.348 0.75 0.54 0.513 0.196
19 0.726 0.455 1.00 0.73 0.598 0.252
20 0.752 0.366 1.00 0.71 0.567 0.244
21 0.726 0.203 1.00 0.64 0.499 0.221
130
Table B.2 continue...
Webpage
no Balance Unity Sequence Average
Interaction model
Ngo model
22 0.807 0.460 1.00 0.76 0.615 0.262
23 0.597 0.571 1.00 0.72 0.614 0.253
24 0.847 0.270 1.00 0.71 0.539 0.242
25 0.848 0.451 1.00 0.77 0.619 0.265
26 0.856 0.311 1.00 0.72 0.558 0.248
27 0.864 0.452 1.00 0.77 0.622 0.267
28 0.641 0.679 1.00 0.77 0.662 0.271
29 0.837 0.335 1.00 0.72 0.566 0.249
30 0.898 0.386 1.00 0.76 0.597 0.263
31 0.900 0.321 1.00 0.74 0.568 0.254
32 0.764 0.163 1.00 0.64 0.486 0.220
33 0.634 0.200 0.75 0.53 0.490 0.191
34 0.950 0.381 1.00 0.78 0.603 0.267
35 0.846 0.352 1.00 0.73 0.575 0.253
36 0.894 0.408 1.00 0.77 0.607 0.265
37 0.796 0.521 1.00 0.77 0.639 0.268
38 0.852 0.286 1.00 0.71 0.546 0.245
39 0.574 0.684 0.75 0.67 0.600 0.245
40 0.860 0.617 1.00 0.83 0.695 0.287
41 0.757 0.393 0.75 0.63 0.560 0.229
42 0.598 0.351 0.75 0.57 0.525 0.206
131
Table B.3 Simple measures values for the 42 webpages
Webpage
no No of objects No of sizes
JEPG file size
(in Kbyte) No of font types No of images
1 6 6 52 2 0
2 18 12 209 4 12
3 7 5 69 1 0
4 18 5 161 3 12
5 11 11 192 2 3
6 6 6 149 2 4
7 8 7 194 6 5
8 8 6 162 1 2
9 11 11 214 1 6
10 8 8 50 2 0
11 4 3 179 2 4
12 5 5 108 2 2
13 7 6 174 2 4
14 6 5 160 3 2
15 11 8 113 4 3
16 10 10 140 4 1
17 10 10 169 2 3
18 20 20 170 2 9
19 11 8 175 1 2
20 7 6 191 3 2
21 13 13 174 3 5
132
Table B.3 continue...
Webpage
no No of objects No of sizes
JEPG file size
(in Kbyte) No of font types No of images
22 10 10 10 10 10
23 12 11 12 11 12
24 12 10 12 10 12
25 11 10 11 10 11
26 8 7 8 7 8
27 9 8 9 8 9
28 14 10 14 10 14
29 8 8 8 8 8
30 13 13 13 13 13
31 11 11 11 11 11
32 9 9 9 9 9
33 14 14 14 14 14
34 14 13 14 13 14
35 21 17 21 17 21
36 10 9 10 9 10
37 13 13 13 13 13
38 7 4 7 4 7
39 11 8 11 8 11
40 9 8 9 8 9
41 15 15 15 15 15
42 7 7 7 7 7
APPENDIX C
QUESTIONNAIRE SCORES FOR EXPERIMENTAL
TRAILS OF CHAPTER 4
134
Table C.1 Scores for the one-question mock-up trail.
Participant
no
Design
1 2 3 4 5 6 7 8
1 1 2 1 5 5 8 6 2
2 5 6 2 3 4 1 4 3
3 3 2 1 1 3 4 2 3
4 10 5 10 5 10 4 10 5
5 6 6 7 7 5 3 5 4
6 7 7 6 6 9 8 9 5
7 6 3 1 1 3 4 1 3
8 4 3 6 1 7 7 8 1
9 2 3 4 2 3 3 4 3
10 7 1 2 1 8 8 2 2
11 6 8 8 5 4 5 3 5
12 8 8 8 2 8 5 5 2
13 10 9 10 2 2 1 1 1
14 4 5 5 6 6 5 5 6
15 8 6 8 6 6 5 7 6
16 7 6 2 8 7 1 4 5
17 6 1 8 4 3 2 7 6
18 6 6 3 4 7 6 6 3
19 5 4 5 6 3 3 3 3
20 5 3 3 2 4 2 2 2
135
Table C.1 continue…
Participant
no
Design
1 2 3 4 5 6 7 8
21 2 3 3 2 5 3 3 2
22 4 5 4 2 6 5 4 4
23 6 7 4 4 8 7 4 2
24 2 3 8 6 1 1 1 3
25 6 5 5 5 6 4 5 4
26 2 2 2 2 2 2 2 2
27 10 7 8 3 6 2 5 1
28 3 4 2 2 3 3 4 4
29 1 1 1 1 1 1 1 1
136
Table C.2 Scores for the one-question with webpages trail
Participant
no
Design
1 2 3 4 5 6 7 8
1 7 7 8 8 7 7 8 8
2 1 1 1 1 1 1 1 1
3 2 2 1 1 2 2 1 1
4 6 6 6 7 7 8 8 8
5 2 3 2 1 2 2 3 2
6 2 2 4 4 2 2 2 4
7 5 5 4 5 5 5 5 4
8 5 5 6 2 1 2 3 3
9 6 6 3 1 1 1 1 1
10 4 4 3 1 5 5 2 3
11 5 5 4 2 5 5 3 2
12 4 4 3 5 4 4 4 4
13 4 6 6 6 9 5 7 8
14 3 3 5 4 5 4 5 5
15 5 4 5 5 2 2 6 6
16 4 4 4 4 4 3 3 3
17 8 7 5 3 9 10 4 6
18 6 5 6 6 6 6 6 6
19 3 3 3 3 3 3 3 3
20 3 3 3 3 3 3 3 3
137
Table C.2 continue…
Participant
no
Design
1 2 3 4 5 6 7 8
21 3 3 1 2 2 2 1 1
22 6 8 7 5 6 6 5 6
23 4 6 4 4 5 5 4 5
24 5 5 5 5 5 5 5 5
25 3 3 4 4 5 5 4 4
26 1 2 1 1 1 1 1 2
27 3 3 3 3 2 2 2 2
28 4 4 2 2 4 4 2 2
138
Table C.3 Average scores for the Classical scale of Classical/Expressive questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 5.5 5.3 5.3 5.0 3.5 3.7 3.6 3.4
2 3.0 3.0 3.0 3.0 4.0 4.5 4.5 4.0
3 5.5 5.0 5.5 3.8 3.0 4.0 3.0 3.0
4 5.5 4.8 4.5 3.0 3.5 4.0 3.0 2.5
5 4.3 4.8 5.3 5.3 2.8 3.5 2.0 3.0
6 6.0 6.0 6.0 6.0 4.5 3.5 4.5 4.0
7 5.8 5.5 4.3 3.8 5.0 5.0 5.0 5.0
8 3.5 3.5 3.5 3.5 5.5 6.0 6.0 4.0
9 6.0 6.0 5.5 5.8 2.5 2.5 2.5 2.5
10 4.0 4.3 5.0 4.8 5.8 6.0 5.3 5.0
11 1.5 1.0 1.0 1.0 4.8 4.5 4.3 5.0
12 4.3 3.8 4.5 3.5 1.0 1.0 1.5 1.0
13 4.5 4.5 4.5 4.3 3.5 3.0 4.8 3.8
14 5.3 5.3 5.0 3.8 3.3 3.0 3.5 2.8
15 2.0 4.5 4.3 2.0 3.0 3.0 3.0 3.0
16 7.0 5.3 5.0 5.0 2.0 2.0 4.3 2.0
17 2.5 2.5 2.5 2.5 3.5 5.0 2.0 4.8
18 - - - - 2.5 2.5 2.5 2.5
19 - - - - 3.5 3.7 3.6 3.4
20 - - - - 3.5 3.7 3.6 3.4
21 - - - - 3.5 3.7 3.6 3.4
139
Table C.4 Average scores for the Expressive scale of the Classical/Expressive questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 3.5 3.3 3.3 4.3 3.0 3.0 2.8 3.0
2 1.8 1.8 2.0 1.8 3.5 1.5 3.3 3.0
3 3.8 3.5 4.5 4.0 1.8 1.8 1.8 3.0
4 2.0 2.0 3.0 2.0 3.5 3.5 4.5 4.3
5 3.5 2.8 3.0 3.0 2.0 2.5 3.3 2.3
6 2.3 2.0 2.0 2.5 2.5 2.8 2.3 2.3
7 1.8 2.0 2.0 2.0 2.0 2.0 2.0 2.0
8 2.0 3.0 2.0 2.0 2.0 2.0 2.0 2.0
9 1.0 1.0 1.0 1.0 2.0 2.0 2.0 2.0
10 3.8 2.5 4.0 4.0 1.0 1.0 1.0 1.0
11 3.3 3.3 2.0 4.3 3.5 3.5 4.0 4.0
12 1.8 2.0 2.8 1.8 2.0 2.3 2.5 2.0
13 4.0 4.3 2.5 2.5 2.0 2.3 3.5 2.3
14 6.0 5.0 5.0 5.0 4.3 4.3 3.0 4.0
15 3.0 2.0 3.0 3.0 4.0 5.0 5.0 5.0
16 6.0 5.5 6.0 6.5 3.0 3.0 3.0 3.0
17 4.3 4.3 4.3 4.3 6.3 5.5 6.5 5.5
18 - - - - 4.3 4.3 4.3 4.3
19 - - - - 6.0 5.8 5.8 5.8
20 - - - - 1.0 1.0 1.0 1.0
21 - - - - 2.0 2.0 2.0 2.0
140
Table C.5 Total average scores for the Classical/Expressive questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 4.5 4.3 4.3 4.6 3.3 3.4 3.2 3.2
2 2.4 2.4 2.5 2.4 3.8 3.0 3.9 3.5
3 4.6 4.3 5.0 3.9 2.4 2.9 2.4 3.0
4 3.8 3.4 3.8 2.5 3.5 3.8 3.8 3.4
5 3.9 3.8 4.1 4.1 2.4 3.0 2.6 2.6
6 4.1 4.0 4.0 4.3 3.5 3.1 3.4 3.1
7 3.8 3.8 3.1 2.9 3.5 3.5 3.5 3.5
8 2.8 3.3 2.8 2.8 3.8 4.0 4.0 3.0
9 3.5 3.5 3.3 3.4 2.3 2.3 2.3 2.3
10 3.9 3.4 4.5 4.4 3.4 3.5 3.1 3.0
11 2.4 2.1 1.5 2.6 4.1 4.0 4.1 4.5
12 3.0 2.9 3.6 2.6 1.5 1.6 2.0 1.5
13 4.3 4.4 3.5 3.4 2.8 2.6 4.1 3.0
14 5.6 5.1 5.0 4.4 3.8 3.6 3.3 3.4
15 2.5 3.3 3.6 2.5 3.5 4.0 4.0 4.0
16 6.5 5.4 5.5 5.8 2.5 2.5 3.6 2.5
17 3.4 3.4 3.4 3.4 4.9 5.3 4.3 5.1
18 - - - - 3.4 3.4 3.4 3.4
19 - - - - 4.8 4.7 4.7 4.6
20 - - - - 2.3 2.4 2.3 2.2
21 - - - - 2.8 2.9 2.8 2.7
141
Table C.6 Average scores for the Simplicity scale of the VisAWI questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 3.0 2.8 4.4 4.2 2.3 1.7 1.0 1.0
2 4.6 4.4 4.6 4.2 4.0 5.0 5.3 5.7
3 5.6 5.6 6.0 3.8 4.0 5.0 5.0 6.0
4 2.4 4.0 3.6 2.8 7.0 6.7 7.0 6.3
5 5.0 5.0 2.8 2.6 1.0 1.0 2.7 2.3
6 4.0 3.8 4.0 4.0 1.7 1.7 4.7 4.7
7 5.2 3.8 3.0 4.0 1.0 1.0 2.3 1.0
8 5.4 5.4 4.4 3.8 1.0 1.0 1.3 1.0
9 4.4 4.6 3.8 3.8 4.7 5.3 4.7 4.7
10 4.0 4.0 5.4 5.4 4.0 4.3 4.0 4.3
11 2.8 2.0 2.0 2.2 3.3 4.7 4.0 3.3
12 4.4 4.6 4.4 4.6 3.0 4.0 4.7 4.3
13 5.4 5.6 4.4 4.8 5.0 5.7 6.0 5.0
14 5.4 4.6 5.0 5.0 6.7 6.7 6.7 6.7
15 4.8 4.8 3.6 3.6 3.0 3.3 1.7 2.7
16 4.8 5.8 4.0 6.0 1.0 1.0 1.0 1.0
17 5.8 6.0 5.6 5.4 5.3 3.3 4.3 5.3
18 2.8 2.8 2.0 2.0 5.0 5.0 5.0 5.3
19 4.0 3.6 4.0 4.0 2.7 2.7 2.0 2.7
20 4.2 4.2 3.4 3.6 5.3 5.7 5.7 5.3
21 5.2 5.6 5.2 4.6 4.3 3.7 4.0 2.7
22 4.2 5.2 4.4 3.6 5.3 4.3 3.3 5.0
23 5.0 4.4 5.4 5.0 4.0 4.0 3.3 4.0
24 3.8 4.8 4.6 3.6 - - - -
25 1.6 2.0 2.0 2.0 - - - -
142
Table C.7 Average scores for the Diversity scale of the VisAWI questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 3.0 3.4 4.0 4.6 1.0 1.0 1.0 1.0
2 3.6 2.4 2.2 2.2 1.3 1.3 1.3 1.3
3 4.6 4.8 5.6 4.6 2.7 4.0 4.7 5.0
4 4.2 4.0 5.0 4.4 6.3 3.7 6.3 5.3
5 4.2 3.0 2.6 1.8 1.0 1.0 2.7 1.0
6 4.0 4.4 4.0 4.0 2.0 2.0 4.0 4.3
7 4.2 4.6 4.6 4.2 1.0 1.0 2.0 1.0
8 4.0 4.0 3.0 3.2 1.0 1.0 2.0 1.0
9 2.4 2.0 1.4 1.6 4.3 4.0 3.0 3.3
10 4.8 4.2 2.6 3.4 1.3 1.3 1.3 1.0
11 2.8 2.0 2.0 2.8 2.0 2.7 2.0 2.0
12 2.8 2.8 3.2 3.6 3.7 3.7 4.7 2.7
13 2.0 2.2 2.0 2.4 2.3 2.7 2.0 2.3
14 3.8 2.8 4.6 2.6 6.0 6.0 6.0 6.3
15 1.6 1.6 1.2 1.6 2.3 3.0 2.0 2.0
16 2.6 3.0 3.4 3.8 1.0 1.0 1.0 1.0
17 4.0 5.0 4.2 4.2 4.3 3.7 4.3 4.3
18 2.0 2.0 2.0 2.0 2.0 2.0 2.3 2.0
19 4.0 4.0 4.0 4.0 2.0 2.0 2.0 1.3
20 1.8 1.2 1.2 1.0 2.7 3.7 3.7 3.7
21 4.8 5.4 5.4 4.0 2.0 1.3 2.3 1.7
22 1.0 3.2 2.2 1.2 1.7 1.3 1.3 2.0
23 4.8 5.6 5.4 4.6 2.0 2.0 1.7 2.0
24 5.2 3.0 3.8 4.0 - - - -
25 2.0 1.8 2.4 2.0 - - - -
143
Table C.8 Average scores for the Colorfulness scale of the VisAWI questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 3.0 4.3 3.8 4.3 1.0 1.0 1.0 1.0
2 4.5 4.0 4.0 4.0 2.5 4.0 3.5 4.0
3 4.0 4.8 5.5 4.3 3.0 4.5 3.0 5.0
4 4.8 3.5 3.8 4.0 6.0 2.0 6.0 3.0
5 4.3 4.5 2.8 3.8 5.0 5.0 5.0 4.0
6 4.0 3.3 4.0 4.0 2.0 2.0 3.5 3.0
7 3.5 4.8 3.5 4.5 1.0 1.0 2.0 1.0
8 6.0 5.8 5.0 5.5 1.0 1.0 2.0 1.0
9 3.8 2.8 3.0 3.0 4.5 4.5 4.0 4.5
10 4.5 3.8 4.0 4.8 3.0 2.0 3.0 3.0
11 2.0 2.0 2.0 3.0 2.0 2.0 2.0 2.0
12 3.5 3.8 3.8 3.8 4.0 3.0 3.5 2.5
13 4.0 4.3 4.0 4.0 2.0 1.0 1.0 1.5
14 5.3 5.0 5.5 5.5 7.0 6.5 6.5 6.5
15 1.0 1.0 1.3 1.0 2.0 2.5 2.0 2.0
16 6.3 6.0 6.0 6.0 1.0 1.0 1.0 1.0
17 5.8 6.0 5.5 5.5 5.5 5.0 6.0 6.0
18 3.0 2.0 2.0 2.0 3.0 2.0 2.5 2.0
19 4.0 4.0 4.0 4.0 2.0 2.0 2.0 1.5
20 2.8 4.0 3.5 3.8 3.0 2.0 3.0 2.5
21 6.0 5.5 5.8 5.3 1.0 1.0 1.5 1.0
22 2.5 2.5 2.0 2.5 6.0 6.0 6.0 6.0
23 4.8 5.5 5.5 4.8 2.0 2.0 2.5 2.5
24 3.8 4.5 5.0 3.8 - - - -
25 4.0 1.5 2.0 1.8 - - - -
144
Table C.9 Average scores for the Craftsmanship scale of the VisAWI questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 3.5 2.8 4.8 4.0 1.0 1.0 1.0 1.0
2 3.3 2.5 2.8 2.3 2.0 2.5 4.0 2.5
3 5.8 5.0 5.5 4.3 3.5 4.0 4.0 6.0
4 2.0 4.3 3.3 1.3 7.0 7.0 7.0 6.5
5 4.5 2.8 3.0 2.8 1.0 1.0 1.0 1.0
6 4.0 3.3 4.0 4.0 3.5 2.0 5.0 5.0
7 4.3 4.5 4.0 3.5 1.0 1.0 1.5 1.0
8 3.0 4.0 3.3 2.8 1.0 1.0 1.0 1.0
9 3.0 2.0 2.0 3.8 4.0 4.0 4.0 4.5
10 5.5 5.0 3.8 4.5 1.5 2.0 2.0 2.5
11 3.3 2.5 3.5 2.3 2.0 2.0 1.5 2.0
12 3.0 2.8 2.5 2.8 3.5 4.5 5.5 3.5
13 5.0 3.0 4.8 4.3 3.0 2.5 2.5 2.5
14 4.8 5.0 4.8 4.5 7.0 7.0 7.0 7.0
15 2.8 2.3 2.0 2.3 3.5 3.5 2.0 3.5
16 4.5 4.3 4.3 4.3 1.0 1.0 1.0 1.0
17 4.5 4.0 5.5 4.5 3.0 3.0 4.0 4.0
18 2.3 2.0 2.0 2.0 4.0 4.5 4.0 4.5
19 4.0 4.0 4.0 4.0 2.0 2.0 2.0 2.0
20 2.8 1.8 2.0 1.8 1.5 2.5 2.5 3.5
21 5.3 5.3 5.0 3.5 1.5 1.0 1.0 1.0
22 1.3 1.5 3.3 1.5 2.0 1.0 2.0 2.0
23 4.8 4.8 5.5 4.8 2.0 2.5 1.0 2.5
24 3.3 5.0 4.0 3.8 - - - -
25 2.5 3.0 3.5 3.0 - - - -
145
Table C.10 Total average scores for the VisAWI questionnaire.
Participant
no
Design
1 2 3 4 5 6 7 8
1 3.1 3.3 4.2 4.3 1.3 1.2 1.0 1.0
2 4.0 3.3 3.4 3.2 2.5 3.2 3.5 3.4
3 5.0 5.0 5.7 4.2 3.3 4.4 4.2 5.5
4 3.3 3.9 3.9 3.1 6.6 4.8 6.6 5.3
5 4.5 3.8 2.8 2.7 2.0 2.0 2.8 2.1
6 4.0 3.7 4.0 4.0 2.3 1.9 4.3 4.3
7 4.3 4.4 3.8 4.1 1.0 1.0 2.0 1.0
8 4.6 4.8 3.9 3.8 1.0 1.0 1.6 1.0
9 3.4 2.8 2.6 3.0 4.4 4.5 3.9 4.3
10 4.7 4.2 3.9 4.5 2.5 2.4 2.6 2.7
11 2.7 2.1 2.4 2.6 2.3 2.8 2.4 2.3
12 3.4 3.5 3.5 3.7 3.5 3.8 4.6 3.3
13 4.1 3.8 3.8 3.9 3.1 3.0 2.9 2.8
14 4.8 4.4 5.0 4.4 6.7 6.5 6.5 6.6
15 2.5 2.4 2.0 2.1 2.7 3.1 1.9 2.5
16 4.5 4.8 4.4 5.0 1.0 1.0 1.0 1.0
17 5.0 5.3 5.2 4.9 4.5 3.8 4.7 4.9
18 2.5 2.2 2.0 2.0 3.5 3.4 3.5 3.5
19 4.0 3.9 4.0 4.0 2.2 2.2 2.0 1.9
20 2.9 2.8 2.5 2.5 3.1 3.5 3.7 3.8
21 5.3 5.4 5.3 4.3 2.2 1.8 2.2 1.6
22 2.2 3.1 3.0 2.2 3.8 3.2 3.2 3.8
23 4.8 5.1 5.5 4.8 2.5 2.6 2.1 2.8
24 4.0 4.3 4.4 3.8 - - - -
25 2.5 2.1 2.5 2.2 - - - -