Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Linköping University | Department of Computer and Information Science Bachelor thesis, 16 ECTS | Datateknik
Spring 2020| LIU-IDA/LITH-EX-G--20/023--SE
Visualization of machine learning data for radio networks
A case study at Ericsson
Bingyu Niu
Supervisors: Daniel Karlsson (Ericsson) Magnus Johansson (Ericsson) Zeinab Ganjei (Linköping University) Examiner: Mikael Asplund (Linköping University)
Upphovsrätt
Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår.
Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och administrativ art.
Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart.
För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.
Copyright
The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starting from the date of publication barring exceptional circumstances.
The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.
According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.
For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/. © Bingyu Niu
http://www.ep.liu.se/http://www.ep.liu.se/
iii
Abstract
This thesis presents a method to develop a visualization software for time-varying and geographic-
based data. The machine learning team at Ericsson has collected data from their machine learning
algorithms. The data set contains timestamped and geographic information. To have a better
understanding of the result made by the machine learning algorithms, it is important to understand
the pattern of the data. It is hard to see the pattern of the data by only looking at the raw data set,
and data visualization software will help the users to have a more intuitive view of the data. To
choose a suitable GUI library, three common GUI libraries were compared. The Qt framework
was chosen as the GUI library and development framework because of its wide-range support to
user interface design. Animation is the main method to visualize the data set. The performance
evaluation of the software shows that it handles the back-end data efficiently, renders fast in the
front-end and has low memory and CPU usage. The usability testing indicates that the software is
easy to use. In the end, the thesis compares its method to a previous method, developed in R. The
comparison shows that even though the old method is easier to develop, it has worse performance.
iv
Acknowledgement
I would like to thank my supervisors at Ericsson, Daniel Karlsson and Magnus Johansson, for all
their help and support from both practical and theoretical aspects. I also want to thank examiner
Mikael Asplund and supervisor Zeinab Ganjei from the university that they guided the whole
process of thesis work and provided many advice and suggestions on academic research.
v
Table of Contents
Upphovsrätt ...................................................................................................................... ii
Copyright .......................................................................................................................... ii
1. Introduction ..................................................................................................................... 1 1.1 Background .................................................................................................................. 1
1.2 Motivation ................................................................................................................... 2 1.3 Aim .............................................................................................................................. 3 1.4 Approach ..................................................................................................................... 4 1.5 Delimitation ................................................................................................................. 4
2. Background ...................................................................................................................... 5
2.1 Explainable artificial intelligence ................................................................................ 5 2.2 Time-varying data visualization .................................................................................. 5 2.3 C++ .............................................................................................................................. 6 2.4 R language ................................................................................................................... 6
2.5 QT software development framework ......................................................................... 6 2.6 Other GUI libraries ...................................................................................................... 7
2.6.1 GIMP Tool Kit: GTK ........................................................................................... 7
2.6.2 wxWidgets ............................................................................................................ 7 2.7 Usability testing ........................................................................................................... 8
2.8 Related work ................................................................................................................ 9 3. Method ............................................................................................................................ 10
3.1 Development language and tool ................................................................................ 10
3.2 Implementation .......................................................................................................... 11 3.2.1 Structure of the software .................................................................................... 11
3.2.2 User interface .................................................................................................... 11 3.2.3 Front-end ........................................................................................................... 12 3.2.4 Back-end ............................................................................................................ 13
3.2.5 Two-way communication ................................................................................... 16
3.3 Evaluation .................................................................................................................. 16 3.3.1 User Feedback ................................................................................................... 16
3.3.2 Software Performance ....................................................................................... 17 3.3.3 Comparison ........................................................................................................ 18
4. Result .............................................................................................................................. 19 4.1 User Interface ............................................................................................................ 19 4.2 Performance of the software ...................................................................................... 20
4.2.1 Data preparation ............................................................................................... 20 4.2.2 Response time for the selected time ................................................................... 20 4.2.3 CPU and memory usage .................................................................................... 20 4.2.4 Rendering time ................................................................................................... 21
4.3 Comparison ................................................................................................................ 21
4.3.1 Data preparation ............................................................................................... 21
4.3.2 Response and rendering time ............................................................................. 22
4.3.3 CPU and memory usage .................................................................................... 22 4.3.4 Implementation efficiency .................................................................................. 23
4.4 Usability testing result ............................................................................................... 23 5. Discussion ....................................................................................................................... 25
5.1 Method discussion ..................................................................................................... 25 5.2 Result discussion ....................................................................................................... 26 5.3 Ethical and societal consideration ............................................................................. 27
vi
6. Conclusion ..................................................................................................................... 28
7. References ...................................................................................................................... 29
vii
List of figures
Figure 1. 1: One Node with three cells ........................................................................................... 1 Figure 1. 2: Relationship of the machine learning algorithms and visualization software ............. 2
Figure 2. 1: Signals and Slots mechanism in Qt. ............................................................................ 7
Figure 3. 1: Structure of the software ........................................................................................... 11 Figure 3. 2: Layout of the user interface ....................................................................................... 12 Figure 3. 3: Structure of a QML file ............................................................................................. 12 Figure 3. 4: Back-end modules ..................................................................................................... 13
Figure 3. 5: Workflow of the back-end ......................................................................................... 15 Figure 3. 6: Data containers .......................................................................................................... 16 Figure 3. 7: Output from top command ........................................................................................ 18
Figure 4. 1: User Interface ............................................................................................................ 19 Figure 4. 2: Time taken for data preparation of new software ...................................................... 20 Figure 4. 3: CPU usage for the new software ............................................................................... 21
Figure 4. 4: Data preparation comparison ..................................................................................... 22 Figure 4. 5: CPU usage for the old software ................................................................................. 22
Figure 4. 6: CPU and memory usage comparison in percentage .................................................. 23
List of tables
Table 1. 1: Input data set ................................................................................................................. 2
Table 3. 1: Comparison of GTK, Qt and wxWidgets ................................................................... 10
Table 4. 1: CPU and memory usage of the old software .............................................................. 23 Table 4. 2: Result of usability testing with SUS questions ........................................................... 24
1
1. Introduction This is a bachelor thesis of data visualization written with case study from Ericsson. In this
chapter, the background, motivation, aim, approach, and delimitation will be described.
1.1 Background
Ericsson is one of the biggest networking and telecommunication companies that provides
Information and Communication Technology (ICT) solutions. Ericsson is a Swedish company
that has their business over many countries and areas over the world [1].
There is a massive amount of data collected from Ericsson’s 2G, 3G, 4G and 5G radio networks
and the machine learning server/models use this data to provide better service and solutions.
The abstract nature of machine learning makes it is hard to understand the data provided by
machine learning devices.
This thesis was written at the machine learning team at Ericsson AB Linköping. The team
consists of twelve software developers and two of them were the supervisors for this thesis.
User equipment (UE) is the device that is used for communication by end-users. The most
common UE is a mobile phone. When the connection between a UE and core network is weak
or lost, the UE needs to find another connection with better performance. In cellular
telecommunications, one cell means one network coverage area. Handover is the connection
switching from one cell to another. The team develops machine learning algorithms to get a
better prediction of handover.
Ericsson has radio stations at many sites and the machine learning algorithms are connected to
the stations. In this thesis each station will be called a “node”. As it is shown in Figure 1.1, each
node has several cells that have different coverage areas. Cells collect UE events such as signal
strength from the coverage areas and then those events are used to train machine learning
models. There are different machine learning models for data training. The number of models
in one node depends on the number of coverage areas. Each model has several states of training
status including not trained, not valid, valid, not outdated, outdated, in lobby, and on hold.
Figure 1. 1: One Node with three cells
2
1.2 Motivation
The technology of artificial intelligence and machine learning has developed fast during the
past decade. The machine learning models have been applied to different industries to provide
a better prediction. The users sometimes have difficulty to understand why the models get a
certain result. Explainable artificial intelligence is a concept intended to make the results from
those machine learning models more understandable by humans [2]. To visualize the
performance of different machine learning models, a Graphical User Interface (GUI) is needed
to present data in a more human way.
The relationship of the machine learning algorithms and visualization software is shown in
Figure 1.2. The changes of models’ training state are recorded as the result of the machine
learning algorithms. These data will be the input data of the visualization software which
includes these attributes: node name, cell id, frequency, machine learning model id, timestamp,
states of data training, latitude and longitude. Table 1.1 shows what the raw data set looks like.
It is hard to see the states changes for different nodes in one area by just looking at the data set.
Therefore, it is desirable to have a visualization software that can show the changes of each
model state for each node in a continuous time flow.
Figure 1. 2: Relationship of the machine learning algorithms and visualization software
Table 1. 1: Input data set
Node Cell Frequency Model ID Time State Latitude Longitude
N001 1 347 0x105454 5/15/2019 6:59 VALID 48.5646 1.5864
N001 1 347 0x205454 5/13/2019 20:24 NOT_OUTDATED 48.5646 1.5864
N001 3 1288 0x168402 5/11/2019 20:35 NOT_TRAINED 48.5646 1.5864
N002 7 6300 0x242956 5/12/2019 2:42 ON_HOLD 48.5874 1.5789
N002 2 347 0x245688 5/12/2019 2:42 NOT_VALID 48.5874 1.5789
N003 5 2850 0x229898 5/12/2019 2:42 VALID 48.5836 1.5584
N004 4 1288 0x158912 5/12/2019 17:48 ON_HOLD 48.5744 1.5784
... ... ... ... ... ... ... ...
Here are some scenario examples that expected from the visualization software.
Scenario one:
From the visualization, users see that the majority state of models in one node becomes ready
in a short time, then the state of this node changes very fast during the following three days, and
3
after that, the state of this node become stable. It is interesting for the users to see that, so they
can consider why the model state changes like this.
Scenario two:
There is one node that is never ready, and the colour is always red. The user clicks on the node
and see that there are several models applied to this node. Some of them work well and the
states are ready but some of them have worse performance. In this way, it is easy to see which
model has better performance.
Scenario three:
The visualization software helps to show that different areas have its special pattern of model
performance. The state changes and the majority state can be different at different geographical
areas and this may base on some elements such as city size, number of user equipment, etc.
Previous thesis students developed a visualization software which is a web application written
in R language [3].However, this software has long response time when users interact with the
user interface. Because of the long response time, this software is not in use. Thus, a new
visualization software with a better performance is needed.
It is important and interesting to find out proper methods to develop the software and evaluate
the performance. This thesis develops a new visualization software and then compare it to the
previously software to find out the difference between two approaches from various aspects.
1.3 Aim
The aim of this thesis is to find out an approach to develop a visualization software which can
display the machine learning data of radio network in an understandable and efficient way. Here
are the research questions based on the motivation:
1. How can a visualization software for time-varying data with geographic information be implemented, thus the data can be understood better and the software has good
performance?
2. Compare the performance of new and old software and find out the advantages and disadvantages of these two approaches.
To answer these research questions, several issues need to be examined:
• Find a programming language that can execute code fast and is easy to use for software developing.
• Find a GUI framework that can fulfil the visualization requirement and has good execution performance.
• Find suitable data structures and containers to store data so it is easy to reach and use them.
• Find a way to control the data import process so the data input is correct, transparent, and controllable.
• Find a method to evaluate software’s performance.
• Compare the performance of the new and old software.
• Find a method to evaluate users' experience.
4
1.4 Approach
The study began with reading theories about data visualization, explainable artificial
intelligence, development approaches, common GUI libraries and software evaluation. Suitable
developing language and GUI library were decided based on theory study, requirements and
developing environment. The implementation idea was presented including back-end and front-
end. Front-end description focused on how to show the information in a human way and what
kinds of tools were needed. The back-end focused on the data input, data storage and data
access. The two-way communication between front-end and back-end was described. After the
implementation was done, the software was evaluated from two aspects: the user experience
and the software performance. Then, the old software’s performance was evaluated. The
performance of two software has been compared and analysed.
1.5 Delimitation
Because of the limitation of time, the study will choose suitable tools based on limited theory
study and comparison. There are other ways to develop the software, but this thesis will focus
on one possible way. The comparison of the new and old software will not consider the code
quality, algorithm choosing, or other detailed issues. Since the data set was not collected by the
author, the ethical issues of data collection will not be discussed in this thesis.
5
2. Background This chapter presents relevant theories about data visualization, programming language, GUI
libraries and evaluation method. Those theories will offer a foundation to answer the research
questions. Then, related works will be presented.
2.1 Explainable artificial intelligence
Machine learning algorithms have been applied to different industries to provide a better
prediction. Most of those algorithm structures are non-linear and users have difficulty to
understand why the machine learning algorithms produce a certain result [4]. People can control
the input data and will get an output from machine learning algorithms. How the algorithms
make the decision is unknown for the users. This is the so-called "black box" in machine
learning which is not transparent for users [4]. If people do not understand how the decision
was made by artificial intelligence, it will be hard to trust and explain the result. Explainable
artificial intelligence, which is opposite to the "black box", is the method that makes the results
from those machine learning models more understandable by humans [5, 4]. Explainable
artificial intelligence aims to provide better explanation and transparency of why a result was
reached by machine learning algorithms or artificial intelligence [4, 5].
There are some goals for the explainable artificial intelligence, and they include trustworthiness,
causality, transferability, informativeness, accessibility, and interactivity [2]. Trustworthiness
suggests that the model should be trustful and be able to act as expected. Causality means that
the model can show the relationship between various variables. Transferability requires that the
existing model or solution can be applied to other problems. Informativeness means that the
models need to provide enough and clear information so the users can understand the decision.
Accessibility requires that the user with different knowledge levels can get the main clues and
facts of the model rapidly. Interactivity means that the users can interact with the model.
2.2 Time-varying data visualization
The input data of the visualization software is time-based because all the collected data have a
timestamp. The dynamic data which different in time is called time-varying or time-based data
[6]. The characteristics of data update can sort data into different categories including
continuous and discontinuous, regular and unregular, noisily and significantly [6]. Moreover,
according to the behaviour of the data, it can be separated into three different types: regular,
periodic and turbulent [7]. Regular data means that the value changes has a stable tendency
during the time, like rising, stable, or decreasing. Periodic data is for instance temperature,
which changes during the day and the night. Turbulent data means that the data varies a lot with
both spatial and temporal aspects [7].
Moere [6] summarized several methods to visualize the time-varying data including static state
replacement, time-series plots, static state morphing, and control application. Static state
replacement refers to updating a value by replacing the current data value with a new value.
Time-series plots method is usually shown as a chart with curves and timeline in the chart.
Static state morphing presents the data that have been filtered by a selected time interval.
Control application requires that the visualization and result can be produced at any time during
the execution.
6
From another aspect of view, time-varying data can be presented in two different ways: space
and animation [8]. By space, the length of time or time interval can be shown as a line in the
space. By animation, the visualization view should change based on the change of time.
It is popular to combine the space information with time animation for visualizing time-
dependent data. Interactivity is important for animation and the user should be able to choose
the time point or filtering the data [8]. The speed of the animation should be clear and slow so
the user can see the development and change clearly. Both stationary data presentation and
animation are useful for data visualization and the choice of which type depends on the task
requirement and data type [8, 9]. 2D is suitable for data visualization when the data set and task
are not very complicated [9].
2.3 C++
In this thesis work, C++ was chosen as the development language for the back-end. This section
introduces the features of C++. Seed [10] describes C++ as a general-purpose and cross-
platform language which was developed based on the C language. It is an object-oriented
language, but since it is an extension of C, it can also be used for structured programming. The
object-oriented feature offers clear constructions for developing and code reuse. C++ allows
developers to control the resource and memory usage of the system. Direct access to memory
is available for C++ which improves the performance of speed and efficiency. C++ is a
compiled language and the compiler will translate the code to machine code which can be
executed directly by the machine. Code can be compiled directly without going through a virtual
machine which contributes to the fast speed.
2.4 R language
R language is the development language for the previous software and this section will presents
the features of R. R is a general-purpose programming language for statistical computing that
can be used in different platforms. The language name is S and S stands for statistic. R is the
implementation and environment of S [11]. R has been widely used for data analyse and data
mining. R is an interpreted language which means that the interpreter needs to translate the
source code to a sequence of instructions first and then those instructions can be translated to
machine code [12].
2.5 QT software development framework
Koranne [13] introduces Qt as an application development framework that provides the GUI
library for visualization. It is written in C++ and supports many other languages including C++,
Java, Python, Go, C#, Ruby, etc. As a cross-platform framework, the source code compiles on
many platforms including UNIX, GNU/Linux, and embedded Linux. Qt has many features that
can fulfil various needs from developers. Besides the core module, Qt4 and the later version
have many other independent modules and each module can be used independently. Some of
the common modules are QtGUI, QtNetwork, QtOpenGL, QtSql, QtXML,and QtSVG. It means
that Qt supports a wide range of applications and demands. Qt not only provides GUI solution
but also a wide range of application programming APIs including memory sharing, database,
multi-threading, and network programming. Qt has the commercial version and the open source
version. The commercial version is under commercial license and the open resource version is
under LGPL license.
The communication among objects are usually made by call back functions in other toolkits,
but Qt introduces a special mechanism for this which called signals and slots [14]. Each object
7
can send signal to others and receive signals by slot. As it shows in Figure 2.1, when some event
occurs for an object, the object will emit a signal to another object’s slot.
Figure 2. 1: Signals and Slots mechanism in Qt.
Qt provides C++ extension and those extensions will be complied by Meta-Object Compiler
(MOC) [14]. MOC will parse those extensions and produce standard C++ sources which can be
compiled by a standard C++ compiler. The QObject class is a Qt C++ extension that supports
object communications by signals and slots [14].
Qt Creator is a cross platform integrated development environment (IDE) for developing Qt
applications. It supports the desktop, mobile and embedded platforms. Qt creator provides the
tool to analyse the code performance including CPU and memory usage [14].
QML is a programming language for developing user interface. The syntax of QML is similar
to JSON and it also supports JavaScript expressions. QML modules provide the engines and
substructures of QML [14]. One module called Qt quick offer several visualization components
and animation framework.
2.6 Other GUI libraries
There are other GUI libraries for visualization, in this chapter, two other GUI libraries will be
introduced.
2.6.1 GIMP Tool Kit: GTK
GTK is a cross-platform and an object-oriented toolkit for graphical user interfaces realised
under LGPL license [13]. The toolkit was originally developed for GNU Image Manipulation
Program (GIMP) and that is why it is called GIMP Tool Kit. It is written in C and can support
several programming languages including Python, C/C++, Perl, and Java. The toolkit is part of
the GNU project and it is also free [15]. The user interface of GTK contains many widgets
including windows, displays, buttons, menus, toolbars, etc [15].
2.6.2 wxWidgets
The documentation of wxWidgets [16] introduces wxWidgets as a cross-platform GUI library
written in C++ and can be compiled by a C++ compiler. It supports several other languages
such as C#, Perl and Python. With the growth of features, it can support many toolkits and
platforms including GTK and Qt. wxWidgets still use the functions from the native platforms
plus it provides an API for coding GUI application. Because wxWidgets uses the native API, it
displays a native look for applications. There are many GUI components that can support
different types of application development. The licence of wxWidgets is “wxWindows Library
Licence” which is similar to LGPL but has some exceptions.
8
2.7 Usability testing
Usability is one of the quality requirements for products that interactive with users [17]. It
means that one product satisfies users’ demands and the interface of the product is easy to use.
Usability testing is an evaluation tool to test if a product interface is easy to use or not for users.
The purpose of this test is to get feedback directly from the users and then improve the product
based on the feedback. The test is usually used for user-centred design and it will help to
determine if the product meets the users’ expectation [17]. The testing environment should be
real which means that the product needs to be accessible for the real users that are going to test
it. There are various methods for usability testing including A/B testing, hallway testing, and
expert review [17]. A/B testing means to evaluate a variable or element from opposite sides.
Usually, the user will get two questions for the same issue, but these two questions A and B are
against each other.
Hartson and Pyla [18] describe System Usability Scale (SUS) as a method to measure the
usability of a software or system. SUS provides a questionnaire with ten questions and a scoring
system. Those questions were designed based on the A/B testing idea. For each aspect, there
are two questions against each other. By SUS, it is easier to know if one system is usable or not
usable. This is widely used nowadays to measure the usability of websites, but it is also suitable
for a wide range of digital products including software applications [19].
The SUS questionnaire contains ten standard questions. The first standard question is “I think
that I would like to use this system frequently”. Since this software will be used by few people
at special occasions, the word "frequently" was changed to "when needed". Ten questions in
the SUS questionnaire are as follows [19]:
1. I think that I would like to use this system when needed.
2. I found the system unnecessarily complex.
3. I thought the system was easy to use.
4. I think that I would need the support of a technical person to be able to use this system.
5. I found the various functions in this system were well integrated.
6. I thought there was too much inconsistency in this system.
7. I would imagine that most people would learn to use this system very quickly.
8. I found the system very cumbersome to use.
9. I felt very confident using the system.
10. I needed to learn a lot of things before I could get going with this system.
The scale of score for each question is from 1 to 5 points which stands for strongly disagree,
disagree, neutral, agree, and strongly agree. The calculation of the total score should follow
these rules [19]:
1. The score of the odd number question = points -1.
2. The score of the even number question = 5- points.
3. Then all the scores of both odd and even questions can be summed together.
4. SUS score = summed score *2.5
SUS score should be inside the scale 0-100 and this can indicate the performance of usability.
The interpreting of score intervals are as follows [19]:
• SUS score
9
• SUS score is between 50–70: The performance is at marginal.
• SUS score >70: The performance is acceptable.
2.8 Related work
This thesis was written for a case study. Hence it is difficult to find previous studies which are
closely related to this thesis. Due to the time limitation, an elaborate literature study was not
performed. However, three student theses were found which were written in related areas.
Håkansson [20] evaluated Qt’s abilities to create customized graphical components and how
easy it is to reuse those components in different projects. Håkansson built up a control system
for CAN-bus signals by Qt framework in an embedded system. The prototype of the system
architecture was provided.
Anderson [21] made a research on tools that can fast build data visualization for IoT system.
Apache Zeppelin was chosen as a proper tool to visualize IoT data. The study has also examined
limitations of this tool for data visualization of IoT system. The performance of Apache
Zeppelin was evaluated by usability scale, summed usability metric and interviews.
Karlsson [22] created a visualization software to visualize log data. The purpose of the software
development was to create a tool that can help the company, OptoNova, to improve the
troubleshooting. The visualization software was developed by C++ and Qt. A database was
designed and created to store log data.
10
3. Method This part will demonstrate how to answer the research questions. First, it will present the chosen
development language and tool. After that, it will present the method implementation. The last
part is about evaluation and testing.
3.1 Development language and tool
Due to the efficiency and execution speed of C++, it will be the developing language for the
back-end. The current platform is Linux thus the GUI library should support the Linux OS.
Considering the potential usage of other platforms in the future, cross-platform GUI is a better
choice because it provides wider possibilities for future development or usage. Qt framework
was chosen as the developing framework because of its wide range supports to interface
development.
There are many GUI libraries that support C++ but the GUI's mentioned in chapter 2.5 and 2.6
are mentioned more often than others in literature and online sources. Table 3.1 shows a simple
comparison of GTK, Qt and wxWidgets from different aspects.
Table 3. 1: Comparison of GTK, Qt and wxWidgets
GTK Qt wxWidgets
Develop language C++ C++ C++
Cross-platform Yes Yes Yes
Licence LGPL LGPL wxWindows
Library Licence
Compiler Support
Standard C++
compiler
Support Standard
C++ compiler
Support
Standard C++
compiler
Libraries/Modules Limited Wide range Limited
IDE No special Qt Creator No special
Map plugin No prefixed Prefixed No prefixed
The comparison of the different cross-platform GUIs of C++ shows that Qt is the better choice
for software development because of its powerful functions. Since data visualization should be
very easy for users to understand, powerful toolkit is needed.
The goal is to locate all the nodes on a map hence the support for a map is important for this
project. Qt supports many map plugins such as Open street map, Mapbox GL, HERE, and Esri.
Since Qt 5.5 all those common map plugins are already prefixed in Qt. It is very simple to use
the map and developers just need to use the plugin key. For example, the plugin key for Open
street map is “osm”, so by adding “osm” as a name in the plugin, the map will be loaded.
For GTK, there are some plugins available and the developer needs to get and install those
packages before using them. For wxWidgets, there is no clear indication and support for loading
a map.
Compared with other GUIs, Qt provides a wider range of modules to support different demands
such as QML and Qt Widgets. To show data in a more intuitive way for humans, Qt with wide
11
modules and advanced features for data visualization is a proper choice for this thesis. It should
be easy to add more features and functions to the software with Qt modules. Moreover, Qt
Creator makes it easy to develop the application and it provides performance analyse tools.
3.2 Implementation
This chapter will describe the structure of the software and then present how to implement
different parts of the software.
3.2.1 Structure of the software
The software consists of three parts: User interface, front-end, and back-end. The user interface
shows the data visualization and allows users to interact with it. The back-end handles data
storage, data sorting, value calculation, and data updating. The front-end is responsible for
creating the user interface and rendering items on the user interface. The back-end will be
implemented by C++ and the front-end will be written in QML.
Figure 3.1 shows a structure of the software. The front-end receives instructions from users and
then send the update instructions to the back-end. The update instruction can be start/stop
updating or terminate the software. Based on the instructions, the back-end updates data. Then,
the updated information should be sent to the front-end. The data update information contains
new changes of the data at the back-end. When the front-end received the data update
information, it will update the user interface by the received data.
Figure 3. 1: Structure of the software
3.2.2 User interface
The user interface visualizes the data and allows users to interact with it. To visualize the data,
animation is a proper method to simulate events happening during the time. When one event
occurs, the software can calculate the new value of the majority node state of a node and then
update the information at the user interface. Different state status can be presented in different
colours. Since there is no special need for 3D visualization, 2D visualization is an appropriate
choice. Because each event has a timestamp and a location, it is reasonable to have the node
located on a map with a timeline. The user interface will consist of three main parts: an
information window, a map window, and a time slider view. Figure 3.2 illustrates the layout of
user interface.
All the nodes can be located inside the map window with different colours which stands for
different state status. When the state status changes, the node colour changes simultaneously.
A timeline can be located inside the time slider view to show time changes. Users can select a
12
certain time at the timeline. Animation control buttons can locate in the same area. In
information window, the detailed data of a node can be presented by text. If the node
information has been updated by a new event, the text should be updated at the same time.
Figure 3. 2: Layout of the user interface
3.2.3 Front-end
Qt Quick module and QML language will be used to implement the front-end. Based on the
design of the user interface, the front-end will also contains three main modules which are map
window, time slider view and information window. The map window is responsible for
rendering map, items on the map, and handle click event from user interaction. The time slider
view shows changes of time and handles time selected event. This area is also responsible for
animation control. The responsibility of information window is to get information from back-
end and show detail information for a selected node. The structure of a QML file is a hierarchy
structure with objects and functions. A QML file need to have one but only one root object. All
other objects or functions should stay inside the root object. The example structure of a QML
file is presented in Figure 3.3. The root object is the application window and under this there
are three sub-objects: map window, time slider view, and information window. Under those
objects, there are other sub-objects and functions.
Figure 3. 3: Structure of a QML file
To implement components of the front-end, many QML elements will be used and here are
some important QML elements for implementation:
Application window
Map window
updateColor()
Nodes
Object
Function
Time slider view
Slider
Timer
Information window
Text Area 1
Function
Object
Text Area 2
13
• QML has a map type that allows developer to draw different map elements on the map. The MapCircle QML type with coordinate property is a proper choice for
rendering nodes.
• The time flow is an important element of the visualization, so there will be a timeline that shows the time of animation and allows users to drag the time point on the
timeline to decide which moment they want to see. Slider QML type provides many
features to implement the timeline.
• TextArea QML type can display the information of a selected node. ScrollView QML type allows the text area to become scrollable.
• To trigger different events and functions at the front-end, the Timer QML type can be applied.
3.2.4 Back-end
The back end needs to import data from CSV files, store data, and handle data updating. Also,
it needs to send update information to the front-end. Figure 3.4 shows main modules of the
back-end.
The data preparation module responses for importing data from CSV files and creating data
containers for data storage. It will create a data container to store events’ information and
another container to store nodes’ information. The data update module is responsible for
receiving update instruction from the front-end, updating node data in the back-end and sending
update information to the front-end. When the data update module receives an update
instruction from the front-end, it reads new events from the event data container. Then, it
updates the node information in the node data container and sends updating information to the
front-end. Information of a selected node can be read from node data container.
Figure 3. 4: Back-end modules
The workflow of the back-end is shown in Figure 3.5, and the function of each step is as follows:
1. The user runs the software by command line. The CSV files’ paths and names should be provided in the command line. The software starts.
2. The software read CSV files that are provided by the user.
14
3. The software controls if the files are correct or not. If the files are incorrect, the software will exit with instruction about how to provide correct files. If the files are
correct, the process will go to data preparation.
4. In this step, all the data will be imported from provided files and be stored in data containers. Two containers will be created for data storage. One is to store event data,
and another is for node information.
5. If the back-end get signal from the front-end, the process will go to next step, otherwise nothing happens.
6. This step will check the signal from the front-end and update a Boolean value for animation. If the signal is to run animation, the Boolean value will be set to true. If the
signal is to stop the animation, the Boolean value will be set to false.
7. If the animation Boolean value is true, the process will go to step 7. Otherwise, the process will go back to step 5.
8. Current node information in the node data storage will be updated by new events.
9. The new updating will be sent to the front-end.
10. If there is no terminate instruction, the process will go back to step 7 to check the Boolean value of animation. If terminate signal received, the process will go to step
11.
11. The software exits.
15
Figure 3. 5: Workflow of the back-end
The information of one event or one node can be stored first in an object and then the object
can be stored in a data container. An object is like a package that contains several data fields
and functions. For example, an event contains information about node name, cell ID, frequency,
model ID, state, time, and location. That information can be gathered inside one event object.
All the objects should be sorted and stored in a reasonable way therefore data can be found and
read correctly with high speed. The events need to be sorted by time, so the data sequence is
important. For the node information, the data sequence is not significant. There are several data
containers that can be used: vector or list for sequential data and unordered map for non-
sequential data. A list is a sequential container that supports fast insertion and deletion of the
data [23]. An unordered map is a container that store element as key and value pairs [23]. Event
and node objects can be stored in these containers as shown in Figure 3.6.
16
Figure 3. 6: Data containers
All the event objects are stored in a list and they should be sorted by the timestamp. The C Time
Library will be used to handle the timestamp for instance to store the timestamp in time type
and calculate time difference. The current information of each node can be stored in a map
container with the node name as the key and node object as the value. By the node name, the
information of the node can be accessed fast.
3.2.5 Two-way communication
Two-way communication between the front-end and the back-end can be implemented based
on QML and C++ integration. The following integration mechanisms need to be used for two-
way communication between QML and C++:
• QObject class can expose C++ class attributes to QML.
• C++ Objects can be embedded in QML by context properties.
• Automatic data type conversion by QML engine.
A developer needs to define a QObject class in C++ and then set this class as the context of
QML items. Then, QObject class at the C++ side can be accessed from QML side. The functions
in the QObject class can be invoked by JavaScript expression in QML or signal handlers. Both
QObject and QML items can emit signals and receive signals via slots.
3.3 Evaluation
It is important that the software is useful for users and satisfies their expectations. Therefore, it
is important to test and evaluate the software. The performance of the software will be evaluated
from two different aspects: the users' feedback and the software performance. The users'
feedback on their experience will help to improve the satisfaction of using this software. The
software performance evaluation can measure if the software is well implemented or not from
technical aspects.
3.3.1 User Feedback
Usability testing is a method for measuring if the software is easy to use or not. Usability is a
significant benchmark for user experience. SUS will be used for usability testing and the result
will be analysed. Except for the ten questions from SUS, the questionnaire will also contain an
17
open question which asks for suggestions and feedback. The questionnaire will be sent to 4-6
people in the machine learning team at Ericsson.
3.3.2 Software Performance
There are many ways to evaluate software performance and some aspects are significate for this
visualization software. Since it is a software that allows users to interact with it, the response
time and rendering time should be short. Besides using this software, the users may need to do
other things at the computer, thus the resource usage of this software should be measured.
Hence, the software performance will be evaluated from the following aspects:
1. The preparation time for reading and sorting data set.
2. The response time after choosing a time point on the timeline.
3. The time usage for rendering one item on the map.
4. Memory usage.
5. CPU usage.
Time duration measurement will be conducted by measuring a start time before a function call
and an end time after the function have returned. The duration time is the delta time between
the start and end times.
Top command in Linux allows the user to see the system resource usage of all the Linux
processes. There are two output from this command shown in Figure 3.7. The first output was
produced when only the new software is running, and the second output is made when only the
old software is running. The “COMMAND” column of the output displays the process name.
“RES” means the physical memory usage of this process in kb. The column of “%CPU”
presents how many percentages of the CPU time the process used since the previous output.
The “%MEM” shows the percentage of physical memory of this task which means this is the
value of RES divided by entire physical memory. Solaris mode was used for the top command
which means the percentage of CPU usage of each task will be divided by the total number of
CPUs. Therefore, the total percentage of the CPU usage is 100%. The top command updates
the output every three seconds. The “RES”, “%CPU” and “%MEM” values were recorded 60
times during the running time and average value of them were used for comparison.
The new software has the name “demo”, so it is very clear that the process demo is the software
process. When running the old web application, three processes are relevant including R, Web
Content, and Firefox. The total resource usage of the old software is the resource usage
summation of R, Web Content and Firefox.
18
Figure 3. 7: Output from top command
3.3.3 Comparison
The performance of the old software written in R and the new software can be compared. The
comparison will be made based on the same data set and environment. Since the implementation
idea and method are different for these two applications, it is hard to compare details of code or
functions. But it is possible to compare those aspects presented in 3.3.2.
Besides the performance comparison, other aspects can be compared for instance how easy it
is to develop software or develop a function by these two different approaches.
19
4. Result The following chapter presents the result of the software implementation, performance
evaluation and the usability testing.
4.1 User Interface
As Figure 4.1 illustrates, the user interface allows users to run and stop the animation. The users
can also drag the time slider and choose a time to start. The animation speed is also controllable
so the users can increase the speed. Node colour will be updated by current majority state status
and selected node information will be displayed in the information area.
Figure 4. 1: User Interface
With the user interface, users can have a better understanding of data patterns. This will help
them to explain the prediction decision made by their machine learning algorithms. The user
interface of the old software has similar layout as this user interface. Comparing to the old user
interface, this user interface allows the users to change the speed of animation which can
improve the usability of the software. Moreover, it also provides information of a selected node
and all the cells under this node. The detail information of each node and cell will help the users
to observe the performance of each node and cell.
20
4.2 Performance of the software
This chapter presents the new software performance evaluation from the technical aspects.
4.2.1 Data preparation
Data preparation is the first step of the process at the back-end. Figure 4.2 shows how long time
it takes to read, store, and sort the data. The X-axis of the chart shows the events number. The
Y-axis indicates the time in milliseconds. The data preparation contains several steps: read data
from CSV file, create objects to store data, push objects into containers, and sort all the objects.
The blue line indicates the time usage in milliseconds for data preparation which includes
sorting. The orange line shows the time used only for data sort. The result shows that the total
time to prepare the data is short enough because the database will usually not be more than ten
thousand events. One real data set has 5466 events and the average time taken of 10 records for
data preparation is 25 milliseconds. All the time value in the chart is the average value of 10
times runs.
Figure 4. 2: Time taken for data preparation of new software
The comparison between the blue and orange line shows that to read and store the data in the
object take more time when data set size rises. The sort time increases also when the data
amount increases but the growth rate is lower than the growth rate of data read.
4.2.2 Response time for the selected time
The user can drag the time point anywhere they want on the timeline slider and the software
will update all the information until the selected time. The time duration between when the time
was selected and when updating until the selected time is done was measured. The time point
was dragged to different positions at the timeline 20 times and the time durations were recorded.
The time taken for updating data until the selected time was between 132 milliseconds to 245
milliseconds. The average value of these 20 records is 168.9 milliseconds.
4.2.3 CPU and memory usage
CPU usage percentage of the software were recorded 60 times and the records are displayed in
Figure 4.3. The CPU clock speed is 2.6 GHz and it has 6 cores. The data was the output of top
command and this command output updating every three seconds. The X-axis of the figure is
the number of record and the Y-axis is the percentage of CPU usage.
0
10
20
30
40
50
2000 4000 6000 8000 10000
Tim
e [m
s]
Data set size
Data Preparation
Preparation Sort
21
Figure 4. 3: CPU usage for the new software
From Figure 4.3, we can see that CPU usage is between 0.5% to 2.4%. When the software is
started and not running animation, the CPU usage is around 0.5%. When users drag and drop
the time slider, the usage rate will reach a peak. The average value of these 60 records is 0.96%.
In general, the CPU usage is low, and running this software will not affect the performance of
other processes in the system.
The memory usage of the software is very stable, and it is always 0.4% and the average physical
memory usage (RES) is 138794.93 kb.
4.2.4 Rendering time
The main task of QML is to render items on the user interface. The time taken to render one
circle and change the color of one circle were measured. To measure the time spend,
console.time and console.timeEnd were used to output the time used in milliseconds. Each event
was tested 10 times and the time taken was always 0 or 1 millisecond. The result shows that the
time taken for creating or modifying one circle takes no longer than 1 millisecond.
4.3 Comparison
Some performance aspects of the new and old software were compared. This chapter will
present the result of the comparison.
4.3.1 Data preparation
Both new and old software need to do data preparation. The data set is the same for both
software and the event amount is 5466. The data preparation process in the new software is
written in C++ and the task is to read data, create objects, store objects in a container, and sort
objects. The preparation process of the old software includes reading data, reorganizing the
data, and output data into a CSV file. The processes are not the same, therefore it is hard to
compare the performance of data preparation. However, the data preparation obviously took a
much longer time for the old software. The time taken for data preparation was measured 10
times and the average value is 16 seconds. The preparation time of the C++ code takes 25
milliseconds which is much lower compared to the old software (Figure 4.4). Even though the
preparation processes are not the same, the big difference still indicates that R has a worse
performance of data handling.
0
0.5
1
1.5
2
2.5
3
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58
Per
cen
tage
[%
]
Records
CPU Usage New
22
Figure 4. 4: Data preparation comparison
4.3.2 Response and rendering time
After users click on the pause button, the new software will stop updating immediately. But the
response time for the old software is slow and it takes 3-6 seconds to stop the updating. This is
because the logic is not efficient and the time for rendering a circle is long. Rendering one circle
takes an average of 8.3 milliseconds which is long compared with QML. It takes a maximum
of 1 millisecond for QML to render one circle.
4.3.3 CPU and memory usage
The CPU clock speed is 2.6 GHz and there are 6 cores. For the old software, the CPU usage in
percentage was recorded 60 times as shown in Figure 4.5. The X-axis of this figure is the
number of records and the Y-axis is the percentage of CPU usage. The old software is a web
application and there were three relevant processes including R, Firefox, and web content.
Figure 4. 5: CPU usage for the old software
The average CPU usage of R was 16.52%. The web event uses 1.44% of the CPU on average.
The average CPU usage of Firefox is 1.42%. Together, the average CPU usage of the old
application is 19.38%. The average CPU usage of the new software is 0.96% which is much
lower than it for the old software. The memory usage was also recorded. The percentage of
memory usage is stable and the size of physical memory usage (RES) various inside a very
small scope. As shown in the Table 4.1, the R process always use 0.6% memory and the average
25
16003
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
C++ R
Data Preparation Comparision
0
2
4
6
8
10
12
14
16
18
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58
Per
cen
tage
[%
]
CPU Usage Old
R Firefox Web content
23
physical memory usage (RES) is 186510.93 kb. The Firefox takes 0.92% which is 311254.67
kb. The web content takes 0.8% of the memory which is averagely 255976.53 kb. Together, the
memory usage for the old software is 2.32% and the average physical memory usage is
753742.13 kb.
Table 4. 1: CPU and memory usage of the old software
RES (kb) %MEM %CPU
R 186510.93 0.6 16.52
Firefox 311254.67 0.92 1.42
Web content 255976.53 0.8 1.44
SUM 753742.13 2.32 19.38
In Figure 4.6, the CPU and memory usage percentages of both software are compared. It shows
that the old software uses more CPU and memory than the new software.
Figure 4. 6: CPU and memory usage comparison in percentage
4.3.4 Implementation efficiency
Looking through the code, R is simple to implement and the code length is much shorter than
C++ for doing a same task. For example, R recognizes the columns in the CSV file and can read
needed data by column name directly. C++ cannot recognize the columns directly and it takes
longer code to handle the data reading.
4.4 Usability testing result
Five feedbacks of SUS questionnaire are received, and the result is shown in Table 4.2. The
highest point is 95 and the lowest point is 70. The average point is 85 which is over 70 and this
means the performance of this software is acceptable.
Besides those ten questions from the questionnaire, the users could give feedback in an open
question. The feedback shown that there were some bugs for instance the software crashed
sometimes, and the map had a flickering problem. These problems were tested again, and bugs
have been fixed at an improved version.
0.96 0.4
19.38
2.32
0
5
10
15
20
25
CPU Memory
Per
cen
tage
[%
]
CPU and memory usage comparison
New Old
24
Table 4. 2: Result of usability testing with SUS questions
SUS Questions P1 P2 P3 P4 P5 AVG
1. I think that I would like to use this system when
needed. 5 4 5 3 4 4.2
2. I found the system unnecessarily complex 1 1 1 1 2 1.2
3. I thought the system was easy to use. 4 5 5 4 4 4.4
4. I think that I would need the support of a
technical person to be able to use this system. 1 1 2 1 2 1.4
5. I found the various functions in this system were
well integrated. 5 5 4 4 4 4.4
6. I thought there was too much inconsistency in
this system. 2 2 1 3 2 2
7. I would imagine that most people would learn
to use this system very quickly. 5 5 5 3 4 4.4
8. I found the system very cumbersome to use. 1 2 1 2 2 1.6
9. I felt very confident using the system. 5 5 4 3 4 4.2
10. I needed to learn a lot of things before I could
get going with this system. 1 1 1 2 2 1.4
Percentile SUS Score 95 92.5 92.5 70 75 85
25
5. Discussion In this chapter, the method and result will be discussed. Ethical and societal consideration will
also be presented in this chapter.
5.1 Method discussion
In this thesis, the way to visualize the data is an animation with value changes. There are other
methods to show data changes during the time such as a chart that has time as X-axis and value
as Y-axis. Different visualization methods can be implemented and then users can test all of
them. Then different visualization methods can be compared and evaluated to find out which
one is better.
The developing language and framework were chosen by simple motivation and comparison.
The purpose of this thesis is not to find out the best language or framework. Instead, the aim is
to try to find an efficient method to develop the software and then evaluate the performance.
There are many other approaches for developing this software and there is limited meaning to
deeply compare languages or frameworks in this case.
One reason why the Qt framework was chosen is that it has many modules and plugins to use.
However, the frequent updated plugins and versions caused some problems. Most of the
plugins, modules, or even some data types have version dependencies and requirements. Some
data type which works in an older module version may totally become unrecognized in a newer
module version. Therefore, developers should be cautious with version requirements with the
Qt framework.
The size of the data set is not very large, and the goal of this software is not to handle large data
set. The data preparation time increases linearly when the data set size increase. If the data set
is very large, it may take long time for the data preparation. In this case study, the data set was
limited and there was no requirement for handling large data size. The evaluation was more
focused on visualization, response speed, resource usage, and user experience. It is difficult to
find a standard way to evaluate different software and the evaluation method is always depends
on the purpose and features of the software.
In this software, the front-end is not very complex thus it may hard to say if QML has high
performance for rendering more complex items. The time to render one circle was measured,
however this function is simple, therefore the full capability of QML was not examined. If
people want to know more about the performance and capability of QML, more complex front-
end should be used for evaluation.
I measured the time taken for rendering one circle by measuring how long time it takes for one
piece of rendering/drawing code. This is maybe not the real time to render one item on the
interface. But the time difference of the same function code still makes sense for the comparison
purpose.
To evaluate the CPU and memory usage, the top command was used. The data of relevant
processes were recorded such as the demo process for the new one and the R process for the old
one. However, other processes may also be relevant for the software.
The comparison of old and new software is difficult because they were developed by different
languages, frameworks, and logic.
26
5.2 Result discussion
From the result of the new software performance evaluation, we can see that the visualization
software made by Qt frameworks has a good performance. The back-end is written in C++ and
this makes sure the high speed of data handling and code execution. QML as the language of
front-end also has good performance of handling signals and rendering items. The CPU and
memory usage are low, thus running this software will not affect other tasks.
The comparison of the old and new software indicates that it is better to write CPU intensive
code in C++ than R. It takes a long time for R to handle data and in this case, it takes 16 seconds
to prepare the data of 5466 events. If the waiting time is long, it is better to provide a process
bar so the user can see how long time left for waiting. Both the CPU and memory usages are
higher than the new software which means it takes more resources to run.
Figure 4.3 shows that user interaction leads to increased CPU usage. If a user interacts with the
user interface very often, then the average value will increase. The highest value of the CPU
usage is 2.4% and this value is still much lesser than the average value of the old software which
is 19.38%.
The value of time duration and resource usage were recorded many times and the average values
were used for comparison and performance evaluation. Using average value can increase the
reliability and validation of the result value.
The performance difference is made by diverse elements, but the main reason can be the
performance difference of C++ and R. A compiled language usually has better performance
than an interpreted language. Moreover, R is developed for statistical purposes and is commonly
used for data analysis. Visualization software is not equal to statistic software. R also takes
more CPU usage, and this is maybe because all the code needs to be interpreted at the run time.
From the developing aspect, the Qt framework with QML is quite easy to learn and use. The
wide range of plugins and modules of QML provides strong support for front-end development.
Since the QML code structure is similar to JSON and JavaScript expressions are available in
QML, it is easy to write QML code. However, compared to the R code, the C++ code in the
back-end is usually longer and takes more time to develop. But the Qt framework has some
inconsistency of modules and data types in different versions which may lead to troubles.
Consider all the aspects above, the Qt framework is a usable and efficient approach for
visualization software development. It is very simple to handle some statistic data with R, but
it is less efficient to develop a whole visualization software only with R.
The result of usability testing indicates that the software is user friendly, therefore the users do
not need to learn so much before they use it. All of the reported bugs from the questionnaire
could not be found. Some bugs may occur due to different machine environments. The map
flickering problem is made by the map plugin version. A map plugin version that works for all
the Qt versions was found but this version does have a flickering problem. Other map plugins
with better performance have Qt version and module dependencies. If the users can control the
Qt version, version and module dependencies will not be a problem.
27
5.3 Ethical and societal consideration
The author has considered ethical issues during the whole research process. All the data and
results were described honestly. The questionnaire was anonymous, and participants’
information were protected. The author has informed the questionnaire participants about the
purpose of this research and the usage of their feedback. All the questions in the questionnaire
are objective without any bias. The author cooperated with other participants with respect and
tried to avoid all kinds of discrimination. Confidential data was also protected according to the
agreement between the author and the company.
Visualization software of machine learning data can help the users to get a better understanding
of the machine learning algorithms and the result made by those algorithms. Visualizing data
in a more human way is part of the explainable artificial intelligence and this will contribute to
the trustworthiness, causality, and informativeness of machine learning algorithms. With a
better understanding of how a result is made by the algorithms, the users can make sure that the
result will not harm humans or society. Moreover, the visualization software will also help the
users to provide better products and services to customers.
All the data should be collected and used based on relevant agreements, laws, moral standards,
and policies. The input data used for the software were collected by the company and the author
has not participated in this process.
28
6. Conclusion This thesis aims to find an efficient way to develop a visualization software. Even though this
thesis is written with a case study, it still provides a possible approach for other similar cases.
In conclusion, the objective of this thesis is met, and research questions can be answered.
Question 1): How can a visualization software for time-varying data with geographic
information be implemented, thus the data can be understood better and the software has good
performance?
This question can be answered in Chapter 3.1, 3.2, 4.1, 4.2, and 4.4.
The software was developed by C++ and Qt framework. C++ is the programming language for
the back-end. As a compiled language, the C++ code can be executed fast with low CPU and
memory usage. The object-oriented feature makes it is easy to store information, access stored
data, and reuse the code. The data was packaged in objects and then stored in list and map. For
data input, the correctness of data set needs to be checked to make sure that the data is
controllable and transparent. The front-end was developed with QML which is a module of the
Qt framework. Animation is the method to show the data changes at the user interface. Map and
timeline help to visualize the time and geographic information.
In general, the Qt framework is powerful, easy to use, and has good performance. But the
developer needs to be careful of version conflicts and dependencies among Qt, modules, and
plugins.
The performance evaluation shows that the software has good performance of data handling
and item rendering. The usage of CPU and memory is low therefore it will not affect the
execution of other processes. From the user experience aspect, the software is easy to use and
does not need much pre-knowledge.
Question 2): Compare the performance of new and old software and find out the advantages
and disadvantages of these two approaches.
This question can be answered in Chapter 4.3. The comparison of the old program and the new
program indicates that the software developed in R is less effective and costs more resources to
run. Although R does not have good performance in terms of execution speed and recourse
usage, it is simpler to develop software with R than with C++. R also provides many data types
and functions that support statistic purposes.
29
7. References [1] Ericsson,” Ericsson” , 2020. [Online]. Available: https://www.ericsson.com/en/about-us.
[2] A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García,
S. Gil-López, D. Molina, R. Benjamins, R. Chatila och F. Herrera, ”Explainable Artificial
Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward
Responsible AI”, Information Fusion, vol. 58, pp. 82-115, 2020. DOI:
10.1016/j.inffus.2019.12.012.
[3] V. Antonov och A. Sterner, ”Methods for developing visualizations in a non-designer
environment: A case study”, Linköping University Electronic Press, 2019.
[4] T. Miller, ”Explanation in artificial intelligence: Insights from the social sciences”,
Artificial Intelligence, vol. 267, pp. 1-38, 2019. DOI: 10.1016/j.artint.2018.07.007.
[5] T. Miller, ” But why? Understanding explainable artificial intelligence”, XRDS: Crossroads,
The ACM Magazine for Students, vol. 25, nr 3, pp. 20-25, 2019. DOI: 10.1145/3313107.
[6] A. V. Moere,” Time-Varying Data Visualization Using Information Flocking Boids”, IEEE
Symposium on Information Visualization, Austin, 2004. DOI: 10.1109/INFVIS.2004.65.
[7] C. Wang, H. Yu och K.-L. Ma,” Importance-Driven Time-Varying Data Visualization”,
IEEE Transactions on Visualization and Computer Graphics, vol. 14, nr 6, pp. 1547-1554,
2008. DOI: 10.1109/TVCG.2008.140.
[8] W. Huang, Handbook of human centric visualization, Springer, 2014. DOI: 10.1007/978-1-
4614-7485-2
[9] J. Bertin, Semiology of graphics: diagrams, networks, maps, Esri Press, Redlands, 2011.
ISBN: 9781589482616.
[10] G. M. Seed, An introduction to object-oriented programming in C++: with applications in
computer graphics, Springer, 2001. ISBN: 1852334509.
[11] G. Sawitzki, Computational Statistics An Introduction to R, CRC Press, Boca Raton ,2009.
ISBN: 9781420086812.
[12] N. Matloff, The art of R programming a tour of statistical software design, No Starch Press,
San Francisco, 2011. ISBN: 9781593274108.
[13] S. Koranne, Handbook of Open Souce Tools, Springer, 2011. ISBN: 9781441977182.
[14] M. Piccolino, Qt 5 projects: develop cross-platform applications with modern UIs using
the powerful Qt framework, Packt Publishing, 2018. ISBN: 9781788295512.
[15] GTK, “What is GTK, and how can I use it?”, 2020. [Online]. Available:
https://www.gtk.org/.
[16] wxWidgets, “About”, 2020. [Online]. Available: https://www.wxwidgets.org/about/.
[17] J. Rubin and D. Chisnell, Handbook of usability testing. How to plan, design, and conduct
effective tests, Wiley, 2008. ISBN: 9780470185483.
[18] R. Hartson och P. S. Pyla, The UX book: process and guidelines for ensuring a quality user
experience, Morgan Kaufmann, 2012. ISBN: 9780123852410.
[19] W. Albert och T. Tullis, Measuring the user experience collecting, analyzing, and
presenting usability metrics, Elsevier, Amsterdam, 2013. ISBN: 9780124157927.
[20] F. Håkansson, ”Platform for Development of Component Based Graphical User
Interfaces”, Uppsala universitet, Institutionen för informationsteknologi, 2010.
[21] J. Anderson, ”Visualisation of data from IoT systems : A case study of a prototyping tool
for data visualisations”, Linköpings universitet, Programvara och system, 2017.
[22] K. Hanna, ”Visualization of Log Data from Industrial Inspection Systems”, Linköpings
universitet, Tekniska högskolan, 2007.
[23] S. B. Lippman, J. Lajoie och B. E. Moo, C++ primer (5. ed.), Addison-Wesley, 2013. ISBN:
9780321714114.
https://doi.org/10.1109/INFVIS.2004.65https://doi.org/10.1109/TVCG.2008.140