Visualization of machine learning data for radio networks1453820/FULLTEXT01.pdf · based data. The machine learning team at Ericsson has collected data from their machine learning

Linköping University | Department of Computer and Information Science Bachelor thesis, 16 ECTS | Datateknik

Spring 2020| LIU-IDA/LITH-EX-G--20/023--SE

Visualization of machine learning data for radio networks

A case study at Ericsson

Bingyu Niu

Supervisors: Daniel Karlsson (Ericsson) Magnus Johansson (Ericsson) Zeinab Ganjei (Linköping University) Examiner: Mikael Asplund (Linköping University)

Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår.

Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och administrativ art.

Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart.

För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starting from the date of publication barring exceptional circumstances.

The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility.

According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement.

For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/. © Bingyu Niu

http://www.ep.liu.se/http://www.ep.liu.se/

iii

Abstract

This thesis presents a method to develop a visualization software for time-varying and geographic-

based data. The machine learning team at Ericsson has collected data from their machine learning

algorithms. The data set contains timestamped and geographic information. To have a better

understanding of the result made by the machine learning algorithms, it is important to understand

the pattern of the data. It is hard to see the pattern of the data by only looking at the raw data set,

and data visualization software will help the users to have a more intuitive view of the data. To

choose a suitable GUI library, three common GUI libraries were compared. The Qt framework

was chosen as the GUI library and development framework because of its wide-range support to

user interface design. Animation is the main method to visualize the data set. The performance

evaluation of the software shows that it handles the back-end data efficiently, renders fast in the

front-end and has low memory and CPU usage. The usability testing indicates that the software is

easy to use. In the end, the thesis compares its method to a previous method, developed in R. The

comparison shows that even though the old method is easier to develop, it has worse performance.

iv

Acknowledgement

I would like to thank my supervisors at Ericsson, Daniel Karlsson and Magnus Johansson, for all

their help and support from both practical and theoretical aspects. I also want to thank examiner

Mikael Asplund and supervisor Zeinab Ganjei from the university that they guided the whole

process of thesis work and provided many advice and suggestions on academic research.

v

Table of Contents

Upphovsrätt ...................................................................................................................... ii

Copyright .......................................................................................................................... ii

1. Introduction ..................................................................................................................... 1 1.1 Background .................................................................................................................. 1

1.2 Motivation ................................................................................................................... 2 1.3 Aim .............................................................................................................................. 3 1.4 Approach ..................................................................................................................... 4 1.5 Delimitation ................................................................................................................. 4

2. Background ...................................................................................................................... 5

2.1 Explainable artificial intelligence ................................................................................ 5 2.2 Time-varying data visualization .................................................................................. 5 2.3 C++ .............................................................................................................................. 6 2.4 R language ................................................................................................................... 6

2.5 QT software development framework ......................................................................... 6 2.6 Other GUI libraries ...................................................................................................... 7

2.6.1 GIMP Tool Kit: GTK ........................................................................................... 7

2.6.2 wxWidgets ............................................................................................................ 7 2.7 Usability testing ........................................................................................................... 8

2.8 Related work ................................................................................................................ 9 3. Method ............................................................................................................................ 10

3.1 Development language and tool ................................................................................ 10

3.2 Implementation .......................................................................................................... 11 3.2.1 Structure of the software .................................................................................... 11

3.2.2 User interface .................................................................................................... 11 3.2.3 Front-end ........................................................................................................... 12 3.2.4 Back-end ............................................................................................................ 13

3.2.5 Two-way communication ................................................................................... 16

3.3 Evaluation .................................................................................................................. 16 3.3.1 User Feedback ................................................................................................... 16

3.3.2 Software Performance ....................................................................................... 17 3.3.3 Comparison ........................................................................................................ 18

4. Result .............................................................................................................................. 19 4.1 User Interface ............................................................................................................ 19 4.2 Performance of the software ...................................................................................... 20

4.2.1 Data preparation ............................................................................................... 20 4.2.2 Response time for the selected time ................................................................... 20 4.2.3 CPU and memory usage .................................................................................... 20 4.2.4 Rendering time ................................................................................................... 21

4.3 Comparison ................................................................................................................ 21

4.3.1 Data preparation ............................................................................................... 21

4.3.2 Response and rendering time ............................................................................. 22

4.3.3 CPU and memory usage .................................................................................... 22 4.3.4 Implementation efficiency .................................................................................. 23

4.4 Usability testing result ............................................................................................... 23 5. Discussion ....................................................................................................................... 25

5.1 Method discussion ..................................................................................................... 25 5.2 Result discussion ....................................................................................................... 26 5.3 Ethical and societal consideration ............................................................................. 27

vi

6. Conclusion ..................................................................................................................... 28

7. References ...................................................................................................................... 29

vii

List of figures

Figure 1. 1: One Node with three cells ........................................................................................... 1 Figure 1. 2: Relationship of the machine learning algorithms and visualization software ............. 2

Figure 2. 1: Signals and Slots mechanism in Qt. ............................................................................ 7

Figure 3. 1: Structure of the software ........................................................................................... 11 Figure 3. 2: Layout of the user interface ....................................................................................... 12 Figure 3. 3: Structure of a QML file ............................................................................................. 12 Figure 3. 4: Back-end modules ..................................................................................................... 13

Figure 3. 5: Workflow of the back-end ......................................................................................... 15 Figure 3. 6: Data containers .......................................................................................................... 16 Figure 3. 7: Output from top command ........................................................................................ 18

Figure 4. 1: User Interface ............................................................................................................ 19 Figure 4. 2: Time taken for data preparation of new software ...................................................... 20 Figure 4. 3: CPU usage for the new software ............................................................................... 21

Figure 4. 4: Data preparation comparison ..................................................................................... 22 Figure 4. 5: CPU usage for the old software ................................................................................. 22

Figure 4. 6: CPU and memory usage comparison in percentage .................................................. 23

List of tables

Table 1. 1: Input data set ................................................................................................................. 2

Table 3. 1: Comparison of GTK, Qt and wxWidgets ................................................................... 10

Table 4. 1: CPU and memory usage of the old software .............................................................. 23 Table 4. 2: Result of usability testing with SUS questions ........................................................... 24

1

1. Introduction This is a bachelor thesis of data visualization written with case study from Ericsson. In this

chapter, the background, motivation, aim, approach, and delimitation will be described.

1.1 Background

Ericsson is one of the biggest networking and telecommunication companies that provides

Information and Communication Technology (ICT) solutions. Ericsson is a Swedish company

that has their business over many countries and areas over the world [1].

There is a massive amount of data collected from Ericsson’s 2G, 3G, 4G and 5G radio networks

and the machine learning server/models use this data to provide better service and solutions.

The abstract nature of machine learning makes it is hard to understand the data provided by

machine learning devices.

This thesis was written at the machine learning team at Ericsson AB Linköping. The team

consists of twelve software developers and two of them were the supervisors for this thesis.

User equipment (UE) is the device that is used for communication by end-users. The most

common UE is a mobile phone. When the connection between a UE and core network is weak

or lost, the UE needs to find another connection with better performance. In cellular

telecommunications, one cell means one network coverage area. Handover is the connection

switching from one cell to another. The team develops machine learning algorithms to get a

better prediction of handover.

Ericsson has radio stations at many sites and the machine learning algorithms are connected to

the stations. In this thesis each station will be called a “node”. As it is shown in Figure 1.1, each

node has several cells that have different coverage areas. Cells collect UE events such as signal

strength from the coverage areas and then those events are used to train machine learning

models. There are different machine learning models for data training. The number of models

in one node depends on the number of coverage areas. Each model has several states of training

status including not trained, not valid, valid, not outdated, outdated, in lobby, and on hold.

Figure 1. 1: One Node with three cells

2

1.2 Motivation

The technology of artificial intelligence and machine learning has developed fast during the

past decade. The machine learning models have been applied to different industries to provide

a better prediction. The users sometimes have difficulty to understand why the models get a

certain result. Explainable artificial intelligence is a concept intended to make the results from

those machine learning models more understandable by humans [2]. To visualize the

performance of different machine learning models, a Graphical User Interface (GUI) is needed

to present data in a more human way.

The relationship of the machine learning algorithms and visualization software is shown in

Figure 1.2. The changes of models’ training state are recorded as the result of the machine

learning algorithms. These data will be the input data of the visualization software which

includes these attributes: node name, cell id, frequency, machine learning model id, timestamp,

states of data training, latitude and longitude. Table 1.1 shows what the raw data set looks like.

It is hard to see the states changes for different nodes in one area by just looking at the data set.

Therefore, it is desirable to have a visualization software that can show the changes of each

model state for each node in a continuous time flow.

Figure 1. 2: Relationship of the machine learning algorithms and visualization software

Table 1. 1: Input data set

Node Cell Frequency Model ID Time State Latitude Longitude

N001 1 347 0x105454 5/15/2019 6:59 VALID 48.5646 1.5864

N001 1 347 0x205454 5/13/2019 20:24 NOT_OUTDATED 48.5646 1.5864

N001 3 1288 0x168402 5/11/2019 20:35 NOT_TRAINED 48.5646 1.5864

N002 7 6300 0x242956 5/12/2019 2:42 ON_HOLD 48.5874 1.5789

N002 2 347 0x245688 5/12/2019 2:42 NOT_VALID 48.5874 1.5789

N003 5 2850 0x229898 5/12/2019 2:42 VALID 48.5836 1.5584

N004 4 1288 0x158912 5/12/2019 17:48 ON_HOLD 48.5744 1.5784

... ... ... ... ... ... ... ...

Here are some scenario examples that expected from the visualization software.

Scenario one:

From the visualization, users see that the majority state of models in one node becomes ready

in a short time, then the state of this node changes very fast during the following three days, and

3

after that, the state of this node become stable. It is interesting for the users to see that, so they

can consider why the model state changes like this.

Scenario two:

There is one node that is never ready, and the colour is always red. The user clicks on the node

and see that there are several models applied to this node. Some of them work well and the

states are ready but some of them have worse performance. In this way, it is easy to see which

model has better performance.

Scenario three:

The visualization software helps to show that different areas have its special pattern of model

performance. The state changes and the majority state can be different at different geographical

areas and this may base on some elements such as city size, number of user equipment, etc.

Previous thesis students developed a visualization software which is a web application written

in R language [3].However, this software has long response time when users interact with the

user interface. Because of the long response time, this software is not in use. Thus, a new

visualization software with a better performance is needed.

It is important and interesting to find out proper methods to develop the software and evaluate

the performance. This thesis develops a new visualization software and then compare it to the

previously software to find out the difference between two approaches from various aspects.

1.3 Aim

The aim of this thesis is to find out an approach to develop a visualization software which can

display the machine learning data of radio network in an understandable and efficient way. Here

are the research questions based on the motivation:

1. How can a visualization software for time-varying data with geographic information be implemented, thus the data can be understood better and the software has good

performance?

2. Compare the performance of new and old software and find out the advantages and disadvantages of these two approaches.

To answer these research questions, several issues need to be examined:

• Find a programming language that can execute code fast and is easy to use for software developing.

• Find a GUI framework that can fulfil the visualization requirement and has good execution performance.

• Find suitable data structures and containers to store data so it is easy to reach and use them.

• Find a way to control the data import process so the data input is correct, transparent, and controllable.

• Find a method to evaluate software’s performance.

• Compare the performance of the new and old software.

• Find a method to evaluate users' experience.

4

1.4 Approach

The study began with reading theories about data visualization, explainable artificial

intelligence, development approaches, common GUI libraries and software evaluation. Suitable

developing language and GUI library were decided based on theory study, requirements and

developing environment. The implementation idea was presented including back-end and front-

end. Front-end description focused on how to show the information in a human way and what

kinds of tools were needed. The back-end focused on the data input, data storage and data

access. The two-way communication between front-end and back-end was described. After the

implementation was done, the software was evaluated from two aspects: the user experience

and the software performance. Then, the old software’s performance was evaluated. The

performance of two software has been compared and analysed.

1.5 Delimitation

Because of the limitation of time, the study will choose suitable tools based on limited theory

study and comparison. There are other ways to develop the software, but this thesis will focus

on one possible way. The comparison of the new and old software will not consider the code

quality, algorithm choosing, or other detailed issues. Since the data set was not collected by the

author, the ethical issues of data collection will not be discussed in this thesis.

5

2. Background This chapter presents relevant theories about data visualization, programming language, GUI

libraries and evaluation method. Those theories will offer a foundation to answer the research

questions. Then, related works will be presented.

2.1 Explainable artificial intelligence

Machine learning algorithms have been applied to different industries to provide a better

prediction. Most of those algorithm structures are non-linear and users have difficulty to

understand why the machine learning algorithms produce a certain result [4]. People can control

the input data and will get an output from machine learning algorithms. How the algorithms

make the decision is unknown for the users. This is the so-called "black box" in machine

learning which is not transparent for users [4]. If people do not understand how the decision

was made by artificial intelligence, it will be hard to trust and explain the result. Explainable

artificial intelligence, which is opposite to the "black box", is the method that makes the results

from those machine learning models more understandable by humans [5, 4]. Explainable

artificial intelligence aims to provide better explanation and transparency of why a result was

reached by machine learning algorithms or artificial intelligence [4, 5].

There are some goals for the explainable artificial intelligence, and they include trustworthiness,

causality, transferability, informativeness, accessibility, and interactivity [2]. Trustworthiness

suggests that the model should be trustful and be able to act as expected. Causality means that

the model can show the relationship between various variables. Transferability requires that the

existing model or solution can be applied to other problems. Informativeness means that the

models need to provide enough and clear information so the users can understand the decision.

Accessibility requires that the user with different knowledge levels can get the main clues and

facts of the model rapidly. Interactivity means that the users can interact with the model.

2.2 Time-varying data visualization

The input data of the visualization software is time-based because all the collected data have a

timestamp. The dynamic data which different in time is called time-varying or time-based data

[6]. The characteristics of data update can sort data into different categories including

continuous and discontinuous, regular and unregular, noisily and significantly [6]. Moreover,

according to the behaviour of the data, it can be separated into three different types: regular,

periodic and turbulent [7]. Regular data means that the value changes has a stable tendency

during the time, like rising, stable, or decreasing. Periodic data is for instance temperature,

which changes during the day and the night. Turbulent data means that the data varies a lot with

both spatial and temporal aspects [7].

Moere [6] summarized several methods to visualize the time-varying data including static state

replacement, time-series plots, static state morphing, and control application. Static state

replacement refers to updating a value by replacing the current data value with a new value.

Time-series plots method is usually shown as a chart with curves and timeline in the chart.

Static state morphing presents the data that have been filtered by a selected time interval.

Control application requires that the visualization and result can be produced at any time during

the execution.

6

From another aspect of view, time-varying data can be presented in two different ways: space

and animation [8]. By space, the length of time or time interval can be shown as a line in the

space. By animation, the visualization view should change based on the change of time.

It is popular to combine the space information with time animation for visualizing time-

dependent data. Interactivity is important for animation and the user should be able to choose

the time point or filtering the data [8]. The speed of the animation should be clear and slow so

the user can see the development and change clearly. Both stationary data presentation and

animation are useful for data visualization and the choice of which type depends on the task

requirement and data type [8, 9]. 2D is suitable for data visualization when the data set and task

are not very complicated [9].

2.3 C++

In this thesis work, C++ was chosen as the development language for the back-end. This section

introduces the features of C++. Seed [10] describes C++ as a general-purpose and cross-

platform language which was developed based on the C language. It is an object-oriented

language, but since it is an extension of C, it can also be used for structured programming. The

object-oriented feature offers clear constructions for developing and code reuse. C++ allows

developers to control the resource and memory usage of the system. Direct access to memory

is available for C++ which improves the performance of speed and efficiency. C++ is a

compiled language and the compiler will translate the code to machine code which can be

executed directly by the machine. Code can be compiled directly without going through a virtual

machine which contributes to the fast speed.

2.4 R language

R language is the development language for the previous software and this section will presents

the features of R. R is a general-purpose programming language for statistical computing that

can be used in different platforms. The language name is S and S stands for statistic. R is the

implementation and environment of S [11]. R has been widely used for data analyse and data

mining. R is an interpreted language which means that the interpreter needs to translate the

source code to a sequence of instructions first and then those instructions can be translated to

machine code [12].

2.5 QT software development framework

Koranne [13] introduces Qt as an application development framework that provides the GUI

library for visualization. It is written in C++ and supports many other languages including C++,

Java, Python, Go, C#, Ruby, etc. As a cross-platform framework, the source code compiles on

many platforms including UNIX, GNU/Linux, and embedded Linux. Qt has many features that

can fulfil various needs from developers. Besides the core module, Qt4 and the later version

have many other independent modules and each module can be used independently. Some of

the common modules are QtGUI, QtNetwork, QtOpenGL, QtSql, QtXML,and QtSVG. It means

that Qt supports a wide range of applications and demands. Qt not only provides GUI solution

but also a wide range of application programming APIs including memory sharing, database,

multi-threading, and network programming. Qt has the commercial version and the open source

version. The commercial version is under commercial license and the open resource version is

under LGPL license.

The communication among objects are usually made by call back functions in other toolkits,

but Qt introduces a special mechanism for this which called signals and slots [14]. Each object

7

can send signal to others and receive signals by slot. As it shows in Figure 2.1, when some event

occurs for an object, the object will emit a signal to another object’s slot.

Figure 2. 1: Signals and Slots mechanism in Qt.

Qt provides C++ extension and those extensions will be complied by Meta-Object Compiler

(MOC) [14]. MOC will parse those extensions and produce standard C++ sources which can be

compiled by a standard C++ compiler. The QObject class is a Qt C++ extension that supports

object communications by signals and slots [14].

Qt Creator is a cross platform integrated development environment (IDE) for developing Qt

applications. It supports the desktop, mobile and embedded platforms. Qt creator provides the

tool to analyse the code performance including CPU and memory usage [14].

QML is a programming language for developing user interface. The syntax of QML is similar

to JSON and it also supports JavaScript expressions. QML modules provide the engines and

substructures of QML [14]. One module called Qt quick offer several visualization components

and animation framework.

2.6 Other GUI libraries

There are other GUI libraries for visualization, in this chapter, two other GUI libraries will be

introduced.

2.6.1 GIMP Tool Kit: GTK

GTK is a cross-platform and an object-oriented toolkit for graphical user interfaces realised

under LGPL license [13]. The toolkit was originally developed for GNU Image Manipulation

Program (GIMP) and that is why it is called GIMP Tool Kit. It is written in C and can support

several programming languages including Python, C/C++, Perl, and Java. The toolkit is part of

the GNU project and it is also free [15]. The user interface of GTK contains many widgets

including windows, displays, buttons, menus, toolbars, etc [15].

2.6.2 wxWidgets

The documentation of wxWidgets [16] introduces wxWidgets as a cross-platform GUI library

written in C++ and can be compiled by a C++ compiler. It supports several other languages

such as C#, Perl and Python. With the growth of features, it can support many toolkits and

platforms including GTK and Qt. wxWidgets still use the functions from the native platforms

plus it provides an API for coding GUI application. Because wxWidgets uses the native API, it

displays a native look for applications. There are many GUI components that can support

different types of application development. The licence of wxWidgets is “wxWindows Library

Licence” which is similar to LGPL but has some exceptions.

8

2.7 Usability testing

Usability is one of the quality requirements for products that interactive with users [17]. It

means that one product satisfies users’ demands and the interface of the product is easy to use.

Usability testing is an evaluation tool to test if a product interface is easy to use or not for users.

The purpose of this test is to get feedback directly from the users and then improve the product

based on the feedback. The test is usually used for user-centred design and it will help to

determine if the product meets the users’ expectation [17]. The testing environment should be

real which means that the product needs to be accessible for the real users that are going to test

it. There are various methods for usability testing including A/B testing, hallway testing, and

expert review [17]. A/B testing means to evaluate a variable or element from opposite sides.

Usually, the user will get two questions for the same issue, but these two questions A and B are

against each other.

Hartson and Pyla [18] describe System Usability Scale (SUS) as a method to measure the

usability of a software or system. SUS provides a questionnaire with ten questions and a scoring

system. Those questions were designed based on the A/B testing idea. For each aspect, there

are two questions against each other. By SUS, it is easier to know if one system is usable or not

usable. This is widely used nowadays to measure the usability of websites, but it is also suitable

for a wide range of digital products including software applications [19].

The SUS questionnaire contains ten standard questions. The first standard question is “I think

that I would like to use this system frequently”. Since this software will be used by few people

at special occasions, the word "frequently" was changed to "when needed". Ten questions in

the SUS questionnaire are as follows [19]:

1. I think that I would like to use this system when needed.

2. I found the system unnecessarily complex.

3. I thought the system was easy to use.

4. I think that I would need the support of a technical person to be able to use this system.

5. I found the various functions in this system were well integrated.

6. I thought there was too much inconsistency in this system.

7. I would imagine that most people would learn to use this system very quickly.

8. I found the system very cumbersome to use.

9. I felt very confident using the system.

10. I needed to learn a lot of things before I could get going with this system.

The scale of score for each question is from 1 to 5 points which stands for strongly disagree,

disagree, neutral, agree, and strongly agree. The calculation of the total score should follow

these rules [19]:

1. The score of the odd number question = points -1.

2. The score of the even number question = 5- points.

3. Then all the scores of both odd and even questions can be summed together.

4. SUS score = summed score *2.5

SUS score should be inside the scale 0-100 and this can indicate the performance of usability.

The interpreting of score intervals are as follows [19]:

• SUS score

9

• SUS score is between 50–70: The performance is at marginal.

• SUS score >70: The performance is acceptable.

2.8 Related work

This thesis was written for a case study. Hence it is difficult to find previous studies which are

closely related to this thesis. Due to the time limitation, an elaborate literature study was not

performed. However, three student theses were found which were written in related areas.

Håkansson [20] evaluated Qt’s abilities to create customized graphical components and how

easy it is to reuse those components in different projects. Håkansson built up a control system

for CAN-bus signals by Qt framework in an embedded system. The prototype of the system

architecture was provided.

Anderson [21] made a research on tools that can fast build data visualization for IoT system.

Apache Zeppelin was chosen as a proper tool to visualize IoT data. The study has also examined

limitations of this tool for data visualization of IoT system. The performance of Apache

Zeppelin was evaluated by usability scale, summed usability metric and interviews.

Karlsson [22] created a visualization software to visualize log data. The purpose of the software

development was to create a tool that can help the company, OptoNova, to improve the

troubleshooting. The visualization software was developed by C++ and Qt. A database was

designed and created to store log data.

10

3. Method This part will demonstrate how to answer the research questions. First, it will present the chosen

development language and tool. After that, it will present the method implementation. The last

part is about evaluation and testing.

3.1 Development language and tool

Due to the efficiency and execution speed of C++, it will be the developing language for the

back-end. The current platform is Linux thus the GUI library should support the Linux OS.

Considering the potential usage of other platforms in the future, cross-platform GUI is a better

choice because it provides wider possibilities for future development or usage. Qt framework

was chosen as the developing framework because of its wide range supports to interface

development.

There are many GUI libraries that support C++ but the GUI's mentioned in chapter 2.5 and 2.6

are mentioned more often than others in literature and online sources. Table 3.1 shows a simple

comparison of GTK, Qt and wxWidgets from different aspects.

Table 3. 1: Comparison of GTK, Qt and wxWidgets

GTK Qt wxWidgets

Develop language C++ C++ C++

Cross-platform Yes Yes Yes

Licence LGPL LGPL wxWindows

Library Licence

Compiler Support

Standard C++

compiler

Support Standard

C++ compiler

Support

Standard C++

compiler

Libraries/Modules Limited Wide range Limited

IDE No special Qt Creator No special

Map plugin No prefixed Prefixed No prefixed

The comparison of the different cross-platform GUIs of C++ shows that Qt is the better choice

for software development because of its powerful functions. Since data visualization should be

very easy for users to understand, powerful toolkit is needed.

The goal is to locate all the nodes on a map hence the support for a map is important for this

project. Qt supports many map plugins such as Open street map, Mapbox GL, HERE, and Esri.

Since Qt 5.5 all those common map plugins are already prefixed in Qt. It is very simple to use

the map and developers just need to use the plugin key. For example, the plugin key for Open

street map is “osm”, so by adding “osm” as a name in the plugin, the map will be loaded.

For GTK, there are some plugins available and the developer needs to get and install those

packages before using them. For wxWidgets, there is no clear indication and support for loading

a map.

Compared with other GUIs, Qt provides a wider range of modules to support different demands

such as QML and Qt Widgets. To show data in a more intuitive way for humans, Qt with wide

11

modules and advanced features for data visualization is a proper choice for this thesis. It should

be easy to add more features and functions to the software with Qt modules. Moreover, Qt

Creator makes it easy to develop the application and it provides performance analyse tools.

3.2 Implementation

This chapter will describe the structure of the software and then present how to implement

different parts of the software.

3.2.1 Structure of the software

The software consists of three parts: User interface, front-end, and back-end. The user interface

shows the data visualization and allows users to interact with it. The back-end handles data

storage, data sorting, value calculation, and data updating. The front-end is responsible for

creating the user interface and rendering items on the user interface. The back-end will be

implemented by C++ and the front-end will be written in QML.

Figure 3.1 shows a structure of the software. The front-end receives instructions from users and

then send the update instructions to the back-end. The update instruction can be start/stop

updating or terminate the software. Based on the instructions, the back-end updates data. Then,

the updated information should be sent to the front-end. The data update information contains

new changes of the data at the back-end. When the front-end received the data update

information, it will update the user interface by the received data.

Figure 3. 1: Structure of the software

3.2.2 User interface

The user interface visualizes the data and allows users to interact with it. To visualize the data,

animation is a proper method to simulate events happening during the time. When one event

occurs, the software can calculate the new value of the majority node state of a node and then

update the information at the user interface. Different state status can be presented in different

colours. Since there is no special need for 3D visualization, 2D visualization is an appropriate

choice. Because each event has a timestamp and a location, it is reasonable to have the node

located on a map with a timeline. The user interface will consist of three main parts: an

information window, a map window, and a time slider view. Figure 3.2 illustrates the layout of

user interface.

All the nodes can be located inside the map window with different colours which stands for

different state status. When the state status changes, the node colour changes simultaneously.

A timeline can be located inside the time slider view to show time changes. Users can select a

12

certain time at the timeline. Animation control buttons can locate in the same area. In

information window, the detailed data of a node can be presented by text. If the node

information has been updated by a new event, the text should be updated at the same time.

Figure 3. 2: Layout of the user interface

3.2.3 Front-end

Qt Quick module and QML language will be used to implement the front-end. Based on the

design of the user interface, the front-end will also contains three main modules which are map

window, time slider view and information window. The map window is responsible for

rendering map, items on the map, and handle click event from user interaction. The time slider

view shows changes of time and handles time selected event. This area is also responsible for

animation control. The responsibility of information window is to get information from back-

end and show detail information for a selected node. The structure of a QML file is a hierarchy

structure with objects and functions. A QML file need to have one but only one root object. All

other objects or functions should stay inside the root object. The example structure of a QML

file is presented in Figure 3.3. The root object is the application window and under this there

are three sub-objects: map window, time slider view, and information window. Under those

objects, there are other sub-objects and functions.

Figure 3. 3: Structure of a QML file

To implement components of the front-end, many QML elements will be used and here are

some important QML elements for implementation:

Application window

Map window

updateColor()

Nodes

Object

Function

Time slider view

Slider

Timer

Information window

Text Area 1

Function

Object

Text Area 2

13

• QML has a map type that allows developer to draw different map elements on the map. The MapCircle QML type with coordinate property is a proper choice for

rendering nodes.

• The time flow is an important element of the visualization, so there will be a timeline that shows the time of animation and allows users to drag the time point on the

timeline to decide which moment they want to see. Slider QML type provides many

features to implement the timeline.

• TextArea QML type can display the information of a selected node. ScrollView QML type allows the text area to become scrollable.

• To trigger different events and functions at the front-end, the Timer QML type can be applied.

3.2.4 Back-end

The back end needs to import data from CSV files, store data, and handle data updating. Also,

it needs to send update information to the front-end. Figure 3.4 shows main modules of the

back-end.

The data preparation module responses for importing data from CSV files and creating data

containers for data storage. It will create a data container to store events’ information and

another container to store nodes’ information. The data update module is responsible for

receiving update instruction from the front-end, updating node data in the back-end and sending

update information to the front-end. When the data update module receives an update

instruction from the front-end, it reads new events from the event data container. Then, it

updates the node information in the node data container and sends updating information to the

front-end. Information of a selected node can be read from node data container.

Figure 3. 4: Back-end modules

The workflow of the back-end is shown in Figure 3.5, and the function of each step is as follows:

1. The user runs the software by command line. The CSV files’ paths and names should be provided in the command line. The software starts.

2. The software read CSV files that are provided by the user.

14

3. The software controls if the files are correct or not. If the files are incorrect, the software will exit with instruction about how to provide correct files. If the files are

correct, the process will go to data preparation.

4. In this step, all the data will be imported from provided files and be stored in data containers. Two containers will be created for data storage. One is to store event data,

and another is for node information.

5. If the back-end get signal from the front-end, the process will go to next step, otherwise nothing happens.

6. This step will check the signal from the front-end and update a Boolean value for animation. If the signal is to run animation, the Boolean value will be set to true. If the

signal is to stop the animation, the Boolean value will be set to false.

7. If the animation Boolean value is true, the process will go to step 7. Otherwise, the process will go back to step 5.

8. Current node information in the node data storage will be updated by new events.

9. The new updating will be sent to the front-end.

10. If there is no terminate instruction, the process will go back to step 7 to check the Boolean value of animation. If terminate signal received, the process will go to step

11.

11. The software exits.

15

Figure 3. 5: Workflow of the back-end

The information of one event or one node can be stored first in an object and then the object

can be stored in a data container. An object is like a package that contains several data fields

and functions. For example, an event contains information about node name, cell ID, frequency,

model ID, state, time, and location. That information can be gathered inside one event object.

All the objects should be sorted and stored in a reasonable way therefore data can be found and

read correctly with high speed. The events need to be sorted by time, so the data sequence is

important. For the node information, the data sequence is not significant. There are several data

containers that can be used: vector or list for sequential data and unordered map for non-

sequential data. A list is a sequential container that supports fast insertion and deletion of the

data [23]. An unordered map is a container that store element as key and value pairs [23]. Event

and node objects can be stored in these containers as shown in Figure 3.6.

16

Figure 3. 6: Data containers

All the event objects are stored in a list and they should be sorted by the timestamp. The C Time

Library will be used to handle the timestamp for instance to store the timestamp in time type

and calculate time difference. The current information of each node can be stored in a map

container with the node name as the key and node object as the value. By the node name, the

information of the node can be accessed fast.

3.2.5 Two-way communication

Two-way communication between the front-end and the back-end can be implemented based

on QML and C++ integration. The following integration mechanisms need to be used for two-

way communication between QML and C++:

• QObject class can expose C++ class attributes to QML.

• C++ Objects can be embedded in QML by context properties.

• Automatic data type conversion by QML engine.

A developer needs to define a QObject class in C++ and then set this class as the context of

QML items. Then, QObject class at the C++ side can be accessed from QML side. The functions

in the QObject class can be invoked by JavaScript expression in QML or signal handlers. Both

QObject and QML items can emit signals and receive signals via slots.

3.3 Evaluation

It is important that the software is useful for users and satisfies their expectations. Therefore, it

is important to test and evaluate the software. The performance of the software will be evaluated

from two different aspects: the users' feedback and the software performance. The users'

feedback on their experience will help to improve the satisfaction of using this software. The

software performance evaluation can measure if the software is well implemented or not from

technical aspects.

3.3.1 User Feedback

Usability testing is a method for measuring if the software is easy to use or not. Usability is a

significant benchmark for user experience. SUS will be used for usability testing and the result

will be analysed. Except for the ten questions from SUS, the questionnaire will also contain an

17

open question which asks for suggestions and feedback. The questionnaire will be sent to 4-6

people in the machine learning team at Ericsson.

3.3.2 Software Performance

There are many ways to evaluate software performance and some aspects are significate for this

visualization software. Since it is a software that allows users to interact with it, the response

time and rendering time should be short. Besides using this software, the users may need to do

other things at the computer, thus the resource usage of this software should be measured.

Hence, the software performance will be evaluated from the following aspects:

1. The preparation time for reading and sorting data set.

2. The response time after choosing a time point on the timeline.

3. The time usage for rendering one item on the map.

4. Memory usage.

5. CPU usage.

Time duration measurement will be conducted by measuring a start time before a function call

and an end time after the function have returned. The duration time is the delta time between

the start and end times.

Top command in Linux allows the user to see the system resource usage of all the Linux

processes. There are two output from this command shown in Figure 3.7. The first output was

produced when only the new software is running, and the second output is made when only the

old software is running. The “COMMAND” column of the output displays the process name.

“RES” means the physical memory usage of this process in kb. The column of “%CPU”

presents how many percentages of the CPU time the process used since the previous output.

The “%MEM” shows the percentage of physical memory of this task which means this is the

value of RES divided by entire physical memory. Solaris mode was used for the top command

which means the percentage of CPU usage of each task will be divided by the total number of

CPUs. Therefore, the total percentage of the CPU usage is 100%. The top command updates

the output every three seconds. The “RES”, “%CPU” and “%MEM” values were recorded 60

times during the running time and average value of them were used for comparison.

The new software has the name “demo”, so it is very clear that the process demo is the software

process. When running the old web application, three processes are relevant including R, Web

Content, and Firefox. The total resource usage of the old software is the resource usage

summation of R, Web Content and Firefox.

18

Figure 3. 7: Output from top command

3.3.3 Comparison

The performance of the old software written in R and the new software can be compared. The

comparison will be made based on the same data set and environment. Since the implementation

idea and method are different for these two applications, it is hard to compare details of code or

functions. But it is possible to compare those aspects presented in 3.3.2.

Besides the performance comparison, other aspects can be compared for instance how easy it

is to develop software or develop a function by these two different approaches.

19

4. Result The following chapter presents the result of the software implementation, performance

evaluation and the usability testing.

4.1 User Interface

As Figure 4.1 illustrates, the user interface allows users to run and stop the animation. The users

can also drag the time slider and choose a time to start. The animation speed is also controllable

so the users can increase the speed. Node colour will be updated by current majority state status

and selected node information will be displayed in the information area.

Figure 4. 1: User Interface

With the user interface, users can have a better understanding of data patterns. This will help

them to explain the prediction decision made by their machine learning algorithms. The user

interface of the old software has similar layout as this user interface. Comparing to the old user

interface, this user interface allows the users to change the speed of animation which can

improve the usability of the software. Moreover, it also provides information of a selected node

and all the cells under this node. The detail information of each node and cell will help the users

to observe the performance of each node and cell.

20

4.2 Performance of the software

This chapter presents the new software performance evaluation from the technical aspects.

4.2.1 Data preparation

Data preparation is the first step of the process at the back-end. Figure 4.2 shows how long time

it takes to read, store, and sort the data. The X-axis of the chart shows the events number. The

Y-axis indicates the time in milliseconds. The data preparation contains several steps: read data

from CSV file, create objects to store data, push objects into containers, and sort all the objects.

The blue line indicates the time usage in milliseconds for data preparation which includes

sorting. The orange line shows the time used only for data sort. The result shows that the total

time to prepare the data is short enough because the database will usually not be more than ten

thousand events. One real data set has 5466 events and the average time taken of 10 records for

data preparation is 25 milliseconds. All the time value in the chart is the average value of 10

times runs.

Figure 4. 2: Time taken for data preparation of new software

The comparison between the blue and orange line shows that to read and store the data in the

object take more time when data set size rises. The sort time increases also when the data

amount increases but the growth rate is lower than the growth rate of data read.

4.2.2 Response time for the selected time

The user can drag the time point anywhere they want on the timeline slider and the software

will update all the information until the selected time. The time duration between when the time

was selected and when updating until the selected time is done was measured. The time point

was dragged to different positions at the timeline 20 times and the time durations were recorded.

The time taken for updating data until the selected time was between 132 milliseconds to 245

milliseconds. The average value of these 20 records is 168.9 milliseconds.

4.2.3 CPU and memory usage

CPU usage percentage of the software were recorded 60 times and the records are displayed in

Figure 4.3. The CPU clock speed is 2.6 GHz and it has 6 cores. The data was the output of top

command and this command output updating every three seconds. The X-axis of the figure is

the number of record and the Y-axis is the percentage of CPU usage.

0

10

20

30

40

50

2000 4000 6000 8000 10000

Tim

e [m

s]

Data set size

Data Preparation

Preparation Sort

21

Figure 4. 3: CPU usage for the new software

From Figure 4.3, we can see that CPU usage is between 0.5% to 2.4%. When the software is

started and not running animation, the CPU usage is around 0.5%. When users drag and drop

the time slider, the usage rate will reach a peak. The average value of these 60 records is 0.96%.

In general, the CPU usage is low, and running this software will not affect the performance of

other processes in the system.

The memory usage of the software is very stable, and it is always 0.4% and the average physical

memory usage (RES) is 138794.93 kb.

4.2.4 Rendering time

The main task of QML is to render items on the user interface. The time taken to render one

circle and change the color of one circle were measured. To measure the time spend,

console.time and console.timeEnd were used to output the time used in milliseconds. Each event

was tested 10 times and the time taken was always 0 or 1 millisecond. The result shows that the

time taken for creating or modifying one circle takes no longer than 1 millisecond.

4.3 Comparison

Some performance aspects of the new and old software were compared. This chapter will

present the result of the comparison.

4.3.1 Data preparation

Both new and old software need to do data preparation. The data set is the same for both

software and the event amount is 5466. The data preparation process in the new software is

written in C++ and the task is to read data, create objects, store objects in a container, and sort

objects. The preparation process of the old software includes reading data, reorganizing the

data, and output data into a CSV file. The processes are not the same, therefore it is hard to

compare the performance of data preparation. However, the data preparation obviously took a

much longer time for the old software. The time taken for data preparation was measured 10

times and the average value is 16 seconds. The preparation time of the C++ code takes 25

milliseconds which is much lower compared to the old software (Figure 4.4). Even though the

preparation processes are not the same, the big difference still indicates that R has a worse

performance of data handling.

0

0.5

1

1.5

2

2.5

3

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58

Per

cen

tage

[%

]

Records

CPU Usage New

22

Figure 4. 4: Data preparation comparison

4.3.2 Response and rendering time

After users click on the pause button, the new software will stop updating immediately. But the

response time for the old software is slow and it takes 3-6 seconds to stop the updating. This is

because the logic is not efficient and the time for rendering a circle is long. Rendering one circle

takes an average of 8.3 milliseconds which is long compared with QML. It takes a maximum

of 1 millisecond for QML to render one circle.

4.3.3 CPU and memory usage

The CPU clock speed is 2.6 GHz and there are 6 cores. For the old software, the CPU usage in

percentage was recorded 60 times as shown in Figure 4.5. The X-axis of this figure is the

number of records and the Y-axis is the percentage of CPU usage. The old software is a web

application and there were three relevant processes including R, Firefox, and web content.

Figure 4. 5: CPU usage for the old software

The average CPU usage of R was 16.52%. The web event uses 1.44% of the CPU on average.

The average CPU usage of Firefox is 1.42%. Together, the average CPU usage of the old

application is 19.38%. The average CPU usage of the new software is 0.96% which is much

lower than it for the old software. The memory usage was also recorded. The percentage of

memory usage is stable and the size of physical memory usage (RES) various inside a very

small scope. As shown in the Table 4.1, the R process always use 0.6% memory and the average

25

16003

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

C++ R

Data Preparation Comparision

0

2

4

6

8

10

12

14

16

18

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58

Per

cen

tage

[%

]

CPU Usage Old

R Firefox Web content

23

physical memory usage (RES) is 186510.93 kb. The Firefox takes 0.92% which is 311254.67

kb. The web content takes 0.8% of the memory which is averagely 255976.53 kb. Together, the

memory usage for the old software is 2.32% and the average physical memory usage is

753742.13 kb.

Table 4. 1: CPU and memory usage of the old software

RES (kb) %MEM %CPU

R 186510.93 0.6 16.52

Firefox 311254.67 0.92 1.42

Web content 255976.53 0.8 1.44

SUM 753742.13 2.32 19.38

In Figure 4.6, the CPU and memory usage percentages of both software are compared. It shows

that the old software uses more CPU and memory than the new software.

Figure 4. 6: CPU and memory usage comparison in percentage

4.3.4 Implementation efficiency

Looking through the code, R is simple to implement and the code length is much shorter than

C++ for doing a same task. For example, R recognizes the columns in the CSV file and can read

needed data by column name directly. C++ cannot recognize the columns directly and it takes

longer code to handle the data reading.

4.4 Usability testing result

Five feedbacks of SUS questionnaire are received, and the result is shown in Table 4.2. The

highest point is 95 and the lowest point is 70. The average point is 85 which is over 70 and this

means the performance of this software is acceptable.

Besides those ten questions from the questionnaire, the users could give feedback in an open

question. The feedback shown that there were some bugs for instance the software crashed

sometimes, and the map had a flickering problem. These problems were tested again, and bugs

have been fixed at an improved version.

0.96 0.4

19.38

2.32

0

5

10

15

20

25

CPU Memory

Per

cen

tage

[%

]

CPU and memory usage comparison

New Old

24

Table 4. 2: Result of usability testing with SUS questions

SUS Questions P1 P2 P3 P4 P5 AVG

1. I think that I would like to use this system when

needed. 5 4 5 3 4 4.2

2. I found the system unnecessarily complex 1 1 1 1 2 1.2

3. I thought the system was easy to use. 4 5 5 4 4 4.4

4. I think that I would need the support of a

technical person to be able to use this system. 1 1 2 1 2 1.4

5. I found the various functions in this system were

well integrated. 5 5 4 4 4 4.4

6. I thought there was too much inconsistency in

this system. 2 2 1 3 2 2

7. I would imagine that most people would learn

to use this system very quickly. 5 5 5 3 4 4.4

8. I found the system very cumbersome to use. 1 2 1 2 2 1.6

9. I felt very confident using the system. 5 5 4 3 4 4.2

10. I needed to learn a lot of things before I could

get going with this system. 1 1 1 2 2 1.4

Percentile SUS Score 95 92.5 92.5 70 75 85

25

5. Discussion In this chapter, the method and result will be discussed. Ethical and societal consideration will

also be presented in this chapter.

5.1 Method discussion

In this thesis, the way to visualize the data is an animation with value changes. There are other

methods to show data changes during the time such as a chart that has time as X-axis and value

as Y-axis. Different visualization methods can be implemented and then users can test all of

them. Then different visualization methods can be compared and evaluated to find out which

one is better.

The developing language and framework were chosen by simple motivation and comparison.

The purpose of this thesis is not to find out the best language or framework. Instead, the aim is

to try to find an efficient method to develop the software and then evaluate the performance.

There are many other approaches for developing this software and there is limited meaning to

deeply compare languages or frameworks in this case.

One reason why the Qt framework was chosen is that it has many modules and plugins to use.

However, the frequent updated plugins and versions caused some problems. Most of the

plugins, modules, or even some data types have version dependencies and requirements. Some

data type which works in an older module version may totally become unrecognized in a newer

module version. Therefore, developers should be cautious with version requirements with the

Qt framework.

The size of the data set is not very large, and the goal of this software is not to handle large data

set. The data preparation time increases linearly when the data set size increase. If the data set

is very large, it may take long time for the data preparation. In this case study, the data set was

limited and there was no requirement for handling large data size. The evaluation was more

focused on visualization, response speed, resource usage, and user experience. It is difficult to

find a standard way to evaluate different software and the evaluation method is always depends

on the purpose and features of the software.

In this software, the front-end is not very complex thus it may hard to say if QML has high

performance for rendering more complex items. The time to render one circle was measured,

however this function is simple, therefore the full capability of QML was not examined. If

people want to know more about the performance and capability of QML, more complex front-

end should be used for evaluation.

I measured the time taken for rendering one circle by measuring how long time it takes for one

piece of rendering/drawing code. This is maybe not the real time to render one item on the

interface. But the time difference of the same function code still makes sense for the comparison

purpose.

To evaluate the CPU and memory usage, the top command was used. The data of relevant

processes were recorded such as the demo process for the new one and the R process for the old

one. However, other processes may also be relevant for the software.

The comparison of old and new software is difficult because they were developed by different

languages, frameworks, and logic.

26

5.2 Result discussion

From the result of the new software performance evaluation, we can see that the visualization

software made by Qt frameworks has a good performance. The back-end is written in C++ and

this makes sure the high speed of data handling and code execution. QML as the language of

front-end also has good performance of handling signals and rendering items. The CPU and

memory usage are low, thus running this software will not affect other tasks.

The comparison of the old and new software indicates that it is better to write CPU intensive

code in C++ than R. It takes a long time for R to handle data and in this case, it takes 16 seconds

to prepare the data of 5466 events. If the waiting time is long, it is better to provide a process

bar so the user can see how long time left for waiting. Both the CPU and memory usages are

higher than the new software which means it takes more resources to run.

Figure 4.3 shows that user interaction leads to increased CPU usage. If a user interacts with the

user interface very often, then the average value will increase. The highest value of the CPU

usage is 2.4% and this value is still much lesser than the average value of the old software which

is 19.38%.

The value of time duration and resource usage were recorded many times and the average values

were used for comparison and performance evaluation. Using average value can increase the

reliability and validation of the result value.

The performance difference is made by diverse elements, but the main reason can be the

performance difference of C++ and R. A compiled language usually has better performance

than an interpreted language. Moreover, R is developed for statistical purposes and is commonly

used for data analysis. Visualization software is not equal to statistic software. R also takes

more CPU usage, and this is maybe because all the code needs to be interpreted at the run time.

From the developing aspect, the Qt framework with QML is quite easy to learn and use. The

wide range of plugins and modules of QML provides strong support for front-end development.

Since the QML code structure is similar to JSON and JavaScript expressions are available in

QML, it is easy to write QML code. However, compared to the R code, the C++ code in the

back-end is usually longer and takes more time to develop. But the Qt framework has some

inconsistency of modules and data types in different versions which may lead to troubles.

Consider all the aspects above, the Qt framework is a usable and efficient approach for

visualization software development. It is very simple to handle some statistic data with R, but

it is less efficient to develop a whole visualization software only with R.

The result of usability testing indicates that the software is user friendly, therefore the users do

not need to learn so much before they use it. All of the reported bugs from the questionnaire

could not be found. Some bugs may occur due to different machine environments. The map

flickering problem is made by the map plugin version. A map plugin version that works for all

the Qt versions was found but this version does have a flickering problem. Other map plugins

with better performance have Qt version and module dependencies. If the users can control the

Qt version, version and module dependencies will not be a problem.

27

5.3 Ethical and societal consideration

The author has considered ethical issues during the whole research process. All the data and

results were described honestly. The questionnaire was anonymous, and participants’

information were protected. The author has informed the questionnaire participants about the

purpose of this research and the usage of their feedback. All the questions in the questionnaire

are objective without any bias. The author cooperated with other participants with respect and

tried to avoid all kinds of discrimination. Confidential data was also protected according to the

agreement between the author and the company.

Visualization software of machine learning data can help the users to get a better understanding

of the machine learning algorithms and the result made by those algorithms. Visualizing data

in a more human way is part of the explainable artificial intelligence and this will contribute to

the trustworthiness, causality, and informativeness of machine learning algorithms. With a

better understanding of how a result is made by the algorithms, the users can make sure that the

result will not harm humans or society. Moreover, the visualization software will also help the

users to provide better products and services to customers.

All the data should be collected and used based on relevant agreements, laws, moral standards,

and policies. The input data used for the software were collected by the company and the author

has not participated in this process.

28

6. Conclusion This thesis aims to find an efficient way to develop a visualization software. Even though this

thesis is written with a case study, it still provides a possible approach for other similar cases.

In conclusion, the objective of this thesis is met, and research questions can be answered.

Question 1): How can a visualization software for time-varying data with geographic

information be implemented, thus the data can be understood better and the software has good

performance?

This question can be answered in Chapter 3.1, 3.2, 4.1, 4.2, and 4.4.

The software was developed by C++ and Qt framework. C++ is the programming language for

the back-end. As a compiled language, the C++ code can be executed fast with low CPU and

memory usage. The object-oriented feature makes it is easy to store information, access stored

data, and reuse the code. The data was packaged in objects and then stored in list and map. For

data input, the correctness of data set needs to be checked to make sure that the data is

controllable and transparent. The front-end was developed with QML which is a module of the

Qt framework. Animation is the method to show the data changes at the user interface. Map and

timeline help to visualize the time and geographic information.

In general, the Qt framework is powerful, easy to use, and has good performance. But the

developer needs to be careful of version conflicts and dependencies among Qt, modules, and

plugins.

The performance evaluation shows that the software has good performance of data handling

and item rendering. The usage of CPU and memory is low therefore it will not affect the

execution of other processes. From the user experience aspect, the software is easy to use and

does not need much pre-knowledge.

Question 2): Compare the performance of new and old software and find out the advantages

and disadvantages of these two approaches.

This question can be answered in Chapter 4.3. The comparison of the old program and the new

program indicates that the software developed in R is less effective and costs more resources to

run. Although R does not have good performance in terms of execution speed and recourse

usage, it is simpler to develop software with R than with C++. R also provides many data types

and functions that support statistic purposes.

29

7. References [1] Ericsson,” Ericsson” , 2020. [Online]. Available: https://www.ericsson.com/en/about-us.

[2] A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García,

S. Gil-López, D. Molina, R. Benjamins, R. Chatila och F. Herrera, ”Explainable Artificial

Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward

Responsible AI”, Information Fusion, vol. 58, pp. 82-115, 2020. DOI:

10.1016/j.inffus.2019.12.012.

[3] V. Antonov och A. Sterner, ”Methods for developing visualizations in a non-designer

environment: A case study”, Linköping University Electronic Press, 2019.

[4] T. Miller, ”Explanation in artificial intelligence: Insights from the social sciences”,

Artificial Intelligence, vol. 267, pp. 1-38, 2019. DOI: 10.1016/j.artint.2018.07.007.

[5] T. Miller, ” But why? Understanding explainable artificial intelligence”, XRDS: Crossroads,

The ACM Magazine for Students, vol. 25, nr 3, pp. 20-25, 2019. DOI: 10.1145/3313107.

[6] A. V. Moere,” Time-Varying Data Visualization Using Information Flocking Boids”, IEEE

Symposium on Information Visualization, Austin, 2004. DOI: 10.1109/INFVIS.2004.65.

[7] C. Wang, H. Yu och K.-L. Ma,” Importance-Driven Time-Varying Data Visualization”,

IEEE Transactions on Visualization and Computer Graphics, vol. 14, nr 6, pp. 1547-1554,

2008. DOI: 10.1109/TVCG.2008.140.

[8] W. Huang, Handbook of human centric visualization, Springer, 2014. DOI: 10.1007/978-1-

4614-7485-2

[9] J. Bertin, Semiology of graphics: diagrams, networks, maps, Esri Press, Redlands, 2011.

ISBN: 9781589482616.

[10] G. M. Seed, An introduction to object-oriented programming in C++: with applications in

computer graphics, Springer, 2001. ISBN: 1852334509.

[11] G. Sawitzki, Computational Statistics An Introduction to R, CRC Press, Boca Raton ,2009.

ISBN: 9781420086812.

[12] N. Matloff, The art of R programming a tour of statistical software design, No Starch Press,

San Francisco, 2011. ISBN: 9781593274108.

[13] S. Koranne, Handbook of Open Souce Tools, Springer, 2011. ISBN: 9781441977182.

[14] M. Piccolino, Qt 5 projects: develop cross-platform applications with modern UIs using

the powerful Qt framework, Packt Publishing, 2018. ISBN: 9781788295512.

[15] GTK, “What is GTK, and how can I use it?”, 2020. [Online]. Available:

https://www.gtk.org/.

[16] wxWidgets, “About”, 2020. [Online]. Available: https://www.wxwidgets.org/about/.

[17] J. Rubin and D. Chisnell, Handbook of usability testing. How to plan, design, and conduct

effective tests, Wiley, 2008. ISBN: 9780470185483.

[18] R. Hartson och P. S. Pyla, The UX book: process and guidelines for ensuring a quality user

experience, Morgan Kaufmann, 2012. ISBN: 9780123852410.

[19] W. Albert och T. Tullis, Measuring the user experience collecting, analyzing, and

presenting usability metrics, Elsevier, Amsterdam, 2013. ISBN: 9780124157927.

[20] F. Håkansson, ”Platform for Development of Component Based Graphical User

Interfaces”, Uppsala universitet, Institutionen för informationsteknologi, 2010.

[21] J. Anderson, ”Visualisation of data from IoT systems : A case study of a prototyping tool

for data visualisations”, Linköpings universitet, Programvara och system, 2017.

[22] K. Hanna, ”Visualization of Log Data from Industrial Inspection Systems”, Linköpings

universitet, Tekniska högskolan, 2007.

[23] S. B. Lippman, J. Lajoie och B. E. Moo, C++ primer (5. ed.), Addison-Wesley, 2013. ISBN:

9780321714114.
https://doi.org/10.1109/INFVIS.2004.65https://doi.org/10.1109/TVCG.2008.140

Documents

Visualization of machine learning data for radio networks1453820/FULLTEXT01.pdf · based data. The machine learning team at Ericsson has collected data from their machine learning