Bayesian belief networks for guided remote diagnostics and ...1115022/FULLTEXT01.pdf · Bayesian belief networks for guided remote diagnostics and troubleshooting of heavy vehicles

Bayesian belief networks for guided remote diagnostics and troubleshooting

of heavy vehicles

MAHBUB HUSSAIN KAMALY

Master of Science Thesis

Stockholm, Sweden 2013

Bayesianska nätverk för guidad fjärrdiagnostik och åtgärdsplanering av

tunga fordon

MAHBUB HUSSAIN KAMALY

Examensarbete

Stockholm, Sverige 2013

Bayesianska nätverk för guidad fjärrdiagnostik och åtgärdsplanering av

tunga fordon

av

Mahbub Hussain Kamaly

Examensarbete MMK 2013:58 MDA 45

KTH Industriell teknik och management

Maskinkonstruktion

SE-100 44 STOCKHOLM

Bayesian belief networks for guided remote diagnostics and troubleshooting of heavy

vehicles

by


Master of Science Thesis MMK 2013:58 MDA 45

KTH Industrial Engineering and Management

Machine Design

SE-100 44 STOCKHOLM

”Seek knowledge and learn (for science) peace and honour, and be humble to the person who

taught you.”

Prophet Muhammad (saw)

.

Examensarbete MMK 2013:58 MDA 45

Bayesianska natverk for guidad fjarrdiagnostik och

atgardsplanering av tunga fordon


Godkant: Examinator: Handledare:

Jan Wikander Bengt Eriksson

Uppdragsgivare: Kontaktperson:

Scania CV AB Jonas Biteus.

Sammanfattning.

Kostnadreducering och effektivisering av reparationer (t.ex i bilindustrin) har varit malet for

forskningen kring guidad diagnostik i snart tvadecennier [1], med en onskan till intuitiv felsokn-

ing och reparation utan tidigare expert kunskaper. Detta betyder att automation vid diag-

nostik har blivit en nodvandighet dar det ar mojligt att forstakomplexa system samtidigt som

operatoren ges tillrackligt med stod och expertkunskaper fr att kunna tillfora kompetent assis-

tans. Detta examensarbete som utfordes paScania CV AB undersoker hur ett sadant system

skulle utformas och prestera samtidigt som arbetet ligger till grund for vidare utveckling av

guidad fjarrdiagnostik hos Scania.

Resultatet kommer att behandla tre analysomraden. Ett, dem observationer fran fordonet

som ar indikationer om ett felaktigt system. Tva, anvandning av ett Basianskt natverk for att

gora en diagnos pasystemet samt undersoka hurvida tillvagagangasattet ar effektivt eller inte

for den intiutiva kanslan. Tre, en studie och implementation av en effektiv felsokningsalgoritm

som minimerar reparationskostnaden baserad paden givna diagnosen, kostnad for reparation

av komponenter samt reparationstiden. Examensarbetet kommer forst att presenteras med en

djupgaende teoridel och foljs av implementation av teorin till en funktionell prototyp.

.

Master of Science Thesis MMK 2013:58 MDA 45

Bayesian belief networks for guided remote diagnostics and

troubleshooting of heavy vehicles


Approved: Examiner: Supervisor:

Jan Wikander Bengt Eriksson

Commissioner: Contact person:

Scania CV AB Jonas Biteus.

Abstract.

Intuitive troubleshooting and fault repair without the need of prior expert knowledge of auto-

mobiles has become essential in an aim for cost-minimization and effectiveness of repairs, it

has been a focus in troubleshooting research for the past decade or two[1]. This calls for an

automated diagnosis system that is simple to understand and operate whilst at the same time

provides the operator with the expert knowledge required for competent assistance. This mas-

ter thesis conducted at Scania CV AB will investigate how such a system would function and

perform, providing a ground work for further development.

The result will incorporate three aspects of analysis. First, the observations from the vehi-

cle indicating that something is wrong or faulty. Second, the use of a Bayesian network, a

model structure that describes probabilistic relationships and dependencies among system vari-

ables, for diagnostic purposes and to examine its haul on intuitive understanding of the system

faults. Third, an implementation and study of a troubleshooting algorithm that will minimize

the cost of repair based on an easy calculated metric that takes into consideration the proba-

bility of fault, cost of observation and the cost of repair (and indirectly also the mean repair

time). Given a particular diagnosis, an optimized action plan and repair sequence is given. A

thorough review of the underlying theory will be provided for the reader in the first part of the

report, where a slight deviation will be made to further investigate the use of Bayesian filters

and its effect on the a priori probabilities used in the Bayesian model. In the final part the

reader will be guided through the implementation of the given theory and emersion of a working

prototype.

Acknowledgements

Jonas Biteus, for your extremely valuable contribution on the Bayesian model and for your

guidance and feedback.

Hakan Warnqvist, for your help and inputs.

Mattias Nyberg, for your refreshing inputs.

Anton Einarson, for the tremendous help and support in programming.

Bengt Eriksson, for your inputs.

iv

Contents

Declaration of Authorship i

Acknowledgements v

List of Figures xi

List of Tables xiii

Abbreviations xv

1 Introduction 1

1.1 Short introduction to DTC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 User requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.2 Research at Scania . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.1 Division of subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4.1 Products on the market . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Learning Models 9

2.1 Fuzzy logic and Dempster-Shafer evidence theory . . . . . . . . . . . . . . . . . . 9

2.2 A brief comparative analysis of learning models . . . . . . . . . . . . . . . . . . . 11

2.2.1 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.1.1 A single element . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.1.2 Architecture of neural networks . . . . . . . . . . . . . . . . . . 12

Feed-forward networks . . . . . . . . . . . . . . . . . . . . . . . . . 12

Feedback networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.2 Knowledge-based methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.3 Conclusion of analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Bayesian Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4 Bayesian probability theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4.1 Joint probability distribution . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4.2 Conditional independence . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

v

Contents

2.5.1 Inference in Bayesian networks . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5.1.1 Query-Based Inference . . . . . . . . . . . . . . . . . . . . . . . 23

2.6 Parameter learning via Kalman filters . . . . . . . . . . . . . . . . . . . . . . . . 23

2.6.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.6.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Previous research on Bayesian methods for diagnostics 27

3.1 Bayesian network for vehicle systems . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Bayesian network for diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Decision-Theoretic Troubleshooting 31

4.1 The troubleshooting process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2 The optimal decision tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3 General assumptions of decision-theoretic troubleshooting . . . . . . . . . . . . . 34

4.3.1 Stochastic dynamical system . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.4 Troubleshooting strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.5 Expected cost of repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.5.1 Efficiency index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.5.2 The cost distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.6 Estimated cost of repair after tests . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5 Method 43

5.1 Requirements and research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.2 Modelling and development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3 Development framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.3.1 Google Web Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Unit and integration testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.4.1 Equivalent partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.4.2 Comparison testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.4.3 Fuzz testing (or negative testing) . . . . . . . . . . . . . . . . . . . . . . . 48

6 Design and implementation 49

6.1 Bayesian network system modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6.1.1 Model assumptions and delimitations . . . . . . . . . . . . . . . . . . . . 49

6.1.2 Model component variables . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.1.3 Observation variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.1.4 A priori probabilities of components . . . . . . . . . . . . . . . . . . . . . 53

6.2 Cost of observation and repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.2.1 Cost of observation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.2.2 Cost of repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.2.3 ICL component costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.3 Troubleshooting methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.3.1 Testing methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.3.2 Repair methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.4 The final Bayesian network model . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.5 Troubleshooting algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

vi

Contents

7 Results and verification 61

7.1 System integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7.2 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7.2.1 Equivalent partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.2.2 Comparison testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

7.2.3 Fuzz testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

8 Conclusion 67

8.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

8.1.1 Bayesian models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

8.2.1 Linking to previous research . . . . . . . . . . . . . . . . . . . . . . . . . . 69

A Bayesian network model of the ICL 71

Bibliography 73

vii

List of Figures

1.1 The user scenario of remote diagnostics . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 The overall architecture of the DiaGuide project. . . . . . . . . . . . . . . . . . . 5

1.3 The three subsystems of guided remote diagnostics. . . . . . . . . . . . . . . . . . 6

1.4 An off-board diagnostic solution by Volvo - Remote Diagnostics [2] . . . . . . . . 7

2.1 A Threshold Logic Unit (TLU) [3] . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 A combined feedforward and layered network [4] . . . . . . . . . . . . . . . . . . 13

2.3 A feedback network [4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4 A directed acylic graph (DAG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5 Given its parents (A), a node (B) is conditionally independent of its non-descendants(E). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.6 The Bayesian-network structure, consisting of an interdependent set of issues [5] 22

2.7 The ongoing discrete Kalman filter cycle. [6] . . . . . . . . . . . . . . . . . . . . 24

3.1 Bayesian network structure for vehicle stability control system [7] . . . . . . . . . 28

4.1 A decision tree representing all the possible solutions for troubleshooting a devicewith two components [8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.1 The overall process in the V-model [9] . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2 The workflow from Java source code to a JavaScript web application via GWT. . 46

5.3 General Classification of Test Techniques . . . . . . . . . . . . . . . . . . . . . . 47

6.1 The Bayesian network model of the ICL . . . . . . . . . . . . . . . . . . . . . . . 58

6.2 One step horizon troubleshooting flowchart . . . . . . . . . . . . . . . . . . . . . 59

7.1 The graphical user interface (GUI) of the diagnostics demonstrator . . . . . . . . 62

7.2 The diagnosis, given fault code (DTC) 0234. The less likely components areunordered and as given and expected in Table 7.1, the Red CAN bus is indicatingto be faulty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

A.1 The Bayesian network model of the ICL . . . . . . . . . . . . . . . . . . . . . . . 72

ix

List of Tables

1.1 A summary of DTC specifications . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1 A simple coin toss test to determine the true probability . . . . . . . . . . . . . . 18

2.2 The joint probability table for Figure 2.4 . . . . . . . . . . . . . . . . . . . . . . . 19

4.1 An example of different TS-sequences . . . . . . . . . . . . . . . . . . . . . . . . 36

4.2 Cost of observation, repair and probability of fault for all components . . . . . . 39

4.3 Estimated cost of repair for different TS-strategies . . . . . . . . . . . . . . . . . 39

4.4 Efficiency index for all three components . . . . . . . . . . . . . . . . . . . . . . . 40

6.1 The different type of nodes defined in the network . . . . . . . . . . . . . . . . . 50

6.2 The component variables in the network . . . . . . . . . . . . . . . . . . . . . . . 51

6.3 The observation nodes in the network . . . . . . . . . . . . . . . . . . . . . . . . 52

6.4 The a priori probabilities for the components . . . . . . . . . . . . . . . . . . . . 53

6.5 The observation and repair cost for the components in the Bayesian model . . . . 55

6.6 The tests for the instrumental cluster . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.7 The repair methods for the instrumental cluster . . . . . . . . . . . . . . . . . . . 57

7.1 Input conditions for equivalent partitioning with one valid and two invalid out-puts. The test case was conducted on the integrated system. . . . . . . . . . . . 63

7.2 A comparison between Assistans+ and GeNIe outputs for a given set of inputs. . 63

xi

Abbreviations

BN Bayesian Network

JPD Joint Probability Distribution

GM Graphical Model

DAG Directed Acylic Graph

ICL Instrument Cluster

EMS Engine Management System

ECU Electronic Control Unit

DTC Diagnostic Trouble Code

ANN Artificail Neural Networks

ECR Estimated Cost of Repair

ECRT Estimated Cost of Repair after Tests

TS Troubleshooting Strategy

SDK Software Development Kit

GWT Google Web Toolkit

IDE Integrated Development Environment

API Application Programming Interface

AJAX Asynchronous JavaScript and XML

XML Extensible Markup Language

OEM Original Equipment Manufacturer

xiii

Dedicated to my parents Anwar & Fathema H. Kamalywithout whose love, support and perseverance I would not be where I am

today

xv

Chapter 1

Introduction

”If anything can go wrong, it will” - Edward A. Murphy

As the system complexity and interdependencies keeps on increasing and intensifying in trucks

in a rapid velocity, the requirements on troubleshooting and maintenance are getting more time-

critical. The complexity of the system is given from the various and complicated relationships

among components and subsystems. There is a need to find, exploit and repair a fault on a

truck as quickly as possible to minimize the vehicle downtime and hence reduce the cost of

troubleshooting and repair. Today, the time from fault detection to completed repair depends

very much on the skill and experience of the mechanic, his/her ability to diagnose the fault

correctly and find an appropriate action.

Hence, it’s crucial to find methods were the process of diagnosis and troubleshooting is done

more effectively to acquire minimal lead time. One way is to introduce automation of processes

or sub-processes and it’s not only a way to increase the effectiveness and productivity also a way

to increase the reliability of the diagnosis[10]. Of course, the reliability is only as good as the

model used, calling for extensive work on development as well as on verification and validation

of the models.

The concept of troubleshooting describes the process of locating the root of a fault or defect and

the appropriate action to resolve that issue. The use of statistical models where the aim is to

mimic the behaviour of a physical system plays an essential role in an effective troubleshooting

process. A fault can be described both with a diagnostic trouble code (DTC), which is the

systematic error flag that describes an occurrence of an abnormality in the electronic control

unit (ECU) and is stored in the ECU it originates from, and through visual observations from

the driver. An example of such an observation could be when the driver detects smoke from the

engine or when the temperature gauge on the ICL is stuck. A short introduction to DTCs can

be found in Section 1.1.

1

Chapter 1. Introduction

As the DTCs are connected to a certain component or system and is traceable, hence also

mappable, through the information given by its corresponding ECU, the visual observations

can have multiple causes and reasons why a need to translate them into a language that the

computer can understand and comprehend arises.

1.1 Short introduction to DTC

The terms DTC and ECU will be used in this thesis and are abbreviations for Diagnostic

Trouble Code and Electronic Control Unit. The DTCs are defined as (according to Scania

Lexicon):

Definition 1.1. (DTC)

A code indicating a discrepancy in an electronic control system (ECU) and providing information

about why the code has been created.

The DTCs are creaked when there is a risk for a function degradation. They should not be

indicating for an unknown fault and should not be false, i.e. a DTC should only exist when

there is a known fault in the a system ECU. In each DTC, there is information contained on

the details of the error. These are summarized in Table 1.1.

Attribute Description

Heading DTC number and diagnostic area.

Detection Description of how the diagnostic test works and when it

reacts. This description should contain threshold values if

applicable.

Cause Description of possible causes to the fault indication.

System reactions The consequences of the fault, how the fault is recognized.

Actions How the faulty component is reached and some brief infor-

mation about repair methods

Table 1.1: A summary of DTC specifications

The DTCs have additional information regarding the status (whether the DTC is active, vali-

dated or passive), the numer of times the DTC has been active, a time stamp (date and time

when it was created in the ECU) and a freeze frame (a snapshot of the operational data at the

time of DTC creation). 1

1Information about DTCs were taken from the Scania Lexicon. This is not available outside Scania.

2


1.2 Objectives

The aim of this thesis is to construct a prototype for guided troubleshooting of the instrumental

cluster in a Scania truck. The overall goal for the thesis is to minimize the cost of repair and

to minimize downtime for heavy vehicles. This is reached by adapting algorithm from research

on troubleshooting and several investigations on the use of Bayesian networks (explained in

Chapter 3) for diagnostics and troubleshooting.

The main objectives of this thesis can be divided as such:

1. Investigate the use of knowledge-based methods for diagnostics

2. Evaluate models and algorithms for effective troubleshooting

3. Perform a study on parameter learning of the Bayesian network via filters

4. Perform a full system analysis regarding performance and reliability

Further, this thesis aims to investigate how such an automated and model-based troubleshooting

system would behave and perform. As the entire chain from fault detection to repair actions

will be investigated, the end result will depend greatly on the separate subsystems and their

performance.

3


1.2.1 User requirements

The general user scenario for remote diagnostics and troubleshooting is given by Figure 1.1.

Any driver, at any moment, should be able to connect to a central operator that checks the

health status of the vehicle and guides the driver through a troubleshooting process of either

finding a fault or the means to repair it. During this time, the workshop gets a detailed plan

of the vehicle and can prepare for mechanics and spare parts before the driver arrives at the

workshop, which makes the reparation process a lot more cost- and time effective.

Figure 1.1: The user scenario of remote diagnostics

Some requirements of the system given by Scania listed as:

• The possibility to extract fault codes (DTCs) from the vehicle.

• The storage of vehicle information in a central database, a server named Backoffice. This

information will be used by several clients (e.g. assistance or workshops).

• Ability to extract expert knowledge and use this in the diagnostic process (e.g from expe-

rienced mechanics).

• From the given DTCs and other perceived observations from the driver, be able to set a

diagnosis on which component is causing an error or fault.

• Derive the minimum cost of repair from the given diagnosis and extract an action plan on

how to troubleshoot in the most cost-effective way.

• Soft deadlines on events in the diagnostics process, however the user should experience it

as real-time (percieved real-time).

• Ability to establish telecommunication between the driver and the assistance operator

during diagnostics.

4


• Be able to send and initiate tests on the vehicle remotely.

1.2.2 Research at Scania

Scania has been conducting research in this area since 2008 in the project named DiaGuide,

which is partly funded by VINNOVA, Sweden’s Innovation Agency. The project is calculated

to end 2014 with a working prototype. This thesis aims to develop a demonstrator, i.e an initial

prototype that will demonstrate the functionalities and possibilities of the DiaGuide concept.

Below is the overall architecture shown in Figure 1.2. The work in this thesis will focus on the

parts outlined with red in the figure, namely the backoffice and the user interface for assistance.

Figure 1.2: The overall architecture of the DiaGuide project.

1.3 Delimitations

Although the overall aim in this thesis is to investigate how a belief network diagnostics system

will perform in heavy vehicles, it’s not possible to consider the whole vehicle in this thesis.

Therefore a delimitation on one part of the vehicle must be made.

The work in this thesis is only focused on the instrumental cluster (ICL) unit. Hence, the

designed model will only consider this ECU. However, it’s important to note that it’s very rare

that one ECU is isolated completely from the rest of the vehicle. It’s very common that several

ECUs share common sensors and/or communication protocols, why these dependencies must

be included in the model in order to acquire a correct model.

Furthermore, the data for the model will be restricted to expert knowledge and some sta-

tistical data, showing the fault probability of certain components given a error log (however,

this dataset will be very limited).

5


The work will be divided and focused on three subsystems. The information about these are

detailed in Section 1.3.1 below.

1.3.1 Division of subsystems

In order to achieve the requirements given in Section 1.2.1, a division of several subsystems

needs to be done. This is in order to simplify the task but also to make testing and verification

easier, as per unit testing on each subsystem. The four requirements can be divided into three

subsystems as given below:

1. Extraction and insertion of the observations via the Backoffice. The system to handle

these will be considered a subsystem.

2. The diagnostic system where the underlying model will provide for calculation of proba-

bilities of fault.

3. A planner which implements an algorithm to give the action plan which minimizes the

cost of repair, given the diagnostics from the subsystem above.

This is visualised in Figure 1.3 below:

Figure 1.3: The three subsystems of guided remote diagnostics.

6


1.4 Previous work

In the past, extensive research has been conducted on the troubleshooting problem where belief-

network diagnosis has been the point of focus. As Heckerman et. al [1] argues, the primary

objective is not to only detect and determine what is wrong, but also find a strategy on how to

fix that fault. This can be achieved through a set of observations, tests or repairs.

For this thesis, a Bayesian network model of the XPI-system was given in the initial stage, in

order to get familiarized with the use of Bayesian models for diagnostic purposes. The given

model was the result from a previous Master thesis at Scania [11]. This initial model is never

used in this thesis, but inspiration from the structure and the written thesis (cited in this thesis

a couple of times) was taken.

1.4.1 Products on the market

There are a few products in the market today which implements an off-board diagnostic solution,

at least to a certain extent. Volvo Trucks has a product which conducts remote diagnostics

on their trucks while the driver is on the road. The aims are similar to the work of this thesis,

namely uptime management and downtime protection. The remote diagnostic system provides

a detailed analysis of critical fault codes and connects the vehicle to dealer, workshop, original

equipment manufacturer (OEM) and decision-makers [2]. Below in Figure 1.4, the diagnostic

chain is shown.

Figure 1.4: An off-board diagnostic solution by Volvo - Remote Diagnostics [2]

7


1.5 Thesis outline

A thorough review of the underlying theory will be provided for the reader in the first part of

the thesis (Chapter 2 and 4) whilst a slight deviation will be made to further investigate the

use of Bayesian networks in research today (Chapter 3). The reader will also be familiarized

with some native troubleshooting algorithm and cost-minimizing methods. The method that is

used in this thesis will be explained in Chapter 5. In the final part (namely Chapter 6 and 7)

the reader will be guided through the implementation of the given theory and emersion of the

application Assistans+.

In the concluding part (Chapter 8) a discussion about the results will be presented as well as

an outlook on future work and development.

8

Chapter 2

Learning Models

When dealing with uncertainty handling in failure diagnostic systems there are mainly three

theories that are prevalent [12]. Fuzzy logic, Dempster-Shafer evidence theory and the Bayesian

probabilistic theory deals well with uncertainties and allows for design of dynamic probabilistic

systems and calculation of probabilities according to acquired data.

A growing interest for Bayesian networks, which are based on the Bayesian probability theory,

have emerged since the 90’s. The approach have a solid mathematical foundation and are used

in diverse fields of science, ranging from data mining, medical diagnosis, software analysis etc..

To motivate the use of a Bayesian network, an investigation of whether it’s applicable for the

targeted field needs to be conducted.

2.1 Fuzzy logic and Dempster-Shafer evidence theory

As mentioned, some of the evidence theories that deals with uncertainty are Fuzzy logic and

Dempster-Shafer theory. It’s interesting to point out why those theories have been ruled out

in this thesis and why a Bayesian approach is used instead. First of all, the presumptions in

this thesis work paved the way for investigation of Bayesian methods and models. This was

the approach preferred by Scania and the advantages of it had been researched upon in earlier

Master thesis’s [13][11] and PhD dissertations [14].

Even if this was the presumption for this thesis work, the logic behind this decision is rather

clear. Fuzzy sets theory was first proposed by UC Berkely professor Lotfi Zadeh in 1965 which

layed the groundwork for fuzzy logic, which he put forward 1973 [15]. Fuzzy logic handles

mathematical sets, or groups of items, a bit different than most mathematical sets. Usually, an

element either belongs to a set or it doesn’t. In fuzzy logic, elements can belong to different sets

in varying degrees. So certain traits that the element share with the set can be considered while

omitting the rest. In other words, fuzzy logic is applied to make machines see the world in a

9

Chapter 2. Learning Models

more human way. As the human has degrees of truth, so does fuzzy logic. Instead of seeing the

world as either 1 or 0, things might be slightly or almost and hence the machine can act upon

varying degrees of states. Paradoxically, fuzzy logic actually has a high power of precisiation of

what is imprecise [16].

When considering the target system in this thesis, a fault is either found to be existent or it

isn’t, there is no in between. We can’t apply the notion of almost faulty because, as defined in

Section 4.3.1, a variable (or component) is found to be in exactly one finite state. Furthermore,

a single-fault system is considered in this thesis (defined in Section 4.3) which means that the

faulty state is caused by one component which is in one definite fault state.

Dempster-Shafer theory on the other hand was developed as an approach to generalize

probability theory by attempting to combine distinct bodies of evidence [17]. More precise,

Dempster-Shafer theory is a generalization of the Bayesian theory of subjective probability.

The difference being that while the latter commits probabilities for each subject or element at

hand, the degree of belief using the Dempster-Shafer theory is given by a belief function rather

than a Bayesian probability distribution. Hence, probabilities are assigned to sets of possibilities

rather than to single events [18]. This tends to become an unwanted attribute for the target

system in this thesis as the point of interest is to find specific probabilities and a generalization

would be another level of undesirable abstraction.

As a conclusion, it’s made clear in this section why this thesis will consider Bayesian probability

theory only and why the others have been omitted. However, another approach which deems

to be similar to Bayesian methods is Neural networks and it’s useful to investigate this further

and make a conclusion on which approach is best for this thesis.

10


2.2 A brief comparative analysis of learning models

An analysis on various network models and a comparison between these should be done to

motivate the choice and use of Bayesian network and Bayesian methods. In this section, a brief

comparative analysis with regards to neural networks will be made with a concluding section of

why a Bayesian approach is taken. The motivation for the comparison between neural networks

and Bayesian network is that neural network might seem like an appropriate candidate for the

application field at a first glance.

2.2.1 Neural networks

Neural networks [19] are described by a network of non-linear elements, interconnected through

variable parameters. The input for each element is the weighted sum of the outcome from

other elements, thus mimicking the biological neurons. Artificial neural networks, ANNs (to

distinguish from biological neural networks), are used in machine learning and construction of

artificial intelligent systems [3]. An important note on the subject is that Bayesian methods can

and are implemented on neural networks, they are not only a property of Bayesian networks

[20].

ANNs learn through examples, just like the biological counterpart. Data acquisition and pro-

cessing is done by ANNs and the interconnected elements and their bonds are adjusted to fit

the new data. ANNs are used in different applications, such as speech recognition or classifi-

cation of data and are constantly adaptable to new process data. The underlying motivation

to use ANNs is its ability to derive useful information from complex or incomplete data. The

structure of ANNs allows for recognition of patterns that bypasses humans or other methods. If

performed correctly and taught well, ANNs can use the given information and answer ”what-if ”

questions with great reliability.

2.2.1.1 A single element

The single element (also known as neuron or unit) consists of a single output but several inputs.

There are two modes that can be applied for an element; training mode and operational mode.

In training mode, the element can be set to trigger (or fire) for a certain set of input pattern.

In operational mode, when recognizing a taught input pattern, the associated output state is

given by the element. If the input pattern is not taught to the element, it uses the previously

mentioned rule on whether to trigger or not. The trigger rule can be seen as the threshold value

θ → [0,1]. The value is 1 when the weighted sum of the inputs (or input pattern)∑wi ∗ xi

exceeds or equals the threshold value. Otherwise it’s 0 and the output of the element is not

11


triggered. The weight is multiplied with the input to give the weighted inputs. This process is

presented in Figure 2.1 below.

Figure 2.1: A Threshold Logic Unit (TLU) [3]

In mathematical terms, the mapping of input-to-output of the non-linear multivariate elements

can hence be described by [20]:

yt = ft(xt) + nt (2.1)

where xt ∈ R corresponds to a set of input variables, yt ∈ R to a set of output variables and

nt ∈ R to the system noise with t = {1, 2, ..., n} representing the time.

2.2.1.2 Architecture of neural networks

Feed-forward networks The feed-forward neural networks only allow information to pass

top-down or bottom-down, i.e only in one direction. These are widely used together with the

common layer representation of neural networks. It consists of an ”input” layer that is connected

to a ”hidden” layer which governs the inputs together with the weights on the connection

between the input and hidden layer.

The hidden layer is then connected to the output layer whose behaviour is governed by the

weights on the connection between the hidden and output layer. This means that when mod-

ifying the weights, each unit or element (seen as a node in Figure 2.2) can choose what is

represents and performs. The output of a layer can never effect the same layer and the output

is an association of the respective inputs. An example of a layered feed-forward network is

shown in Figure 2.2 below.

12


Figure 2.2: A combined feedforward and layered network [4]

Feedback networks This architecture is characterized by the signals travelling both ways in

the network. Hence, every input calls for an adjustment in each element until the network has

come to an equilibrium. For every new input the network gets, a new equilibrium point must

be found. This type of architecture can make the network extremely powerful and complex. A

feedback network is shown in Figure 2.3 below.

Figure 2.3: A feedback network [4]

13


2.2.2 Knowledge-based methods

Coming to the Bayesian networks and Bayesian methods, there are four main reasons that gives

it its strength [1].

First of all, the Bayesian network model can handle data that is incomplete. This means that

data can be subject for learning, or parameter adjustment, depending on the behaviour of the

real world (see Section 2.6). In the other case, consider two input variables to the network

that are strongly anti-correlated. Because all inputs are measured in every case, this correlation

is not a problem for the standard supervised learning techniques. In other cases, this would

provide a problem as they are not able to encode across the two variables. But as explained,

this is solved by Bayesian networks.

Secondly, as mentioned already, Bayesian networks are subject for learning and for setting of

casual relationships. This becomes powerful in two cases. One, where the aim is to gain an

understanding of the problem domain, for example during exploratory data analysis. Two, it

allows us to make predictions in the presence of interventions about casual relationships.

Thirdly, the Bayesian networks allow for use of both statistical data and expert domain knowl-

edge. This means that the experience and knowledge of real-world cases is given a forum for

interaction, which become extremely beneficial and useful when the statistical data is scarce or

expensive to collect.

Heckerman [1] gives the fourth reason when stating:

”The fact that some commercial systems (i.e., expert systems) can be built from prior knowledge

alone is a testament to the power of prior knowledge. Bayesian networks have a causal semantics

that makes the encoding of causal prior knowledge particularly straightforward. In addition,

Bayesian networks encode the strength of causal relationships with probabilities. Consequently,

prior knowledge and data can be combined with well-studied techniques from Bayesian statistics.

Four, Bayesian methods in conjunction with Bayesian networks and other types of models offers

an efficient and principled approach for avoiding the over-fitting of data. As we shall see, there

is no need to hold out some of the available data for testing. Using the Bayesian approach,

models can be ”smoothed” in such away that all available data can be used for training.”

14


2.2.3 Conclusion of analysis

With knowledge about both neural networks and Bayesian networks, it’s easy to see that there

exists similarities between them. So in order to draw a conclusion on which method is better

suited for this thesis, a review on the requirements (given in Section 1.2.1) is needed. The

chosen method does not only have to fulfil the requirements but also have to be efficient and

easy to use. Also, the lesser information that is needed for creating a correct model, the better.

This is partly because of the limited access to fault data in this thesis work, but also since there

isn’t always complete data at hand to build the models from.

While referring to the requirements in Section 1.2.1, a mapping to the characteristics of the two

methods are conducted. The most significant feature desired in the chosen method is the ability

to extract expert knowledge about the system when creating the model. Hence, it’s not only

important to be able to use statistical datasets but also, for example, having the possibility to

adhere to deviations from the general case (i.e, the expert knowledge).

In this regard, Bayesian networks provide a strong tools to utilize that knowledge in the

models. The a priori probabilities (see Section 2.4) can be set accordingly to the degree of

belief of the expert, e.g the mechanic.

Furthermore, for the construction of a neural network, a substantial set of data need to be

present in order to teach the network and adjust the weights as explained in Section 2.2.1.

Access to such data, in sufficient amount, is very limited in this thesis. This means that using

neural network would render either a poorly accurate model or a dysfunctional model altogether.

Bayesian network solves this with its ability to set probabilities from experience.

Lastly, Bayesian networks have been proposed as the preferred choice of modeling technique by

many authors on decision analysis and uncertainty reasoning [21] [22] [23]. This gives a valid

reason and support to take the approach of Bayesian networks, as it’s shown in research to be

the preferred technique.

15


2.3 Bayesian Networks

Bayesian networks (BN), also known as belief networks, fall under the category of probabilistic

graphical models (GM). The aim for these models are to describe and represent knowledge

about an uncertain domain, either by the use of expert knowledge, statistical data and / or

computational methods. The Bayesian network is composed of random variable nodes, with

edges between nodes to describe the probabilistic dependencies between the connected variables

[24]. A clear distinction from ”classical” probability is that Bayesian probability is also a

property of the expert who assigns the probability, in other words the experts belief in a certain

outcome also has an impact on the outcome of the model.

Bayesian networks are structured as directed acyclic graphs (DAG) [25] and provide an effective

representation and computation of the joint probability distribution (JPD) over a set of random

variables [25] (explained further below). The Bayesian network provides an intuitive and com-

prehensible way of understanding complex systems and their interdependencies, while at the

same time being mathematically rigid. The causal relationships between the random variables

are exposed and expressed in the form of probabilities. The directed graph helps to visualize

the probabilities and the causal relationship of nodes [26]. A simple DAG is shown in Figure 2.4

below.

Figure 2.4: A directed acylic graph (DAG)

16


As mentioned, a Bayesian network consists of a set of random variables. These are represented

as a single node in the network. Each node either has a minimum of one parent, minimum of

one child or both. A node X is a parent of another node Y if there is an arrow from node X to

node Y. In Figure 2.4, A is the parent of B.

Also, an arrow from node X to node Y means that X has a direct influence on Y. However, every

node in the network are in some degree dependant on each other and a change in a set of nodes

in one part of the network will effect another part, even if they are not directly connected. This

is the power of Bayesian network and its representation of causal relationships among random

variables.

Two important properties that comes with a Bayesian network are:

1. The network is a compact representation of the joint probability distribution (see Sec-

tion 2.4.1) over the variables.

2. The network encodes the conditional independence relationships (see Section 2.4.2) be-

tween the variables in the graph structure.

17


2.4 Bayesian probability theory

The concept of probability theory that most are familiar with and associate with the term

”probability” is what is known as classical, true or physical probability [1]. It denotes the

actual physical probability of an event, for example the 50/50 probability that a coin will land

on its head (if we disregard the possibility that it might land on the edge). When determining

physical probability, a series of tests needs to be conducted to find the appropriate value of the

probability for a certain event. We might for example toss the coin a hundred times and analyse

the results to determine the probability for the coin to land on its head.

Outcome Amount Probability

Head 45 0, 45

Tail 55 0, 55

Table 2.1: A simple coin toss test to determine the true probability

Bayesian probability differs from this definition as it describes a person’s degree of belief in the

occurrence of a certain event. This means that expert knowledge is utilized when asserting the

probabilities. Where classical probability only considers the test results, Bayesian probability

or personal probability considers other factors that could influence the outcome as well.

Also, an important note on Bayesian probabilities is that the probability value does not have to

be static, which is also true for physical probabilities. The probability given to the occurrence

of a certain event prior to a trial is known as the a priori probability of that event. This is the

known probability of the event without any prior knowledge of the outcome. Depending on how

the a priori probability has been obtained, the reliability of the outcome is affected. The a priori

probability can be obtained through expert knowledge, statistical analysis or a combination of

the two [1]. In extent, the a priori probability can be manipulated and changed depending on

a set of trials or tests. This is known as parameter learning. An extensive explanation on this

subject is given further under Section 2.6.

2.4.1 Joint probability distribution

In the previous section, the Bayesian approach to probability assertion has been discussed. To

implement this on a Bayesian network where a representation of a set of random variables struc-

tured in a DAG is given, the notion of joint probability distribution (JPD) is introduced. The

Bayesian network efficiently encodes this JPD (which can consist of both physical or Bayesian

probabilities) for a large set of random variables. Let X={x1, x2, ..., xi} denote a set of ran-

dom variables in the network S, and let P={p1, p2, ..., pi} be the associated probability for each

variable. Further, Pai is the parents to a certain variable (seen as a node in the network) and

18


the variables corresponding to those parents. For a specific network S, the joint probability

distribution for the set X is given by [1]

p(x) =n∑

i=1

p(xt|pai) (2.2)

The probability of p(xt|pai) describes the probability that the variable xi will occur given the

information of its parents and their associated probabilities pai. This is also known as the

conditional probability of the variable xi based on the knowledge of pai.

When two variables, Xi and Yi, are completely independent of each other, the joint probability

distribution is simplified to P(Xi=x,Yi=y) = P (Xi = x) ∗ P (Yi = y). We know that two

variables are independent if they fulfil

P (xi|pai) = P (xi) (2.3)

and

P (pai|xi) = P (pai) (2.4)

Example 2.1. Given the DAG in Figure 2.4, an example of what the joint probability for each

node can look like is given by the table below:

A P(A)

false 0.6

true 0.4

A B P(B | A)

false false 0.01

false true 0.99

true false 0.7

true true 0.3

B C P(C | B)

false false 0.4

false true 0.6

true false 0.9

true true 0.1

B D P(D | B)

false false 0.02

false true 0.98

true false 0.05

true true 0.95

Table 2.2: The joint probability table for Figure 2.4

19


2.4.2 Conditional independence

A property from probability theory that is used heavily in Bayesian networks is the notion of

conditional independence. It is said that given its parents, a node is conditionally independent

of its non-descendant. In other words, given the knowledge of occurrence of a random node Z,

two random variables X and Y are said to be conditionally independent if and only if, given

any value of Z, the probability distribution isn’t changed for X for all values of Y, and vice

versa [27].

Figure 2.5: Given its parents (A), a node (B) is conditionally independent of its non-descendants (E).

Consider Figure 2.5 above. We know by Equation 2.3 that node B is independent of node E if

P (B|E) = P (B). A definition for conditional independence can be described as below.

Definition 2.1. (Conditional independence)

Consider Figure 2.5. Then, a random variable B is independent of E, given D, if and only if

P (B ∩ E|D) = P (B|D)P (E|D) (2.5)

or equivalently

P (B|E ∩D) = P (B|D) (2.6)

20


2.5 Definitions

As already stated in the previous section, the upper-case X denotes a set of stochastic variables

in the Bayesian network and the lower-case x represent a single variable (or parameter) in the set

X. The variable set X is thus said to be in configuration x. Furthermore, given the information

θ, the probability that X = x is given by the probability function p(X = x |θ) (or p(x |θ) in

short). The same notion is used to denote the probability distribution for X. The probability

function tells us that ”given information/event θ, the probability of configuration x in set X is

given by p(x |θ)” [1].

The probability distribution p given by the probability function ranges to the space of real

number from 0 to 1. In other words, P(X)→ [0,1], where P(X) is the power set of X. Therefore,

the definition of the distribution p is strictly positive. Since the function is called on a set of

variables, the notion of p(x ∩ θ) to describe the probability of two events, x and θ, can also be

made. This is explained as conditional probability and is explained by the well-known Bayes’

formula [28].

P (X|θ) = P (X)P (θ|X)

P (θ)(2.7)

In earlier section the a priori probability was explained as the probability of an event without

any prior information. In the definition above, the posterior probability is given as P(X|θ),which is the probability of event X given some information θ. The probability P(θ|X) is the

likelihood of θ given X. The last term, P(θ) acts to normalize the quotient and is expanded to:

P (θ) =∑

P (θ|Hi) =∑

P (θ|Hi)P (Hi) (2.8)

The expansion is motivated by the fact that P (θ|Hi) and P(Hi) is easier to determine or find

than P(θ). The formal definition of a Bayesian network is described in definition 2.2.

Definition 2.2 (Bayesian Network). [29]

A discrete Bayesian network described as N = (X, G, P) consists of:

• A DAG G={V, E}, where V ={v1, v2, ,vi} are the nodes connected via directed arcs E.

• Random variable set X={x1, x2, , xi} which are represented in the network by the node

set V in graph G, so that xi corresponds to node vi.

• A set of conditional probability distributions P with one distribution P(xv|xpa(v)) for each

xi in set X.

21


The definition above is well visualised in Figure 2.4. However, the nodes in the network can

be classified into subclasses of causes-issues-symptoms [5]. This helps to understand the casual

relationships even more and it’s easier to set evidence on the correct node. The three subclasses

can be explained as:

1. Cause: The root cause of a fault and in most cases the component or element to repair.

For example – ”Broken heat sensor”

2. Issue: A conflict among a set of causes. An issue occurs when its associated causes occur.

3. Symptom: An indication of a faulty system is given as a symptom. For example –

”Temperature gauge is stuck on low”.

This structure is shown in Figure 2.6 below.

Figure 2.6: The Bayesian-network structure, consisting of an interdependent set of issues [5]

2.5.1 Inference in Bayesian networks

When the Bayesian network is constructed properly and asserted with a priori probabilities, the

posterior probabilities can be determined via interference with the models and by providing the

network with an information set θ = {θ1, θ2, ..., θi}. For every unique set of information θ,

the posterior probabilities will change. These probabilities are not stored in the model and is

therefore computed upon every request. This is known as probabilistic inference [1]. In general,

probabilistic inference is an NP-hard task [30].

For a vehicle, the information set would constitute of either error codes generated by the ECUs

whenever an internal fault arises or an error observation where the input is a symptom of fault

on the vehicle as seen by for example the mechanic.

22


2.5.1.1 Query-Based Inference

Another approach to Bayesian inference is to only compute the posterior probabilities over a set

of target variables in a given set. This is known as query-based inference [30]. In other words,

given the Bayesian network model N = (X, G, P) and a query Q = (N, T, e) where T ∈ X is

the target set and e is the evidence set. The result of an query-based inference is the posterior

distribution over the target set of variables, without having to compute for all variables in the

network. The motivation for using a query-based inference might relate to time requirements

or performance issues (only a target of variables are of interest, hence making it unnecessary

to compute over all variables in the network). This approach will however not be used in this

thesis. In the diagnostic process, all variables and their posterior probability is of interest and

there are no benefits to targeting a certain set of variables only in this thesis. Query-based

inference is therefore only introduced shortly.

2.6 Parameter learning via Kalman filters

The Kalman filter is a set of recursive mathematical equations that estimates the current state

of a process and are widely used in the area of autonomous or assisted navigation. The filter

supports estimations of the past, present and even future states and can do so without precise

nature of the modelled system [6]. The Kalman filter is one of the most important and common

data fusion algorithms in use today, and typical use of the filter include smoothing of noisy data

and providing estimates of parameters of interest [31]. The latter is relevant to this thesis.

However, it’s important to note that Kalman filters will not be implemented in this thesis due

to time constraints. The purpose of this section is to aid future discussions on improvements

on the model and to demonstrate what can be done.

2.6.1 Problem statement

If the state of a system at time t evolved from the prior state at time t-1, the assumption of the

state is then

xt = Atxt−1 + Btut + wt (2.9)

where xt is the state vector containing the term of interest, ut is the vector containing control

inputs, At is the state transition matrix which applies the state from t-1 on time t (e.g. the

previous state effects the current state), Bt is the control input matrix that applies the effect of

the control input parameter in vector ut on the state vector, and wt is the noise term.

23


System measurements can be performed according to the equation

zt = Htxt + vt (2.10)

where zt is the measurement vector, Ht is the transformation matrix and vt is the noise.

The noises vt and wt are assumed to be independent of each other, white, and with normal

probability distributions

p(w) ∼ N(0, Q) (2.11)

p(v) ∼ N(0, R) (2.12)

The process noise covariance Q and measurement noise covariance R matrices are assumed

constant, although they might alter in practice over time.

The Kalman filter can be written as a single equation, but its usually conceptualized as two

distinct phases: ”Predict” and ”Correct”. The predict phase uses the state estimate from

previous time t-1 to produce an estimation of the state at current time t. Its also known as the

a priori state estimate. The correct phase, the current a priori prediction is combined with

current observation information to improve the state estimate. The term is also known as the

posteriori state estimate [32]. This is given in Figure 2.7 below.

Figure 2.7: The ongoing discrete Kalman filter cycle. [6]

The following notation in the form xn|m represents the estimate of x at time n given observations

up to, and including time m. The state of the filter is represented by two variables:

• xt|t, the a posteriori state estimate at time t given all observation up to, and including

time t.

• Pt|t, the a posteriori error covariance matrix.

24


The two phases described above can be summarized into the following equations, which states

the implementation of the Kalman filter [6].

Predicted (a priori) state estimate

xt|t−1 = Atxt−1|t−1 + Bt−1ut−1 (2.13)

Predicted (a priori) estimate covariance

Pt|t−1 = AtPt−1|t−1ATt + Qt (2.14)

Correct (a posteriori) state estimate

xt|t = xt|t−1 + ttyt (2.15)

Correct (a posteriori) estimate covariance

Pt|t = (I −KtHt)Pt|t−1 (2.16)

where optimal Kalman gain Kt and innovation residual yt is described by Equation 2.17 and 2.18

respectively:

Kt = Pt|t−1HTt S−1t (2.17)

yt = zt −Htxt|t−1 (2.18)

where the innovation covariance St is given by Equation 2.19:

St = HtPt|t−1HTt + Rt (2.19)

2.6.2 Conclusions

The theory given in this section provides a mathematical basis for further discussions on the

use of Kalman filters for parameter estimation. The data that can be used is fault records for

the specified vehicle, and over time as the filter is applied for a large set of data, the filter will

diverge the probability parameters in the Bayesian network to an equilibrium state. This will

represent the true a priori probability for any given variable in the network.

25


The author proposes the use of Kalman filters to acquire a correct model of the system. It’s

important to note that in order for a successful parameter estimation, a large set of training

data will be needed for the network and filter. Once the correctness of the model is within

a given threshold value, the training data can consist of operational data from real diagnostic

cases.

2.7 Summary

In the summation of this chapter about learning models, the most significant finding is the

Bayesian method approach taken and the motivation behind it. The chapter have also discussed

the pros and cons with the methods and in particular Bayesian methods, Neural networks,

Fuzzy logic and Dempster-Shafer evidence theory, and the conclusion that Bayesian

methods are best suited for the work in this thesis were drawn.

Following this choice, an introduction to Bayesian networks were made with given examples of

Bayesian probability theory and its application. A slight deviation was made in Section 2.6

where parameter learning via Kalman filters were explained and a proposal for future work

following this thesis were given.

26

Chapter 3

Previous research on Bayesian

methods for diagnostics

Over the years, Bayesian methods have been used for diagnostic purposes in many applied

researches. A lot of these are similar to the work of this thesis, why it’s beneficial to look further

into these researches and investigate the possibility for Bayesian methods in this field. In this

Chapter, the reader will be oriented in applications or researches that use Bayesian methods for

diagnostics, and it serves a purpose as a framework to build upon and draw inspiration from for

the work in this thesis. This will be focused on both auto mobiles and vehicles but also other

applications where Bayesian networks have proven to be a useful and efficient tool.

3.1 Bayesian network for vehicle systems

Don Thompson (Pepperdine University) and Wojtek Przytula (HRL Laboratories, LLC ) [33]

have proposed a process for construction of Bayesian network specifically for diagnostics pur-

poses. They conclude that Bayesian network provide a powerful tool for diagnostic and that

the main problem is in the construction of the models for the targeted system. They propose

a computational method of deriving probabilities, showing that they’re ease obtained from do-

main experts and statistical repair data. Their approach will provide support in this thesis

when creating the model.

Matthew Schwall and Christian Gerdes at Stanford University [7] have presented a methods for

processing and analysing residuals for the purpose of fault detection using probabilistic methods.

Their analysis focuses on a car’s handling system. The derived Bayesian network is shown in

Figure 3.1 below.

27

Chapter 3. Previous research on Bayesian methods for diagnostics

Figure 3.1: Bayesian network structure for vehicle stability control system [7]

In Figure 3.1, identified faults {F1, ..., F10} and defined residuals {R1, ..., R5} (shaded boxes) are

modelled in the Bayesian network. The ”hidden” nodes that are shown in dashed boxed are the

assumptions made in while deriving the model in order to make the graph easier to construct

and more understandable.

They conclude that the direct use of residual values as evidence in the Bayesian network calls

for continuous probability distribution. As they implemented exact inference, it was a limiting

factor for the applicability of the method. They therefore propose to research further on approx-

imate solutions. Their research shows how Bayesian network can be used in a small subsystem

(in this case the vehicle stability control system) in order to analyse faults.

3.2 Bayesian network for diagnostics

Bayesian networks are used vastly in a diverse range of applications. The use of Bayesian

networks are not only limited to auto-mobile diagnostics, but can be used in other fields as well.

Bayesian networks have successfully been used in diagnostics purposes in a wide range of fields.

Steam turbine diagnostics [34], female urinary incontinence [35], penetrating injury assessment

[36] and troubleshooting of satellite communication ground equipment [37] are all examples of

how Bayesian networks can (and are) used in various fields. This shows the power of Bayesian

networks and further prove the method to be solid for uncertainty reasoning.

28

Chapter 3. Previous research on Bayesian methods for diagnostics

3.3 Summary

This chapter have briefly presented some research of Bayesian networks for diagnostic pur-

poses and how it can be applied to various fields. Also, the chapter touched upon some of

the challenges when dealing with Bayesian networks, namely performance and time issues in

the implementation of exact inference and the challenge in the construction of the Bayesian

network.

29

Chapter 4

Decision-Theoretic Troubleshooting

In the previous chapter the underlying theory behind the model was presented. The derived

model will be used for troubleshooting a fault and determining the cause. In this chapter, the

theory of decision-theoretic troubleshooting is presented. Decision-theoretic deals with identi-

fying vales, uncertainties and other issues relevant in a given decision and results in a optimal

decision [38]. This will be used when deriving the action plan for the given diagnosis and

calculated error probability distribution of the components.

4.1 The troubleshooting process

In extent to diagnosing the fault or computing the error probability distribution in the Bayesian

network model, the complete troubleshooting process involves correcting or repairing the de-

tected issue or cause. The course in which this can be taken is given by Heckerman et. al. [5]

where the troubleshooting cycle is demand driven and consists of five explicit steps:

1. Database access: A symptom is given of some fault and this is regarded as an input to

the system.

2. Construction of a Bayesian network: The relevant information is extracted for the

construction of a Bayesian network.

3. A BN solution: Using the constructed network, a set of recommendations are given for

components to repair or observe.

4. Execution: The user performs the recommended actions and new inputs are given. Step

1-4 is repeated until satisfactory results (i.e solved problem).

5. Case recording: Upon termination of a session, the distributions (a priori probabilities

in this case) are updated regarding to the new information.

31

Chapter 4. Decision-Theoretic Troubleshooting

The steps above describe the entire troubleshooting process as recommended by Heckerman. In

this thesis, the Bayesian model is given in advance and no case recording is performed, since

there is no implementation of parameter learning in this thesis. Hence, the information recorded

won’t be used (see ”Parameter learning” in Section 2.6), which narrows the steps to include

1-3-4 only.

4.2 The optimal decision tree

As mentioned above, the troubleshooting process does not only entail that the cause of the

failure in the system is exposed, but also that there is a plan on how to repair or fix the issue.

There are many possible observations, tests and repairs that can be applied to a certain problem

in order to fix the malfunction in the most effective way. However, these operations differ in the

amount of time needed and/or cost. This is why it’s essential to find the most efficient way to

solve the problem. In this thesis, the term efficient is used to describe the plan which minimizes

the ECR, short for expected cost of repair. The ECR is further explained in Section 4.5.

The traditional way of computing this expected cost is through a decision tree, such as described

by Raiffa [39]. The decision tree helps to understand the chain of events that leads up to solving

the problem. If done correctly, a decision tree can give information of all the routes which leads

up to a solution, hence allowing for a comprehensive evaluation on which plan is best to take

in order to minimize the ECR.

Definition 4.1 (Decision tree). A decision tree consists of decision nodes, that forms a rooted

tree. The root has no incoming edges. The tree is drawn as a tree consisting of multiple branches.

The decision node represents a decision: an irrevocable allocation of resource. The branches of

a decision denotes that each option is mutually exclusive. A node may hold a probability vector

indicating the probability of a certain outcome of that decision [40].

Example 4.1. In Figure 4.1, the decision tree for a two-component device is represented and

the possible outcomes for certain actions. In the initial state, at the far left decision node, there

are three routes to take. Either an observation is made, which leads to a higher certainty on the

cause of a symptom, or a repair action is taken without any further observation, for either c1

or c2. To calculate the total costs for the routes, the tree is rolled back from the far right. The

cost for each repair action is at the final branches. The cost is calculated as the cost for the

possible action times the probability of that event. For example, the cost for first performing an

observation o and then repairing c1 is 17.

(5 + 10) ∗ 0, 9 + (5 + 10 + 20) ∗ 0, 1 = 17

The total expected cost for the observation route is calculated in the same manner as:

32


min(17, 34.9) ∗ 0, 8 +min(28, 30) ∗ 0, 2 = 19, 2

Figure 4.1: A decision tree representing all the possible solutions for troubleshooting a devicewith two components [8]

We find that in the example shown, the most efficient action would be to first make an obser-

vation o and then decide on which component to repair depending on the path that minimized

the expected cost. In the case o = true the component c1 should be repaired first. If o = false

then component c2 should be repaired first.

An important note to make is that with increasing amount of components, the decision tree

grows exponentially. Hence, it’s important to find a strategy of how to determine whether to

repair or observe at each stage.

33


4.3 General assumptions of decision-theoretic troubleshooting

As mentioned, the troubleshooting can become immensely complex and with the growth of

components, tests and observation routes, the complexity grows exponentially. This means that

in order for it to be possible to identify an optimal troubleshooting algorithm, some assumptions

about the system needs to be made.

Definition 4.2 (Troubleshooting assumptions). Consider a device with n components that is

represented by the variables c1, c2, ..., cn and that each component is exactly in one finite set

of states. Then the assumptions are as follows [5]

1. At the initial stage of troubleshooting, the device is faulty. In other words, there is a

component that are in a state other than ”normal” in the device.

2. Only one component is considered to be faulty and hence the approach is on a single fault

assumption. If pi denotes the probability that component ci is faulty given the current

information of the state on the system, then a single fault assumption would give that∑pi = 1.

3. The state of the device, whether fixed or still faulty, can be observed with the cost Cp

after a repair.

4. Each component is either observable or unobservable. An observable component can be

tested or inspected to find if the component is causing the failure of the device. If the

component is observed to be faulty, it must be repaired or replaced immediately. An

unobservable component is such that it can’t be observed but can be repaired or replaced.

Hence the cost for ”observing” an unobservable component is equal to the sum of the cost

of repair and Cp.

5. The observation and reparation cost for the component is independent of any previous

observation and repairs performed on the device. Hence the costs are considered fixed and

independent.

Furthermore, the class of problems in the system being modelled are those whose dynamics can

be explained as stochastic processes, where the actions – translated as inputs to the system model

– has a direct influence to the system’s behaviour. In the troubleshooting process, a probability

distribution of the possible next states in the system depends on the system’s current state and

the choice of action by the decision maker (i.e the mechanic, driver or assistance operator) [41].

34


4.3.1 Stochastic dynamical system

The system to be modelled are considered to be a stochastic dynamical system: a system which

can be in one of a number of distinct states in any given point in time and whose state only

changes with the initiation of an action, also called an event. The purpose of the event is to

change the system’s state in order to reach the goal state, which is when the system is in a

repaired, or fixed, state.

The system to be modelled consists of random variables, both directly dependent or conditionally

independent (see Section 2.4.2). Given a finite state space S = {s1, s2, ..., sn} where a variable is

found to be in exactly one finite state, in most cases there isn’t complete information about the

current state of the variable or the system. Hence, this uncertainty or incomplete information

is represented as a probability distribution over the states in S [41]. This distribution plays a

vital role in the troubleshooting process, as the decision maker will act on the action plan which

is based on the current state – or the distribution of the most likely states when dealing with

uncertainty – in order to reach the non-faulty state.

35


4.4 Troubleshooting strategies

Troubleshooting actions can be classified into three main groups. They can either be to observe

a component in order to find if it’s faulty or not, or they can be to perform a test that will

provide more information about the fault. Finally, a repair can be performed upon finding the

faulty component.

As mentioned, the primary goal for efficient troubleshooting is to find the sequence which

minimizes the expected cost of repair. This optimal sequence is denoted as the troubleshooting

strategy (TS-strategy). Although it’s possible to find a very efficient TS-strategy, it’s very

difficult to find a completely efficient one. The complexity increases with the growth of the

system which the query is executed on. It is therefore often satisfactory to get the suboptimal

TS-strategy, i.e an optimal strategy given a certain condition.

However, the fault will usually get detected before the last action in the TS-strategy and then

the performed action will form the TS-sequence for the issue. An example of TS-strategy and

TS-sequence is given in example 4.2.

Example 4.2. Given a three component device with components A, B and C, it is found that

the optimal strategy that minimizes the cost of repair is given by STS=(Bo,Ao,Co). The sub-

index o denotes that it’s an observation. For simplicity no tests are available. The possible

TS-sequences are then given as below.

TS-strategy Faulty component TS-sequence

B, A, C A B, A

B, A, C B B

B, A, C C B, A, C

Table 4.1: An example of different TS-sequences

36


4.5 Expected cost of repair

The expected cost of repair (ECR) is a measure on how cost effective a TS-strategy is and

the goal for a successful troubleshooting in the cost optimization perspective is to find the best

TS-strategy that will give minimum cost.

Definition 4.3 (ECR). Let Coi and Cr

i denote the cost of observation and cost of repair for

component ci. Given that we observe and consequently repair (upon detection of a faulty

component) components in a order (c1, c2, ..., cn), then the expected cost of repair is described

as below [8]:

ECR(c1, ..., cn) = (Co1 + p1(C

r1 + Cp)) + (1− p1)(Co

2 +p2

1− p1(Cr

2 + Cp))

+(1− p1 − p2)(Co3 +

p31− p1 − p2

(Cr3 + Cp)) + ...

=

n∑i=1

[(1−i−1∑j=1

pj)Coi + pi(C

ri + Cp)] (4.1)

This means that component c1 is observed at the cost of Co1 . With the probability p1 it’s found

that the component is faulty and it’s repaired at the cost Cr1+ Cp . With probability 1-p1 the

component is not faulty and the next component c2 in the order is observed at the cost Co2 .

With probability p2/(1-p1) its found that c2 is faulty and is repaired at the cost Cr2+ Cp; and

so on.

4.5.1 Efficiency index

The primary task when finding the optimal troubleshooting sequence that minimizes the ex-

pected cost of repair is to find the suitable TS-strategy that satisfies the expression min(ECR).

This is done by classifying the order in which components are observed, by calculating the

efficiency index.

ef =piCoi

(4.2)

The efficiency index ef ranks the components in the order they should be observed based on the

cost of observation and the probability of the component causing the fault. So the component

with the highest ratio ef is observed first and then the components are observed in a descending

order depending on the efficiency index until the faulty component is found (and the problem

solved).

As the order according to the efficiency index gives the optimal TS-strategy, this also means

that it minimizes the total expected cost for repair (and troubleshooting). In other words,

37


if the construction of the troubleshooting sequence S is done with regards to the order of

the components with respects to descending efficiency, an ”optimal” sequence is guaranteed.

However, this only holds true for a sequence consisting only of observations and repairs. If test

are included, this is no longer a valid argument [5]. The reason for utilizing the efficiency index,

is to combine the knowledge given by ECR – with respect to descending ef – with the calculation

of estimated cost of repair after tests(ECRT). This will be discussed further in Section 4.6.

To calculate the efficiency index, the posterior probability pi need to be computed as well as

having the information about the cost of observation for component Ci as given by equation

4.2. By ordering the components based on the descending order of the efficiency index the

expression min(ECR) is fulfilled, as given by equation 4.1. A summary of the plan for the

optimal observation-repair sequence can hence be described as following:

1. Calculate the posterior probabilities for all components in the device given the information

of a malfunction.

2. Compute the efficiency index for each of the components and list them in a descending

order.

3. Observe the component with the highest efficiency index.

4. If found to be faulty, the component is repaired or replaced. Otherwise, go to step 3.

5. If a component is repaired, check if the problem is solved and the device is working.

4.5.2 The cost distribution

As mentioned earlier, the troubleshooting strategy is used to declare the sequence in which the

observations (and consequently the repair) will follow. However, it’s the efficiency index that

shows which TS-strategy minimizes the expected cost of repair. Hence, a combination of them

both will render the most cost-efficient strategy.

While it’s stated to be cost-effective, the term also denotes the time and resource costs involved

as these are included in the cost calculation of each component.

38


Example 4.3. In Table 4.2 below, the costs for observing and repairing different components

(given a device with three components: A, B and C) are presented. For simplicity, no tests are

available.

Component Prob. Obs. Cost Rep. Cost

A 0.2 5 50

B 0.7 11 47

C 0.1 15 32

Table 4.2: Cost of observation, repair and probability of fault for all components

Given a particular TS-strategy of S = {C, A, B}, the cost distribution for this strategy can be

calculated as given in Table 4.3 below. When a component is found to be faulty, it’s repaired

and then the device is observed again at the cost of 10, in order to make sure the repair was

successful.

Faulty component TS-sequence cost

A 80

B 88

C 57

Table 4.3: Estimated cost of repair for different TS-strategies

Table 4.2 and Table 4.3 reveals that there are a 20%, 70% and 10% chance of ending up with a

repair cost of 70, 78 and 47 respectively. The estimated cost on the other hand is an approxi-

mation of the final cost of repair without knowledge about the faulty component. Thus, the ECR

is calculated from Equation 4.1 as:

ECR = (Co1+p1(C

r1+Cp))+(1−p1)(Co

2+p2

1− p1(Cr

2+Cp))+(1−p1−p2)(Co3+

p31− p1 − p2

(Cr3+Cp))

with corresponding values and Cp = 10:

ECR = (15+0.1(32+10))+(1−0.1)(5+0.2

1− 0.1(50+10))+(1−0.1−0.2)(11+

0.7

1− 0.1− 0.2(32+10))

which gives an ECR = 72,8.

39


If the efficiency index ef would be used instead of ”guessing” a TS-strategy and hoping for the

best, the estimated cost of repair would be minimized every time. Since the efficiency index

uses the posterior probability, the values will change upon every change of knowledge to the

Bayesian network (i.e. for every new action or observation). An example on how the efficiency

index is used is given below:

Example 4.4. Using Equation 4.2 in combination from the cost information in Table 4.2, the

efficiency index ef can be calculated. Given the same three-component device in the previous

example, the efficiency index is shown in Table 4.4 below.

Component Prob. Obs. Cost Efficiency index

A 0.2 5 0.04

B 0.7 11 0.063

C 0.1 15 0.0066

Table 4.4: Efficiency index for all three components

If the proposed strategy given by Table 4.4 is used, i.e. in a descending order with respect to the

efficiency index giving a TS-strategy of S = {A, B, C}, the expected cost of repair is calculated

using Equation 4.1.

ECR = (5+0.2(50+10))+(1−0.2)(11+0.7

1− 0.2(47+10))+(1−0.2−0.7)(15+

0.1

1− 0.2− 0.7(32+10))

which equates to an ECR = 71.4. Thus, the expected cost has been reduced with 1.4 by a simple

ordering with respect to the efficiency index.

40


4.6 Estimated cost of repair after tests

In order to fully make use of the power in decision-theoretic troubleshooting as proposed by

Heckerman et. al [8], more information is needed in the process. As tests are introduced (defined

in Section 6.3.1 for this thesis), the algorithm for troubleshooting can look further and assess the

best route by evaluating the probability for certain test outcomes. This provides the efficiency

of the TS-algorithms and the troubleshooting process.

A value for the tests must be defined. It’s referred to as the expected cost of repair after test

(ECRT) and is specified for each tests. The ECRT is calculated as the ECR for the possible

test outcomes plus the cost of performing the test. The ECRT is given by Equation 4.3.

ECRTj =

Q∑i=1

p(Tj = qi|ε)× ECR(Tj = qi|ε) + CTj (4.3)

the sum is over q1, q2, ..., qQ where Q is the amount of possible outcomes for test Tj and CTj is

the cost for test Tj .

Since Equation 4.3 is dependent on the probability p(Tj = qi|ε), is relies on current information

about the system. For each evaluation of ECRT, the information must therefore be extracted

from the Bayesian network.

The outcome of the test are considered the possible states of Tj . For example, a test T : Did

the temperature gauge move? has the states q1 = Y es and q2 = No.

4.7 Summary

In this chapter an introduction to decision-theoretic troubleshooting was presented. The the-

ory in this chapter serves a purpose to gain an understanding on how to proceed with the

troubleshooting of heavy vehicles, primarily via calculation of a troubleshooting strategy and

calculation of the estimated cost of repair.

41

Chapter 5

Method

The general method used in this thesis is a traditional system engineering process method, the

V-model. The reason for using the V-model is that verification is a major part of the process,

which is essential in this thesis in order to analyse the use and performance of the designed

system. The method is presented briefly in Figure 5.1 below.

Figure 5.1: The overall process in the V-model [9]

Although the work in this thesis has followed the V-model method to a large extent, some

alternations have been made to suit the thesis particularly. This includes omitting some of the

stages in the process such as ”Validation planning”, ”Installation qualification” and ”Validation

reporting”. These have been omitted because they serve no purpose in the aim of this thesis.

Some of the pros and cons are listed below with this method [42].

Pros

43

Chapter 5. Method

• Testing is involved early, as early as in the requirement phase.

• Requirement change is possible at any stage

• Has all the advantages as compared to e.g. the Waterfall model

Cons

• With any change in the requirements, the testing and design documents needs to be

updated to meet the new requirements.

• Requires reviews at each stage, which is not efficient for small projects. However, this

part can be manipulated as done in this thesis and adapt to the appropriate scope.

5.1 Requirements and research

Based on the requirements, as stated in Section 1.2.1, of the system, a background study was

conducted where the possibility and alternatives of solutions were investigated. Also, since many

auto-mobile manufacturers have a product or service that can be compared to the diagnostics

system that is the aim of this thesis, it served a benefiting purpose investigating those solutions

and how the architecture of those systems looked like. Hence, inspiration and lessons could be

extracted from the previous implementations of the problem. This research can be found in

chapter 3.

5.2 Modelling and development

Once the requirements of the system where converged into a system specification document,

a general design of the architecture (see Section 1.3.1) and workflow of the solution could

be constructed. As the diagnostic system consists of three main parts, earlier mentioned as

observation, diagnostics and troubleshooting, these three sections is considered as the three

main subsystems of the design (see Section 1.3.1).

When prototyping and deploying subsystem tests, these three will be the systems in focus.

Although these are seen as separate subsystems, the interdependencies are clear and a thorough

subsystem integration test must be performed at each stage of the design and prototyping. This

is explained further in chapter 7.

In the actual development phase, the concept of rapid prototyping is applied. As the prototype

is subject for weekly evaluation at Scania together with the stakeholder of the project and

thesis, a new set of functionality is presented at each evaluation meeting. This means that

44

Chapter 5. Method

rapid prototyping is an efficient method as the iterative process converge the prototype to a

working system. The unit testing is also conducted in this phase, where each subsystem (see

Section 1.3.1) is tested separately with test data derived from the evaluation meeting, while

matching them to the user requirements (validation assessment).

5.3 Development framework

The development is conducted in the Java language in the Eclipse integrated development envi-

ronment (IDE). The reason for using Java is that the target environment for the prototype is

a common web browser. This means that the prototype will be a web application, hence it has

to be developed in a language that is translatable and understandable for a web browser. Java

allows for that translation via Google Web Toolkit (GWT). The tool compiles the Java source

code and outputs it as JavaScript, fully understandable and presentable for a web browser.

5.3.1 Google Web Toolkit

GWT allows for building and optimizing complex web-based applications, such as the proto-

type in this thesis. It’s open-source and completely free and with a large community using it

worldwide [43]. The GWT software development kit (SDK) provides a set of core Java applica-

tion programming interfaces (API) and so called Widgets. With these, the developer is allowed

to write asynchronous JavaScript and XML (AJAX) applications in Java and then compile the

source code into JavaScript, which is platform independent as it can run on all browsers, includ-

ing mobile browsers in Android and iOS. The workflow from Java source code to compilation

into JavaScript is shown in Figure 5.2.

One challenge when working with GWT and JavaScript is that the latter does not have support

for multiple threads. This means that in order to handle multiple tasks, asynchronous calls are

made which has an effect on performance and deadlines. As a consequence, hard deadlines are

not possible to achieve in a web application making asynchronous calls [44].

Although the system requirements does not entail or call for any hard deadlines (where timing of

events are safety critical [45]), this notion is useful to keep in mind during the development phase

as the performance will be affected heavily. Nevertheless, any request made server-side should

never take more time than what the user experiences as real time. This is verified throughout

the development phase, during the unit testing and at the weekly evaluation meetings.

45

Chapter 5. Method

Figure 5.2: The workflow from Java source code to a JavaScript web application via GWT.

5.4 Unit and integration testing

The unit and integration testing of the system is a major part of the development phase, as

the solution needs to be functional and realistic in its results in order to satisfy the goal of the

thesis. The testing can be done in either a black box or a white box setting [46]. For this thesis,

only a black box test setting is chosen as the relevant result is how the system behaves given

some inputs, not necessarily the path it takes. Black box testing is visualized in Figure 5.3

below.

46

Chapter 5. Method

Figure 5.3: General Classification of Test Techniques

For the black box testing, only three test configurations are considered. These are equivalent

partitioning, comparison testing and fuzz testing.

5.4.1 Equivalent partitioning

This configuration divides the input domain of an application into equivalence classes. This

means that it separates a set of valid and invalid states for given input conditions. In other

words:

• Given a range of the input condition - define one valid and two invalid states

• Given a value of the input condition - define one valid and two invalid states

Equivalent partitioning will be conducted in the unit testing (Bayesian model and troubleshoot-

ing algorithm).

5.4.2 Comparison testing

Generally, comparison testing is conducted in software development where reliability of software

is critical. The software engineering team develops several independent versions of the same

47

Chapter 5. Method

application and conducts test with the same data and verifies that the outcome is the same. In

this thesis, comparison testing will be used on the Bayesian network and the different versions

will be the target application of this thesis and GeNIe, a development environment for graphical

decision-theoretic models (i.e. Bayesian networks) [47].

5.4.3 Fuzz testing (or negative testing)

Fuzz testing, also known as negative testing, takes random values as inputs. This is done to test

the robustness of a system and will be used for both the unit testing and integration testing.

The main characteristics of fuzz testing, or fuzzing, is given below [48]:

• The input is random

• A criteria of reliability: if the application crashes or hangs during the fuzz testing - the

test case has failed

• Fuzz testing have the ability to be automated to a large extent

48

Chapter 6

Design and implementation

In this chapter, the implementation of the thesis work will be presented. Most of the work

have been focused on design of the different subsystems (see Section 1.3.1), and converging of

the same. The implementation phase was divided into several stages. Each stage defined a

certain subsystem. The derivation of the Bayesian network model was aided by the industrial

supervisor Jonas Biteus. Much work has also been put on investigating how to utilize the model

for the intended purpose.

6.1 Bayesian network system modeling

As mentioned in the introduction (see Section 1.3), a delimitation made for this thesis is that

the system should only consider the instrumental cluster (ICL) unit and hence the model should

be of the ICL-system. As the work is only intended to demonstrate the technology and what

can be done, the step towards a full vehicle system model is not far away. The application in

this thesis is fully scalable to any amount of ECU in one vehicle and also any amount of vehicles.

6.1.1 Model assumptions and delimitations

As the Bayesian model can expand rapidly when it comes to the complexity, there is a need to

simplify the model so that it fits the scope of this thesis. The model is, as mentioned, only on

the ICL-unit of the vehicle. It consists of a number of different node categories (all internally

defined, in the model they behave all the same). Some nodes acts only as connectors between

model-nodes, the purpose being to minimize the complexity of the model and hence enhancing

the performance of the Bayesian network model inference. The defined nodes are summarized

in Table 6.1.

49

Chapter 6. Design and implementation

Category Type Defined Possible fault states

Observation Visual Externally {Stuck high, Stuck low,

Stuck other, Normal,

Dirty, Cold, Hot, No

sweep, Still}Observation DTC Externally {Active, Passed}Breakable Component Externally {High, Excessive high,

Low, Obstructed, Cir-

cuit short to ground,

Circuit short to battery,

Circuit open, Electrical

fault, No fault, Faulty,

Stuck open, Stuck

closed, Stuck high,

Stuck low, Stuck}Breakable Other Externally {Same as above}

Connectors - Internally -

Table 6.1: The different type of nodes defined in the network

6.1.2 Model component variables

The component variables in the model are extracted during the diagnosis (diagnosis subsystem

described in Section 1.3.1) and examined. In the model, these are the cause nodes (see Table

6.1). For each set of evidence query on the model, all nodes update their probability distribution

accordingly. However, in the implementation, only the nodes defined (internally) as components

are extracted to find the component which causes the fault or error. The component variables in

the ICL model is given in Table 6.2 below. The component variables are defined as breakables

in the model, and divided into two subgroups. These are component and other. Components

are those which are replaceable with spare parts, i.e. actual physical components. Others are

the rest, i.e CAN buses, cables and hoses.

Each component have a cost of repair, where the cost of spare part and time to repair the

component is considered and a cost of observation, where the cost to inspect whether the

component is faulty or not is considered (i.e the time to observe the component).

50


Identifier Code Fault states Type

Coolant thermostat A11 {Stuck open, Stuck

closed}Component

EMS E44 {Electrical fault} Component

ICL O1 {Electrical fault} Component

Radiator fan A1 {Faulty} Component

Radiator and hoses A2 {Faulty} Component

Temperature sensors except coolant A4 {Faulty} Component

Coolant temperature circuit T33 {Faulty} Component

Engine coolant temperature sensor 30 {Circuit open, Circuit

short to ground, Stuck

high, Stuck low, Stuck}

Other

Engine coolant temperature sensor cables 31 {Circuit short to

ground, Circuit short to

battery, Circuit open}

Other

Yellow CAN bus 71 {Electrical fault} Other

Red CAN bus 50 {Electrical fault} Other

Table 6.2: The component variables in the network

51


6.1.3 Observation variables

The observations (observation subsystem described in Section 1.3.1), derived from the DTCs and

other perceived observations, are set as the symptom variables in the model. The observation

variables acts as the evidence set X as in definition 2.2. The observations are also categorized

into two subgroups, DTCs and visuals. See Table 6.3 below.

Identifier Code ECU Type

ICL temperature gauge sweep area 1 ICL Visual

ICL temperature gauge movement 2 ICL Visual

Fan rotation 3 EMS Visual

ICL temperature gauge active movement 5 ICL Visual

Engine temperature 7 EMS Visual

Communication with the coordinator (COO) 0303 ICL DTC

CAN communication, yellow CAN bus 0300 ICL DTC

Unknown D100 ICL DTC

Communication with the engine control unit (EMS) 0305 ICL DTC

Internal fault 0971 ICL DTC

Unknown 0256 ICL DTC



Internal software fault 9999 EMS DTC

Coolant temperature sensor 0115 EMS DTC

Coolant temperature sensor 1135 EMS DTC

Coolant temperature sensor 1 0118 EMS DTC

Coolant temperature sensor 1 0117 EMS DTC

Coolant temperature 1132 EMS DTC



Communication with the EMS control unit 0234 COO DTC

Yellow CAN bus 02A0 All DTC

Red CAN bus 029F All DTC

Red CAN bus 0255 All DTC

Table 6.3: The observation nodes in the network

52


6.1.4 A priori probabilities of components

Identifier Code Fault state Probability

Coolant thermostat A11 Stuck open 1× 10−8

Stuck closed 1× 10−8

EMS E44 Electrical fault 1× 10−8

ICL O1 Electrical fault 1× 10−7

Radiator fan A1 Faulty 1× 10−7

Radiator and hoses A2 Faulty 1× 10−7

Temperature sensors

except coolant

A4 Faulty 1× 10−8

Coolant temperature

circuit

T33 Faulty 1× 10−7

Engine coolant temper-

ature sensor

30 Circuit open 1× 10−7

Circuit short to ground 1× 10−7

Stuck high 1× 10−7

Stuck low 1× 10−7

Stuck 1× 10−7

Engine coolant temper-

ature sensor cables

31 Circuit short to ground 1× 10−7

Circuit short to battery 1× 10−8

Circuit open 1× 10−8

Yellow CAN bus 71 Electrical fault 1× 10−7

Red CAN bus 50 Electrical fault 1× 10−7

Table 6.4: The a priori probabilities for the components

The reason for such low a priori probabilities on the components are the lack of statistical data

that can serve to adjust the values to fit the statistics of the ECUs fault history. The power of

Bayesian network is still utilized in the structure of the network that is drawn and the casual

relationships between components. This describes how different components are affected by

each-other, as explained in Section 2.2.2.

53


6.2 Cost of observation and repair

As given by Equation 4.1, each component has two costs associated with it: the cost of ob-

servation Coi and the cost of repair Cr

i . These are static values, independent of the state of

the system. To achieve a successful troubleshooting result and have a cost-efficient action plan,

each component needs to be associated with both Coi and Cr

i . Also, when a faulty component

is discovered and repaired, the system is observed once more with the cost of Cp to determine

if the repair was successful or not. This is also considered a static value.

6.2.1 Cost of observation

The cost of observation entails the cost for observing a component. This is regarded as the

standard time to investigate a component (including dismounting / disassembling if necessary)

multiplied with the hourly rate of a mechanic – giving the total cost for a mechanic to observe

a component. The hourly rate is approximated to ∼150 SEK.

6.2.2 Cost of repair

The cost of repair is the cost for fully repairing / replacing the component (the lowest cost is

considered here), including the cost for spare parts and cost of mechanics and other personnel

(with the same calculation as the cost of observation). It should be noted that the standard

time for repair / replace is different than that of observation.

54


6.2.3 ICL component costs

In Table 6.5 below, the cost of observation and repair is presented.

Component Obs. cost (sek) Rep. cost (sek)

Coolant thermostat 105 365

EMS 205 2650

ICL 140 1950

Radiator fan 95 240

Radiator and hoses 125 230

Temperature sensors except coolant 110 315

Coolant temperature circuit 120 400

Engine coolant temperature sensor 120 375

Engine coolant temperature sensor cables 125 280

Yellow CAN bus 165 1300

Red CAN bus 165 1300

Table 6.5: The observation and repair cost for the components in the Bayesian model

6.3 Troubleshooting methods

The information from Section 6.2 above will determine in which order the troubleshooting will

occur. But as mentioned in Section 4.6, the tests must be taken into consideration in order to

minimize the cost and achieve an efficient troubleshooting process. These are given for the ICL,

but as mentioned earlier it’s difficult to separate the ICL entirely from other ECUs, why some

of the tests affect ECUs also.

6.3.1 Testing methods

The tests (also called methods) are presented below in Table 6.6.

55


Action Description Test cost (sek)

Swipe instrumental cluster Initiates a swipe of all gauges

in the ICL

100

Check coolant temperature sensor Check the function of the tem-

perature sensor with SDP3

and a multimeter

150

Check thermostat Check the function of the

thermostat in a vessel of boil-

ing water

105

Set temperature in EMS Manipulate the sensor value

for the temperature in the

EMS and check for response

50

Table 6.6: The tests for the instrumental cluster

6.3.2 Repair methods

When acquiring the action plan to repair or renew a component, a set of repair methods are

used. Again, it’s difficult to separate the ICL from the system as it’s interdependent with other

ECUs. Hence, the repair methods might not be directly connected to the ICL.

The repair methods are given in Table 6.7 below.

56


Action Type Description

Renew instrumental cluster Renew Schedules renewal of the in-

strument cluster to the next

workshop visit.

Renewal - Glass, instrumental cluster Renew Schedules renewal of the glass

panel of the instrument clus-

ter to the next workshop visit.

Renew coolant temperature sensor Renew Schedules renewal of the

coolant temperature sensor to

the next workshop visit.

Renew thermostat Renew Schedules replacement the

thermostat to the next

workshop visit.

Renew EMS Renew Schedules to the next work-

shop visit renewal and pro-

gramming of the EMS control

unit.

Renew Coordinator Unit (COO) Renew Schedules to the next work-

shop visit renewal and pro-

gramming of the COO control

unit.

Troubleshoot CAN system Repair Schedules to the next work-

shop visit troubleshooting of

the CAN system with SDP3.

Table 6.7: The repair methods for the instrumental cluster

57


6.4 The final Bayesian network model

The final model is given in Figure 6.1 below. It’s also given in Appendix A. The model consists

of component and observation nodes. The internal nodes (as described in Table 6.1.1) are given

a purple color in the model. The reason for them are to simplify the model and make it easier

to construct.

Figure 6.1: The Bayesian network model of the ICL

6.5 Troubleshooting algorithm

The troubleshooting algorithm used in this thesis is a so called one step look-ahead algorithm. It

incorporates the possibility to evaluate tests in the troubleshooting process using Equation 4.3.

For each action, the ECR is compared to the ECRT for all available tests to evaluate which

approach corresponds to the lowest estimated cost.

The comparison determines which action is best as the next step, hence the name one step

look-ahead, as it only sees one step ahead in time. A more complex troubleshooting algorithm

could be used, but as it’s sufficient to use a one step algorithm in this thesis it will not be the

case here. Tests provide information that can isolate a fault quicker and cheaper than only

calculating the ECR and basic TS-strategy. This is evaluated for every action, and for every

new set of evidence in the Bayesian network the ECR and ECRT are updated.

58


Figure 6.2: One step horizon troubleshooting flowchart

Noteworthy about the algorithm is that it’s biased towards how it sees the tests. After each

evaluation, it makes a decision whether it’s good or bad. It only cosiders to make the test now

or never. It does not comprise the idea of performing a test later in the process, which weakens

it’s efficiency by some degree [49].

59

Chapter 7

Results and verification

In this chapter, the results of the previous chapter (Chapter 6) will be presented. Furthermore, a

verification of the modelled system will be conducted to investigate how (or if ) the requirements

in Section 1.2.1 were fulfilled.

7.1 System integration

When referring to ”integration testing” in this chapter, the complete system where all three

subsystems (see Section 1.3.1) are integrated. The user interface is displayed in Figure 7.1.

which is considered the interface for the integrated system. The integrated system is named

Assitans+, as it’s considered an advancement from the current assistance service provided today.

For the diagnosis, three threshold values are considered. For components with a posterior prob-

ability of 0.7 or greater - the likelihood is set to most likely (red color). For 0.7 > a posterior >

0.1, the likelihood is set to less likely (orange color). For a posterior ≤ 0.1, the likelihood is set

to not likely (green color).

7.2 Verification

For unit (each subsystem is considered a ”unit”) and integration testing and verification, a

number of simulation where made of different user scenarios. These are described in the sections

below. The difference of verification and validation is sometimes difficult to distinguish. Below

is a description proposed by Wallace et. al [50]:

Validation: Are we building the right system?

Verification: Are we building the system right?

61

Chapter 7. Results and verification

Figure 7.1: The graphical user interface (GUI) of the diagnostics demonstrator

7.2.1 Equivalent partitioning

In Table 7.1 below are the chosen input conditions and the corresponding output states (valid

and invalid). The chosen inputs are based on an expected user scenario (given from weekly

meetings at Scania). The valid output (component) is expected to be faulty (either most likely

or less likely) given the input, and the invalid outputs are expected to be non-faulty given the

input. In this test case, three DTC inputs and one visual input are considered.

62


Input (Code) Valid output (Code) Invalid outputs (Code) Classes covered

DTC (0234) Red CAN bus (50) ICL (O1) DTC (0303,

0305, 0255, 029F,

D100)

Radiator fan (A1)

DTC (0973) ICL (O1) Yellow CAN bus (71) DTC (0972, 0971)

Radiator fan (A1)

DTC (0128) Coolant temp. circuit (T33) ICL (O1) DTC (1133, 1132,

1135, 0115)

Radiator and hoses (A2)

Visual Temp.

gauge doesn’t

sweep (1)

ICL (O1) EMS (E44) Random

Temperature sensors (A4)

Table 7.1: Input conditions for equivalent partitioning with one valid and two invalid outputs.The test case was conducted on the integrated system.

The tests were conducted on the integrated system (user interface given in Figure 7.1) and an

example can be seen in Figure 7.2.

7.2.2 Comparison testing

For the comparison testing, the same input where used from Section 7.2.1. However, the com-

parison where made to the GeNIe outcome to see if the model and thesis application did give

the same result as the GeNIe application. The threshold values given in Section 7.1 are used

here also.

Input (Code) Target comp. (Code) Assistans+ output GeNIe

probabil-

ity

DTC (0234) Red CAN bus (50) Less likely 0.12

DTC (0973) ICL (O1) Most likely 1

DTC (0128) Coolant temp. circuit (T33) Most likely 0.7

Visual Temp.

gauge doesn’t

sweep (1)

ICL (O1) Most likely 0.85

Table 7.2: A comparison between Assistans+ and GeNIe outputs for a given set of inputs.

63


Figure 7.2: The diagnosis, given fault code (DTC) 0234. The less likely components areunordered and as given and expected in Table 7.1, the Red CAN bus is indicating to be faulty.

The troubleshooting system in Assistans+ behaves well according to the expected values, how-

ever, this can’t be taken as if the model itself is a valid representation of the system. The

comparison test only compares between the two outputs, but puts no value on the validity of

the output result. To achieve this, the comparison must be made with a physical vehicle and

with the assistance of a skilled mechanic or domain expert.

7.2.3 Fuzz testing

The goal for the fuzz testing simulation was to see how the application and model would react

to random inputs. The thesis application was testes randomly, changing variables and inputs

to see how the application behaved. If it crashed or hanged, the test was considered a failure.

The outcome from the negative testing was only examined holistically as it was not possible to

analyse any expected output as this was random, given the input conditions.

64


As a conclusion for the negative testing, it was found that the application froze when trying

to update the diagnosis with new inputs without allowing the application to update once for

each input. In other words, when forcing multiple inputs (observations) at the same time, the

application froze and had to be restarted. However, this is only a weakness in the framework

and environment the application runs in (mainly issues with JavaScript not being able to handle

multiple threads), and not a property of the model or troubleshooting algorithm. In principle,

this should not impose a problem for the application. The model itself handles multiple inputs

(evidences) very well, and when performing the same type of negative testing in GeNIe, the

failure is not reproduced.

65

Chapter 8

Conclusion

As a conclusion, it can be said that this thesis proved that troubleshooting of heavy vehicles

using Bayesian network models, and decision-theoretic algorithms, have a large potential and

should be investigated further in future research. The results were satisfactory and this is shown

by the test cases in Chapter 7. The limitations of the thesis implementation have already been

discussed, namely the lack of a completely correct Bayesian model and the use of a sub-optimal

troubleshooting algorithm (one step lookahead). These are recommended to be improved in

future work.

8.1 Discussion

In the implementation in this thesis of the troubleshooting application, the main areas for

improvements are the correctness of the Bayesian model, and a more complex troubleshooting

algorithm. However, since the purpose of this thesis was to demonstrate how an integrated

troubleshooting system that uses Bayesian network models for preparation of an action plan,

it’s natural that these were not as optimal as they could’ve been. The troubleshooting algorithm,

as mentioned earlier, only looked one step ahead in time and never considered the possibility

to conduct a test later in time. This limited the efficiency of the algorithm heavily, but the

efficiency was sufficient enough for the implementation in this thesis.

Since the troubleshooting algorithm depended heavily on the outcomes of the Bayesian network

model and its probability distribution, it’s concluded that for a successful troubleshooting (in

the sense of minimizing repair cost and minimize downtime of the vehicle), both the model and

algorithm need to be as optimal as possible. Hence, both are an area of focus for future work.

67

Chapter 8. Conclusion

8.1.1 Bayesian models

One of the main challenge for a successful model-based diagnostics is in the method for creating

the model. In this thesis the model was created manually and for the most part only via expert

knowledge of the modelled system. As Scania is known for their modularity structure and design

of their trucks [51], this means that for each modification of a truck - there needs to be a new

modified model. It’s not plausible to create the models manually if this was to be implemented

on all trucks and versions. Hence, automated methods for creating Bayesian networks is called

for, and future work should focus on this issue. Also, for a more accurate model, statistical

fault data should be used in a larger extent than it has been in this thesis.

Moreover, the use of maximum-likelihood estimation [52] gives the possibility to create (or teach)

a network structure from sets of data and prior information. Hence, there exists a possibility

to automate the creation of the Bayesian networks. This can be applied for:

• Network structure is specified & Data does not contain missing values

• Network structure is not specified & Data does not contain missing values

• Network structure is specified & Data does contain missing values

• Network structure is not specified & Data does contain missing values

8.2 Future work

For future work the author recommends to focus on the creation of accurate models. The

troubleshooting process will only be as accurate as the underlying model. Here, a wider usage

of statistical data from the error logs can be used as a ”teacher” of the models.

Also, an important aspect is the automation in model creation. As mentioned, Scanias mod-

ularity design calls for a new model for each alteration of a truck, why it’s not plausible to

acquire models manually and by hand.

As far as the question about using alternative learning models, such as neural networks, instead

of Bayesian networks, there are both pros and cons with such an approach. As investigated

already by Scania [13][11](among others), Bayesian methods are suitable for this task as the sys-

tems modelled are (mostly) non-deterministic. Hence, Bayesian networks provide well grounded

support, because of its statistical nature. However, given sufficient amount of learning data,

neural networks might prove to be the better case. But it all depends on the amount and quality

of the learning data provided to the network. Bayesian network overcomes this dependency by

incorporating domain expert knowledge.

68

Chapter 8. Conclusion

8.2.1 Linking to previous research

During the last couple of years, a lot of research have been conducted on the use of decision-

theoretic methods and Bayesian networks for troubleshooting vehicles. Even in the pre-study

some thesis work (majority from KTH) was found on the matter [13][11][53]. What this thesis

brings to the table is the comprehensiveness of all the different field and merging them into a

single system to truly demonstrate the power of guided troubleshooting and diagnostics.

69

Appendix A

Bayesian network model of the ICL

71

Appendix A. Bayesian network model of the ICL

FigureA.1:

Th

eB

ayesian

netw

ork

mod

elof

the

ICL

72

Bibliography

[1] D. Heckerman. A tutorial on learning with Bayesian Networks. 1995.

[2] Volvo Trucks US. Remote diagnostics - real diagnostics. real people. real

time., 2013. URL http://www.volvotrucks.com/trucks/na/en-us/business_tools/

connectedvehicleservices/Pages/default.aspx.

[3] Nils J. Nilsson. Introduction to Machine Learning. 1998.

[4] Wikipedia. Artificial neural networks, March 2013. URL http://en.wikibooks.org/

wiki/Artificial_Neural_Networks.

[5] D. Heckerman and J. Breese. Decision-theoretic case-based reasoning. page 838, 1996.

[6] G. Welch and G. Bishop. An introduction to the kalman filter. Technical report, University

of North Carolina at Chapel Hill, 2006.

[7] M. L. Schwall and J. C. Gerdes. A probabilistic approach to residual processing for vehicle

fault detection. American Control Conference, 3:2552–2557, 2002.

[8] J. Breese D. Heckerman and K. Rommelse. Decision-theoretic troubleshooting. pages 1–15,

1996.

[9] Amir. V-model, June 2013. URL http://www.testingexcellence.com/v-model/.

[10] B. Selic. The pragmatics of model-driven development. IEEE, 20:19–25, Sept.-Oct. 2003.

[11] Alexander Cyon. Modeling of fuel injection system for troubleshooting. Master’s thesis,

Royal Institute of Technology, 2012.

[12] Julio C. Ramirez and Antonio S. Piqueras. Learning bayesian networks for systems diag-

nosis. Robotics and Automotive Mechanics Conference, 2006.

[13] Thomas Gustavsson. Troubleshooting using cost effective algorithms and bayesian net-

works. Master’s thesis, Royal Institute of Technology, 2006.

[14] Hkan Warnquist. Computer-Assisted Troubleshooting for Efficient Off-board Diagnosis.

PhD thesis, Linkpings universitet, 2011.

73

http://www.volvotrucks.com/trucks/na/en-us/business_tools/connectedvehicleservices/Pages/default.aspx

http://www.volvotrucks.com/trucks/na/en-us/business_tools/connectedvehicleservices/Pages/default.aspx

http://en.wikibooks.org/wiki/Artificial_Neural_Networks

http://en.wikibooks.org/wiki/Artificial_Neural_Networks

http://www.testingexcellence.com/v-model/

Bibliography

[15] Chien C. Lee. Fuzzy logic in control systems: Fuzzy logic controller - part 1. IEEE

Transactions on systems, Man and Cybernetics, 20:404, 1990.

[16] Lotfi A. Zadeh. Is there a need for fuzzy logic? Information Sciences - An international

journal, (178):2751–2779, Februari 2008.

[17] Q. Yang J. Li and B. Yang. Dempster-Shafer Theory Is A Special Case of Vague Sets

Theory, 2004.

[18] Wikipedia. Dempster-shafer theory, 2013. URL http://en.wikipedia.org/wiki/

Dempster%E2%80%93Shafer_theory.

[19] C. Stergiou and D. Siganos. Neural networks, May 2013. URL http://bit.ly/2AeP0i.

[20] Joao F. G. de Freitas. Bayesian methods for neural networks.

[21] J. Breese M. Henrion and E. Horvitz. Decision analysis and expert systems. Al Magazine,

1991.

[22] B. Middleton M. Pradhan, G. Provan and M. Henrion. Knowledge engineering for large

belief networks. Uncertainty in Artificial Intelligence: Proceedings of the Tenth Conference,

1994.

[23] K. Laskey and S. Mahoney. Network fragments: Representing knowledge for construting

probabilistic models. Uncertainty in Artificial Intelligence: Proceedings of the Thirteenth

Conference, 1997.

[24] I. Nachman N. Friedman, M.Linial and D. Pe’er. Using bayesian networks to analyze

expression data. Journal of Computational Biology, 77:601–620, 2000.

[25] F. Faltin F. Ruggeri and R. Kenett. Bayesian networks. Encyclopedia of Statistics in

Quality and Relaiability, 2007.

[26] H. Li and Y. Zhou. Fault diagnosis by bayesian network insystem of poor braking effi-

ciency. 4th International Conference on Intelligent Human-Machine Systems and Cyber-

netics, page 45, 2012.

[27] Jun Shao. Lectures from: STAT709 Mathematical Statistics. University of Wisconsin-

Madison, 2011.

[28] Q. Shen R. Daly and S. Aitken. Learning bayesian networks: approaches and issues. The

Knowledge Engineering Review, 26:99–157, 2011.

[29] T. D. Nielsen and F. V. Jensen. Bayesian Networks and Decision Graphs. Springer, second

edition, February 2007.

[30] U. Kjaerulff and A. Madsen. Probabilistic Networks An Introduction to Bayesian Networks

and Influence Diagrams. 2005.

74

http://en.wikipedia.org/wiki/Dempster%E2%80%93Shafer_theory

http://en.wikipedia.org/wiki/Dempster%E2%80%93Shafer_theory

http://bit.ly/2AeP0i

Bibliography

[31] Ramsey Faragher. Understanding the basis of the kalman filter via a simple and intuitive

derivation. IEEE Signal Processing Magazine, 2012.

[32] M. I. Ribeiro and P. Lima. Introduction to kalman filtering. Technical report, Intituto

Superior Technico, 2008.

[33] K. W. Przytula and D. Thompson. Construction of bayesian networks for diagnostics.

Aerospace Conference Proceedings, IEEE, 5:193–200, Mars 2000.

[34] Z. Qi and L. Yi. Improved bayesian network in steam turbine fault diagnosis. Industrial

Mechatronics and Automation (ICIMA), pages 465–468, May 2010.

[35] S. Venkatesh M. Hunt, B. von Konsky and P. Petros. Bayesian networks and decision

trees in the diagnosis of female urinary incontinence. Engineering in Medicine and Biology

Society, 1:551–554, June 2000.

[36] J.R. Clarke O. Ogunyemi and B. Webber. Using bayesian networks for diagnostic reasoning

in penetrating injury assessment. Computer-Based Medical Systems, pages 115–120, June

2000.

[37] R. Barco S. Munagala, L. Moltsen and P. Lazaro. Automated troubleshooting of satellite

communication ground equipment. Aerospace Conference, pages 1–10, March 2008.

[38] Wikipedia. Decision theory, 2013. URL http://en.wikipedia.org/wiki/Decision_

theory.

[39] H. Raiffa. Decision Analysis: Introductory Lectures on Choice Under Uncertainty. Addison-

Wesley, 1968.

[40] L. Rokach and O. Maimon. Data mining and knowledge discovery handbook. 2010.

[41] T. Dean C. Boutilier and S. Hanks. Decision-theoretic planning: Structural assumptions

and computational leverage. Journal of Artificial Intelligence Research, 11:1–94, 1999.

[42] S. Balaji and Dr. M. Sundarajan Murugaiyan. Waterfall vs v-model vs agile: A comparative

study on sdlc. International Journal of Information Technology and Business Management,

2:28, 2012.

[43] Google Webkit Tool. Gwt project, 2013. URL http://www.gwtproject.org/overview.

html.

[44] Antonio G. Hernndez and Mara N. M. Garca. A javascript rdf store and application library

for linked data client application. Technical report, Universidad de Salamanca, 2011.

[45] J. Lee and N. Jindal. Asymptotically optimal policies for hard-deadline scheduling over

fading channels. IEEE Transactions on Information Theory, 2009.

75

http://en.wikipedia.org/wiki/Decision_theory

http://en.wikipedia.org/wiki/Decision_theory

http://www.gwtproject.org/overview.html

http://www.gwtproject.org/overview.html

Bibliography

[46] I. Jovanovic. Software testing methods and techniques. pages 30–41, 2008.

[47] University of Pittsburgh. Genie and smile, 1998. URL http://genie.sis.pitt.edu/.

[48] University of Wisconsin. Fuzz testing of application reliability, 2008. URL http://pages.

cs.wisc.edu/~bart/fuzz/fuzz.html.

[49] B. Kristiansen F. V. Jensen, U. Kjaerulff et. al. The sacso methodology for troubleshooting

complex systems. Hewlett-Packard Laboratory for Normative Systems, 2000.

[50] D. R. Wallace and R. U. Fujii. Software verification and validation: Its role in computer

assurance and its relationship with software project management standards. NIST Special

Publication, 1990.

[51] Scania. Easier customisation with scania’s modular system, 2013. URL http://www.

scania.com/media/feature-stories/general/module-system.aspx.

[52] Stanford University. Lectures on statistical inference. 2013.

[53] Joakim Fenn. Fault code positioning. Master’s thesis, Uppsala Universitet, 2011.

76

http://genie.sis.pitt.edu/

http://pages.cs.wisc.edu/~bart/fuzz/fuzz.html

http://pages.cs.wisc.edu/~bart/fuzz/fuzz.html

http://www.scania.com/media/feature-stories/general/module-system.aspx

http://www.scania.com/media/feature-stories/general/module-system.aspx

Documents

Bayesian belief networks for guided remote diagnostics and ...1115022/FULLTEXT01.pdf · Bayesian belief networks for guided remote diagnostics and troubleshooting of heavy vehicles