Cyber-Physical Systems and Mixed Simulationswsn.univ-brest.fr/pottier/hoang.pdf · systems such as self-reproduction in biology, di↵usion models in chemistry. The famous ”Game

Master Research Internship

Master Thesis

Cyber-Physical Systems and MixedSimulations

Author:

Tran Van HoangSupervisor:

Professor Bernard Pottier

Abstract

Climate change has received much attention in recent years. The needs of prediction and

validation of real systems behaviors and natural phenomena are critical. Simulation is a good

candidate for this mission.

However, the major problem is that modeling and simulating complicated and large physical

systems are time-consuming. Despite many commercial software now exist for such systems

(water, forest modeling as examples), require a considerable knowledge of specific physical

processes, and about the study areas. Thus, at the first step, we propose a practical way for

simply modeling physical systems, especially natural system, by using Cellular Automata (CA).

The PickCell tool developed at Lab-STICC laboratory will facilitate that process. As a point,

GPU computations and parallelisms will be proposed as an important part of this methodology.

The purpose is to accelerate large size physical simulations.

In addition, we propose the use of distributed simulations to deal with the lack of interoperability

between simulations. To do that, we use an IEEE standard High Level Architecture (HLA) for

designing the system supporting mixed simulations being based on synchronous systems.

This also makes a great chance of conducting the simulations Cyber-Physical Systems.

Acknowledgments

I would like to give special thanks to professor Bernard Pottier, and all of my colleagues at

LabSTICC, UBO. I appreciate the supports not only in research activities but also in my daily

life.

Tran Van Hoang, Brest, France, 12/06/2015

Contents

1 Introduction 1

1.1 Motivations and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Cyber-Physical Systems (CPS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Cellular Automata (CA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Physical simulations based on cell networks 5

2.1 PickCell tool and cell networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Physical simulations based on cell networks . . . . . . . . . . . . . . . . . . . . . 8

2.3 Case study and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.4 Routing algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Simulations with Cuda programming model 14

3.1 GPU and Cuda programming model . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Accelerating simulations by using Cuda . . . . . . . . . . . . . . . . . . . . . . . 16

3.3 Details of GPU implementation of simulations . . . . . . . . . . . . . . . . . . . . 18

3.4 Performance measurement principles . . . . . . . . . . . . . . . . . . . . . . . . . 19

4 Distributed simulation with HLA 26

4.1 Overview of The High Level Architecture (HLA) . . . . . . . . . . . . . . . . . . 26

4.2 Time management in HLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.3 Distributed physical simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5 Conclusion 37

5.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.2 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Bibliography 40

i

1

Introduction

1.1 Motivations and Objectives

Nowadays, developing countries have su↵ered from natural disasters such as typhoon, tsunami,

fire, and flood. For example, in Mekong Delta of Vietnam, under the impacts of climate change,

the sea level rise around. This could make the flooding Mekong Delta every year. Thus, en-

vironment surveillance and prediction of such phenomenon become necessary. Simulation is a

good approach for that purpose. It helps human make better decisions to prevent or relieve the

impacts.

In recent years, wireless sensor network (WSN) emerges as a good candidate in monitoring

the environment. Several inspiring projects have been launched as a common aim to sense the

environment [10]. Sensors are used to collect status of physical systems and send status data

to computer systems for processing, analyzing. Some reactions will be sent back to physical

systems. A such integration between physical systems and computer systems pertain to Cyber-

Physical Systems (CPS), as presented in Section 1.2.

Therefore, it is necessary to consider sensing processes. The objective is to support and to

validate operations of the WSN. Especially, it is responsible for dangerous accidents such as

monitoring chemical store placed at residents regions. A composing model of the parallel sim-

ulations of the two sides of the CPS will thus be conducted.

However, modeling and simulating physical systems confront many issues. These systems often

appear as huge systems and complex behaviour. This leads to a lot e↵ort for designing the

models. Moreover, the lack of interoperability is also a major challenge. In fact, they always

impact to each other in the real world. For instance, the fire spread is influenced by several

other factors, namely weather conditions, wind directions and speeds, responding abilities, and

sensing performance of the wireless sensor network (WSN). In such systems, the model consists

of a lot of components (fire spreading, weather conditions, firefighter, and WSN). These com-

ponents and the their relations result in large scale models. Such models are very di�cult to

maintain and adopt. These circumstances bring about:

• long run times for simulation runs.

• long time for developing and testing of such models.

• huge e↵ort for maintaining and for adapting the models for other perspectives.

1

CHAPTER 1. INTRODUCTION

• low flexibility and reusability.

Traditionally, there are two common approaches to handle these problems. One solution is the

employment of powerful hardware. The other is breaking up the model into a set of submod-

els, which are distributed on di↵erent computer systems. However, they come from separate

works. Thus, in this project, we use a hybrid approach of the association of distributed models

and parallel computations. It aims to enable and to adapt to huge size and complex behavior

physical systems.

This approach can be viewed under two main aspects. For the problem of computing perfor-

mance, the use of parallel simulations based on GPU is suggested. The powerful GPU has been

considered in several studies to speed up large simulations over the last years. To deal with the

lack of the interoperability of simulations, we use an IEEE standard High Level Architecture

(HLA), which provides independent simulations the ability to communicate together in the con-

text of a synchronous system.

The thesis is roughly divided into five chapters:

Chapter 1: An introduction to the motivations and the objectives of the study is presented.

An overview of related concepts will be described such as Cyber-Physical Systems (CPS), Cel-

lular Automata (CA). A description of PickCell tool and its applications will end the chapter.

Chapter 2: A new approach is to simplify the process of modeling physical systems. The

approach is facilitated by the PickCell tool in accordance with the CA.

Chapter 3: Describing the use of Cuda programming model to simulate physical models.

Some experiments are conducted to evaluate the feasibility of the solution.

Chapter 4: Using the HLA standard to deal with the lack of interoperability of several

simulations. It enables parallel simulations to be able to communicate together in context of

distributed systems.

Chapter 5: Summarising the contributions and presenting future work.

1.2 Cyber-Physical Systems (CPS)

Cyber-Physical Systems (CPS) are integration of computation and physical processes [9], [1].

In which, embedded computers and networks monitor and control the physical processes. It

includes feedback loops where physical processes a↵ect computations and vice versa.

2


Figure 1.1: An example of Cyber-Physical System.

An example of CPS is illustrated in Figure 1.1, as an illustrating of monitoring accidents

(pollution, flood, landslides, chemical spreading, as example) in the river. A WSN can be used

to observe the status of the river via sensors. Sensors forward status data to computer systems,

which will carry out computations. An analysis of computed results can lead to some emergency

operations, giving some signals or closing the basin, in the case of the accidents. Apparently, for

implementation of this type of system, one of critical challenges is system integration. Therefore,

to obtain the interoperability of simulations, an integration solution is required.

In fact, on [22], the authors presented a co-simulation framework based on the HLA standard.

That work focus on integrating heterogeneous systems, designed in di↵erent tools and languages,

as CPSs. However, the given prototype has not taken care for phenomena and computation

performance as well. Thus, the considerations in this project are expected to provide another

perspective on phenomena simulations.

1.3 Cellular Automata (CA)

Cellular Automaton (CA) is one of the techniques used in simulating complex physical

systems such as self-reproduction in biology, di↵usion models in chemistry. The famous ”Game

of Life”, it illustrates that cellular automata have capacity of producing dynamic patterns and

structures [2], [3].

According to [4], a major e↵ort is presented to show the advantages of using CA for modeling

systems, especially for natural phenomena. The use of CA for modeling phenomena is clearer,

more accurate, and more complete than conventional mathematical system. Moreover, the

transition rules of CA models are often simpler than mathematical equations, but the result

produced is more comprehensive. It can mimic the actions of any possible physical systems. A

3


CA typically consists of two main components.

The first component is a cellular space that is a lattice of cells, each with an identical pattern of

local connection to other cells for input and output. The cell has a set of states that is chosen

from a finite number states. In the simplest case each cell can have the binary states 1 or 0. A

set of cells called neighbourhood is defined relatively to the specified cell (center). The states

of the neighbours will be used to calculate the next state of the center according to the defined

rule. The number of neighbour depend on the pattern chosen in modeling process.

The second component is a transition rule (CA rule) giving the update of the state (at time

t+1) of each cell according to its current state and the states of its neighbourhood (at time t).

Typically, the rules for updating states of all cells are the same and do not change over time.

Generally, the CA exits under various forms. The simplest CA is one being the one-dimensional

lattice, meaning that all the cells are arranged in a line. Then, the neighbourhood of the cell are

just in its left and its right. Meanwhile, for the two-dimensional lattice, the most common types

of neighbourhood are Moore neighbourhood and Von Neumann neighbourhood (see Figure 1.2).

Figure 1.2: Von Neumann and Moore neighbourhood (distance = 1).

In Von Neumann neighbourhood, each cell has four neighbourhood, north (N), south (S),

east (E), and west (W). We thus have 32 (25) possible. Meanwhile, for the latter, each cell

totally has nine cells, then 512 (29) possible patterns can be produced. In both cases, the dis-

tances are one and transition function is supposed to generate.

Therefore, in order to model systems with this approach, the two components need be accom-

plished: the cellular space and the transition rules or behavior. In the next chapter, an approach

proposed in Lab-STICC laboratory to automatically generate the cellular spaces (cell networks)

from geographic data is briefly presented. Input data and behavior of each cell will be later

determined according to di↵erent interests on a certain physical system.

4

2

Physical simulations based on cell networks

This chapter presents a brief description about PickCell, a tool allowing to generate cell

networks of physical systems. Their structures thus will be described in the second section

as well. We next propose a methodology to develop physical simulations in term of the cell

networks. Lastly, some cases are examined to demonstrate the use of the proposed methodology.

2.1 PickCell tool and cell networks

2.1.1 PickCell tool

PickCell is a modeling tool, has developed in Lab-STICC in recent years (more in document

[8]). It enables to access geographic data from various public resources as input data, namely

GoogleMap, OpenStreetMap, or even picture files. The tool uses these data to analyze, process,

and generate cell network structures of physical processes.

The main feature of the tool is extracting visible properties (potential physical systems) on

geographic data such as river, forest, or road system. A process start from input data. The

final results are a set of separated physical systems being represented by a group of cell networks,

presented in Section 2.1.2. Generally, this process is performed throughout three main steps:

• Preprocessing data: Geographic data are usually yet well presented, especially in the

case of satellite and air images. At this step, the tool increases the contrast of the data

to serve the following steps.

• Segmenting data into cells: In order to achieve interest regions on the data such as

rivers, or roads. The data are divided into small cells. Their sizes (x, y) depend on the

objective on the desirable models. In which, x and y parameter represent the width and

the height of cells, respectively. It makes sense that with the same size of input data, if x

and y values are small, the number of cells will large or vice versa.

• Recognizing similar cells and grouping into layers: Typically, the tool uses 3

standard components of color (Red-Green-Blue) to classify divided cells into defined layers.

Each contains a set of cells with similar colors. Next, the relations between these cells in

the same layer will be defined depending on a certain CA pattern. As a result, for each

layer, we have a set of cells organized as a network due to their relations (or links). These

5

CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS

sets are considered as cell networks. The details of cell networks will be presented in the

next section.

2.1.2 Cell network

As mentioned previous, a cell network is a group of cells and the relations between them.

Each typically has its data consisting of four elements: identity, local state (such as pollution

density, insect population, geographic positions), links to other cells (or its neighbour), and

relative positions to its the neighbour. The last one means that a cell is capable of de-

termining the directions of its neighbour, which can be located at the eastern, the western, the

northern, or the southern. This property can be useful in various situations such as simulating

the weather, or flow of the fluid. For the sake of simplicity, it can be organized as pairs of

number, shown in Table 2.1.

Direction Value

East (1,0)

West (-1,0)

North (0,-1)

South (0,1)

Table 2.1: A proposed organization of directions in a cell network.

Table 2.1 formally shows an example of a cell network, which is generated from PickCell

tool except for its data represented by the column named ”Pollution Density”. The data can be

loaded at the beginning of simulations or at runtime.

6


Cell Id Pollution Density Neighbour Id Directions

1 0 100 590, 25, 1, 600 (-1,0), (1,0), (0,-1), (0,1)

2 1 50 589, 0 (-1,0), (0,1)

... ... ... ... ...

26 25 10 0, 26 (-1,0), (0,1)

... ... ... ... ...

591 590 50 2, 0, 589 (-1,0), (1,0), (0-1)

... ... ... ... ...

601 600 78 0 (1,0)

Table 2.2: The table presents a cell network structure of 601 cells generated by PickCell tool (VonNeumann 1 CA).

The use of the cell network brings some advantages in developing physical simulations.

Firstly, each cell network is a clear and consistent structure. All cells come from a certain

physical system. They own the same type local data and have the same behaviour. This

structure looks like a class in OOP (Object Oriented Programming) and its cells are objects

being instantiated from that class. Under the view of software engineering, it thus especially

useful in maintaining the systems. It is simple to add necessary properties to states or transitions

of the models.

Secondly, cell networks generated from PickCell tool help to tackle the latency of input data.

Many phenomena simulations have used raster data as the input for their models. It is often

di�cult to distinguish the interest regions with this type of data. The limitation causes the

useless computations occurring on the outside of those regions. For example, in [23], data cells

are not belonging to the real interest area (rivers) will be marked ”NoData” in the preprocessing

step. The use of models built from cell networks will avoid this useless processing in default.

In addition to the cell network structure, the PickCell tool also allows to extract visible data.

This is useful for displaying and analyzing simulated results. Figure 2.1 demonstrates how a

river system is displayed from extracted visual data. In current version, the tool enables to

generate two dimension data in the format of two concurrent programming languages, Cuda

and Occam [6], [7]. The third dimension data for elevation will appear soon in the next version.

7


Figure 2.1: A cell network of a river system generated from PickCell tool with Von Neumann 1.

In short, cell networks generated from PickCell tool are presented as skeletons for simulation

models. In order to obtain a complete model by this approach, two other components need to

be considered: input data and transition rules. These will be presented in the next section.

2.2 Physical simulations based on cell networks

The cell network structure early presented is one of main components for this methodology.

Each model has at least three other components: cell network, input data, and transition

rule. The first one will be generated from geographic data with the facilitation of PickCell tool.

Whereas, the two others will be defined according to the characteristics of physical systems.

A summary of the methodology is depicted in Figure 2.2. The process has three main steps.

Initially, it begins with geographic data. These data are next processed to generate a cell

network by the PickCell tool. The cell network is associated with input data and transition

rule to make up a complete model. Lastly, this model is executed by a simulator.

Currently, the cell networks are generated in two versions, Cuda and Occam codes. Cuda was

chosen in this work due to adequation of its model.

8


Figure 2.2: A summary of the proposed process which is used to conduct physical simulations.

2.3 Case study and applications

This section describes a case study that has been applied to study region. It is a small area

located in Mekong Delta of Vietnam, as shown in Figure 2.3. In which, there are totally three

physical systems: river, forest, and road. The first two of those, river system and forest

system, which were considered in this project.

Considering applications of the proposed approach, there are two models will be conducted from

the study region. One is the model of forest fire spread. The other is river pollution di↵usion.

In addition, we assume that a Wireless sensor network (WSN) is used to monitor the status of

the forest. Thus, a model of WSN is also developed. Details of three models are later described

in this section.

Another assumption is that there are communications between those three systems. One hap-

pens as the fire spreading close to the river. Then, ashes of the fire will pollute to the river.

Meanwhile, as the sensors of the WSN recognised the fire appearing near to them, these sensors

will raise emergency signals. This scene will be clarified and used as an application for a solution

presented in Chapter 4.

9


Figure 2.3: The study region: A small area in Mekong Delta, the South of Vietnam. (data source:OpenStreetMap [16])

In reality, there are many elements of input data will be used for models and transition

rules are often very complicated. The goal is to create simulations as real as possible. However,

in our case, some basic characteristics will be picked to express the possibility of the proposed

methodology. Particularly, the input data and the transition rule of each model are presented

as follows:

2.3.1 The di↵usion of pollution in the river

This model is used to simulate the di↵usion of pollution in a river. Regarding the context of

pollution, it is possible to think of various potential situations such as chemical, oil, contaminant.

Then, the di↵usion much depends on the density. Thus, the pollution density was kept as input

data for this model. Each cell contains an amount of pollution density, which represents the

cell state. The states are changed according to the transition rule.

• Input data: Pollution density.

• Transition rule: At every time step, to achieve a new state at time t+1, each cell will

perform sequential tasks:

– If the local density value is larger than zero, it will be randomly subtracted a certain

amount of its density. That proportion will be equally transported to its neighbour.

– Next, it will receive some proportions from its neighbour.

– Finally, the addition and the subtraction will be updated to prepare for the next step

(time+1).

10


2.3.2 The fire spread in the forest

A model used for simulating the fire spread in the forest. It is reproduced from a sample in

CORMAS [15]. Each cell has four possible states: tree, fire, ash, and empty. At the beginning,

some cells are initialized with the state fire, while others are tree.

• Input data: Tree, fire, ash, and empty.

• Transition rule:

– If a cell is tree at time t, it will become fire at time t+1 in the case that there is at

least one of its neighbour is fire.

– If a cell is fire at time t, it will become ash at time t+1.

– If a cell is ash at time t, it will become empty at time t+1.

2.3.3 Wireless sensor network (WSN)

In this study, WSN plays as a sensing component role. It regularly collect raw data from the

environment, processes that data, and raises emergency alert in the case of the fire detected. A

WSN will monitor status of the forest. To do that, a set of sensors will be deployed in the forest

border because our consideration is the spread of the fire to other systems. In this case, we give

a simple way using a distributed algorithm for the deployment of sensors. The algorithm will

be described in Section 2.4. A simple WSN is achieved as shown in Figure 2.4.

Figure 2.4: Deploying sensors along the forest border extracted from the study region with the4 neighbour pattern. The communication range and the sensing range are 25 and 5 cells units,respectively.

Typically, sensors have two types of ranges. One is to indicate the sensing capacity of the

sensor. This sensing range can be small. Meanwhile, the other, communication range, can be

11


longer due to radio link technology. Thus, as deploying sensors, it is necessary to make sure

that sensors are connected together depending on the value of the communication ranges.

• Input data: Sensing data.

• Transition rule: At every step, the nodes check data received from the fire forest simula-

tion. In case of fire detected at some points, signals will be raised.

2.4 Routing algorithm

This section presents a routing algorithm implemented in parallel. Taking advantage of the

GPU computation, a new version of this algorithm was implemented in Cuda starting from a

Occam program. The routing table which can be used for deploying sensors as described in

previous.

We assume that the network has the shape and structure like the cell network as introduced

in Section 2.1.2. Generally, it consists of n nodes, numbered 0 to n-1, they are viewed as their

identity, as showed in Figure 2.5. Associating to each node is two elements: route table and

temperate table. In which, route table will store identities of itself and other nodes, to which

it has reached after t step. The structure of this table is presented in Table 2.3. Meanwhile,

temperate table will only contains new nodes’ identity, to which it reached at each step. It

means that after each step, the values held by temperate table are completely replaced by the

new ones while the route table can be added more new records or will be unchanged.

At each step, each node performs two main tasks that are sending out local temperate tables to

its neighbor and receiving temperate tables from them as well. These tasks will be performed

n-1 times. This is to assume that the maximum distance will be obtained. The algorithm is

presented as the following:

Algorithm in parallel:

• Initializing

– Adding node’s id to local temperate table and route table with distance is zero, link

index is -1.

• For i to n

– For each neighbour

⇤ Sending local temperate table to neighbour.

⇤ Receiving a temperate table from the neighbor.

⇤ Emptying local temperate table

⇤ For each id in received temperate table

· If id does not exist in the route table.

· Adding id, i as distance, and a link index to route table.

12


· Adding id to local temperate table.

Figure 2.5: A simple network.

Node 0 Node 1

Known Id Distance Links Known Id Distance Links

0 0 -1 1 0 -1

1 1 0 0 1 0

3 1 1 3 1 1

2 2 0 2 2 0

Node 2 Node 3

Known Id Distance Links Known Id Distance Links

2 0 -1 3 0 -1

3 1 0 1 1 0

0 2 0 0 1 1

1 2 1 2 1 2

Table 2.3: An example of route table at node 0 after 3 steps.

These tables show information held by nodes in the network. Each node can know ”who” it

can reach and the distance to destinations.that it can achieved.

2.5 Remarks

The chapter presented a variety of subjects. The most noticeable is the concept of cell

network. It plays an important role in developing physical models. For the next chapter,

parallel computations will be employed to simulate these models.

13

3

Simulations with Cuda programming model

This chapter describes Cuda programming model and its applications. One goal is to show

a adequation of mapping between GPU architecture and cell network structure. Besides, it

enables to solve the problems of both large cell networks and complicated behavior. Next, the

performance tests on computation will be conducted in di↵erent scenarios due to the necessary

considerations on the e↵ectiveness of this approach.

3.1 GPU and Cuda programming model

3.1.1 Introduction to GPU

The Graphic Processing Unit (GPU) [5] is massively multithreaded - many core chips com-

posed of hundreds of cores and thousands of threads. This provides the capacity for processing

large data in parallel. Thus, it is widely used in parallel computations.

a simplified of a motherboard architecture is depicted in Figure 3.1. There are two parts, the left

part for the CPU (host) and the right one for the GPU (device). They are connected together

by a PCI bus. On the CPU, only host memory is considered in this model. Meanwhile, the

GPU chip comes with a set of streaming multiprocessors (SM). Each consists of several scalar

processors (SP), a set of registers, a shared memory. An on-chip shared memory is visible for

all threads that executed on a SM. A global memory is shared for all SMs.

14

CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL

Figure 3.1: A simplified motherboard architecture.

3.1.2 Cuda programming model

Cuda (Compute Unified Device Architecture) is created by NVIDIA. It provides a platform

for parallel computing and programming model. It enables to increase computing performance

by harnessing the power of the GPU. Cuda provides a set of extensions to C/C++ language,

to express parallel programs.

The GPU has thousands of threads handing multiple tasks while a CPU consists of a few threads

for sequential serial processing. Thus, a Cuda program typically consists of CPU code (host

code) and one or more kernels (device code) running concurrently on the GPU. As shown in

Figure 3.2, the compute-intensive portions of the application will be sent to the GPU, while the

remainder of the code still runs on the CPU.

Kernels are executed by many several threads with private local variables and shared memory.

The executions of blocks are synchronous while those of threads in each block are independent.

In addition, each of the CPU and the GPU has its own separate memory. They cannot directly

access the memory of each other. Thus, we need to explicit transfer data between the two

memories via PCI bus.

15


Figure 3.2: Anatomy of a CUDA program.

3.2 Accelerating simulations by using Cuda

Programming with Cuda, means programming a large number of threads with own shared

memory and concurrent executing the same task. Therefore, if there is a need to address a large

number of repeated works which are the same, it is convenient to apply this model.

In our case, each model owns a cell network, input data for each cell, and a common transition

rule for entire cells. This makes sense that each cell has its local data and global behavior.

Every cells must make the same computation on its own data at each step in order to achieve

new states for the system. It is thus simple to map each cell to each thread being responsible

for the processing of that cell, as illustrated in Figure 3.3.

16


Figure 3.3: The mapping between the cell network structure and the GPU architecture.

According to this model, data need to be moved on the global memory to share between

threads. Figure 3.4 shows the data flow of physical simulations in term of CUDA programming.

This can be summarized into some main steps:

• Initializing initial states (input data) for all network cells.

• Transferring data (cells’ states and network structure) to the GPU for computations. For

each cycle, the new states of all nodes will be concurrently computed on the GPU. These

states will updated with new values to prepare for the next cycle.

• Sending data back to to the CPU memory possibly to display and analyze the results.

It is optional, if the result of each step is not considered for displaying and analyzing at

run-time, these operations can be omitted.

Figure 3.4: Data flow in the system.

Obviously, if the phase of displaying and analyzing is ignored, the execution of simulation

17


mostly is run on device. Hence, it is believed that the benefit of performance in this case will be

proportional to the size of cell networks. It becomes more worthwhile in the case of simulating

phenomena, which often appear with large sizes and very complicated transition functions.

Moreover, this proposition provides an opportunity to achieve computations and statistics in

real time. This increasingly becomes important when the needs of predictions of many emergent

cases increase, namely clouds of insects, flooding, tra�c congestion, tsunami, fire. For those

situations, the systems can directly access available data from the natural environment via

observing systems. The simulations use input data to conduct useful information (directions of

clouds of insects or the level of flood at a certain time in the future, for example).

3.3 Details of GPU implementation of simulations

In this section, the details of GPU implementations of three main simulations will be pre-

sented: pollution di↵usion, forest fire and wireless sensor network. All of them are

developed by C programming language in accordance with Cuda model. These implemen-

tations are resulted from the analysis in the previous section. The formal presentations of

implementations are described as the following.

Host program implemented on the CPU

(1) Initializing the initial values for all cells.

(2) Copying the cell network structure and data from the CPU host memory to the

GPU device memory and launching the kernel.

Kernel program implemented on the GPU

(3) Looping each cycle.

(4) Computing the new states for each cell.

(5) Updating new states to each cell.

(6) Reading back results to the CPU and output the results (once for each time step

or more).

Apparently, the execution runs mostly on GPU (from (3) to (5)). Others do not much a↵ect

to global performance if line (6) is not considered. Then, line (1) is executed once and line (2)

is run twice. Thus, as a comparison, the execution time on CPU can be omitted.

In the next section, some initial measurements will be performed for evaluating the e↵ectiveness

18


of using the massively parallel architecture GPU to accelerate the computation of phenomena

simulations.

3.4 Performance measurement principles

In order to validate the performance of the proposal methodology, a few measurement tests

were performed. The simulation of pollution di↵usion in the river was chosen as a case. The de-

scription of the pollution di↵usion model follows Section 2.3.1. The implementation of the tran-

sition function presented in Listing 3.2. There are two data structures used. The NodeState

structure contains states of cells, the Canaux structure consists of links to neighbours.

Listing 3.1: Transition function

1dev i ce NodeState computeState ( NodeState ⇤nowState , i n t nodeIndex ,

Canaux ⇤ channe l s )3{

NodeState myState ;

5i n t nbIn , nodeIn ;

f l o a t r e c e i v e ;

7

/// Get t ing p o l l u t i o n den s i t y o f the c e l l

9myState = nowState [ nodeIndex ] ;

/// Get t ing number o f ne ighbours o f the c e l l

11nbIn = channe l s [ nodeIndex ] . nbIn ;

r e c e i v e = 0 ;

13f o r ( i n t i = 0 ; i < nbIn ; i++)

{15/// Get t ing id o f the ne ighbours

nodeIn = channe l s [ nodeIndex ] . read [ i ] . node ;

17r e c e i v e = r e c e i v e +

( ( nowState d [ nodeIn ] . dens i ty / 2 . 0 ) / ( f l o a t ) channe l s [ nodeIn ] . nbIn ) ;

19}/// Computing the new s t a t e

21myState . dens i ty = (myState . dens i ty / 2 . 0 ) + r e c e i v e ;

r e turn myState ;

23}

We have tested and have evaluated the computational e�ciency in various studies. The con-

centration of these tests is to show how the GPU speeds up the simulations when comparing

to the CPU. Therefore, the time for transferring data between CPU and GPU are omitted in

most cases. The time execution of the simulation on the host is also ignored due to most of

computation being moved on the device.

As mentioned earlier, the simulation execution costs depend on two main components: cell

networks (size and type of CA pattern chosen) and the complexity of transition rules. Thus,

many di↵erent aspects related to these components will be concerned.

All tests have been tried on a PC with hardware configuration shown in Table 3.1. Information

about Graphics Device is presented in Table 3.2 (more details, see [11]). We have used a pro-

19


filing tool nvprof [17] to estimate time for GPU computation and the standard library time.h

for that on the CPU.

Intel(R) Xeon(R) CPU E3-1240 V2 @ 3.40GHz

Num. CPUs 8

Num. Cores/CPU 4

Architecture i686

RAM 16 GB

Table 3.1: Technical data of PC used.

GeForce GTX 680

Num. cores 1536

Maximum number of threads per block 1024

Global memory 4 GB

Table 3.2: Technical data of NVidia graphics card used.

The first scenario: The comparison of time computation between the CPU and the GPU

was carried out. All tests follow the model of river pollution di↵usion (Section 2.3.1) with

the pattern of 8 neighbourhoods and 1,000 cycle runs for each test. The transport time was

considered in this case study.

The computation on both the CPU and the GPU are influenced by the size of cell networks

(number of cells), but not by the size of cells. Since, the cell is a basic element in cell networks,

the computations are careless about the pixels of cells. With the same studied region, as the

size of cells is smaller, we can process a larger cell network. Otherwise, the cell network is small

if a bigger size of cells is chosen. Thus, the sizes of cells were regardless the performance tests.

Table 3.3 shows the time executions of the pollution di↵usion model on the CPU and the GPU

with 1,000 cycles. The network sizes used between 1,220 and 83,661 cells.

Regarding the network size, the number of cells influence the performance for both the CPU

and the GPU. On the CPU, the upward trend is very noticeable. The great increase starts

from the size of 10,703 to 83,661 at a rate of 0.26(s)/1,000 cells. It is projected that the

trend anticipation will be maintained with bigger sizes. Whereas, the increase on the GPU is

not dramatic. It gradually rises between 1,220 and 83,661 at a rate of 0.01(s)/1,000 cells.

Table 3.3 presents that the GPU is overwhelmingly faster than the CPU. The gap increasingly

becomes significant according to the rise of the number of cells. This is visually expressed in

Figure 3.5. As the size of cell network is 83,661, the GPU is approximately 22 times faster

20


than the CPU. It is that the use of GPU is very vital in the case of vast systems.

Time (seconds)/1,000 cycles

Num. cells Cell size (Pixel) CPU GPU

1,220 10x10 0.060 0.040

10,703 5x5 0.590 0.103

48,425 2x2 10.880 0.527

83,661 2x2 19.910 0.894

Table 3.3: The computation comparison between the CPU and the GPU in the case of pollutiondi↵usion model.

Figure 3.5: Demonstrating the accelerating time of using the GPU for physical simulation.

Figure 3.6 shows an example about physical simulation on GPU. The cell network of a river

is generated by PickCell tool with the use of four neighbor pattern. Meanwhile, the model

of pollution di↵usion is referred from Section 2.2. Initially, two polluted points are randomly

created in the river. These points contain an amount of pollution density as their data states.

At every step, system states are changed according to the transition function.

21


Figure 3.6: Illustrating a simulation of di↵using pollution in a river following the model describedin Section 2.2. It is initialized with two polluted points (black points).

The second scenario: Di↵erent sizes of cell networks are still taken into account. The

two popular patterns of CA (Von Neumann 1 and Moore 1) and the di↵erence of number of

cycles are considered as well. The model are used as the previous case. The achieved results

are presented in Table 3.4. One of these attempts is shown in Figure 3.6.

The values shown in Table 3.4 indicate that the increase of cycles does not much a↵ect to the

execution time. It can be understood that the transition functions are very simple to generate

major di↵erences.

22


Num. cells

Cell size

(Pixel) CA PatternTime (seconds) / Num. cycles

100 1,000 10,000 100,000 1,000,000

1,220 10x10 VN 1 0.002 0.035 0.356 3.564 35.643

1,220 10x10 Moore 1 0.002 0.035 0.355 3.569 35.642

10,703 5x5 VN 1 0.010 0.103 1.036 10.369 103.544

10,703 5x5 Moore 1 0.017 0.170 1.704 17.058 170.544

48,425 2x2 VN 1 0.052 0.527 5.268 52.704 527.008

48,425 2x2 Moore 1 0.087 0.880 8.801 88.008 880.386

83,661 2x2 VN 1 0.145 0.894 8.948 89.477 895.427

83,661 2x2 Moore 1 0.219 1.454 14.566 145.661 1,002.105

Table 3.4: Measurements results.

Regarding CA patterns, for small networks, the di↵erences between Von Neumann 1 and

Moore 1 are not very remarkable. However, in the case of larger ones, Von Neumann 1 is

significantly faster than the other. As a case, as running time is 10,000 cycles and network size

is 83,661 cells, the Moore 1 takes 14.566(s) while the Von Neumann 1 just takes 8.948(s).

The former is about 1.6 times slower than the latter, as shown in Figure 3.7.

Figure 3.7: The graph displays the increase of the gap between two CA patterns with 10,000cycles.

The third scenario: It aims to show that the execution time also depends on transition

function. To do that, we modified a little on the previous version. Particularly, at every step,

each cell loses an random amount of the pollution density. The implementation is shown as

below.

23


Listing 3.2: Transition function (version 2)

1dev i ce NodeState computeState ( NodeState ⇤nowState , i n t nodeIndex ,

Canaux ⇤ channels , curandState ⇤ devStates )3{

NodeState myState ;

5f l o a t lo s sPercentage , r e c e i v e , l o s s ;

i n t nbIn , nodeIn ;

7

myState = nowState [ nodeIndex ] ;

9/// Generating a random va lue in [ 0 . 0 � 1 . 0 ] by generateNumber func t i on .

l o s sPe r c en tage = generateNumber ( devStates , nodeIndex ) ;

11/// Ca l cu l a t i n g an amount o f l o s s .

l o s s = lo s sPe r c en tage ⇤ myState . dens i ty ;

13/// Get t ing number o f ne ighbour

nbIn = channe l s [ nodeIndex ] . nbIn ;

15r e c e i v e = 0 ;

f o r ( i n t i = 0 ; i < nbIn ; i++)

17{/// Get t ing id o f the neighbour

19nodeIn = channe l s [ nodeIndex ] . read [ i ] . node ;

r e c e i v e = r e c e i v e + ( ( nowState [ nodeIn ] . dens i ty / 2 . 0 ) /

21( f l o a t ) channe l s [ nodeIn ] . nbIn ) ;

}23/// Computing the new s t a t e

myState . dens i ty = (myState . dens i ty / 2 . 0 ) + r e c e i v e � l o s s ;

25i f (myState . dens i ty < 0 . 0 )

{27myState . dens i ty = 0 . 0 ;

}29re turn myState ;

}

The graph 3.8 demonstrates the influences of transition rules on execution time in this approach.

The version 2 is slower than version 1 due to the more complex behaviour. The increase of time

is stable following the size of the networks.

24


Figure 3.8: Comparing the execution time between previous transition function (version 1) andthe new one (version 2).

25

4

Distributed simulation with HLA

The simulations of large systems often face with the performance issues. The use of Cuda

programming model can deal with those. However, the lack of interoperability between sim-

ulations poses a major challenge. Thus, the High Level Architecture [(HLA) [12], [13], [20]]

standard is proposed as a solution for addressing that new demand. According to this standard,

the distribution of many sub-simulations can be achieved instead of the development of one vast

simulation. The integration of Cuda model and the HLA leads to a hybrid solution in which

several parallel simulations can be distributed on di↵erent computer systems. This chapter gives

a brief description of the application of HLA on parallel simulations.

4.1 Overview of The High Level Architecture (HLA)

The High Level Architecture (HLA) ( [12], [13], [20]) is a standard for distributed simulations,

the main goal is to support interoperability and reusability of simulations. The HLA was

developed by the United States Department Defense (DoD) to facilitate the integration of

distributed simulation models within an HLA environment. It allows the division of a large

scale model into a number of manageable components, while maintaining interaction between

them. Over the last years, the HLA is deployed in a wide range of simulation application areas

including transportation and the manufacturing industry. But, it hardly appears in simulation

about phenomena, especially the climate change area. The HLA is thus suggested as a potential

approach of composition of parallel simulations in this project.

26

CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA

Figure 4.1: HLA Federation.

In HLA terminology, the entire system is represented by a federation. Each simulator re-

ferring to the federation is called a federate. A set of federates is connected via Run Time

Infrastructure (RTI). These federates can be established on di↵erent platforms and connected

together by a network system. In such case, RTI can be viewed as distributed operating systems

for interconnect cooperating system federates. Figure 4.1 describes the global architecture of a

HLA simulation. Generally, the HLA specification defines:

• A set of rules: This describes the responsibilities of federates and their relationship with

RTI. There are ten rules. One of them is that all exchange of data among federates should

occur via the RTI during a federation execution.

• An interface specification: The interface specification prescribes the interface between

each federate and the Runtime Infrastructure (RTI), which provides communication ser-

vices to the federates. The interface specification is divided into some main management

areas:

– Federation management: Federation management includes main tasks such as creat-

ing federations, joining federates to federations, resigning federates from federations,

and destroying federations.

– Declaration management : This allows federates publish and subscribe class attributes

and interactions to RTI. Other federates can only subscribe to an attribute or an

interaction when they were published by the federates owning them.

– Object management: Which includes the tasks of creating, and sending the updates

of objects to other federates.

– Ownership management: The RTI allows federates to distribute the responsibility

for updating and deleting object instances with a few restrictions.

– Time management: This focuses on the implementation of time management policies

and negotiate time advances. This mechanism allows to create several simulations

running concurrently.

27


• An Object Model Template (based on the OMT standard [14]): This component

defines how information is communicated between federates, and how the federates and

federation have to be documented (using Federation Object Model FOM). FOM defines

the shared objects, attributes, and interactions for whole federation.

There are two elements can be exchanged between federates:

An object: is an entity that represents “actor” playing in the simulation. It contains

shared data that are created by a federate during the federation execution and persist

until it is destroyed. The FOM defines all classes of object, a case presented in Table 4.2.

As a federate wants to publish or subscribe to an object, it must compatibly define that

object in its FOM. Objects store their data in attributes.

An interaction: is a broadcast message that any federate can send or receive. A pub-

lishing federate sends out an interaction to the federates, which have subscribed to the

publisher. If no subscribing federate receives the interaction, the data it carries are lost.

The FOM also defines all classes of interaction. As a federate wants to publish or subscribe

to an interaction, it must compatibly define that interaction in its FOM. Interactions carry

data in parameters.

Figure 4.2: Illustrating a high level of the interplay between a federate and a federation.

Figure 4.2 depicts the interplay between a federate and a federation. Initially, a federate will

try to create a federation, or to connect to existing one on RTI. It then specifies what data

will be shared with other federates by using publishing services. These published objects or

published interactions will be available to all federates, which also has a connection to the same

28


federation.

An federate want to send data to other federates, it has to register objects and call an update

service. That data will be automatically reflected to subscribers by the RTI. Releasing allocated

resources is always necessary at the end.

4.2 Time management in HLA

The RTI provides a variety of optional time management services. It is important to un-

derstand time management to manage the mechanism of exchanging events between federates.

Each federate manages its own logical time and communicate this time to the RTI. The RTI

will ensure correct coordination of federates by advancing time coherently. In the discrete event

simulation literature, logical time is equivalent to ”simulation time”. It is used to make sure

that federates observe events in the same order [19]. It helps to avoid many problems such as

causality violation, or di↵erent results led from repeated executions of the simulation with the

same input data. Logical time is not mapped to real time.

4.2.1 Time policies

According to the HLA time policies, each federate is involved in the progress of time. In some

cases, it is necessary to map the progress of one federate to the progress of another. A federate

needs to request a regulation policy to participate in the decision for the progress of time. A

constrained federate follows the time progress imposed by other federates. As our approach, the

synchronization of logical time from di↵erent federates is necessary. Thus, the federating and

constrained federates are allowed, as shown in Table 4.3. This enables participating federates

can exchange data together.

4.2.2 Time progress

The second portion of the time management component provides a mechanism to advance

simulation time within each federate. There are two particular services which federates can in-

voke to request time advancement from the RTI. The timeAdvanceRequest is used to implement

time-stepped federates; the nextEventRequest is used to implement event-based federates. The

granted time is given by timeAdvanceGrant service.

Generally, a time management cycle consists of three steps. First, a federate sends a request

for time advancement. Next, the federates can receive ReflectAttributeValues callbacks. The

RTI completes the cycle by invoking a federate defined procedure called timeAdvanceGrant to

indicate the federate’s logical time has been advanced.

29


Figure 4.3: A model of time advancement request is used in this project.

4.2.3 Time synchronization

As presented in previous sections, all simulations have to synchronize their local logical time

to ensure the causality. The constrained parameter and regulating parameter are enabled for

all simulations. The former ensures federates to be able to send the updates and interactions

in causal order. In the other hand, the latter allows federates to able receive those updates and

interactions from the RTI. Since a passive visualization federate does not send any updates or

interactions, it has no impact to the time advance of the federation. Therefore, only constrained

parameter is enabled and regulating can be switched o↵ in the case of visualization. Table 4.1

shows time policies proposed for the case study (Section 2.3).

Federate Time constrained Time regulating Time advance

Forest Yes Yes Time stepped

River Yes Yes Time stepped

WSN Yes Yes Time stepped

Visualization Yes Yes/No Time stepped

Table 4.1: Time management of the federation.

To synchronize activities between several federates participating in a federation, the RTI

gives a mechanisms for exchanging data between them. In this case, times will be associated

with exchanged data in coordinating federate activities. The RTI allows federates communicate

explicit synchronization points. Figure 4.4 illustrates a process of synchronizing between two

federates, the river federate and the forest federate.

30


Figure 4.4: Federate synchronization

First of all, one of available federates sends a synchronizing request to the RTI and in this

case it is the river federate. Then, the RTI will send the response to river federate and later send

an announce to other federates to achieve a synchronization point. A service will be used by

federates to confirm the synchronized point achieved. In the next portion, some issues relating

to exchanging data is considered in the context of distributed simulations.

4.2.4 Exchanging data

Exchanging data between simulation federates is one important part of distributed systems.

However, a question arriving in this case is that what kind of data must be shared, where the

communication will happen.

Regarding the type of exchanging data, it is determined by characteristics of real systems as

well as interoperability between them. As our case, there is a communication between four

federates: forest, river, WSN, and visualization. The forest federate transports its status

to the river federate and the WSN federate. Meanwhile, the three federates need to provide

their data to the visualization federate for analyzing the results.

To achieve it, the forest federate will publish its data (forest status and position) as an object

class (ForestNode). The river federate and the WSN federate need to subscribe it. As the

same case, the river federate and the WSN federate also publish the object classes RiverNode

and WSNNode, respectively. These published classes have to be declared in the FOM of the

federation, has a structure as indicated in Table 4.2.

31


Object Class Attributes Published by Subscribed by

ForestNode State, Position ForestNode River, WSN, Visualization

RiverNode Pollution density, Position RiverNode Visualization

WSNNode State, Position WSNNode Visualization

Table 4.2: Objects and their attributes, publishers and subscribers.

In some cases, it is also important to specify where data will be exchanged between two

federates, especially in the case of physical systems owning very large sizes. Indeed, it is often

ine�ciency to send the entire data via the RTI because of issues with network performance and

local computation yield. Thus, we proposed a solution for a general case of exchanging data

between two adjacent systems.

Adjacent situation is two physical systems that have a common frontier or some places in

common. It is useless if unrelated information is sent to others. For example, as shown in

Figure 2.3, new polluted points to the river can be caused by the ashes of forest fire only

appears at the frontier of the two systems. Forest fire federate sends regularly its states to the

river federate. The latter only takes care states of points close to it instead of entire forest

states. This not only takes time for transporting data between federates, but also lead to less

e�cient in computation at receiver side.

A solution based on the morphology theory [21] can be used to address that issue. That enables

to smooth the boundary of physical systems by applying basic operations such as erosion, and

dilation. To summarize, only the status data at the boundary of forest will be sent to the RTI.

4.3 Distributed physical simulation

This section presents an application of using of the HLA standard for unifying parallel sev-

eral simulations, or called a mixed simulation. The study region was suggested as shown in

Figure 2.3.

The whole model was split in three simulation federates: forest fire spread, river pollution

di↵usion, and WSN. The simulation federates was all implemented in accordance with the

Cuda programming model and the HLA standard as well. These parallel simulations are exe-

cuted concurrently as three di↵erent simulators. Their models were presented in Section 2.2.

In addition to the three federates, the last one, visualization, is designed as a supportive federate.

The overview about the federation can be seen in Figure 4.5.

32


Figure 4.5: A structure for a proposed federation.

Repeating the communication that was proposed in Section 2.3, forest fire spread will pro-

duce ashes which result in some new polluted points and dusts to river pollution di↵usion

at time t. The latter will include these new data to its model at time t+1. The communication

depends on a specify condition.

There is also the communication between WSN and forest fire spreading, the sensors reg-

ularly collect the forest status as it is the goal of sensing. New information will be sent to

observers. In the case of fire detected, the observers will raise emergency signals as the fires

were detected. To do that, the synchronization needs to be achieved as indicated in Table 4.1

and the shared data have to be declared as shown in Table 4.2.

The file FOM for the federation was represented in the cyber.fed file shown in 4.1.

Listing 4.1: cyber.fed file

; ; Cyber phy s i c a l s imu la t i on

2(Fed

( Federat ion Cyber )

4( FedVersion v1 . 0 )

( Federate ” r i v e r ” ”Publ ic ”)

6( Federate ” f o r e s t ” ”Publ ic ”)

( Federate ”wsn” ”Publ ic ”)

8( Federate ” v i s u a l i z a t i o n ” ”Publ ic ”)

( Objects

10( Class ObjectRoot

( Att r ibute p r i v i l e g eToDe l e t e r e l i a b l e timestamp )

12( Class RTIprivate )

( Class ForestNode

14( Att r ibute Posit ionX RELIABLE TIMESTAMP)

( Att r ibute Posit ionY RELIABLE TIMESTAMP)

16( Att r ibute State RELIABLE TIMESTAMP)

)

18( Class RiverNode

( Att r ibute Posit ionX RELIABLE TIMESTAMP)

20( Att r ibute Posit ionY RELIABLE TIMESTAMP)

( Att r ibute Density RELIABLE TIMESTAMP)

22)

33


( Class SensorNode

24( Att r ibute Posit ionX RELIABLE TIMESTAMP)

( Att r ibute Posit ionY RELIABLE TIMESTAMP)

26( Att r ibute State RELIABLE TIMESTAMP)

)

28)

)

30)

4.3.1 Forest fire spread federate

The model of this simulation federate was presented in Chapter 2. In which, there are

some fires (red points) being randomly initialized in the forest. These fires will spread around

according to the transition function and CA pattern of the model. An example about the

spreading is shown in Figure 4.6. The green, red, grey, and white points represent the trees,

fires, ashes, and empty states, respectively.

Figure 4.6: An example of simulating of fire spread in the forest. The pattern of 4 neighbour isused. The red color represents fire trees and the gray color implies ashes formed by the fire.

The ashes can be formed after some steps. These ashes are able to pollute the river as

shown in Figure 4.7.

4.3.2 River pollution di↵usion federate

The model of pollution di↵usion in the river was also presented in Chapter 2. Initially, there

are some polluted points randomly generated in the river. During the progress of di↵usion, it

always receives status data about the fire from forest federate via the RTI. It will check the

data to determine whether the ashes will pollute some river cells or not. This defends on a

specific condition. For each river cell, if the distance to an ash cell is equal or less than a specify

34


threshold, the pollution density of river cell will decrease in inverse proportion of that of the

distance.

The RTI only sends that data to river federate as soon as it receives an update call from forest

federate. The update call only appears when ashes presented in the scope of the forest boundary.

Figure 4.7: A result is got from visualization federate. This demonstrates the exchanging databetween the two simulations via the RTI. Two regions marked with the red circles representing thenew pollution created by the ashes, which are formed from the forest fire after 4 steps.

4.3.3 WSN federate

The model of WSN was also introduced in Chapter 2. Every time step, nodes will receive

the data from forest federate via the RTI and only consider to cells in the scope of the sensing

range. If it detected that there are fire, it will forward that information to a observer for making

decision. The signals will be raised as the fire is recognized. As depicted in Figure 4.8, the red

rings indicate that fires have been detected at those sensors.

4.3.4 Visualization federate

The viewer federate is based on the 2D visualization X Window System. As mentioned

above, it first subscribes all necessary data, which have been published by other federates. The

aim is to provide a overview on the results as shown in Figure 4.8. Initially, the background

of the viewer is drawn from visible data extracted from PickCell tool. During the federation

execution, this federate will receive data from others and update the view at every step.

35


4.3.5 A case study

This section describes a case of the federation. Initially, one federate creates a federation

on the RTI and waits for other federates to participate. Another federate will connect to that

federation and also wait until the last coming. The first one will send a request to others to

achieve a synchronization point. After the responses of other federates, the synchronization

point is achieved. They run on the same time progress. At each time step, these federates

exchange data together via the RTI. Figure 4.8 presents the results captured from visualization

federate.

Figure 4.8: Illustrating an interoperability between the four federates via the RTI.

4.3.6 Simulation tools

Along with the PickCell tool, which is developed at LabSTICC laboratory. An Open Source

software, CERTI [18], was used in this project. The CERTI RTI supports HLA 1.3 specification

(C++ and Java). The X Window System was used to support for displaying the results of

simulation federates.

36

5

Conclusion

Our works focus on the modeling and simulating large and complex physical systems, espe-

cially natural phenomena, which recently emerged as a critical topic. We mainly consider two

problems. The first one is long time simulations of vast physical systems. The other is handling

the problem of lack of interoperability of physical models, which actually have several relations

in reality. In addition, the complexity of modeling was also taken into account.

5.1 Contributions

In order to carry this out, we firstly proposed a methodology to develop distributed physical

models based on the PickCell tool. A methodology based on this tool can conduce physical

simulations in term of cell network systems. Next, we also suggested an hybrid approach to

tackle problems of large size and complicated behaviour of physical processes. This enables

to create a distribution of several parallel simulations, which are simultaneously running and

communicating.

Results from some experiments of parallel simulations were a lot encouraging. The parallel

computations on the GPU help to dramatically reduce the simulating time of large models.

The interoperability between several physical simulations in accordance with the HLA standard

was also implemented.

5.2 Future works

We are going to apply our approach in reality. A proposition for simulating phenomena

in Mekong Delta region (Vietnam) will be regarded. At which, flood, pollution, and cloud of

insects are always considered problems.

Currently, a new version of PickCell tool is developing. There are new features such as getting

elevation data as example, enables to generate 3D data.

37

List of Figures

1.1 An example of Cyber-Physical System. . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Von Neumann and Moore neighbourhood (distance = 1). . . . . . . . . . . . . . 4

2.1 A cell network of a river system generated from PickCell tool with Von Neumann

1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 A summary of the proposed process which is used to conduct physical simulations. 9

2.3 The study region: A small area in Mekong Delta, the South of Vietnam. (data

source: OpenStreetMap [16]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Deploying sensors along the forest border extracted from the study region with

the 4 neighbour pattern. The communication range and the sensing range are 25

and 5 cells units, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.5 A simple network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 A simplified motherboard architecture. . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Anatomy of a CUDA program. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3 The mapping between the cell network structure and the GPU architecture. . . . 17

3.4 Data flow in the system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.5 Demonstrating the accelerating time of using the GPU for physical simulation. . 21

3.6 Illustrating a simulation of di↵using pollution in a river following the model

described in Section 2.2. It is initialized with two polluted points (black points). 22

3.7 The graph displays the increase of the gap between two CA patterns with 10,000

cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.8 Comparing the execution time between previous transition function (version 1)

and the new one (version 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.1 HLA Federation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.2 Illustrating a high level of the interplay between a federate and a federation. . . . 28

4.3 A model of time advancement request is used in this project. . . . . . . . . . . . 30

4.4 Federate synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.5 A structure for a proposed federation. . . . . . . . . . . . . . . . . . . . . . . . . 33

38

LIST OF FIGURES

4.6 An example of simulating of fire spread in the forest. The pattern of 4 neighbour

is used. The red color represents fire trees and the gray color implies ashes formed

by the fire. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.7 A result is got from visualization federate. This demonstrates the exchanging

data between the two simulations via the RTI. Two regions marked with the red

circles representing the new pollution created by the ashes, which are formed

from the forest fire after 4 steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.8 Illustrating an interoperability between the four federates via the RTI. . . . . . . 36

39

List of Tables

2.1 A proposed organization of directions in a cell network. . . . . . . . . . . . . . . 6

2.2 The table presents a cell network structure of 601 cells generated by PickCell tool

(Von Neumann 1 CA). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 An example of route table at node 0 after 3 steps. . . . . . . . . . . . . . . . . . 13

3.1 Technical data of PC used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2 Technical data of NVidia graphics card used. . . . . . . . . . . . . . . . . . . . . 20

3.3 The computation comparison between the CPU and the GPU in the case of

pollution di↵usion model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.4 Measurements results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1 Time management of the federation. . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Objects and their attributes, publishers and subscribers. . . . . . . . . . . . . . . 32

40

Bibliography

[1] Teodora Sanislav, Liviu Miclea. ”Cyber-Physical Systems - Concept, Challenges and Research Ar-

eas,” CEAI, Vol.14, No.2, pp. 28-33, 2012.

[2] Francesco Berto and Jacopo Tagliabue. ”Cellular Automata,” 26 Mars 2012. http://plato.

stanford.edu/entries/cellular-automata/.

[3] Stephen Wolfram. ”Cellular Automata as Simple Self-Organizing Systems,”Caltech preprint CALT-

68-938, July 1982.

[4] Robert M. Itami. ”Simulating spatial dynamics: cellular automata theory,” Landscape and Urban

Planning 30 (1994) 27-47.

[5] NVIDIA, ”Graphics Processing Unit (GPU),”. [online]. Available:

http://www.nvidia.com/object/gpu.html.

[6] CUDA Home Page. [online]. Available: http://developer.nvidia.com/object/cuda.html.

[7] Daniel C. Hyde. ”Introduction to the programming language Occam,” Bucknell University, 1995.

[8] Bernard Pottier, Pierre-Yves Lucas. ”Dynamic networks – NetGen: objectives, installation, use, and

programming,” Universite de Bretagne Occidentale. August 26, 2014.

[9] Edward A. Lee. ”CPS Foundations,”Proc. of the 47th Design Automation Conference (DAC). ACM,

737-742, June 2010.

[10] Luıs M. L. Oliveira, Joel J. P. C. Rodrigues. ”Wireless Sensor Networks: a Survey on Environmental

Monitoring,” Journal of Communications. Vol. 6, N.2, April 2011.

[11] GeForce GTX 680. [online]. Available: http://www.geforce.com/hardware/desktop-gpus/

geforce-gtx-680/specifications.

[12] ”IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)– Framework

and Rules,” IEEE Std 1516TM-2010, pp. 1-38, 2010.

[13] ”IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)– Federate

Interface Specification,” IEEE Std 1516TM-2010, pp. 1-378, 2010.

[14] ”IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)– Object Model

Template (omt) Specification,” IEEE Std 1516.2TM-2010, pp. 1-110, 2010.

[15] Common-pool Resources and Multi-Agent Simulations (CORMAS) CIRAD research center. [online].

Available: http://cormas.cirad.fr/

41

BIBLIOGRAPHY

[16] The study region, Mekong Delta Region, South of Vietnam. [online]. Available: https:

//www.openstreetmap.org/search?query=U%20Minh%2C%20ca%20mau%2C%20viet%20nam#map=

12/9.5440/105.0918

[17] Profiling User’s Guide. [online]. Available: http://docs.nvidia.com/cuda/profiler-users-

guide/#axzz3aPF6qywC.

[18] E. Noulard, J.-Y. Rousselot, and P. Siron, ”CERTI, an open source RTI, why and how?” Spring

Simulation Interoperability Workshop, 2009.

[19] R. M. Fujimoto, ”HLA time management: Design document,” Georgia Tech College of Computing,

Tech. Rep. Aug 1996.

[20] F. Kuhl, R. Weatherly, J. Dahmann, ”Creating Computer Simulation Systems: An Introduction to

the High Level Architecture,” Prentice Hall, 1999.

[21] C. Alasbey and G. Horgan, ”Image analysis for the biological sciences,”Edinburgh University, Febru-

ary 1999. Chapter 5, p.1-13.

[22] G. Lasnier, J. Cardoso, P. Siron, C. Pagetti, and P. Derler, ”Distributed simulation of hetero-

geneous and real-time systems,” in Distributed Simulation and Real Time Applications (DS-RT),

2013 IEEE/ACM 17th International Symposium on, Oct 2013, pp. 55–62.

[23] Juraj Cirbus , Michal Podhoranyi, ”Cellular Automata for the Flow Simulations on the Earth

Surface, Optimization Computation Process,” Applied Mathematics & Information Sciences 7, No.

6, 2149-2158, 2013.

42

Documents

Cyber-Physical Systems and Mixed Simulationswsn.univ-brest.fr/pottier/hoang.pdf · systems such as self-reproduction in biology, di↵usion models in chemistry. The famous ”Game