Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Master Research Internship
Master Thesis
Cyber-Physical Systems and MixedSimulations
Author:
Tran Van HoangSupervisor:
Professor Bernard Pottier
Abstract
Climate change has received much attention in recent years. The needs of prediction and
validation of real systems behaviors and natural phenomena are critical. Simulation is a good
candidate for this mission.
However, the major problem is that modeling and simulating complicated and large physical
systems are time-consuming. Despite many commercial software now exist for such systems
(water, forest modeling as examples), require a considerable knowledge of specific physical
processes, and about the study areas. Thus, at the first step, we propose a practical way for
simply modeling physical systems, especially natural system, by using Cellular Automata (CA).
The PickCell tool developed at Lab-STICC laboratory will facilitate that process. As a point,
GPU computations and parallelisms will be proposed as an important part of this methodology.
The purpose is to accelerate large size physical simulations.
In addition, we propose the use of distributed simulations to deal with the lack of interoperability
between simulations. To do that, we use an IEEE standard High Level Architecture (HLA) for
designing the system supporting mixed simulations being based on synchronous systems.
This also makes a great chance of conducting the simulations Cyber-Physical Systems.
Acknowledgments
I would like to give special thanks to professor Bernard Pottier, and all of my colleagues at
LabSTICC, UBO. I appreciate the supports not only in research activities but also in my daily
life.
Tran Van Hoang, Brest, France, 12/06/2015
Contents
1 Introduction 1
1.1 Motivations and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Cyber-Physical Systems (CPS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Cellular Automata (CA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Physical simulations based on cell networks 5
2.1 PickCell tool and cell networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Physical simulations based on cell networks . . . . . . . . . . . . . . . . . . . . . 8
2.3 Case study and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Routing algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Simulations with Cuda programming model 14
3.1 GPU and Cuda programming model . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Accelerating simulations by using Cuda . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Details of GPU implementation of simulations . . . . . . . . . . . . . . . . . . . . 18
3.4 Performance measurement principles . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 Distributed simulation with HLA 26
4.1 Overview of The High Level Architecture (HLA) . . . . . . . . . . . . . . . . . . 26
4.2 Time management in HLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 Distributed physical simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5 Conclusion 37
5.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Bibliography 40
i
1
Introduction
1.1 Motivations and Objectives
Nowadays, developing countries have su↵ered from natural disasters such as typhoon, tsunami,
fire, and flood. For example, in Mekong Delta of Vietnam, under the impacts of climate change,
the sea level rise around. This could make the flooding Mekong Delta every year. Thus, en-
vironment surveillance and prediction of such phenomenon become necessary. Simulation is a
good approach for that purpose. It helps human make better decisions to prevent or relieve the
impacts.
In recent years, wireless sensor network (WSN) emerges as a good candidate in monitoring
the environment. Several inspiring projects have been launched as a common aim to sense the
environment [10]. Sensors are used to collect status of physical systems and send status data
to computer systems for processing, analyzing. Some reactions will be sent back to physical
systems. A such integration between physical systems and computer systems pertain to Cyber-
Physical Systems (CPS), as presented in Section 1.2.
Therefore, it is necessary to consider sensing processes. The objective is to support and to
validate operations of the WSN. Especially, it is responsible for dangerous accidents such as
monitoring chemical store placed at residents regions. A composing model of the parallel sim-
ulations of the two sides of the CPS will thus be conducted.
However, modeling and simulating physical systems confront many issues. These systems often
appear as huge systems and complex behaviour. This leads to a lot e↵ort for designing the
models. Moreover, the lack of interoperability is also a major challenge. In fact, they always
impact to each other in the real world. For instance, the fire spread is influenced by several
other factors, namely weather conditions, wind directions and speeds, responding abilities, and
sensing performance of the wireless sensor network (WSN). In such systems, the model consists
of a lot of components (fire spreading, weather conditions, firefighter, and WSN). These com-
ponents and the their relations result in large scale models. Such models are very di�cult to
maintain and adopt. These circumstances bring about:
• long run times for simulation runs.
• long time for developing and testing of such models.
• huge e↵ort for maintaining and for adapting the models for other perspectives.
1
CHAPTER 1. INTRODUCTION
• low flexibility and reusability.
Traditionally, there are two common approaches to handle these problems. One solution is the
employment of powerful hardware. The other is breaking up the model into a set of submod-
els, which are distributed on di↵erent computer systems. However, they come from separate
works. Thus, in this project, we use a hybrid approach of the association of distributed models
and parallel computations. It aims to enable and to adapt to huge size and complex behavior
physical systems.
This approach can be viewed under two main aspects. For the problem of computing perfor-
mance, the use of parallel simulations based on GPU is suggested. The powerful GPU has been
considered in several studies to speed up large simulations over the last years. To deal with the
lack of the interoperability of simulations, we use an IEEE standard High Level Architecture
(HLA), which provides independent simulations the ability to communicate together in the con-
text of a synchronous system.
The thesis is roughly divided into five chapters:
Chapter 1: An introduction to the motivations and the objectives of the study is presented.
An overview of related concepts will be described such as Cyber-Physical Systems (CPS), Cel-
lular Automata (CA). A description of PickCell tool and its applications will end the chapter.
Chapter 2: A new approach is to simplify the process of modeling physical systems. The
approach is facilitated by the PickCell tool in accordance with the CA.
Chapter 3: Describing the use of Cuda programming model to simulate physical models.
Some experiments are conducted to evaluate the feasibility of the solution.
Chapter 4: Using the HLA standard to deal with the lack of interoperability of several
simulations. It enables parallel simulations to be able to communicate together in context of
distributed systems.
Chapter 5: Summarising the contributions and presenting future work.
1.2 Cyber-Physical Systems (CPS)
Cyber-Physical Systems (CPS) are integration of computation and physical processes [9], [1].
In which, embedded computers and networks monitor and control the physical processes. It
includes feedback loops where physical processes a↵ect computations and vice versa.
2
CHAPTER 1. INTRODUCTION
Figure 1.1: An example of Cyber-Physical System.
An example of CPS is illustrated in Figure 1.1, as an illustrating of monitoring accidents
(pollution, flood, landslides, chemical spreading, as example) in the river. A WSN can be used
to observe the status of the river via sensors. Sensors forward status data to computer systems,
which will carry out computations. An analysis of computed results can lead to some emergency
operations, giving some signals or closing the basin, in the case of the accidents. Apparently, for
implementation of this type of system, one of critical challenges is system integration. Therefore,
to obtain the interoperability of simulations, an integration solution is required.
In fact, on [22], the authors presented a co-simulation framework based on the HLA standard.
That work focus on integrating heterogeneous systems, designed in di↵erent tools and languages,
as CPSs. However, the given prototype has not taken care for phenomena and computation
performance as well. Thus, the considerations in this project are expected to provide another
perspective on phenomena simulations.
1.3 Cellular Automata (CA)
Cellular Automaton (CA) is one of the techniques used in simulating complex physical
systems such as self-reproduction in biology, di↵usion models in chemistry. The famous ”Game
of Life”, it illustrates that cellular automata have capacity of producing dynamic patterns and
structures [2], [3].
According to [4], a major e↵ort is presented to show the advantages of using CA for modeling
systems, especially for natural phenomena. The use of CA for modeling phenomena is clearer,
more accurate, and more complete than conventional mathematical system. Moreover, the
transition rules of CA models are often simpler than mathematical equations, but the result
produced is more comprehensive. It can mimic the actions of any possible physical systems. A
3
CHAPTER 1. INTRODUCTION
CA typically consists of two main components.
The first component is a cellular space that is a lattice of cells, each with an identical pattern of
local connection to other cells for input and output. The cell has a set of states that is chosen
from a finite number states. In the simplest case each cell can have the binary states 1 or 0. A
set of cells called neighbourhood is defined relatively to the specified cell (center). The states
of the neighbours will be used to calculate the next state of the center according to the defined
rule. The number of neighbour depend on the pattern chosen in modeling process.
The second component is a transition rule (CA rule) giving the update of the state (at time
t+1) of each cell according to its current state and the states of its neighbourhood (at time t).
Typically, the rules for updating states of all cells are the same and do not change over time.
Generally, the CA exits under various forms. The simplest CA is one being the one-dimensional
lattice, meaning that all the cells are arranged in a line. Then, the neighbourhood of the cell are
just in its left and its right. Meanwhile, for the two-dimensional lattice, the most common types
of neighbourhood are Moore neighbourhood and Von Neumann neighbourhood (see Figure 1.2).
Figure 1.2: Von Neumann and Moore neighbourhood (distance = 1).
In Von Neumann neighbourhood, each cell has four neighbourhood, north (N), south (S),
east (E), and west (W). We thus have 32 (25) possible. Meanwhile, for the latter, each cell
totally has nine cells, then 512 (29) possible patterns can be produced. In both cases, the dis-
tances are one and transition function is supposed to generate.
Therefore, in order to model systems with this approach, the two components need be accom-
plished: the cellular space and the transition rules or behavior. In the next chapter, an approach
proposed in Lab-STICC laboratory to automatically generate the cellular spaces (cell networks)
from geographic data is briefly presented. Input data and behavior of each cell will be later
determined according to di↵erent interests on a certain physical system.
4
2
Physical simulations based on cell networks
This chapter presents a brief description about PickCell, a tool allowing to generate cell
networks of physical systems. Their structures thus will be described in the second section
as well. We next propose a methodology to develop physical simulations in term of the cell
networks. Lastly, some cases are examined to demonstrate the use of the proposed methodology.
2.1 PickCell tool and cell networks
2.1.1 PickCell tool
PickCell is a modeling tool, has developed in Lab-STICC in recent years (more in document
[8]). It enables to access geographic data from various public resources as input data, namely
GoogleMap, OpenStreetMap, or even picture files. The tool uses these data to analyze, process,
and generate cell network structures of physical processes.
The main feature of the tool is extracting visible properties (potential physical systems) on
geographic data such as river, forest, or road system. A process start from input data. The
final results are a set of separated physical systems being represented by a group of cell networks,
presented in Section 2.1.2. Generally, this process is performed throughout three main steps:
• Preprocessing data: Geographic data are usually yet well presented, especially in the
case of satellite and air images. At this step, the tool increases the contrast of the data
to serve the following steps.
• Segmenting data into cells: In order to achieve interest regions on the data such as
rivers, or roads. The data are divided into small cells. Their sizes (x, y) depend on the
objective on the desirable models. In which, x and y parameter represent the width and
the height of cells, respectively. It makes sense that with the same size of input data, if x
and y values are small, the number of cells will large or vice versa.
• Recognizing similar cells and grouping into layers: Typically, the tool uses 3
standard components of color (Red-Green-Blue) to classify divided cells into defined layers.
Each contains a set of cells with similar colors. Next, the relations between these cells in
the same layer will be defined depending on a certain CA pattern. As a result, for each
layer, we have a set of cells organized as a network due to their relations (or links). These
5
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
sets are considered as cell networks. The details of cell networks will be presented in the
next section.
2.1.2 Cell network
As mentioned previous, a cell network is a group of cells and the relations between them.
Each typically has its data consisting of four elements: identity, local state (such as pollution
density, insect population, geographic positions), links to other cells (or its neighbour), and
relative positions to its the neighbour. The last one means that a cell is capable of de-
termining the directions of its neighbour, which can be located at the eastern, the western, the
northern, or the southern. This property can be useful in various situations such as simulating
the weather, or flow of the fluid. For the sake of simplicity, it can be organized as pairs of
number, shown in Table 2.1.
Direction Value
East (1,0)
West (-1,0)
North (0,-1)
South (0,1)
Table 2.1: A proposed organization of directions in a cell network.
Table 2.1 formally shows an example of a cell network, which is generated from PickCell
tool except for its data represented by the column named ”Pollution Density”. The data can be
loaded at the beginning of simulations or at runtime.
6
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
Cell Id Pollution Density Neighbour Id Directions
1 0 100 590, 25, 1, 600 (-1,0), (1,0), (0,-1), (0,1)
2 1 50 589, 0 (-1,0), (0,1)
... ... ... ... ...
26 25 10 0, 26 (-1,0), (0,1)
... ... ... ... ...
591 590 50 2, 0, 589 (-1,0), (1,0), (0-1)
... ... ... ... ...
601 600 78 0 (1,0)
Table 2.2: The table presents a cell network structure of 601 cells generated by PickCell tool (VonNeumann 1 CA).
The use of the cell network brings some advantages in developing physical simulations.
Firstly, each cell network is a clear and consistent structure. All cells come from a certain
physical system. They own the same type local data and have the same behaviour. This
structure looks like a class in OOP (Object Oriented Programming) and its cells are objects
being instantiated from that class. Under the view of software engineering, it thus especially
useful in maintaining the systems. It is simple to add necessary properties to states or transitions
of the models.
Secondly, cell networks generated from PickCell tool help to tackle the latency of input data.
Many phenomena simulations have used raster data as the input for their models. It is often
di�cult to distinguish the interest regions with this type of data. The limitation causes the
useless computations occurring on the outside of those regions. For example, in [23], data cells
are not belonging to the real interest area (rivers) will be marked ”NoData” in the preprocessing
step. The use of models built from cell networks will avoid this useless processing in default.
In addition to the cell network structure, the PickCell tool also allows to extract visible data.
This is useful for displaying and analyzing simulated results. Figure 2.1 demonstrates how a
river system is displayed from extracted visual data. In current version, the tool enables to
generate two dimension data in the format of two concurrent programming languages, Cuda
and Occam [6], [7]. The third dimension data for elevation will appear soon in the next version.
7
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
Figure 2.1: A cell network of a river system generated from PickCell tool with Von Neumann 1.
In short, cell networks generated from PickCell tool are presented as skeletons for simulation
models. In order to obtain a complete model by this approach, two other components need to
be considered: input data and transition rules. These will be presented in the next section.
2.2 Physical simulations based on cell networks
The cell network structure early presented is one of main components for this methodology.
Each model has at least three other components: cell network, input data, and transition
rule. The first one will be generated from geographic data with the facilitation of PickCell tool.
Whereas, the two others will be defined according to the characteristics of physical systems.
A summary of the methodology is depicted in Figure 2.2. The process has three main steps.
Initially, it begins with geographic data. These data are next processed to generate a cell
network by the PickCell tool. The cell network is associated with input data and transition
rule to make up a complete model. Lastly, this model is executed by a simulator.
Currently, the cell networks are generated in two versions, Cuda and Occam codes. Cuda was
chosen in this work due to adequation of its model.
8
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
Figure 2.2: A summary of the proposed process which is used to conduct physical simulations.
2.3 Case study and applications
This section describes a case study that has been applied to study region. It is a small area
located in Mekong Delta of Vietnam, as shown in Figure 2.3. In which, there are totally three
physical systems: river, forest, and road. The first two of those, river system and forest
system, which were considered in this project.
Considering applications of the proposed approach, there are two models will be conducted from
the study region. One is the model of forest fire spread. The other is river pollution di↵usion.
In addition, we assume that a Wireless sensor network (WSN) is used to monitor the status of
the forest. Thus, a model of WSN is also developed. Details of three models are later described
in this section.
Another assumption is that there are communications between those three systems. One hap-
pens as the fire spreading close to the river. Then, ashes of the fire will pollute to the river.
Meanwhile, as the sensors of the WSN recognised the fire appearing near to them, these sensors
will raise emergency signals. This scene will be clarified and used as an application for a solution
presented in Chapter 4.
9
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
Figure 2.3: The study region: A small area in Mekong Delta, the South of Vietnam. (data source:OpenStreetMap [16])
In reality, there are many elements of input data will be used for models and transition
rules are often very complicated. The goal is to create simulations as real as possible. However,
in our case, some basic characteristics will be picked to express the possibility of the proposed
methodology. Particularly, the input data and the transition rule of each model are presented
as follows:
2.3.1 The di↵usion of pollution in the river
This model is used to simulate the di↵usion of pollution in a river. Regarding the context of
pollution, it is possible to think of various potential situations such as chemical, oil, contaminant.
Then, the di↵usion much depends on the density. Thus, the pollution density was kept as input
data for this model. Each cell contains an amount of pollution density, which represents the
cell state. The states are changed according to the transition rule.
• Input data: Pollution density.
• Transition rule: At every time step, to achieve a new state at time t+1, each cell will
perform sequential tasks:
– If the local density value is larger than zero, it will be randomly subtracted a certain
amount of its density. That proportion will be equally transported to its neighbour.
– Next, it will receive some proportions from its neighbour.
– Finally, the addition and the subtraction will be updated to prepare for the next step
(time+1).
10
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
2.3.2 The fire spread in the forest
A model used for simulating the fire spread in the forest. It is reproduced from a sample in
CORMAS [15]. Each cell has four possible states: tree, fire, ash, and empty. At the beginning,
some cells are initialized with the state fire, while others are tree.
• Input data: Tree, fire, ash, and empty.
• Transition rule:
– If a cell is tree at time t, it will become fire at time t+1 in the case that there is at
least one of its neighbour is fire.
– If a cell is fire at time t, it will become ash at time t+1.
– If a cell is ash at time t, it will become empty at time t+1.
2.3.3 Wireless sensor network (WSN)
In this study, WSN plays as a sensing component role. It regularly collect raw data from the
environment, processes that data, and raises emergency alert in the case of the fire detected. A
WSN will monitor status of the forest. To do that, a set of sensors will be deployed in the forest
border because our consideration is the spread of the fire to other systems. In this case, we give
a simple way using a distributed algorithm for the deployment of sensors. The algorithm will
be described in Section 2.4. A simple WSN is achieved as shown in Figure 2.4.
Figure 2.4: Deploying sensors along the forest border extracted from the study region with the4 neighbour pattern. The communication range and the sensing range are 25 and 5 cells units,respectively.
Typically, sensors have two types of ranges. One is to indicate the sensing capacity of the
sensor. This sensing range can be small. Meanwhile, the other, communication range, can be
11
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
longer due to radio link technology. Thus, as deploying sensors, it is necessary to make sure
that sensors are connected together depending on the value of the communication ranges.
• Input data: Sensing data.
• Transition rule: At every step, the nodes check data received from the fire forest simula-
tion. In case of fire detected at some points, signals will be raised.
2.4 Routing algorithm
This section presents a routing algorithm implemented in parallel. Taking advantage of the
GPU computation, a new version of this algorithm was implemented in Cuda starting from a
Occam program. The routing table which can be used for deploying sensors as described in
previous.
We assume that the network has the shape and structure like the cell network as introduced
in Section 2.1.2. Generally, it consists of n nodes, numbered 0 to n-1, they are viewed as their
identity, as showed in Figure 2.5. Associating to each node is two elements: route table and
temperate table. In which, route table will store identities of itself and other nodes, to which
it has reached after t step. The structure of this table is presented in Table 2.3. Meanwhile,
temperate table will only contains new nodes’ identity, to which it reached at each step. It
means that after each step, the values held by temperate table are completely replaced by the
new ones while the route table can be added more new records or will be unchanged.
At each step, each node performs two main tasks that are sending out local temperate tables to
its neighbor and receiving temperate tables from them as well. These tasks will be performed
n-1 times. This is to assume that the maximum distance will be obtained. The algorithm is
presented as the following:
Algorithm in parallel:
• Initializing
– Adding node’s id to local temperate table and route table with distance is zero, link
index is -1.
• For i to n
– For each neighbour
⇤ Sending local temperate table to neighbour.
⇤ Receiving a temperate table from the neighbor.
⇤ Emptying local temperate table
⇤ For each id in received temperate table
· If id does not exist in the route table.
· Adding id, i as distance, and a link index to route table.
12
CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS
· Adding id to local temperate table.
Figure 2.5: A simple network.
Node 0 Node 1
Known Id Distance Links Known Id Distance Links
0 0 -1 1 0 -1
1 1 0 0 1 0
3 1 1 3 1 1
2 2 0 2 2 0
Node 2 Node 3
Known Id Distance Links Known Id Distance Links
2 0 -1 3 0 -1
3 1 0 1 1 0
0 2 0 0 1 1
1 2 1 2 1 2
Table 2.3: An example of route table at node 0 after 3 steps.
These tables show information held by nodes in the network. Each node can know ”who” it
can reach and the distance to destinations.that it can achieved.
2.5 Remarks
The chapter presented a variety of subjects. The most noticeable is the concept of cell
network. It plays an important role in developing physical models. For the next chapter,
parallel computations will be employed to simulate these models.
13
3
Simulations with Cuda programming model
This chapter describes Cuda programming model and its applications. One goal is to show
a adequation of mapping between GPU architecture and cell network structure. Besides, it
enables to solve the problems of both large cell networks and complicated behavior. Next, the
performance tests on computation will be conducted in di↵erent scenarios due to the necessary
considerations on the e↵ectiveness of this approach.
3.1 GPU and Cuda programming model
3.1.1 Introduction to GPU
The Graphic Processing Unit (GPU) [5] is massively multithreaded - many core chips com-
posed of hundreds of cores and thousands of threads. This provides the capacity for processing
large data in parallel. Thus, it is widely used in parallel computations.
a simplified of a motherboard architecture is depicted in Figure 3.1. There are two parts, the left
part for the CPU (host) and the right one for the GPU (device). They are connected together
by a PCI bus. On the CPU, only host memory is considered in this model. Meanwhile, the
GPU chip comes with a set of streaming multiprocessors (SM). Each consists of several scalar
processors (SP), a set of registers, a shared memory. An on-chip shared memory is visible for
all threads that executed on a SM. A global memory is shared for all SMs.
14
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
Figure 3.1: A simplified motherboard architecture.
3.1.2 Cuda programming model
Cuda (Compute Unified Device Architecture) is created by NVIDIA. It provides a platform
for parallel computing and programming model. It enables to increase computing performance
by harnessing the power of the GPU. Cuda provides a set of extensions to C/C++ language,
to express parallel programs.
The GPU has thousands of threads handing multiple tasks while a CPU consists of a few threads
for sequential serial processing. Thus, a Cuda program typically consists of CPU code (host
code) and one or more kernels (device code) running concurrently on the GPU. As shown in
Figure 3.2, the compute-intensive portions of the application will be sent to the GPU, while the
remainder of the code still runs on the CPU.
Kernels are executed by many several threads with private local variables and shared memory.
The executions of blocks are synchronous while those of threads in each block are independent.
In addition, each of the CPU and the GPU has its own separate memory. They cannot directly
access the memory of each other. Thus, we need to explicit transfer data between the two
memories via PCI bus.
15
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
Figure 3.2: Anatomy of a CUDA program.
3.2 Accelerating simulations by using Cuda
Programming with Cuda, means programming a large number of threads with own shared
memory and concurrent executing the same task. Therefore, if there is a need to address a large
number of repeated works which are the same, it is convenient to apply this model.
In our case, each model owns a cell network, input data for each cell, and a common transition
rule for entire cells. This makes sense that each cell has its local data and global behavior.
Every cells must make the same computation on its own data at each step in order to achieve
new states for the system. It is thus simple to map each cell to each thread being responsible
for the processing of that cell, as illustrated in Figure 3.3.
16
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
Figure 3.3: The mapping between the cell network structure and the GPU architecture.
According to this model, data need to be moved on the global memory to share between
threads. Figure 3.4 shows the data flow of physical simulations in term of CUDA programming.
This can be summarized into some main steps:
• Initializing initial states (input data) for all network cells.
• Transferring data (cells’ states and network structure) to the GPU for computations. For
each cycle, the new states of all nodes will be concurrently computed on the GPU. These
states will updated with new values to prepare for the next cycle.
• Sending data back to to the CPU memory possibly to display and analyze the results.
It is optional, if the result of each step is not considered for displaying and analyzing at
run-time, these operations can be omitted.
Figure 3.4: Data flow in the system.
Obviously, if the phase of displaying and analyzing is ignored, the execution of simulation
17
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
mostly is run on device. Hence, it is believed that the benefit of performance in this case will be
proportional to the size of cell networks. It becomes more worthwhile in the case of simulating
phenomena, which often appear with large sizes and very complicated transition functions.
Moreover, this proposition provides an opportunity to achieve computations and statistics in
real time. This increasingly becomes important when the needs of predictions of many emergent
cases increase, namely clouds of insects, flooding, tra�c congestion, tsunami, fire. For those
situations, the systems can directly access available data from the natural environment via
observing systems. The simulations use input data to conduct useful information (directions of
clouds of insects or the level of flood at a certain time in the future, for example).
3.3 Details of GPU implementation of simulations
In this section, the details of GPU implementations of three main simulations will be pre-
sented: pollution di↵usion, forest fire and wireless sensor network. All of them are
developed by C programming language in accordance with Cuda model. These implemen-
tations are resulted from the analysis in the previous section. The formal presentations of
implementations are described as the following.
Host program implemented on the CPU
(1) Initializing the initial values for all cells.
(2) Copying the cell network structure and data from the CPU host memory to the
GPU device memory and launching the kernel.
Kernel program implemented on the GPU
(3) Looping each cycle.
(4) Computing the new states for each cell.
(5) Updating new states to each cell.
(6) Reading back results to the CPU and output the results (once for each time step
or more).
Apparently, the execution runs mostly on GPU (from (3) to (5)). Others do not much a↵ect
to global performance if line (6) is not considered. Then, line (1) is executed once and line (2)
is run twice. Thus, as a comparison, the execution time on CPU can be omitted.
In the next section, some initial measurements will be performed for evaluating the e↵ectiveness
18
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
of using the massively parallel architecture GPU to accelerate the computation of phenomena
simulations.
3.4 Performance measurement principles
In order to validate the performance of the proposal methodology, a few measurement tests
were performed. The simulation of pollution di↵usion in the river was chosen as a case. The de-
scription of the pollution di↵usion model follows Section 2.3.1. The implementation of the tran-
sition function presented in Listing 3.2. There are two data structures used. The NodeState
structure contains states of cells, the Canaux structure consists of links to neighbours.
Listing 3.1: Transition function
1dev i ce NodeState computeState ( NodeState ⇤nowState , i n t nodeIndex ,
Canaux ⇤ channe l s )3{
NodeState myState ;
5i n t nbIn , nodeIn ;
f l o a t r e c e i v e ;
7
/// Get t ing p o l l u t i o n den s i t y o f the c e l l
9myState = nowState [ nodeIndex ] ;
/// Get t ing number o f ne ighbours o f the c e l l
11nbIn = channe l s [ nodeIndex ] . nbIn ;
r e c e i v e = 0 ;
13f o r ( i n t i = 0 ; i < nbIn ; i++)
{15/// Get t ing id o f the ne ighbours
nodeIn = channe l s [ nodeIndex ] . read [ i ] . node ;
17r e c e i v e = r e c e i v e +
( ( nowState d [ nodeIn ] . dens i ty / 2 . 0 ) / ( f l o a t ) channe l s [ nodeIn ] . nbIn ) ;
19}/// Computing the new s t a t e
21myState . dens i ty = (myState . dens i ty / 2 . 0 ) + r e c e i v e ;
r e turn myState ;
23}
We have tested and have evaluated the computational e�ciency in various studies. The con-
centration of these tests is to show how the GPU speeds up the simulations when comparing
to the CPU. Therefore, the time for transferring data between CPU and GPU are omitted in
most cases. The time execution of the simulation on the host is also ignored due to most of
computation being moved on the device.
As mentioned earlier, the simulation execution costs depend on two main components: cell
networks (size and type of CA pattern chosen) and the complexity of transition rules. Thus,
many di↵erent aspects related to these components will be concerned.
All tests have been tried on a PC with hardware configuration shown in Table 3.1. Information
about Graphics Device is presented in Table 3.2 (more details, see [11]). We have used a pro-
19
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
filing tool nvprof [17] to estimate time for GPU computation and the standard library time.h
for that on the CPU.
Intel(R) Xeon(R) CPU E3-1240 V2 @ 3.40GHz
Num. CPUs 8
Num. Cores/CPU 4
Architecture i686
RAM 16 GB
Table 3.1: Technical data of PC used.
GeForce GTX 680
Num. cores 1536
Maximum number of threads per block 1024
Global memory 4 GB
Table 3.2: Technical data of NVidia graphics card used.
The first scenario: The comparison of time computation between the CPU and the GPU
was carried out. All tests follow the model of river pollution di↵usion (Section 2.3.1) with
the pattern of 8 neighbourhoods and 1,000 cycle runs for each test. The transport time was
considered in this case study.
The computation on both the CPU and the GPU are influenced by the size of cell networks
(number of cells), but not by the size of cells. Since, the cell is a basic element in cell networks,
the computations are careless about the pixels of cells. With the same studied region, as the
size of cells is smaller, we can process a larger cell network. Otherwise, the cell network is small
if a bigger size of cells is chosen. Thus, the sizes of cells were regardless the performance tests.
Table 3.3 shows the time executions of the pollution di↵usion model on the CPU and the GPU
with 1,000 cycles. The network sizes used between 1,220 and 83,661 cells.
Regarding the network size, the number of cells influence the performance for both the CPU
and the GPU. On the CPU, the upward trend is very noticeable. The great increase starts
from the size of 10,703 to 83,661 at a rate of 0.26(s)/1,000 cells. It is projected that the
trend anticipation will be maintained with bigger sizes. Whereas, the increase on the GPU is
not dramatic. It gradually rises between 1,220 and 83,661 at a rate of 0.01(s)/1,000 cells.
Table 3.3 presents that the GPU is overwhelmingly faster than the CPU. The gap increasingly
becomes significant according to the rise of the number of cells. This is visually expressed in
Figure 3.5. As the size of cell network is 83,661, the GPU is approximately 22 times faster
20
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
than the CPU. It is that the use of GPU is very vital in the case of vast systems.
Time (seconds)/1,000 cycles
Num. cells Cell size (Pixel) CPU GPU
1,220 10x10 0.060 0.040
10,703 5x5 0.590 0.103
48,425 2x2 10.880 0.527
83,661 2x2 19.910 0.894
Table 3.3: The computation comparison between the CPU and the GPU in the case of pollutiondi↵usion model.
Figure 3.5: Demonstrating the accelerating time of using the GPU for physical simulation.
Figure 3.6 shows an example about physical simulation on GPU. The cell network of a river
is generated by PickCell tool with the use of four neighbor pattern. Meanwhile, the model
of pollution di↵usion is referred from Section 2.2. Initially, two polluted points are randomly
created in the river. These points contain an amount of pollution density as their data states.
At every step, system states are changed according to the transition function.
21
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
Figure 3.6: Illustrating a simulation of di↵using pollution in a river following the model describedin Section 2.2. It is initialized with two polluted points (black points).
The second scenario: Di↵erent sizes of cell networks are still taken into account. The
two popular patterns of CA (Von Neumann 1 and Moore 1) and the di↵erence of number of
cycles are considered as well. The model are used as the previous case. The achieved results
are presented in Table 3.4. One of these attempts is shown in Figure 3.6.
The values shown in Table 3.4 indicate that the increase of cycles does not much a↵ect to the
execution time. It can be understood that the transition functions are very simple to generate
major di↵erences.
22
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
Num. cells
Cell size
(Pixel) CA PatternTime (seconds) / Num. cycles
100 1,000 10,000 100,000 1,000,000
1,220 10x10 VN 1 0.002 0.035 0.356 3.564 35.643
1,220 10x10 Moore 1 0.002 0.035 0.355 3.569 35.642
10,703 5x5 VN 1 0.010 0.103 1.036 10.369 103.544
10,703 5x5 Moore 1 0.017 0.170 1.704 17.058 170.544
48,425 2x2 VN 1 0.052 0.527 5.268 52.704 527.008
48,425 2x2 Moore 1 0.087 0.880 8.801 88.008 880.386
83,661 2x2 VN 1 0.145 0.894 8.948 89.477 895.427
83,661 2x2 Moore 1 0.219 1.454 14.566 145.661 1,002.105
Table 3.4: Measurements results.
Regarding CA patterns, for small networks, the di↵erences between Von Neumann 1 and
Moore 1 are not very remarkable. However, in the case of larger ones, Von Neumann 1 is
significantly faster than the other. As a case, as running time is 10,000 cycles and network size
is 83,661 cells, the Moore 1 takes 14.566(s) while the Von Neumann 1 just takes 8.948(s).
The former is about 1.6 times slower than the latter, as shown in Figure 3.7.
Figure 3.7: The graph displays the increase of the gap between two CA patterns with 10,000cycles.
The third scenario: It aims to show that the execution time also depends on transition
function. To do that, we modified a little on the previous version. Particularly, at every step,
each cell loses an random amount of the pollution density. The implementation is shown as
below.
23
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
Listing 3.2: Transition function (version 2)
1dev i ce NodeState computeState ( NodeState ⇤nowState , i n t nodeIndex ,
Canaux ⇤ channels , curandState ⇤ devStates )3{
NodeState myState ;
5f l o a t lo s sPercentage , r e c e i v e , l o s s ;
i n t nbIn , nodeIn ;
7
myState = nowState [ nodeIndex ] ;
9/// Generating a random va lue in [ 0 . 0 � 1 . 0 ] by generateNumber func t i on .
l o s sPe r c en tage = generateNumber ( devStates , nodeIndex ) ;
11/// Ca l cu l a t i n g an amount o f l o s s .
l o s s = lo s sPe r c en tage ⇤ myState . dens i ty ;
13/// Get t ing number o f ne ighbour
nbIn = channe l s [ nodeIndex ] . nbIn ;
15r e c e i v e = 0 ;
f o r ( i n t i = 0 ; i < nbIn ; i++)
17{/// Get t ing id o f the neighbour
19nodeIn = channe l s [ nodeIndex ] . read [ i ] . node ;
r e c e i v e = r e c e i v e + ( ( nowState [ nodeIn ] . dens i ty / 2 . 0 ) /
21( f l o a t ) channe l s [ nodeIn ] . nbIn ) ;
}23/// Computing the new s t a t e
myState . dens i ty = (myState . dens i ty / 2 . 0 ) + r e c e i v e � l o s s ;
25i f (myState . dens i ty < 0 . 0 )
{27myState . dens i ty = 0 . 0 ;
}29re turn myState ;
}
The graph 3.8 demonstrates the influences of transition rules on execution time in this approach.
The version 2 is slower than version 1 due to the more complex behaviour. The increase of time
is stable following the size of the networks.
24
CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL
Figure 3.8: Comparing the execution time between previous transition function (version 1) andthe new one (version 2).
25
4
Distributed simulation with HLA
The simulations of large systems often face with the performance issues. The use of Cuda
programming model can deal with those. However, the lack of interoperability between sim-
ulations poses a major challenge. Thus, the High Level Architecture [(HLA) [12], [13], [20]]
standard is proposed as a solution for addressing that new demand. According to this standard,
the distribution of many sub-simulations can be achieved instead of the development of one vast
simulation. The integration of Cuda model and the HLA leads to a hybrid solution in which
several parallel simulations can be distributed on di↵erent computer systems. This chapter gives
a brief description of the application of HLA on parallel simulations.
4.1 Overview of The High Level Architecture (HLA)
The High Level Architecture (HLA) ( [12], [13], [20]) is a standard for distributed simulations,
the main goal is to support interoperability and reusability of simulations. The HLA was
developed by the United States Department Defense (DoD) to facilitate the integration of
distributed simulation models within an HLA environment. It allows the division of a large
scale model into a number of manageable components, while maintaining interaction between
them. Over the last years, the HLA is deployed in a wide range of simulation application areas
including transportation and the manufacturing industry. But, it hardly appears in simulation
about phenomena, especially the climate change area. The HLA is thus suggested as a potential
approach of composition of parallel simulations in this project.
26
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
Figure 4.1: HLA Federation.
In HLA terminology, the entire system is represented by a federation. Each simulator re-
ferring to the federation is called a federate. A set of federates is connected via Run Time
Infrastructure (RTI). These federates can be established on di↵erent platforms and connected
together by a network system. In such case, RTI can be viewed as distributed operating systems
for interconnect cooperating system federates. Figure 4.1 describes the global architecture of a
HLA simulation. Generally, the HLA specification defines:
• A set of rules: This describes the responsibilities of federates and their relationship with
RTI. There are ten rules. One of them is that all exchange of data among federates should
occur via the RTI during a federation execution.
• An interface specification: The interface specification prescribes the interface between
each federate and the Runtime Infrastructure (RTI), which provides communication ser-
vices to the federates. The interface specification is divided into some main management
areas:
– Federation management: Federation management includes main tasks such as creat-
ing federations, joining federates to federations, resigning federates from federations,
and destroying federations.
– Declaration management : This allows federates publish and subscribe class attributes
and interactions to RTI. Other federates can only subscribe to an attribute or an
interaction when they were published by the federates owning them.
– Object management: Which includes the tasks of creating, and sending the updates
of objects to other federates.
– Ownership management: The RTI allows federates to distribute the responsibility
for updating and deleting object instances with a few restrictions.
– Time management: This focuses on the implementation of time management policies
and negotiate time advances. This mechanism allows to create several simulations
running concurrently.
27
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
• An Object Model Template (based on the OMT standard [14]): This component
defines how information is communicated between federates, and how the federates and
federation have to be documented (using Federation Object Model FOM). FOM defines
the shared objects, attributes, and interactions for whole federation.
There are two elements can be exchanged between federates:
An object: is an entity that represents “actor” playing in the simulation. It contains
shared data that are created by a federate during the federation execution and persist
until it is destroyed. The FOM defines all classes of object, a case presented in Table 4.2.
As a federate wants to publish or subscribe to an object, it must compatibly define that
object in its FOM. Objects store their data in attributes.
An interaction: is a broadcast message that any federate can send or receive. A pub-
lishing federate sends out an interaction to the federates, which have subscribed to the
publisher. If no subscribing federate receives the interaction, the data it carries are lost.
The FOM also defines all classes of interaction. As a federate wants to publish or subscribe
to an interaction, it must compatibly define that interaction in its FOM. Interactions carry
data in parameters.
Figure 4.2: Illustrating a high level of the interplay between a federate and a federation.
Figure 4.2 depicts the interplay between a federate and a federation. Initially, a federate will
try to create a federation, or to connect to existing one on RTI. It then specifies what data
will be shared with other federates by using publishing services. These published objects or
published interactions will be available to all federates, which also has a connection to the same
28
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
federation.
An federate want to send data to other federates, it has to register objects and call an update
service. That data will be automatically reflected to subscribers by the RTI. Releasing allocated
resources is always necessary at the end.
4.2 Time management in HLA
The RTI provides a variety of optional time management services. It is important to un-
derstand time management to manage the mechanism of exchanging events between federates.
Each federate manages its own logical time and communicate this time to the RTI. The RTI
will ensure correct coordination of federates by advancing time coherently. In the discrete event
simulation literature, logical time is equivalent to ”simulation time”. It is used to make sure
that federates observe events in the same order [19]. It helps to avoid many problems such as
causality violation, or di↵erent results led from repeated executions of the simulation with the
same input data. Logical time is not mapped to real time.
4.2.1 Time policies
According to the HLA time policies, each federate is involved in the progress of time. In some
cases, it is necessary to map the progress of one federate to the progress of another. A federate
needs to request a regulation policy to participate in the decision for the progress of time. A
constrained federate follows the time progress imposed by other federates. As our approach, the
synchronization of logical time from di↵erent federates is necessary. Thus, the federating and
constrained federates are allowed, as shown in Table 4.3. This enables participating federates
can exchange data together.
4.2.2 Time progress
The second portion of the time management component provides a mechanism to advance
simulation time within each federate. There are two particular services which federates can in-
voke to request time advancement from the RTI. The timeAdvanceRequest is used to implement
time-stepped federates; the nextEventRequest is used to implement event-based federates. The
granted time is given by timeAdvanceGrant service.
Generally, a time management cycle consists of three steps. First, a federate sends a request
for time advancement. Next, the federates can receive ReflectAttributeValues callbacks. The
RTI completes the cycle by invoking a federate defined procedure called timeAdvanceGrant to
indicate the federate’s logical time has been advanced.
29
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
Figure 4.3: A model of time advancement request is used in this project.
4.2.3 Time synchronization
As presented in previous sections, all simulations have to synchronize their local logical time
to ensure the causality. The constrained parameter and regulating parameter are enabled for
all simulations. The former ensures federates to be able to send the updates and interactions
in causal order. In the other hand, the latter allows federates to able receive those updates and
interactions from the RTI. Since a passive visualization federate does not send any updates or
interactions, it has no impact to the time advance of the federation. Therefore, only constrained
parameter is enabled and regulating can be switched o↵ in the case of visualization. Table 4.1
shows time policies proposed for the case study (Section 2.3).
Federate Time constrained Time regulating Time advance
Forest Yes Yes Time stepped
River Yes Yes Time stepped
WSN Yes Yes Time stepped
Visualization Yes Yes/No Time stepped
Table 4.1: Time management of the federation.
To synchronize activities between several federates participating in a federation, the RTI
gives a mechanisms for exchanging data between them. In this case, times will be associated
with exchanged data in coordinating federate activities. The RTI allows federates communicate
explicit synchronization points. Figure 4.4 illustrates a process of synchronizing between two
federates, the river federate and the forest federate.
30
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
Figure 4.4: Federate synchronization
First of all, one of available federates sends a synchronizing request to the RTI and in this
case it is the river federate. Then, the RTI will send the response to river federate and later send
an announce to other federates to achieve a synchronization point. A service will be used by
federates to confirm the synchronized point achieved. In the next portion, some issues relating
to exchanging data is considered in the context of distributed simulations.
4.2.4 Exchanging data
Exchanging data between simulation federates is one important part of distributed systems.
However, a question arriving in this case is that what kind of data must be shared, where the
communication will happen.
Regarding the type of exchanging data, it is determined by characteristics of real systems as
well as interoperability between them. As our case, there is a communication between four
federates: forest, river, WSN, and visualization. The forest federate transports its status
to the river federate and the WSN federate. Meanwhile, the three federates need to provide
their data to the visualization federate for analyzing the results.
To achieve it, the forest federate will publish its data (forest status and position) as an object
class (ForestNode). The river federate and the WSN federate need to subscribe it. As the
same case, the river federate and the WSN federate also publish the object classes RiverNode
and WSNNode, respectively. These published classes have to be declared in the FOM of the
federation, has a structure as indicated in Table 4.2.
31
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
Object Class Attributes Published by Subscribed by
ForestNode State, Position ForestNode River, WSN, Visualization
RiverNode Pollution density, Position RiverNode Visualization
WSNNode State, Position WSNNode Visualization
Table 4.2: Objects and their attributes, publishers and subscribers.
In some cases, it is also important to specify where data will be exchanged between two
federates, especially in the case of physical systems owning very large sizes. Indeed, it is often
ine�ciency to send the entire data via the RTI because of issues with network performance and
local computation yield. Thus, we proposed a solution for a general case of exchanging data
between two adjacent systems.
Adjacent situation is two physical systems that have a common frontier or some places in
common. It is useless if unrelated information is sent to others. For example, as shown in
Figure 2.3, new polluted points to the river can be caused by the ashes of forest fire only
appears at the frontier of the two systems. Forest fire federate sends regularly its states to the
river federate. The latter only takes care states of points close to it instead of entire forest
states. This not only takes time for transporting data between federates, but also lead to less
e�cient in computation at receiver side.
A solution based on the morphology theory [21] can be used to address that issue. That enables
to smooth the boundary of physical systems by applying basic operations such as erosion, and
dilation. To summarize, only the status data at the boundary of forest will be sent to the RTI.
4.3 Distributed physical simulation
This section presents an application of using of the HLA standard for unifying parallel sev-
eral simulations, or called a mixed simulation. The study region was suggested as shown in
Figure 2.3.
The whole model was split in three simulation federates: forest fire spread, river pollution
di↵usion, and WSN. The simulation federates was all implemented in accordance with the
Cuda programming model and the HLA standard as well. These parallel simulations are exe-
cuted concurrently as three di↵erent simulators. Their models were presented in Section 2.2.
In addition to the three federates, the last one, visualization, is designed as a supportive federate.
The overview about the federation can be seen in Figure 4.5.
32
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
Figure 4.5: A structure for a proposed federation.
Repeating the communication that was proposed in Section 2.3, forest fire spread will pro-
duce ashes which result in some new polluted points and dusts to river pollution di↵usion
at time t. The latter will include these new data to its model at time t+1. The communication
depends on a specify condition.
There is also the communication between WSN and forest fire spreading, the sensors reg-
ularly collect the forest status as it is the goal of sensing. New information will be sent to
observers. In the case of fire detected, the observers will raise emergency signals as the fires
were detected. To do that, the synchronization needs to be achieved as indicated in Table 4.1
and the shared data have to be declared as shown in Table 4.2.
The file FOM for the federation was represented in the cyber.fed file shown in 4.1.
Listing 4.1: cyber.fed file
; ; Cyber phy s i c a l s imu la t i on
2(Fed
( Federat ion Cyber )
4( FedVersion v1 . 0 )
( Federate ” r i v e r ” ”Publ ic ”)
6( Federate ” f o r e s t ” ”Publ ic ”)
( Federate ”wsn” ”Publ ic ”)
8( Federate ” v i s u a l i z a t i o n ” ”Publ ic ”)
( Objects
10( Class ObjectRoot
( Att r ibute p r i v i l e g eToDe l e t e r e l i a b l e timestamp )
12( Class RTIprivate )
( Class ForestNode
14( Att r ibute Posit ionX RELIABLE TIMESTAMP)
( Att r ibute Posit ionY RELIABLE TIMESTAMP)
16( Att r ibute State RELIABLE TIMESTAMP)
)
18( Class RiverNode
( Att r ibute Posit ionX RELIABLE TIMESTAMP)
20( Att r ibute Posit ionY RELIABLE TIMESTAMP)
( Att r ibute Density RELIABLE TIMESTAMP)
22)
33
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
( Class SensorNode
24( Att r ibute Posit ionX RELIABLE TIMESTAMP)
( Att r ibute Posit ionY RELIABLE TIMESTAMP)
26( Att r ibute State RELIABLE TIMESTAMP)
)
28)
)
30)
4.3.1 Forest fire spread federate
The model of this simulation federate was presented in Chapter 2. In which, there are
some fires (red points) being randomly initialized in the forest. These fires will spread around
according to the transition function and CA pattern of the model. An example about the
spreading is shown in Figure 4.6. The green, red, grey, and white points represent the trees,
fires, ashes, and empty states, respectively.
Figure 4.6: An example of simulating of fire spread in the forest. The pattern of 4 neighbour isused. The red color represents fire trees and the gray color implies ashes formed by the fire.
The ashes can be formed after some steps. These ashes are able to pollute the river as
shown in Figure 4.7.
4.3.2 River pollution di↵usion federate
The model of pollution di↵usion in the river was also presented in Chapter 2. Initially, there
are some polluted points randomly generated in the river. During the progress of di↵usion, it
always receives status data about the fire from forest federate via the RTI. It will check the
data to determine whether the ashes will pollute some river cells or not. This defends on a
specific condition. For each river cell, if the distance to an ash cell is equal or less than a specify
34
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
threshold, the pollution density of river cell will decrease in inverse proportion of that of the
distance.
The RTI only sends that data to river federate as soon as it receives an update call from forest
federate. The update call only appears when ashes presented in the scope of the forest boundary.
Figure 4.7: A result is got from visualization federate. This demonstrates the exchanging databetween the two simulations via the RTI. Two regions marked with the red circles representing thenew pollution created by the ashes, which are formed from the forest fire after 4 steps.
4.3.3 WSN federate
The model of WSN was also introduced in Chapter 2. Every time step, nodes will receive
the data from forest federate via the RTI and only consider to cells in the scope of the sensing
range. If it detected that there are fire, it will forward that information to a observer for making
decision. The signals will be raised as the fire is recognized. As depicted in Figure 4.8, the red
rings indicate that fires have been detected at those sensors.
4.3.4 Visualization federate
The viewer federate is based on the 2D visualization X Window System. As mentioned
above, it first subscribes all necessary data, which have been published by other federates. The
aim is to provide a overview on the results as shown in Figure 4.8. Initially, the background
of the viewer is drawn from visible data extracted from PickCell tool. During the federation
execution, this federate will receive data from others and update the view at every step.
35
CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA
4.3.5 A case study
This section describes a case of the federation. Initially, one federate creates a federation
on the RTI and waits for other federates to participate. Another federate will connect to that
federation and also wait until the last coming. The first one will send a request to others to
achieve a synchronization point. After the responses of other federates, the synchronization
point is achieved. They run on the same time progress. At each time step, these federates
exchange data together via the RTI. Figure 4.8 presents the results captured from visualization
federate.
Figure 4.8: Illustrating an interoperability between the four federates via the RTI.
4.3.6 Simulation tools
Along with the PickCell tool, which is developed at LabSTICC laboratory. An Open Source
software, CERTI [18], was used in this project. The CERTI RTI supports HLA 1.3 specification
(C++ and Java). The X Window System was used to support for displaying the results of
simulation federates.
36
5
Conclusion
Our works focus on the modeling and simulating large and complex physical systems, espe-
cially natural phenomena, which recently emerged as a critical topic. We mainly consider two
problems. The first one is long time simulations of vast physical systems. The other is handling
the problem of lack of interoperability of physical models, which actually have several relations
in reality. In addition, the complexity of modeling was also taken into account.
5.1 Contributions
In order to carry this out, we firstly proposed a methodology to develop distributed physical
models based on the PickCell tool. A methodology based on this tool can conduce physical
simulations in term of cell network systems. Next, we also suggested an hybrid approach to
tackle problems of large size and complicated behaviour of physical processes. This enables
to create a distribution of several parallel simulations, which are simultaneously running and
communicating.
Results from some experiments of parallel simulations were a lot encouraging. The parallel
computations on the GPU help to dramatically reduce the simulating time of large models.
The interoperability between several physical simulations in accordance with the HLA standard
was also implemented.
5.2 Future works
We are going to apply our approach in reality. A proposition for simulating phenomena
in Mekong Delta region (Vietnam) will be regarded. At which, flood, pollution, and cloud of
insects are always considered problems.
Currently, a new version of PickCell tool is developing. There are new features such as getting
elevation data as example, enables to generate 3D data.
37
List of Figures
1.1 An example of Cyber-Physical System. . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Von Neumann and Moore neighbourhood (distance = 1). . . . . . . . . . . . . . 4
2.1 A cell network of a river system generated from PickCell tool with Von Neumann
1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 A summary of the proposed process which is used to conduct physical simulations. 9
2.3 The study region: A small area in Mekong Delta, the South of Vietnam. (data
source: OpenStreetMap [16]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Deploying sensors along the forest border extracted from the study region with
the 4 neighbour pattern. The communication range and the sensing range are 25
and 5 cells units, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 A simple network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 A simplified motherboard architecture. . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Anatomy of a CUDA program. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 The mapping between the cell network structure and the GPU architecture. . . . 17
3.4 Data flow in the system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.5 Demonstrating the accelerating time of using the GPU for physical simulation. . 21
3.6 Illustrating a simulation of di↵using pollution in a river following the model
described in Section 2.2. It is initialized with two polluted points (black points). 22
3.7 The graph displays the increase of the gap between two CA patterns with 10,000
cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.8 Comparing the execution time between previous transition function (version 1)
and the new one (version 2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 HLA Federation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Illustrating a high level of the interplay between a federate and a federation. . . . 28
4.3 A model of time advancement request is used in this project. . . . . . . . . . . . 30
4.4 Federate synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.5 A structure for a proposed federation. . . . . . . . . . . . . . . . . . . . . . . . . 33
38
LIST OF FIGURES
4.6 An example of simulating of fire spread in the forest. The pattern of 4 neighbour
is used. The red color represents fire trees and the gray color implies ashes formed
by the fire. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.7 A result is got from visualization federate. This demonstrates the exchanging
data between the two simulations via the RTI. Two regions marked with the red
circles representing the new pollution created by the ashes, which are formed
from the forest fire after 4 steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.8 Illustrating an interoperability between the four federates via the RTI. . . . . . . 36
39
List of Tables
2.1 A proposed organization of directions in a cell network. . . . . . . . . . . . . . . 6
2.2 The table presents a cell network structure of 601 cells generated by PickCell tool
(Von Neumann 1 CA). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 An example of route table at node 0 after 3 steps. . . . . . . . . . . . . . . . . . 13
3.1 Technical data of PC used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Technical data of NVidia graphics card used. . . . . . . . . . . . . . . . . . . . . 20
3.3 The computation comparison between the CPU and the GPU in the case of
pollution di↵usion model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Measurements results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Time management of the federation. . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Objects and their attributes, publishers and subscribers. . . . . . . . . . . . . . . 32
40
Bibliography
[1] Teodora Sanislav, Liviu Miclea. ”Cyber-Physical Systems - Concept, Challenges and Research Ar-
eas,” CEAI, Vol.14, No.2, pp. 28-33, 2012.
[2] Francesco Berto and Jacopo Tagliabue. ”Cellular Automata,” 26 Mars 2012. http://plato.
stanford.edu/entries/cellular-automata/.
[3] Stephen Wolfram. ”Cellular Automata as Simple Self-Organizing Systems,”Caltech preprint CALT-
68-938, July 1982.
[4] Robert M. Itami. ”Simulating spatial dynamics: cellular automata theory,” Landscape and Urban
Planning 30 (1994) 27-47.
[5] NVIDIA, ”Graphics Processing Unit (GPU),”. [online]. Available:
http://www.nvidia.com/object/gpu.html.
[6] CUDA Home Page. [online]. Available: http://developer.nvidia.com/object/cuda.html.
[7] Daniel C. Hyde. ”Introduction to the programming language Occam,” Bucknell University, 1995.
[8] Bernard Pottier, Pierre-Yves Lucas. ”Dynamic networks – NetGen: objectives, installation, use, and
programming,” Universite de Bretagne Occidentale. August 26, 2014.
[9] Edward A. Lee. ”CPS Foundations,”Proc. of the 47th Design Automation Conference (DAC). ACM,
737-742, June 2010.
[10] Luıs M. L. Oliveira, Joel J. P. C. Rodrigues. ”Wireless Sensor Networks: a Survey on Environmental
Monitoring,” Journal of Communications. Vol. 6, N.2, April 2011.
[11] GeForce GTX 680. [online]. Available: http://www.geforce.com/hardware/desktop-gpus/
geforce-gtx-680/specifications.
[12] ”IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)– Framework
and Rules,” IEEE Std 1516TM-2010, pp. 1-38, 2010.
[13] ”IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)– Federate
Interface Specification,” IEEE Std 1516TM-2010, pp. 1-378, 2010.
[14] ”IEEE Standard for Modeling and Simulation (M&S) High Level Architecture (HLA)– Object Model
Template (omt) Specification,” IEEE Std 1516.2TM-2010, pp. 1-110, 2010.
[15] Common-pool Resources and Multi-Agent Simulations (CORMAS) CIRAD research center. [online].
Available: http://cormas.cirad.fr/
41
BIBLIOGRAPHY
[16] The study region, Mekong Delta Region, South of Vietnam. [online]. Available: https:
//www.openstreetmap.org/search?query=U%20Minh%2C%20ca%20mau%2C%20viet%20nam#map=
12/9.5440/105.0918
[17] Profiling User’s Guide. [online]. Available: http://docs.nvidia.com/cuda/profiler-users-
guide/#axzz3aPF6qywC.
[18] E. Noulard, J.-Y. Rousselot, and P. Siron, ”CERTI, an open source RTI, why and how?” Spring
Simulation Interoperability Workshop, 2009.
[19] R. M. Fujimoto, ”HLA time management: Design document,” Georgia Tech College of Computing,
Tech. Rep. Aug 1996.
[20] F. Kuhl, R. Weatherly, J. Dahmann, ”Creating Computer Simulation Systems: An Introduction to
the High Level Architecture,” Prentice Hall, 1999.
[21] C. Alasbey and G. Horgan, ”Image analysis for the biological sciences,”Edinburgh University, Febru-
ary 1999. Chapter 5, p.1-13.
[22] G. Lasnier, J. Cardoso, P. Siron, C. Pagetti, and P. Derler, ”Distributed simulation of hetero-
geneous and real-time systems,” in Distributed Simulation and Real Time Applications (DS-RT),
2013 IEEE/ACM 17th International Symposium on, Oct 2013, pp. 55–62.
[23] Juraj Cirbus , Michal Podhoranyi, ”Cellular Automata for the Flow Simulations on the Earth
Surface, Optimization Computation Process,” Applied Mathematics & Information Sciences 7, No.
6, 2149-2158, 2013.
42