Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
1/150
COMPUTATIONAL METHODS IN NUCLEAR TECHNOLOGY, 2014/1 5
1. Sets of homogeneous linear ordinary differential equations ........................................ 4
Example: Xenon poisoning......................................................................................... 10
2. Fourier transform ........................................................................................................ 14
Definitions .................................................................................................................. 14
Properties .................................................................................................................... 16
Fourier transform of discretely sampled data ............................................................. 17
Fast Fourier transform (FFT)...................................................................................... 19
FFT of real data....................................................................................................... 22
FFT of functions of two or more variables .............................................................23
Application: computed tomography ........................................................................... 24
3. Eigenvalues and eigenvectors of a matrix .................................................................. 32
Characteristic equation for the eigenvalues ................................................................ 35
Search for isolated eigenvalues and eigenvectors....................................................... 37
Power iteration........................................................................................................ 37
Inverse power iteration (Wielandt’s method) ......................................................... 39
Jacobi transformations for the diagonalisation of symmetric matrices ...................... 40
Householder reduction................................................................................................ 43
Eigenvalue problem for the reduced matrix ........................................................... 46
The QR algorithm ................................................................................................... 46
Application: Schrödinger equation ............................................................................. 48
Formulation............................................................................................................. 48
Solving.................................................................................................................... 50
Example .................................................................................................................. 51
4. Singular value decomposition..................................................................................... 54
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
2/150
Application: the least squares problem....................................................................... 55
Linear model ........................................................................................................... 55
Non-linear model .................................................................................................... 56
Solving the problem................................................................................................ 57
Example: Analysis of a gamma spectrum............................................................... 63
5. Orthogonal polynomials. Approximation of functions. Gaussian quadrature ............ 71
Approximation of functions........................................................................................ 71
Linear least squares................................................................................................. 71
Example: Polynomial approximation of the Runge function ..................................... 76
Data smoothing ........................................................................................................... 79
Gaussian quadrature.................................................................................................... 80
6. Monte Carlo methods.................................................................................................. 84
Generation of random deviates with a chosen probability distribution ...................... 85
Uniform distribution ................................................................................................... 86
Normal distribution..................................................................................................... 86
Central limit theorem.................................................................................................. 86
The transformation method for generating deviates with a specified probability
distribution ........................................................................................................................... 86
Sampling from the normal distribution (Box-Muller method) ............................... 87
The rejection method for generating deviates with a specified probability distribution
.............................................................................................................................................. 88
Assessing the quality of the sample............................................................................ 90
Monte Carlo integration.............................................................................................. 91
An example ............................................................................................................. 96
Monte Carlo for particle transport problems .............................................................. 99
Variance reduction methods ................................................................................. 102
Application: integral form of the neutron transport equation ................................... 110
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
3/150
7. Partial differential equations..................................................................................... 120
von Neumann stability analysis ................................................................................ 121
Lax scheme ........................................................................................................... 123
Diffusion initial value problem................................................................................. 123
Explicit scheme..................................................................................................... 123
Implicit scheme..................................................................................................... 124
Crank-Nicholson scheme...................................................................................... 125
Multidimensional case .......................................................................................... 126
An example: The one-dimensional heat equation..................................................... 127
Application: the diffusion equation in nuclear reactor physics ................................ 128
Example: one-dimensional two-group problem ................................................... 145
Further reading.............................................................................................................. 150
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
4/150
1. Sets of homogeneous linear ordinary differential equations
Consider the set:
( ) ( ) ( )
( ) ( ) ( )xyaxyadx
xdy
xyaxyadx
xdy
nnnnn
nn
++=
++=
...
....
...
11
11111
,
or:
Ayy =
dx
d. (1.1)
with an initial condition ( ) 00 yy = and constant coefficients njniaij ,...,1,,...,1, == .
An example of large sets of the form of (1) are the equations describing the nuclide
composition evolution in materials subject to neutron irradiation, incl. the nuclide evolution of
nuclear fuel.
The balance equation of the concentration iN of the i-th nuclide (i = 1,...,N) in such ma-
terial is:
( ) ( ) iiiij
jjijij
jjiji NNfN
dt
tdN λσλσγ +Φ−+Φ= ∑∑≠
→≠
→ , (1.2)
where Φ is the one-group scalar neutron flux, σ is an one-group neutron absorption
cross-section, λ is a decay constant, ij →γ is the yield of the i-th nuclide as a result of neutron
absorption by the j-th nuclide (including the process of neutron-induced fission), ijf → is the
yield of the i-th nuclide from spontaneous decay of the j-th nuclide.
Note:
Inhomogeneous equations and equations of higher order can be reduced to the form of (1). Let, for exam-
ple, the equation is:
( ) ( ) ( ) dcxxbyxaydx
xyd +++= '2
2
with an initial condition:
( ) ( ) 0000 ''; yxyyxy == .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
5/150
The following dependent variables are introduced:
( ) ( ) ( ) ( ) ( ) ( ) 1;;'; 4321 ≡≡≡≡ xyxxyxyxyxyxy
Thus, the considered equation is reformulated as a set:
( ) ( )( ) ( ) ( ) ( ) ( )( ) ( )( )
04
43
43212
21
=
=
+++=
=
dx
xdy
xydx
xdy
xdyxcyxbyxaydx
xdy
xydx
xdy
with initial conditions:
( )( )( )( ) 1
'
04
003
002
001
====
xy
xxy
yxy
yxy
Through direct substitution it can be verified that
yAy n
n
dx
d = . (2)
Indeed:
( ) ( ) ( )
[ ]∑∑ ∑
∑ ∑∑∑
=
=
==
=
kkik
kk
jjkij
jk
kjkij
j
jij
jjij
i
yyaa
yaadx
xdyaxya
dx
d
dx
xdy
dx
d
2A
,
or:
( ) yAAyAy 22
2
==dx
d.
By setting 2AB ≡ , in an analogous fashion it is demonstrated that:
( ) ( ) [ ] [ ]∑∑∑ ∑∑ ==
=
=
kkik
kkik
kk
jjkij
jjij
i yyyabxybdx
d
dx
xyd
dx
d 32
2
ABA ,
i.e. yAy 33
3
=dx
d, etc.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
6/150
On the other hand, for each dependent variable the following representation can be em-
ployed (Taylor series expansion):
( ) ( ) ( ) ( ) ( )( )( )
∑∞
=
=+++++=0
22
2
!
0...0
!
1...0
!2
100
n
nn
ikk
ik
iiii x
n
yx
dx
yd
kx
dx
ydx
dx
dyyxy (3)
And indeed, if the function ( )xfyi = has n derivatives at point 0x , then there exists a polynomial
( )xPn , for which:
а) ( ) ( ) ( ) ( ) ( )( ) ( )( )000000 ,...,'', xfxPxfxPxfxP nnnnn === и (4.1)
б) ( ) ( ) ( )( ) 00 , xxxxoxPxf nn →−+= . (4.2)
Proof:
а) Let the polynomial be sought in the form:
( ) ( ) ( )nnn xxAxxAAxP 0010 ... −++−+= (5)
Since from (5) it follows that ( ) 00 AxPn = , then from the first requirement in (4.1) it follows that
( )00 xfA = . Further, since ( ) ( ) ( ) 10021 ...2' −−++−+= n
nn xxnAxxAAxP , then from the second re-
quirement in (4.1) it follows that ( )01 ' xfA = .
In the same fashion, since ( ) ( ) ( ) 202 1...1.2'' −−−++= n
nn xxAnnAxP , then ( )!2
'' 02
xfA = . In the
general case the result is ( )( )
!0
k
xfA
k
k = . Thus, in fulfilment of the conditions (4.1), the polynomial in (4.2)
will have the form:
( ) ( ) ( )( )( )( ) ( )
( )( ) ( )nn
kk
n xxn
xfxx
k
xfxxxfxfxP 0
00
0000 !
...!
...' −++−++−+= .
б) The next step is to confirm that this polynomial satisfies the relation (4.2). Let
( ) ( ) ( )xPxfxr nn −≡ . From (4.1) it follows that ( ) ( ) ( )( ) 0...' 000 ==== xrxrxr nnnn . Then, by applying
L’Hôpital’s rule for resolving the indeterminate form ( )
( )nn
xx
xr
0− at 0xx → , one obtains:
( )( )
( )( )
( )( )( )
( )( )0
!!lim...
'limlim 0
0
1
100
000
==−
==−
=−
−
→−→→ n
xr
xxn
xr
xxn
xr
xx
xr nn
nn
xxnn
xxnn
xx,
i.e. in reality ( ) ( ) 00 , xxxxoxr nn →−= .
L’Hôpital’s rule:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
7/150
If ( ) ( ) 0limlim ==→→
xgxfaxax
or ( ) ( ) ±∞==→→
xgxfaxax
limlim , then ( )( )
( )( )xg
xf
xg
xfaxax '
'limlim
→→= .
Proof for “0/0” :
( )( )
( ) ( )( )( ) ( )( )( ) ( )( )( ) ( )( )
( )( )
( )( )xg
xf
hag
haf
xghxg
xfhxf
xghxg
xfhxf
xg
xf
axhax
ax
h
h
h
axax
→→→
→
→
→
→
→→
=−+−+=
−+
−+=
−+
−+=
lim0
0lim
lim
limlim
lim
limlim
'
'lim
00
0
0
.
Through combining (2) with the joint formulation of (3) for all dependent variables, the
following representation of the solution of the set (1) is obtained:
( ) ( ) ( ) ( ) ( )
00
022
22
!...
!
1...
!2
1
...0!
1...0
!2
100
yA
yAAA1
yAyAAyyy
=
+++++=
+++++=
∑∞
=n
nnkk
kk
n
xx
kxx
xk
xxx
(6)
On the other hand, for the scalar case ( )
aydx
xdy = it can be easily verified that:
( ) ( )axyyn
xaxy
n
nn
exp! 00
0
=
= ∑
∞
=
(7)
Thus, by analogy to (7), the solution of the set (1) is concisely denoted as (matrix expo-
nential):
( ) ( )[ ] 0exp yAy xx = , where ( ) ∑∞
=
≡0 !
expn
nn
n
xx
AA . (8)
In the particular case of decoupled equations:
( ) ( ) nixyaxdx
dyiii
i ,...,1, == (9)
the coefficient matrix is of diagonal shape, ( )iiadiagA = , and the equation solutions are
( ) ( ) ( ) nixayxy iiii ,...,1,exp0 == . Or, in short notation:
( ) ( )( ) 0exp ydiagy xax ii= (10)
Since for this diagonal matrix ( )kii
k adiagA = , through comparison between (3), (7), (6)
and (8) it can be directly seen that:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
8/150
( )( ) ( )xxaii Adiag expexp = , (11)
with the same definition of the matrix exponential ( )xAexp as in (8).
Based on that, the following procedure for evaluating ( )xAexp with a general matrix A
can be applied:
а) diagonalisation of A (cf. Topic 3), i.e. finding a matrix Z and a diagonal matrix D,
such that DZAZ =−1 , and respectively DZZA 1−= .
б) evaluation of ( )xAexp as:
( ) ( ) ( )( )ZdiagZZDZA xdxx iiexpexpexp 11 −− == . (12)
The last equality can be proved by accounting that the application of the defining ex-
pression (8) for ( )xAexp requires computing the powers kk xA , and
( )( ) ZDZAZDZDZZDZZA kk 121112 ... −−−− === (13)
The above approach is, however, restricted to problems, i.e. matrices A, of compara-
tively small dimensions.
In the general case a method which directly follows from the defining expression (8) is
preferred. The practical procedure will be outlined as follows.
Thus, after returning to the series (6) and introducing the vector ( )0
0 yc ≡ , it can be di-
rectly confirmed that the successive terms in this series will have the form:
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )kk
k
xxxx AccAccAccAcc
1;...;
3;
2; 1231201
+==== + (14)
The solution of the set (1) is accumulated as a sum:
( )∑∞
=
=0k
kcy (15)
The computational algorithm is as follows:
• initialise ( ) ( )00ii yc = and ( )( ) ( )00
ii cxy = , ni ,...,1=
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
9/150
• for ,....2,1=k : evaluate ( ) ( )∑=
−=n
j
kjij
ki ca
k
xc
1
1 and update the partial sum
( )( ) ( )( ) ( )ki
ki
ki cxyxy += −1 . If
( )
( )( ) δ≤− xk
k
1y
c, terminate the iteration on k.
The convergence of iterations is in principle guaranteed by property (4.2) of the Taylor
series. Nevertheless, in order to reduce the computational effort and avoid the accumulation of
excessive roundoff errors in the process of summing the series, it is strongly desirable to
minimise the number of iteration steps.
The chosen termination criterion is without a practical alternative, and its fulfilment is
guaranteed by property (4.2), provided that the solution propagation step x is sufficiently
small. Actually, however, this criterion imposes to the series the more general condition that
0!yA k
k
k
x decreases monotonously with the increase of k. The latter, in it turn, can be ensured
if the matrix norm xx AA = is reduced below a given limit. While an obvious way to
achieve this aim is to choose a sufficiently small propagation step x, with a large norm A
such an approach for solving the set (1) will be rendered impracticable.
A more realistic strategy is to reformulate the problem so that the norm A be reduced
to a level which is appropriate for the desired propagation step x.
A suitable matrix norm which corresponds to the vector norm ( ) ( )∑=
≡n
i
ki
k c1
c , is
∑=
=n
iij
ja
1
maxA . For the matrix generated by problem (1.2) this is the quantity
( )jjj
λσ +Φmax2 , and the physical explanation of this result is the fact that the rate of deple-
tion of a given nuclide is equal to the sum of the rates of production of all its daughter nu-
clides. Therefore, a way to reduce the norm A in the considered case is to exclude the nu-
clides with highest depletion rates (i.e. highest effective decay constants) from the set (1.2).
The exclusion is effected e.g. through replacing a chain of the form CBA →→ by a chain
CA → , where B is the short-living nuclide subject to exclusion. After solving the reduced
equations set and finding the concentrations of the precursors of these excluded nuclides, the
balance equations for the latter can easily be solved analytically.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
10/150
For example, if nuclide B is much shorter-lived than nuclide A, i.e. AB λλ >> , then after
a relatively short period of time (several mean lifetimes of nuclide B) its concentration
reaches a so-called secular equilibrium with equated rates of production and of depletion, i.e.
( ) ( )tNtN BBABA** λλ =→ ,
where *λ are the above mentioned effective decay constants. Thus, with a known con-
centration ( )tNA , the sought concentration of nuclide B is
( ) ( )*
*
B
ABAB
tNtN
λλ →= .
In the more general case the balance equation for the nuclide B is:
( ) ( ) ( )tNtNtdt
dNBBABA
B ** λλ −= →
with a solution
( ) ( ) ( ) ( ) ( )( ) ''exp'exp0 *
0
** dttttNtNtN B
t
ABABBB −−+−= ∫→ λλλ .
With a known initial condition and a known concentration ( )tNA , the sought concentra-
tion ( )tNB can be obtained e.g. through numerical integration.
In particular, if in the expression for ( )tNB one can assume that ( ) .consttNA ≅ , it will
take the form
( ) ( )( )tNtN BAB
BAB
**
*
exp1 λλ
λ −−= →
and after several mean lifetimes of the nuclide B its concentration will approach the
above mentioned secular equilibrium level.
The exclusion criterion can be chosen empirically. For example, it can turn out that with
a chosen propagation step x the series (6) will converge after a reasonably large number of
iteration steps if x does not exceed 5 effective half-lives of the shortest living nuclide in he
system.
Example: Xenon poisoning
The 135Xe nuclide is distinguished by an exceptionally large thermal neutron capture
cross-section. Its accumulation in nuclear reactor fuel leads to a significant deterioration of
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
11/150
the multiplying properties of the reactor medium. For this reason the problem of modelling
the evolution of 135Xe concentration is of special importance in reactor physics.
The production and depletion chain of 135Xe is:
Xe
b.Xea
n
stableBaCsXe
ITefission
yxТhТ
Xe
hТsТTe
136
135103.2,13514.9,135
0.003
57.6,13519,135061.0
61072
)(6
2/12/1
2/12/1
×≈↓+
→ →→
↓
→ → →
==
=
===
−−
−−
σ
ββ
γ
ββγ
In a simplified form (with accounting of only the fission of 235U and omitting of 135Te
from the nuclear transitions chain) the balance equations for the concentrations of 235U, 135I
and 135Xe are:
( )
( ) ( )
( ) ( ) [ ] ( )tXetItUdt
dXe
tItUdt
dI
tUdt
dU
XeXea
If
Xe
If
I
a
1351352355135
1352355135
2355235
λσλσγ
λσγ
σ
+Φ−+Φ=
−Φ=
Φ−=
(1)
A set of exemplary values of the coefficients in (1) for WWER-1000 at rated power can
be produced as follows.
From the relation
[ ] [ ] [ ] [ ] [ ]WPNJEscmcm ff =××Φ× −− #.#. 51225σ (2)
With known
5fσ = 337.73 b = 3.3773E-22 cm2
fE =200 MeV = 3.204E-11 J (3)
259.046E10022.6235
/35300 23
5
55 +=×== HM
A
tgN
A
MN
P = 50 MW/tHM
for the one-group scalar flux one can obtain:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
12/150
Φ = 5.11E+13 cm-2.s-1 (4)
With
5aσ = 416.1 b; 0615.0=+ ITeγ ; 0406.7 −= EXeγ ;
bEXea 06516.1 +=σ ; -1s05-2.1068E=Xeλ ; -1s05-2.9309E=Iλ (5)
the final form of the balance equations is:
( )
( ) ( )
( ) ( ) [ ] ( )tXetItUdt
dXe
tItUdt
dI
tUdt
dU
1355-5-
1355-
23511-135
1355-
2359-135
2358235
102.11107.74102.93101.22
102.93101.06
1013.2
⋅+⋅−⋅+⋅=
⋅−⋅=
×⋅−= −
(6)
Or:
( ) ( )( ) ( ) ( )( ) ( ) ( ) ( )
( ) ( ) ( ) 00;00;109.0460
109.85102.93101.22
102.93101.06
1013.2
3225
1
35-
25-
111-3
25-
19-2
181
==⋅=
⋅−⋅+⋅=
⋅−⋅=
⋅−= −
NNN
tNtNtNdt
tdN
tNtNdt
tdN
tNdt
tdN
(7)
The analytical solution of these equations is:
( ) ( ) ( )taNtN 1111 exp0=
( ) ( ) ( ) ( ) ( ) ( )( )tataaa
aNtaNtN 2211
2211
2112222 expexp0exp0 −
−+= (8)
( ) ( ) ( )
( ) ( ) ( )( )
( ) ( ) ( )( )
( ) ( ) ( )( ) ( ) ( )( )
−−−
−−
−+
−−
+
−−
+
=
3322
3322
3311
3311
2211
21321
33223322
322
33113311
311
3333
expexpexpexp0
expexp0
expexp0
exp0
aa
tata
aa
tata
aa
aaN
tataaa
aN
tataaa
aN
taNtN
Let, with these constants, the problem of tabulating the Xenon-135 concentration is
solved with a time step of one hour until 30 hours after starting the reactor at full power (zero
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
13/150
initial conditions for all nuclides except U-235). The matrix norm is 10497.1 −−= sEA , and
the application of the above mentioned empirical relation for the limiting time step gives a
value of sh 35185max = , i.e. much higher than the desired tabulation step. With a relative
error threshold 4.1 −= Eδ , the maximum number of Taylor expansion terms is 6 (in the be-
ginning of the transient process), whereas the typical number is 3-4.
After including of 135Te in the set of balance equations, the matrix norm becomes
1023.7 −−= sEA , so that the limiting time step decreases to sh 95max = . With the same
relative error threshold, the maximum number of Taylor expansion terms is 15 (in the begin-
ning of the transient process), and the typical number is 6. Since a single tabulation step is
completed via a considerable number of intermediate steps, and each intermediate step re-
quires a separate Taylor expansion, the number of matrix-vector multiplications per one tabu-
lation step is typically about 220, as compared with approximately 3 in the previous case. The
accuracy of the solution for the 135Xe concentration in both cases is equally good, although in
the second case the risk of excessive accumulation of roundoff errors is in principle higher.
0 5 10 15 20 25 300.000
0.005
0.010
0.015
0.020
0.025
0.030
0.035
0.040
q Xe
t, h
without Te-135 with Te-135
Figure 1. Xenon poisoning (relative neutron absorption rate in Xe-135) without and
with the inclusion of Te-135 in the set of balance equations.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
14/150
2. Fourier transform
Definitions
Let ( )th is a function of the independent variable ( )∞∞−∈ ,t (e.g. time).
A Fourier image of this function is the following function ( )fH of the independent
variable ( )∞∞−∈ ,f (with the meaning of frequency, if t has the meaning of time):
( ) ( ) ( )∫∞
∞−
= ifttdthfH π2exp , (1)
With a known image ( )fH , the original ( )th can be restored through inverse transform:
( ) ( ) ( )∫∞
∞−
−= iftfdfHth π2exp , (2)
(Actually, statement (2) needs substantiation, and this will be done below.)
Or, with fπω 2≡ (ω is angular frequency, if f is (linear) frequency):
( ) ( ) ( )∫∞
∞−
= titdthH ωω exp и ( ) ( ) ( )∫∞
∞−
−= tiHdth ωωωπ
exp2
1
The original can be either a real or a complex function of a real independent variable.
As it is seen from (1), the image of a real function is in the general case of the complex type.
Let, for example, ( ) ( )tifCth 02exp π−= , i.e. the original is a single-frequency harmonic
oscillator.
The application of (1) leads to:
( ) ( ) ( ) ( )( )
=∞×≠×
=−== ∫∫∞
∞−
∞
∞− 0
00 ,
,02exp2exp
ffC
ffCtffidtCifttdthfH ππ
Let the integral on the right be denoted by
( ) ( )( )∫∞
∞−
−≡− tffidtff 00 2exp πδ , (i)
which is a function of f with the following property:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
15/150
( )
=−∞≠−
=−0,
0,0
0
00 ff
ffffδ (ii)
The inverse transform (2) for this original ( ) ( )0ffCfH −= δ must restore the original,
i.e. must satisfy the equality:
( ) ( ) ( )∫∞
∞−
−−=− iftffdfCtifC πδπ 2exp2exp 00 . (iii)
At that, in particular if 0=t , then:
( )∫∞
∞−
−= 0ffdfCC δ (iv)
A function with the property (ii) and the additional properties required by (iii) and (iv),
i.e. more generally:
• ( )
=−∞≠−
=−0,
0,0
0
00 xx
xxxxδ
• ( ) 1lim0
00 =−∫
+
−∞→
ax
axadxxxδ
• ( ) ( ) ( )00
0
0
lim xfdxxfxxax
axa=−∫
+
−∞→δ ,
is known as the Dirac delta function.
It should be explicitly noted that the above considerations do not prove that (i) has all
properties of the Dirac delta function (only property (ii) is proved). It is instead only demon-
strated that if (i) has these properties, then the Fourier transform will be invertible, i.e. state-
ment (2) will be true.
Actually, expression (i) is one of the valid representations of Dirac δ-function. Thus,
based on (i), the general statement (2) for invertibility of the Fourier transform can be cor-
roborated:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
16/150
( ) ( )
( ) ( ) ( )
( ) ( )( )
( ) ( ) ( )thttthdt
fttidfthdt
iftiftthdtdf
iftfdfH
=−=
−=
−
=
−
∫
∫ ∫
∫ ∫
∫
∞
∞−
∞
∞−
∞
∞−
∞
∞−
∞
∞−
∞
∞−
'''
'2exp''
2exp'2exp''
2exp
δ
π
ππ
π
(v)
Properties
a) symmetries („*” denotes complex conjugate)
original ( )th image ( )fH real ( ) ( )*fHfH =− imaginary ( ) ( )*fHfH −=− even ( ) ( )fHfH =− , i.e. even odd ( ) ( )fHfH −=− , i.e. odd real and even real and even real and odd imaginary and odd imaginary and even imaginary and odd imaginary and odd real and odd
b) scaling and shifting („⇔ ” denotes a bi-unique correspondence between the original
and the image)
( )
⇔a
fH
aath
1
( )bfHb
th
b⇔
1
( ) ( ) ( )00 2exp iftfHtth π⇔−
( ) ( ) ( )002exp ffHtifth −⇔− π
c) convolution ( ) ( )∫∞
∞−
−≡ τττ dthghg*
( ) ( )fHfGhg ⇔*
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
17/150
d) correlation ( ) ( ) ( )∫∞
∞−
+≡ τττ dhtghgcorr ,
( ) ( ) ( )fHfGhgcorr *, ⇔ , if g and h are real functions. For the particular case of auto-
correlation: ( ) ( ) 2, fGggcorr ⇔
e) total power in a signal, Pt
( ) ( )∫∫+∞
∞−
+∞
∞−
=≡ dffHdtthPt
22
f) differentiation
( ) ( )fifHdt
tdh π2⇔
g) Dirac delta function:
( )fδ⇔1 , ( ) 1⇔tδ
Fourier transform of discretely sampled data
Let the original is represented through a sequence of function values at equidistant val-
ues of the independent variable ,...3,2,1,0,1,2,3...,, +++−−−=∆= nntn , where t∆≡∆ is the
independent variable tabulation step, i.e. let there exists the sequence:
( ) ,...3,2,1,0,1,2,3...,, +++−−−=∆≡ nnhhn (3)
Nyquist-Shannon sampling theorem
„Let ∆
≡2
1cf . (4)
(this is the so-called Nyquist critical frequency)
If the continuous function ( )th , the values of which are sampled at an interval ∆, is bandwidth
limited to frequencies smaller in magnitude than the critical frequency cf , i.e. ( ) 0=fH for
cff ≥ , then this function is completely determined by its sampled values nh :
( ) ( )[ ]( )∑
∞
−∞= ∆−∆−∆=
n
cn nt
ntfhth
ππ2sin
” (5)
Let ( )th is represented by the finite sequence of sampled values:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
18/150
( ) 1,...,0,, −=∆= Nkktth kk . (6)
If the sampling interval ∆ satisfies the requirements of the sampling theorem, and if
( ) 0=th for t < t0 and t > tN-1, or h(t) is a periodic function and the interval [ ]10, −Ntt contains
one of its periods, then the sequence (6) will carry the entire information content of h(t). Let,
for further simplicity, N is even.
The sequence of original functional values (6) can be used for finding a sequence of
frequency amplitudes ( )nfH . In principle their number can be arbitrary, but since Fourier
transform is a linear operation, only N of them can be mutually independent. For a representa-
tive description of the spectrum they must span the frequency spectrum [ ]cc ff +− , , in general
uniformly. These requirements are met by the set
( )2
,...,2
,,NN
nN
nffH nn +−=
∆= , (7)
representing the image ( )fH in the frequency range [ ]cc ff +− , .
The set (7) is computed through approximating the integral (1) by the sum:
( ) ( ) ( )
( )
+−=
×∆≡
∆=
∆
∆∆
=∆≈
≡
∑
∑∑
∫
−
=
−
=
−
=
∞
∞−
2,...,
2
2exp
2exp2exp
2exp
1
0
1
0
1
0
NNn
HN
iknh
kN
nihtifh
tiftdthfH
n
N
kk
N
kk
N
kknk
nn
π
ππ
π
(8)
The sequence nH is periodic in n with a period N: ,...2,1, == −− nHH nNn .
Because of that, usually nH is indexed in n from 0 to N-1. Thus, n = 0 corresponds to
zero frequency, 12
1 −≤≤ Nn corresponds to cff <<0 , 11
2−≤≤+ Nn
N – to 0<<− ff c ,
and 2
Nn = – to cff ±= .
The inverse discrete transform is computed in an analogous way:
∑−
=
−=1
0
2exp
1 N
nnk N
iknH
Nh
π (9)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
19/150
Here it is important to note that although expression (8) for the frequency amplitudes is
approximate, their employment in the inverse transform (9) leads to an exact recovery of the
sequence of original function values.
Indeed, let
≡N
iW
π2exp . (10)
Then (8) takes the form:
∑−
=
=1
0
N
kk
nkn hWH , or WhH = , (11)
where ( ) nknk W≡W , and (9) translates into:
HWh +=N
1 (11а)
From (11а) it becomes evident that in order to fulfil the transform invertibility require-
ment, the W matrix must be unitary with a scaling factor N: 1WW N=+ . By making use of
the definition (10), this can be directly verified. Thus the statement about the accuracy of re-
storing the original, made in conjunction with (9), is also corroborated. Moreover, through the
limit approach 0→∆ , and therefore ∞→N , the transform invertibility is proved in the con-
tinuous case (2) as well, without resorting to the representation (i) of Dirac δ-function.
Fast Fourier transform (FFT)
Expression (11) would imply that N2 linear operations on complex numbers will be re-
quired for computing the discrete Fourier transform.
There exist, however, algorithms like FFT (Fast Fourier Transform) (Danielson and
Lanczos, 1942) through which the required number of operations is reduced to ( )NNO 2log .
The approach is described below.
The discrete transform for a sequence of N function values can be represented as a sum
of two transforms – separately for the even and odd-indexed data subsets, each of length N/2:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
20/150
( ) ( )
1,...,0,
2/
2exp
2/
2exp
122exp
22exp
2exp
10
12/
012
12/
02
12/
012
12/
02
1
0
−=+=
+
=
++
=
=
∑∑
∑∑
∑
−
=+
−
=
−
=+
−
=
−
=
NnHWH
N
iknhW
N
iknh
N
kinh
N
kinh
N
iknhH
nn
n
N
kk
nN
kk
N
kk
N
kk
N
kkn
ππ
ππ
π
(12)
Here it is important to note that the Fourier images 0nH and 1
nH are periodic in n with a
period N/2, and also that the subdivision (12) can be applied recursively. Thus, for example,
0nH can be subdivided into an even and odd component, 00
nH and 01nH , each with a period
N/4:
( )
( )( )
( )( )
( ) ( )
1,...,0,
4
2exp
22exp
4
2exp
1222exp
222exp
22exp
01200
14/
0122
14/
022
14/
0122
14/
022
12/
02
0
−=+=
+
=
++
=
=
∑∑
∑∑
∑
−
=+
−
=
−
=+
−
=
−
=
NnHWH
N
ikn
N
nih
N
iknh
N
kinh
N
kinh
N
kinhH
nn
n
N
kk
N
kk
N
kk
N
kk
N
kkn
πππ
ππ
π
, (12.a)
and 1nH can also be subdivided into an even and odd component, 10
nH and 11nH , each
with a period N/4:
( )
( )( )( )
( )( )( )
( ) ( )
( ) 1,...,0,
4
2exp
22exp
4
2exp
11222exp
1222exp
122exp
11210
14/
01122
14/
0122
14/
01122
14/
0122
12/
012
1
−=+=
+
=
+++
+=
+=
∑∑
∑∑
∑
−
=++
−
=+
−
=++
−
=+
−
=+
NnHWHW
N
ikn
N
nihW
N
iknhW
N
kinh
N
kinh
N
kinhHW
nn
nn
N
kk
nN
kk
n
N
kk
N
kk
N
kkn
n
πππ
ππ
π
(12.b)
In this fashion, is N is an integer power of 2, the recursion can proceed to components
with a period 1=N
N, i.e.:
kn hH =...0100101 for some value of k. (13)
(And indeed, if ...0100101nH is periodic in n with a period, it will not depend on n).
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
21/150
The correspondence between k and the Fourier image component index, i.e. the se-
quence ‘0100101...’ can be established through the following observation.
If the leftmost superscript of Hn is ‘0’, i.e. H0…, then k is even, i.e. the least significant
(rightmost) bit in the binary representation of k is 0: k=[…0]. And conversely – if the leftmost
superscript is ‘1’, i.e. H1…, then k is odd, i.e. the least significant (rightmost) bit in the binary
representation of k is 1: k=[…1].
Further, if the second symbol of the superscript of Hn is ‘0’, i.e. Hx0…, then the position
of kh in the new list is even, i.e. the second least significant bit of k is 0: k=[…0x]. Similarly,
if the second symbol is ‘1’, i.e. Hx1…, then the second least significant bit of k is 1: k=[…1x].
In other words, the successive subdivision of data in even and odd numbered is equiva-
lents to successive testing from right to left of the bits in the binary representation of k. For
example, in particular, the binary record of the index k in expression (13) will be ...1010010.
Based on (12) and (13), the fast Fourier transform algorithm (the Cooley-Tukey algo-
rithm) will be as follows (the example is for N = 8):
− Rearrangement of the array of kh according to the rule: ( ) ( )00000000 hhhh ↔ ,
( ) ( )41001001 hhhh ↔ , ( ) ( )20102010 hhhh ↔ , ( ) ( )51103011 hhhh ↔ ,.... The rearranged array
will contain in successive positions pairs of single-component Fourier images
( )10, xxxx HH .
− Combining each pair of single-component Fourier images ( )10, xxxx HH according to
(12) and writing the two different values of the resultant two-component image in the
same pair of adjacent array locations (0xH or 1xH have a period N/4 = 2 in n). The
expressions are 001400000n
nnn HWHH += , ..., 111411011
nn
nn HWHH += , 1,0=n . The values
of nW4 also have a period N/4 = 2: 10 =W , 14 −=W .
− Combining each pair of two-component images ( )10, xx HH according to (12) and
writing the four different values of the resultant four-component image in the four ad-
jacent array locations previously occupied by the two-component images (0H or 1H
have a period N/2 = 4 in n). The expressions are 012000n
nnn HWHH += and
112101n
nnn HWHH += , 3,2,1,0=n . The values of nW2 also have a period N/2 = 4:
10 =W , iW =2 , 14 −=W , iW −=6 and alternate in sign with half that period, i.e. 2.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
22/150
The array value replacement process is: a) read the values of 000H and 01
0H from array
positions 1 and 3; b) compute 010
000
00 HHH += and 01
0000
02 HHH −= , and write these
in array positions 1 and 3; then proceed similarly with 001H and 01
1H from positions 2
and 4 in order to produce 01H and 03H in the same positions, i.e. 2 and 4, etc.
− Finally, combining the two four-component images ( )10,HH according to (12) and
writing the result 1,...,0, −= NnH n in the eight adjacent array locations previously
occupied by the four-component images. The expression is
7,...,0,10 =+= nHWHH nn
nn . The values of nW are correspondingly: 10 =W ,
2
1
2
11 iW += , iW =2 , 2
1
2
13 iW +−= , 14 −=W , 2
1
2
15 iW −−= , iW −=6 ,
2
1
2
17 iW −= . These alternate in sign with half the current period, i.e. 4, and the
updating of the data array values is done in place as illustrated above.
Each application of (12) requires N operations (one multiplication and one addition),
and the number of steps is N2log , and therefore the total number of operations for imple-
menting the FFT algorithm is ( )NNO 2log .
FFT of real data
Since the real array 1,...,0, −= Nkfk will be twice shorter (in number of addresses, or
words) than the complex array 1,...,0, −= Nkhk , and the image 1,...,0, −= NnFn will never-
theless be complex, the question arises whether the transform can be done „in place”, and,
more generally, whether the number of real operations will remain equal to the total number
of complex operations (as in the complex case). The answer is expectedly affirmative, be-
cause ( ) ( )*fFfF =− , i.e. the frequency spectrum will contain twice as little information.
The algorithm for real data is implemented as follows.
The data are subdivided in two sets – with even and odd sequential numbers. The first
set is interpreted as the real, and the second one – as the imaginary part of a twice shorter set
of complex numbers: 12/,...,0,122 −=+= + Njiffh jjj . This synthetic complex array is sub-
jected to the standard Fourier transform routine. The output is a complex array
12/,...,0,10 −=+= NniFFH nnn with the following components (both complex):
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
23/150
∑−
=
=
12
02
0
2
2exp
N
kkn N
iknfF
π and ∑
−
=+
=
12
012
1
2
2exp
N
kkn N
iknfF
π (14)
By virtue of (12), the final result is obtained as:
1,....,0,2
exp 10 −=
+= NnFN
inFF nnn
π. (15)
The task of extracting 0nF and 1
nF from nH and the simultaneous producing of nF is
solved in the following way:
( ) ( ) 1,....,0,2
exp*2
*2
122 −=
−−+= −− NnN
inHH
iHHF nNnnNnn
π (15a)
The array F is complex and is twice longer than the real array f. Since
nnNn FFF == −− ** , only the amplitudes at positive frequencies are sufficient, and they can be
written in place of the original data. Because all values 2/,...,0, NnH n = are nevertheless
needed, and the Fourier image of the synthetic complex array is 12/,...,0, −= NnH n , one can
employ the fact that nNn HH −− = 2 , i.e. 02 HH N = .
The inverse transform is organised as follows.
• Construct
( ) ( )( ) ( ) 12/,...,0,*
2exp
2
1
*2
1
22
21
−=−
−=
+=
−
−
NnFFN
inF
FFF
nNnn
nNnn
π (16)
• Find the inverse transform of ( ) ( )21nnn iFFH += .
FFT of functions of two or more variables
Let, similarly to the one-dimensional case:
( ) ( ) 1,...,0,1,...,0,,, 22112121 −=−=∆∆≡ NkNkkkhkkh yx . (17)
The two-dimensional Fourier image is defined by the complex function:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
24/150
( ) ( )
( )
( )∑ ∑
∑ ∑
∑∑
−
=
−
=
−
=
−
=
−
=
−
=
=
=
≡
1
021
1
0 2
22
1
11
1
021
1
0 1
11
2
22
1
021
1
0 1
11
2
2221
1
1
2
2
2
2
1
1
2
2
1
1
,2
exp2
exp
,2
exp2
exp
,2
exp2
exp,
N
k
N
k
N
k
N
k
N
k
N
k
kkhN
nik
N
nik
kkhN
nik
N
nik
kkhN
nik
N
niknnH
ππ
ππ
ππ
, (18)
or:
( )21,nnH = FFT by the second index of [FFT by the first index of ( )21,kkh ]
= FFT by the first index of [FFT by the second index of ( )21,kkh ].
The inverse transform amounts to inverting the signs of the exponentials and multiply-
ing the final result by 21
1
NN.
Application: computed tomography
Computed tomography is a technique of studying the internal structure of objects by
means of penetrating radiation (X-rays, but generally also light, gamma rays, etc.). In one of
the possible geometries of measurement the object is placed between a line source and a line
detector which are parallel to each other in the plane of the examined slice of the object. Then,
if the source of length xL is located along the x axis (e.g. between x = 0 and x = Lx at y = 0) ,
and the detector – at a distance yL from the source (i.e. between x = 0 and x = Lx at y = Ly),
then the registered intensity of the transmitted parallel beam will be:
( ) ( )
−= ∫
yL
y dyyxILxI0
0 ,exp, µ , (19)
where I0 is the constant linear density of the source intensity, and the linear attenuation
coefficient of the penetrating radiation ( )yx,µ completely characterises the internal structure
of the slice. Insofar as it can be assumed that outside the object ( ) 0, =yxµ , and that because
of the parallel beam the distance between the source and the detector does not affect the regis-
tered intensity, then:
( ) ( ) ( ) +∞<<∞−
−=== ∫
+∞
∞−
xdyyxIxILxI y ,,exp0,, 0 µθ . (20)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
25/150
Here the parameter θ represents the angle of rotation of the source-detector frame (or of
the object) around the z axis which is transverse to the plane where the source, the examined
slice and the detector lie (cf. Fig. 1). This parameter is introduced because, as it will be seen
below, the sought distribution ( )yx,µ is reconstructed from transmitted intensity measure-
ments at a series of rotation angles θ.
Figure 1. Mutual arrangement of the source, the object and the detector in a computed
tomography measurement
Further it is convenient to assume that the measured quantity is actually:
( ) ( ) ( )∫+∞
∞−
=
−≡ '','
,'ln,'
0
dyyxI
xIxp µθθ , (21)
where x’, y’ are in a coordinate system rotated at an angle θ with respect to that in ex-
pression (20).
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
26/150
The normalisation and the taking of logarithm in (21) are trivial operations which with-
out any restriction can be assumed to be performed by the detector system. The x’ and y’ co-
ordinates are in a system rotated at an angle θ, so that in a particular measurement the source
is located along the x’ axis, e.g. between x’ = 0 and x’ = Lx at y’ = 0, and the detector – be-
tween x’ = 0 and x’ = Lx at y’ = Ly. The relation between x’, y’ and x, y (corresponding to
θ = 0) is:
=
⋅
− '
'
cossin
sincos
y
x
y
x
θθθθ
(22)
For the Fourier image of ( )θ,'xp , with using (21) and (22), one obtains:
( ) ( ) ( )
( ) ( )
( ) ( )( ) ( )( )
( ) ( )( )
( )λκ
λκπµ
θθκπµ
κπµ
κπθθκ
,
2exp,
sincos'2exp,',,'
''''2exp','
'''2exp,','
Μ=
+=
+=
=
≡
∫ ∫
∫ ∫
∫ ∫
∫
∞+
∞−
∞+
∞−
∞+
∞−
∞+
∞−
∞+
∞−
∞+
∞−
+∞
∞−
dxdyyxiyx
dydxyxiyxyyxx
dxdyxiyx
dxxixpP
, (23)
where:
θκλθκκ sin',cos' == , (24)
and ( ) ( ) ( )( )∫ ∫+∞
∞−
+∞
∞−
+≡Μ dydxyxiyx λκπµλκ 2exp,, is the two-dimensional Fourier image
of ( )yx,µ . The conversions in (23) employ the fact that coordinate system rotations like (22)
conserve the volume element and the integration limits – dxdydydx ='' .
The structure ( )yx,µ of the examined object can be reconstructed via an inverse Fou-
rier transform of (23):
( ) ( ) ( )( )∫ ∫+∞
∞−
+∞
∞−
+−Μ= λκλκπλκµ ddyxiyx 2exp,, . (25)
However, it is important to note that in (23) the variables κ and λ are interrelated
through (24), i.e. they are not mutually independent, and at a given θ they can span only a
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
27/150
small portion of the value range needed for performing the inverse transform (25). Therefore,
the inverse transform is at all possible only if a sufficiently large number of projections
( )θ,'xp exist at different values of θ.
With this in mind, expression (25) can provide a basis for the following algorithm:
• Collect an array of measurement results:
( ) MmNnxp mn ,...,1,,...,1,,' ==θ
• Compute M discrete one-dimensional Fourier transforms (FFT) in order to produce
the quantities
( ) ( ) ( ) MmLldxxixpP lmml ,...,1,,...,1,'''2exp,',' === ∫+∞
∞−
κπθθκ
• Build a correspondence map ( ) ( )jimlP λκθκ ,,' Μ⇒ using the relations (24).
• Compute ( )yx,µ through an inverse discrete two-dimensional Fourier transform
(FFT) of ( )λκ ,Μ .
A major drawback of the implementation of this algorithm is the mapping
( ) ( )jimlP λκθκ ,,' Μ⇒ . Because of (24), the sets ( )ji λκ ,Μ will be arranged along radial lines
in the ( )λκ , plane at angles mθ with respect to the κ axis, instead of forming an equidistant
Cartesian grid as would be needed in order to ensure the proper invertibility of the Fourier
transform. Although the available data can in principle be interpolated to the desired Cartesian
grid, the process would introduce a significant noise to the recovered two-dimensional distri-
bution ( )yx,µ , especially due to the sparsely scattered data points far from the origin of the
( )λκ , coordinate system. For this reason, the following approach (Radon transform) is al-
ways preferred.
Since the Fourier image ( )θκ ,'P of the measured quantity ( )θ,'xp is in ( )θκ ,' coordi-
nates, which according to (24) relate to ( )λκ , as polar to Cartesian, the integration in (25) can
be done after a corresponding change of variables and of the integration limits: the volume
element λκdd is replaced by θκκ dd '' , the limits in κ’ – by ( )∞,0 , and the limits in θ – by
( )π2,0 . Thus:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
28/150
( ) ( ) ( )( )
( ) ( ) ( )∫∫ ∫
∫ ∫
=
−=
+−=
∞
∞
ππ
π
θθθκκκπθκ
θκκθθκπθκµ
2
0
2
0 0
2
0 0
,'''''2exp,'
''sincos'2exp,',
dxCddxiP
ddyxiPyx
(26)
( )θ,'xC differs in form from the inverse Fourier transform of the function ( )θκκ ,''P
only by the integration limits: ( )+∞,0 instead of ( )+∞∞− , .
The bringing of this integral to the form of an inverse Fourier transform is based on the
circumstance that the exchange of places of the source and the detector (or rotating the
source-detector frame by 180°) does not change the measurement conditions, i.e. the detector
response. Namely:
( ) ( )πθθ +−= ,',' xpxp , (27)
and consequently ( ) ( )πθκθκ +−= ,',' PP
Therefore:
( ) ( ) ( )
( ) ( )( )
( ) ( ) ( )( )( )
( ) ( ) ( ) ( )( )
( ) ( ) ( ) ( )( )
( ) ( )
( )∫
∫ ∫
∫ ∫∫
∫ ∫∫
∫∫
∫
∫ ∫
=
−=
−−+−+−=
−−++−=
+++−++
+−=
−=
∞
∞−
∞−
∞
∞∞
∞
∞
∞
π
π
π
π
π
π
θθ
θκκκπθκ
θκκκππθκκκκπθκ
θκκκππθκκκκπθκ
θκκπθπθκππθκ
κκθθκπθκ
θκκκπθκµ
0
0
0
0
0
0 00
0
0
0
2
0 0
,'
''''2exp,'
''''2exp,'''''2exp,'
''''2exp,'''''2exp,'
''sincos'2exp,'
''sincos'2exp,'
''''2exp,',
dxC
ddxiP
ddxiPdxiP
ddxiPdxiP
d
dyxiP
dyxiP
ddxiPyx
(28)
where
( ) ( ) ( )
( ) ( ) ( )∫
∫∞
∞−
∞
∞−
−=
−≡
'''2exp','
''''2exp,','
κκπκθκ
κκκπθκθ
dxiBP
dxiPxC
, (29)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
29/150
is an inverse Fourier transform of the function ( ) ( )θκκ ,PB , and ( ) κκ ≡B plays the
role of a „frequency filter”.
In practice, the measurements are performed at a sequence of discrete rotation angles
1,...,0, −= Mmmθ , and the image ( )yx,µ is reconstructed in a discrete grid of cells (pixels)
bounded by ( ) JjIiyx ji ,...,1,,...,1,, == . The algorithm based on (23), (28) and (29) will
take the form:
For each 1,...,0, −= Mmmθ :
1. Discretisation of ( )θ,'xp in x’ with a step ∆ :
( ) ∆=−= kxNkxp kmk ',1,...,0,,' θ
2. Finding the Fourier image (FFT) of ( )mxp θ,' :
( ) ( )( )mnmn xpFFTN
nNnP θκθκ ,'',1,...,0,,' =
∆=−=
3. Inverse Fourier transform (IFFT) of the product ( )mP θκκ ,'' :
( ) ( )( )mmk PIFFTNkxC θκκθ ,''1,...,0,,' =−=
4. Mapping of ( )mkxC θ,' into the pixel grid in which ( )yx,µ is to be reconstructed.
This is effected as follows. First, for each pixel ( )ji yx , , the coordinate
mjmi yxx θθ sincos' += is computed. Then the index k is found for which 1''' +≤≤ kk xxx .
Finally, the quantity ( )mijC θ is evaluated through interpolation between ( )mkxC θ,' and
( )mkxC θ,' 1+ .
5. Adding of ( ) θθ ∆mijC to the current value of ijµ (with a starting value 0=ijµ ) – i.e.
integration by θ according to (28).
Example
Let the original image is the so-called Shepp-Logan tomographic phantom, composed
of ellipses with different optical density (Fig. 2). Then, for example, the application of the
above algorithm with a 512 × 512 pixel grid and 256 equidistant discrete angles between 0°
and 180°, i.e. 256 linear projections ( ) 255,...,0,511,...,0,,' == mkxp mk θ , gives the recon-
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
30/150
structed image from Fig. 3. The observed noise effects can additionally be filtered by apply-
ing a suitable frequency filter in (29) of the general form ( ) ( )κϕκκ ≡B .
Figure 2. The Shepp-Logan phantom in A. C. Kak, M. Slaney, Principles of Computer-
ized Tomographic Imaging, IEEE Press, 1988.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
31/150
Figure 3. A reconstruction of the image from Fig. 2 in a 512 × 512 pixel grid, based on
256 linear projections – above. Linear projections at 0° and 90° – below.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
32/150
3. Eigenvalues and eigenvectors of a matrix
The scalar λ and the non-zero vector x are an eigenvalue and an eigenvector of the
square matrix A, if:
xAx λ= (1)
It is seen that an eigenvector may have an arbitrary scaling. It is also seen from (1) than
non-zero eigenvectors can exist only if the following equality is fulfilled:
( ) 0det =− 1A λ (2)
The left side of (2) is a polynomial of degree n (if this is the number of rows and col-
umns of A) with respect to λ and its roots are the eigenvalues of A. These eigenvalues are not
necessarily distinct or real valued. From (1) and (2) it also follows that between the eigenval-
ues and the eigenvectors there exists an unique correspondence – ( ) niii ,...,1,, =xλ , which
does not imply that the eigenvectors are always distinct. The addition e.g. of xτ to both sides
of (1) shifts the eigenvalues by an additive constant τ without changing the eigenvectors.
Therefore the occurrence of a zero eigenvalue is not a special case because through such shift-
ing any eigenvalue can be brought to zero, or conversely – to non-zero.
It can easily be verified that:
− If the matrix B is a polynomial of a given degree m of the matrix A, i.e. ( )AB mP= ,
then the eigenvalues of B are the same polynomial of the eigenvalues of A, i.e.
( ) niP imi ,...,1, == λµ , where iλ and iµ are correspondingly the eigenvalues of A
and B, and the eigenvectors of A and B coincide.
− If nii ,...,1, =λ are the eigenvalues of A, then nii
i ,...,1,1 ==λ
µ are the eigenvalues
of B = A-1, and the eigenvectors of A and B coincide.
Here it is appropriate to remind some definitions and statements.
A matrix is symmetric if it coincides with its transpose:
TAA = , or jiij aa = (3)
A matrix is Hermitian if it coincides with the complex conjugate of its transpose (also called Hermitian
conjugate):
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
33/150
+= AA , or *jiij aa = (4)
A matrix is orthogonal if its transpose is equal to its inverse:
1AAAA == TT , (5)
and unitary, if its Hermitian conjugate equals its inverse. From (5) it is seen that the columns of an or-
thogonal (unitary) matrix, interpreted as vectors, are mutually orthonormal.
A matrix is normal if it commutes with its Hermitian conjugate:
AAAA ++ = (6)
It is evident that symmetric (Hermitian) and orthogonal (unitary) matrices are normal.
All eigenvalues of a symmetric/Hermitian matrix are real.
The eigenvectors of a normal matrix with distinct eigenvalues are mutually orthogonal and form a basis
in the n-dimensional vector space. (If a normal matrix has coinciding (i.e. degenerate) eigenvalues, then the
eigenvectors corresponding to the set of degenerate eigenvalues can be replaced by linear combinations of them-
selves (these combinations will also be eigenvectors which correspond to this eigenvalue set), so that a Gram-
Schmidt orthogonalisation can be performed in order to produce a complete and orthogonal set of eigenvectors,
same as in the case of distinct eigenvalues.)
If a matrix is not normal (e.g. some random real matrix), then its eigenvectors in general cannot be
brought to an orthonormal basis, but usually, although not always, form a complete set – i.e. an arbitrary non-
zero n-element vectors, where n is the matrix dimension, can be represented as a linear combination of the ma-
trix eigenvectors.
A matrix which is columnwise assembled of a set of orthonormal eigenvectors is obviously unitary. Also,
a matrix which is columnwise assembled of the eigenvectors of a real symmetric matrix, is orthogonal since the
eigenvectors of the relevant real and symmetric matrix are all real.
λ and x satisfying the equality
TT xAx λ= (7)
are correspondingly a left eigenvalue and a left eigenvector. It is seen that the transposed left eigenvectors
of A are right eigenvectors of AT. Therefore, the left and right eigenvectors of a symmetric/Hermitian matrix are
mutually transpose/Hermitian conjugate. The left and right eigenvalues of A coincide because expression (7) is
equivalent to ( ) 0=− x1A Tλ , and ( )( ) ( )1A1A λλ −=− detdet T (a basic property of determinants), i.e.
the characteristic polynomial of (7) coincides with that of (1).
In general, left and right eigenvectors are mutually orthogonal (with distinct eigenvalues) or can be made
such (with degenerate eigenvalues). This will be shown below.
Let XR is a matrix columnwise assembled from the right eigenvectors of A, and XL – a
matrix rowwise assembled from the left eigenvectors of A. Then (1) and (7) can be written as:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
34/150
( )( ) LnL
nRR
XdiagAX
diagXAX
λλλλ
,...,
,...,
1
1
==
(8)
If the first equality is multiplied on the left by XL, the second – on the right by XR, and
the resulting expressions are subtracted, one obtains:
( ) ( ) ( )( )RLnnRL XXdiagdiagXX λλλλ ,...,,..., 11 = . (9)
Since the only type of matrices which commute with a diagonal matrix with distinct
elements, are also diagonal, then if the eigenvalues of A are distinct, the left and right eigen-
vectors of A will be mutually orthogonal. These can always be normalised (eigenvectors are
only unique with a precision to a multiplier) in order to obtain:
1XX =RL , i.e. 1−= RL XX . (10)
(In the case of degenerate eigenvalues their corresponding left or right eigenvectors can
be linearly combined in a procedure similar to Gram-Schmidt, so that relation (10) will also
hold. An exception is the case of an incomplete set of eigenvectors when the matrix RLXX
will have zero elements on the diagonal.)
By multiplying the first equality in (8) on the left by XL and using the result (10), the
following is obtained:
( )nRR λλ ,...,11 diagAXX =− . (11)
This expression is a particular case of a similarity transformation on matrix A:
AZZA 1−→ , (12)
where in the general case the transformation matrix Z can be arbitrary.
Similarity transformations conserve the eigenvalues. And indeed:
( ) ( )( ) ( ) ( ) ( ) ( )1AZ1AZZ1AZ1AZZ λλλλ −=−=−=− −−− detdetdetdetdetdet 111 . (13)
Since the last expression matches the left side in (2), the eigenvalues of AZZ 1− will co-
incide with those of A.
With accounting of (11) - (13) it is seen that each matrix A with a complete set of ei-
genvectors (i.e. each normal matrix and most random matrices) can be diagonalised through
similarity transformations. At that, the columns of the applied transformation matrix will be
right eigenvectors of A, and the rows of the inverse of this matrix – left eigenvectors of A.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
35/150
(Of course, the elements of the resultant diagonal matrix will be the corresponding eigenval-
ues of A.)
The eigenvectors of a real symmetric matrix are real and orthonormal. This follows
from the fact that since AA =T , if ( )nRR λλ ,...,1diagXAX = , then
( ) TRn
TR
TTR XdiagAXAX λλ ,...,1== , i.e. T
RL XX = (compare with the second expression in (8)).
Then, according to (10), TRR XX =−1 .
In this case the diagonalising matrix of the similarity transformation will be orthogonal.
A similarity transformation of this kind is known as orthogonal:
AZZA T→ , (14)
where Z is an orthogonal matrix.
Based on the above, the general strategy of the most commonly applied methods of
finding eigenvalues and eigenvectors consist in approximating the matrix A to a diagonal
form through a sequence of similarity transformations:
( )nλλ ,...,... 13211
11
21
3211
11
211
1
1
diagPPPAPPPPAPPPAPPAZZ
→→→→→−
−−−−−−
32143421 (15)
When this diagonal (or practically diagonal) form is achieved, the eigenvectors will be
found in the columns of the cumulative transformation matrix:
...321 PPPX =R (16)
Characteristic equation for the eigenvalues
One of the possible approaches for finding the eigenvalues of A is through searching the
roots of the polynomial equation fro λ, i.e. ( ) ( ) 0det ==− λλ nP1A , commonly known as a
characteristic equation. Since the size of the matrix is usually large it is desirable that the
characteristic polynomial be represented in a form suitable for evaluating and for applying
standard algorithms for finding the roots of a polynomial, namely through the coefficients
before the powers of λ: ( ) ∑=
=n
i
iin aP
0
λλ .
The polynomial coefficients can be determined via the method of Krylov based on the
following considerations.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
36/150
If the eigenvectors nii ,...,1, =x of matrix A form a complete set, then an arbitrary non-
zero vector y can be represented as their linear combination: ∑=
=n
iii
1
xy α .
Hence, if the polynomial Qn with a non-zero set of coefficients satisfies the equality
( ) 0=AnQ , then the equality ( ) ( ) 01
==∑=
n
iiinin QQ xyA λα will also hold, where λi are the ei-
genvalues of A. Since the vectors nii ,...,1, =x are presumed to be mutually linearly inde-
pendent and the set of coefficients nii ,...,1, =α is non-zero, from the last equality it follows
that ( ) niQ in ,...,1,0 ==λ .
On the other hand, λi are zeroes of the characteristic polynomial Pn which is also of de-
gree n, and two polynomials of the same degree and coinciding zeroes can differ only by a
common multiplier. Thus, in the context of the problem solved the polynomial Qn will coin-
cide with the characteristic polynomial of matrix A.
Since the characteristic polynomial coefficients are sought with a precision to a com-
mon multiplier, the characteristic equation can be written in the form
( ) 01
1 =+= ∑=
−n
i
ii
nn bP λλλ .
Then, with a sufficiently arbitrarily chosen fixed non-zero vector y, the equality
( ) 01
1 =+= ∑=
−n
i
ii
nn bP yAyAyA (17)
can be regarded as a set of n linear algebraic equations for the n unknowns nbb ,...,1 . The
coefficients ( ) nkc ki
ki ,...,1,1 == − yA before the unknowns nibi ,...,1, = , as well as the absolute
terms ( ) nkd kn
k ,...,1, == yA , can be evaluated through recursive build-up of the sequence
( ) ( ) ( ) ( ) ( )1010 ,...,, −==≡ nn AvvAvvyv . (18)
For finding the zeroes of the characteristic polynomial there exits standard methods,
similar to those for finding the zeroes of a function. A necessary condition for the successful
and efficient finding of these zeroes is their initial localisation (bracketing). The following
relations can be helpful in solving this task:
( )QP,min≤λ , (19)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
37/150
where ∑=j
iji
aP max and ∑=j
jii
aQ max
( )( ) ( ) ( )( )( )( ) ( ) ( )( )iii
iiii
i
iiii
iiii
PaPa
PaPa
+≤≤−
+≤≤−
ImmaxImImmin
RemaxReRemin
λ
λ
or (20)
( )( ) ( ) ( )( )( )( ) ( ) ( )( )iii
iiii
i
iiii
iiii
QaQa
QaQa
+≤≤−
+≤≤−
ImmaxImImmin
RemaxReRemin
λ
λ,
where ∑≠
=ij
iji aP and ∑≠
=ij
jii aQ .
If the matrix is diagonally dominant, i.e. iii Pa > or iii Qa > , then ( )iiii
Pa −≥ minλ or
( )iiii
Qa −≥ minλ .
Search for isolated eigenvalues and eigenvectors
Power iteration
Let the eigenvectors of A, nii ,...,1, =x , are a complete set. Then any vector 0v can be
represented as ∑=i
ii xv α0 . Thus, the sequence of vectors ,...2,10 =≡ k
kk vAv will have the
following representation:
...,...,,1 ∑∑ ==i
imiim
iiii xvxv λαλα . , (21)
where nii ,...,1, =λ are the eigenvalues of A.
Let nλλλ ≥≥≥ ...21 . Therefore, if 01 ≠α then
1101
1lim xvA α
λ=
∞→
mmm
, or ( )( ) 1
1
v
vlim λ=+
∞→im
im
m, or
m
m
m vyvy.
.lim 1
1+
∞→=λ , (22)
where y is an arbitrary vector which is not orthogonal to 1x . In practice it is convenient
to choose y with 1 in the position corresponding to that of the element of mv with the largest
absolute value and 0’s everywhere else. This is so because at a large m this element of the
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
38/150
iterate vector will tend to stabilise and will become representative of the magnitude of the
eigenvector 1x .
Accelerating the convergence
Since
+= ∑
=
n
iii
m
imm
2 1111 xxv α
λλαλ , the convergence rate of the power method will
depend on the ratio 1
2
λλ
(known as the dominance ratio). If this ratio is too close to 1, one of
the following acceleration methods can be applied.
1) Let ke is the k-th column of the identity matrix. Then
m
mn
i
mii
m
n
i
mii
m
mk
mkmR
+ →
+
+=≡ ∞→
=
=
++
+
∑
∑
1
21
211
2
1111
1
.
.
λλβλ
λαλα
λαλα
veve
. (23)
or, ( )mm RrR −×=− + 111 λλ , where 1
2
λλ=r . This allows to apply Aitken’s δ2-process:
m
m
m
m
R
R
R
R
−−=
−− +
+
+
1
11
11
21
λλ
λλ
⇒
( ) ( )12
22
212
212
212
212
1 22 ++
++
++
+++
++
++
∆−∆∆−=
+−−−=
+−−=
nn
nm
mmm
mmm
mmm
mmm RRRR
RRR
RRR
RRRλ (24)
2) The power iteration can be performed with 1A τ− instead of A. The corresponding
eigenvalues will be τλµ −= ii and through a suitable choice of τ the dominance ratio 1
2
µµ
can be reduced, thereby increasing the convergence rate.
For example, if the eigenvalues are real and positive and nλ is the smallest of them,
then with nλτ = convergence will accelerate because 1
2
1
2
λλ
τλτλ <
−−
. Actually the optimum
value of τ is 2
2 nλλ +. If the eigenvalues can be either positive or negative, τ can be chosen so
that the second and third in magnitude among the numbers τλ −i be with approximately
equal magnitudes and opposite signs.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
39/150
3) If A is symmetric, then its eigenvectors will be mutually orthogonal and
∑=
++ ==
n
i
miimmmm
1
1221 λαvvAvv , and ∑
=
=n
i
miimm
1
22λαvv , (25)
so that
+=+
m
mm
mm O2
1
21
1
λλλ
vvvv
. (26)
Inverse power iteration (Wielandt’s method)
Let there exists a good estimate τ for a given eigenvalue, e.g. jλ , i.e.
jkkj ≠−<<− ,τλτλ , and let the eigenvectors of A, nii ,...,1, =x , form a complete set.
Then, beginning with an arbitrary non-zero vector 0v the following iteration process can be
organised:
( ) ,...2,1,1 ==− − kkk vv1A τ (27)
If nii ,...,1, =≠ λτ , then ( ) 1−− 1A τ exists and the iteration process has the form
( ) ,...2,1,11 =−= −
− kkk v1Av τ (28)
of a power iteration with the matrix ( ) 1−− 1A τ having eigenvalues
nii
i ,...,1,1 =−
≡τλ
µ , (29)
with jiij ≠>> ,µµ . This will clearly result in a very fast convergence to jµ .
At that, the vector sequence ,...2,1=kkv will converge to the eigenvector xj which corre-
sponds to jµ . Indeed, if
∑=
=n
iii
10 xv α , then ( ) ( )∑
=
−
−=−=
n
iim
i
imm
10 xv1Av
τλατ , i.e.
( ) jjmji
i
m
i
jijjm
mj xxxv α
τλτλ
αατλ →
−−
+=− ∞→≠∑ (30)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
40/150
The practical procedure is as follows. Let at a given stage the estimates of jx and jλ
are mv and mτ , and the equation ( ) mmm vv1A =− +1τ is solved, where mv is normalised so
that 1. =mm vv . Since 1+mv is an improved estimate of jx , it can be assumed that jm xv ≅+1 ,
i.e. ( ) ( ) 11 ++ −≅=− mmjmmm vvv1A τλτ . (This is so because jjj λ xAx = and correspondingly
( ) ( ) jmjjm xx1A τλτ −=− ). Then, ( ) mmmjmm vvvv .1. 1+−≅= τλ and therefore:
mmmj vv .
1
1+
+≅ τλ . (31)
This evaluation of jλ is taken as a new estimate 1+mτ , prior to the next iteration step
1+mv is normalised to unit length, and the algorithm is repeated.
The described procedure is directly extended to complex eigenvalues and eigenvectors.
Regardless of whether a given eigenvalue is real or complex, the numerical stability of the
implementation is better if the initial vector 0v is chosen to be real. Expression (31) has the
form mm
mmmj vv
vv.
.
1+
+≅ τλ and here it is also expedient that mv is normalised so that
1* =⋅ mm vv .
Jacobi transformations for the diagonalisation of symmetric matrices
An implementation of the previously outlined general approach of similarity transfor-
mations (15) is the following sequence of orthogonal transformations leading to asymptotic
diagonalisation of a real symmetric matrix.
The orthogonal matrices for the consecutive transformations are chosen to be of the
form:
−=
1
0
...
1
...
0
1
M
OLL
M
MM
M
LLO
M
cs
sc
Ppq , (32)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
41/150
where the numbers c and s are the cosine and sine of some angle ϕ, so that 122 =+ sc ;
all diagonal elements are 1, except ppp = c and pqq = c; all off-diagonal elements are 0, except
ppq = s and pqp = -s.
Each of the orthogonal transformations will be:
( ) ( ) ( ) ( )kpq
kTkpq
k PAPA 1−= . (33)
The index ,...2,1=k denotes the sequential number of the transformation and ( )0A is
the original matrix A.
It is seen from (32) and (33) that the transformation will affect only the matrix elements
in the p-th and q-th rows and columns (generally different for each application of (33)).
From (33) it directly follows that these transformations will preserve the symmetry of
the matrix: ( )( ) ( ) ( )( ) ( ) ( ) ( ) ( ) ( )kkpq
kTkpq
kpq
TkTkpq
Tk APAPPAPA === −− 11
The expressions for the altered matrix elements are as follows:
( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( ) ( )qqppqppqqppq
qppqqqppqqqppqqqpppp
piipqiiqqiiqqiiqpiippiip
aascasca
scaacasascaasaca
qipisacaasacaa
−+−=
++=−+=
≠≠+=−=
22
2222
~2~;2~
,~;~
(34)
The technique of Jacobi’s method consists in zeroing the off-diagonal matrix elements
through a sequence of transformations of the type of (33)/(34). In particular, as it follows
from the last expression in (34), for zeroing of ( )qppqa~ it is necessary that the angle ϕ be the
solution of the equation:
( )pq
ppqq
a
aa
sc
sc
222ctg
22 −=−≡ϕ (35)
Unfortunately, any subsequent transformations which affect some of the already altered
rows or columns will in general make the previously zeroed off-diagonal elements non-zero
again. In spite of this, the convergence of Jacobi’s method is a fact which is proved by the
following considerations.
Let after the k-1-th step the current sums of the squares of the diagonal and the off-
diagonal elements are correspondingly
( ) ∑=−
iii
kd aS 21 and ( ) ∑
≠
− =ji
ijk
n aS 21 (36)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
42/150
Then, from relations (34) and the condition ( ) 0~ =qppqa it can be demonstrated that:
( ) ( ) 21 2 pqk
nk
n aSS −= − and ( ) ( ) 21 2 pqk
dk
d aSS += − (37)
And indeed,
( ) ( )( ) ( )( ) ( ) ( )( )[ ]
( ) ( ) ( ) ( ) ( )[ ]( ) 21
,
2222221
22
,
221
2
0
~~
pqk
n
qpiqiiqpiip
kn
qppqqpi
piipqiiqqiiqpiipk
nk
n
aS
ascascS
aasacasacaSS
−=
+++++=
++++−+=
−
≠
−
≠
−
∑
∑
, (38)
where ( )1−knS is the sum of squares of those off-diagonal elements which do not belong
to rows/columns p and q. The second statement in (37) can also be directly verified but it also
follows from the fact that orthogonal transformations preserve the sum of squares of all ma-
trix elements.
In the particular case of the matrix (32) the last assertion can be proved simply. Let x is an arbitrary non-
zero vector. For the elements of the vector xPy pq= it can be immediately checked that qpixy ii ,, ≠= ,
qpp sxcxy += , qpq cxsxy +−= . Like in (38), it can be shown that ∑∑==
=n
ii
n
ii xy
1
2
1
2 . Similarly it can be
demonstrated that the same result holds for xPy Tpq= . Then, let APB T
pq= . From the above it follows that the
sum of squares of the matrix elements of B will be equal to the sum of squares of the matrix elements of A. Let
now pqBPC = , i.e. TTpq
T BPC = . In the same way it follows that the sum of squares of the matrix elements
of CT will be equal to the sum of squares of the matrix elements of BT. The same, of course, is true for C and B.
By looking back to (33) it is seen that C relates to A as ( )kA to ( )1−kA in (33).
Therefore in the process of orthogonal transformations the ratio of the off-diagonal
norm to the diagonal norm will decrease monotonously and thus the sequence of such ratios
will converge to zero.
In practice, after some number of iterations ( )kA will become diagonal within machine
precision, and its elements will hold the eigenvalues of A. The columns of the matrix
( ) ( ) ( )kpqpqpq PPPV ...21= will contain the eigenvectors of A because from DAVV =T , where V is
orthogonal and D – diagonal with the same eigenvalues as A, it follows that VDAV = which
for V coincides with the definition (8) for right eigenvectors. This matrix is built up in the
process of transformations: ( )ipqVPV ← , beginning with 1V = . Or, more specifically:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
43/150
nicssc iqipiqiqipip ,...,1,vvv~;vvv~ =+=−= . (39)
Householder reduction
An alternative to diagonalisation through similarity transformations (which is possible
only for symmetric matrices) is the following two-stage strategy.
− First apply an economic algorithm to reduce the matrix to a simpler form through a
finite number of similarity transformations. For symmetric matrices this form is
usually tri-diagonal, and for non-symmetric – the so-called Hessenberg form. (A
Hessenberg matrix contains only zeroes under its first subdiagonal or above its first
superdiagonal).
− Then find the eigenvalues and eigenvectors of the reduced matrix. One possible ap-
proach is through finding (or localising) the zeroes of the characteristic polynomial,
followed by refining the values of these zeroes (eigenvalues of the matrix) and find-
ing their corresponding eigenvectors – e.g. through inverse power iteration (Wie-
landt’s method). Another approach are the so-called QR or QL methods which will
be examined below and which with a symmetric matrix also lead to finding the ei-
genvectors.
A standard technique for implementing the first stage of this strategy is Householder’s
method.
The method leads to tri-diagonalisation of a symmetric matrix of reduction of a non-
symmetric matrix to Hessenberg form after n-2 orthogonal transformations effected by means
of the so-called Householder matrices. For a symmetric matrix each orthogonal transforma-
tion zeroes a corresponding portion of a column and a row, and for a non-symmetric matrix –
only of a column.
The Householder matrix has the form ww1P ⊗−= 2 , where w is a real vector for
which 12 =w . The symbol ⊗ denotes the so-called external (or matrix) product:
[ ] jiij ww=⊗ ww . This matrix is symmetric and orthogonal because
( )( ) ( )( )
1wwww1
ww1ww1
wwwwww1ww1ww1P
=⊗+⊗−=
+⊗−=
+⊗−=
⊗⊗+⊗−=⊗−⊗−=
==
==
∑∑
44
4444
4422
,...,1,...,1
,...,1,...,1
2
njniji
kkk
njnik
jkki wwwwwwww (40)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
44/150
Therefore 1−= PP , but since TPP = , then TPP =−1 , i.e. P is orthogonal.
Further P will be used in the form H
uu1P
⊗−= where u is an arbitrary real non-zero
vector and 2
2
1u=H .
Let 1exxu −= where e1 is the first column of the identity matrix and x is some vector.
Then:
( ) ( )[ ]
( )H
x
HH
1
2
11
xxux
xexxu
xxexxu
xPx
−−=
⋅−−=
−⊗−= (40)
Since
( ) ( ) 1
2
11
2
2
1
2
1xH xxexxexxu −=−⋅−== , (41)
then ( )
uxxu
=−H
x1
2
and 1exuxPx =−= (42)
This means that the considered matrix P zeroes all elements of the vector x except the
first one which becomes equal to the norm of x.
In the context of consecutive similarity transformations let the first transformation ma-
trix P1 contains in its first row/column the first row/column of the identity matrix, and below
and to the right contains a Householder matrix (n-1)P1 with dimension ( ) ( )11 −×− nn for which
the vector x consists of the lower n-1 elements of the first column of A. Then:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
45/150
It is seen that P1 will zero the lower n-2 elements in the first column of A. k is the norm
of the vector ( )121,..., naa .
It is clear that with a symmetric matrix A the similarity (orthogonal) transformation
( )TTT APPPAPAPPA 111111' === will result in:
The matrix P2 for the next step is composed similarly to P1 but from a Householder ma-
trix (n-2)P2 with dimension ( ) ( )22 −×− nn , based of the lower n-2 elements of the second col-
umn of A’ , supplemented to the left and above with the first tow columns/rows of the identity
matrix:
The identity block in the upper left corner ensures preservation of the achieved tri-
diagonal form, and the (n-2)-sized Householder matrix ( )2
2 P−n tri-diagonalises the second col-
umn/row. It is evident that n-2 orthogonal transformations of this type will bring the symmet-
ric matrix A to a tri-diagonal form.
As it can be easily concluded, with non-symmetric matrices the elements above the di-
agonal will not be zeroed and the transformed matrix A will be of Hessenberg form:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
46/150
In both cases if at the next stage the eigenvectors of the reduced matrix will be sought,
the eigenvectors of A can be obtained from them by applying the resultant transformation
matrix 221 ... −= nPPPQ .
Eigenvalue problem for the reduced matrix
The eigenvalues and eigenvectors of the simplified matrix (after a Householder reduc-
tion) can be determined through finding the roots of the characteristic polynomial (only ap-
proximate estimates are sufficient) and applying Wielandt’s method for refining the eigenval-
ues and/or finding their corresponding eigenvectors.
Thus, for example, for a tri-diagonal symmetric matrix J (resulting from a Householder
reduction of a symmetric matrix) the polynomial ( ) ( )1A µµ −= detnp can be obtained
through expanding the determinant in minors using the sequence of recurrent relations:
( )( )
( ) ( ) ( ) ( ) nipJpJp
Jp
p
iiiiiii ,...,3,2,
...
1
22
1,1
111
0
=−−=
−==
−−− µµµµ
µµµ
(43)
In the process of building the polynomial sequence, information is gathered about the
intervals which bracket the zeroes of ( )µnp , so that these zeroes can easily be localised e.g.
by the bisection method.
The QR algorithm
The QR algorithm is an infinite (like Jacobi’s method) sequence of similarity transfor-
mations which is used to diagonalise a symmetric tri-diagonal matrix or reduce a non-
symmetric matrix to an upper triangular form. In both cases the eigenvalues are the diagonal
elements of the transformed matrix, and in the first case the eigenvectors are found in the col-
umns of the resultant transformation matrix. For non-symmetric matrices the eigenvectors are
found through inverse power iteration (Wielandt’s method). Since accurate estimates of the
eigenvalues are already available, the convergence of this iteration will be rapid. Also, be-
cause of the simple form of the transformed matrix its inversion if significantly facilitated.
The QR is based on the possibility to represent each matrix as the product QRA = ,
where Q is an orthogonal matrix and R is an upper triangular matrix.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
47/150
This possibility is due to the fact that through a sequence of Householder reductions the
result RTAAPPP ==−− 121 ...nn can be obtained, where iP are the Householder matrices as
defined above. Since they are orthogonal and symmetric, i.e. 1−== iTii PPP , the matrix
( ) 1211211
1211 ......... −−
−−−
− ==== nTn
TTnn PPPPPPPPPTQ will also be orthogonal (and symmetric).
On the other hand, since RTA = , then QRA = .
Let now, after finding Q and R, the matrix RQA =' is formed. Since AQR T= , then
AQQA T=' is similar to A (i.e. it has the same eigenvalues and eigenvectors). An important
circumstance is that this similarity transformation conserves the symmetric, tri-diagonal or
Hessenberg form of the matrix.
Thus, the QR algorithm consists of the following infinite iterative procedure:
( )...
...
1 kkTkkkk
kkk
QAQQRA
RQA
==
=
+ (44)
As it is seen, this is a direct analogue of Householder reduction, although with the fol-
lowing two important differences:
− the applied Householder matrices zero all subdiagonal elements (the Householder
reduction does not zero the first subdiagonal);
− the orthogonal transformations are an infinite number (with Householder reduction
their number is n-2).
The following statement holds, the proof of which is outside the scope of this text:
If A has eigenvalues of different absolute value iλ , i.e. nλλλ >>> ...21 , then as
∞→k , Ak approaches an upper triangular form (in the symmetric case – a diagonal
form), and at that its diagonal elements approach its eigenvalues (the eigenvalues of a
triangular matrix always coincide with its diagonal elements). If A has an eigenvalue
iλ of multiplicity p, then as ∞→k , Ak will also converge to an upper triangular ma-
trix with diagonal elements converging to its eigenvalues, except for a diagonal block
matrix of order p, the eigenvalues of which converge to iλ . Also, the subdiagonal ele-
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
48/150
ments of Ak (in the symmetric case – the superdiagonal as well) approach zero like
( )k
j
ikija
λλ
~ (i > j, ji λλ < ).
In the context of the above, convergence can be accelerated through shifting of the ei-
genvalues, i.e. through applying the QR step to 1A kk k− where kk is a suitably chosen con-
stant. Then the convergence rate will be determined by j
i
kj
ki
k
k
λλ
λλ <
−−
. The choice of kk is
made on the basis of the current eigenvalue estimates.
Application: Schrödinger equation
Formulation
The Schrödinger equation is a quantum mechanical analogue of the equation of motion
in classical mechanics. Its underpinning can be explained as follows.
The general expression for the amplitude of a plane wave is ( ) ( )tiiAt ωψ −⋅= krr exp, ,
where k is a wave vector.
De Broglie’s wave associated with a free particle is ( )
−⋅= tii
t ωψh
prr exp, , where p
is the momentum of the particle and its energy is ωh=E .
Schrödinger’s wave function for a non-relativistic free particle, which is a direct coun-
terpart of Broglie’s wave function, is ( )
−⋅=
hh m
itpit
2exp,
2prrψ . (The second term in the
exponent follows from ωh=E and m
pE
2
2
= )
The differentiation of this wave function in time and coordinates yields
( ) ( )tm
pt
ti ,
2,
2
rr ψψ =∂∂
h and ( ) ( ) ( ) ( )tptt
pt
xi
i
,,;,,2
22
2
2
2
2
rrrr ψψψψhh
−=∇−=∂∂
. (45)
Therefore,
( ) ( )tm
tt
i ,2
, 22
rr ψψ ∇−=∂∂ h
h (46)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
49/150
is a differential equation satisfied by this wave function.
Let the particle is in a potential field, i.e. the potential energy of the particle is ( )rV .
Then the kinetic energy of the particle will be ( )rVEm
pEk −==
2
2
, where ωh=E is the full
energy. Let the full energy is everywhere higher than the potential energy.
It is also assumed that the wave function of such particle will be a superposition of
plane waves of the considered form, with a time dependence
− tiE
hexp , and a coordinate
dependence ( )
⋅=h
prr
iexpψ . Then, for each of these waves:
( ) ( )tEtt
i ,, rr ψψ =∂∂
h and ( ) ( ) ( )( ) ( )tVEmt
pt ,
2,,
22
22 r
rrr ψψψ
hh
−−=−=∇ . (47)
With constant full energy E, the coordinate and time dependences are separable and the
second expression in (47) can be applied independently:
( ) ( )( ) ( )tVEtm
,,2
22
rrr ψψ −=∇− h, or ( ) ( ) ( ) ( )tEtVt
m,,,
22
2
rrrr ψψψ =+∇− h. (48)
This is the time-independent Schrödinger equation for the wave function of a particle
with constant full energy E.
After substituting ( )tE ,rψ by ( )tt
i ,rψ∂∂
h , the above equation takes the form:
( ) ( ) ( ) ( )tVtm
tt
i ,,2
, 22
rrrr ψψψ +∇−=∂∂ h
h . (49)
This is the time-dependent Schrödinger equation in which the full energy E is not ex-
plicitly included. It is valid for any time dependence of the full energy, and therefore for any
wave function.
It is clear that the time-independent equation will be satisfied by the spatial part of the
wave function ( )rf alone: ( ) ( )
−= tiE
fth
exp, rrψ , so that
( ) ( ) ( ) ( )rrrr EffVfm
=+∇− 22
2
h, or after nondimensionalisation: (50)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
50/150
( ) ( ) ( ) ( )rrrr EffVf =+∇− 2 . (51)
Since ( ) ( ) 2rr ψ=P is the probability to find the particle in the vicinity of r , a natural
boundary condition will be ( ) 0=rψ far from the region where the potential energy is non-
zero.
Solving
1) If the derivatives are approximated by finite differences, the one-dimensional prob-
lem will take the form:
jjjjj Effh
fVh
fh
=−
++− +− 12212
121, (52)
where ( )ii xff ≡ , .1 constxxh ii =−≡ + , ( )ii xVV ≡ . Or
fHf E= , (53)
which is a standard eigenvalue problem for the eigenvalues Ei and the eigenvectors f i,
and the matrix H is symmetric and tri-diagonal so that the QR method can be applied directly.
In the two- or three-dimensional case the matrix is symmetric and with a band structure.
2) An alternative approach is to expand the solution in basis functions ( )rkϕ selected on
the grounds of an analysis of the physical problem: ( ) ( )∑=k
kkaf rr ϕ .
Thus:
( )( ) ( ) ( )∑∑ =+∇−k
kkk
kk aEVa rrr ϕϕ2 , (54)
or, after integrating over the problem space (to the boundaries where the boundary con-
ditions are imposed):
( ) ( )( ) ( )[ ] ( ) ( )[ ]∑ ∫∑ ∫ =+∇−k
kklk
kkl ardEaVrd rrrrr ϕϕϕϕ *32*3 , i.e. SaHa E= . (55)
If the basis functions are orthogonal, then ijijS δ= and the problem is aHa E= .
The problem SaHa E= is known as a generalised eigenvalue problem. It can be repre-
sented in the standard form as aHaS E=−1 . However, a serious computational difficulty will
arise from the fact that even if H and S are symmetric, HS 1− will not preserve this property.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
51/150
In the presently considered particular case of a symmetric matrix S the following rem-
edy can be applied. If S is also diagonally dominant, then it will be positive-definite. For
symmetric positive-definite matrices there exists the decomposition += LLS where L is a
lower triangular matrix (this decomposition is performed economically after the so-called
Cholesky method). Then, the similar to HS 1− matrix G will be:
( )( ) ( )[ ] ( ) ( ) ( )+−−−+−−+−−++−+−+ ==== 111111111 LHLLHLLHLLLLHSLG . (56)
The equality ( ) ( )+−−+ = 11LL on which the final result in (56) is based is because of the
symmetry of S ( SS =+ ):
( ) LSLSLL 111 −−+−+ =⇒= ; ( ) ( ) ( ) ( ) 11111 −+−+−+−+−+ ==⇒== LLSLLSLSL . (57)
Thus, by construction the matrix G is similar to HS 1− , and because of the last equality
in the chain (56) it will inherit the symmetry of H. Therefore it is advantageous to replace the
problem aHaS E=−1 by the problem aGa E= .
Example
The one-dimensional time-independent Schrödinger equation has the form:
( ) ( ) ( ) ( )xEfxfxVdx
xfd =+−2
2
, (58)
and with ( ) 2xxV = :
( ) ( ) ( ) 022
2
=−+ xfxEdx
xfd (59)
Solutions of this equation with ,...1,0,12 =+= nnEn are the so-called Hermite func-
tions ( )xnψ :
( ) ( ) ( ) 012 22
2
=−++ xxndx
xdn
n ψψ, (60)
i.e. they are eigenfunctions for the Schrödinger equation in the considered case, and
,...1,0,12 =+= nnEn are their corresponding eigenvalues.
The Hermite functions are:
( ) ( ) ( ) ( )xHxnx nn
n 2exp!2 221−=
−πψ , (61)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
52/150
where ( )xHn are the Hermite polynomials.
Hermite functions form an orthogonal set:
( ) ( ) nmmn xxdx δψψ =∫∞
∞−
(62)
Hermite polynomials can be evaluated through the recurrence relation:
( ) ( ) ( )( ) ( ) xxHxH
nxnHxxHxH nnn
2,1
,...2,1,22
10
11
===−= −+ (63)
Therefore, if method 1), i.e. (52), is implemented with a given spatial discretisation
Nixi ,...,1, = , then the eigenvalues are expected to converge to ,...1,0,12 =+= nnEn , and
the eigenvectors – to ( ) Nixin ,...,1, =ψ .
In particular, e.g. with 101 −=x , 10=nx and 400=N , the discretisation approach in x
(finite differencing) gives the following first 10 eigenvalues:
n 0 1 2 3 4 5 6 7 8 9 En 1.00 3.00 5.00 7.00 8.99 11.0 13.0 15.0 17.0 19.0
The figures below illustrate some of the eigenvectors and their corresponding Hermite
functions.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
53/150
0 100 200 300 400
-0.15
-0.10
-0.05
0.00
0.05
0.10
0.15
-0.002
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
0.016
0.018
0.020
0.022
0.024
0.026
0.028
0.030ψ
(x)
i(xi)
ψ(x), n=0 discretisation basis functions
ψ(x
)2
ψ(x)2
Figure 1. The eigenvector and its corresponding Hermite function at n = 0 (the graphic
representations fully match), together with ( ) ( )xxP 2ψ= .
0 100 200 300 400
-0.15
-0.10
-0.05
0.00
0.05
0.10
0.15
-0.002
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
0.016
0.018
ψ(x
)
i(xi)
ψ(x), n=3 discretisation basis functions
ψ(x
)2
ψ(x)2
Figure 2. The eigenvector and its corresponding Hermite function at n = 3 (the graphic
representations fully match), together with ( ) ( )xxP 2ψ= .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
54/150
4. Singular value decomposition
Let nm×A is an arbitrary matrix. Since for any vector x the following inequality is valid:
( ) ( ) ( ) ( ) ( ) 0≥⋅=⋅= +++++ AxAxAxAxxAAx , (1)
then the matrix AA + is positive-semidefinite.
(A symmetric (Hermitian) matrix А is said to be positive-definite if for any non-zero vector x
the quadratic form 0>+Axx . Such matrix is negative-definite if 0<+Axx , and respectively posi-
tive- or negative-semidefinite if 0≥+Axx or 0≤+Axx .)
On the other hand, the maximum and minimum eigenvalues of a Hermitian (symmetric)
matrix H like AA + are:
xxHxx
x +
+
≠=
0max maxλ and
xxHxx
x +
+
≠=
0min minλ . (2)
Therefore, the eigenvalues of the matrix ( ) nn×+AA are non-negative and can be written
in the form niii ,...,1,2 == σλ .
The quantities ( )0≥iσ are known as the singular values of matrix A.
It can be shown that for an arbitrary matrix nm×A of rank r there exist such orthogonal
(unitary) matrices nm×U and nn×V that nnnnnmmn ×××+
× = ΣVAU , where nn×Σ is a diagonal matrix,
the first nr ≤ diagonal elements of which, i.e. 0...21 >≥≥≥ rσσσ , are the non-zero singu-
lar values of A, and its remaining n-r diagonal elements are zeroes.
From the relation nnnnnmmn ×××+
× = ΣVAU it follows that +×××× = nnnmnm VUA nnΣ .
Singular value decomposition (SVD) consists in finding the matrices Σ, U and V for a
given matrix A. The process has two stages.
At the first stage, through a sequence of Householder reductions the matrix A is brought
to upper bidiagonal form: nnnmmnnn ×××× = QAPJ , where 11...PPPP −= nn and 221 ... −= nQQQQ
are products of Householder matrices. Since P and Q are orthogonal (unitary), then
( )QAAQPAQPAQJJ ++++++ == . Therefore the matrix JJ+ is similar to AA + , i.e. it has
the same eigenvalues. Thus the singular values of the upper bidiagonal matrix J coincide with
those of the matrix А.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
55/150
At the second stage the superdiagonal elements of J are brought to vanishing magnitude
through an infinite iterative procedure analogous to the QR method. The general form of the
transformation is nnnnnnnn ×××× = TJSΣ where Σ is practically diagonal (with negligible off-
diagonal elements), and S and T are orthogonal (unitary). In a way analogous to the previous
proof it is shown that the singular values of J and Σ coincide. The resultant transformation
+= VUA Σ is effected through the orthogonal (unitary) matrices mnnnmn ××× = PSU and
nnnnnn ××+× = TQV .
Application: the least squares problem
Let miyi ,...,1, = is a set of observations, each with a variance misi ,...,1,2 = .
Linear model
Let these observations are compared against a model
( )∑=
≡n
jijji xXay
1
~ , (3)
where xi are values of the independent variable x which in some general way describes
the measurement conditions, and ( ) njxX j ,...,1, = are the so-called basis function of the
model. The number of basis functions n is not larger than the number of observations m, and
usually mn << .
The model parameters nja j ,...,1, = are so chosen as to minimise the deviation between
the observations and the respective values predicted by the model. A measure of this devia-
tion is the quantity
( ) ( )( ) ( )∑ ∑∑
∑∑
= ==
=
=
−=
−=−≡m
i
n
jjiji
m
i i
n
jijj
i
im
i i
ii aCbs
xXa
s
y
s
yyR
1
2
11
2
1
12
22
~ aa , (4)
where ( )i
ijij s
xXC ≡ are the elements of a matrix nm×C and b is a vector composed of
mis
yb
i
ii ,...,1, =≡ .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
56/150
The form of ( )a2R follows from the assumption that the observations are mutually in-
dependent random variables with Gaussian distributions with expected values iy~ and vari-
ances 2is . This assumption in turn is based on the central limit theorem in probability theory.
( )a2R is the squared value of the residual bCaR −= of the set of linear equations
bCa = . This set is usually overdetermined and its residual is expectedly non-zero. The data
modelling problem consists in minimising this residual.
A frequently employed method of minimising ( )a2R is through solving the set:
( )nj
a
R
j
,...,1,02
==∂
∂ a
which leads to the following inhomogeneous linear system for the parameters a:
bDa = ,
where:
( ) ( ) ( )∑∑
==
===m
i i
ijij
m
i i
ijikkj njk
s
xXyb
s
xXxXD
12
12
,...,1,,,
This is the so-called „method of normal equations”, as discussed e.g. in the lecture notes
on Programming and Computational Physics.
Non-linear model
Let the model compared against the available observations is
( )nii aaxfy ,...,;~1≡ . (3a)
The number n of model parameters is not larger than the number of observations m, and
usually mn << .
The ( )a2R quantity (4) can be represented through linearization of the model with re-
spect to its parameters using the Taylor expansion:
( ) ( )( ) ( )( ) ( )( )kjj
ki
n
j j
kii aax
a
fxfxf −×
∂∂+≅ ∑
=aaa ;;;
1
,
where ( )ka is some known estimate of the sought parameters.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
57/150
Following the considerations in the linear model case, the task of minimising ( )a2R is
solved iteratively starting with an initial estimate of the sought parameters and reduces to
minimising the squared residuals of sets of the form baC =δ , where:
( ) ( )kk aaa −≡ +1δ ,
( )( )i
ki
jij s
xa
f
C
a;∂∂
≡ and ( )( )
i
kii
i s
xfyb
a;−≡ , njmi ,...,1,,...,1 == .
The normal equations method is also iterative and consists in solving of equation sets of
the form:
baD =δ ,
where:
( )( ) ( )( )( )( )( ) ( )( )∑
∑
=
=
=∂∂−=
∂∂
∂∂=
m
i
ki
li
kii
l
m
i
ki
j
ki
lilj
njlxa
f
s
xfyb
xa
fx
a
f
sD
12
12
,...,1,,;;
,;;1
aa
aa
Solving the problem
And so, the data modelling problem is expressed in minimising the square ( )a2R of the
residual mnnmm baCR −= × of the equation set bCa = . (It is assumed that this set is gener-
ated by a linear model or by e linearised non-linear model. In the latter case the role of basis
functions is played by ( ) njxa
f
j
,...,1,; =∂∂
a .)
This set is usually overdetermined and has no exact solution. The vector a which mini-
mises the square (or the norm) of the residual is interpreted as an approximate solution to this
set.
If the matrix C is subjected to SVD, i.e. the matrices U, Σ and V are found for which
+= VUC Σ , then the system can be transformed to a formally determined system with a di-
agonal matrix in the following way:
( ) ( ) dzbUaVbaVUaC =→=→== ++×
+×××× ΣΣΣ nnnnnn nnmnnmnnm , or: (5)
nridz
ridz
ii
iii
,...,1,0
,...,1,
+====σ
, (6)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
58/150
where r is the rank of C.
The solution of (6) is rjd
zj
jj ,...,1, ==
σ and, formally, nrjz j ,...,1, +=∞=
The reverse transition to the original model parameters is obvious: Vza = .
It turns out that njj ,...,1, =σ are non-zero only when the basis functions are linearly
independent at mixi ,...,1, = , i.e. the vectors ( ) njmixX ijj ,...,1,,...,1, ==≡X are mutually
linearly independent.
The potential linear dependence between some of the vectors jX can be due either to a
deficiency of the model, i.e. „redundant” basis functions – in its turn either by design, or be-
cause their unique contributions are masked by numerical errors – or to the fact that with the
available observations, with accounting of their uncertainties, the contribution of some of the
basis functions cannot be distinguished.
If the rank of C is smaller than the number of model parameters (because of the dis-
cussed essential or effective linear dependence between some basis functions) the choice for
the respective nrjz j ,...,1, += in (6) is not unique, and hence the solution a of the lest
squares problem, i.e. the solution which minimises the squared residual of the set bCa = , is
not unique.
In this case it is desirable to choose, among all possibilities, the model parameters set a
with the smallest norm ∑=
≡n
iia
1
2a . This is so because otherwise some of the basis functions
would enter the model with excessively large contributions ai which would ultimately cancel
each other. A result of this kind is always less stable numerically and, which is worse, would
mislead about the contribution of certain factors through which the observations are attempted
to be explained.
It will be chosen below that if nrjz j ,...,1,0 +== is substituted in (6), i.e. the matrix
1−Σ is replaced by ( )nriridiag i ,...,1,0;,...,1,~ 1 +===− ωΣ where
ii σ
ω 1≡ , and the solution
of bCa = is evaluated as bUΣVdΣVVza +−− === 110
~~, then:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
59/150
а) The solution 0a has the smallest norm among those which minimise the norm of the
residual of bCa = ;
б) The solution 0a ensures the smallest norm of the residual of bCa =
The first statement is proved as follows.
1. Any solution of bCa = can be represented in the form:
( )( ) ( )i
n
iii VbUbUVΣdVΣVza ∑
=
+−− ⋅====1
11 ω (7)
where i
i σω 1≡ , and ( )iU and ( )iV are correspondingly the i-th columns of U and V.
That is, a is a linear combination of the columns of V with coefficients ( )( )bU ⋅iiω .
This result is verified as follows. First, the matrix-vector product Axy = can be repre-
sented as ( )∑=
=n
iii x
1
Ay where ( ) nii ,...,1, =A are the columns of A. Thus ( )∑=
==n
iiiz
1
VVza
where iz are the elements of the vector bUΣz +−= 1 . Second, the product of a diagonal matrix
and a vector, ydiag )( ix , is the vector with elements ii yx . That is, bUΣ+−1 is a vector with
elements iii xz ω= , where ix are the elements of the vector bU+ ,
( ) bU ⋅=== ∑∑==
+i
m
jjji
m
jjiji bUbUx
11
. Thus, finally: ( )( ) ( )i
n
iii VbUbUVΣa ∑
=
+− ⋅==1
1 ω .
2. Important properties of SVD, the proof of which is outside the scope of this text, are:
− The columns of V with sequential numbers coinciding with those of the zero singu-
lar values form a complete basis in the subspace of vectors 0≠x for which 0=Cx
(this subspace is known as the kernel (or nullspace) of the matrix)
− The columns of U with sequential numbers coinciding with those of the non-zero
singular values form a complete basis in the subspace of vectors y for which
0≠= yCx with 0≠x (this subspace is known as the image (or range) of the ma-
trix).
3. Therefore, if C has zero singular values, then a can be written in the form:
( ) ( ) δaVVa +=+= ∑∑+==
011
i
n
riii
r
ii βα , (8)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
60/150
where the form of iα and iβ is according to (7). At that, for the vector δ (generally
non-zero) it is always true that 0=Cδ (follows from the above statements about the corre-
sponding columns of V), and thus 0CaCa = with an arbitrary choice of δ through iβ . That
is, both 0a and any other vector a of the form (8) are solutions to the set bCa = .
From (5) through (7) it follows that nrii ,..,1, +=∞=β . By virtue of the freedom of
choice for δ through iβ , let 0=δ is selected by setting nrii ,..,1,0 +==β , i.e. 01 =≡
ii σ
ω ,
when 0=iσ (!). As it will be shown immediately below, this choice ensures the minimum
norm of a.
4. And so, let for the purpose of this proof δ is some non-zero vector for which 0=Cδ ,
and which consequently can be represented as ( )i
n
rii V∑
+= 1
γ . Let also ( )idiag ω≡−1~Σ is with zero
elements where the singular values are zero, i.e. 0=iσ . Then the norm of any of the possible
vectors a (model parameter sets in the context of the examined problem) will be:
( )δVbUΣ
δVbUΣVδbUΣVδaa
++−
++−+−
+=
+=+=+=1
110
~
~~
, (9)
The second equality from the right is based on the orthogonality of V, and the last
equality – on the fact that orthogonal matrices preserve the vector norm: zVz = . Also, from
the choice of 1~−Σ it immediately follows that 01~
abUΣV =+− , where 0a is according to (8).
Because of the mentioned choice of δ, δV + can be represented as:
( )
( ) ( ) ( ) ( )( )( ) ( )( )nrjrjcol
colcol
j
n
riiji
j
n
riiij
j
n
riii
,...,1,,,...,1,0
11
1
+===
⋅=
⋅=
=
∑∑
∑
+=+=
+=
++
γ
γγ
γ
VVVV
VVδV
,
i.e. the vector δV + has zero elements if the respective singular values are non-zero and
potentially non-zero elements if the respective singular values are zero.
With accounting of the form of 1~−Σ it is seen that the vector bUΣz +−= 1~~ has the fol-
lowing structure:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
61/150
( )( ) ( ) ( )( )nrjrjzcolcol jj
jjj
,...,1,0,,..,1,~~ 1 +===== ++− bUbUΣz ω ,
i.e. the vector z~ has zero elements if the respective singular values are zero and poten-
tially non-zero elements if the respective singular values are non-zero.
By the way, the vector z~ is actually the solution z of the set (6), for which the choice
nrizi ,...,1,0 +== is made. Here this is achieved automatically through the substitution
01 =≡
ii σ
ω when 0=iσ .
5. From the above analysis of (9) it can be concluded that the norm of δaa += 0 is
formed by the sum of the squares of the non-zero elements of z~ and of δV + , taken sepa-
rately (i.e. there is no instance of summing the corresponding vector elements before the
squaring operation). The conclusion from this result in its own turn is that the smallest norm
of δaa += 0 is achieved with nrii ,...,1,0 +==γ , i.e. with 0=δ .
Thus it is finally proved that the substitution 01 =≡
ii σ
ω when 0=iσ ensures the
smallest norm of the model parameters set a, and that this solution with the smallest norm is
0a which is obtained from the solution of the set (6) by setting nrizi ,...,1,0 +== .
In the context of the discussion about the reasons for the emergence of zero singular
values (including almost zero values because of an effective linear dependence between the
basis functions over the grid of points in which the model is evaluated), a good practice of
applying SVD is to choose a threshold τ for the singular values in dependence of the accuracy
of observations and of performing the arithmetic operations needed for evaluating the model.
Thus, if τσ >j , then j
jj
dz
σ= , and if τσ ≤j , then 0=jz .
From the general logic of the problem it follows that the emergence of zero or practi-
cally zero singular values demands a reformulation of the model (removal or redefinition of
certain basis functions).
The second statement, namely that SVD ensures the smallest norm (square) of the re-
sidual bCaR −≡ 0 of the set bCa =0 , is as follows.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
62/150
1. Let now δ is an arbitrary non-zero vector for which 0≠= gCδ . Therefore, g can be
represented as ( )i
r
ii Ug ∑
=
=1
γ . Let δaa += 0 , where 0a is the above commented solution with
smallest norm. The norm of the residual of bCa = in this case will be:
( )( )( ) ( )[ ]( ) gUbU1ΣΣ
gUbU1ΣΣUgb1UΣUΣ
gbbUΣVVUgbCaR
++−
++−+−
+−+
+−=
+−=+−=
+−=+−=
1
11
10
~
~~
~Σ
(10)
2. Since ( ) ( )( )nrjrjcolj
,...,1,0,,...,1,1~ 1 +===−ΣΣ , then 1ΣΣ −−1~
will have non-zero
elements only where the corresponding singular values are zero. Therefore, the vector
( ) bU1ΣΣ +− −1~ will have potentially non-zero elements only in the positions of the zero singu-
lar values.
On the other hand, because of the already mentioned choice of g, gU+ can be repre-
sented as follows:
( )
( ) ( ) ( ) ( )( )( ) ( )( )nrjrjcol
colcol
jj
r
iiji
j
r
iiij
j
r
iii
,...,1,0,,...,1,
11
1
+===
⋅=
⋅=
=
∑∑
∑
==
=
++
γ
γγ
γ
UUUU
UUgU
,
i.e. the vector gU+ has potentially non-zero elements only if the corresponding singular
values are non-zero.
The conclusion from this analysis of (10) is that the norm of the residual of the set is
formed from the sum of the squares of the non-zero elements of ( ) bU1Σ +− −1~Σ and of gU+
taken separately (i.e. there is no instance of summing the corresponding vector elements be-
fore the squaring operation). The conclusion from this result in its turn is that the smallest
residual norm is achieved with nrii ,...,1,0 +==γ , i.e. with 0=δ . This smallest norm will
be ( ) ∑+=
=n
rjjd
1
20aR , where bUd += .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
63/150
Thus it was proved that the above commented and selected solution 0a , which has the
smallest norm, also ensures the smallest norm of the residual of the set bCa = , i.e. this set is
solved in the least-squares sense.
In summary, the usefulness of SVD and the introduction of a suitably chosen positive
threshold τ is as follows:
• The occurrence of zero or effectively zero singular values is an indicator which can
provide guidelines for improving or simplifying the model;
• The accuracy of finding the parameters ja is improved. This is so because the ratio
min
max
σσ
can be regarded as a condition number ( )Ccond of the matrix C. Since
( )b
bC
a
a δδcond≤ , where aδ is the vector of errors of the model parameters, and
bδ is the vector of errors of the observations, then the condition number of the coeffi-
cients matrix is a measure of the sensitivity of the solution to errors in the vector of
constant terms, i.e. typically to errors in the input data. (It can be shown that the condi-
tion number is also a measure of the sensitivity of the solution to errors in the matrix
elements.) Thus, the zeroing of those jσ which are smaller than τ results in reducing
the condition number to τ
σ max . Since the condition number is essentially the error in-
crease factor, this reduction improves the reliability of determining the sought parame-
ters ja .
It can additionally be shown that the variances of the model parameters are given by the
expression:
( ) ∑=
=
=
m
i i
jij nj
Va
1
2
2 ,...,1,σ
σ (12)
(Here, same as above, iσ are the diagonal elements of the matrix Σ.)
Example: Analysis of a gamma spectrum
The model is
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
64/150
( ) ∑=
−−++≡N
k k
kki
cxAbxbxy
1
2
21 2
1exp,~
σa ,
where:
− x is a measure of the energy of gamma-quanta;
− 21 bxb + is the background;
− N is the number of instrument lines;
− Ak, ck and σk are respectively the amplitudes, positions and standard deviations of
those lines.
The line full width at half maximum (FWHM) is related to the standard deviation as fol-
lows: σ2ln22=FWHM . The relation between amplitude A and area S is: AS σπ2= .
The model is obviously non-linear and the equations for its parameters are of the form
explained in the respective section above. This model is simplified and in particular relies on
the assumption that the non-uniform energy sensitivity of the detector is accounted in ad-
vance.
The application of SVD to this problem is illustrated below. The choice is m = 100,
miixi ,...,1, == , 2=N and the observations are synthesized with the parameter values from
Table 1 in the following way:
( ) ( ) mixyxyy iiii ,...,1,,~,~ =×+= ξaa ,
where iξ is a random number sampled from the standard Gaussian distribution ( )1,0N ,
i.e. ( ) ( )( )aa ,~,,~iii xyxyNy ∈ .
The way of synthesizing the observations iy corresponds to the meaning of iy as a
number of registered gamma-quanta with an energy corresponding to the i-th channel, and
this number has a random value sampled from the Poisson distribution, which at a sufficiently
large mathematical expectation converges to the normal (Gaussian) distribution with a vari-
ance equal to the expectation.
Table 1. Model parameters
k zk bk Ak ck σk
1 300 -2.02 8000 41 5
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
65/150
2 100 302.02 9000 55 5
2,1, =kzk are the background levels at 1x and mx , and the coefficients 2,1, =kbk are
obtained through linear interpolation between these two values.
The initial parameter estimates with which the iterative minimisation of 2R is started,
as well as the parameter estimates during the minimisation process, are shown in Table 2.
The course of minimisation of 2R is illustrated in Figures 1 and 2.
Table 2. Minimisation steps. k is the step sequential number. k = 0 refers to the initial parame-
ter estimates.
k R2 z1 z2 A1 c1 σ1 A2 c2 σ2
0 202145 0 0 10000 35 1 10000 50 1 1 150419 459.4 145.5 2913.5 35.3 1.50 5024.3 50.1 1.98 2 53074 412.2 137.5 3314.0 37.2 4.10 4885.1 51.0 6.19 3 33696 285.7 88.9 6199.1 42.7 6.20 8185.6 56.0 6.62 4 26726 320.7 101.6 3882.0 34.9 2.74 11508.6 50.4 7.34 5 10924 280.8 74.1 2780.4 37.3 4.22 8884.3 50.6 7.65 6 39229 305.9 102.7 9307.4 45.8 8.28 7809.6 56.3 4.75 7 11855 306.9 121.2 6975.9 42.4 6.25 9116.5 57.0 4.88 8 744.6 308.6 96.5 7776.6 40.5 4.73 8950.9 55.0 5.25 9 96.6 301.8 98.4 7961.4 41.0 5.00 9045.6 55.0 5.02 10 93.3 301.4 98.2 8008.1 41.0 5.02 9021.9 55.1 4.99 11 93.3 301.4 98.2 8008.3 41.0 5.02 9022.1 55.1 4.99
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
66/150
0 1 2 3 4 5 6 7 8 9 10 11
0
50000
100000
150000
200000
R2
k
Figure 1. R2(a) in the course of minimisation.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
67/150
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
68/150
Figure 2. Consecutive steps of the minimisation of R2(a).
The advantage of SVD before the normal equations method are better exhibited when
the model is in a large discrepancy with the observations.
Let, based on a modification of the previous example, the observations consist only of
the second line, whereas the model remains with two lines and its initial parameter estimates
are the same as in the previous example.
In this case the course of iterations through SVD is shown in Table 3, and the initial and
final steps are illustrated in Figure 3. It is seen that the implausible first line can easily be
identified and excluded from the model, and even without doing this the remaining parame-
ters are found accurately enough. Also, in this case three of the singular values effectively
approach zero in the course of iterations, whereas in the previous example all singular values
remain significantly non-zero throughout the iteration process.
Table 3. Minimisation steps through SVD. k is the sequential step number. k = 0 refers to the
initial parameter estimates.
k R2 z1 z2 A1 c1 σ1 A2 c2 σ2
0 820110 0.0 0.0 10000 35 1.0 10000 50 1.0 1 82627 376.3 144.0 -44.5 35.0 0.99 4143.8 50.3 1.70 ... ... ... ... ... ... ... ... ... ... 8 94.4 374.7 147.9 -280.3 -751.8 -459.2 9027.7 55.0 5.01
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
69/150
Figure 3. Initial and final step of the minimisation of R2(a) through SVD.
The course of iterations with the “normal equations” method is shown in Table 4, and
the initial and final steps are illustrated in Figure 4. The iterations are stopped at step 7 when
the matrix of the linear system becomes singular.
Table 4. Minimisation steps through solving the „normal equations”. k is the sequential num-
ber of the iteration step. k = 0 refers to the initial parameter estimates.
k R2 z1 z2 A1 c1 σ1 A2 c2 σ2
0 820110 0.0 0.0 10000 35 1.0 10000 50 1.0 1 82622 376.2 144.1 -43.1 35.0 0.99 4145.1 50.3 1.70 ... ... ... ... ... ... ... ... ... ... 6 1.44E+07 6553.1 5821.1 259613 70475.4 3840.1 9011.2 55.1 4.99
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
70/150
Figure 4. Initial and final step of the minimisation of R2(a) through solving the „normal equa-
tions”.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
71/150
5. Orthogonal polynomials. Approximation of functions.
Gaussian quadrature
Approximation of functions
A typical problem in computational modelling is the approximation of a known function
( )xf by a combination (most often linear) of functions belonging to a suitably chosen class.
A frequently preferred class is ( ) xpn where ( )xpn are polynomials of degree ,...2,1,0=n
An especially common instance of polynomial approximation is the Taylor series expansion.
Another standard class of basis functions, evidently related to the Fourier transform, are
,...1,0,cos,sin =nnxnx Still another popular choice are the exponential functions. Many
other sets of linearly independent functions with suitable properties are also used for the pur-
pose of function approximation.
Linear least squares
A customary measure of the proximity between the approximating and the approxi-
mated functions is the weighted sum of the squared differences between the respective func-
tional values. More specifically, let ( )xf is the approximated function and nixi ,...,1, = is a
sequence of chosen values of the independent variable (nodes) at which the values of ( )xf
are known (e.g. observed), usually with some uncertainty. Let ( )ii xff ≡ are the exact values
of ( )xf at ix , and the corresponding observed values are if . Let ( ) ,...1,0, =Φ jxj is a set of
basis functions defined at each ix , and the approximating function at the nodes is
( )∑=
Φ≡m
jijji xaf
0
~. Then the coefficients ja must be determined so as to minimise the quantity
( ) ( ) ( ) ( )∑∑ ∑== =
=
Φ−≡
n
iii
n
i
m
jijjii RxwxafxwS
1
2
1
2
0
2 a , (1)
where ( )ii xww ≡ are suitable weights associated with the corresponding observations
and/or nodes.
The least squares measure, although most frequently applied, is not an exclusive option in solving func-
tion approximation problems. Another standard possibility is to minimise the quantity
( ) ( ) ( )xfxfxRxx
~maxmax −≡ over the interval where the approximation is constructed.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
72/150
Despite the existence of different ways of defining the function approximation problem,
here only the case of polynomial approximation in the least squares sense will be considered.
A standard analytical approach to the minimisation of ( )a2S is to find a solution of the
set of equations
( ) ( ) ( ) mkxxafwa
S n
iik
m
jijjii
k
,...,0,021 0
2
==Φ
Φ−−≡
∂∂
∑ ∑= =
a. (2)
Or: kj
jkj ag ρ=∑ where (3)
( ) ( )∑=
ΦΦ≡n
iijikikj xxwg
1
и ( )∑=
Φ≡n
iikiik xfw
1
ρ . (4)
The following remarks can be made about this system of 1+m linear equations:
• If nm =+1 , its solution a leads to ( ) 02 =aS , i.e. the function ( ) ( )∑ Φ=j
jj xaxf~
in-
terpolates between the data points nifxi ,...,1,, = . If nm >+1 the problem is ill-
posed and no unique solution a can be found.
• The matrix kjg≡G is symmetric. If the basis functions ( ) ,...1,0, =Φ jxj are or-
thogonal over the set of data points nixi ,...,1, = with a weight function ( )xw , i.e.
( ) ( ) ( )
=≠≠=
ΦΦ∑= jk
jkxxxw
n
iijiki ;0
;0
1
, then the matrix G will be diagonal.
• If the matrix G is not diagonal and the basis functions are polynomial then G may
tend to be ill-conditioned and consequently the roundoff errors in finding the solution
a may become inadmissibly high. Indeed, let for simplicity ( ) jj xx =Φ , ( ) 1=xw and
all nodes ix be equidistant in the interval [ ]1,0 . Then, with large n,
1
1
01 ++=≈≡ ∫∑ +
=
+
jk
ndxxnxg jk
n
i
jkikj , i.e. HG n= where:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
73/150
++
+
+
+
=
12
1.........
1
13
1.........
3
12
1...
4
1
3
1
2
11
1...
3
1
2
11
mm
m
m
m
H .
(the factor n arises from the quadrature expression ∑∫=
++ ∆≈n
i
jki
jk xxdxx1
1
0
where n
x1=∆ )
It is seen that with the increase of m the determinant of H will decrease, i.e. the matrix
will approach singularity, and the elements of its inverse will grow in absolute value. Thus for
example, with 9=m 1−H will have elements of the order of 12103× . Accounting for the fact
that ρHa 1−= and that the elements of ρ will contain inevitable roundoff errors, it is evident
that the errors in a will become unacceptably high.
Orthogonal polynomials
The above considerations show that it is especially desirable that the polynomial basis
functions be constructed as orthogonal over the set of nodes nixi ,...,1, = , i.e.
( ) ( ) kjxpxpwn
iikiji ≠=∑
=
,01
. (5)
Then the system of equations for the coefficients a will take the form:
mkad kkkk ,...,0, == ω , where (6)
( )∑=
=n
iikikk xpwd
1
2 and ( )∑=
≡n
iikiik xpfw
1
ω . (7)
The solution of this system, mkda kkkk ,...,0, == ω , is found immediately and all diffi-
culties arising from the ill conditioned matrix G as per the above discussion are avoided. (Es-
sentially this approach can in part be regarded as an analogue of the SVD technique in the
particular case of linear data modelling with polynomial basis functions.)
Also, if m is replaced by с m+1 it will be sufficient to complement the already existing
solution with 1,111 ++++ = mmmm da ω . (The non-diagonal would have to be completely re-
evaluated and therefore a new linear system for the entire set a would have to be solved.) This
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
74/150
circumstance simplifies the choice of a degree M for the approximation. Indeed, if the ap-
proximated function ( )xf is exactly (or almost exactly) a polynomial of degree M and the
approximation is of degree Mm> , then the values of mMja j ,...,1, += should statistically
approach zero. The above assertion is verified directly. First, the orthogonal polynomials are
mutually linearly independent (this is straightforwardly demonstrated through the way of their
construction as described further below), so that any polynomial can be represented as a linear
combination of them. Thus, if the approximated function ( )xf is exactly a polynomial of de-
gree M, it can be represented as ( ) ( )∑=
=M
kkk xpxf
0
α . Let also the observed function values are
exact, i.e. ( )ii xff = . Then, according to (7):
( ) ( ) ( )
( ) ( )
>≤≤
==
=≡
∑ ∑
∑ ∑∑
= =
= ==
Mm
Mmdxpxpw
xpxpwxpfw
mmmim
M
k
n
iikik
n
iim
M
kikki
n
iimiim
,0
0,
0 1
1 01
αα
αω
With this the assertion is proved for exact observations. If these contain uncertainties,
then by virtue of the central limit theorem in probability theory (insofar as the prerequisites
for its validity are fulfilled) the observations will be normally distributed around the exact
function values: ( )( )iii xfNf σ,∈ , and from the above expression it becomes clear that the
random variables mω will have a zero expectation at m > M. This is so because
( ) ( ) ( ) ( ) ( ) ( ) ( ) ...1 011
=
=== ∑ ∑∑∑= ===
n
iim
M
kikki
n
iimii
n
iimiim xpxpwxpxfwxpfEwE αω where ( )E
is the mathematical expectation operator which is linear.
It can also be demonstrated that in the examined case the quantity ( ) ( )12 −− mnS a will
be independent of m for Mm≥ .
Thus, for finding M (which in general is not known in advance) it is expedient to solve
the equations ρGa = (or ωDa = ) successively for ,...2,1,0=m until the corresponding val-
ues of ( ) ( )12 −− mnS a stop decreasing significantly with the increase of m.
The general procedure of constructing a set of orthogonal polynomials (Gram-Schmidt
process) consists in setting 10 =p and finding the higher-degree polynomials through the re-
current relation
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
75/150
( ) ( ) ( )∑=
+ +=n
iiinn xpxxpxp
01 α , (8)
at which the coefficients iα (a separate set for each n) are determined from the condi-
tions
( ) nipp in ,...,0,0,1 ==+ where ( ) ( ) ( ) ( )∫≡b
a jiji xpxpxdxwpp , . (9)
More specifically, since already ( ) ijji pp δ=, for i and j ≤ n, then the orthogonality con-
dition is
( ) ( ) ( ) 0,,,1 =+=+ kkkknkn pppxppp α . Therefore (10)
( )( ) nk
pp
pxp
kk
knk ,...,0,
,
, =−=α . (11)
Considering again the mutual orthogonality of the already constructed polynomials and
using the defined recurrent relation, the numerators of kα become:
( ) ( ) ( ) ( ) 2,...,0,0,,,,0
1 −==−== ∑=
+ nkppppxpppxpk
iiniknknkn α . (12)
Therefore, the recurrent relation is simplified to
( ) ( ) ( ) ( ) ( ) 11
1 −−=
+ −−=+= ∑ nn
n
niiinn pxpxxpxxpxp γβα , (13)
where ( )( )nn
nnn pp
pxp
,
,=−= αβ and ( )( )11
11 ,
,
−−
−− =−=
nn
nnn pp
pxpαγ . (14)
The last expression is simplified further:
( ) ( ) ( ) ( ) ( ) ( )nnnnnnnnnnnnnn ppppppppxpppxp ,,,,, 221,111 =−−== −−−−−− αα , (15)
so that ( )
( )11,
,
−−
=nn
nn
pp
ppγ . (16)
The orthogonalised polynomials can be normalised: ( )ii
ii
pp
pp
,← .
The mutual linear independence between the constructed set of polynomials follows
immediately from (8).
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
76/150
After defining ( ) ( ) ( )∑=
≡n
kkjkikji xpxpwpp
1
, , the discrete counterpart of this process be-
comes completely analogous to the described above.
Example: Polynomial approximation of the Runge function
Runge function, ( )2251
1
xxf
+= , is noteworthy by the fact that its interpolation by a
polynomial of a relatively high degree in equidistant nodes xi within the interval [ ]1,1+− tends
to oscillate near the ends of this interval. A similar behaviour is exhibited by the polynomial
approximation of this function.
Let the number of nodes is n = 100 and the maximum degree of orthogonal polynomial
approximation with a weight function ( ) 1=xw is M = 30. The dependence of
( ) MmmS ,...,0,2 = on the degree m is shown below.
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 301E-5
1E-4
1E-3
0.01
0.1
1
10
100
S2 (m
)
m
Figure 1. Behaviour of ( )mS2 in the course of approximating the Runge function by orthogo-
nal polynomials.
It is seen that ( )mS2 decreases exponentially (the ordinate scale is logarithmic). The
stepwise behaviour is due to the fact that Runge’s function is even and the orthogonal poly-
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
77/150
nomials over the chosen interval and with the chosen weight function are Legendre’s poly-
nomials. These polynomials are even functions of x when of an even degree, and odd when of
an odd degree. Therefore the contributions of the polynomials of an odd degree must be zero.
This is indeed so – the coefficients before the polynomials of an odd degree take on negligible
values as compared with those at an even degree. In this context it is appropriate to build the
approximation with only the even-degree polynomials belonging to this orthogonal set.
The approximations at some selected degrees are shown below.
Figure 2. Approximations of the Runge function with orthogonal polynomials over 100 equi-
distant nodes in the interval [-1,+1].
The examined example shows that a polynomial approximation of a relatively low de-
gree can eliminate the deficiencies of an attempted global polynomial interpolation (the inter-
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
78/150
polating polynomial in this case would deviate significantly from the true function between
the nodes, especially near the ends of the interval).
Below a general polynomial approximation following the standard linear least squares
approach is illustrated. The basis functions are ( ) kk xx =Φ .
-2 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 321E-5
1E-4
1E-3
0.01
0.1
1
10
100
S2 (m
)
m
Figure 3. Behaviour of ( )mS2 in the course of a general polynomial approximation of
Runge’s function.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
79/150
Figure 4. General polynomial approximations of Runge’s function over 100 equidistant nodes
in the interval [-1,+1].
It is seen that up to relatively low degrees the two approaches are equivalent and fail to
produce a consistent approximation. At higher degrees the general polynomial approximation
with basis functions ( ) kk xx =Φ is again unsuccessful because of numerical instability,
whereas the approximation with orthogonal polynomials leads to a satisfactory solution of the
approximation (and effective interpolation) problem.
Data smoothing
The least squares approximation by polynomials for the purpose of data modelling is
normally performed over the entire set of observations nif ,...,1, = and the weights iw are
assumed to be reciprocal to the observation variances: 21 iiw σ= . However, the technique of
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
80/150
least squares approximation by orthogonal polynomials is also applied for the purpose of the
so-called data „smoothing” which aims at filtering out the potential noise in order to reveal (at
least visually) the trends in the examined functional dependence. Smoothing usually encom-
passes a restricted number n of adjacent nodes, the approximating polynomial degree m is low
and the weight function is ( ) 1=xw . If the nodes are equidistant and the number n of the in-
cluded nodes is chosen to be odd, i.e. 12 += Ln , then it is convenient to introduce a new in-
dependent variable 1
1
+
+
−−=
Ln
L
xx
xxs which varies from -1 to 1 (node numbering is changed for
simplicity). Under these restrictions and with this choice of an independent variable the set of
orthogonal polynomials ( ) mjsp j ,...,0, = will be determined solely by the values of n and m.
Thus the expressions for evaluating the smoothed data if~
will have a fixed form. For exam-
ple, with 3=n ( )1=L and 1=m they will be:
( )3211 256
1~ffff −+= , ( )3212 3
1~ffff ++= , ( )3213 52
6
1~ffff ++−= . (17)
Since at each step the n-point data set which is subjected to the smoothing procedure is
shifted forward by one position, the expressions for 1
~f and 3
~f are applied to the first and the
last node respectively, and the expression for 2
~f – to all interior nodes.
Gaussian quadrature
The general approach in numerical integration, i.e. the evaluation of ( )∫≡b
aab xdxfI , ul-
timately reduces to interpolation of the integrand and subsequent approximation of abI by an
analytical expression for the integral of the interpolating function. The final result can always
be represented in the form ( )∑=
≅n
iiiab xfaI
1
where the nodes nixi ,...,1, = are chosen in a cer-
tain way, and the coefficients (weights) niai ,...,1, = depend on this choice. It is evident that
in the case of polynomial interpolation and an arbitrary choice of distinct nodes the quadrature
formula will be exact for integrands which are polynomials of a degree not higher than n-1.
On the other hand, if some special conditions (n in number) are imposed on the selec-
tion of nodes, it can be expected that the total number of 2n conditions (n for the nodes + n for
the interpolation requirement) will be sufficient to construct a quadrature formula which will
be exact for integrands which are polynomials of a degree not higher than 2n-1. This is
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
81/150
namely the idea behind the so-called Gaussian quadrature. This method is aimed at finding a
quadrature formula of the form:
( ) ( ) ( )
( ) ( ) 0;0
1
>≥
≅
∫
∑∫=
b
a
n
iii
b
a
xdxwxw
xfaxfxdxw, (18)
The rationale of introducing a weight function ( )xw is that it allows to decompose the
integrand ( )xg into a product of a function ( )xw which may be sufficiently „complex” and,
in particular, not even nearly reducible to a polynomial of a comparatively low degree, and a
„simple” function ( )xf which may be exactly or approximately represented as a polynomial
of a low degree: ( ) ( ) ( )xfxwxg = . In this sense a quadrature as defined in (18) can provide a
powerful tool for economic and precise integration of a wide range of functions.
Let ...,...,,,1 10 nQQQ ≡ are a set of orthogonal polynomials with respect to ( )xw in
[ ]ba, .
For such polynomials:
( ) ( ) ( )
( ) ( ) ( ) 0
,0
=
≠=
∫
∫b
a n
b
a ji
xqxQxdxw
jixQxQxdxw, (19)
where ( )xq is any polynomial of degree lower than n.
(The last is due to the fact that orthogonal polynomials form a basis and allow the ex-
pansion ( ) ( )∑−
=
=1
0
n
iii xQxq α .)
Let nixi ,...,1, = are the zeroes (roots) of the orthogonal polynomial ( )xQn . For cer-
tain appropriate weight functions it can be shown that these roots are real, distinct and lie
within the interval (a,b).
Let:
а) The nodes in (18) are the roots of ( )xQn ;
б) The weights niai ,...,1, = are chosen so that the quadrature (18) will be interpola-
tional, i.e. exact for all polynomials of degree n-1.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
82/150
For example, Lagrange polynomials ( )( )
( )∏∏
≠
≠
−
−=
jiij
jii
j xx
xx
xL allow the representation:
( ) ( ) ( ) ( )xfRxLxfxfn
iii ;
1
+=∑=
. If ( )xf is precisely a polynomial of degree n-1, then the remainder
( ) 0; =xfR and ( ) ( ) ( ) ( ) ( ) ( )∑∑ ∫∫==
==n
iii
n
i
b
a ii
b
axfaxLxdxwxfxfxdxw
11
Let ( )xn 12 −Φ is an arbitrary polynomial of degree 2n-1. It can always be represented in
the form:
( ) ( ) ( ) ( )xrxqxQx nnnn 1112 −−− +=Φ , (20)
where the quotient ( )xqn 1− and the remainder ( )xrn 1− are polynomials of degree not
higher than n-1.
Then:
( ) ( ) ( ) ( ) ( )
( ) ( ) ( )[ ] ( )44 844 76444444 8444444 76
48476444 8444 7644 844 76
5
112
4
111
3
11
2
1
1
12
0
0
∑∑
∑∫∫
=−
=−−
=−−−
Φ=+=
==+=Φ
n
iini
n
iininini
n
iini
b
a n
b
a n
xaxrxqxQa
xraxrxdxwxxdxw
(21)
The second expression is because of (20) and (19), the third is equivalent to the second
because of the choice of weights, the fourth is equivalent to the third because of the choice of
nodes, and the fifth is an alternative representation of the fourth (because of (20)).
Thus it was proved that with the above choice of n weights and nodes the quadrature
formula (18) is exact for polynomials of degree 2n-1.
The sets of orthogonal polynomials needed for evaluating the nodes (and weights) can
be explicitly constructed for each particular weight function ( )xw and integration limits. In
most cases, however, it is expedient to employ standard quadrature sets for a selection of
weight functions and to convert the integration limits to the standard ones through a linear
change of variables. Some of the orthogonal polynomial sets corresponding to frequently cho-
sen weight functions and integration limits are:
− Legendre polynomials: [ ] [ ]1,1, −=ba and ( ) 1=xw .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
83/150
− Chebyshev polynomials of the first kind: ( ) ( )1,1, −=ba and ( )21
1
xxw
−= .
− Chebyshev polynomials of the second kind: [ ] [ ]1,1, −=ba and ( ) 21 xxw −= .
− Laguerre polynomials: [ ) [ )∞= ,0,ba and ( ) ( ) 1,exp −>−= αα xxxw .
− Hermite polynomials: ( ) ( )∞∞−= ,,ba and ( ) ( )2exp xxw −= .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
84/150
6. Monte Carlo methods
Monte Carlo (MC) methods are a widely applied class of computational algorithms for
simulating the behaviour of various physical and mathematical systems, as well as for solving
some general computational problems. The principal characteristic feature of MC methods are
their stochasticity expressed in the employment of random (or most often pseudo random)
numbers for performing a simulation or solving a numerical problem .
A typical application of MC is the modelling of transport phenomena – neutron trans-
port for the purposes of nuclear reactor analysis, photon transport for radiation shielding
analyses and determining the efficiency of registration in detector systems, etc. In these prob-
lems the MC approach essentially consists in finding certain quantitative characteristics of a
given macroscopic system through simulating the microscopic interactions therein (these in-
teractions most often are of an intrinsically stochastic nature).
Based on a more general mathematical formulation of such class of problems it can be
said that Monte Carlo methods represent a technique of solving of systems of integro-
differential equations for the sought macroscopic characteristics (e.g. radiation fields).
Another self-standing application of MC methods is numerical integration. Determinis-
tic methods of numerical integration require evaluating the integrand in a set of selected
points (most often equidistant) in the function arguments space. With a large number of inde-
pendent variables this approach may be impracticable – for example, a case with 100 inde-
pendent variables and 10 grid points in each dimension would require 10100 evaluations of the
integrand. This example is not purely artificial because in many physical problems there ex-
ists an unique correspondence between their dimensionality and the number of degrees of
freedom. With Monte Carlo methods, on the other hand, the information about the integrand
is collected through random sampling of points in the space of independent variables, and the
integral is evaluated as an appropriately defined average of the function values. By the law of
large numbers such method will converge as N1 , i.e. a quadruple increase of the sample
volume will result in halving the uncertainty of the estimated value of the integral, independ-
ently of the problem dimensionality. The efficiency of the method can be further improved if
the points in the arguments space are sampled from a probability distribution with a density
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
85/150
function which resembles in shape the magnitude of the integrand. Thus the evaluated func-
tional values will be predominantly among those with a larger contribution to the integral.
MC methods are also a powerful and widely used technique for numerical optimisation
(finding the global minimum of a function, often of a large number of independent variables).
These methods are based on the so-called random walk. The itinerary through the space of
function arguments is with a principal downhill trend, but with a possibility of random excur-
sions in the direction of function growth, thus reducing the chance of getting stuck in a local
minimum instead of the global one.
From the above considerations it follows that a prerequisite for the employment of MC
methods is the capability to generate samples from various and practically arbitrary probabil-
ity distributions. An obvious example in support of this statement is numerical integration.
Another clear example is presented by the task of solving transport problems. For instance, if
the energy of an incident particle is E’, then its energy E after a scattering event is a random
quantity sampled from the distribution ( ) ( )( )',
',
E
EEEf
s
s
rr
Σ→Σ= , where r is the point of scatter-
ing. This point in its turn is also random and it is sampled from the distribution
( ) ( ) ( )[ ]',,,exp', ERTEf t Ωrrr −Σ= , where ( ) ( )∫ −Σ≡R
t dRERERT0
'','',,, ΩrΩr is the so-called
optical thickness, R is the free path length for a particle travelling in direction Ω from its point
of emergence (previous collision) r0 to the current collision point, i.e. Ωrr R+= 0 .
Generation of random deviates with a chosen probability distribution
Here it is appropriate to recall some of the properties of random variables and of their
probability distributions.
Let x is a random variable and ( )bxaP ≤≤ is the probability that x takes a value between a and b.
The probability density function( )xf of this random variable is defined by the property:
( ) ( )xxxxPxxf ∆+≤≤=∆ 000 at 0→∆x . (1)
Therefore:
( ) 0≥xf and ( ) 1=∫+
−dxxf
x
x where x- and x+ are the bounds of the possible values of x, which in par-
ticular can be -∞ and +∞, and
( ) ( )∫=≤≤b
adxxfbxaP (2)
The cumulative distribution function( )xF is defined as:
( ) ( )00 xxPxF ≤= (3)
Therefore:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
86/150
( ) ( )∫ −=
x
xdxxfxF '' ; ( ) 1 → +→xx
xF ; ( ) 0 → −→xxxF ;
( ) ( ) ( )aFbFbxaP −=≤≤ ' ; ( ) ( )xf
dx
xdF = (4)
Uniform distribution The uniform distribution ( )baU , has the following probability density function:
( )
><
≤≤−=
bxax
bxaab
dxdxxf
,:0
: (5)
If ( )1,0Ux∈ , then the expectation of this random variable is ( )2
1== ∫∞
∞−dxxxfµ and its variance is
( ) ( )12
122 =−= ∫∞
∞−dxxfx µσ . (6)
Normal distribution
The normal (Gaussian) distribution ( )σµ,N has the following probability density function:
( )
−−=2
2
1exp
2 σµ
πσxdx
dxxf , (7)
where µ and 2σ are respectively the expectation and the variance of this distribution.
If the random variable x has a distribution ( )σµ,N , then the random variable σ
µξ −≡ x will have the
distribution ( )1,0N . And conversely, if the random variable x has a distribution ( )1,0N , then the random vari-
able xσµξ +≡ will have a distribution ( )σµ,N . These statements follow from the definitions of mathemati-
cal expectation and of variance and are valid for any probability density functions. More specifically, for the expectation:
( ) ( ) ( ) ( )xEdxxfxxE σµσµσµ +=+≡+ ∫
and for the variance:
( ) ( ) ( )( )[ ] ( ) ( )xDdxxfxExxD 22 σσµσµσµ =+−+≡+ ∫ .
Central limit theorem If ,..., 21 xx are mutually independent random variables, all belonging to the same distribution with an
expectation µ and variance σ2, then at ∞→n the random variable ∑=
≡n
iin xs
1
will belong to the distribution
( )σµ nnN , .
In particular, if ,...1, =ixi are sampled from the distribution ( )1,0U , then, for instance, the random
variable 612
1
−≡∑=i
ixξ will have a distribution approximating ( )1,0N .
The transformation method for generating deviates with a specified prob-
ability distribution
Let the random variable ( )xy ϕ= with a probability density function ( )yg is some pre-
scribed (fully deterministic) function of the random variable x with a probability density func-
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
87/150
tion ( )xf . Then the following relation holds (fundamental transformation law of probabili-
ties):
( ) ( )dxxfdyyg = , (8)
where dx is an arbitrary small increment, whereas ( ) ( )xdxxdy ϕ−+ϕ= . Or,
( ) ( ) ( ) ( )ydy
dxf
dy
dxxfyg 1−ϕ== . (9)
In particular, if ( )xFy = , where F is the cumulative distribution function (3) corre-
sponding to ( )xf , then from (4), (9) and the fact that F is a monotonically increasing function
of x, it follows that ( ) ( ) ( ) 1=====dF
dx
dx
dF
dF
dx
dx
dF
dy
dxxfFgyg . Since at that 10 ≤≤ F ,
then the random variable ( )xFy = will have an uniform distribution between 0 and 1.
Thus if ξ is sampled from the uniform distribution between 0 and 1, then
( )ξ= −1Fx (10)
will be a random deviate from the specified distribution ( )xf .
Let, for example, ( ) xexf −= , ∞<< x0 . Therefore ( ) xexF −−= 1 , so that the applica-
tion of (10) results in the prescription to evaluate the random deviate x belonging to the speci-
fied distribution ( )xf according to the expression ( )ξ−−= 1lnx where ( )1,0U∈ξ . Since, on
the other hand, ξζ −≡ 1 is also from the distribution ( )1,0U , then finally:
ζln−=x where ( )1,0U∈ζ . (11)
Sampling from the normal distribution (Box-Muller m ethod)
Let the random variables nxxx ,...,, 21 have the joint probability density function
( )nxxf ,...,1 .
Let also exist the prescribed, i.e. completely deterministic functional dependences:
( ) ( )
( ) ( )nnnn
nn
xxxxy
xxxxy
,...,,...,
...
,...,,...,
11
1111
ϕ
ϕ
=
=. (12)
Then, analogously to (8), the following relation holds:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
88/150
( ) ( ) ( )( ) n
n
nnnn dydy
yy
xxxxfdydyyyp ...
,...,
,...,det,...,...,..., 1
1
1111
∂∂= , (13)
where ( )( )
∂∂
n
n
yy
xx
,...,
,...,det
1
1 is the absolute value of the determinant of the Jacobian ma-
trix,
( )( )
∂∂
∂∂
∂∂
∂∂
≡∂∂
n
nn
n
n
n
y
x
y
x
y
x
y
x
yy
xx
...
.........
...
,...,
,...,
1
1
1
1
1
1 .
Now let in particular: 212
211
2sinln2
2cosln2
xxy
xxy
π
π
−=
−=. (14)
Then:
( )
1
22
22
211
arctg2
1
2
1exp
y
yx
yyx
π=
+−= (15)
The determinant of the Jacobian matrix will be:
( )( )
−=
∂∂ −− 22
21
2122
21
2
1
2
1
,
,det yy ee
yy
xx
ππ.
From the above and from (13) it follows that y1 and y2 are mutually independent random
variables, each with a probability density function ( )1,0N .
The rejection method for generating deviates with a specified probability
distribution
Let the desired probability density function is ( )xp , and ( )xf is a suitably chosen en-
veloping function (cf. Fig.1), such that ( ) ( )xpxf ≥ and ( )∫∞
∞−
dxxf has a finite value.
If the above described transformation method is applied for sampling from the distribu-
tion ( )xp , and the generated sample is plotted on the x axis, then the number of random devi-
ated in a given interval of this axis will be statistically proportional to the area under the curve
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
89/150
of ( )xp in this interval. If these deviates ( ),...1, =ixi are regarded as the abscissae of points
in the ( )( )xpx, plane, and their corresponding ordinates are chosen as random deviates from
the uniform distributions ( )( )ixpU ,0 , then the density (number per unit area) of these points
under the curve of ( )xp will be a statistically constant quantity.
An analogous procedure applied to the function ( )xf will differ from the above de-
scribed only in the necessity to account for the fact that ( )∫+∞
∞−
>= 1Adxxf . Thus, in the algo-
rithm for this method the random deviate ξ must be sampled from the distribution ( )AU ,0
instead of the distribution ( )1,0U . Of course, the ordinates of the points in the plane of the
plot of ( )xf must be sampled from the uniform distributions ( )( )ixfU ,0 . In this case the
abscissae of those points with ordinates not larger than ( )ixp will form a sample from the
distribution ( )xp . This is so because they will meet all requirements of the transformation
method if it were applied directly to ( )xp .
And so, let the definition interval of ( )xf and ( )xp is [ ]+− xx , (this interval can as well
be infinite) and let the analogue of the cumulative distribution function for ( )xf is
( ) ( )∫−
≡x
x
dxxfxF '' . Let ( ) ( )∫+
−
=≡ +
x
x
dxxfxFA '' . In this case the algorithm of sampling from
( )xp by the so-called rejection method will be as follows:
1. Sampling of ( )AU ,0∈ξ . Computing ( )ξ10
−= Fx . Sampling of ( )( )00 ,0 xfUy ∈ .
2. Comparing y0 with ( )0xp . If ( )00 xpy ≤ , the abscissa x0 is accepted as a random de-
viate x from the distribution ( )xp . Otherwise steps 1 and 2 are repeated.
It is clear that the relative share of “successful” hits will be equal to the ratio of the area
under ( )xp to the area under ( )xf . Therefore for a better efficiency of the algorithm it is de-
sirable that ( )xf compactly envelopes ( )xp and has an easily computable inverse of its
primitive function (antiderivative). Thus the rejection method can be regarded as a substitute
of the transformation method when the evaluation of the inverse of the antiderivative of the
desired probability density function ( )xp is difficult or computationally expensive.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
90/150
Figure 1. The rejection method. The illustration is from NUMERICAL RECIPES IN FORTRAN 77: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43064-X) Copyright (C) 1986-1992 by Cambridge University Press.
Assessing the quality of the sample
The quality of a sample from a given probability distribution can be assessed by means
of the so-called Pearson’s 2χ test.
Let the continuous random variable x has a probability density function ( )xf and let
Lixi ,...,1, = (1)
is a sample from this distribution.
Let the range of possible values of x is subdivided into M intervals with boundaries
1,...,1, += Mjbj . (2)
Let further
MjN j ,...,1, = (3)
are the numbers of observations in the corresponding intervals, i.e. kN is the number of
elements of the sample (1) with values within the interval [ ]1, +kk bb . It is clear that LNM
jj =∑
=1
.
On the other hand, let MjF j ,...,1, = are the expected numbers of observations:
( )∫+
=1k
k
b
b
k dxxfLF . (4)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
91/150
If the hypothesis that Lixi ,...,1, = is a sample from the distribution ( )xf is true, then
the observation numbers MjN j ,...,1, = will have a Poisson distribution with a probability
density function:
( ) ( )!
exp
j
jN
jj N
FFNp
j −= . (5)
( )jNp is the probability of occurrence of jN successful outcomes if their expected
number is jF and if this expected number is strictly proportional to the total number of trials
(here the total number of trials is the sample volume L), and if the occurrence of a successful
outcome does not depend on the number of trials after the occurrence of the previous success-
ful outcome. It is seen that the process of sampling from ( )xf fully corresponds to these con-
ditions.
The Poisson distribution (5) has an expectation jj F=µ and a variance jj F=2σ . When
the expectation is large, e.g. 10>jF , the Poisson distribution converges to the Gaussian
(normal): ( )jj FFN , .
By virtue of the last two statements (if the prerequisite for the second of them is ful-
filled), the quantity
j
jjj
F
FNz
−= (6)
will have a standard Gaussian distribution ( )1,0N , and therefore the sum
( )∑∑
==
−=≡
M
j j
jjM
jj F
FNzS
1
2
1
220 (7)
will have a 2Mχ distribution with M degrees of freedom.
Then, if the probability ( )20
2 SSPM > is lower than e.g. 0.05, then the above mentioned
hypothesis is rejected, i.e. the quality of the tested sample is not satisfactory.
Monte Carlo integration
The simplest approach can be explained through an analogy with the elementary algo-
rithms with equidistant abscissae (e.g. the rectangle rule). Let the task is to evaluate
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
92/150
( )∫≡1
0
dxxfI . According to the rectangle rule, if the interval [0,1] is subdivided into N equal
subintervals, the estimate of the integral will be ( )∑=
≡≅N
iixf
NRI
1
1 where xi are the centres
of these subintervals, uniformly distributed in the interval [0,1]. Thus R can be interpreted as
an estimate of the average function value in this interval: fR ≅ .
It is clear that with a sufficiently large number of points, N, this estimate will not de-
pend on the specific choice of these points, provided that they are uniformly distributed in the
interval [0,1]. Thus, if Nii ,...,1=ξ is a random sample with a volume N from the uniform distri-
bution ( )1,0U , then the sample average ( )∑=
≅N
iif
Nf
1
1 ξ will also be an estimate of the inte-
gral ( )∫≡1
0
dxxfI .
These considerations are directly generalised for arbitrary integration limits, as well as
for a higher dimensionality of the integral.
In the present context ( )xf is a random variable because it is a function of a random
argument. Let this random variable has a probability density function ( )fp . Let
( ) ( )∫+
−
≡=f
f
p dfffpfEµ and (1)
( ) ( )( ) ( )∫+
−
−≡=f
f
p dffpfffD 22 µσ (2)
are respectively the expectation and the variance of f.
In this sense ( ) Niff ii ,...,1, =≡ ξ is a random sample from the distribution ( )fp , and
its sample average ∑=
≅N
iifN
f1
1 is obviously also a random quantity. The sample elements
if are mutually independent values of the random variable f , i.e. they are mutually independ-
ent random variables, each with a distribution ( )fp . Therefore, for each of them are valid the
equalities: ( ) pifE µ= and ( ) 2pifD σ= .
Thus, the expectation of the sample average is:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
93/150
( ) pp
N
i
i
NN
N
fEfE µ
µ==
=∑=1
, (4)
and the variance of this average:
( ) ( )N
NN
fDNN
fDfD p
p
N
ii
N
i
i
22
21
21
11 σσ ===
= ∑∑==
(5)
The first equality in the above chain is because to the mutual independence of the sam-
ple elements:
( )
( )
( ) ( )
( ) ( )( )
( )( )[ ]∑
∑∑
∑∑
∑∑
∑∑
≠
≠=
==
==
==
−−+=
−−+−=
−
−=
−=
−=
−=
=
jipjpi
p
jipjpi
N
ipi
N
jpj
N
ipi
N
ipip
N
ii
p
N
ii
N
ii
ffEN
fffEN
ffEN
fEN
NfN
E
fN
EfN
DfD
µµσ
µµµ
µµ
µµ
µ
2
1
2
2
112
2
12
2
1
2
11
1
1
11
11
(6)
( )( )[ ] ( )( )( )
( )( ) ( )( )
( ) 0
,
2 =−=
−−=
−−=−−
∫∫
∫ ∫
+
−
+
−
+
−
+
−
pp
pjj
f
f
j
f
f
piii
f
f
pjpiji
f
f
jipjpi
ffpdfffpdf
ffffpdfdfffE
µµ
µµ
µµµµ
(7)
(The last chain of equalities relies on the definition of the expectation of a function of a multidimensional
(in particular two-dimensional) random variable with a probability density function ( )21, xxp :
( )( ) ( ) ( )∫ ∫+
−
+
−
=1
1
2
2
21212121 ,,,x
x
x
x
xxpxxdxdxxxE ϕϕ
and on the fact that for the mutually independent random variables if and jf
( ) ( ) ( )jiji fpfpffp =, .)
Expression (5) deserves a special emphasis because it actually pertains to all esti-
mates obtained through any Monte Carlo method.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
94/150
The possibility to choose a different probability distribution (beside the uniform) for the
random arguments x for the integration problem can be justified and commented in conjunc-
tion with the already discussed transformation method for sampling from a specified distribu-
tion. There the algorithm reduced to populating the area under the curve of a non-negative
function ( )xf with points, the density (number per unit area) of which is statistically constant
everywhere under this curve. Thus the number abN of points with abscissae between two
fixed boundaries, e.g. a and b, would be proportional to the integral of the function within
these limits ( )∫≡b
a
ab dxxfF . In this sense, if there exists a function ( ) ( )xfxg ≤ and the num-
ber of points with ordinates ( )ii xgy ≤ ( ix are the abscissae of these points) is abM , then the
integral ( )∫≡b
a
ab dxxgG can be estimated through the expression abab
abab F
N
MG = , provided
that the value of abF is known.
This approach has the following two extreme variants.
The first one is ( )[ ]
( )xgAxfba,
max≥= , at which ( )abAFab −×= and the abscissae and
ordinates of the respective points are sampled from the uniform distributions ( )baU , and
( )AU ,0 . A big disadvantage in this case is that if ( ) Axg << in most of the interval [ ]ba, (a
typical situation), then with a reasonably large number of points abN the number of hits abM
under the curve of ( )xg will be quite small and correspondingly with a very large relative
uncertainty (this follows from the discussion about the Poisson distribution made above in
relation to the assessment of the quality of samples). This large relative uncertainty is directly
inherited by the estimate of abG .
The second extremity is to choose ( ) ( )xgxf ≈ in the interval [ ]ba, (observing the re-
quirement that ( ) ( )xgxf ≥ in this interval). In this case, of course, the value of abF must be
known in advance, the abscissae ix must be evaluated by the transformation method for ( )xf
and the ordinates must be sampled from the uniform distributions ( )( )ixfU ,0 . This ensures
abab NM ≈ and the statistical uncertainty of the estimate abG is minimised. In the absolutely
extreme case of ( ) ( )xgxf = , and correspondingly abab NM = , the estimate abG will coincide
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
95/150
with the actual integral but the employment of a Monte Carlo procedure will be completely
meaningless.
The above considerations can be generalised and further specified as follows.
The problem of evaluating ( )∫=b
a
ab dxxfI is equivalent to the problem of evaluating
( ) ( )( ) ( ) ( )∫∫ =
b
a
b
a
dxxxgdxxg
xfxg ϕ . Let here ( )xg plays the role of a probability distribution from
which the arguments xi of the function values ( )ixϕ are sampled. In full agreement with the
considerations so far and with the initial particular example, an estimate of the integral abI
will be the sample average ( )∑=
=N
iix
N 1
1 ϕϕ . (In the initial example the integration limits
were [ ]1,0 , ( ) ( )1,01 Uxg == , and obviously ( ) ( )xfx =ϕ ).
In order that ( )xg can play the role of a probability distribution it is necessary that
( ) [ ]baxxg ,,0 ∈≥ and ( )∫ =b
a
dxxg 1. Let, in particular, [ ]ba, is such that in it ( ) 0≥xf , too.
This can always be ensured by subdividing the original integration interval into regions where
the integrand does not change its sign, and where the integrand is negative, by assigning a
negative sign to the final result.
In this way the distribution ( )xg can be chosen to resemble in shape the actual inte-
grand, i.e. ( ) ( )xfxCg ≅ , where C is a multiplier through which the requirement for an unit
integral value of ( )xg is fulfilled. Thus, in contrast with the actual integrand (which in par-
ticular may consist of one or several peaks, so that only the functional values calculated in
their close vicinities will have a significant contribution to the evaluated integral), the effec-
tive integrand ( ) ( )( ) Cxg
xfx ≅=ϕ will have a practically uniform contribution to the integral
everywhere within the integration limits [ ]ba, .
As it was already shown, an estimate of the integral abI will be the sample average
( )∑=
=N
iix
N 1
1 ϕϕ where Nixi ,...,1, = is a sample from the distribution ( )xg . It was also
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
96/150
shown that the variance of ϕ will be ( ) ( )N
p22 σϕσ = where ( )p2σ is the variance of the
probability distribution ( )ϕp of ( )xϕ (with a random argument x the functional value ( )xϕ is
also random). In the considered case ( ) Cx ≅ϕ and ( ) ( )Cp −≅ ϕδϕ , where δ is Dirac delta
function, so that ( ) 02 ≅pσ .
This is namely the desired effect of reducing the uncertainty of the estimated value of
the integral. Of course, the extreme case of ( ) Cx =ϕ will lead to ( ) 02 =pσ and hence to
zero uncertainty, but then it would be necessary that the normalising constant C be precisely
equal to the sought integral abI (!) (this follows from ( ) ( )xfxCg = and the requirement for
an unit integral of the probability distribution ( )xg ) and the entire stochastic procedure is
rendered meaningless.
The general conclusion is that the application of variance reduction methods requires
the preliminary knowledge of an approximate solution to the integration problem, whereas the
role of the Monte Carlo procedure is reduced to a potentially small refinement of this solution.
This conclusion is also valid for the application considered below.
An example
Let the task is to evaluate the integral ( )∫≡ 2
0cos
π
dxxI . The exact result with which the
numerical estimate can be compared is 1.0.
1. Let the probability distribution ( )xg is uniform between 0 and 2
π, i.e. ( )
π2=xg . The
cumulative distribution function is ( ) xxGπ2= , its inverse is ( ) ξπξ
21 == −Gx . The integral
estimate is ϕ≅I , where ( )∑=
=N
iix
N 1
1 ϕϕ is the sample average, and
( ) ( )( ) ( )xxg
xfx cos
2
πϕ == . The respective estimate of the standard deviation of the result is
( )N
22 ϕϕϕσ
−= where ( )∑
=
=N
iix
N 1
22 1 ϕϕ . The arguments ix are sampled from the
distribution ( )xg by the transformation method, i.e. ( )ii Gx ξ1−= , ( )1,0Ui ∈ξ .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
97/150
2. A better choice would be a distribution ( )xg which resembles in shape the integrand
( )xf . Since the implementation of the transformation method for sampling from ( )xg is of-
ten a complicated task, ( )xg is chosen as a trade-off between the shape requirement and the
ease of implementation of the transformation method. An example of such trade-off might be
( ) ( )
∈−=2
,0,exp1 π
xxc
xg , where the denominator
−−=2
exp1π
c ensures a normalisa-
tion to an unit integral of ( )xg . This choice is by no means optimum and is justified only by
the fact that ( )xf is a decreasing function within the integration limits. The integral is evalu-
ated as ϕ≅I where ( ) ( )( ) ( ) ( )xxcxg
xfx cosexp==ϕ , and the standard deviation of this result
is estimated analogously to the previous case. The rule of sampling from ( )xg , as per the
transformation method, is ( )ii cx ξ−−= 1ln where ( )1,0Ui ∈ξ .
Another choice of a probability distribution ( )xg can be e.g. the linear function
( ) ( )
∈+=2
,0,1 π
xbaxc
xg . The constants a and b are chosen so, that ( )
∈>2
,0,0π
xxg ,
and e.g. ( )c
g1
0 = and c
g01.0
2=
π. The denominator
2
0
2
2
π
+= bxxa
c ensures a normali-
sation to an unit integral of ( )xg . The cumulative distribution function is
( )
+= bxxa
cxG 2
2
1. Its inversion is already not quite trivial since it requires funding the
roots of a second-degree polynomial and selecting the root which lies in the interval
2,0π
.
The result is ( )a
acbbGx
ξξ 221 ++−
== − . The integral is evaluated as ϕ≅I where
( ) ( )( )
( )x
xc
xg
xfx
αϕ
−==
1
cos, and the standard deviation of this result is estimated analogously to
the previous case. The rule of sampling from ( )xg , as per the transformation method, is
( )ii Gx ξ1−= where ( )1,0Ui ∈ξ .
This choice of ( )xg is only slightly more efficient than the previous one. It was made
solely for the purpose of illustrating the potential difficulties of implementing the transforma-
tion method.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
98/150
3. The above illustrated difficulties can be overcome through interpolating between a set
of stored values ( ) MixG i ,...,1, = in order to provide a quick and simple (although approxi-
mate) solution of the equation ( ) ξ=xG for x with a known ξ. Here it is convenient and ap-
propriate to evaluate ( )xg , which is needed for the formation of ( )xϕ , by means of an analo-
gous interpolation procedure. This approach allows a great degree of freedom for the choice
of ( )xg . In the considered case it was chosen to approximate ( ) ( )xxf cos= at
∈2
,0π
x by
( ) ( )( )xbac
xg exp1 += with the additional condition that ( ) 0≥xg ,
∈2
,0π
x . With a = 1.4
and b = -0.28457 the similarity between ( )xf and ( )xg is quite good. The constant c, as pre-
viously, ensures an unit integral of ( )xg from 0 to 2
π. The results below are obtained through
piecewise linear interpolation.
With an arbitrary but fixed random sequence ( ) NiUi ,...,1,1,0 =∈ξ the results are as
follows.
Table 1. Monte Carlo evaluation of ( )∫ 2
0cos
π
dxx
N=10 N=100 000
( )xg I σ I σ
c
1 (uniform)
0.8886 0.1527 0.9984 0.0015
( )xc
−exp1
(exponential) 1.0449 0.0490 0.9996 0.0007
( )baxc
+1 (linear)
1.0239 0.0431 1.0004 0.0004
( )( )xbac
exp1 + (approximation, interpolation)
0.9948 0.0132 1.0035 0.0002
N is the volume of the sample from the uniform distribution, which in the first case is
used directly and in all other cases is employed for generating a sample from ( )xg . For com-
parability, all results with a given N are produced using the same sample. As it should be ex-
pected, with large N all results converge to the same correct estimate. The relative advantage
of choosing non-uniform distributions ( )xg which in a certain sense resemble the integrand is
also seen. It must be noted that with a very good approximation (e.g. similar to the attempt in
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
99/150
the last case) the numerical integration (by Monte Carlo, or any other method) can success-
fully be replaced by an analytical integration of the approximating function.
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.60.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
5
1
4 3
2
y
x
1: cos(x) 2: const. 3: exp(-x) 4: 1-ax 5: a+b*exp(x)
Figure 2. Monte Carlo evaluation of ( )∫ 2
0cos
π
dxx . Integrand and probability distributions
( )xg .
Monte Carlo for particle transport problems
In its simplest form this MC method consists in tracking the histories of a finite number
of particles (N) with accounting for the stochastic nature of the events during a history. This
implies sampling from various probability distributions, e.g. the scattering angle, the free path
to a collision, etc..
The procedure in the particular case of a stationary problem with an external source in a
non-multiplying medium would be as follows. A history is started through sampling a random
set of coordinates, an initial energy and a direction from some known probability distributions
which characterise the source. After determining the free path to the first collision, again as a
random number with a known probability distribution, the collision location is found and the
material zone is identified to which this location belongs. In a random way, based on data
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
100/150
about the total cross-sections interpreted as a probability distribution, the target nuclide and
the reaction type – absorption or scattering – are determined. In the case of absorption the
particle history is terminated. In the opposite case (scattering), the direction of the scattered
particle is sampled from the probability distribution of the scattering angle and its energy is
determined uniquely from the requirement to conserve the energy and momentum of the sys-
tem (this would be so with elastic scattering, whereas with inelastic scattering the energy is
also sampled from a respective probability distribution). The history of the scattered particle
is tracked analogously for the subsequent collisions and is terminated in the case of absorption
or leaving the problem boundaries.
Let the purpose of the described procedure is to estimate the mathematical expectation
of a given quantity – e.g. flux, current, reaction rate, etc. This quantity usually matches none
of the random microscopic parameters of the particle histories – scattering angles, locations,
free paths between successive collisions, etc.
Thus the question arises about what statistics of which of those parameters shall be used
as an estimator for the desired quantity. Let, for example, the desired quantity is the scalar
flux Φ.
Further, if ΦΣ= tVc is the average number of collisions in some volume V, then the
mean value (the expectation) of the scalar flux in this volume will be tV
c
Σ=Φ , where tΣ is
the macroscopic total cross-section. A sample estimate of this quantity will be cV tΣ
=Φ 1,
where
∑=
=N
nnc
Nc
1
1 (1)
is the sample average of the collisions number in volume V per unit time, and nc is the
respective number of collisions contributed by the n-th history. Since usually the flux of parti-
cles with certain energy (i.e. with energy within the interval [ ]dEEE +, ) is sought, the con-
tribution of an individual history will comprise only the events at which the particle energy
has been in the desired interval. Similarly, if an estimate of the directional flux is sought, then
only the events caused by particles with directions of travel within a given solid angle are
accounted.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
101/150
On the other hand, the scalar flux is defined as pNv≡Φ , where v is the particle veloc-
ity and Np is their number in a given phase volume. Based on this, the sample estimate of the
flux will be lV
1=Φ , where
∑=
=N
nnlN
l1
1 (2)
is the average track length of the particles in volume V . The track length contributed by
the n-th history is ∑=
=I
iinl
1
v , where I is the number of traversals of volume V per unit time
having occurred while following the n-th history, and iv is the particle velocity during each
traversal. Here also the contribution nl typically comprises only the events at which the en-
ergy (and/or the direction of travel) of the particle has been within the desired interval. This
scalar flux estimator is known as „track type” because the respective events resemble those in
a track detector.
Since the number of collisions in a given volume is expectedly smaller than the number
of crossings of this volume, the statistic (2) will be more stable than (1) and is usually pre-
ferred.
From the above examples it can be seen that the useful result from Monte Carlo simula-
tions of transport phenomena is estimated as a sample average
∑=
=N
nnx
Nx
1
1, (3)
where nx is the contribution of the n-th history in the estimated quantity.
From those examples it is also seen that there are two extreme types of estimators, the
statistics of which are used for evaluating the desired physical quantity – parameters with a
binomial distribution, with which the contributions of the individual particles to the estimated
result are either 0 or 1 (e.g. number of the collisions of a particle with a given energy while
crossing a given volume), and parameters with all histories having non-zero contributions to
the sought estimate (e.g. a “track type” statistic in a large volume for a wide energy range).
Estimators with a binomial distributions are almost never used in practice, but it is also im-
possible to find an estimator with which all histories would have a non-zero contribution to
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
102/150
the estimated result. Most often the probability distribution of the contribution of the chosen
estimator x to the evaluated result is of the form:
( ) ( ) ( )xgxcxf +−= 0δ , (4)
where δ is Dirac delta function.
The delta-function term in (4) is due to the histories with no contribution to the esti-
mated results, whereas ( )xg describes the distribution of non-zero contributions of particle
histories. A similar distribution is observed e.g. in shielding problems where a large relative
share c of particles cannot penetrate through the shield and correspondingly cannot contribute
to the flux on its outer surface.
The expectation ( ) ( )∫∫ == dxxxgdxxxfx is not affected by the contribution of the δ-
function. This is so because of the general property ( ) ( ) ( )00 hdxxxh =−∫ δ . With a correct
normalisation, the sample average is also not biased by the zero-contribution histories, i.e. by
the presence of ( )0−xcδ in the distribution (4).
This is unfortunately not the case with the variance:
( ) ( )( ) ( ) ( ) ( ) 2222
22
g
f
dxxgxxdxxxxc
dxxfxx
σσδ
σ
δ +=−+−=
−=
∫∫
∫, (5)
where 22 xc=δσ is attributed by the inefficiency of the MC procedure, i.e. by the fact
that not all particles (histories) have a contribution to the estimated result, and 2gσ is the vari-
ance inherent to the distribution of the estimators with a non-zero contribution to this result.
With the above illustrated physically realistic approach to modelling the particle histo-
ries, usually referred to as “analogue MC”, the relative share c of non-productive histories can
be very high and may lead to an inacceptable increase of the variance of the estimated quan-
tity. This is so because the variance of the sample average (through which the useful result is
evaluated) is ( )N
x f2
2 σσ = .
Variance reduction methods
Different methods, formally expressed in some departure from physical reality in the
course of modelling particle histories, are employed for reducing the variance while observing
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
103/150
the obligatory requirement to preserve an unbiased estimate of the sought result. The common
designation of any procedures which involve such variance reduction methods is “non-
analogue MC”.
Similarly to the conclusions made in relation with function integration, a fruitful ap-
proach to the present task would be to sample the initial phase coordinates of the particles
predominantly from those regions where the contribution of particle histories to the estimated
quantity is larger. That is, to sample the initial phase coordinates from a probability density
function which is proportional to the contribution of particles with such initial coordinates to
the estimated quantity. The desired distribution can be constructed as described below.
The directional flux is a solution of the following equation (in the particular case of a
stationary system with a fixed source):
0=+ QϕL , (6)
The boundary condition for (6) is
( ) 0,0,, <⋅= ΩdsΩr Esϕ (6a)
at all positions on the convex outer surface of the problem.
In (6) ( )EQQ ,,Ωr= , and the operator L is such that:
( ) ( ) ( )( ) ( )∫∫ Ω→→Σ+
Σ−∇⋅−=
''',',',',
,,,,,
dEdEEE
EEE
n
t
ΩrΩΩr
ΩrrΩrΩL
ϕ
ϕϕϕ, (7)
at which the cross-section nΣ characterises jointly the scattering and the fission sources
(the latter in the case of neutron transport in a multiplying medium).
This operator has its adjoint counterpart L+, defined as:
( ) ( ) ( )( ) ( )∫∫ Ω→→Σ+
Σ−∇⋅+=+
''',',',',
,,,,,
dEdEEE
EEE
n
t
ΩrΩΩr
ΩrrΩrΩL
ϕ
ϕϕϕ (8)
Let the virtual detector in the MC simulation is described through its response cross-
section ( )Ed ,,ΩrΣ . (It is clear that outside the detector this cross-section is zero.)
Let the so-called adjoint flux +ϕ is a solution of the equation
0=Σ+++dϕL . (9)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
104/150
The boundary condition for (9) is
( ) 0,0,, >⋅=+ΩdsΩr Esϕ (9a)
at all positions on the convex outer surface of the problem.
The adjoint operator L+ has the following property:
( ) ( )∫∫∫∫∫∫+++ Ω=Ω ϕϕϕϕ LL dErdddErdd 33 (10)
This property can be proved as follows.
а) For the second terms in (7) and (8):
( ) ( ) ( )[ ]( ) ( ) ( )[ ]∫∫∫
∫∫∫+
+
ΣΩ=
ΣΩ
EEEdErdd
EEEdErdd
t
t
,,,,,
,,,,,
3
3
ΩrrΩr
ΩrrΩr
ϕϕ
ϕϕ (10a)
б) For the third terms in (7) and (8):
( ) ( ) ( )[ ]( ) ( ) ( )
( ) ( ) ( )[ ]∫∫∫ ∫∫
∫
∫∫∫ ∫∫
+
+
+
→→ΣΩΩ=
→→ΣΩΩ=
→→ΣΩΩ
',',',','',,
',',',',,,''
',',',','',,
3
3
3
EEEdEdEdErdd
EEEEdEdEdrdd
EEEdEdEdErdd
n
n
n
ΩrΩΩrΩr
ΩrΩΩrΩr
ΩrΩΩrΩr
ϕϕ
ϕϕ
ϕϕ
(10b)
в) For the first terms in (7) and (8):
Let
( ) ( ) ( ) ( )[ ]∫∫∫++ ∇⋅+∇⋅Ω≡∆ EEEEdErdd ,,,,,,,,3
ΩrΩΩrΩrΩΩr ϕϕϕϕ (10c)
is the difference between the first term in (8) and the first term in (7).
The integrand in (10c) can be represented as:
( )++++
+++
++++
++
⋅∇=Ω∂∂+Ω
∂∂+Ω
∂∂=
∂∂Ω+
∂∂Ω+
∂∂Ω=
∂∂Ω+
∂∂Ω+
∂∂Ω+
∂∂Ω+
∂∂Ω+
∂∂Ω=
∇⋅+∇⋅
ϕϕϕϕϕϕϕϕ
ϕϕϕϕϕϕ
ϕϕϕϕϕϕϕϕ
ϕϕϕϕ
Ω
ΩΩ
zyx
zyx
zyxzyx
xyx
xyx
zyxzyx (10d)
After applying the divergence theorem (Gauss's theorem or Ostrogradsky's theorem):
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
105/150
( )( )[ ] ( )( )( )[ ]∫ ∫ ∫∫ ∫ ∫++ Ω⋅=Ω⋅∇=∆ ΩrΩdsΩ ,,3 EddEdrddE sϕϕϕϕ (10e)
From the last results and from the boundary conditions (6а) for ϕ and (9a) for +ϕ it
follows that 0=∆ .
After multiplying the forward transport equation by +ϕ , the adjoint by ϕ, and after inte-
grating over all phase variables, the following equality is obtained:
( ) ( )[ ] ( ) ( )[ ]∫∫∫∫∫∫+Ω=ΣΩ≡ EQEdErddEEdErddR d ,,,,,,,, 33 ΩrΩrΩrΩr ϕϕ , (11)
where R is the detector response.
The choice of a source Q is in principle arbitrary. Let, in particular:
( ) ( )000 ,,,, EEEQ −−−= ΩΩrrΩr δ (12)
Then:
( ) ( )000 ,,000 ,, ERE ΩrΩr =+ϕ , (13)
where ( ) ( ) ( ) ( )[ ]∫∫∫ ΣΩ=000000 ,,
3,, ,,,,
EdE EEdErddRΩrΩr ΩrΩr ϕ is the response rate from
the detector caused by the flux arising from the unit source (12).
Therefore the adjoint flux, which is a solution of (9), has the meaning of a distribution
of the importance of the source particles with respect to the invoked detector response.
Of course, the task of solving (9) and convolving the source by the obtained solution is
tantamount to the effort of achieving an exact solution of the transport problem, which ren-
ders the MC simulation exercise worthless.
The practical usefulness of the described approach consists in employing an approxi-
mate estimate of the importance function evaluated through solving a simplified form of
equation (9) – e.g. after reducing the dimensionality, homogenisation, simplifying the energy
dependence, etc. If such an approximate estimate of the importance function is supplied, then
the phase coordinates of a tracked particle (both the initial ones and those after a collision or
fission event (the latter in a multiplying medium)) can be sampled from the distribution
( )E,,Ωr+ϕ (obtained from the original importance function after proper normalisation), at
which the contribution of this particle to the useful statistic (i.e. the detector response) must be
multiplied by the weight factor +=in
inw,
,
1
ϕ, referring to the n-th history after the i-th event in
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
106/150
the process of its tracking. The effective result is replacement of the distribution ( )xf in (4)
by a distribution ( )xf~
for which 22~ ff
σσ < .
Often the described systematic approach, known as importance sampling, is substituted
or augmented by heuristic techniques with the same general property, namely:
1) Instead of sampling from the “natural” distribution f , a sample from the modified
distribution f~
is drawn.
2) A weight function ( ) ( )( )xf
xfxw ~= is introduced, through which an unbiased estimate of
the sought mean value is obtained:
( ) ( )∫= dxxfxxwx~
.
Thus, instead of accumulating the statistic x (3) from a sample from the distribution
( )xf , a statistic
( )∑=n
nn xxwN
wx1
is formed from a sample ,...,...,11 nnxwxw from the distribution
( )xf~
. At that both statistics will have a mathematical expectation x .
Of course, the purpose of the described substitution is that the modified estimator
( )xxw will have a smaller variance than the original estimator x.
Some of the standard variance reduction techniques will be discussed below. Here it
should only be mentioned that in the context of (4) and (5) the purpose of non-analogue MC
consists in increasing the share of particles (histories) with non-zero contribution to the de-
sired statistic and in reducing the variance of the distribution of these contributions, at which
the accumulated statistic remains unbiased with respect to the analogue case.
Implicit absorption
With analogue MC, after choosing the collision location, a random number ( )1,0U∈ξ is
generated and is compared to the ratio t
a
ΣΣ
. If t
a
ΣΣ<ξ , then absorption is assumed and the
particle history is terminated. In the opposite case it is assumed that the collision event has
resulted in scattering. A new energy and travel direction are sampled for the scattered particle
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
107/150
and the tacking of its history is continued. According to the currently considered approach,
however, the particle history is not terminated because of absorption. Instead, with non-
analogue MC the weight associated with the particle after the i-th collision is reduced so as to
correspond to the probability to escape an absorption:
ΣΣ−×=+
t
ainin ww 1,1,
This procedure raises the computational expense since the total number of collisions in-
creases, but can achieve a significant reduction of variance because through decreasing the
share of histories of the share of histories with a zero contribution to the estimated quantity.
The total efficiency of the method can increase, provided that a suitable criterion for terminat-
ing a history is introduced.
Russian roulette for the termination of histories
It is clear that with implicit absorption and a large system size (i.e. relatively low leak-
age) the efficiency of accumulating the useful statistics will be low because after a sufficient
number of collisions the particle weight will drop to an insignificant level. One of the possible
solutions to this problem is to terminate the history when the weight passes a certain low
threshold. This, however, will result in underestimating the statistics and may cause biasing of
the sought quantities. (In shielding problems, for example, a physically justified measure of
the importance of a history is the energy of the slowing down particle. Insofar as a contribu-
tion to the evaluated quantity (e.g. dose rate) will have only the particles with an energy above
a certain threshold, the history can be terminated after a scattering event to an energy below
this threshold without biasing the dose rate estimate.)
An universal and reliable method of terminating a particle history, which does not bias
the estimates of the sought quantities, is the so-called Russian roulette. Following this ap-
proach, first the particle weight is checked for dropping below a given threshold. If so, a ran-
dom number ( )1,0U∈ξ is generated and compared with Ξ1
, where Ξ is a constant, typically
between 2 and 10. If Ξ
>ξ 1, then the history is terminated. In the opposite case its tracking is
continued, but for preserving an unbiased estimate the particle weight is multiplied by Ξ. This
method increases the variance but decreases the time for tracking a history. With an appropri-
ate choice of Ξ an overall increase of the efficiency of the MC procedure can be achieved.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
108/150
Russian roulette for particle splitting
This method is applied for particles with a weight which has grown above a certain
threshold. An obvious cause for a weight increase could be the use of an importance function
+ϕ for the formation of weights.
The implementation is analogous to the previous case, except that here Ξ functions as a
multiplication constant. If for a given particle such multiplication occurs, each of the Ξ
daughter particles is assigned a weight equal to Ξ1
of the weight of the cloned parent particle.
Additional comments on the Russian roulette technique
The Russian roulette for terminating particles with low weights increases the share of
the delta function at x = 0 in the distribution ( )( )xxwf (4) (because of the terminated histo-
ries), as well as the values of the continuous distribution ( )( )xxwg at large ( )xxw far above
its mathematical expectation (due to the relative increase of the weights of non-terminated
histories). Thus the overall variance of ( )( )xxwf increases. And conversely, the Russian rou-
lette for splitting of particles with high weights increases the values of ( )( )xxwg at smaller
( )xxw , nearer its mathematical expectation. Thus the overall variance of ( )xxw decreases,
however at the expense of an overhead computational effort for tracking the additional histo-
ries.
The importance sampling approach is often implemented in conjunction with Russian
roulette through introduction of the so-called weight windows. A weight window is typically
cantered at the importance value ( )E,,Ωr+ϕ (in this context referred to as a target weight)
and is assigned some chosen width. The current particle weight is compared with the weight
window boundaries. If this current weight is below the lower weight window boundary, a
Russian roulette procedure for terminating the history is invoked. If above the upper bound-
ary, a Russian roulette for particle splitting is applied. This replaces the need for sampling
from modified (biased) probability distributions for the new particle’s parameters after a colli-
sion event, with the task of sampling from a biased source distribution sill retained.
Path stretching
This technique is used as an alternative to particle splitting for deep penetration prob-
lems and consists in increasing the free path between successive collisions for particles travel-
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
109/150
ling in a direction which is in some way preferred (i.e. important for the accumulation of the
useful statistics). Let, for example, this is the positive direction of the Ox axis. Then the total
cross-section tΣ is replaced with:
( )xtex p eΩ.1−Σ=Σ ,
where xe is the unit vector of Ox, and p, 10 <≤ p , is some chosen path stretching pa-
rameter.
It is clear that this non-physical free path elongation must be compensated through mul-
tiplying the particle weight by a suitable factor wex in order to conserve the actual reaction
rates. Since the collision probability over a distance between u and duu + along the particle’s
travel trajectory is ( )duutt Σ−Σ exp , for preserving the actual value of this probability it is
necessary that:
( ) ( )duuwduu exexextt Σ−Σ=Σ−Σ expexp
Therefore:
( )( )
( )( ) ( )( )
( ) ( )( )
x
xt
xtx
xtxt
tt
exex
ttex
p
up
upp
upp
u
u
uw
eΩeΩ
eΩeΩ
eΩeΩ
.1
.exp
.exp.1
1
.1exp.1
exp
exp
exp
−Σ−=
Σ−=
−Σ−−ΣΣ−Σ=
Σ−ΣΣ−Σ=
Forced collisions
This correction is especially useful if e.g. a reaction rate in some small volume has to be
estimated. It is clear that the accumulated statistic will improve if the number of collisions in
this volume can be artificially increased, however without biasing the sought estimate. The
forced collisions technique consists in splitting a particle into two particles with lower
weights, then letting the first one traverse the volume while enforcing a collision for the other
one. Let the weight of the initial particle is w and its forthcoming path through the volume is
u. The collision escape probability for this initial particle will be ( )utΣ−exp . When this parti-
cle is split in two, the one which is let to traverse the volume is assigned a weight
( )uww te Σ−= exp , whereas the one for which a collision is enforced – a weight
( )( )uww tc Σ−−= exp1 . The history of the particle with weight ew is continued in the stan-
dard way, i.e. as it would be for the initial particle, while for the particle with weight cw a
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
110/150
collision location within the volume is sought as follows. The probability distribution for the
free path ux ≤≤0 before a collision in the volume is:
( ) ( )( )
( )( )u
x
dxx
xxf
t
ttu
tt
tt
Σ−−Σ−Σ=
Σ−Σ
Σ−Σ=
∫exp1
exp
exp
exp
0
.
According to the transformation method, the random number with the desired distribu-
tion will be:
( )( )( )ux tt
Σ−−−Σ
−= exp11ln1 ξ ,
where ( )1,0U∈ξ . The history of this particle after the forced collision is followed in the
standard way.
Although this method may cause the emergence of particles with low weights, no Rus-
sian roulette for terminating the history within the volume is applied.
Application: integral form of the neutron transport equation
The purpose of this example is to show that the general Monte Carlo approach of sam-
pling from chosen probability distributions can be regarded as a method of solving of integral,
and hence of differential equations.
The neutron transport equation has the form:
( ) ( ) ( ) ( )
( ) ( ) ( )∫ ∫Ω Ωϕ→→Σ+
=ϕΣ+ϕ∇Ω+∂
ϕ∂
' ''',',',,,',,,,
,,,,,,,,,.,,,
v
1
E s
t
dEdtEtEEtES
tEtEtEt
tE
ΩrΩΩ'rΩr
ΩrΩrΩrΩr
(14)
The total source which combines the external, scattering and fission sources is:
( ) ( ) ( ) ( )∫ ∫Ω Ωϕ→→Σ+Ω=Ω' '
'',',',,,',,,,,,,E s dEdtEtEEtEStEq ΩrΩΩ'rrr (15)
With assuming an isotropic transport medium, neglecting the time dependence of cross-
section and introducing energy groups, the neutron transport equation becomes:
( ) ( ) ( ) ( ) ( )tqttt
tgg
tgg
g ,,,,,,.,,
v
1
g
ΩrΩrrΩrΩΩr
=ϕΣ+ϕ∇+∂
ϕ∂, (16)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
111/150
where ( )( ) ( )
( )∫
∫
∆
∆
ϕ
ϕΣ=Σ
g
g
E
E ttg
dEtE
dEtEE
,,,
,,,,
Ωr
Ωrrr (17)
The representation:
Ωrr u+= ' , (18)
is introduced, where r is an arbitrary but fixed observation point, r’ is some chosen
starting point along the linear trajectory of free flight of neutrons with a travel direction Ω
(e.g. the point of last collision, after which a neutron emerges with energy in group g and
travel direction Ω), and u is the free path length from the starting point r’ to the observation
point r .
Figure 2. A geometric representation for constructing the integral form of the neutron trans-port equation.
Then, after leaving out the time dependence and accounting that ( )ΩrΩ ,. gϕ∇ is a de-
rivative along Ω, (16) can be rewritten as follows (here and below the group index g will be
omitted for simplicity):
( ) ( ) ( ) ( )ΩΩrΩΩrΩrΩΩr ,','',' uquuudu
dt +=+ϕ+Σ++ϕ , (19)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
112/150
Further it will be more convenient to represent the spatial dependence in terms of the
observation point r . Formally, based on (19), this reduces to a change of the independent vari-
able, R ≡ -u, from which it follows that dR
d
du
d −= , and substituting the notation r ’ by r .
These changes are equivalent to following the particle trajectories from the point of observa-
tion back to the point of emergence:
( ) ( ) ( ) ( )ΩΩrΩΩrΩrΩΩr ,,, RqRRRdR
dt −=−ϕ−Σ+−ϕ− , (20)
Or, if the arbitrary but fixed parameters r and Ω are omitted:
( ) ( ) ( ) ( )RqRRRdR
dt =ϕΣ+ϕ− (21)
Such ordinary differential equation can be converted to an integral form by applying the
integrating factor:
( ) ( )[ ]Ωr ,;exp RTR −=µ , (22)
where:
( ) ( )∫ −Σ=R
t dRRRT0
'',; ΩrΩr (23)
Namely:
( ) ( )[ ] ( ) ( ) ( ) ( ) ( )( ) ( )RRq
RRRdR
RdRRR
dR
dt
µ
µµµ
=
ϕ−Σ+ϕ−=ϕ− Ωr (24)
(The derivative of the integrating factor is obtained according to the differentiation rule for a
definite integral: ( ) ( ) ( ) ( )∫∫ ∂
∂+−=)(
)(
)(
)(
',',,','
xb
xa
xb
xa x
xxFdx
dx
daaxF
dx
dbbxFxxFdx
dx
d)
The integration of (24) in limits from 0 to R leads to:
( ) ( )[ ] ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )∫
∫
∫∫
+ϕ=ϕ
−=ϕ−ϕ
−=ϕ
R
R
RR
RRqRR
RRqRR
RRqdRRRdR
d
0
0
00
''0
''00
''''''
µµ
µµµ
µµ
, i.e.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
113/150
( ) ( ) ( )
( ) ( )∫ ∫
∫
−Σ−−+
−Σ−−ϕ=ϕ
R R
t
R
t
dRRRqdR
dRRR
0
'
0
0
''''exp,''
''exp,,
ΩrΩΩr
ΩrΩΩrΩr (25)
(Here the fact that ( ) 10 =µ is used.)
If on physical grounds it is assumed that for all r and Ω
( ) ( )0
''exp,lim0 =
∞→
−Σ−−ϕ ∫
R
dRRRR
t ΩrΩΩr,
then the final form of the integral formulation of the neutron transport equation is:
( ) ( ) ( )[ ]∫∞
−−=ϕ0
,;exp,, ΩrΩΩrΩr RТRdRq (26)
The neglected time dependence supposes a conditional criticality formulation of the fis-
sion source. Also, in correspondence with the most frequently solved conditional criticality
problems, it is assumed that there is no external source. Under these conditions and with re-
storing of the group index g, the neutron source has the form:
( ) ( ) ( ) ( ) ( ) ( )∑ ∫∑∫ ϕΩΣ4
+ϕΣΩ= →'
4 '''
4 '' ,'11
,.,',g
gggg
gggg dk
dqπ
ν
πχ
πΩ'rrrΩ'rΩΩ'rΩr
Thus the integral equation of neutron transport (26) takes the form:
( )
( )[ ]( ) ( ) ( )
( )( ) ( ) ( )
∫∑∫
∑ ∫∞
→
−ϕ−Σ−Σ
−ΣΩ+
−ϕΩ4
−Σ−
−
=ϕ
0
'4 ''
'
'
'4 ''
,.,
'
,'11
,;exp
,
gg
tgt
g
gg
gggg
g
g
RRR
Rd
RdRRk
RTdR
π
πν
πχ
Ω'ΩrΩrΩr
ΩΩ'Ωr
Ω'ΩrΩrΩr
Ωr
Ωr
(27)
A simplified and generalised formulation of such integral equation (homogeneous Fred-
holm equation of the second kind) is:
( ) ( ) ( ) ( ) ( )∫∫ +=b
a
b
a
dssfstLdssfstKtf ,,1
λ (28)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
114/150
The numerical approach to solving such equation is based on approximating the inte-
grals with quadrature formulae which are always of the form: ( ) ∑∫=
≅N
jjj
b
a
wdxx1
ϕϕ where
( )jj xϕϕ ≡ . Or:
( ) ( ) ( ) ( ) ( )∑∑==
+=N
jjjj
N
jjjj sfstLwsfstKwtf
11
,,1
λ (29)
If the right-hand side is evaluated at the quadrature nodes, then for the function values
( ) Nisff ii ,...,1, =≡ the following linear system is obtained:
NifLfKfN
jjij
N
jjiji ,...,1,
1
11
=+= ∑∑==λ
, where ( )jijij ssKwK ,≡ and ( )jijij ssLwL ,≡ (30)
Or, in matrix-vector notation:
fAfLfKff λλ
=⇒+= 1, where ( ) KL1A 1−−= (31)
This is a standard eigenproblem and well developed methods exist for its solving. In or-
der to evaluate the integral equation solution ( )tf at arguments different from the quadrature
nodes, expression (29) is directly applied.
The representation (28) can be viewed as an invitation for applying an accurate and eco-
nomical method of numerical integration. In this context an especially good choice would be
the above described Monte Carlo approach according to which the quadrature nodes (the ar-
gument values of the integrand) are chosen predominantly in regions where the contribution
of the integrand to the evaluated integral are higher.
Applied to the considered physical problem this would require sampling the phase space
with a density distribution which is proportional to the probability of populating with neu-
trons the point of observation (i.e. energy group g, travel direction Ω and position r ). Insofar
as the observation points also span the entire phase space (or a large portion thereof), the
above condition is equivalent to the requirement to follow the histories of individual particles
through sampling of new phase coordinates after each collision in correspondence with the
natural probability distribution of these coordinates. This distribution is expressed through
the respective macroscopic cross-sections.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
115/150
An alternative formulation of this conclusion is that if the purpose is to obtain a solution
of the transport equation everywhere in the phase space, this would require the employment
of an analogue Monte Carlo simulation. As it will be demonstrated below, such is the case of
searching the effective multiplication factor – except for the possible introduction of implicit
absorption and of an importance function proportional to the probability of causing fission. Of
course, if a fixed-source problem is solved for the reaction rates (the flux) in restricted regions
of the phase space (virtual detectors), then the correct choice is non-analogue MC with suit-
able variance reduction techniques.
The analysis of the conditionally critical formulation of the homogeneous neutron trans-
port equation for a multiplying medium shows that the effective multiplication factor k is the
maximum eigenvalue of the integral equation (27). With the conventional introduction of
„neutron generations”, which corresponds to the power method of finding the maximum ei-
genvalue, the effective multiplication factor can be regarded as a ratio of the number of neu-
trons in the (n+1)-th generation to the number of neutrons in the n-th generation. This way of
estimating k in practice reduces to iterations on the fission neutron source.
Thus after introducing an indexing of the generations (i.e. the iterations), multiplying
and dividing some terms by ( )rtgΣ and multiplying both sides of equation (27) by ( )rν
gΣ , it
can be written in the form:
( )( )
( )( )
( )( ) ( )
( )[ ]
( ) ( )( )
( )( )
( )( )
( )( )∫
∑∫
∑ ∫∞
→
−
−−Σ
−ΣΩ+
−4Ω
−Σ−Σ
−
×−
ΣΣΣ
=ΣΣ
0
'4 '
'
'
'4
1'
'
'
,.,
'
,'1
,;exp
,
g
ngt
g
gg
g
ngt
g
gg
g
tgt
g
g
ngt
g
g
RR
Rd
Rd
R
RR
k
RT
dR
π
π
νν
ν
ψ
ψπ
χ
ψ
Ω'ΩrΩr
ΩΩ'Ωr
Ω'ΩrΩr
ΩrΩr
Ωr
rr
r
Ωrr
r
, (32)
where ( )( ) ( ) ( )( )ΩrrΩr ,, ng
tg
ng ϕΣ≡ψ is the so-called collision density.
The expression on the left is the rate of emergence of fission neutrons in volume d3r
around point r , owing to neutrons from generation (n) with energy in group g and travel direc-
tion Ω, which have reached this volume without intermediate collisions and have emerged at
any point r’ along the trajectory of free flight because of:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
116/150
a) fission caused by neutrons from the (n-1)-th generation with incident energy in
any group g’ and any travel direction Ω’, or
b) scattering of neutrons from the n-th generation with incident energy in any
group g’ and any travel direction Ω’.
For the sake of compactness it is convenient to introduce the following operator nota-
tions:
– integral operator of particle transport
( ) ( ) ( )( )∫ →Σ≡→ rrrΩrrL RPdR tgg ',' (33)
– integral operator of particle collisions
( ) ( )( )
( )( )∑∫
=
→→
ΣΣ
ΣΣ
Ω≡→g
gtg
sg
sg
gggg d
1'4
'
'
'
''
'.,'',
π r
r
r
ΩΩrΩΩrC , (34)
where:
− ( ) ( )[ ]dRRTdRP g Ωrrr ,;exp' −≡→ is the probability of reaching the volume d3r
around point r through free flight in direction Ω from the point of emergence r’ , and
'rr −=R is the travelled path.
− ( )
( )
ΣΣ →
r
ΩΩrsg
gg
'
' '., is a normalised probability distribution for sampling a new energy
group and a new travel direction of the scattered neutron, and ( )( )
ΣΣ
r
rsg
sg
'
' is the prob-
ability to escape absorption.
With those, analogously to (32) the following can be written:
( )( ) ( ) ( ) ( ) ( ) ( ) ( )[ ]','',',',', ''
1,ΩrΩΩrCΩrΩrrLΩr n
gggnf
ggn
g S ψψ →+→= →− , (35)
where ( )( ) ( ) ( ) ( )( )
( )( )∑ ∫
Ω
ΣΣ
4≡ −
−−
'4
1'
'
'
11, ,'
11,
g
ngt
g
ggn
nfg d
kS
π
ν
ψχπ
Ω'rr
rrΩr is the normalised to
the multiplication factor for the (n-1)-th generation source of neutrons in group g at point r
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
117/150
due to fission caused by neutrons from the previous (n-1)-th generation, and ( )( )
ΣΣ
r
rtg
gν
is the
average number of fission neutrons born after a collision of a neutron with energy in group g.
The term ( )( ) ( ) ( ) ( )ΩrΩrrLΩr ,',', 1,, −→= nfgg
ncg SS is the contribution to ( )( )Ωr ,n
gψ
from first collision of fission neutrons.
After isolating this term, (35) is written in the form:
( )( ) ( )( ) ( ) ( ) ( ) ( )','',',',, ''
,ΩrΩΩrCΩrrLΩrΩr n
ggggnc
gn
g S ψψ →→+= → (36)
Expression (36) is a practical basis for implementing the following MC procedure:
− The modelling of an individual history starts with the introduction of a particle with
phase coordinates sampled from the source distribution ( )( )Ωr ,1, −nfgS .
− The location of its first collision is determined through sampling the kernel of the
transport operator L .
− The particle weight is multiplied by the absorption escape probability and the kernel
of the collision operator C is used for sampling its new energy and direction of
travel.
− Further the kernels of the transport operator L and of the collision operator C are
successively sampled for determining the locations of the second, third, etc. colli-
sion and the weight, energy and travel direction of the particle after the respective
collision.
− The history is followed until the particle weight drops below a certain threshold or
until the particle leaves the system.
− The integration required by the operators (33) and (34) corresponds to the following
summation (the generation index (n) is omitted):
( ) ( )∑∞
=
=0j
jgg ,, ΩrΩr ψψ , (37)
where ( )Ωr ,jgψ is the contribution of particles which after j collisions have
emerged with an energy in group g and a travel direction Ω, and undergo their
next collision at point r .
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
118/150
Namely:
( ) ( )( ) ( ) ( ) ( )ΩrΩrrLΩrΩr ,',', 1,,0 −→== nfgg
ncgg SS,ψ ;
…
( ) ( ) ( ) ( )','',',' 1'' ΩrΩΩrCΩrrLΩr −
→ →→= jgggg
jg , ψψ
The specific steps for solving the full problem (which includes finding the effective
multiplication factor) are as follows:
− The initial phase coordinates ( )000 ,, Ωrg are sampled from the distribution
( )( )Ωr ,1, −nfgS .
− For finding the point of first collision 001 Ωrr R+= a free path R is sampled from the
distribution ( ) ( )
+Σ−+Σ ∫ 00
0
00 ''exp00
ΩrΩr RdRR tg
Rtg . The probability of scattering
(escaping absorption) there is ( )( )1
1
0
0
r
rtg
sg
ΣΣ
. Scattering is enforced and the particle weight
after this event is multiplied by the scattering probability.
− The new energy group g1 is sampled from the distribution ( )
( )1
1
0
0
r
rsg
gg
ΣΣ → , where
( ) ( )∫ →→ ΩΣ≡Σπ4 011 .,
00ΩΩrr gggg d , after which the new travel direction Ω1 is sampled
from the distribution ( )
( )1
01
10
10.,
r
ΩΩr
gg
gg
→
→
ΣΣ
. The subsequent collisions are modelled in an
analogous way until the particle history is terminated.
− The accumulated statistic (37) is used to form a new estimate of the effective multipli-
cation factor:
( )
( )( )( )( )∑∫ ∫
∑∫ ∫−Ω
Ω=
g
nfg
g
nfg
n
Sdrd
Sdrd
k
π
π
4
1,3
4
,3
,
,
Ωr
Ωr
, (38)
where the new fission source is:
( )( ) ( ) ( )( )
( )( )∑ ∫ ΩΣΣ
4=
'4 '
'
', ','1
,g
ngt
g
gg
nfg dS
π
ν
ψχπ
Ωrr
rrΩr (39)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
119/150
Before being used in (35), this source is normalised:
( )( ) ( )( )( )Ω←Ω ,
1, ,, rr nf
gnnf
g Sk
S (40)
− The preceding steps are repeated until the estimate (38) is stabilised.
In the context of (37) it is seen that from the viewpoint of solving the integral equation
(32)/(37) the above described choice of values for the phase variables g, r , Ω as samples from
their probability distributions has the effect of an economical procedure of numerical integra-
tion with which the density of sampling of elementary volumes in the phase space is propor-
tional to the probability of populating these volumes.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
120/150
7. Partial differential equations
According to their characteristics, i.e. curves along which a partial differential equation
transforms into an ordinary differential equation, partial differential equations are subdivided
into the following three categories:
− hyperbolic, a prototypical example being the one-dimensional wave equation:
( ) ( )2
22
2
2 ,v
,
x
txu
t
txu
∂∂=
∂∂
− parabolic, a prototypical example being the one-dimensional diffusion equation:
( ) ( )
∂∂
∂∂=
∂∂
x
txuD
xt
txu ,,
− elliptic, a prototypical example being the Poisson equation:
( ) ( ) ( )yxy
yxu
x
yxu,
,,2
2
2
2
ρ=∂
∂+∂
∂
From a computational point of view, however, a different classification is more impor-
tant – whether the respective equation defines only a boundary value problem, or both a
boundary and an initial value problem.
Although in either case the general approach to solving the equation can be reduced to
representing the derivatives as final differences, i.e. to the construction of a differencing
scheme, and thus to transforming the equation into a system of linear algebraic equations for
the values of the dependent variable at a set of fixed argument values (or for the average val-
ues of the dependent variable over the grid cells), in the case of solving an initial value prob-
lem there exist some special additional restrictions with respect to the proportions between the
discretisation steps in space and in time.
An example for the latter is the following equation:
x
u
t
u
∂∂−=
∂∂
v , (1)
the solution of which is in the form:
( ) ( )txftxu v, −= (2)
(f is an arbitrary function and v is a constant)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
121/150
The direct approach to constructing a difference scheme for (1) involves the choice of
grid points Jjxjxx j ,...,1,0 =∆+= and Nntnttn ,...,1,0 =∆+= .
Let ( )njnj txuu ,≡ and let the time derivative is represented by the forward difference:
( ) ( )tOt
uutx
t
unj
nj
nj ∆+∆−
=∂∂ +1
, (3)
The difference expression (3) refers to the beginning tn of the current time step, so that it
can be directly related to the already known solution of equation (1) at this moment ( )ntxu , .
The spatial derivative can be represented by the more accurate central difference, again
based only on quantities which are known at time tn:
( )211
, 2xO
x
uu
x
unj
nj
nj
∆+∆−
=∂∂ −+ (4)
................................... The truncation error of finite differencing is estimated through the Taylor expansion of the function:
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ...'''6
1''
2
1'
...'''6
1''
2
1'
32
32
+−+−=−
++++=+
xfhxfhxhfxfhxf
xfhxfhxhfxfhxf
Therefore:
for the forward difference: ( ) ( ) ( ) ( ) ...''
2
1' ++=−+
xhfxfh
xfhxf
for the central difference: ( ) ( ) ( ) ( ) ...'''
6
1'
22 ++=−−−
xfhxfh
hxfhxf
.................................... Thus, the difference approximation of (1) leads to the following explicit relation for
evaluating 1+nju :
x
uu
t
uu nj
nj
nj
nj
∆−
−=∆− −+
+
2v 11
1
, or ( )nj
nj
nj
nj uu
x
tuu 11
1
2
v−+
+ −∆∆−= (5)
Unfortunately, however, a difference scheme of this kind, known as FTCS (Forward in
Time, Centred in Space), turns out to be unstable. This can be demonstrated as follows.
von Neumann stability analysis
A differencing scheme for solving an initial value problem is stable if the accumulated
with time roundoff error of the solution remains restricted (neutral stability) or decreases (full
stability). If this error grows unrestrictedly with time, the differencing scheme is unstable.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
122/150
Let the roundoff error of the solution of the difference equation (e.g. (5)) is:
nj
nj
nj uU −≡ε , (6)
where njU is the exact, and nju is the numerical solution of the difference equation.
The exact solution must satisfy the difference equation, i.e. the equality:
( )x
trUUrUU n
jnj
nj
nj ∆
∆−≡−+= −++
2
v;11
1
is satisfied exactly.
If equation (5) is subtracted from this, the result will be:
( )nj
nj
nj
nj r 11
1−+
+ −+= εεεε (7)
The full coincidence between (7) and (5) shows that the time behaviour of roundoff er-
ror and of the numerical solution will be the same.
Under sufficiently general conditions the spatial behaviour of roundoff error can be ex-
panded into a Fourier series (similar to the discrete Fourier transform discussed in Chapter 2):
( ) ( ) ( )∑=m
mm xiktAtx exp,ε (8)
Under equally general conditions the typical time behaviour of roundoff error can be as-
sumed to be exponential (i.e. the relative change of this error after each time integration step
remains practically constant), so that finally:
( ) ( ) ( )∑≅m
mmm xiktCtx expexp, αε (9)
For the purpose of a stability analysis the behaviour of the series (9) can be inferred
from the behaviour of any of its terms. Thus it can further be assumed that:
( )( ) ( )( ) ( )( )xjikxjiktn nnj ∆=∆∆= expexpexp ξαε , (10)
where the amplification factor ( )t∆≡ αξ exp will depend on the wave number k through
the index m in (9).
Thus, if ( ) 1>kξ for some k, the differencing scheme will be unstable.
Through substituting (10) in (7), the amplification factor ( )kξ is found to be:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
123/150
( )( ) ( )( ) ( ) ( )( )[ ]xikxikrxjikxjik nn ∆−−∆+×∆=×∆ expexp1expexp ξξξ ,
or:
( ) ( )xkirk ∆+= sin21ξ (11)
Therefore, for an arbitrary k, ( ) ( ) 1sinv12
>
∆∆∆+= xkx
tkξ and the FTCS scheme is
unconditionally unstable.
The instability of FTCS can be remedied through the adoption of the so-called
Lax scheme
This scheme consists in the following substitution of nju on the right-hand side of (5):
( )nj
nj
nj uuu 112
1−+ +→ , (12)
which is equivalent to a linear interpolation in space and leads to:
( ) ( )nj
nj
nj
nj
nj uu
x
tuuu 1111
1
2
v
2
1−+−+
+ −∆∆−+= (13)
Thus, analogously to (11):
( ) ( ) ( )xkirxkk ∆+∆= sincosξ and ( ) ( ) ( )xkrxkk ∆+∆= 222 sincosξ (14)
Therefore the stability condition ( ) 1≤kξ will be:
1v ≤∆∆≡x
tr . (15)
This restriction is known as the Courant condition.
Diffusion initial value problem
The simplest example is the one-dimensional problem with constant coefficients:
2
2
x
uD
t
u
∂∂=
∂∂
(16)
Explicit scheme
The counterpart of the above examined FTCS scheme will be:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
124/150
( )2
111 2
x
uuuD
t
uu nj
nj
nj
nj
nj
∆+−
=∆− −+
+
, or ( ) ( )nj
nj
nj
nj
nj uuu
x
tDuu 112
1 2 −++ +−
∆∆+= (17)
(The difference expression for the second derivative is a central difference)
The von Neumann analysis leads to the following relation for the amplification factor:
( )( )
∆−=∆−−=2
sin41cos121 2 xkrxkrξ , (18)
where ( )2x
tDr
∆∆≡ .
Thus the stability condition ( ) 1≤kξ becomes:
2
1≤r , or ( )
D
xt
2
2∆≤∆ (19)
This condition can in principle be met, but it often imposes an excessively severe re-
striction on the time integration step and may render the problem practically impossible to
solve.
Implicit scheme
If the expected final state of the modelled quantity (the solution of the differential equa-
tion) is stationary, i.e. ( )
0, →
∂∂
∞→tt
txu, and hence also
( )0
,2
2
→∂
∂∞→tx
txu, an uncondi-
tional stability can be ensured through the following implicit differencing scheme with respect
of time (BTCS):
( )2
11
111
1 2
x
uuuD
t
uu nj
nj
nj
nj
nj
∆+−
=∆− +
−++
++
(20)
In contrast with the explicit scheme (17), here the finite difference representation of the
time derivative, ( ) ( )tOtxt
u
t
uunj
nj
nj ∆+
∂∂=
∆−
+
+
1
1
, , is referred to the end 1+nt of the time step,
and this imposes the same requirement to the right-hand side of equation (16).
The adoption of this implicit scheme involves the task of solving the following inhomo-
geneous linear system with a tridiagonal matrix:
( ) 1,...,12,21 11
111 −==−++− +
+++
− Juruurru nj
nj
nj
nj , (21)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
125/150
where ( )2x
tDr
∆∆≡ .
It is seen that at ( )∞→∞→∆ rt expression (20) will be a finite difference counterpart
of ( )
0,
2
2
=∂
∂x
txu which correctly reflects the expected final state of the solution of the differen-
tial equation.
The von Neumann analysis produces the following equation for the amplification factor:
( ) ( ) ( ) 1exp21exp =∆−++∆−− xikrrxikr ξξξ , or
( ) ( )( )
∆+=
∆−+=
2sin41
1
cos121
1
2 xkr
xkrkξ (22)
Therefore ( ) 1<kξ with any step ∆t and the scheme (16) is unconditionally stable.
Crank-Nicholson scheme
Through combining the FTCS and BTCS schemes a difference scheme can be obtained
which is both unconditionally stable and of second order of accuracy (i.e. with a truncation
error ( )( )2tO ∆ ) with respect to the finite difference representation of the time derivative:
( ) ( )( )2
1111
111
1 22
2 x
uuuuuuD
t
uu nj
nj
nj
nj
nj
nj
nj
nj
∆+−++−
=∆− −+
+−
+++
+
(23)
The latter will be true if the right-hand side can be interpreted as an estimate of 2
2
x
u
∂∂
at
time ttn ∆+2
1, so that the finite difference representation of the time derivative can acquire
the meaning of a central difference: ( )21
2
1, tOttx
t
u
t
uunj
nj
nj ∆+
∆+∂∂=
∆−+
.
The requirement that the right-hand side of equation (16) is referred to the centre of the
time step is fulfilled through linear interpolation:
( ) ( )
∂∂+
∂∂=
∆+∂∂
+12
2
2
2
2
2
,,2
1
2
1, njnjnj tx
x
utx
x
uttx
x
u
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
126/150
Similarly to the above it can be shown that with this differencing scheme the amplifica-
tion factor will be:
( )
∆+
∆−=
2sin21
2sin21
2
2
xkr
xkr
kξ , (24)
so that the method is unconditionally stable with any step ∆t.
All of the discussed differencing schemes can be directly generalised for the case ( )xD
on the basis of the following representation of ( ) ( )jxxx
txuxD
x =∂∂
∂∂ ,
:
( ) ( )( )2
121121
x
uuDuuD jjjjjj
∆−−− −−++ (25)
Multidimensional case
The treatment below will be made on the example of the two-dimensional problem:
∂∂+
∂∂=
∂∂
2
2
2
2
y
u
x
uD
t
u (26)
The generalisation of explicit methods like FTCS is straightforward and the computa-
tional procedure is analogous to the one-dimensional case.
The application of an implicit method, e.g. of the Crank-Nicholson scheme, leads to the
following relation:
( )nljy
nljy
nljx
nljx
nlj
nlj uuuuruu ,
21,
2,
21,
2,
1, 2
1 δδδδ ++++= +++ , (27)
where yx
tDr ∆=∆≡∆
∆∆≡ ,2 and n
ljn
ljn
ljn
ljx uuuu ,1,,1,2 2 −+ +−≡δ , and similar for n
ljy u ,2δ .
After the introduction of one-dimensional indexing in space, the relations (27) take the
form of a large inhomogeneous linear system of equations with a sparse matrix for the solu-
tion values at the grid modes. Unlike the one-dimensional case of a tridiagonal system for the
solution of which there exist direct and economical methods, here usually specialised iterative
methods are required.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
127/150
A possible alternative which can allow to circumvent or simplify the iterative solving of
the algebraic problem is the so-called alternating-direction implicit method (ADI). The
method can be regarded as an equivalent of the Crank-Nicholson scheme and has the same
second order of accuracy with respect to time and space, as well as the same unconditional
stability. With this method each time step is subdivided into two steps of size ∆t/2 (in the
three-dimensional Cartesian case – three steps of size ∆t/3). At each step an one-dimensional
implicit scheme along the respective direction is applied:
( )
( )1,
22/1,
22/1,
1,
,22/1
,2
,2/1
,
2
12
1
++++
++
++=
++=
nljy
nljy
nlj
nlj
nljx
nljx
nlj
nlj
uuruu
uuruu
δδ
δδ (28)
It is clear that each substep will involve the solving of a linear system with a tridiagonal
matrix.
An example: The one-dimensional heat equation
The equation is parabolic and has the same form as the diffusion equation:
( ) ( )txx
Ttx
t
T,,
2
2
∂∂=
∂∂ α ,
where T is temterature and α is the so-called thermal diffusivity.
The selected example is in slab geometry,
∈2
,0π
x , and the boundary conditions cor-
respond to an adiabatic process:
( ) 0,2
,0 =
∂∂=
∂∂
tx
Tt
x
T π
The selected initial condition is:
( ) ( )xxT cos0, =
The assumed value of α is 0.061644.
The assimptotic solution is:
( ) ( )ππ
π
2cos
2,
2
0
=→∞ ∫ dxxxT
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
128/150
With a spatial step 2100
1 π× and a time step 1 s the solution by the implicit time-
differencing scheme (BTCS) is illustrated in the figures below. The asymptotic distribution of
temtrerature is reached after 28 s (maximum absolute deviation from the asymptotic value
below 0.001). For comparison, the maximum allowed time step for the explicit scheme
(FTCS) is 2×10-3 s.
Solution of the one-dimensional heat equation at different moments of time
Application: the diffusion equation in nuclear reactor physics
With known group constants and boundary conditions, the conditionally critical multi-
group stationary diffusion equation is:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
129/150
G)1,...,=(g
)()(1
)()()()()()(G
1g'''
1''' ∑∑
==→ ΦΣ+ΦΣ=ΦΣ+Φ∇⋅∇− rrrrrrrr ggg
G
ggggg
rgggD ν
λχ
,
where ggtg
rg →Σ−Σ≡Σ is the removal cross-section from group g.
Since epithermal neutrons cannot gain energy from scattering, if all thermal neutrons
are aggregated in a single group (as a rule with an upper boundary 0.625 eV), then the scatter-
ing source will be:
∑−
=→ ΦΣ
1
1''' )()(
g
gggg rr .
(Here it should be reminded that the group numbering is by decreasing energy).
All further treatment will refer to this case, although the final results are directly gener-
alised for any energy group structure. If we also assume that the fission neutron source
( ) ∑=
ΦΣ≡G
1g''' )()(
1rrr ggggF ν
λχ
is estimated in advance (e.g. from a previous approximation of the group fluxes), then
each equation in the above system can be solved separately, provided that this is done in the
order g = 1, g = 2, ..., g = G.
And so, the diffusion equation in group g will have the form:
)()()()()( rrrrr ggrggg SD =ΦΣ+Φ∇⋅∇− ,
where the source
( )rrrr g
g
ggggg FS +ΦΣ≡∑
−
=→
1
1''' )()()(
is assumed to be known.
After introducing a discretisation in space, so that all boundaries between conditionally
homogenised regions of the reactor medium coincide with boundaries between nodes, of one-
dimensional numbering of the nodes (e.g. from top do bottom, from inside to outside and
from left to right), and after integrating over the node volumes and dividing by these volumes
Vk, the diffusion equation in group g is represented as a set of balance equations:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
130/150
( )KkSJV kk
kr
k
kk
k
,...,1,1
'
' ==ΦΣ+∑ ,
where the group index g is omitted and the leakage term is transformed using the diver-
gence theorem (Gauss's theorem or Ostrogradsky's theorem). The summation over k’ involves
the nodes with which the node k has a common interface, and the net currents across these
common interfaces are:
( )∫ Φ∇−≡'
.'kkS sk
kk dDJ rs .
The source
( )∫=kV g
kk rSd
VS r31
is, of course, known, and the nodal group fluxes
( )∫ Φ≡ΦkV g
kk rd
Vr31
are subject to determination.
It is evident that the resulting set of balance equations can be solved for the nodal fluxes
Φk only if a way of expressing the currents 'kkJ through the fluxes is found. There exist vari-
ous ways of introducing the sought relations, commonly referred to as nodal models.
Below the simplest nodal method will be considered, and at that only in one dimension,
i.e. with a spatial dependence only on the x coordinate.
Let xk-1, xk and xk+1 are the coordinates of the centres of two neighbouring nodes (here
with the shape of infinite slabs), and 2/1−kx and 2/1+kx are the left and right boundary coordi-
nates of node k (respectively common with nodes k-1 and k+1. Let also all nodes are with an
equal width h.
In this case the normal projections of the current across the left and right interfaces of
node k will be correspondingly
( )2/11
−− Φ= kk
kk x
dx
dDJ and ( )2/1
1+
+ Φ−= kkkk x
dx
dDJ .
With approximating the derivatives by central differences, e.g.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
131/150
( ) ( ) ( )h
xxx
dx
d kkk
Φ−Φ≅Φ ++
12/1 ,
and with the assumption that the flux at the centre of the node coincides with the node-
averaged value, i.e.
( ) Kkx kk ,...,1, =Φ=Φ ,
the result would be:
( )kkkkk h
DJ Φ−Φ−= −−
11 1
and ( )kkkkk h
DJ Φ−Φ−= ++
11 1
.
It is seen that these relations do not ensure a continuity of the current at the interfaces
between neighbouring nodes, e.g.
( ) ( )
Φ−Φ−=≠
Φ−Φ−= +++++
kkkkkkkk
kk h
DJh
DJ 11111 11
.
This makes them unfit for representing the neutron balance in the system.
The correct approach is to express the currents through the flux values at the node inter-
faces. For example, for the k-th node:
( )kkkk
kk h
DJ Φ−Φ−= −− 11 2 and ( )k
kkk
kk h
DJ Φ−Φ−= ++ 11 2,
where kk
kk 1
1−
− Φ=Φ и kk
kk 1
1+
+ Φ=Φ are the flux values at the left and the right interface of
node k. The introduction of such interface values automatically ensures continuity of the flux
at the interfaces between neighbouring nodes, which corresponds to physical reality.
The freedom of choosing the interface flux values allows to ensure the continuity of the
currents across the interfaces:
( ) ( )11
11
22 −−
−− Φ−Φ−=Φ−Φ− k
kkkkkkk h
Dh
D and ( ) ( )111
1 22 +++
+ Φ−Φ−=Φ−Φ− kkkkk
kkk h
Dh
D .
The above relations are equations for the interface fluxes. Their solutions are:
1
111
−
−−−
+Φ+Φ=Φ
kk
kkkkkk DD
DD and
1
111
+
+++
+Φ+Φ=Φ
kk
kkkkkk DD
DD.
After substituting these solutions in the expressions for the interface currents, the sought
relation between those currents and the nodal fluxes is obtained:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
132/150
( )11
11 21−
−
−− Φ−Φ+
−= kkkk
kkkk DD
DD
hJ and ( )kk
kk
kkkk DD
DD
hJ Φ−Φ
+−= +
+
++1
1
11 21.
A disadvantage of this simplest nodal scheme is that the two underlying assumptions,
and namely – replacement of the derivatives by finite differences and of the central fluxes by
the node-averaged ones – are too crude for nodes as large as the transverse dimension of a
fuel assembly. For this reason actual coarse-mesh diffusion calculations are founded on more
accurate but much more sophisticated nodal schemes. On the other hand, this nodal scheme is
fully adequate and commonly applied for the so-called fine-mesh diffusion calculations where
the transverse dimension of the nodes does not exceed the pitch of the fuel pin grid.
With the adopted nodal scheme the set of balance equations obtains the standard form
of a system of inhomogeneous linear equations:
kkkkkkkkkk Saaa =Φ+Φ+Φ ++−− 11,,11, ,
where 1
121,
21
−
−− +
−=kk
kkkk DD
DD
ha ,
1
121,
21
+
++ +
−=kk
kkkk DD
DD
ha and ∑−Σ=
'',,,
kkkkrkk aa .
The first and the last equation which refer to the two external problem boundaries are
completed using the boundary conditions. The technique is analogous to the already dis-
cussed, and the result are certain special expressions of the coefficients a1,1 и aN,N, where N is
the number of nodes.
In particular, let the boundary condition is of the logarithmic type and refers to the right
boundary of the considered problem: RK
RKJ Φ= α , i.e.
( ) RKK
RKK h
D Φ=Φ−Φ− α2
This condition, taken as an equation for the boundary flux, has the solution:
K
K
RK
D
hΦ
+=Φ
21
1α
After substituting in the expression for the boundary flux:
K
K
KRK h
D
DJ Φ
+=
2
αα
.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
133/150
If the nodes are of unequal size, Nkhk ,...,1, = , the matrix elements for the set of bal-
ance equations will have the form:
21
21
12,1 ~~
~~1
DD
DD
ha
+−= , 2,1
1
1
111,1 ~
~1
aD
D
ha −
++Σ=
αα
1
11, ~~
~~1
−
−− +
−=kk
kk
kkk DD
DD
ha ,
1
11, ~~
~~1
+
++ +
−=kk
kk
kkk DD
DD
ha and ∑−Σ=
'',,
kkkkkk aa
1
11, ~~
~~1
−
−− +
−=NN
NN
NNN DD
DD
ha , 1,, ~
~1
−−+
+Σ= NNN
N
NNNN a
D
D
ha
αα
,
where kk
k Dh
D2~ = , and α is the ratio ΦJ on the external boundary.
In matrix-vector notation the resulting inhomogeneous linear system has the form
SAΦ = . If the nodes of equal size, then, as follows from the definitions of the matrix ele-
ments, the matrix A will be symmetric. In all cases it will also be diagonally dominant, which
means that each of its diagonal elements is larger than the sum of the magnitudes of the off-
diagonal elements in the respective row. It can be shown that real symmetric diagonally domi-
nant matrices are positive definite, i.e. all their eigenvalues are real and positive, and also that
the eigenvectors of real symmetric matrices are mutually linearly independent.
Two- or three-dimensional spatial discretisation leads to analogous expressions for the
leakage term and to a matrix with the same properties as in the discussed one-dimensional
case.
From the viewpoint of solving the resultant inhomogeneous linear system, the following
additional characteristics of the matrix A are important:
− It is especially large. Thus, for example, with a course-mesh discretisation of the core
of WWER-1000 the number of equations is typically about 5000, whereas with a fine-
mesh grid it is about two million.
− The matrix is sparse, i.e. has a very small number of non-zero elements. As it can be
see from the above one-dimensional example, each of its rows contains only two non-
zero off-diagonal elements, and these are adjacent to the non-zero diagonal element.
The number and the locations of non-zero off-diagonal elements in the case of two- or
three-dimensional problems will depend on the node shape. For reactors with a square
grid of assemblies (or fuel pins) the single-index numbering of nodes will lead in the
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
134/150
two-dimensional case to the emergence of four non-zero off-diagonal elements, and in
the three-dimensional case – to six such elements. In both cases the matrix has a band
structure, i.e. all non-zero elements are confined in a strip around its diagonal. For the
solving of systems with such matrices there exist efficient direct methods which re-
quire of the order of N linear operations. For reactors with a triangular grid of assem-
blies (or fuel pins), such as those of the WWER type, the single-index numbering of
nodes will lead in the two-dimensional case to the emergence of six non-zero off-
diagonal elements, and in the three-dimensional case – to eight such elements. Unfor-
tunately, however, with such grids the matrix of the system can never be brought to a
band shape, so that the existing direct and efficient methods for solving the respective
linear system are not applicable. In all cases for the solving of systems with large
sparse matrices it is practical to employ methods which require storing only the non-
zero matrix elements (usually they are not even stored at all, and are instead evaluated
in the course of the computational process). It is also desirable that the number and the
locations of the non-zero elements are not changed during the computational process.
In the terminology of reactor core calculations it is conventional to refer to the stage of
solving the fixed-source one-group boundary value problem (in diffusion or in other approxi-
mation) as to inner iterations.
A distinctive characteristic of all methods for solving linear systems with large sparse
matrices is that they are iterative, i.e. the solution is obtained through a succession of ap-
proximations.
______________________________________________________________________
Another, more sophisticated nodal scheme for the diffusion problem can be constructed
as follows.
In energy group representation the diffusion equation for node n has the form:
nn
eff
nnsnnrnn
kΦFΦΣΦΣΦD ˆ1ˆˆˆ ,,2 +=+∇− , (1)
where:
( ) ( )( )rrΦnG
nn col ΦΦ≡ ,...,1 is the sought multigroup flux;
( )nG
nn DDdiag ,...,ˆ1≡D is the matrix of diffusion coefficients;
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
135/150
( )nrG
nrnr diag ,,1
, ,...,ˆ ΣΣ≡Σ is the matrix of removal cross-sections;
ns,Σ is the matrix of in-scatter cross sections with elements nsgg
nsgg
,'
,', →Σ=Σ (with one
thermal group this matrix is upper triangular);
nF is the fission source cross-sections matrix with elements ngg
nggF ,
'',νχ Σ=
Equation (1) has the following more general formulation:
0ˆ2 =−∇ nnnΦAΦ , (2)
where:
( )
−−≡
− n
eff
nsnrnn
kFΣΣDA ˆ1ˆˆˆˆ ,,1
Usually, the matrix nA can be diagonalised through similarity transformations (cf.
Chapter 3). This means that there exists a matrix nZ , such that ( ) nnnn ΛZAZ ˆˆˆˆ 1=
−, where
( )nG
nn diag λλ ,...,ˆ1=Λ contains the eigenvalues of nA , and the columns of nZ are the eigen-
vectors of nA . (Here it must be noted that although the matrix nA is real, in the general case
the elements of nΛ and nZ are complex.)
Therefore, ( ) 1ˆˆˆˆ −= nnnn ZΛZA .
The substitution of the latter in (2) leads to ( ) 0ˆˆˆ 12 =−∇− nnnnn ΦZΛZΦ , and after multi-
plying on the left by ( ) 1ˆ −nZ and on the right by nZ , the result is:
0ˆ2 =−∇ nnnΨΛΨ , (3)
where:
( ) nnnn ZΦZΨ ˆˆ 1−≡
The form of the leakage term in (3) is due to the fact that the elements of the matrix nA
are spatially constant within node n because the problem (1) is formulated after an appropriate
homogenisation of the group constants.
The system (3) can be solved for nΨ and afterwards the solution of (1) can be obtained
through the reverse transformation:
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
136/150
( ) ( )( ) 1ˆˆ −= nnnn ZrΨZrΦ (4)
The equations in (3) are separated, i.e.
( ) ( ) Ggng
ng
ng ,...,1,2 =Ψ=Ψ∇ rr λ , (5)
and each of them has particular solutions of the form:
( ) ( )rκr .exp ng
ng
ng ψ=Ψ , (6)
where:
( ) ng
ng λ=2κ
It is evident that with a fixed eigenvalue ngλ there is an infinite number of possible com-
binations of the three components of the vector ngκ , whereas any practically constructible
solution of (3) can include only a part of them.
If the problem is one-dimensional, the particular solutions of (5) are two:
( ) ( )xx ng
ng
ng κψ ±=Ψ exp , (7)
where the quantity ng
ng λκ = is also generally complex.
The form of the solutions (6) и (7) is for Cartesian coordinates. In other coordinate sys-
tems these solutions have a different form but with analogous general properties.
The functions ( )rngΨ are conventionally known as modes of the scalar flux, and corre-
spondingly the methods for solving the diffusion equation based on (2) - (4) – as modal meth-
ods.
Returning to the diffusion equation (1), the following can be noted:
1) From (4) it follows that the solution of (1) will also be a linear combination of the
particular solutions of (3).
2) Any isolated equation from the system (1) can be formally written as an inhomoge-
neous one-group equation, i.e.
( ) ( ) ( )rrr ng
ng
nrg
ng
ng QD =ΦΣ+Φ∇− ,2 (8)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
137/150
(The source on the right-hand side is estimated iteratively within the iterative process
for evaluating effk , which is generally inherent to the eigenproblem (1)/(3).)
In this case the homogeneous part of (8) has a form identical to (5):
( ) ( )rr ng
ng
ng B Φ=Φ∇ 22 , (9)
where ng
nrgn
g DB
,2 Σ
≡ and, correspondingly has particular solutions identical in form and
properties to those of (6)/(7). The only, however important difference is that the quantity 2n
gB ,
which is a counterpart of ngλ from (2), is guaranteed to be real and positive.
The general solution of the inhomogeneous equation (8) can be obtained as a linear
combination of its particular solutions and of particular solutions of its homogeneous part (9).
The inclusion of particular solutions of the homogeneous part is expedient by virtue of state-
ment 1) above. Regarding the particular solutions of the inhomogeneous equation, their inclu-
sion is, of course, mandatory, and their functional form in principle will be determined by the
functional form of ( )rngQ . On the other hand, the original equation (1) is actually homogene-
ous and ( )rngQ is essentially a linear combination of the sought general solutions
( ) Ggng ,...,1, =Φ r . Therefore, no formal restrictions are imposed on the choice of a functional
form of the particular solutions of the inhomogeneous equation (8). This freedom can be used
for reducing the number of particular solutions of the homogeneous problem in constructing
the general solution of (8).
Based on the above considerations, the following comparative assessment of the direct
(8) and the modal (3) formulations of the multigroup diffusion problem can be made:
Except for the one-dimensional problem and for simplified cases with nodes, the
boundaries of which coincide with coordinate surfaces, none of the practically con-
structible solutions can be exact with either of the two formulations (cf. the com-
ment on (6), which is also valid for (9)).
The selection of a finite (and not quite large) number of wave vectors for (6)/(9) and
the determination of the coefficients in the respective linear combinations is made
so as to satisfy important balances between reaction rates, to preserve certain sym-
metries of the flux and current, and to provide a sufficient number of definite values
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
138/150
of representative particular solutions or of their moments (weighted integrals) at the
node interfaces. This selection depends also on the number and the boundaries of
the energy groups. One of the criteria for representativeness and sufficiency is the
physical requirement for continuity of the directional fluxes and currents at the node
interfaces.
The modal formulation (3) is conceptually simpler and does not involve an arbitrary
choice of a solution to the inhomogeneous equation (8). This advantage is somewhat
neutralised by the need to include a larger number of solutions of the form (6). Be-
cause of the complicated relation between modes and group fluxes (currents), the
implementation of the continuity requirements for the flux and the current is rather
sophisticated. These drawbacks, along with the conceptual inconsistency of dimen-
sionality reduction through transverse integration, make the modal formulations
heavier to implement and computationally more expensive.
The direct formulation (1)/(8), on its part, requires a formally arbitrary construction
of a solution to the inhomogeneous equation. The process is heuristic and its impact
on the accuracy of the general diffusion problem is difficult to assess in advance. On
the other hand, the freedom of constructing such solution can advantageously be
used for reducing the number of components of the general solution and hence of
the number of equations for determining the free parameters on which this solution
depends. Also, the inhomogeneous form of the one-group equations harmonises
well with the technique of transverse integration for reducing the problem dimen-
sionality and thence the mathematical and computational complexity of the resultant
implementation. An additional advantage is the simplicity of formulating the inner
and outer boundary conditions.
With the traditional two energy groups both approaches ensure practically equal at-
tainable accuracy, while the direct approach has the advantage of its large number of
successful and as a rule computationally simple implementations.
The transverse integration technique which is normally characteristic of the direct
approach has the inherent to this technique advantages and disadvantages – a sim-
pler mathematical and computational apparatus, however also an ambiguity and a
potential inaccuracy arising from the representation of the transverse leakage.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
139/150
Below the so-called direct approach will be illustrated on the example of an one-
dimensional formulation for two energy groups.
The two-group diffusion equation is:
( ) ( ) ( )
( ) ( ) ( )rrrJ
rrrJ
nns
nna
n
g
ng
ng
eff
nnr
n
k
122
2
1,11
1
ΦΣ=ΦΣ+⋅∇
ΦΣ=ΦΣ+⋅∇ ∑=
ν (10)
Let the hexagonal prismatic nodes have a transverse dimension Hr (between the centres
of adjacent nodes) and height Hz. Thus the node volume V and the area of its base hexF are
2
2
3, rhexzhex HFHFV == .
The transverse averaging of (10) lead to the following one-dimensional formulation:
( ) ( ) ( ) ( ),101
2
2
zQzzdz
dDdxdy
F ggggg
Fhex hex
=ΦΣ+Φ−⇒∫∫ (11)
where:
( ) ( )∫∫ Φ=ΦhexF
ghex
g dxdyF
z r1
and ( ) ( ) ( ),zLzSzQ ggg −= and
( ) ( )∫∫ Φ
∂∂+
∂∂−=
hexF
ghex
gg yx
dxdyF
DzL r
2
2
2
2
is the so-called transverse leakage.
The form of the group source ( )zSg follows directly from (10).
According to the results obtained above, the particular solutions of the homogeneous
form of (11) are ( )Bz±exp (here and below the group indices will be omitted). Let the gen-
eral solution is sought in the form:
( ) ( ) ( ) ( )BzaBzazpczi
ii −++≈Φ ∑=
expexp 21
2
0
, (12)
where the coefficients ci and ai are subject to determination. ( ) ( )∑=
=2
02
iii zpczP is a sec-
ond degree polynomial which represents the particular solution of the inhomogeneous equa-
tion (11). ( )zpi are polynomials of degree i.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
140/150
The polynomial representation of the inhomogeneous equation solution is of course not
the only possibility; however it is preferred due to its simple form. The substitution of (12) in
(11) leads to the following result:
( ) ( )zQD
zPBC1
22 =+− , (13)
where the constant C is the second derivative of ( )zP2 .
It is seen that the second degree chosen for the polynomial particular solution of the in-
homogeneous equation is the highest which allows to expand the source ( )zQ in the same
polynomial terms ( )zpi : ( ) ( )∑=
=2
0iii zpqzQ . The additive constant, which according to (13)
distinguishes the representations of ( )zP2 and ( )zQ , is accommodated in a natural way by the
coefficient before ( )zp0 . (Although it is possible to choose a lower degree of the polynomial
expansion, this would unnecessarily reduce the capability of the model (12) to reproduce suf-
ficiently accurately the shape of the actual solution of the diffusion problem.)
The process of finding the coefficients in the polynomial expansion of the source
(which is always assumed as known and is updated iteratively) is strongly facilitated if the
polynomial terms ( )zpi are constructed as orthogonal: ( ) ( ) ij
H
H ji dzzpzp δ=∫+
−2
2
. Then:
( ) ( )∫+
−= 2
2
H
H ii dzzQzpq .
Since the exponential terms in the flux model satisfy the homogeneous diffusion equa-
tion, with the chosen polynomial expansions the inhomogeneous equation (11) takes the form:
( ) ( ) ( ) 02
0
2
0
2
02 =−Σ+− ∑∑∑
=== iii
iii
iii zpqzpczp
dz
dcD (14)
In addition, because of the orthogonality of the polynomial terms:
( )( ) ( ) ( ) 0142
0
2
2
2
22
2
=−Σ+−⇒ ∑ ∫∫=
−−
kki
H
Hiki
H
Hk qczp
dz
dzdzpcDzdzp (15)
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
141/150
The second derivative ( )zpdz
di2
2
is zero for i = 0, 1 and a constant for i = 2. In the latter
case this makes it different from ( )zp0 by a constant multiplier.
Therefore, for 2,1=k the form of (15) is 0=−Σ kk qc and
Σ= k
k
qc (16)
For k = 0 the form of (15) is ( ) ( ) 000
2
2
22
2
02 =−Σ+− ∫−
qczpdz
dzdzpDc
H
H
and
220
0 cB
qc
α−Σ
= , (17)
where ( ) ( )∫−
=2
2
22
2
0
H
H
zpdz
dzdzpα .
Expressions (16) and (17) fully determine the coefficients before the polynomial terms
in the flux representation (12).
The coefficients before the exponential terms in (12) are found from the inner and outer
boundary conditions.
Let ±0J and ±
1J are the average outgoing and incoming partial currents on the bottom
and top interfaces of the node. With accounting of Fick’s law, the applicable expressions are:
( ) ( )
+Φ
+Φ=
−Φ±
−Φ= ±±
22
1
24
1;
22
1
24
110
H
dz
zdD
HJ
H
dz
zdD
HJ m (18)
After comparing these relations with the flux model (12) it is seen that the partial cur-
rents will be linear combinations of the coefficients 2,...,0, =ici and 2,1, =kak . The cou-
pling of these linear combinations through the inner boundary conditions between adjacent
nodes, supplemented with the outer boundary conditions, will define a linear algebraic system
for the nodal coefficients 2,1, =kak . The solving of this system is equivalent to solving the
inhomogeneous diffusion equation (11) with a known source. The practical procedure of ob-
taining a solution of the described problem is as follows.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
142/150
Let ( )20,..,cccol≡C , ( )21,aacol≡A , ( )±±± ≡ 10 , JJcolJ . With this notation the above
mentioned linear combinations are written as:
AQCPJ ±±± += ˆˆ (19)
a) Let at first the incoming partial currents for the node, −J , be assumed as known. This
allows to solve (19) for A:
( ) ( )CPJQA −−−− −= ˆˆ 1 (20)
b) These incoming partial currents are actually also subject to determination. The equa-
tions for them are expressed as continuity conditions for the current across the node inter-
faces:
+−+− ==10 ,0,1,1,0 , mnmn JJJJ , (21)
where the indices m0 and m1 denote the bottom and respectively top neighbour of node
n.
Through (19), equations (21) – totally 2 per node, are essentially formulated for the co-
efficients before the exponential terms, which are also 2 per node.
The implementation of (21) requires a relation between the incoming and the outgoing
partial currents. Based on (19) and (20), this relation is:
( )( ) ( ) −−−+−−−+++ +−= JQQCPQQPJ11 ˆˆˆˆˆˆ (22)
The overall diffusion problem is therefore solved through the following two-tier itera-
tive procedure.
a) Inner iterations for updating the coefficients before the exponential terms in the flux
expansion (12) – through on relations (21).
The inner iterations cycle begins with finding the coefficients before the polynomial
components in (12) through (16) and (17).
A basis of the inner iterations is the evaluation of outgoing partial currents through the
incoming ones and the polynomial coefficients using (22).
In preparation for the next inner iteration the incoming currents are expressed through
the outgoing ones using (21) for the inner interfaces, or using the outer boundary conditions
for the external problem surfaces.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
143/150
The cycle is completed by calculating new polynomial coefficients for the scattering
source in the next group or for the fission source in the next outer iteration. Since this source
is evidently proportional to the flux, the calculation reduces to finding the so-called flux mo-
ments ( ) ( )∫+
−Φ= 2
2
H
H ii dzzzpf . For this purpose the coefficients before the exponential terms in
the expansion (12) are expressed through the partial currents using (20).
b) The outer iterations (source iterations) involve updating the transverse leakage (if the
problem is actually three-dimensional) and the fission source, and a calculation of the effec-
tive multiplication factor.
______________________________________________________________________
The joint solving of the group equations with a dependent source, i.e. the search for the
fission neutron source and the effective multiplication factor, has the following general char-
acteristics.
The generalised matrix-vector representation of the spatially discretised multi-group dif-
fusion criticality problem is:
ΦFΦA ˆ1ˆλ
= , (1)
where the vector ( )Ggcol ΦΦΦΦ ,...,,...,1≡ is a concatenation of the vectors Φg intro-
duced above as solutions of the one-group inhomogeneous equations; the matrix A has a
block structure assembled from the matrices gA of these equations; the matrix F is an analo-
gous result from the spatial and energy discretisation of the fission neutron source
( ) ( ) ( )∫∞
ΦΣ0
',',' EEdEE rrνχ .
The form of this generalised representation does not depend on the applied nodal
scheme. The one-group inhomogeneous equations are solved usually through inner iterations
with a fixed fission source estimated from a previous flux approximation, and the solving of
the inhomogeneous equations is tantamount to inverting the matrix A . The factor λ has the
meaning of an effective multiplication factor and is actually included in the normalisation of
the fixed fission source, most often to ∑ ∑∑ =ΦΣg g n
ng
ngg 1
1
''
,'
νχλ
. It is evident that this nor-
malisation is effectively imposed on the previous flux approximation. With this normalisation
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
144/150
of the right-hand side of the multigroup equation, the result from the j-th cycle of inner itera-
tions is:
( ) ( ) ( )11 ˆˆ −−= jj ΦFAΦ (2)
It is seen that the succession of those cycles, which is essentially the outer iteration
process, is a power iteration with the matrix FAB ˆˆˆ 1−≡ . Therefore, with a sufficiently large j
the relation (2) is equivalent to:
ΦBΦ ˆ1 =λ , (3)
where 1λ is the largest eigenvalue of B and Φ is its corresponding eigenvector. It can
be shown that only this eigenvector has real and non-negative values of all of its components,
which is actually the physical requirement to the neutron flux. Therefore, only the largest ei-
genvalue of B will have the physical meaning of an effective multiplication factor. And of
course, (3) is an alternative formulation of (1).
Within the framework of the power iteration process (Chapter 3), the new estimate of
1λ is obtained e.g. through the expression ( )( )
( )11 −⋅⋅=
j
jj
ΦyΦyλ , where y can in principle be any
non-zero vector. The choice of ( )jΦy = , i.e. ( )( ) ( )
( ) ( )11 −⋅⋅=
jj
jjj
ΦΦ
ΦΦλ , will accelerate convergence
if the eigenvectors of B are mutually orthogonal (Chapter 3). Although such orthogonality is
not guaranteed, this choice is standard for solving the described problem. Moreover, for
evaluating ( )j1λ it is sufficient that the numerator and the denominator be the same linear com-
binations of the elements of ( )jΦ and ( )1−jΦ respectively. Based on this, and in accordance
with the commonly adopted definition of the effective multiplication factors, the latter is most
often evaluated through the ratio:
( )( )( )
( ) ( )
( )
( ) ( )∑ ∑∑
∑ ∑
ΦΣ
ΦΣ
ΦΣ
=⋅
=−
==
=
−
n
jng
gg
jng
ggn
n
jng
ggn
jj
jj
eff
V
V
k1
2
1
2
1
22
1
1
2
ˆˆ
ˆ
νν
ν
ΦFΦFΦF
, (4)
where nV is the volume of node n.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
145/150
The rightmost expression in (4) is for the particular (however commonplace) case of
two energy groups with accounting for the fact that 11 =χ and 02 =χ .
Here it must be reminded that before the next cycle of inner iterations the fission neu-
tron source is normalised: ( )( )
( )jj
eff
j
kΦFΦF ˆ1ˆ ← . It is also worth mentioning that the term inner
iterations is only conventional and actually refers to the effective inversion of the matrix A ,
which is some cases does not involve an iterative procedure.
The outer iterations can be accelerated using certain specialised methods which how-
ever will not be discussed here.
The outer iterations can be embedded in criticality search iterations which consist in
varying the material composition of the reactor medium (e.g. of the boron concentration
and/or of the position of chosen control rods) in order to achieve keff = 1, and the latter in their
own turn can be embedded in a cycle of so-called burnup iterations. Burnup iterations are a
means of accounting for the effect of the evolution of the nuclide composition of fuel during
reactor operation. The process is iterative because the material properties, respectively the
matrices A and F , depend on the sought nuclide composition (often aggregately character-
ised by a quantity known as fuel burnup). These iterations usually account separately for the
so-called poisoning (generation, decay and neutron-induced depletion of fission products with
large neutron absorption cross-sections).
Example: one-dimensional two-group problem
The examples consists in solving the one-dimensional two-group stationary diffusion
equation in plane geometry with diffusion constants representative of the fuel and the reflec-
tor region of WWER-1000. The boundary conditions are void (α = 0.5). The nodal scheme is
based on finite differencing and the power iteration is not accelerated. The convergence crite-
ria are 710−=kε for the multiplication factor and 610−=fε for the fission source shape.
The first example is for a problem with a single fuel material (with relatively high mul-
tiplying properties) without reflector. The effect of the discretisation step h (constant for the
entire problem) is studied. The thickness of the fuel layer is 50 cm and is close to the critical
value.
The effect from the discretisation step h (with a step of 16.7 cm the problem is subdi-
vided into only 3 nodes) is demonstrated to be quite significant from a physical point of view.
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
146/150
It is also seen that the fission rate shape is very close to cosine, as it should be expected for a
bare homogeneous slab.
Bare homogeneous slab. Effect from the discretisation step
h, cm keff δkeff, pcm # of source iterations 16.7 1.011556 1520 33 10 1.00215 580 23 5 0.99791 155 18 1 0.996355 0 23
(1 pcm = 10-5)
0 10 20 30 40 500.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
normalisedfission rate
rel.
units
cm
Bare homogeneous slab. Fission rate
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
147/150
0 10 20 30 40 500.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
thermal flux
epithermal flux
rel.
units
cm
Bare homogeneous slab. Two-group flux.
The second example differs from the first by the addition of a 20 cm thick reflector. The
observed effect is a large increase of the multiplication factors and a higher sensitivity to the
discretisation step h. The latter is due to the complicated flux shape in the vicinity of the bor-
der between the fuel region and the reflector. Another important effect is the flattened fission
density shape. The characteristic peaking of the thermal neutron flux in the reflector close to
the fuel region is observed as well.
Reflected homogeneous slab. Effect from the dscretisation step
h, cm keff δkeff, pcm # of source iterations 16.7 (20 in the reflector) 1.092594 4605 12 10 1.063492 1695 14 5 1.050144 360 16 1 1.046545 0 21
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
148/150
20 30 40 50 60 700.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
normalisedfission rate
rel.
units
cm
Reflected homogeneous slab. Fission rate
0 10 20 30 40 50 60 70 80 900.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
thermal flux
epithermal flux
rel.
units
cm
Reflected homogeneous slab. Two-group flux
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
149/150
The third example differs from the second one in that the central 10 cm contain fuel
with weakest multiplying properties, surrounded by a 10 cm intermediate layer with better
multiplying properties and a 10 cm outer layer with strongest multiplying properties (identical
to that from the preceding examples). It is seen that the fission rate shape is flattened further,
although the choice of fuel regions if far from optimum.
Reflected heterogeneous slab. Effect from the discretisation step
h, cm keff δkeff, pcm # of source iterations 10 0.98103 1852 18 5 0.966429 392 17 1 0.962509 0 18
20 30 40 50 60 700.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
normalisedfission rate
rel.
units
cm
Reflected heterogeneous slab. Fission rate
These notes represent the lecture contents in Computational Methods in Nuclear Technology. Errors and inaccuracies are possible. I. Christoskov, January 2015
150/150
0 10 20 30 40 50 60 70 80 900.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
thermal flux
epithermal flux
rel.
units
cm
Reflected heterogeneous slab. Two-group flux
Further reading
1. W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery: Numerical recipes in FORTRAN, Second Edition, Cambridge University Press, 1992
2. A. C. Kak, M. Slaney, Principles of Computerized Tomographic Imaging, IEEE Press, 1988
3. E. E. Lewis, W. F. Miller, Computational Methods of Neutron Transport, John Wiley & Sons, 1984
4. R. Stammler, M. J. Abbate, Methods of Steady State Reactor Physics in Nuclear Design, Academic Press, 1983
5. M. Hjorth-Jensen, Computational Physics, University of Oslo (2013) (http://www.physics.ohio-state.edu/~ntg/6810/readings/Hjorth-Jensen_lectures2013.pdf)