Upload
vongoc
View
213
Download
0
Embed Size (px)
Citation preview
CHAPTER 3
SOLVING PROBLEMS OF DIFFERENT RESEARCH AREA USING TLBO
This chapter presents the performance of TLBO algorithm in solving problems of different research area viz.
computer network, power system, data clustering and neural network. The performance of TLBO algorithm is
compared with number of heuristic optimization techniques.
3.6 Quality of Service (QoS) multicast Routing
The QoS multicast routing problem is to find a multicast routing tree with minimal cost that can satisfy constraints
such as bandwidth and delay. This problem is NP Complete. Hence, the problem is usually solved by heuristic or
intelligence optimization technique. In this problem Teaching Learning Based Optimization method is used to
optimize the multicast tree. A fitness function is used to implement the constraints specified by the Quality of
Service conditions. The experimental results dealt with relations between the number of nodes, edges in the input
graph and convergence time, the optimal solution quality comparison with other evolutionary techniques. The
results reveal that TLBO algorithm performs better than the existing compared algorithms.
3.6.1 Introduction
The rapid development in network multimedia technology enables more and more real-time multimedia services
such as video conferencing, on-line games, distance education etc. to become mainstream internet activities. These
services often require the network to provide multicast capabilities. Multicast refers to the delivery of packets from
a single source to multiple destinations. The central problem of QoS routing is to set up a multicast tree that can
satisfy certain QoS parameters. However, the problem of constructing a multicast tree under multiple constraints is
NP-Complete (Wang, Z. et. al.)24
. Hence, the problem is usually solved by heuristic or intelligence optimization
technique.
In QoS multicast routing, each node or link has some parameters associated with it. These parameters are used
to determine the most efficient path from the source to the destinations. Thus, these network resources must be
handled and shared in such a way that the most optimal solution can be obtained for the QoS multicast routing
problem with minimal cost. This cost is determined by the parameter values associated with each link which may be
present in a chosen path from a source to a destination. Genetic Algorithms(GA) and Ant Colony Optimization
(ACO) (Lhotska, L. et. al.)25
are used to solve this problem. Particle Swarm Optimization (PSO) (Kennedy, J. et.
al.)19
, (Xing, J. et. al.)26
technique is applied to solve the QoS multicast routing. Besides this other algorithm based
on quantum mechanics named as Quantum-Behaved Particle Swarm Optimization (QPSO) is proposed (Sun, J. et.
al.)27
. Later on PSO along with Genetic Algorithm (GA) is introduced which become hybrid genetic algorithm and
particle swarm Optimization (HGAPSO)(Li, C. et. al.)28
to solve QoS multicast routing. A tree based PSO is
proposed in (Wang, H. et. al.)29
for optimizing the multicast tree directly. However, the performance depends on the
number of particles generated. Another drawback of the algorithm is merging the multicast trees, eliminating
directed circles and nested directed circles are very complex. In order to remove the above disadvantage Teaching
Learning Based Optimization technique is used for optimization of QoS multicast routing.
3.6.2 Multicast Routing Problem Formulation
A QoS multicast routing problem is usually involved in several constraints. Here, it is simplified the QoS
constraints and mainly focus on the band-width-delay constrained least cost multicast routing problem.
18
Communication network can be modeled as an undirected graph 𝐺 =< 𝑉,𝐸 >, where 𝑉is the set of all
nodes representing routers or switches, 𝐸 is the set of all edges representing physical or logical connection
between nodes. Each link (𝑥, 𝑦) ∈ 𝐸in 𝐺has three weights (𝐵(𝑥, 𝑦),𝐷(𝑥, 𝑦),𝐶(𝑥, 𝑦)) associated with it, in which
positive real values 𝐵(𝑥, 𝑦),𝐷(𝑥, 𝑦),𝐶(𝑥, 𝑦)denote the available bandwidth, the delay and the cost of the link
respectively. Given a path 𝑃(𝑥, 𝑦)connected any two nodes x, y in G, it can be presumed that:
The delay of a path is the sum of the delays of the links (𝑥 , 𝑦):
𝐷𝑒𝑙𝑎𝑦 𝑃 𝑥, 𝑦 = 𝐷(𝑎, 𝑏)(𝑎 ,𝑏)∈𝑃(𝑥 ,𝑦) (3.1)
The available bandwidth of 𝑃 𝑥, 𝑦 is considered as the bottle neck bandwidth of 𝑃 𝑥, 𝑦 :
𝑊𝑖𝑑𝑡 𝑃 𝑥, 𝑦 =min(𝐵 𝑎, 𝑏 )
(𝑎, 𝑏) ∈ 𝑃(𝑥,𝑦) (3.2)
In QoS transmission of real time multimedia service, the optimal cost routing problem with delay and
bandwidth constrained can be described as follows: Given 𝐺 =< 𝑉,𝐸 >, a source node𝑠, and a multicast member
set 𝑀 ⊆ 𝑉 − {𝑠}, the problem is to find the multicast tree 𝑇 = (𝑉𝑇 ,𝐸𝑇) from source s to all destinations 𝑣 ∈ 𝑀,
where 𝑇 ⊆ 𝐺, and 𝑇must satisfy the following conditions:
𝐶𝑜𝑠𝑡 𝑇 = min( 𝐶(𝑥, 𝑦)(𝑥 ,𝑦)∈𝐸𝑟 ) (3.3)
𝐷 𝑥, 𝑦 ≤ 𝐷𝑚𝑎𝑥 ∀𝑣(𝑥 ,𝑦)∈𝑃𝑟(𝑠,𝑣) ∈ 𝑀 (3.4)
𝑊𝑖𝑑𝑡(𝑃𝑇 𝑠, 𝑣 ) ≥ 𝑊𝑚𝑖𝑛 , ∀𝑣 ∈ 𝑀 (3.5)
where 𝑃𝑇 𝑠, 𝑣 is the set of links in the path from source nodes 𝑠 to destination node 𝑣 in the multicast tree.
Relation (3.3) means that the cost of multicast routing tree should be minimum. Relation (3.4) means that the delay
requirement of QoS in which 𝐷𝑚𝑎𝑥 is thepermitted maximum delay value of real time services and relation (3.5)
guarantees the bandwidth of communication traffic, in which 𝑊𝑚𝑖𝑛 is the required minimum bandwidth of all
applications.
3.6.3 The TLBO Algorithm for Multicast Routing
The TLBO-based algorithm for solving multicast routings is overviewed as follows. The process is initialized with
a group of random learners (solutions). Then determine the best teacher among the learners that have the best
fitness value. Subsequently in each time interval learners improved based upon teacher and themselves.
Initialization of individuals
To apply the TLBO method for multicast routing problems, it has been used the “individual” to replace the
“learner”. In the initialization process, a set of individuals is created at random. Individual 𝑖’𝑠 position at iteration
0 can be represented as the vector
𝑥𝑖 0 = {𝑥𝑖1(0), 𝑥𝑖2(0),……… , 𝑥𝑖𝑛 (0)} (3.6)
where 𝑛 is the number of network nodes of individual 𝑖corresponds to the generation update quantity covering all
network nodes.
Before starting the TLBO algorithm, it can be removed all the links, which their bandwidth are less than the
minimum of all required thresholds 𝑊𝑚𝑖𝑛 . If in the refined graph, the source node and all the destination nodes are
not in a connected sub-graph, this topology does not meet the bandwidth constraint. In this case, the source should
negotiate with the related application to relax the bandwidth bound. On the other hand, if the source node and all
19
the destination nodes are in a connected sub-graph, then this sub-graph will be used as the network topology in this
TLBO algorithm.
Evaluation function definition
The evaluation function f is given in Equation (3.7) as the evaluation value of each individual in population. The
evaluation function f is a reciprocal of the performance criterion 𝐶𝑜𝑠𝑡 (𝑇) in Equation (3.3).It implies the smaller
𝐶𝑜𝑠𝑡(𝑇) the value of individual T, the higher its evaluation value.
𝑓 𝑇 = ∅(𝑠)𝑚∈𝑀 × 𝜑(𝑠)𝑚∈𝑀
𝑐𝑜𝑠𝑡 (𝑒)𝑒∈𝑇 (3.7)
∅ 𝑠 = 𝐷𝑒𝑙𝑎𝑦 𝑃 𝑠,𝑚 − 𝐷𝑚𝑎𝑥 (3.8)
𝜑 𝑠 = 1, 𝑊𝑚𝑖𝑛 ≤ 𝑊𝑖𝑑𝑡(𝑃 𝑠,𝑚 )
𝛾, 𝑊𝑚𝑖𝑛 > 𝑊𝑖𝑑𝑡(𝑃 𝑠,𝑚 ) (3.9)
where 𝜑 𝑠 is the penalty function, and the value of the 𝛾 = 0.5 for this problem.
3.6.4 Experimental Results
The simulation is performed to investigate the performances of multicast routing algorithms based on TLBO
algorithm. The source and the destination are randomly generated. The bandwidth and delay of each link is
uniformly distributed in range [10,50] and [0,50ms] respectively. The cost of each link is uniformly distributed in
range [0, 200].
To analyze the performance TLBO algorithm for solving problem the defined problem, different sets of input
were generated and the algorithm was tested for varying number of nodes and edges. In Table 3.1.1, compared the
running times between genetic algorithm (GA), immune algorithm(IA),ant colony algorithm (ACO), Particle
Swarm Optimization(PSO) and (Teaching learning based Optimization)TLBO algorithm for different
combinations of node and edge has been given.
Table. 3.1.1 Comparison of running time (in second)
Nodes Edges TLBO PSO ACO IA GA
20 32 0.08 0.21 0.27 0.35 0.48
40 89 0.19 0.38 0.45 0.51 0.63
80 172 0.80 1.36 1.42 1.48 1.59
120 239 1.32 1.97 2.68 3.12 3.16
160 336 3.46 4.18 5.37 5.82 7.72
180 371 5.12 6.46 8.19 8.93 9.85
200 427 6.89 8.36 9.73 10.26 11.25
Table. 3.1.2The optimal solution quality comparison
Algorithm Optimal Sub-optimal Invalid
TLBO 89.4% 9.7% 0.9%
PSO 81.2% 17.5% 1.3%
GA 78.4% 19.4% 2.2%
IA 78.9% 19.6% 1.5%
20
ACO 79.9% 18.2% 1.9%
Table 3.1.1 results clearly show that the running time of TLBO algorithm grows very slowly with the size of
the network and the running times of TLBO is smaller than GA‟s ,IA‟s, ACO‟s and PSO‟s . This behavior is
shown in the Fig. 3.1.1. Therefore, the TLBO algorithm is very effective. Furthermore, for the same multicast
routing, 300 simulations is taken by TLBO algorithm against GA(Wang, Z. et. al.)30
, immune algorithm(IA)( Liu,
F. et. al.)31
, ACO algorithm (Carrillo, L. et. al.)32
and PSO algorithm. The computation results are shown in Table
3.1.2. It is found that the TLBO algorithm performances better than GA, IA, ACO and PSO. So the TLBO
algorithm is shown the good performance in comparison to other algorithms.
Fig.3.1.1 Convergence behavior of GA, PSO, IA, ACO and TLBO algorithm
3.6.5 Conclusion
Multicast routing problem arises in many multimedia communication applications, computing the band-width-
delay constrained least-cost multicast routing tree is a NP-complete problem. In the above sections a novel
multicast routing algorithm based on the TLBO algorithm is developed. The experimental results show that this
algorithm has better performance and efficiency.
3.7 0-1 Integer Programming For Generation maintenance Scheduling in Power Systems
This section presents optimal solution of the unit maintenance scheduling problem in which the cost reduction is as
important as reliability. The objective function of the algorithms is used to address this problem, considers the
effect of economy as well as reliability. Various constraints such as spinning reserve, duration of maintenance
crew is being taken into account while dealing with such type of problems. In this work the Teaching learning
based optimization (TLBO) algorithm is applied on a power system with six generating units. Numerical results
reveal that the TLBO algorithm can find better and faster solutions when compared to other heuristic or
deterministic methods.
3.7.1 Introduction
The major responsibility of power station maintenance department is to maximize plant reliability, availability and
efficiency by determining both short and long term maintenance requirements. It also complies with statutory and
mandatory requirements and investigates into plant problems. The aim of the department is to make the most
economic use of its available resources; this is achieved, in part, by having a level of staff (engineering,
supervisory, craft) to deal with the general day-to-day steady workload and by making alternative arrangements to
cater for work load peaks (Tabari, N. M. et. al.)33
. To achieve the above goal, periodic servicing must take place
and normally falls under the wing items. Planned maintenance: Overhaul preventive maintenance and unplanned
maintenance: emergency maintenance.
20 40 60 80 100 120 140 160 180 2000
2
4
6
8
10
12
No. of nodes
Run
ning
tim
e in
s
Comparision of running Time
PSO
TLBO
ACO
IA
GA
21
Preventive maintenance requires shop facilities, skilled labor, keeping records and stocking of replacement
parts and hence expensive. The cost of downtime resulting from avoidable outages may amount to ten or more
times the actual cost of repair. The high cost of downtime makes it imperative to economic operation that
maintenance be scheduled into the operating schedule (Tabari, N. M. et. al.)33
. Thus the maintenance scheduling
problem is to determine the period for which generating units of an electric power utility should be taken off line
for planned preventive maintenance over the course of a one or two year planning horizon, in order to minimize
the total operating cost while system energy, reliability requirements and a number of other constraints are
satisfied (Marwali, M. K. C. et. al.)34
.
Many efforts are put by researchers, in past, to solve this maintenance scheduling problem. These approaches
are based on many heuristics/stochastic and deterministic algorithms. Few are mentioned below for examples:
Lagrangian relaxation (Geetha, T. et. al.)35
,Linear programming(Chattopadhyay, D.)36
, Mixed integer
programming (Dailva, E.L. et. a.)37
, Decomposition methods (Marwali, M. K. C. et. al.)38
, Fuzzy logic (El-Sharkh,
M.Y. et. al.)39
,Ant colony optimization (Foong, W.K.)40
, Harmony Search (Fetanat, A. et. al.)41
.In this section, yet
one more recently developed stochastic approach know as Teaching-learning based Optimization (TLBO) (Rao,
R.V. et. al.)4 for maintenance scheduling field is demonstrated. The attractive feature of TLBO is based on the fact
that unlike other techniques like GA, PSO, and harmony Search, it has no parameters for tuning and hence can
find the optimal solution in less computational time. The simulation results in this work reveal that the TLBO
approach not only outperforms all other approaches in terms of producing good result but also able to do so in
quicker times.
3.7.2 The Integer Programming Problem
Many power systems areas such as, short-term hydro scheduling, optimal reconfiguration and capacitor allocation,
reactive power market clearing, transmission network expansion planning, etc. require the variables to be integers.
These problems are called Integer Programming problems. Optimization methods are developed for real search
spaces can be used to solve Integer Programming problems by rounding off the real optimal values to the nearest
integers.
Maintenance scheduling problem is kind of 0-1 Integer Programming. Importance of TLBO is in solving
the problems of nonlinear optimization in the space of real numbers and has applications in the problems related to
engineering optimization (Rao, R.V. et. al.)4 , (Rao, R.V. et. al.)
42, (Rao, R.V. et. al.)
43. In order to expand the
above mentioned matter in 0-1 integer programming, the real numbers of the problem has been considered, which
at the beginning of TLBO algorithm are chosen randomly in the interval [0, 1]. For solving the problem, values
equal to or higher than 0.5 are rounded to 1 and values less than 0.5 are rounded to 0.
3.7.3 Maintenance Scheduling Model
Here Leou‟s model two objective functions (Tabari, N. M. et. al.)33
, (Leou, R.C. et. al.)44
are used in this work.
Due to the binary nature of the maintenance scheduling problem, integer programming optimization method can
be better suited. This method is computationally acceptable even for problem is scaled up with a large number of
variables and constraints. Here, the beginning time of maintenance is adopted as the state variable. The
maintenance scheduling problem can be set up as a 0-1 integer programming whose general form is: find the n-
vector 𝑥∗ which minimizes the cost function
𝑍 = 𝑐𝑇𝑥 (3.10)
Subjected to
𝐴𝑥 ≤ 𝑏(3.11)
where 𝑥𝑖 = 0 𝑜𝑟 1, 𝑖 = 1,2,……….,n.
22
A feasible solution satisfies the constraints. A feasible n-vector 𝑥∗ is optimal if and only if solution 𝑐𝑇𝑥∗ ≤
𝑐𝑇𝑥 for all feasible x. Each 𝑥𝑖 is associated with beginning maintenance on some unit j during some week k if and
only if 𝑥𝑖 = 1 for each problem tables relating i, j and k are developed.
As an example of a group of six units that must be maintained during a ten week period are taken here. The
sample machine input data are shown in Table 3.2.1.
Table 3.2.1.Machines input data for six unit system
Unit
No.
Allowed Period Capacity(MW) Maintenance crew Outage duration
1 1-4 200 10 3
2 3-6 300 15 4
3 5-7 300 15 4
4 6-9 300 15 4
5 12-14 500 20 4
6 14-16 500 20 4
It can be seen that only four units (unit 1 to 4) should be maintained during the ten week period. Table 3.2.2
gives variables associated with each unit that should be maintained. In this table i, j, and k respectively shows
“Associated unknown”, “Unit No.” and “Maintenance starts in week”. For instance, if 𝑥6 = 1, maintenance on
unit 2 begins on the fourth week.
Table 3.2.2 State variables for six unit system during 10 week period
Objective function
The deferring maintenance of units may cause damage on the machines. In order to improve the reliability of the
power system and save the maintenance expense of the damaged machines, the objective function adopted, was to
maintain the units as earlier as possible. In the above example, only four units should be maintained during the
following ten week period. The objective functions can be expressed as the following forms:
𝑐1 =[ 1 2 3 4 1 2 3 4 1 2 3 1 2 3 4 ] (3.12)
𝑐2 =[ 0 1 2 3 0 1 2 3 0 1 2 0 1 2 3 ] (3.13)
According to (3.10), values in the 𝑐1 vector (Tabari, N. M. et. al.)33
and 𝑐2 vector (Leou, R.C. et. al.)44
,
(Leou, R.C.)45
are the coefficients of objective functions and express the maintenance cost of each one of unit
generators.
For each unit there is a cost of 1 associated with the beginning of the maintenance during the first allowed
week. There is a cost of 2 imposed for beginning maintenance in the second week. The schedule that minimizes
this cost function is the “earliest possible” maintenance schedule.
Constraints
Spinning reserve
In order to maintain the electric power supply normally, there must have a lot of spinning reserve to compensate
for the outage of the generating units. The spinning reserve constraint can be expressed as
𝑥𝑖 𝑥1 𝑥2 𝑥3 𝑥4 𝑥5 𝑥6 𝑥7 𝑥8 𝑥9 𝑥10 𝑥11 𝑥12 𝑥13 𝑥14 𝑥15
(j,k) (1,1) (1,2) (1,3) (1,4) (2,3) (2,4) (2,5) (2,6) (3,5) (3,6) (3,7) (4,6) (4,7) (4,8) (4,9)
23
Capacity for maintenance + load capacity + spinning reserve ≤generating capacity (3.14)
Maintenance crew
For each period, numbers of the people, who to perform maintenance schedule, cannot exceed the available crew.
Assume the number of people available for maintenance is P. The maintenance crew should satisfy the following
constraint for each period.
Numbers of people are performing maintenance ≤ P (3.15)
Duration of maintenance
In order to let the units operate in good condition, the units should be maintained after a period of operation.
CASE STUDY
Input data
In this section, test results of the six-unit test system mentioned previously is reported.
As indicated in Table 3.2.1, the six-unit system can generate 2100 MW, and the number of people available
for maintenance is 50. During the maintenance period (ten week interval), only four units (unit 1 to unit 4) need to
perform maintenance. The machines input data are shown in Table 3.2.1. Figure 3.2.1 shows the load curve of the
system. The spinning reserve is 400 MW.
1st to 4
th constraints of (3.16) indicate the beginning maintenance constraint for unit 1 to unit 4. 5
th to 14
th
constraints of (3.16) represent the spinning reserve constraints. From the machine input data of Table 3.2.1, in
period 1 and period 2 only unit 1 is possible to perform the maintenance. The 6th
constraint contains two items,
which are 200𝑥1 and 200𝑥2. 200𝑥2describethe possibility of unit 1 start maintaining in period 2. 200 is the
capacity of unit 1. Since the outage duration of unit 1 lasting 3 periods, hence 200𝑥1 is included in the 6th
constraint. For the same consideration, the spinning reserve constraint of period 3 is included in the 7th
constraint.
In addition to consideration of spinning reserve constraints, the available people to perform maintenance are also
important. 15th
to 24th
constraints of (3.16) describe the consideration of crew constraints. In this case, the
available people to perform maintenance are 50 people. In period 1, from the machine data shown in Table 3.2.1,
only unit 1 is possible in maintenance. The crew constraint of period 1 is shown in the 15th constraint. Numbers of
people needed to perform maintenance for unit 1 are 10 people. In period 2, also only unit 1 is possible for
maintenance, but unit 1 may start to maintain between period 1 and period 4. Therefore, in period 2, the crew
constraint is demonstrated in the 16th
constraint. Following the same rule the crew constraints have been built for
all periods.
24
Integrate with these input data, the model is shown as follows:
Min
𝑍 = 𝑥1 + 2𝑥2 + 3𝑥3 + 4𝑥4 + 𝑥5 + 2𝑥6 + 3𝑥7 + 4𝑥8 + 𝑥9 + 2𝑥10 + 3𝑥11 + 𝑥12 + 2𝑥13 + 3𝑥14 + 4𝑥15
s.t
𝑥2 + 𝑥3 + 𝑥4 = 1
𝑥5 + 𝑥6 + 𝑥7 + 𝑥8 =1
𝑥9 + 𝑥10 + 𝑥11 = 1
𝑥12 + 𝑥13 + 𝑥14 + 𝑥15=1
200𝑥1 + 800 + 400 ≤ 2100
200𝑥1 + 200𝑥2 + 700 + 400 ≤ 2100
200𝑥1 + 200𝑥2 + 200𝑥3 + 300𝑥5 + 600 + 400 ≤ 2100
200𝑥2 + 200𝑥3 + 300𝑥5 + 200𝑥4 + 300𝑥6 + 500 + 400 ≤ 2100
200𝑥3 + 300𝑥5 + 200𝑥4 + 300𝑥6 + 300𝑥7 + 300𝑥9 + 700 + 400 ≤ 2100
300𝑥5 + 200𝑥4 + 300𝑥6 + 300𝑥7 + 300𝑥9 + 300𝑥8 + 300𝑥10 + 300𝑥12 + 800 + 400 ≤ 2100
300𝑥6 + 300𝑥7 + 300𝑥9 + 300𝑥8 + 300𝑥10 + 300𝑥12 + 300𝑥11 + 300𝑥13 + 1000 + 400 ≤ 2100
300𝑥7 + 300𝑥9 + 300𝑥8 + 300𝑥10 + 300𝑥12 + 300𝑥11 + 300𝑥13 + 300𝑥14 + 1400 + 400 ≤ 2100 ……. (3.16)
300𝑥8 + 300𝑥10 + 300𝑥12 + 300𝑥11 + 300𝑥13 + 300𝑥14 + 300𝑥15 + 1200 + 400 ≤ 2100
300𝑥11 + 300𝑥13 + 300𝑥14 + 300𝑥15 + 1100 + 400 ≤ 2100
10𝑥1 ≤ 50
10𝑥1 + 10𝑥2 ≤ 50
10𝑥1 + 10𝑥2 +10𝑥3 + 15𝑥5 ≤ 50
10𝑥2 + 10𝑥3 + 15𝑥5 + 10𝑥4 + 15𝑥6 ≤ 50
10𝑥3 + 15𝑥5 + 10𝑥4 + 15𝑥6 + 15𝑥7 + 15𝑥9 ≤ 50
15𝑥5 + 10𝑥4 + 15𝑥6 + 15𝑥7 + 15𝑥9 + 15𝑥8 + 15𝑥10 + 15𝑥12 ≤ 50
15𝑥6 + 15𝑥7 + 15𝑥9 + 15𝑥8 + 15𝑥10 + 15𝑥12 + 15𝑥11 + 15𝑥13 ≤ 50
15𝑥7 + 15𝑥9 + 15𝑥8 + 15𝑥10 + 15𝑥12 + 15𝑥11 + 15𝑥13 + 15𝑥14 ≤ 50
15𝑥8 + 15𝑥10 + 15𝑥12 + 15𝑥11 + 15𝑥13 + 15𝑥14 + 15𝑥15 ≤ 50
15𝑥11 + 15𝑥13 + 15𝑥14 + 15𝑥15 ≤ 50
Fig 3.2.1. Load curve of the six unit system
3.7.4 Output Result
Simulation Strategy
While comparing the performance of algorithms, the focus is on computational time required to find the solution.
For comparing the speed of the algorithms, the first thing it require a fair time measurement. The number of
25
iterations or generations cannot be accepted as a time measure since the algorithms perform different amount of
works in their inner loops. Hence, it has been chosen the number of fitness function evaluations (FEs) as a measure
of computation time instead of generations or iterations. Since the algorithms are stochastic in nature, the results of
two successive runs usually do not match. Hence, 20 independent runs (with different seeds of the random number
generator) is taken for each of algorithms. The results have been stated in terms of the mean values and standard
deviations over the 20 runs in HS and TLBO case.
Finally, it is pointed out that all the experiment codes are implemented in MATLAB. The experiments are
conducted on a Pentium 4, 1GB memory desktop in Windows XP 2007 environment.
Experimental Results
For comparing the results of scheduling problem Fuzzy 0-1 Integer Programming, Implicit enumeration, Harmony
search algorithm and Teaching learning based optimization are considered. To judge the accuracy for optimal
solution obtained by HS and TLBO algorithm, each of them run for a very long time. Then, the results are noted.
The results obtained from the algorithms and the values reported are averages over 20simulations, with standard
deviations to indicate the range of values to which the algorithms converge. Result for Fuzzy 0-1 integer
programming and Implicit Enumeration are taken from paper (Tabari, N. M. et. al.)33
, (Huang, C.J. et. al.)46
.
Detailed results are given in the Table 3.2.3 for 1st objective function and in the Table 3.2.4 for 2
nd objective
function
Table 3.2.3.Objective function with 𝑐1 vector
x Fuzzy 0-1 Integer
Programming
Implicit
Enumeration
Harmonic Search
Algorithm
TLBO Algorithm
𝑥1 0 1 1 1
𝑥2 0 0 0 0
𝑥3 1 0 0 0
𝑥4 0 0 0 0
𝑥5 1 1 1 1
𝑥6 0 0 0 0
𝑥7 0 0 0 0
𝑥8 0 0 0 0
𝑥9 1 1 1 1
𝑥10 0 0 0 0
𝑥11 0 0 0 0
𝑥12 0 0 0 0
𝑥13 0 0 0 0
𝑥14 0 0 0 0
𝑥15 1 1 1 1
Z 9 7 7 7
No of fitness
evaluation in
Mean±std
- - Fitness evaluation
more than 30,000
1483±187.76
26
Table3.2. 4.Objective function with 𝑐2 vector
From the Table 3.2.3 and Table 3.2.4 it is clear that the compared with Leou‟s method (Fuzzy 0-1 Integer
Programming) that calculates Z in 9 and 5 values (Tabari, N.M. et. al)33
, (Huang, C.J. et. al.)46
, HS and TLBO are
more proper and simpler. Also, in comparison to implicit enumeration that calculates Z in 7 and 3 values (Tabari,
N.M. et. al)33
, HS and TLBO are faster. Compared to HS it is found that TLBO is more faster, as it converges in
less fitness evaluation.
3.7.5 Conclusion
In above section, 0-1 Integer Programming based on the Teaching learning based optimization (TLBO) for finding
an optimal generation maintenance schedule is presented. The purpose of objective function is to make units
maintain as earlier as possible with constraints like spinning reserve, crew and duration of maintenance. The
comparison of the optimal schedule with other optimization methods indicates that TLBO is better than others.
3.8 Improvement of initial cluster center for c-means algorithm
While clustering the data using fuzzy c-means (FCM) and hard c-means (HCM), the sensitivity to tune the initial
clusters centers have captured the attention of the clustering communities for quite a long time. In this study, the
Teaching learning based Optimization (TLBO), as a method to address this problem. The approach consists of two
stages. In the first stage, the TLBO explores the search space of given dataset to find out near-optimal cluster
centers. The cluster centers found by TLBO are then evaluated using reformulated c-mean objective function. In
the second stage, the best cluster centers found are used as the initial cluster center for the c-mean algorithms.
Experiments show that TLBO can minimize the difficulty of choosing an initialization for the c-means clustering
algorithms. For purposes of evaluation, standard benchmark data and artificial data have been experimented.
x Fuzzy 0-1 Integer
Programming
Implicit
Enumeration
Harmonic Search
Algorithm
TLBO Algorithm
𝑥1 0 1 1 1
𝑥2 0 0 0 0
𝑥3 1 0 0 0
𝑥4 0 0 0 0
𝑥5 1 1 1 1
𝑥6 0 0 0 0
𝑥7 0 0 0 0
𝑥8 0 0 0 0
𝑥9 1 1 1 1
𝑥10 0 0 0 0
𝑥11 0 0 0 0
𝑥12 0 0 0 0
𝑥13 0 0 0 0
𝑥14 0 0 0 0
𝑥15 1 1 1 1
Z 5 3 3 3
No. of fitness
evalution in
Mean±std
- - Fitness evaluation
more than 70,000
2941±307.962
27
3.8.1 Introduction
Clustering is a typical unsupervised learning technique for grouping similar data points according to some measure
of similarity. The main goal of such technique is to minimize the inter-cluster similarity and maximize the intra-
cluster similarity (Jain, A.K. et. al.)47
. One of the most popular clustering algorithms is the c-means algorithm with
its two types: fuzzy c-means (FCM) and hard c-means algorithm (HCM). However, selecting the initial cluster
centers in these clustering algorithms is considered one of the main challenging problems. Generally, these types
of clustering algorithms look for minimizing the objective function, though it is unfortunately guaranteed only to
yield local minimum (Hathaway, R.J. et. al.)48
. Uncorrected selection of the initial cluster centers will generally
lead to undesirable clustering result which will be affected by these initial values. The main cause for the local
optimal problem in these algorithms is that c-means algorithms actually work similar as a hill climbing algorithm
(Kanade, P.M. et. al.)49
. The local search based algorithms move in one direction without performing a wider scan
of the search space. Thus the same initial cluster centers in a dataset will always generate the same cluster results,
better results might be obtained if the algorithm is run with different initial cluster centers.
To overcome the main cause of this problem, several population-based or local search-based meta-heuristic
algorithms have been proposed in the last several decades including Simulating Annealing (Selim, S.Z. et. al.)50
,
Tabu Search (Al-Sultan, K..S.)51
, Genetic Algorithm (Hall, L.O. et. al.)52
, Particle Swarms Optimization (Lili, L.
et. al.)53
, Ant Colony Algorithm (Kanade, P.M. et. al.)49
and Differential Evolution (Maulik, U. et. al.)54
. The main
advantages of these meta-heuristic-based algorithms are their abilities to cope with local optima and effectively
explore large solution spaces by maintaining, recombining and comparing several solutions simultaneously
(Paterlini, S. et. al.)55
. Teaching learning based optimization (TLBO) (Rao, R.V. et. al.)4 is a relatively new
population-based meta-heuristic optimization algorithm. This method works on the effect of influence of a teacher
on learners. The key advantage of TLBO is that, the algorithm is free from parameters. It lies in its ability to
exploit the new suggested solution synchronizing with exploring the search space in a parallel optimization
environment without parameter where other algorithm change in the algorithm parameters changes the
effectiveness of the algorithm.
In this section, a new variation of TLBO for solving initial centers selection problem for both HCM and
FCM has been introduced. This approach consists of two stages. In the first stage, the TLBO explores the search
space of the given dataset to find out the near optimal cluster centers. In the second stage, the best cluster centers
found are used as the initial cluster centers for the c-means algorithms to perform clustering.
3.8.2 Hard C Mean and Fuzzy C Mean Clustering Algorithm
In this section, the Hard C Mean and Fuzzy C Mean algorithm have been described.
Hard C Mean clustering algorithm
In non fuzzy or hard clustering, data is divided into crisp clusters, where each data point belongs to exactly one
cluster.
Used to classify data in crisp set
Each data point will be assigned to only one cluster
Clusters are also known as partitions
U is a matrix with c rows and n columns
The cardinality gives number of unique c partitions for n data points
In this clustering technique partial membership is not allowed. HCM is used to classify data in a crisp sense.
By this it means that each data point will be assigned to one and only one data cluster. In this sense, these clusters
are also called as partitions that are partitions of the data. In case of hard c mean each data element can be a
28
member of one and only one cluster at a time. In other words it can be said that the sum of membership grades of
each data point in all clusters is equal to one and in HCM membership grade of a specific data point in a specific
cluster is one and in all the remaining clusters its membership grade is zero. Also number of clusters that cannot be
less than or equal to one and they cannot be equal to or greater than number of data elements because if number of
clusters is equal to one than all data elements will lie in same cluster and if number of clusters is equal to number
of data elements than each data elements will lie in its own separate cluster. That is each cluster is having only one
data point in this special case. The steps of HCM algorithm is given below.
1. fix c(2<=c<n) and initialize the U matrix
𝑈(0) ∈ 𝑀𝐶
then for r = 0, 1, 2, 3……………
2. Calculate the center vectors{ 𝑉®| with 𝑈®}
3. Update 𝑈® calculate the updated characteristic function (for a all i, k).
𝑋𝑖𝑘(𝑟+1)
= 1, 𝑑𝑖
(𝑟)= 𝑚𝑖𝑛 𝑑𝑗𝑘
(𝑟) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑗 ∈ 𝐶
0, 𝑜𝑡𝑒𝑟𝑤𝑖𝑠𝑒
4. if ||𝑈(𝑟−1) − 𝑈®|<=δ (tolerance level)
STOP: otherwise set r = r+1 and return to step 2.In step 4 the notation || || is any matrix norm such as the
Euclidean norm.
Fuzzy C Mean clustering algorithm
Fuzzy C Mean (FCM) is a data clustering (Yanp, M. S. et. al.)56
, (Almeida, R. J. et. al.)57
technique in which a data
set is grouped into n clusters with every data point in the dataset belonging to every cluster will have a high degree
of belonging or membership to that cluster and another data point that lies far away from the center of a cluster
will have a low degree of belonging or membership to that cluster. The steps of FCM algorithm are given below.
1. Fix c and c is (2<=c<n) and select a value for parameter m‟. Initialize the partition matrix U(0). Each
step in this algorithm will be labeled as r, where r=0, 1, 2………
2. Calculate the c center vector{𝑉𝑖𝑗 } for each step
𝑉𝑖𝑗 = 𝑢𝑖𝑘
𝑚𝑛𝑘=1 ×𝑥𝑘𝑗
𝑢𝑖𝑘𝑚𝑛
𝑘=1 (3.17)
3. Calculate the distance matrix 𝐷[𝑐 ,𝑛]
Di,j = [ (xkj − vij )2m
j=1 ]1/2 (3.18)
4. Update the partition matrix for the rth step ,𝑈® as follow:
𝑢𝑖𝑘𝑟−1 =
1
[𝑑𝑖𝑘𝑟
𝑑𝑗𝑘𝑟 ]2/(𝑚−1)𝐶
𝑗=1
(3.19)
if||𝑈(𝑘+1) − 𝑈(𝑘) || <δ then STOP, otherwise return to step 2 by iteratively updating the cluster centers
and
the membership grades for data point. FCM iteratively moves the cluster centers to the “right” location
within a dataset.
3.8.3 The Proposed Approach
The proposed approach consists of two stages. In the first stage, TLBO algorithm explores the search space of the
given dataset to find out the near-optimal cluster centers values. In the second stage, those cluster centers with the
best objective function values (i.e. minimum) are used by FCM/HCM as initial cluster centers and then the final
clustering is performed. A description of these two stages is given.
29
Stage 1: Finding Near-Optimal Cluster Centers Using TLBO
In the following sections it describes a model of TLBO that represents the proposed algorithm.
The following section first presents a standard TLBO algorithm for clustering data into a given number of
clusters.
TLBO algorithm for clustering
Using TLBO data vectors can be clustered as follows:
1. Initialize each learner to contain 𝑁𝑐 , randomly selected cluster centroids.
Each learner 𝑥𝑖 , is constructed as follows:
𝑥𝑖= (𝑚𝑖 ,1,𝑚𝑖 ,2,……𝑚𝑖 ,𝑗 ……… . .𝑚𝑖 ,𝑁𝐶) . (3.20)
where 𝑚𝑖 ,𝑗 refers to the j-th cluster centroid vector of the i-th learner in cluster 𝐶𝑖𝑗 .
2. For t = 1 to 𝑡𝑚𝑎𝑥 do
(a) For each learner ido
(b) For each data vector 𝑧𝑝
i. Calculate the Euclidean distance d(𝑧𝑝 ,𝑚𝑖𝑗 ) to all cluster centroids 𝐶𝑖𝑗 .
ii. Assign 𝑧𝑝 to cluster 𝐶𝑖𝑗 .such that
d(𝑧𝑝 ,𝑚𝑖𝑗 ) = 𝑚𝑖𝑛
∀𝑐 = 1,2,…… . ,𝑁𝑐𝑑(𝑧𝑝 ,𝑚𝑖𝑐) .
iii. Calculate the fitness
The fitness of learner is easily measured as the quantization error,
𝐽𝑒 = [ 𝑑 𝑧𝑝 ,𝑚 𝑗 ∀𝑧𝑝∈𝐶𝑖𝑗
/ 𝐶𝑖𝑗 ]𝑁𝑐𝑗
𝑁𝑐 (3.21)
where 𝐶𝑖𝑗 is the number of data vectors belonging to cluster 𝐶𝑖𝑗 i.e. the frequency of that
cluster,𝑧𝑝 data vector
(c) Update the learner modification
(d) Update the cluster centroids using teacher phase and learner phase of TLBO algorithm.
where𝑡𝑚𝑎𝑥 , is the maximum number of iteration.
Check the stopping Criterion
This process is repeated until the maximum number of iterations (𝑡𝑚𝑎𝑥 ) is reached.
Stage 2: C-Means Clustering
Once the TLBO has met the stopping criterion, the solution vector from group of learners with best (i.e. minimum)
objective function value is selected and considered as initial centers for FCM/HCM. Consequently, c-means
algorithms perform data clustering in their iterative manner until the stopping criterion is met. Then it has been
calculated the validity index value using reformulated version of standard c-mean objective functions proposed in
(Hall, L.O. et. al.)58
can be seen in (3.22) for HCM and (3.23) for FCM.
𝑅1 = min{ 𝐷1𝑖 ,𝐷2𝑖 ,… ,𝐷𝑐𝑖 }𝑛𝑖=1 (3.22)
𝑅𝑚 = ( 𝐷𝑗𝑖
1
1−𝑚𝑐𝑗=1 )1−𝑚𝑛
𝑖=1 (3.23)
where 𝐷𝑗 is 𝑥𝑖 − 𝑣𝑗 , the Euclidean distance from data point 𝑥𝑖 to the jth cluster center.
30
3.8.4 Experimental Results and Discussion
This section compares the results of the TLBO initialization are marked as (TLBO/FCM), while the results from
random initialization are marked as (RAN/FCM) on clustering of some data sets in experiment 1, similarly in
experiment 2 TLBO initialization are marked as (TLBO/HCM), while the results from random initialization are
marked as (RAN/HCM).
For all the results reported, averages over 30simulations are given. In first stage the TLBO algorithm runs for
1000 function evaluations on using 10 learners and in the second stage FCM/HCM algorithm runs for maximum
100 iterations.
A. Experimental Setup
Here TLBO clustering algorithm used 10 particles and run on K number of clusters which is given in
following.
B. Datasets used
The real-life and synthetic data sets are used in this section are described in APPENDIX II.
C. Population Initialization
For both the TLBO Clustering algorithm, cluster centroids are randomly initialized. The cluster centroids are also
randomly fixed between 𝑋𝑚𝑎𝑥 and 𝑋𝑚𝑖𝑛 , which denote the maximum and minimum numerical values of any
feature of the data set under test, respectively.
D. Simulation Strategy
The quality of the solution constructed is measured in terms of the objective function and the number of iterations
needed to reach an optimal solution. The experiments are designed to test the performance of TLBO in finding
appropriate initial cluster centers for CMean algorithm compared with a standard random initialization technique
used to choose cluster centers.
In this section, while comparing the performance of algorithms, the main focus is given to computational time
required to find the solution. For comparing the speed of the algorithms, numbers of generations or iterations have
been chosen. Since the algorithms are stochastic in nature, the results of two successive runs usually do not match.
Hence, 30 independent runs (with different seeds of the random number generator) of each algorithm are taken.
The results have been stated in terms of the mean values and standard deviations over the 30 runs in each case.
All the experiment codes are implemented in MATLAB. The experiments are conducted on a Pentium 4, 1GB
memory desktop in Windows 7 environment.
E. Experimental Results
Experiment 1: FCM Experiment
The results from TLBO initialization are marked as (TLBO/FCM), while the results from random initialization are
marked as (RAN/FCM). Table 3.3.1 summarizes these results, where the average results from 30 trials are
recorded along with standard deviation.
31
Table 3.3.1 Results from TLBO/FCM and RAN/FCM
From the Table 3.3.1 it is clear that except Artificial_2 data set, all are converging to the nearly same extremum
that each of them has. This will take place for all initialization tried in these experiments, but the speed of reaching
this extremum depends on the initialization centers that are used, this will reflect on the number of iteration that
FCM needs to reach the extremum. The datasets Artificial_2 will converge to a different extrema depending on the
initial centers used. This will lead to different clustering results. Table 3.3.1 shows that the TLBO/FCM has equal
or better results for all datasets with single or multiple extrema are compared to the results obtained from
RAN/FCM. It is also noticeable that the big improvement in the objective function results was obtained from
TLBO/FCM in comparison with RAN/FCM when the datasets have multiple extrema.
Experiment 2: HCM Experiments
The results from TLBO initialization are marked as (TLBO/HCM), while the results from random initialization are
marked as (RAN/HCM). Table 3.3.2 summarizes these results, where the average results from 30 trials are
recorded along with standard deviation.
Table 3.3.2 Results from TLBO/HCM and RAN/HCM
Dataset
Name
TLBO/FCM(𝑅𝑚 )
(Objective Value)
RAN/FCM
(Objective Value)
TLBO/FCM(𝑅𝑚 )
(# of iteration)
RAN/FCM
(# of
iteration)
Iris Mean 61.7685 61.7751 7 14
std 0 0.0073 0 1.2867
Glass Mean 72.6045 72.6219 39 42
std 0 0.0253 1 5.5560
W B C
data
Mean 2.1756e+03 2.1758e+03 5 9
std 0 0.0002e+03 0 2
Wine Mean 1.0538e+004 1.0575e+004 8 15
std 0 0.0001e+04 0 5.2500
Vowel Mean 5.9963e+004 5.9989e+04 13 28
std 0 0.0002e+04 0 3.9225
H S Data
Set
Mean 1.7190e+003 1.7199e+03 6 10
std 0 0.0001e+03 0 1.0127
PID Data
set
Mean 3.5712e+004 3.5749e+004 8 11.2096
std 0 0.0010e+004 0 2.4519
Artificial
_1 data
Mean 3.7603 3.7603 13 21
std 0 0.5263 0 0.1532
Artificial
_2 data
Mean 0.0728 0.1076 36 82
std 0 0.0125 0 5.1299
Dataset
Name
TLBO/HCM(𝑅𝑚 )
(Objective Value)
RAN/HCM
(Objective Value)
TLBO/HCM(𝑅𝑚 )
(# of iteration)
RAN/HCM
(# of iteration)
Iris Mean 97.3259 97.4038 5 10.2000
std 0 0.1384 0 3.8239
Glass Mean 211.7820 219.2513 5 10
std 0 7.7919 0 2.5000
W B C
data
Mean 3.0395e+03 3.0395e+03 11 11
std 0 0 0 0
32
From Table 3.3.2 it is clear that always TLBO/HCM give better than or equal to that from randomly
initialized HCM. The improvement in the objective function results obtained from TLBO/HCM in comparison
with RAN/HCM is also noticeable when the datasets have multiple extrema. Table 3.3.2 also shows that the
number of iterations required by TLBO/HCM to reach the near optimal solution is less than or equal to those
obtained by RAN/HCM.
3.8.5 Conclusion
In the above sections the Teaching learning based optimization algorithm to overcome cluster centers initialization
problem in clustering algorithms (FCM/HCM) is discussed. This step is important in data clustering since the
incorrect initialization of cluster centers will lead to a faulty clustering process. TLBO algorithm works globally
and locally in the search space to find the appropriate cluster centers. The experiment evaluation shows that the
algorithm can tackle this problem intelligently.
3.9 Neural networks learning enhancement
Evolutionary computation is a collection of algorithms based on the evolution of a population towards a solution
of certain problem. These algorithms can be used successfully in many applications requiring the optimization.
These algorithms have been widely used to optimize the learning mechanism of classifiers, particularly on
Artificial Neural Network (ANN) Classifier. Major disadvantages of ANN classifier are due to its slow
convergence and always being trapped at the local minima. To overcome this problem, TLBO (Teaching learning
based optimization) has been used to determine optimal value for learning mechanism. In this study, TLBO is
chosen and applied to feed forward neural network to enhance the learning process. Two programs have
developed, Differential Evolution Neural Network (DENN) and Particle Swarm Optimization with Neural
Network (PSONN) to probe the impact of these methods on Teaching learning based optimization with neural
network (TLBONN) learning using various datasets. The results have revealed that TLBONN has given quite
promising results in terms of smaller errors compared to PSONN and DENN.
3.9.1 Introduction
Teaching learning based optimization (TLBO) algorithm is an evolutionary algorithm, which was proposed by
R.V. Rao (2011)4. It is a small and simple mathematical model of a big and naturally complex process of
evolution. So, it is easy and efficient. In this section, TLBO is chosen and applied to feed forward neural network
to enhance the learning process and compare with Differential Evolution Neural Network (DENN) and Particle
Swarm Optimization with Neural Network (PSONN) to probe the impact of these methods on Teaching Learning
Based Optimization with neural network (TLBONN) learning using various datasets.
Wine Mean 1.6557e+04 1.8437e+04 14 16
std 0 0 0 2
Vowel Mean 1.5008e+05 1.5170e+05 7 14
std 0 0.1251e+05 0 1.0125
H S Data
Set
Mean 2.6264e+03 2.6264e+03 7 12
std 0 0 0 2
PID Data
set
Mean 5.2072e+04 5.2072e+04 16 20
std 0 0 0 0
Artificial_1
data
Mean 6.3341 6.7858 4 10
std 0 0.0002 0 2.1195
Artificial
_2 data
Mean 0.0949 0.5976 28 56
std 0 0.0029 0 3.9127
33
3.9.2 ANN, PSO, DE , TLBO
The major discussion of this section is on Artificial Neural Network (ANN), Particle Swarm Optimization (PSO),
Differential Evolution (DE) and Teaching Learning based Optimization (TLBO).
Artificial neural network
Artificial Neural Network (ANN) is the most popular supervised learning technique. In ANN, there are many
elements to be considered such as number of input, hidden and output nodes, learning rate, momentum rate, bias,
minimum error and activation/transfer function. These elements will affect the convergence of BP (Back
Propagation) learning. The learning consists of the following steps:
i. An input vector is presented at the input layer.
ii. A set of desired output is presented at the output layer.
iii. After a forward pass is done, the errors between the desired and actual output are compared.
iv. The comparison results are used to determine weight changes (backwards) according to the learning
rules through papers
3.9.3 Neural Network Structure for DENN, PSONN and TLBONN
Here weights modification of neural network had been detained by PSO, DE and TLBO and it is thought to be an
alternative to BP methods because of their convenience. These algorithms applied with feed forward neural
network. For all algorithms, 3-layer ANN is used for classification on this study for all datasets. The network
architecture consists of input layer, hidden layer and output layer. The total number of nodes for every layer is
different depending on the classification problem. Number of input layer and output layer usually come from
number of attribute and class attribute. However there is no appropriate standard rule or theory to determine the
optimal number of hidden nodes (Kim, G.H. et. al.)59
. There are many suggestions by researcher to determine the
suitable number of hidden node. Some suggested techniques are summarized as follows:
1. The number of hidden nodes should be in the range between the size of the input layer and the size of
the output layer.
2. The number of hidden nodes should be 2/3 of the input layer size, plus the size of the output layer.
3. The number of hidden nodes should be less than twice the input layer size.
4. Pyramidal shape topology (Mariyam, S. et. al.)60
.
5. One hidden layer and 2N+1 hidden neurons sufficient for N inputs (Mariyam, S. et. al.)60
.
6. The number of hidden nodes is selected either arbitrarily or based on trial and error approaches
(Charytoniuk, W. et. al.)61
.
7. The number of hidden nodes should be n*m where m is the number of input nodes and n is number
output nodes (Charytoniuk, W. et. al.)61
. .
In this study, point 5 and 7 has been used to determine number of hidden node. The dimension of vector in
case of DE or particle in case of PSO or learner in case of TLBO can be calculated using the formula
Dimension= (input+1)*hidden + (hidden+1)*output (3.24)
and the activation function used to calculate output for each neuron except input neuron. A sinusoidal function is
used instead of sigmoid activation function because it leads to what has been termed a “Generalized Fourier
Analysis”. The sine function takes the trigonometric sine of the input and cosine function takes the trigonometric
cosine of the input. Consider a back propagation network with just one output, the learning procedure can be
thought of as synthesizing a continuous function y = g(x) by showing it a discrete set of (x, y) pairs. The network
configures itself to output a correct value (desired output) for each example input pair. When a previously unseen
input pattern is presented to the network, the network in effect performs a non-linear interpolation and produce an
34
output which a reasonable function value. When a sine or cosine function is used instead of a sigmoid, the learning
procedure seems to perform a mode decomposition where it discovers the most important frequency components
of the function described by discrete set of input output examples. The expression of this function is described as
follows:
f(net) = h(net) (3.25)
where h is sine of cosine function and net is net input in a neuron.
A particle is a complete set of weights. The architecture of the FNN is constant for each particle. A particle‟s
fitness is calculated in the following way. An artificial neural network is set up by using the particle‟s weights in
the FNN architecture. For each sample of the supervised training data, forward propagation of input through the
neural network is done to obtain the output and the error of this output with the desired output is calculated. The
error values of all training samples are accumulated after squaring and used as the particle‟s fitness. Higher the
error, lower the fitness of a particle. A forward propagation through the network is a computationally expensive
task, so the aim is to find the best possible solution using a limited number of forward propagations through the
network.
For all problems the initial weights are randomly assigned within a range [0, 1].The training accuracy of
each FNN is measured in the terms of the Root Mean Squared Error (RMSE) according to the following equation
RMSE = (𝑇𝑖 ,𝑗−𝐹𝑖 ,𝑗 )2𝑚
𝑗=1𝑝𝑖=1
𝑝∗𝑚 (3.26)
where p denotes the number of training patterns, m the number of FNN outputs, 𝑇𝑖 ,𝑗 the target (or desired) value
and𝐹𝑖 ,𝑗 the actual value of the output. The index for patterns is i and that for output is j. For all problems the neural
network had one input layer, one hidden layer and one output layer.
3.9.4 Experiment and Result
A. Experimental Setup
In all experiments in this section, the values of the common parameters used in each algorithm such as population
size and total evaluation number were chosen to be the same. Population size was 50 and the maximum number of
iteration was 100 for all functions. The other specific parameters of algorithms are given below:
PSO Settings
Cognitive and social components𝑐1,𝑐2 are constants that can be used to change the weighting between personal
and population experience, respectively. In this experiment cognitive and social components were both set to 2.
Inertia weight, which determines how the previous velocity of the particle influences the velocity in the next
iteration, was 0.5 (Kennedy, J.)62
.DE Settings
In DE, F is a real constant which affects the differential variation between two Solutions and set to F = 0.5*(1+
rand (0, 1)) where rand (0, 1) is a uniformly distributed random number within the range [0, 1] in this experiments.
Value of crossover rate, which controls the change of the diversity of the population, was chosen to be 𝑅 =
(𝑅𝑚𝑎𝑥 − 𝑅𝑚𝑖𝑛 ) ∗ (𝑀𝐴𝑋𝐼𝑇 − 𝑖𝑡𝑒𝑟)/𝑀𝐴𝑋𝐼𝑇 where 𝑅𝑚𝑎𝑥 =1 and 𝑅𝑚𝑖𝑛 =0.5 are the maximum and minimum values
of scale factor R, iter is the current iteration number and MAXIT is the maximum number of allowable iterations
(Storn, R. et. al.)17
.
35
TLBO Settings
For TLBO there is no such constant to set.
B. Datasets used
The neural network is tested on five different datasets. These are described in detail in APPENDIX II.
C. Simulation Strategy
Here, while comparing the performance of algorithms, training accuracy is measured in the terms of the Root
Mean Squared Error (RMSE) and convergence characteristics in terms of no. of fitness evaluation. Since the
algorithms are stochastic in nature, the results of two successive runs usually do not match. Hence, 30
independent runs (with different seeds of the random number generator) of each algorithm are taken. The results
have been stated in terms of the mean values and standard deviations over the 30 runs in each case.
Finally, it is pointed out that all the experiment codes are implemented in MATLAB. The experiments are
conducted on a Pentium 4, 1GB memory desktop in Windows XP 2002 environment.
D. Experimental Results
To judge the accuracy of the PSONN, DENN and TLBONN, each of them run for a very long time over every
benchmark data set, until the number of FEs exceeded 5000, then note the result as found in Table 3.4.1 using sine
activation function and in Table 3.4.2 using cosine activation function. At the same time the RMSE value is noted
in different FEs in Table 3.4.3 and Table 3.4.4 for sine activation function and in Table 3.4.5 and 3.4.6 for cosine
activation function.
Table 3.4.1. Results of the experiments for NN training (averaged over 30 runs) using SINE function
Table 3.4.2. Results of the experiments for NN training (averaged over 30 runs) using COSINE function
Dataset PSONN DENN TLBONN
Mean Std Mean Std Mean Std
2 bit parity 0.0053 0.0055 1.4923e-4 2.1642e-4 1.0063e-7 1.4024e-7
4 bit parity 0.0012 0.0017 1.4083e-4 2.3486e-4 1.0857e-7 1.4963e-7
iris data 8.3371e-5 2.1940e-6 1.9400e-4 1.5993e-4 8.6331e-6 8.5031e-6
lenses data 5.2319e-4 2.2008e-6 0.0013 0.0012 3.1168e-5 1.9736e-5
Survival
data
2.1759e-5 1.5728e-5 1.2067e-5 2.9850e-5 5.4443e-8 6.5821e-8
Dataset PSONN DENN TLBONN
Mean Std Mean Std Mean Std
2 bit parity 0.0104 0.0110 1.6254e-4 3.2051e-4 2.0925e-7 3.9495e-7
4 bit parity 8.8731e-4 0.0012 6.4093e-5 8.4232e-5 1.2081e-7 1.3946e-7
iris data 6.6081e-4 5.0269e-4 1.3935e-5 1.0141e-5 1.2506e-6 1.0034e-6
lenses data 0.0044 0.0032 0.0010 0.0010 3.3741e-5 2.9101e-5
Survivaldata 5.8250e-5 5.0471e-5 7.5927e-6 9.8357e-6 3.5133e-8 4.4424e-8
36
Table 3.4.3. RMSE value in corresponding FEs using SINE activation function
Table 3.4.4. RMSE value in corresponding FEs using SINE activation function
Table 3.4.5. RMSE value in corresponding FEs using COSINE activation function
T
a
b
l
e
3
.
4
.
No.
FEs
2bit parity dataset 4bit parity dataset Iris dataset
PSO DE TLBO PSO DE TLBO PSO DE TLBO
1 0.0021 0.0016 1.9985e-4 0.0047 0.0013 2.7405e-5 5.4161e-4 4.3411e-4 5.1125e-4
250 0.0021 0.0016 1.3000e-4 0.0047 5.3945e-4 1.3521e-5 5.4161e-4 4.3411e-4 3.5112e-4
500 0.0021 0.0016 2.7668e-5 0.0047 5.3945e-4 3.8312e-6 5.4161e-4 3.8558e-4 1.1126e-4
1000 0.0021 0.0016 2.7668e-5 0.0047 3.6125e-4 2.6865e-7 5.4161e-4 3.8558e-4 7.6619e-5
1500 0.0021 1.0232e-4 5.9518e-6 0.0047 2.2348e-4 2.6865e-7 5.4161e-4 2.5650e-4 5.1103e-5
2000 0.0021 4.7050e-5 3.3312e-6 0.0047 2.2348e-4 2.6865e-7 5.4161e-4 2.5650e-4 5.1103e-5
2500 0.0021 4.7050e-5 2.0012e-7 0.0047 2.2348e-4 1.4360e-7 5.4161e-4 2.5650e-4 5.1123e-6
3000 0.0021 4.7050e-5 1.2312e-7 0.0047 2.2348e-4 1.4360e-7 5.4161e-4 2.5650e-4 5.1123e-6
3500 0.0021 4.7050e-5 1.2312e-7 0.0047 2.2348e-4 1.4360e-7 5.4161e-4 2.5650e-4 2.1126e-6
4000 0.0021 4.7050e-5 1.0001e-7 0.0047 2.2348e-4 1.0092e-7 5.4161e-4 2.5650e-4 2.1126e-6
4500 0.0021 4.7050e-5 1.0001e-7 0.0047 2.2348e-4 1.0092e-7 5.4161e-4 2.5650e-4 1.0019e-6
5000 0.0021 4.7050e-5 1.0001e-7 0.0047 2.2348e-4 1.0092e-7 5.4161e-4 2.5650e-4 1.0019e-6
No.
of
FEs
Haberman‟s survival Lenses
PSO DE TLBO PSO DE TLBO
1 3.3374e-5 1.2763e-6 2.0612e-4 5.2797e-4 5.3316e-4 0.0017
250 3.3374e-5 3.0981e-5 1.6625e-4 5.2414e-4 5.3075e-4 8.6446e-4
500 3.3374e-5 3.0981e-5 5.6624e-5 5.2414e-4 5.2506e-4 5.6979e-5
1000 3.3374e-5 9.7380e-6 2.3312e-5 5.2414e-4 5.2506e-4 5.6979e-5
1500 3.3374e-5 9.7380e-6 2.1991e-7 5.2363e-4 5.2364e-4 2.1630e-5
2000 3.3374e-5 9.7380e-6 5.1126e-8 5.2357e-5 5.2358e-4 2.1630e-5
2500 3.3374e-5 9.7380e-6 1.5107e-8 5.2357e-5 5.2357e-4 1.6651e-5
3000 3.3374e-5 9.7380e-6 1.5907e-9 5.2357e-5 5.2357e-4 6.5612e-6
3500 3.3374e-5 9.7380e-6 1.5907e-9 5.2357e-5 5.2357e-4 2.1162e-6
4000 3.3374e-5 9.7380e-6 1.5907e-9 5.2357e-5 5.2357e-4 2.1162e-6
4500 3.3374e-5 9.7380e-6 1.5907e-9 5.2357e-5 5.2357e-4 2.1162e-6
5000 3.3374e-5 9.7380e-6 1.5907e-9 5.2357e-5 5.2357e-4 2.1162e-6
No.
FEs
2bit parity dataset 4bit parity dataset Iris dataset
PSO DE TLBO PSO DE TLBO PSO DE TLBO
1 0.0038 5.3822e-4 2.3257e-5 1.4520e-4 2.3121e-5 1.9195e-4 4.4795e-4 1.1165e-4 2.1124e-4
250 0.0038 5.3822e-4 2.3257e-5 1.4520e-4 2.3121e-5 5.1588e-5 4.4795e-4 1.1165e-4 2.1124e-4
500 0.0038 5.3822e-4 2.3257e-5 1.4520e-4 2.3121e-5 7.0456e-6 4.4795e-4 1.1165e-4 2.1124e-4
1000 0.0038 1.9971e-4 1.7198e-5 1.4520e-4 2.3121e-5 4.1669e-6 4.4795e-4 3.4132e-5 1.0560e-5
1500 0.0038 6.6203e-5 8.4330e-6 1.4520e-4 2.3121e-5 4.1669e-6 4.4795e-4 3.4132e-5 1.0560e-5
2000 0.0038 6.6203e-5 3.1771e-6 1.4520e-4 2.3121e-5 4.1669e-6 4.4795e-4 3.4132e-5 1.0560e-5
2500 0.0038 6.6203e-5 1.0184e-7 1.4520e-4 2.3121e-5 1.2204e-7 4.4795e-4 3.4132e-5 7.8669e-6
3000 0.0038 6.6203e-5 1.0441e-8 1.4520e-4 2.3121e-5 1.2204e-7 4.4795e-4 3.4132e-5 1.8769e-6
3500 0.0038 6.6203e-5 6.4876e-9 1.4520e-4 2.3121e-5 1.2204e-7 4.4795e-4 3.4132e-5 1.2506e-6
4000 0.0038 6.6203e-5 6.4876e-9 1.4520e-4 2.3121e-5 6.9832e-8 4.4795e-4 3.4132e-5 1.2506e-6
4500 0.0038 6.6203e-5 6.4876e-9 1.4520e-4 2.3121e-5 6.9832e-8 4.4795e-4 3.4132e-5 1.2506e-6
5000 0.0038 6.6203e-5 6.4876e-9 1.4520e-4 2.3121e-5 6.9832e-8 4.4795e-4 3.4132e-5 1.2506e-6
37
Table 3.4.6. RMSE value in corresponding FEs using COSINE activation function
From Table 3.4.1 and 3.4.2, it gets that in all datasets the RMSE error for TLBONN is less than the RMSE
error for both PSONN and DENN. From Table 3.4.3, 3.4.4, 3.4.5 and 3.4.6 it is clear that convergence rate of
TLBONN is slower in compared to PSONN and DENN but TLBONN is giving smaller error than PSONN and
DENN in all the datasets. It is also marked that in how much no. of fitness evaluation other algorithms converges
at the same no. of fitness evaluation also TLBONN is giving better result than other algorithm in almost all cases.
Due to limited space only two figures have been taken to show the convergence behavior of all algorithms.
Fig. 3.4.1 shows for iris dataset with cosine activation function and Fig. 3.4.2 for iris dataset with sine activation
function.
Fig. 3.4.1. Convergence behavior of algorithms Fig. 3.4.2 Convergence behavior of
algorithms
using cosine activation function for IRIS dataset using sine activation function for IRIS dataset
3.9.5 Conclusion
In the above section the application of Teaching Learning Based Optimization (TLBO) to enhance the learning in
neural network is investigated. The TLBONN approach is compared against PSONN and DENN. From the
simulation results it is observed that TLBONN has better training accuracy in comparison to other two algorithms.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000
1
2
x 10-4 Iris dataset with Cosine activation function
No.of fitness evaluation
RM
SE
valu
e
PSO
DE
TLBO
0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000
1
2
3
4
5
6x 10
-4 Iris dataset with Sine activation function
No.of fitness evaluation
RM
SE
valu
e
PSO
DE
TLBO
No. of
FEs
Haberman‟s survival Lenses
PSO DE TLBO PSO DE TLBO
1 2.6675 e-5 8.2352e-5 8.0296e-6 0.0088 0.0021 2.8358-4
250 2.6675 e-5 1.7840e-5 3.6534e-6 0.0088 0.0021 1.3000e-4
500 2.6675 e-5 1.5680e-5 3.6534e-6 0.0088 0.0021 8.7104e-5
1000 2.6675 e-5 1.5680e-5 4.1600e-7 0.0088 0.0021 8.7104e-5
1500 2.6675 e-5 1.5680e-5 2.6901e-7 0.0088 2.8519-4 8.7104e-5
2000 2.6675 e-5 8.1262e-5 9.0791e-8 0.0088 0.0014 8.7104e-5
2500 2.6675 e-5 8.1262e-5 7.4727e-8 0.0088 0.0014 8.7104e-5
3000 2.6675 e-5 8.1262e-5 7.4727e-8 0.0088 0.0014 1.4521e-5
3500 2.6675 e-5 8.1262e-5 7.4724e-8 0.0088 0.0014 1.4521e-5
4000 2.6675 e-5 8.1262e-5 7.4724e-8 0.0088 0.0014 1.4521e-5
4500 2.6675 e-5 8.1262e-5 7.4724e-8 0.0088 0.0014 1.4521e-5
5000 2.6675 e-5 8.1262e-5 7.4724e-8 0.0088 0.0014 1.4521e-5
38
3.10 Overall Conclusion
This chapter discusses about application of TLBO algorithm in solving different problems of different research
area especially in the area of computer network by solving problem of QoS multicast routing, power system by
solving problem of 0-1 Integer Programming for Generation maintenance Scheduling, data clustering of data
mining by solving problem of improvement of initial cluster centers for c-means algorithm while clustering the
variant of datasets, and in the area of neural network for solving problem of learning enhancement mechanism by
neural network. In solving all those problems TLBO algorithm has shown better performance compared to other
studied algorithms in terms of less computational costs.