Upload
global-studies
View
0
Download
0
Embed Size (px)
Citation preview
A neural network approach to identify forest standssusceptible to wind damage
Marc Hanewinkela,*, Wenchao Zhoub, Christian Schilla
aInstitute of Forestry Economics (IFE), University of Freiburg, Tennenbacherstr. 4, D-79106 Freiburg, GermanybInstitute of Forest Economics, Swedish Agricultural University, 90183 Umea, Sweden
Received 23 September 2002; received in revised form 18 November 2003; accepted 20 February 2004
Abstract
The artificial neural network technique to model wind damage to forests was examined. The network used in the investigation
was a three-layered feed-forward neural network with a backpropagation training-algorithm using a momentum term and flat
spot elimination. To yield insights into the performance of the network, a logistic regression model was fitted as a baseline. Two
different types of models were set up and analyzed for both approaches. A dichotomous model that predicted the categories
‘‘damaged’’ versus ‘‘undamaged’’ for two different damage thresholds and a multinomial model that predicted the damage in
four damage classes. The performance of the network and the logistic regression model was measured using the mean squared
sensitivity error. The results of the dichotomous model demonstrate that a feed-forward network is able to better classify forests
susceptible to wind damage than a logistic regression model, especially when the frequency of the undamaged and damaged
forest stands differs significantly. This study also shows that the network has a higher capacity to identify damaged forest stands,
compared to the logistic regression model applied in this investigation. With the specific dataset used in the present study, the
proportion of damaged forest stands predicted by the network was between the observed proportion and the proportion predicted
by the logistic regression model. The results of the multinomial models showed that both, the statistical model and the neural
network were unable to classify all four damage classes but showed a dichotomous behavior in predicting the damage only in the
two extreme damage classes. Possibilities to optimize the network performance by using different training algorithms or
topologies and principal differences between the two models referring to their specific properties are discussed.
# 2004 Elsevier B.V. All rights reserved.
Keywords: Logistic regression model; Backpropagation; Dichotomous model; Multinomial model; Risk management
1. Introduction
Climatic hazards, causing damages to forests
mainly in the form of storms or snowbreakage have
reached a level that is constantly threatening regular
forest management. The storm of February 1990
caused more than 100 million m3 of damage to the
forests of Europe. After the catastrophic gale of 26
December 1999 in France and Germany, where
more than 30 million m3 of timber were blown down
only in Southwest Germany (Baden-Wurttemberg), a
necessity of proper risk management is obvious.
Beside these catastrophic events we have to consider
a constant high level of so-called ‘‘incidental exploi-
tations’’. An investigation of Hanewinkel (2001)
Forest Ecology and Management 196 (2004) 227–243
* Corresponding author. Tel.: þ49-761-203-3686;
fax: þ49-761-203-3690.
E-mail address: [email protected]
(M. Hanewinkel).
0378-1127/$ – see front matter # 2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.foreco.2004.02.056
showed that the percentage of exploitations that were
not the result of planned silvicultural interventions
but were due to the impact mainly of storm and
snow reached 44% for the period of 1980–1994 in
the state-forest of the northern Black Forest.
Several distinctly different ways for risk assessment
have been developed. The first one is sometimes only
partly a scientific approach. Based on an extensive
literature review or even only on local experience, an
expert system assigns forest stands and/or site units to
risk classes using more ore less simple expert rules. An
example for such an expert system is a scheme to
estimate the risk for storm damage in forest stands in
South Germany by Rottmann (1986, p. 96), or a
similar approach for snow damage by the same author
(Rottmann, 1985, p. 111). Mitchell (1998) has devel-
oped a diagnostic framework for windthrow risk esti-
mation in Canada that is very similar to an expert
system.
In addition to expert systems mechanistic models or
empirical mechanistic models such as HWIND (Pel-
tola et al., 1999) or GALES (Gardiner and Quine,
2000) have been developed as generic tools for risk
assessment and tested against each other (Gardiner
et al., 2000). Both of these tools require a very high
quality of input data and are primarily meant to be
used to evaluate the risk linked to a particular regime
of management. Component models integrate the risk
assessment on different levels, from single trees to
stands and whole regions (Talkkari et al., 2000).
Meteorological components of the models such as
windspeed or airflow modeling (Konig, 1995; Lekes
and Dandul, 2000) seek to improve the performance of
the risk assessment. An overview of different model
approaches for risk assessment that is accessible via
the World Wide Web is given by Miller et al. (2000).
However, the most common way to assess risk on a
scientific base is still the use of statistical models.
These models use data of historical damage occur-
rences to predict future risk events or to classify forests
according to their vulnerability towards risks. A clas-
sic deterministic approach is thereby to derive transi-
tion probabilities for age classes of stand types on
defined site units. The theory behind these models has
mainly been developed by Suzuki (1971) based on
Markov-chains. This approach has been widely
applied in eastern Germany for forests dominated
by Norway spruce (Kurth et al., 1987). The standard
tool to predict risk for forests or forest stands is usually
a variant of a regression model. Thereby, the logistic
regression model has been utilized as the most com-
mon statistical approach to examine wind damage to
forests (Hinrichs, 1994; Konig, 1995; Fridman and
Valinger, 1998; Valinger and Fridman, 1997, 1999;
Jalkanen and Mattila, 2000; Mitchell et al., 2001).
This technique was mainly successful when it was
applied for numerically analyzing influential factors
causing wind damages. The different factors that were
analyzed in the different studies vary widely. Hinrichs
(1994) uses the same variables to model wind damage
as in the present study (see Section 3). Konig (1995)
basically adds wind speed as an explaining variable.
Fridman and Valinger (1998) use stem volume, dbh, h/
dbh, taper, mean diameter, mean height, N/ha, basal
area, volume index and site index as independent
variables. According to the basic logistic regression
model developed by Jalkanen and Mattila (2000), the
susceptibility of a stand to wind damage was increased
by large mean diameter, high stand age, seed-tree
cutting (felling), special cutting (tree felling for
ditches, roads or power lines, or sanitation cutting
after damage), and decreasing temperature sum. Key
variables in the models built by Mitchell et al. (2001)
included site quality, stocking boundary orientation,
time since harvest and topographic exposure. As a
classifier for wind damage to forests, however, the
logistic regression model did not always perform as
well as one might hope. Its ability to predict damages
to forest stands decreases, especially when the number
of undamaged and damaged stands in the sample
dataset to which the logistic regression model is fitted,
differs significantly. The study of Fridman and Valin-
ger (1998), for example, showed that with the specific
dataset used in that investigation, the predicted pro-
portion of damaged plots was highly over-estimated.
The low performance of the logistic regression models
necessitates efforts to find new approaches to classify
wind damage to forests.
2. Goal of the investigation
Goal of the present study was to predict the prob-
ability and intensity of wind damage for a given stand
and its vulnerability towards this kind of damage using
rather simple and easy to assess inventory and booking
228 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
data. In order to improve the weak performance of the
commonly used logistic regression models that is
reported in the literature in these cases, a new meth-
odology for damage prediction was tested.
In this study, the possibility of using an artificial
neural network as an alternative approach to identify
forest stands susceptible to wind damage was there-
fore investigated. A second goal of the investigation
was to compare the performance of the classical
statistical approach with that of the neural network.
3. Materials and methods
3.1. Neural networks—a brief introduction
An artificial neural network (ANN) is a technology
in the field of artificial intelligence that is especially
designed to deal with complex and ill defined problems,
for example, pattern recognition (Patterson, 1996).
ANNs are able to learn from incomplete, disturbed
and ‘‘noisy’’ datasets. Therefore, they should be espe-
cially suited to deal with data concerning risk like in the
present investigation. Applications of ANNs in forestry
mainly deal with mortality estimation (Guan and Gert-
ner, 1991a,b, 1995), uncertainty assessment of forest
growth models (Guan et al., 1997) or multi resource
forest land use planning (Nogami, 1991).
The first step to start a risk analysis using artificial
neural networks is to define the decisive input vari-
ables that will be used for the input layers of the
network. In our case this input is defined by the
parameters of the inventory (stand description,
Fig. 1). In a next step the topology (the architecture)
of the neural network to be used has to be defined. As
there is no general rule or recipe how to design the
network (Nauck et al., 1994) this is a very complex
trial and error process.
The different nodes (perceptrons) of the neural
network are connected to each other with weights.
Each perceptron evaluates the sum of the weighted
inputs by a special activation function and ‘‘fires’’ this
result to each perceptron to which its output is con-
nected. In the present investigation the processing
units in the networks used a logistic activation func-
tion of the form:
f ðxÞ ¼ ½1 þ expð�xÞ��1(1)
where x is the activation of the processing unit exclud-
ing all the units in the input layer.
The final output value that leaves the output layer of
the neural network is compared to a target value. The
difference between these two values which represents
the error of the network is minimized by backpropa-
gating this error through the net and adjusting the
weights between the different perceptrons. The target
values (the output for the different damage types) were
standardized using a linear standardization method to
force them into the range of the activation function.
The error function that calculates the mean quad-
ratic error for each input pattern has the following
form:
Ep ¼ 1
2
Xm
k¼1
ðtpk � z
pkÞ
2(2)
with: 1; . . . ; p the number of training patterns; tpk the
known output of kth variable of training pattern p; zpk
the calculated value for variable k in training pattern p;
k ¼ 1; . . . ;m number of variables describing one input
pattern.
This type of ‘‘supervised learning’’ uses the back-
propagation algorithm and the generalized delta rule
(Rumelhart and McClelland, 1986) to adjust the
weights of the different connections between the
nodes. The magnitude of the weight adjustment is
determined by the learning coefficient Z. The conver-
gence rate of the network can be improved by adding a
momentum term a to the gradient expression, that
means by adding a fraction of the previous to the
actual weight change (formula (3); Patterson, 1996,
p. 187 f):
Dwjðt þ 1Þ ¼ �Z@E
@wjðtÞþ aDwjðtÞ (3)
where DwjðtÞ is the previous weight change,
Dwjðt þ 1Þ the actual weight change, E the error, Zthe learning coefficient, and a the momentum term.
To adjust the weights within the described learning
process it is necessary to confront the network with a
training set. The performance of the net must then be
evaluated using a dataset that has not been part of the
training set. The type of architecture for an artificial
neural network and the learning procedure briefly
described here is only one possibility to design and
train an artificial neural network (Nauck et al., 1994).
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 229
3.2. The topology of the neural network
The network used in this study was a general three-
layered feed-forward neural network trained with a
backpropagation algorithm with momentum term and
flat spot elimination called BackpropMomentum,
implemented in the neural network simulator SNNS
that was used in the present investigation (Zell et al.,
2000). Beside the usual learning parameter Z of the
standard backpropagation that specifies the step width
of the gradient descent, this enhanced backpropaga-
tion learning algorithm uses a momentum term m
0
200
400
600
800
1000
4 8 12 16 20 Age index
0%
20%
40%
60%
80%
100%
Perc
en
tag
e o
f d
am
ag
e
Frequency
(a)
0
200
400
600
800
1000
3 4 5 6 7 Height index
0%
20%
40%
60%
80%
100%
Perc
en
tag
e o
f d
am
ag
e
Frequency
(b)
0
200
400
600
800
1000
Other Spruce Species
0%
20%
40%
60%
80%
100%
Perc
en
tag
e o
f d
am
ag
e
Frequency
(c)
0
200
400
600
800
1000
N NE E SE S SW W NW Aspect
0%
20%
40%
60%
80%
100%
Perc
en
tag
e o
f d
am
ag
e
Frequency
(d)
0
200
400
600
800
1000
1 2 3 4 Site stability
0%
20%
40%
60%
80%
100%
Perc
en
tag
e o
f d
am
ag
e
Frequency
(e)
0
200
400
600
800
1000
400 450 500 550Elevation, m
0%
20%
40%
60%
80%
100%
Perc
en
tag
e o
f d
am
ag
e
Frequency
(f)
0
200
400
600
800
1000
0 5 10 15 20 25 Slope
0%
20%
40%
60%
80%
100%
Perc
en
tag
e o
f d
am
ag
e
Frequency
(g)
0
200
400
600
800
1000
0 50 100 150 Topex
0%
20%
40%
60%
80%
100%P
erc
en
tag
e o
f d
am
ag
eFrequency
(h)
Damaged Undamaged + Damaged Percentage of damage
Fig. 1. The relationships between the occurrence of damage and the variables.
230 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
which defines a relative amount of the old weight
change that is added to the current change of the
weights and a flat spot elimination value c, a constant
value which is added to the derivative of the activation
function to enable the network to pass flat spots of the
error surface. To prevent over-training of the network
an additional parameter, dmax, defined as the max-
imum difference dj ¼ tj � oj between a teaching value
tj and an output oj of an output unit which was
tolerated and propagated back as dj ¼ 0 was also used
(Zell et al., 2000, 67ff).
Two different approaches to model damage to forest
stands due to storms were applied:
(i) A dichotomous model, which predicted damage
due to storm for forest stands as two categories
‘‘damaged’’ and ‘‘undamaged’’ according to a
predefined damage threshold. For this model the
network was composed of one input layer, one
hidden layer and one output layer. Dependent
upon the features of the variables in the dataset
used, the resulting size of the network was as
follows: 16 units (nodes) in the input layer
receiving inputs describing the stand state and
the site; 1 unit (node) in the output layer,
representing the damage due to storm of the
stand. For the hidden layer, five processing units
(nodes) were considered. Other training algo-
rithms and network topologies and different
settings of the above-mentioned parameters of
the network were examined in a sensitivity
analysis.
Thus, the output from the proposed network can
be explained as the estimated conditional prob-
ability of a forest stand being damaged given the
input vector, since the response from the output
unit is always between 0 and 1 (Hinton, 1989;
Guan and Gertner, 1991b). When the trained net-
work is applied to an unknown case, the output
unit receives activation, corresponding to each
vector of inputs. Based on the achieved activation
level the forest stand was classified as ‘‘unda-
maged’’ or ‘‘damaged’’. Damaged stands
exceeded the threshold of 0.5 for the activation
level. All other forests were classified as ‘‘unda-
maged’’.
(ii) A multinomial model that tried to predict damage
to forest stands due to storms in four different
damage classes. The same network type (three-
layered, feed-forward), learning algorithm and
activation function was applied in this model, but
the network topology was changed into a 16–5–4
structure with 16 nodes in the input, 5 in the
hidden and 4 units in the output layer. The four
output units encoded the four damage classes
using a 0/1-encoding scheme. Thus, damage
class 0 (very low damage) was represented as 0 0
0 1 in the output units, whilst damage class 3
(high damage) was encoded by 1 0 0 0, with the
damages classes 1 (0 0 1 0) and 2 (0 1 0 0) in
between. The output unit with the highest
activation level in the learned output was looked
upon as the ‘‘winning’’ unit that decided on the
resulting damage class (for example, see Table 8).
3.3. The logistic regression model
In order to obtain insights into the performance of
the different neural network models the following two
different statistical models were also fitted to the
dataset using a logistic regression (formula (2)):
lnp
1 � p
� �¼ b0 þ b1x1 þ b2x2 þ � � � þ bnxn (4)
where p is the probability of an arbitrary stand being
damaged, x1; x2; . . . ; xn the independent variables, and
b1; b2; . . . ; bn the parameters.
(i) For the dichotomous model wind damage to a
forest stand was treated as a binary event,
‘‘undamaged’’ (encoded as 0) or ‘‘damaged’’
(encoded as 1).
With the estimated parameters from (2) by
using logistic regression, the probability of
damage (p) for a forest stand was calculated as
p ¼ expðb0 þ b1x1 þ b2x2 þ � � � þ bnxnÞ1 þ expðb0 þ b1x1 þ b2x2 þ � � � þ bnxnÞ
(5)
The threshold between the undamaged and
damaged stand was set at 0.5. If the estimated
probability was higher than 0.5, then the stand
was classified as ‘‘damaged’’, or ‘‘undamaged’’
otherwise.
(ii) For the multinomial model a multinomial logistic
regression model (MLR) was fitted using the
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 231
identical damage classes as for the multinomial
neural network model. The MLR model can be
described as
pj ¼expðb0;j þ x1b1;j þ x2b2;j þ � � � þ xnbn;jÞ
1 þP2
j¼0expðb0;j þ x1b1;j þ x2b2;j þ � � � þ xnbn;jÞ;
j ¼ 0; 1; 2 (6)
where pj is the probability of class j at an arbitrary
stand (the last class 3 is assumed to be the reference
class); xi the observed value of independent vari-
able i, i ¼ 1; . . . ; n; bi;j the parameter associated
with variable i for class j.
3.4. Performance measures
The performance of both the neural network and the
logistic regression model can be measured by the
sensitivity, which is defined as the proportion of events
labeled as a given class that are correctly identified
(Lawrence et al., 1998, 303). Table 1 shows a 2 2-
classification table illustrated for our classification
problem.
From Table 1, the sensitivity for each class can be
calculated. Specifically, the sensitivity of the class
‘‘Undamaged Stand’’, S0, which measures the propor-
tion of the correctly classified undamaged stands, can
be calculated as
S0 ¼ n00
n00 þ n01
(7)
The sensitivity of the class ‘‘Damaged Stand’’, S1,
which measures the proportion of the correctly clas-
sified damaged stands, is calculated as
S1 ¼ n11
n10 þ n11
(8)
The overall sensitivity, denoted by SS, can be calcu-
lated as
SS ¼ n00 þ n11
n00 þ n01 þ n10 þ n11
(9)
with n00 being a correctly classified undamaged stand,
n11 a correctly classified damaged stand, n01 an unda-
maged stand classified as damaged and n10 a damaged
stand classified as undamaged.
In the present study we used the mean squared
sensitivity error (MSSE), a performance measure
introduced by Lawrence et al. (1998), to evaluate
the performance of the neural network and the logistic
regression model. The MSSE was calculated by
MSSE ¼ 12½ð1 � S0Þ2 þ ð1 � S1Þ2� (10)
Since the sensitivities take values between 0 and 1, a
lower MSSE indicates a higher performance. Com-
pared to SS, MSSE as an overall performance measure
has the advantage of giving equal importance to each
class, instead of depending mostly on the most com-
mon class.
For the multinomial model the sensitivity and
MSSE were calculated, respectively, but instead of
using only two classes S0 and S1, four classes (S0, S1,
S2, S3) were included in the calculation.
By directly comparing the MSSE for the neural
network and the MSSE for the logistic regression
model, it was decided which of the two models
performed better for our classification problem. In
this paper, the sensitivity was expressed as a percen-
tage, while the MSSE was represented as a decimal
value.
3.5. Database of the investigation
The database of the present investigation was book-
ing records from the 4000 ha state forest unit ‘‘Beben-
hausen’’ in south-west Germany (Hinrichs, 1994). The
original dataset contained historical records of wind
damage to more than 2800 forest stands in the years
1967–1991. The complete investigated period was
divided into three sub-periods. Damage was expressed
as the average of the sub-period. In addition, regular
inventory data of the periodical forest management
that were assessed every 10 years (1967, 1977, 1987)
and a site classification characterized the forest stands
in terms of species composition, growing stock, age,
height and site unit. The input described the status of
the stand at the beginning of the sub-period for which
the damage was assessed. A GIS was used to deter-
mine the position of each stand and assign the site
units to the forest stands. Filtering out all stands
Table 1
Classification table for undamaged (0) and damaged stands (1)
Observed Predicted
Undamaged (0) Damaged (1)
Undamaged (0) n00 n01
Damaged (1) n10 n11
Rows are the observed values and columns are the predicted values.
232 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
younger than 30 years reduced this dataset. Stands
with no recorded growing stock were also removed.
As a result, a total of 1600 stand records were
obtained. The reduced dataset was then used to gen-
erate the training and test sets for the neural network.
A test set (TTS) was generated by randomly selecting
400 stands (25%) from the reduced dataset. The rest of
the 1200 stands were used as a training set (TGS). Five
pairs of training- and test sets were obtained by
repeating this procedure. In the following they will
be referred to as: TGS-1/TTS-1; TGS-2/TTS-2; TGS-
3/TTS-3; TGS-4/TTS-4; TGS-5/TTS-5, respectively.
Defining whether a forest stand is damaged or not is
always somewhat arbitrary. In this study, the degree of
damage (damage rate) was calculated as a percentage
by dividing the recorded damage due to wind by the
standing volume of the forest stand. For the dichot-
omous model a predefined value of damage rate was
set as the cut line between ‘‘undamaged’’ and
‘‘damaged’’. Specifically, a forest stand was classified
as ‘‘undamaged’’ (encoded as 0) if the observed
damage rate was below 2%, otherwise as ‘‘damaged’’
(encoded as 1). Different values of the cut line will
result in different distributions of the frequency of the
undamaged and damaged stands; consequently, the
performance of the model may change more or less. In
order to determine the effectiveness of the cut value, a
damage rate of 5% was also used. For the multinomial
model the four damage classes were fixed at 0–2 (class
0), 2–5 (class 1), 5–10 (class 2) and >10% (class 3).
This classification follows a scheme proposed by
Hinrichs (1994), where a 2%-damage is considered
to be a low damage that already leads to higher costs
for salvage cuttings. A damage of 10% is considered to
be a ‘‘high damage’’ that severely influences the
stability of a stand and leads to larger gaps or open
areas. A 5%-damage is already visible in aerial photo-
graphs (Hinrichs, 1994, 51).
For each of the generated datasets, the frequency of
the undamaged and damaged stands for the dichot-
omous model is depicted in Table 2. It is obvious that
the frequency of the undamaged and damaged stands
was significantly unbalanced. The proportion of
damaged stands amounted to around 30% in all the
sets when 2% of damage rate was used as the cut value,
and to around 20% when 5% was used. Table 3 shows
how the percentage of the damaged stands in the test
sets is reduced when four different damage classes are
formed in the multinomial model. Risk class 2 (5–
10%) reaches a level of less then 9% of damaged
stands in all the test sets, while classes 1 and 3 are
within 9 and 13%. For the training sets the shares of
the different risk classes were within the same range as
for the test sets.
Table 2
Frequency distribution of undamaged and damaged stands in training sets (TGS) and test sets (TTS) for the dichotomous model
Damage rate 2% as cut value Damage rate 5% as cut value
Undamaged 0 Damaged 1 Percentagea Undamaged 0 Damaged 1 Percentage
TGS-1 858 342 28.5 972 228 19.0
TTS-1 274 126 31.5 323 77 19.2
TGS-2 854 346 28.8 969 231 19.3
TTS-2 277 123 30.8 326 74 18.5
TGS-3 855 345 28.8 974 226 18.8
TTS-3 277 123 30.8 321 79 19.8
TGS-4 854 346 28.8 976 224 18.7
TTS-4 278 122 30.5 319 81 20.3
TGS-5 844 356 29.7 969 231 19.3
TTS-5 288 112 28.0 329 71 17.75
Of a total of 1600 stands, 400 are randomly selected, forming a test set. The rest, 1200 stands, are used for training. Five pairs of training and
test sets are generated. They are referred to as TGS-1/TTS-1, TGS-2/TTS-2, TGS-3/TTS-3, TGS-4/TTS-4, and TGS-5/TTS-5, respectively.a The percentage of the observed damaged stands.
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 233
The stand age, the dominant height of the stand, the
tree species, the site stability for Norway spruce, the
aspect, elevation, slope, and Topex1 of the site were
selected as preliminary input variables. Fig. 1 shows
the relationships between the occurrence of damage
and these variables using the reduced dataset, where a
damage rate of 2% is used as the cut line between
‘‘undamaged’’ and ‘‘damaged’’. The proportion of
damaged stands changes significantly as the stand
age index increases (Fig. 1a), and when the site aspect
varies (Fig. 1d). It increases as the height index
increases (Fig. 1b), and when the stand changes from
non-spruced to spruced-dominated stands (Fig. 1c).
On the other hand, no clear tendencies are observed
between the occurrence of damage and the variables
site stability, elevation, slope or Topex. Fig. 1 reveals
that the database implies some difficulties for a mod-
eling approach.
In the reduced dataset, ‘‘site stability’’ as fixed by
the site classification was formulated as an ordinal
variable with four values between 1 (stable) and 4
(unstable). For the logistic regression model, the
stability was directly presented to the model as a
categorical variable. For the neural network, it was
encoded as an ordinal variable and presented to the net
as a so-called thermometer (Master 1993, 260–262),
thus requiring only three input units. The variable
‘‘tree species’’ was presented as a categorical variable
to both the logistic regression model and the network.
It was encoded by giving a value of 0 for spruce
while all other species (Not spruce) were given the
number 1.
The variable ‘‘aspect’’ was measured by degree in
the interval of 0–360 using a digital terrain model
within the GIS as the source of information. For the
logistic regression model, aspect was classified into
eight directions (north, northeast, east, southeast,
south, southwest, west, and northwest) and presented
as a categorical variable. The same classification
system was applied to the neural network, for which
the eight directions of aspect were encoded into seven
8-tupels using the n � 1 encoding scheme recom-
mended for the type of learning algorithm that was
used in the present investigation.
For the logistic regression model, the variables
elevation, slope, and Topex were directly presented
to the model as continuous variables in the measured
values. For the network model, they were scaled to be
in the interval of [0, 1] using formula zi ¼ ðyi � yminÞ/ðymax � yminÞ in order to make training faster and
reduce the chances of getting stuck in local optima.
3.6. Training procedure
After trying several levels and combinations of the
learning parameters for the BackpropMomentum
algorithm the learning rate Z was set at 0.02 with a
momentum m of 0.02, a flat spot elimination value c of
0.1 and dmax of 0.1. This configuration of the net was
used for all the training- and test runs with the datasets
TGS1/TTS1–TGS5/TTS5. Alterations of these para-
meter settings were subject to a sensitivity analysis
that was applied only to selected training- and test-
sets. The training procedure was run iteratively to
minimize the mean squared error (MSE) that was
Table 3
Percentage of the different risk classes (observed damaged stands) in the test sets for the multinomial model
Class 0 (0–2%)a Class 1 (2–5%)a Class 2 (5–10%)a Class 3 (>10%)a
TTS-1 68.5 12.3 7.3 12.0
TTS-2 69.5 12.0 5.8 12.8
TTS-3 69.3 11.0 8.8 11.0
TTS-4 69.5 10.3 8.3 12.0
TTS-5 72.0 10.3 8.5 9.3
a Damage rate.
1 The Topex score is assessed by measuring the angle of
elevation in degrees from a fixed point to the horizon for a
predetermined number of compass directions (Pyatt, 1969). The
sum of all the angles taken at each sample point is the Topex score.
In the present investigation, the Topex was calculated based on the
GIS and the digital terrain model, where eight directions (north,
northeast, east, southeast, south, southwest, west, and southwest)
are considered.
234 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
controlled on a logfile-panel and an error-graph of the
network simulator. To avoid over-fitting (over-train-
ing) of the network the training process was stopped
after 2000 epochs. The epoch size was set to the
number of samples of the whole training set. Specific
to our dataset, the epoch size was 1200. The obtained
network was then applied to the test set, and the
performance measures were calculated. In order to
alleviate the influence of the initial value of the
weights and the danger of dropping into a local
minimum, 10 trials were performed for each training
and test set, the median of which were considered as
the final performance measure.
We fitted the logistic regression model for the
dichotomous approach by using the binary logistic
regression procedure implemented in SPSS for win-
dows (SPSS for Windows 10.0, 2000), which derives
the parameters through maximum likelihood estima-
tion. For the multinomial model the multinomial
logistic regression procedure was applied that uses
the same estimation method (SPSS for Windows 10.0,
2000). We started with a model including all the eight
variables as stated in the previous section. Then, a
backward stepwise procedure using the Bayesian
Information Criterion (BIC) (Schwarz, 1978) as vari-
able removal criterion was used to eliminate the non-
significant variables and to select the most appropriate
models. The BIC can be calculated by
BIC ¼ �2 lnðLÞ þ ð1 þ kÞ lnðnÞ (11)
where L is the model’s likelihood, k the number of
explanatory variables, and n the total number of
samples.
The fitted model was then applied to the test data.
The probability of the occurrence of a stand being
damaged (dichotomous model) or belonging to one of
the four damage classes (multinomial model) was
predicted for each stand contained in the set. The
performance measures were finally calculated for the
logistic regression model.
4. Results
4.1. Comparison of the dichotomous models
The parameter estimates for variables included in
the fitted logistic regression model are listed in Table 4
with a damage rate of 2% and in Table 5 with a damage
rate of 5% as cut value between undamaged and
damaged. Tables 4 and 5 show that four variables
enter all the fitted models, including stand age, tree
species, dominant height and aspect. In contrast, the
variables ‘‘site-stability for Norway spruce’’, eleva-
tion, slope and Topex are not selected for all the cases.
No effort to explain this phenomenon is taken here
since the purpose of this study is to investigate the
performance of the trained neural network compared
to the fitted logistic regression model when they are
applied to test sets identical for both models.
Fig. 2 shows the performances of the trained net-
work and the fitted logistic regression model measured
by MSSE when they are applied to the test sets. Fig. 2a
demonstrates for each of the five test sets that the
median of the MSSE for the neural net is lower than
the MSSE for the logistic regression model when the
cut value between the undamaged and damaged stand
is set at a damage rate of 2%. This indicates that the
neural network might perform better than the logistic
regression model. However, the differences between
the two models for the lower damage rate especially
for the test sets TTS1 and TTS4 are rather small (see
also the sensitivity in Fig. 3.a.2). Further, Fig. 2b
shows that the performance of the network can be
more promising compared to the logistic regression
model if 5% is used as cut value. In addition, it can be
observed that both models tend to perform worse when
the cut value changes from 2 to 5%. As shown in
Table 1, an increase of the cut value from 2 to 5%
actually changes the distribution of the frequency of
both the undamaged and damaged stands in the train-
ing and test set. More specifically, a change of the cut
value from 2 to 5% results in a reduction of the
proportion of the damaged stands from around 30
to 20%. Consequently, both the trained network and
the fitted logistic regression model show a lower
performance, since the number of the damaged stands
is further reduced in favor of the undamaged stands.
Thus, the results of the present study indicate that the
neural network may be preferable as a classifier for a
dichotomous approach compared to a logistic regres-
sion model, when the frequencies of the classes vary
significantly.
Fig. 3 shows the sensitivity of the models when they
are applied to the test sets under different cut values
between undamaged and damaged. Both Fig. 3.a.1 and
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 235
3.b.1, demonstrate that the sensitivity for the argument
‘‘undamaged stand’’ is in most cases somewhat lower
for the neural network than for the logistic regression
model. This means that the ability of the neural net-
work to correctly classify undamaged stands may
sometimes be lower than the logistic regression
model. However, looking at Fig. 3.a.2 and 3.b.2 it
should be noted that the neural network performs
much better than the logistic regression model in
identifying damaged stands, especially when the cut
value between undamaged and damaged is set at 5%.
Fig. 3.a.3 and 3.b.3 give the overall sensitivity for both
Table 4
Parameter estimates for variables included in the fitted logistic regression model
Code TGS-1 TGS-2 TGS-3 TGS-4 TGS-5
Constant �5.565 �5.863 �5.794 �8.896 �5.625
Age �0.278 �0.302 �0.283 �0.276 �0.303
(55.082)a (65.916) (52.157) (52.345) (60.808)
Species 0 Not spruce �0.602 �0.486 �0.668 �0.702 �0.663
(7.277) (2.035) (10.422) (12.484) (10.437)
Height 1.063 1.151 1.136 1.096 1.123
(107.836) (127.125) (119.023) (114.233) (116.635)
Aspectb (14.327) (10.728) (10.559) (14.932) (15.654)
1 N 1.561 1.263 1.348 4.392 1.475
2 NE 1.996 1.967 1.897 5.251 1.894
3 E 2.136 2.167 2.087 5.370 2.372
4 SE 2.175 2.088 1.925 5.179 1.882
5 S 1.556 1.652 1.496 4.835 1.695
6 SW 1.394 1.545 1.365 4.733 1.532
7 W 1.971 1.918 1.652 5.189 2.088
BICa 1254.542 1248.796 1238.025 1238.463 1256.142
Percentage of correct
classification
Undamaged 90.2 89.2 89.2 88.6 89.7
Damaged 35.7 37.9 39.1 43.6 40.2
Total 74.7 74.4 74.8 75.7 75.0
MSSEc 0.2115 0.1987 0.1913 0.1655 0.1841
Damage rate (defined as the rate of recorded storm damage of a stand in percent of its standing volume) 2% as the cut value between the
undamaged (encoded as 0) and damaged stand (encoded as 1).a The Bayesian Information Criterion (Schwarz, 1978). The values in the parenthesis are the changes in BIC if the term were removed from
the fitted model.b North west (code: 8) was used as reference.c Mean squared sensitivity error.
0,0
0,2
0,4
0,6
0,8
1,0
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5
MS
SE
Logistic Model
Neural Network
0,0
0,2
0,4
0,6
0,8
1,0
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5
MS
SE
Logistic Model
Neural Network
(a) (b)
Fig. 2. Performance of the fitted logistic regression model and the trained neural network when they are applied to test sets, including: TTS-1,
TTS-2, TTS-3, TTS-4 and TTS-5. The performance is measured by mean squared sensitivity error. The MSSEs for the neural network are the
median value of the MSSE of 10 trials. (a) Damage rate 2% as cut line between undamaged and damaged stand, (b) damage rate 5% as cut line
between undamaged and damaged stand.
236 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
models. Under this overall evaluation criterion, the
performance of the neural network is only slightly
different from the logistic regression model. As a
general observation it can be noted that for the dichot-
omous model the network has a distinctly stronger
ability to identify damaged stands, compared to the
logistic regression model. This significant advantage
may be accompanied by a slight loss in precision on
the side of the undamaged stands.
Table 6 lists the proportion of the observed and the
predicted number of damaged stands by the logistic
regression model and the neural network for the test
sets. In general, it can be noted that both the logistic
regression model and the network tend to under-pre-
dict the proportion of damaged stands. However, using
a neural network instead of a logistic regression model
can increase the number of correctly predicted
damaged stands. Specific to the database used in this
investigation is that, with a damage rate of 2% as the
cut value between undamaged and damaged, the
observed proportion of damaged stands is about
30%. The predicted proportion of damaged stands
is about 20% when the logistic regression model is
applied. This proportion increases to an average of
23% when the neural network model is used. When a
damage rate of 5% is used as the cut value, the
observed proportion of damaged stands amounts to
about 20%. The predicted proportion of damaged
stands is then only about 6% for the logistic regression
model, but 14% for the network.
4.2. Comparison of the multinomial models
Table 7 shows the parameter estimates for the
variables that were included in the fitted multinomial
logistic regression using the Bayesian Information
Table 5
Parameter estimates for variables included in the fitted logistic regression model
Code TGS-1 TGS-2 TGS-3 TGS-4 TGS-5
Constant �5.104 �5.616 �5.196 �9.651 �5.424
Age �0.215 �0.285 �0.250 �0.293 �0.258
(18.592)a (34.623) (21.744) (31.473) (25.463)
Species 0 Not spruce �0.903 �0.782 �0.994 �0.942 �0.934
(17.585) (11.119) (22.384) (19.305) (19.298)
Height 0.908 1.104 1.005 1.099 1.040
(55.852) (82.002) (62.989) (75.762) (70.210)
Aspectb (12.400) (9.638) (5.512) (13.660) (9.182)
1 N 1.332 1.243 1.123 5.206 1.295
2 NE 1.301 1.389 1.276 5.652 1.299
3 E 1.517 1.472 1.313 5.600 1.639
4 SE 1.495 1.495 0.228 5.582 1.148
5 S 0.669 0.526 0.567 4.734 0.772
6 SW 0.669 1.065 0.749 4.994 0.839
7 W 1.138 1.289 0.883 5.551 1.383
BICa 1035.946 1012.733 1008.362 981.617 1025.719
Percentage of correct
classification
Undamaged 96.5 95.8 95.8 95.7 95.5
Damaged 12.7 21.6 16.8 20.1 17.5
Total 80.6 81.5 80.9 81.6 80.3
MSSEc 0.3817 0.3082 0.3470 0.3201 0.3413
Damage rate (defined as the rate of recorded storm damage of a stand in percent of its standing volume) 5% as the cut value between the
undamaged (encoded as 0) and damaged stand (encoded as 1).a The Bayesian Information Criterion (Schwarz, 1978). The values in the parenthesis are the changes in BIC if the term were removed from
the fitted model.b North west (encoded as 8) was used as reference.c Mean squared sensitivity error.
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 237
Criterion. The fitted models for all training sets only
use four variables: age, species, height and slope. The
variable aspect that entered the binary models
(Tables 4 and 5) was not selected. Secondly it can
be observed that the model is not able to predict the
damage classes 1 and 2, but shows instead a dichot-
omous behavior in assigning the stands in the training-
(and test-) sets either to damage class 0 or 3. The
sensitivity for damaged stands in the training sets is
distinctly lower than for the 2% binary model (Table 4)
and lower than for the 5% damage rate (Table 5)
especially for the training sets 1, 2 and 4, while the
training sets 3 and 5 show a slightly higher sensitivity
than the 5% binary model. Overall sensitivity
decreases by 10% compared to the dichotomous
model with 5% damage rate due to 0-sensitivity in
damage classes 1 and 2.
Table 8, in which the performance of both multi-
nomial models when applied to the test sets is
depicted, reveals that the neural network shows the
same behavior as the logistic multinomial regression.
Damage classes 1 and 2 are not identified by the
network with multiple outputs, a damaged stand is
either assigned to damage class 0 or 3. Except training
set 1, the sensitivity for damage class 3 of the neural
network with multiple outputs is higher and the MSSE
is lower than the multinomial regression model which
may again indicate a higher performance of the neural
network to detect damaged stands. The sensitivity for
undamaged stands is again slightly lower for the
a.1 Sensitivity for Undamaged Stand
0
20
40
60
80
100
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5
Sen
siti
vit
y [
%]
b.1 Sensitivity for Undamaged Stand
0
20
40
60
80
100
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5
Sen
siti
vit
y [
%]
a.2 Sensitivity for Damaged Stand
0
20
40
60
80
100
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5
Sen
siti
vty
[%
]
b.2 Sensitivity for Damaged Stand
0
20
40
60
80
100
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5S
en
siti
vit
y [
%]
a.3 Overall Sensitivity
0
20
40
60
80
100
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5
Sen
siti
vty
[%
]
b.3 Overall Sensitivity
0
20
40
60
80
100
TTS-1 TTS-2 TTS-3 TTS-4 TTS-5
Sen
siti
vit
y [
%]
Logistic Model Neural Network
(a) (b)
Fig. 3. Sensitivities of the fitted logistic regression model and the trained neural network when they are applied to test sets, including: TTS-1,
TTS-2, TTS-3, TTS-4 and TTS-5. The sensitivities for the neural network are the median value of the sensitivity of 10 trials. (a) Damage rate
2% as cut line between undamaged and damaged stand, (b) damage rate 5% as cut line between undamaged and damaged stand.
238 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
Table 6
Proportion (%) of observed damaged and predicted damaged stands for test sets (dichotomous model)
Undamaged stands Damaged stands
Observed Network Logistic model Observed Network Logistic model
Damage rate 2% as cut line between undamaged and damaged stand
TTS-1 68.5 76.2 80.8 31.5 23.8 19.3
TTS-2 69.5 78.5 83.5 30.5 21.6 16.5
TTS-3 69.3 74.7 83.8 30.8 25.3 16.3
TTS-4 69.5 77.1 78.8 30.5 22.9 21.3
TTS-5 72.0 77.9 80.8 28.0 22.1 19.2
Damage rate 5% as cut line between undamaged and damaged stand
TTS-1 80.75 87.6 95.0 19.25 12.5 5.0
TTS-2 81.50 85.4 94.0 18.50 14.7 6.0
TTS-3 80.25 83.8 94.0 19.75 16.3 6.0
TTS-4 79.75 84.8 93.0 20.25 15.2 7.0
TTS-5 82.25 89.5 93.0 17.75 10.5 7.0
Table 7
Parameter estimates for variables included in the fitted multinomial logistical regression model
Category Constant Age Species, 0 (not spruce) Height Slope Sensitivity S (%) MSSEa
TGS-1 0 4.200 0.301 1.173 �1.142 0.091 97.4
1 �0.248 0.002 1.117 �0.129 0.074 0 0.6988
2 �0.333 0.106 0.689 �0.241 0.045 0
3b 10.9
Overall 70.8
TGS-2 0 4.816 0.360 1.280 �1.318 0.060 97.1
1 0.273 0.099 1.356 �0.373 0.056 0 0.6805
2 �0.310 0.078 1.091 �0.162 �0.001 0
3 15.1
Overall 70.7
TGS-3 0 4.434 0.330 1.305 �1.255 0.108 96.4 0.6682
1 �0.317 0.069 1.267 �0.223 0.074 0
2 �0.446 0.086 0.752 �0.233 0.076 0
3 18.0
Overall 70.7
TGS-4 0 4.503 0.379 1.232 �1.307 0.072 97.3
1 0.254 0.164 1.015 �0.454 0.068 0 0.6820
2 �0.785 0.102 0.622 �0.144 0.037 0
3 14.7
Overall 70.8
TGS-5 0 4.595 0.332 1.173 �1.260 0.072 96.4
1 0.454 0.047 1.026 �0.327 0.074 0 0.6632
2 �0.243 0.077 0.638 �0.229 0.045 0
3 19.3
Overall 70.1
Damage rate is classified into four categories: 0–2% (encoded as 0); 2–5% (encoded as 1); 5–10% (encoded as 2); 10–100% (encoded as 3).
The Bayesian Information Criterion (Schwarz, 1978) was used to determine which variable enters the model.a Mean squared sensitivity error.b Reference category.
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 239
neural network with an almost identical overall-sen-
sitivity for both models. However it is obvious that
both of the models do not qualify as a classifier to four
classes of storm damage with the present database.
5. Discussion
5.1. Overall analysis of the performance of the two
models
This study indicates that the artificial neural net-
work technology may be a promising approach to
classify forests susceptible to wind damage in a dichot-
omous classification mode. A feed-forward network
tends to have a stronger ability to identify damaged
stands than a classic logistic regression model. Its
ability will be reduced with a decrease in the frequency
of damaged stands. Nevertheless, the network per-
forms better than the logistic regression model, espe-
cially when the cut value between ‘‘undamaged’’ and
‘‘damaged’’ is increased. A slight loss in precision
when predicting undamaged stands compared to the
logistic regression model, as observed in some of the
datasets of this investigation, seems to be acceptable
since a significant gain in precision on the side of
predicting damaged stands is available. Hasenauer
et al. (2000) found that neural networks were suitable
for the prediction of the number of juvenile trees/unit
area. Prediction results were more accurate than results
from the conventional statistical approach based on
regression analyses. In a study from a completely
different field, Arana et al. (1999) compared the
performance of a neural network and a logistic regres-
sion as a predictive radiological model. They found
that the neural network had a higher performance as a
classifier than the logistic regression when the models
were built with 3-fold cross-validation (similar to the
approach that was chosen in the present paper that can
be looked upon as a 5-fold cross-validation), while two
other validation methods (leave-one-out and bootstrap
algorithm) did not detect this difference (Arana et al.,
1999, 636). Both of these studies seem to support the
results of the present paper, but they can not be directly
applied to the problem of our study. Furthermore, both
of the models reveal fundamental differences in the
processing of the input and their application that will
be discussed in the following.
5.2. Differences of the models in processing
input information
In order to keep both of the models comparable
we started with the same set of input (independent)
Table 8
Performance measures for the fitted multinomial logistic regression models and the trained neural network when they are applied to the test
datasets
Model Performance measure
Sensitivity (%) for damage classesa MSSE
0 1 2 3 All
TTS-1 Logistic model 98.5 0 0 20.8 70.0 0.6567
Neural Network 95.6 0 0 19.8 67.9 0.6613
TTS-2 Logistic model 96.8 0 0 15.7 69.2 0.6780
Neural Network 95.7 0 0 21.6 69.3 0.6542
TTS-3 Logistic model 96.8 0 0 15.9 68.8 0.6770
Neural Network 93.7 0 0 40.9 69.4 0.5882
TTS-4 Logistic model 97.1 0 0 16.7 69.5 0.6738
Neural Network 93.7 0 0 24.0 68.0 0.6455
TTS-5 Logistic model 96.9 0 0 18.9 71.5 0.6646
Neural Network 93.9 0 0 28.4 70.3 0.6291
MSSE: mean squared sensitivity error.a Damage classes: 0 ¼ 0�2% damage; 1 ¼ 2�5% damage; 2 ¼ 5�10% damage; 3 10% damage.
240 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
variables. However the logistic regression model
using Schwarz’ Bayesian Information Criterion
eliminated for both model types some of the inde-
pendent variables as non-significant which led to a
different final model-layout of the logistic regression
compared to the neural network. In order to analyze
whether a neural network would produce similar
results to a logistic regression if the input nodes
where reduced to the significant independent vari-
ables detected by the logistic regression, a neural
network with four nodes in the input layer, represent-
ing the variables age, species, height and slope was
set up and was tested for the multinomial approach
with a randomly selected pair of training- and test
sets. The result was that this 4–5–4 network (4 nodes
in the input, 5 in the hidden and 4 in the output layer)
was not able at all to correctly assign damaged stands
to a damage class. All the stands were classified into
damage class 0. The same happened with a 4–3–4
network. The network obviously needed additional
information to be able to divide between damaged
and undamaged stands.
In an attempt to optimize the topology of the
neural network the number of input units for the
variable ‘‘aspect’’ was reduced applying an encod-
ing scheme proposed by Master (1993, 270ff).
Therefore, a pair of new variables: sin(aspect) and
cos(aspect) that have the feature of changing
smoothly were introduced. Consequently, two units
were required in the network to represent one value
of aspect. For example, a forest stand with aspect of
908 was represented as (1, 0) in the training and test
sets. The result of that 11–5–1 network was almost
the same as for the 16–5–1 network in the dichot-
omous model for the 2%-damage threshold,
although a less efficient training algorithm without
flatspot-elimination was used. This indicates that
such a modification that enhances the speed of the
learning process and normally reduces the danger of
over-training does not negatively affect the perfor-
mance of the network. However, applying this sin–
cos coding scheme for the variable ‘‘aspect’’ when
fitting the logistic regression models led to an exclu-
sion of the variable ‘‘aspect’’ in all the fitted logistic
regression models and reduced the performance of
these regression models, which again is an indication
of a different processing of input information of the
two approaches.
5.3. Further optimization of the neural network—a
sensitivity analysis
It is well known that the design of a neural network
is a ‘‘trial-and-error’’-process. Taking into account
that in the present study the final measures for the
networks’ performance are the median of only 10
trials, and that the training procedure is stopped after
2000 epochs, one may argue that the topology of the
present network in comparison to the logistic regres-
sion model is not optimized, and furthermore, that it
may be over-trained. It is indeed most likely that it is
not optimized. Therefore different trials to optimize
the network and reduce the effect of over-training
were tested. As already stated, a change in the network
structure of the multinomial model did not lead to
positive results. Early stopping of the learning process
with less than 100 epochs or a general decrease of the
number of epochs below 2000 did not improve the
performance, neither did the change of the learning
parameters (Z, m, c or dmax). Changing the network
topology for the dichotomous model from a 16–5–1
into a 16–7–1 structure by increasing the number of
nodes in the hidden layer lead to a slightly higher
performance of around 5% for damaged stands with
the damage threshold 2% only for the training/test set
1. Using a different learning algorithm called Quick-
prop (Zell et al., 2000, 148), a method to speed up the
learning process by using information about the cur-
vature of the error surface (Fahlman, 1988) improved
the performance of the dichotomous model with the
2%-damage rate for the training- and test set 4, but did
not show a stable behavior and led in one case to the
network getting trapped in a local minimum. Pruning
the network, using a magnitude based pruning algo-
rithm (Zell et al., 2000:127f) did not lead to an
improvement of the performance of the net but
revealed at least some similarities between the proces-
sing of input information of the neural network and the
logistic regression model. Pruning, a way to make
networks smaller by excluding unnecessary units or
links, e.g. by removing the links with the smallest
weights, showed that mainly the links to the units that
were also removed in the logistic regression (such as
aspect or stability) were subject to pruning, while the
input variables that were identified as significant by
the logistic model (age, species, height,) remained
untouched.
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 241
6. Conclusions
The results of the present study indicate that a
neural network may perform better than a classical
logistic regression as a predictive dichotomous model
for storm damage. Yet, we have to accept that using an
artificial neural network instead of a statistical model
like the logistic regression may also limit our analy-
tical capacities. Due to the ‘‘black-box character’’ of
the network we are not able to identify the significance
level of the different input variables as we are with a
statistical approach. This might, however, be useful
information, as a decision maker might not only be
interested in the result of a risk-classification but also
in knowing which of the relevant factors are the most
influential. Comparing both the approaches exclu-
sively by calculating a performance measure like
the MSSE might therefore be not completely satisfac-
tory as the different character of both of the models is
not adequately taken into account. In our case, a forest
manager who is mainly interested in the output of a
risk-classification system may choose a neural net-
work whilst a decision maker who wants to know more
about the reasons behind the classification might
prefer a logistic regression despite a possibly lower
performance. Using the pruning technique for neural
networks or training algorithms including weight
decay to detect unnecessary units and links may go
in this direction but does not deliver the same quality
of information. Numerical results further demonstrate
that, specific to the dataset used in this study, both the
network and the logistic regression model under-pre-
dict the proportion of damaged stands, but the esti-
mated proportion of damaged stands by the network
lies between the observed proportion and the propor-
tion estimated by logistic regression model.
In this study, the training procedure for the neural
network iteratively minimizes the MSE until a pre-
defined number of epochs are reached, and the per-
formance measures are then calculated for the trained
net. We recommend that the network be directly
trained through minimizing the MSSE of the model.
Thus, a further improvement of the performance of the
neural network can be expected.
The experiences in the present investigation clearly
showed the limits of both technologies for risk classi-
fication when applied to the present database, although
it might be one of the most detailed assessments of risk
due to storm available for the southern part of Ger-
many, as this parameter is usually not recorded at the
stand level. Relying on a similar database, we would
therefore not recommend applying a neural network or
a logistic regression as a risk-classifier in a multi-
nomial or even a continuous variable approach. Pre-
vious investigations in which a neural network was
used to predict the risk of storm damage as a con-
tinuous variable (Hanewinkel and Zhou, 2000)
resulted in a non-satisfactory performance of the net-
work, especially for higher damage rates. We would
instead stick to a dichotomous approach and try to find
the most useful damage threshold that matches the
information needs of forest managers. To go beyond
this, meaning to try to predict risk in several risk
classes or to directly estimate the actual damage rate,
obviously needs additional parameters that better
characterize the stands and their vulnerability towards
storm damage.
Acknowledgements
We want to thank Alex Hinrichs for the initial
preparation of the data in 1994. Furthermore we would
like to thank Kai Fischer (M.Sc.) of IFE for calcula-
tions done within our Geographic Information System
(GIS). The research has been supported by the German
Ministry of Formation, Research and Technology
(BMBF) under the grant number 0339732/5. The
authors would also like to thank Greg Biging,
University of California in Berkeley for his useful
comments on the text.
References
Arana, E., Delicado, P., Martı-Bonmatı, L., 1999. Validation
procedures in radiological diagnostic models. Neural network
and logistic regression. Invest. Radiol. 34, 636–642.
Fahlman, S.E., 1988. Faster-learning variations on back-propaga-
tion: an empirical study. In: Sejnowski, T.J, Hinton, G.E.,
Touretzky, D.S. (Eds.), Connectionist Models Summer School,
San Mateo, CA, Morgan Kaufmann, pp. 105–110.
Fridman, J., Valinger, E., 1998. Modeling probability of snow
and wind damage using tree, stand, and site characteristics
from Pinus sylvestris sample plots. Scand. J. For. Res. 13 (3),
348–356.
Gardiner, B.A., Quine, C.P., 2000. Management of forests to reduce
the risk of abiotic damage—a review with particular reference
242 M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243
to the effects of strong winds. For. Ecol. Manage. 135,
261–277.
Gardiner, B.A., Peltola, H., Kellomaki, S., 2000. Comparison of
two models for predicting the critical wind speeds required to
damage coniferous trees. Ecol. Model. 29, 1–23.
Guan, B.T., Gertner, G., 1991a. Using a parallel distributed
processing system to model mortality. For. Sci. 37 (3), 871–
885.
Guan, B.T., Gertner, G., 1991b. Modeling red pine tree survival
with an artificial network. For. Sci. 37 (5), 1429–1440.
Guan, B.T., Gertner, G., 1995. Modeling individual tree survival
probability with a random optimization procedure: an artificial
neural network approach. AI Appl. 9 (2), 39–52.
Guan, B.T., Gertner, G., Parysow, P., 1997. A framework for
uncertainty assessment of mechanistic forest growth models: a
neural network example. Ecol. Model. 98, 47–58.
Hanewinkel, M., 2001. Financial results of selection forest
enterprises with high proportions of valuable timber—results
of an empirical study and their application. Swiss For. J. 152
(8), 343–349.
Hanewinkel, M., Zhou, W., 2000. A new approach for risk
assessment in secondary coniferous forests based on fuzzy sets
and artificial neural networks. In: Hasenauer, H. (Ed.),
Proceedings of the International IUFRO Conference on Forest
Ecosystem Restoration—Ecological and Economical Impacts
of Restoration Processes in Secondary Coniferous Forests,
Vienna, April 10–12, 2000, pp. 112–117.
Hasenauer, H., Kindermann, G., Merkl, D., 2000. Zur Schatzung
der Verjungungssituation in Mischbestanden mit Hilfe Neuraler
Netze (Assessment of regeneration in uneven-aged mixed
stands using neural networks). German J. For. Sci. 119 (6),
350–366.
Hinrichs, A., 1994. Geographische Informationssysteme als
Hilfsmittel der forstlichen Betriebsfuhrung (Geographical
Information Systems as Tool for Forest Management), vol. 3.
Ph.D. Thesis. Schriften aus dem Institut fur Forstokonomie,
Albert-Ludwigs-University of Freiburg, Germany, 128 pp.
Hinton, G.E., 1989. Connectionist learning procedure. Artif. Intell.
4, 185–234.
Jalkanen, A., Mattila, U., 2000. Logistic regression models
for wind and snow damage in northern Finland based on
the national forest inventory data. For. Ecol. Manage. 135,
315–330.
Konig, A., 1995. Sturmgefahrdung von Bestanden im Altersklas-
senwald. J.D. Sauerlander’s, 194 pp.
Kurth, H., Gerold, D., Dittrich, K., 1987. Reale Waldentwicklung
und Zielwald—Grundlagen nachhaltiger Systemregelung
des Waldes. Wissenschaftliche Zeitschrift der TU Dresden 36,
121–137.
Lawrence, S., Burns, I., Back, A., Tsoi, A., Giles, C., 1998. Neural
networks classification and prior class probabilities. In: Tricks
of the Trade, Lecture Notes in Computer Science State-of-the-
Art Surveys, Springer, Berlin, Germany, pp. 299–314.
Lekes, V., Dandul, I., 2000. Using airflow modelling and
spatial analysis for defining wind damage risk classification
(WINDARC). For. Ecol. Manage. 135, 331–344.
Master, T., 1993. Practical Neural Network Recipes in Cþþ.
Academic Press, 493 pp.
Miller, D.R., Dunham, R., Broadgate, M.L., Aspinall, R.J., Law,
A.N.R., 2000. A demonstrator of models for assessing wind,
snow and fire damage to forests using the WWW. For. Ecol.
Manage. 135, 355–363.
Mitchell, S., 1998. A diagnostic framework for windthrow risk
estimation. For. Chron. 74, 100–105.
Mitchell, S.J., Hailemariam, T., Kulis, Y., 2001. Empirical
modeling of cutblock edge windthrow risk on Vancouver
Island, Canada, using stand level information. For. Ecol.
Manage. 154, 117–130.
Nauck, D., Klawonn, F., Kruse, R., 1994. Neuronale Netze und
Fuzzy-Systeme. Vieweg, 407 pp.
Nogami, K., 1991. Applying neural network and fuzzy sets theory
for multi-resource forest land use planning. In: Nogami, K.,
et al. (Eds.), Proceedings of the Symposium of Integrated
Forest Management Information Systems, Tsukuba, Japan,
pp. 203–212.
Patterson, D., 1996. Artificial Neural Networks. Theory and
Applications. Prentice-Hall, 506 pp.
Peltola, H., Kellomaki, S., Vaisanen, H., Ikonen, V.-P., 1999. A
mechanistic model for assessing the risk of wind and snow
damage to single trees and stands of Scots pine, Norway spruce,
and birch. Can. J. For. Res. 29, 647–661.
Pyatt, D.G., 1969. Guide to the site types of north and mid Wales.
Forestry Commission Forest Record No. 69. Forestry Commis-
sion, United Kingdom, 45 pp.
Rottmann, M., 1985. Schneebruchschaden in Nadelholzbestanden.
J.D. Sauerlander’s, Frankfurt a.M. 159 S.
Rottmann, M., 1986. Wind- und Sturmschaden im Wald. J.D.
Sauerlander’s, Frankfurt a.M. 128 S.
Rumelhart, D.E., McClelland, J.L., 1986. Parallel Distributed
Processing. Explorations in the Microstructure of Cognition,
vol. 1, Foundations. MIT Press, 547 pp.
Schwarz, G., 1978. Estimating the dimension of a model. Ann.
Stat. 6, 461–464.
SPSS for Windows, Release 10.0, 2000. SPSS Inc., Chicago, IL.
Suzuki, T., 1971. Forest transition as a stochastic process.
Mitteilungen der Forstlichen Bundesversuchsanstalt (FBVA)
Wien 91, 137–150.
Talkkari, A., Peltola, H., Kellomaki, S., Strandmann, H., 2000.
Integration of component models from the tree, stand and
regional levels to assess the risk of wind damage at forest
margins. For. Ecol. Manage. 135, 303–313.
Valinger, E., Fridman, J., 1997. Modeling probability of snow and
wind damage in Scots pine stands using tree characteristics.
For. Ecol. Manage. 97 (3), 215–222.
Valinger, E., Fridman, J., 1999. Models to assess the risk of snow
and wind damage in pine, spruce, and birch forests in Sweden.
Environ. Manage. 24 (2), 209–217.
Zell, A., Mamier, G., Vogt, M., et al., 2000. SNNS (Stuttgart Neural
Network Simulator), User Manual, Version 4.2. Institute for
Parallel and Distributed High Performance Systems, University
of Stuttgart, Wilhelm Schickard Institute for Computer
Sciences, University of Tubingen, Germany. 338 pp.
M. Hanewinkel et al. / Forest Ecology and Management 196 (2004) 227–243 243