AN ARTIFICIAL NEURAL NETWORK METIIOD FOR MINERAL

AN ARTIFICIAL NEURAL NETWORK METIIOD FOR MINERAL PROSPECTIVITY MAPPING: A COMPARISON WITH FUZZY LOGIC AND BAYESIAN PROBABILITY METIIODS

WARICK M. BROWNt, DA YID 1. GROVESt and TAMAs D. GEDEON+

t Centre for Global Metallogeny, Department of Geology and Geophysics, University of Western Australia, Perth, WA 6907, AUSTRALIA E-mail: [email protected] E-mail: [email protected]

~ School of Information Technology, Murdoch University, Perth, WA 6150, AUSTRALIA E-mail: [email protected]

Abstract

A multilayer perceptron (MLP) neural network is used to combine multi-source exploration data in a Geographic Infonnation System (GIS) database and produce a mineral prospectivity map for gold deposits in the Tenterfield 1: 100,000 sheet area, NSW, Australia. Statistical and probability measures of map quality indicate that the neural network method perfonns better than the empirical weights-of-evidence (Bayesian probability) and conceptual fuzzy logic methods in the generation of the mineral prospectivity map. The neural network has several important advantages over existing methods, including the ability to respond to critical combinations of parameters, rather than automatically increasing the prospectivity estimate in response to every favourable parameter.

1. Introduction

The volume and complexity of regional data available to the exploration geologist are increasing rapidly, from a variety of sources such as remote sensing, airborne geophysics and large commercially-available geochemical databases compiled from

179

w: Sandham and M Leggett (eds.), Geophysical Applications of Artificial Neural Networks and Fuzzy Logic, 179-196. (02003 Kluwer Academic Publishers.

180

open-file exploration reports. As the amount of data increases, so too will the need for an effective means of integration and analysis (Cucuzza and Goode, 1998).

A mineral prospectivity map, showing areas in different colours according to their potential to host mineral deposits of a particular type, represents one method of combining and summarizing regional data sets within a GIS in a way that can help geologists make decisions about ground acquisition and target selection. The two most commonly used methods of producing prospectivity maps are the empirical weights-ofevidence method (Bonham-Carter et aI., 1988) and the fuzzy logic method, although both these methods have important limitations.

The weights-of-evidence method is based on Bayesian probability and requires a statistically significant number of mineral occurrences to estimate the relative importance of geological evidence associated with the deposits. Consequently, the method can not be used in poorly explored areas containing few known deposits. The conceptual fuzzy logic method overcomes this problem, but relies on a deposit expert to make subjective assessments. Conceptual methods also depend on the application of a deposit model to make decisions about which geological parameters might be significant. However, new deposit types are still being discovered (e.g. Olympic Dam at the time of discovery: Woodall, 1994), and models for well-investigated deposit styles are continually being updated in the light of new discoveries (e.g. Century: Legge, 1995).

Artificial neural networks offer a new approach to the problem of mineral prospectivity mapping. Neural networks are particularly suitable for pattern recognition and classification tasks. The ability of neural networks to function without pre-existing knowledge (e.g. a deposit model), and to operate at acceptable accuracy when the data are noisy or contain outliers, suggest that neural networks could be well suited to the integration of mineral exploration data. If digital geological, geophysical and geochemical maps can be superimposed in a GIS system to produce an image from which the signatures of ore deposits can be extracted, then neural networks could be trained to recognise these signatures in order to predict the location of new deposits. To test this concept, a neural network was used to integrate regional exploration data from a small GIS database into a single prospectivity map, and the performance of the neural network approach compared with Bayesian probability and fuzzy logic methods.

2. Previous Work

The potential of combining neural network and GIS technologies was recognised by Jeffrey and Rossner (1986) and Ritter et al. (1988), and much of the subsequent research focussed on automated pattern recognition and the classification of remotely-sensed imagery (Decateur, 1989; Ritter and Hepner, 1990). Many of these studies showed that neural networks could be used to combine large numbers of data sets to perform classification (Benediktsson et aI., 1993; Bruzzone et aI., 1997). Although neural networks have been applied to a wide range of remote-sensing and geophysical problems, few studies describe their application to mineral exploration or mineral prospectivity mapping.

181

Bonham-Carter (1994) has described previous work in mineral prospectivity mapping as either empirical (data-driven) or conceptual (knowledge-driven). An empirical approach generally involves a statistical analysis of the spatial relationships of map features (e.g. faults and lithological units) to mineral deposit locations. Empirical techniques include; multiple regression (Sinclair and Woodsmith, 1970), logistic regression (Agterberg, 1974; Reddyet aI., 1991), canonical correlation (Pan and Harris, 1992), a Bayesian-spatial method (Singer and Kouda, 1988) and weights-of-evidence (Bonham-Carter et aI., 1988).

Conceptual methods include evidential-belief theory (Moon, 1990), fuzzy logic (An et al., 1991), expert systems (Katz, 1991), Boolean algebra and index-overlay method (Bonham-Carter, 1994), and a mineral-systems method (Wyborn et al., 1995).

Singer and Kouda (1996) applied a multilayer feed-forward network to Kuroko-type deposits to produce a map depicting the distance to the nearest deposit, and the use of a proprietary neural network system for the classification of anomalies in multi-source data has been described by Clare et al. (1997). The use of a multilayer perceptron (MLP) to produce a prospectivity map was compared with fuzzy logic and weights-ofevidence methods by Brown et al. (1998, 1999). Harris and Pan (1999) have compared probabilistic neural network, logistic regression and discriminant analysis methods of mineral prospectivity mapping. Although the approach has been an empirical one in these studies, the neural network approach could equally well be applied to the processing of input map layers based on geological knowledge and subjective judgment.

3. Artificial Neural Networks

Although many different neural-network architectures exist, an MLP neural network was chosen for this study, because it performs very well for a large variety of problem types and has powerful function-approximation capabilities (Hornik, 1991; Masters, 1993). Learning in MLP neural networks is achieved by modifying the connection weights while a set of training examples is repeatedly processed. This training method is referred to as supervised because each example specifies all the input parameters together with the desired outputs. The set of inputs to the network constitutes a realvalued vector, [xj, ... ,xnl As the neural network used in this study contains only a single output unit, the training data consist of a list of input vectors together with the corresponding scalar output-value. The difference between the desired output and the output produced by the network represents the error.

Learning proceeds by applying the error back-propagation algorithm (Werbos, 1974; Rummelhart et aI., 1986) to iteratively adjust the weight values in order to minimize the network error. After training is completed, the network should be able to generalize: that is, produce appropriate responses for inputs that have not been used during training.

4. GIS Database

The GIS database for the trial study corresponds to the area of the Tenterfield 1: 100,000

182

sheet (map 9339), and was compiled by the Geological Survey of New South Wales (Barnes, 1993; Bames et aI., 1995). The database comprises the following thematic layers; solid geology, regional-scale faults, airborne total magnetic-intensity, airborne gamma-ray spectrometry (U, Th, K and total count) and deposit locations. Magnetic and radiometric data were supplied as gridded, grey-scale images. The mineral deposit layer includes mineral occurrences, and small and medium sized mines. Each cell in the grid data represents a 200 metre square on the ground. There are a total of 248 cells in the x-direction and 280 cells in the y-direction, giving a total of 69,440 cells.

5. Geology

The Tenterfield 1: 1 00,000 sheet is located in the southern portion of the New England Orogen (Figure 1). The study area consists of three main lithostratigraphic units: 1) a Silurian to Carboniferous metasedimentary and volcanic basement, 2) the Mid to Late Permian Wandsworth Volcanic Group, comprising intermediate to felsic volcanic

i----- ~ J

I' 'I Terbary ba:sal'j. diorites

U CIIv...,.. Mor«on&lG.n

0 Und""' .... lIo<ocl ~t ..

l:.:J Moonbi ~ton.c SuO,,"

~ U,illia p ..... cric SU1e

r;-:l Cau.oc. R.w, l :':.1 Ptutooic W:O

Unci~t.atad Perm. Grorit~

r.7l =~~p I ~-, Penn Se<ime"'''y L.... Rocka

R ~':'~"",",-''Y

O T ...... ilHfJ.w 1: t 00.000 . ..... IFlg.O& 10)

• _goIdd_ M-.I UCCUf800llS

o Disocminolod lIO'd deposiIs and OCOJIroncoo

Figure 1 ~ Location and simplified geology of the Tenterfield I :100,000 sheet area adapted from an unpublished regional digital map supplied by the Geological Survey of New South Wales,

183

rocks and comagmatic felsic intrusions, and 3) Late Permian to Early Triassic leucogranitoids. The main granitoids in the subject area are those of the Clarence River and Moonbi Suites and an undifferentiated group of leucogranitoids.

The highly fractionated leucogranites are the most important hosts to gold mineralization in this area. The mid-Triassic Stanthorpe Adamellite, appearing as a semi-circular body in the north of the study area, east of the Demon Fault, hosts gold mineralization in the form of large low-grade primary disseminations in poorly-defined irregular lenses near the upper surface of the granite body (e.g. the Timbarra-Poverty Point deposits) and placer deposits. Numerous alluvial gold occurrences are scattered across the Bungulla Porphyritic Adamellite, which forms an arcuate body on the eastern margin of the Stanthorpe Adamellite (Barnes et aI., 1995). Forty-seven secondary, thirteen primary and three spatially-associated granitoid-hosted gold deposits of unknown form were selected as the basis for the prospectivity analysis in this study.

Placer gold deposits are included in the analysis, because the placer deposits are likely to be favourable indicators for the presence of primary gold deposits, since most of the placer deposits are derived from the primary deposits in this area and have not been transported far. This conclusion is supported by; 1) gold grains, which have a faceted morphology, reflecting the shape of crystals on which they are deposited (Bampton, 1988), 2) the fmeness of both the alluvial and primary gold (Barnes et aI., 1995), 3) the flat and poorly-drained top of the plateau formed by the Stanthorpe Adamellite, and 4) the fact that much of the historical production from the TimbarraPoverty Point goldfield came from eluvial deposits.

6. Neural Network Method

6.1. PRE-PROCESSING

All thematic layers in the GIS database were converted to raster format prior to further processing. The original solid-geology layer contained 41 rock units, many of which contain zero or very few deposits. The reliability of probability estimates used in the weights-of-evidence method depends on the existence of a statistically significant number of known deposits. Therefore, units belonging to the same rock suite or stratigraphic group were combined, resulting in a simplified geology layer consisting of 12 rock units. Values in the magnetic and radiometric grey-scale images were subdivided into eight classes. The GIS database layer containing regional faults was converted to a grid in which the cell values indicate the distance to the nearest fault. The thematic layers in the Tenterfield GIS database were combined into a set of input feature vectors. At each cell location in the Tenterfield map grid, a value from each thematic layer was combined to form a vector (Figure 2). These vectors, known as input feature vectors, formed the input to the neural network.

Linear scaling, based on the maximum and minimum values contained in each thematic layer, was applied to all input feature vectors. Target values used in the training data were scaled to the range [0.1,0.9]. This is a result of applying a simple linear mapping of the binary output values, zero and one, to the practical limits of the network output. For a feedforward network with an asymptotic logistic activation

184

GIS thematic layers Training pattern

Input featu", vector target output

[5. 8. 33, 700)

soIldgeologV r r I -----------,

magnetic anomaly ____________ _ ___ J

gamma-my survey ____________________ J

diSance to faults .................. -_ ......... --_ ... -- --_ ...... -_ ...... _,

Iono ...... depo9ts

Figure 2. Relationship between GIS thematic layers and featnre vectors used as input to the neural network. At each location on the map grid, the cell values for each thematic layer are combined to form an input featnre vector. Patterns used to train the network consist of an input featnre vector paired together with the desired output, represented by the value of the binary layer showing the locations of known deposits.

function, the output activation values are typically limited to 0.1 and 0.9 (Masters, 1993).

Apart from the preparation of the map layers, all processing was performed outside the GIS using a series of utility programs written in C++ and a neural network backpropagation simulation program (McClelland and Rumeihart, 1988). Data processing in the neural network method consisted of three main stages: 1) pre-processing of data to a format suitable as inputs to the neural network, 2) neural network training and processing the entire data set after training, and 3) post-processing required to convert the output values to a map. Processing in the weights-of-evidence and fuzzy logic methods was performed using the GEODIPS GIS prospectivity-analysis package (Jusmady and Taylor, 1997; Taylor, 1997).

6.2. INPUT CODING

A single input unit was assigned to each of the magnetic, four radiometric and faultproximity layers. A one-of-n coding scheme was applied to the solid geology layer, so that each of the 12 rock types in the simplified geology layer was assigned a separate

Quaternary. alluvium Tertiary. volcanic rocks

Clarence-Moreton Basin rocks Leucogranitoids Moonbi Plutonic Suite Uralla Plutonic Suite Clarence River Plutonic Suite Permian granitoids Permian porphyritic granitoids Wandsworth Volcanic Group Permian sedimentary rocks Sil.-Carb. Sedimentary rocks Airborne magnetic data

[Uranium

Gamma-ray Thorium channels - Potassium

Total count Distance to nearest fault

185

)-+----3> output

Figure 3. Architecture of the MLP neural network used in this study. The network has an \8-2-\ topology (where the numbers refer to input, hidden and output units, respectively). Each map layer in the GIS database was assigned an input unit except the geology layer, for which \-of-n encoding was used.

input layer unit in the neural network. For a particular input pattern, only one of the input units is set to one and all the rest are set to zero. The network topology and inputs used are shown in Figure 3. Binary coding was used for the single output unit, with one and zero representing the presence and absence of deposits, respectively.

6.3. TRAINING DATA

In each training run, the total sum of squares (TSS) error follows a decreasing trend and asymptotically approaches zero as training proceeds. However, excessive learning can lead to what is known as overfitting, associated with poor generalization performance. This is due to the presence of noise in the data set. If the number of modifiable weights in an MLP network is high in relation to the number of patterns in the training data set, and training is continued beyond the stage at which the general trends in the data are learnt, the network begins to learn the noise in the data set. This is harmful to the ability of the network to generalize with new data sets.

To avoid overtraining, a procedure known as early stopping is applied (Wang et aI., 1994). The data available for training are divided into three parts referred to as training, test and validation sets, respectively. The concept is to improve generalization by stopping learning before the global minimum of the training data set TSS error; i.e. before the idiosyncrasies of the data set are learnt. The training data set is used to adapt the weights in the network and the test data set is used to determine when to stop training. The validation data set plays no role in training and is used to assess the generalization performance of the trained network.

Although the ratio of deposit to non-deposit cells in the Tenterfield grid is approximately I: I 000, the ratio of deposit to non-deposit vectors used in the training set

186

was close to 1: 1. If deposit patterns are represented in the training set in the same proportion as they appear in the total data population, the network would not learn to recognize the rare deposit patterns at all, or would perform very poorly for this class of patterns (Masters, 1993; Zaknich, 1999). Consequently, the training, test and validation data sets were randomly assigned one third of the available deposit vectors (21 each), and two non-deposit vectors were randomly selected for each of the 12 main rock units, giving a total of24 non-deposit vectors in each data set.

Since there are only 63 gold deposits of the required type in the Tenterfield area, compared with 69,377 non-deposit cells, the size of the data sets used for training were limited by the number of known deposits.

6.4. NETWORK TOPOLOGY AND NETWORK PARAMETERS

The number of units in the hidden layer of the network was determined experimentally, resulting in an 18-2-1 network topology. The number of hidden units was kept to a minimum, in order to limit the number of modifiable weights in the network and thereby reduce the ability of the network to learn the noise in the training data.

A learning rate of 0.5 was used for all networks. Weights were randomly initialized in the range [-0.5,+0.5]. The value of the momentum parameter (used to add a fraction of the previous weight change to the current weight change) was set to zero. Training patterns were presented to the network in a randomly permutated order, and the weights were updated after the presentation of each training pattern (incremental update).

6.5. NETWORK TRAINING

A series of networks were trained using ten different random sets of initial weights. In each case, the initial weights were saved. The total sum of squares (TSS) error was monitored for both the training and test data sets. Although the training error decreases during training, the test error decreases to a minimum and then begins to increase as training continues. Training is stopped at the point at which the test TSS error reaches a minimum value, and the weights corresponding to the optimum number of cycles are used for all subsequent processing. From the ten neural networks, the one giving the best classification performance for the independent validation data set was used to process the input vectors for the entire Tenterfield grid. The output values resulting from this step were re-scaled and classified into integer values to produce an nine-class prospectivity map.

7. Fuzzy Logic Method

Fuzzy systems represent a conceptual or knowledge-based approach to mineral prospectivity mapping. In contrast to neural networks, which can learn from data, fuzzy systems require explicit statements about uncertain knowledge (Kasabov, 1996). The method applied in this study has been described in detail by Bonham-Carter (1994) and only a short description is given here. Each cell value in the input map layers is assigned a fuzzy membership value in the range [0,1], which expresses the degree to which the

187

value belongs to the set "favourable for mineralization". Fuzzy membership values are assigned according to subjective judgement. For the Tenterfield area, parameter values and the corresponding fuzzy membership values are shown in Table 1. Using this table, each cell value in GIS map layers was replaced by a fuzzy membership value. The fuzzy membership values from the different GIS layers were then combined using fuzzy operators to produce a single output membership value for each grid cell location representing the overall favourability for mineralization.

TABLE I. Map class values and fuzzy membership values for maps used in the fuzzy logic method.

Lithostratigraphic Distance to faults U count Th count Units (metres (classed data) (classed data)

Quaternary Alluvium 0.1 <200 0.1 I 0.1 I 0.2 Tertiary Volcanic Rocks 0.0 200-400 0.1 2 0.2 2 0.2 Clarence-Moreton Basin 0.0 400-600 0.1 3 0.2 3 0.25 Leucogranitoids 0.95 600-800 0.1 4 0.4 4 0.4 Moonbi Plutonic Suite 0.95 800-1000 0.1 5 0.7 5 0.6 Uralla Plutonic Suite 0.0 1000-1200 0.1 6 0.8 6 0.8 Clarence River Plutonic. Suite 0.1 1200-1400 0.1 7 0.9 7 0.8 Permian. Granitoids 0.4 1400-1600 0.1 8 0.9 8 0.9 Permian. Porphyritic. Granitoids 0.0 1600-1800 0.1 Wandsworth Volcanic. Group 0.1 1800-2000 0.1 Permian Sedimentary. Rocks 0.1 >2000 0.75 Sil.-Carb. Sedimentary. Rocks. 0.1

A variety of different fuzzy logic inference nets using different fuzzy operators (in the case of the gamma fuzzy operator, different values of the gamma parameter) were tested in order to combine membership. The prospectivity map that best accounts for the known deposit points, while minimizing the area of the highest prospectivity classes, was chosen as the final fuzzy logic map. The relationship between deposit points and magnetic intensity was ambiguous, and therefore magnetic data were not used as an input in the fuzzy logic and weights-of-evidence methods. The fuzzy gamma operator with y = 0.95 was used to combine membership values after the values for the U and Th values were summed. The fuzzy gamma operator is given by eq.(l);

[ N J'Y( N )1-1 Ilcomb= I-TI(I-lli) TIlli

1=1 1=1

(I)

where y is a number in the range [0, I], Ilcomb is the fuzzy membership value for the combined data sets, and Ili is the fuzzy membership value for the ,.fu data set. Descriptions of fuzzy operators are given by Zimmermann (1985), An et al. (1991), and Bonham-Carter (1994).

188

8. Weights-of-Evidence Method

Originally developed for medical diagnosis (Lusted, 1968), the weights-of-evidence method was adapted to spatial data and mineral exploration by Bonham-Carter et ai. (1988). A full description of the weights-of-evidence method is given by BonhamCarter (1994) and only a brief description is given here. The method is based on the concept of prior and posterior probabilities. If each deposit is assumed to occupy a single grid cell (where the map layer is in raster format), then the prior probability of any cell chosen at random containing a deposit is the ratio of the number of deposit cells to the total number of cells, that is,

P(D) = n(D)/n(1) (2)

This prior probability can be updated using binary maps that show whether a parameter favourable for mineralization is present or absent. The updated probability estimate is the conditional probability of a deposit occurring in a grid cell, given the presence of a binary map pattern, B. Bayes' Rule and the Theorem of Total Probability can be used to write;

P(D/B) = P(D) * P(B/D) / PCB) (3)

Here, the prior probability P(D) is updated with the factor P(BIDYP(B) to give the posterior conditional probability, P(D/B). The log of odds formulation of eq.(3) allows the effect of each map layer, which depicts parameters important for mineralization, to be calculated independently and then combined (Bonham-Carter et aI., 1989);

N ± In[OddY(D/ Bl nB2 n ... nBn )]= In[Odds(D)]+ Iw; (4)

i=l

if pattern B present

where w/= ifpattern B is absent

o if data are missing

and Bn means a favourable evidence pattern is present in a binary predictor map and D means a deposit is present. A bar over the letter indicates that the favourable feature is absent. The positive weight (Wn is used in eq.(4), when the favourable pattern is present, and increases the prior probability, and the negative weight (W;-) is used when the pattern is absent and decreases the prior probability.

189

A binary predictor map is produced by reclassifying multi-class maps, so that the spatial association between deposit points and the pattern is maximized. A pair of weights is then calculated for each binary predictor map, based on the conditional probability ratios estimated from the numbers of grid cells. The final prospectivity map is presented in terms of probability values calculated from the posterior log-odds (shown on the left-hand side of eq.( 4) ).

In order to apply eq.(4), an assumption that parameters in the binary evidence maps are conditionally independent with respect to deposits must be made; i.e., the effects of the interactions between parameters, used as evidence to modify the prior probability, must be ignored. This assumption is rarely satisfied completely in geological data sets (Bonham-Carter, 1994). Where a significant correlation between binary map parameters exists, one of the dependent maps must be omitted or dependent maps must be combined using Boolean operators (AND or OR), multiple regression or principal component analysis (Bonham-Carter, 1994). Weights-of-evidence maps were prepared from input map layers exported from a GIS using the GEODIPS GIS prospectivity analysis package (Jusmady and Taylor, 1997).

9. Measuring Prospectivity Map Quality

The quality of maps produced using different methods was compared quantitatively by measuring the extent to which high prospectivity areas of the maps include the locations of known mineral deposits and occurrences. A variety of statistical and probability indices of map quality were used.

The Chi-square statistic measures the extent to which observed and expected numbers of deposits differ. Although statistical tests of significance are not valid in the case of spatial associations, the Chi-square statistic can provide explorative and descriptive measures of spatial correlation (Bonham-Carter, 1994). The observed number of deposits in the map classes with the highest favourability should be much higher than the number that would be expected if the map class areas were distributed randomly with respect to deposits. A problem with the Chi-square statistic is that it is only sensitive to the total difference between observed and expected numbers of deposits for all the map classes, and not to whether it is the higher prospectivity-map classes that contain higher than expected numbers of the known deposits.

Spearman's and Kendall's rank correlation coefficients were used to assess the degree to which the probability of occurrence of known mineral deposits increases with prospectivity map class. For both these statistics, a value of zero indicates that no correlation exists and a value of one indicates a perfect correlation.

Conditional probabilities of the occurrence of a known mineral deposit, given the grid cell is located within a particular prospectivity class, were calculated for each prospectivity map class using the numbers of grid cells. Each deposit was assumed to occupy only a single cell. A disadvantage of conditional probabilities is that they are very small numbers and depend on the total number of known deposits in a region.

The ratio of the conditional probability of a deposit given a particular prospectivity class area, P(DIA), to the probability of a deposit for the map area as a whole, P(D), results in numbers that are easier to compare. This ratio is equivalent to an intuitive

190

notion of a predictive prospectivity map; i.e., a high prospectivity map class should define an area in which a high proportion of the known deposits are predicted in a small proportion of the total map area. This can be expressed as the following ratio:

n(DA}/n(Dtotat} _ P(D/ A) n(A)/ n(T} - P(D)

(5)

where n(DAJ represents the number of deposits in map class A, n(DtotaJ is the total number of deposits, n(A) is the area of map class A, and n(1) is the total area.

If a prospectivity map is to be useful, then the probability of discovering a deposit in the highest group of prospectivity map classes should be significantly upgraded compared to selecting a grid cell at random. Consequently, the ratio in eq.(5) should be greater than one. Conversely, map classes representing the lowest prospectivity classes should have ratios significantly less than one, indicating that the probability of fmding a deposit is much lower than average. In the middle group of map classes, the ratio should be approximately equal to one, indicating that cells in these areas are no more likely to contain a deposit than cells chosen at random.

The fuzzy logic and neural network maps were reclassified from nine to four classes of favourability prior to calculating the probability-based statistics, to allow a more direct comparison with the weights-of-evidence map, which, due to problems of conditional independence, only contains four prospectivity classes.

10. Results

Mineral prospectivity maps produced using the weights-of-evidence, fuzzy logic and neural network methods are shown, together with known gold occurrence and deposit points, in Figure 4. Both the neural network and fuzzy logic favourability results are classified into nine prospectivity classes. Only three colours are displayed in the weights-of-evidence map (Figure 4a). Due to the necessity of combining several input maps to avoid problems of conditional dependence, the weights-of-evidence method results in only three discrete probability levels. ClassifYing these values results in a four-class prospectivity map, in which one of the classes (class 3 in Figure 4a and Table 3) does not correspond to any cells on the map. Pairwise tests of all possible binary evidence maps from the weights-of-evidence method show that the assumption of conditional independence is violated for all map pairs except those involving the map showing distance to the nearest fault. To overcome this, dependent maps (geology, U, Th, K, and total count gamma-ray channels) are combined using the Boolean AND operator.

191

high .1 .2 0 3 .4 9 low

a) 0 km 20

Figure 4. Prospectivity maps for the Tenterfield I: 100,000 area. a) Weights-of-evidence method. b) Fuzzy logic method. c) Neural network method. Crosses, dots and open squares represent known primary, alluvial and undifferentiated gold deposits and occurrences, respectively.

Values of the Chi-square statistic and Speannan's and Kendall's rank correlation coefficients are shown for each of the three prospectivity mapping methods in Table 2. Table 3 shows probability-based measures of map quality in which favourability values for the fuzzy logic and neural network methods are reclassified from nine to four classes. Although no cells in the weights-of-evidence map correspond to class 3, it was assumed that just one cell value corresponds to this class in order to calculate the statistics shown in Tables 2 and 3. The upper part of Table 3 shows the conditional probability that a cell contains a known deposit, given that the area of interest is restricted to a particular prospectivity class. The lower part of Table 3 shows the ratio of conditional probability for a given map-class area to prior probability for the study area as a whole.

TABLE 2. Statistical measures of map quality for prospectivity maps produced using different methods.

Quality statistic Method

Weights-of Fuzzy Neural -Evidence logic network

Chi-square 66 63 83 Spearman' s p 0.80 0.50 0.93 Kendall's't 0.67 0.53 0.83

11. Discussion

The neural network method produces geologically meaningful results. The similarity of the neural network and conceptually-based fuzzy logic maps indicates that the neural network result conforms to current understanding of geological factors important for gold mineralization in the Tenterfield area. The broad similarity between the neural

192

network and the empirically-derived weights-of-evidence maps indicates that the neural network result accmmts for spatial relationships between known mineral deposits and the parameters of the GIS database.

The output of the trained neural network; i.e., values in the range [0.1,0.9], can be interpreted as the degree of similarity of a particular output to a composite of all deposit vectors presented to the network during training. Such an interpretation fits with the

TABLE 3. Probability measures of map quality for prospectivity maps produced using different methods. The fuzzy logic and neural network maps are reclassified from nine to four classes to enable a direct comparison with the weights-of-evidence map. The upper part of the table shows the conditional probability of a gold deposit given that a cell occurs in the particular map class. The ratio of the posterior to prior probability is shown in the lower part. In both cases, prospectivity is low for class 4 and high for class I. The weights-of-evidence classes are the same as those shown in Figure 4a.

Map Class Method

Weights-of-Evidence Fuzzy Logic Neural Networks

P(DIA) X 10' 4 2.52 2.85 2.60

I 3 0.00 1.65

I 2.15

I 2 8.67 6.89 11.66 I 23.45 18.91 26.74

I P(DIA)/P~ 0.28 0.31 0.29

I ~ 0.00 0.18 0.24 0.96 0.76 1.29 2.58 2.08 2.95

intuitive notion that a prospective area has characteristics that closely resemble areas known to contain mineral deposits. Therefore, the network output is also a measure of how favourable the area is for mineral deposits of the type included in the training data set.

The main difference between the prospectivity maps is that the neural network map contains much smaller high prospecitivty areas (classes 1 and 2 in Figure 4). This is interpreted to be due to the ability of the neural network to respond in a highly nonlinear way to combinations of input parameters, so that a high favourability value is only assigned to areas where there is a combination of the critical favourable parameters. This contrasts with the weights-of-evidence and fuzzy logic methods, where, for a given grid location, each favourable input-parameter value automatically results in an increase in the fmal prospectivity estimate. The broad band of high prospectivity (coloured red and orange) in the northeastern comer of the fuzzy logic map appears to an example of this effect (Figure 4b). The high favourability estimate is the result of high radiometric responses in all channels of the gamma-ray survey data. This area is not highlighted in the neural network map, where both a favourable leucogranite host and a high radiometric response are required to produce a high favourability estimate. The high prospectivity band in the fuzzy logic map is absent in

193

the weights-of-evidence map because gamma-ray survey map layers, which contain very similar information, violate the assumption of conditional independence and were therefore converted to a single input layer.

Both the fuzzy logic and weights-of-evidence methods are quite sensitive to data layers that are strongly correlated; i.e., contain similar information. This characteristic is common in geological exploration data sets, particularly those involving geochemistry. As explained above, the result of incorporating data sets containing similar information in the fuzzy logic method is a map containing spuriously high favourability estimates. With the weights-of-evidence method, correlated data sets violate the assumption of conditional independence. The solution requires the dependent data sets to be either discarded or combined in some way, but Boolean methods of combining data sets reduce multiple parameters to a single parameter and therefore result in a loss of information. The results of this study, in which four layers of gamma-ray survey data were used as inputs to the neural network, suggest that the neural network method is very robust with respect to data sets containing similar information.

The neural network method allows data sets to be combined without the loss of information inherent in the weights-of-evidence method and some implementations of the fuzzy logic method. The weights-of-evidence method requires that multi-class data be converted to a binary format prior to integration. Values below the threshold used to reclassify multi-class to binary maps play no role in determining prospectivity, and values above the threshold (included in the binary evidence map) are treated as though they are equal because they are assigned the same weight. The fuzzy logic method overcomes the problem of applying a fixed cut-off value. Instead of converting input map layers to binary values, the raw cell values are replaced with fuzzy membership values in the range [0, I]. However, loss of information also occurs where single fuzzymembership values are applied to a whole class or range of values, as in the case of studies by Eddy et al. (1995) and An et al. (1991). This simple method was also applied here. In contrast, the neural network method does not require ratio data to be reclassified prior to combining data sets.

Inconsistent or misleading data are unlikely to represent as serious a problem for neural networks as for statistical methods such as the weights-of-evidence methods and combinations of statistical and fuzzy logic methods. The GIS thematic layer containing regional faults and lineaments used in this study is an example of a possibly spurious data set. An examination of the digital elevation model (OEM) for the Tenterfield area reveals many major lineaments not present in the structural layer. Thus, the structural map represents incomplete information. The weak inverse relationship between deposit points and faults that emerged from the statistical analysis performed in the weights-ofevidence method is likely to be misleading. The effect of this input on the estimated prospectivity is clearly illustrated in the weights-of-evidence prospectivity map in the form of elongate zones of moderate prospectivity throughout the map (Figure 4a). In contrast, the neural network prospectivity map does not show any evidence that major faults have played an important role in determining prospectivity. A property of neural networks is the ability to respond to spurious input data by adjusting the weights connected to the relevant input to very low values, thereby reducing or eliminating the contribution made by that data to the output.

194

The statistical measures of prospectivity map quality (Table 2) are all highest for the neural network map. The Chi-square statistic indicates greater differences between the number of deposits observed in the class areas of the neural network map and those that would be expected if the class areas were located at random, compared with maps produced by the fuzzy logic and weights-of-evidence methods. The larger values of both Spearman's and Kendall's rank correlation coefficients show that there is a stronger correlation between map class and probability of a cell containing a known deposit than for the other two methods.

The probability and probability ratio measures of map quality (Table 3) indicate that the highest prospectivity map classes in the neural network map are stronger predictors of deposits than the highest prospectivity map classes in the fuzzy logic map and slightly stronger than those in the weights-of-evidence map. As noted above, in an ideal prospectivity map, the ratio P(DIA}/P(D) should be greater than one in the high prospectivity class, approximately equal to one for the middle prospectivity class, and considerably less than one for the low prospectivity class.

The prospectivity map produced using the neural network method is based on about two thirds of the data that are used in the weights-of-evidence method. Approximately one third of the available data was reserved as a validation data set in the neural network method and used to check the generalization performance of the network after it had been trained. Only one third of the available data, the training data set, was used directly in training the network and a further third, the test data set, was used indirectly. The validation data set played no role in training. In contrast, all of the available deposit points were used in the weights-of-evidence method. There were an insufficient number of gold deposits and occurrences available to allow a third of the data to be reserved as a validation data set in the weights-of-evidence method.

12. Conclusions

The neural network method, using an MLP network trained with the error backpropagation algorithm, produces a geologically-plausible mineral prospectivity map similar, but superior, to the fuzzy logic and weights-of-evidence maps. The use of neural networks for the integration of large multi-source data sets used in regional mineral exploration, and for the prediction of mineral prospectivity, offers several advantages over existing methods. These include the ability to: 1) respond to critical combinations of parameters, rather than automatically increasing the prospectivity due to all favourable parameters, 2) combine data sets without the loss of information inherent in existing methods, and 3) produce results that are relatively unaffected by redundant data, spurious data and data containing multiple populations. In addition, the neural network method in this study used approximately a third less data than the weights-of-evidence method.

195

Acknowledgments

The authors gratefully acknowledge Robert Barnes of the Geological Survey of New South Wales, Department of Mineral Resources, for providing the GIS data used in this study. The use of the GEODIPS GIS prospectivity analysis package, written by Jusmady and Geoff Taylor of the Department of Applied Geology, UNSW, in the preparation of the weights-of-evidence and fuzzy logic prospectivity maps, is also gratefully acknowledged. WMB thanks the AusIMM for supporting this study in the form of a grant from the Bicentennial Gold 88 Endowment Fund. Josef Holzschu kindly helped in preparing the diagrams. Joe Leach is thanked for contributing the initial idea for this study.

References

Agterberg, F. P., 1974, Automatic contouring of geological maps to detect target areas for mineral exploration: J. Math. Geo!., 6, 373-395.

An, P., Moon, W. M., and Rencz, A., 1991, Application of fuzzy set theory to integrated mineral exploration: Can. J. Exp!. Geophys., 27, 1-11.

Bampton, M. D., 1988, Alteration and mineralisation of the southern part of the Stanthorpe Adamellite, near Tenterfield, New South Wales: B.Sc. thesis, Univ. of Sydney. (unpub!').

Barnes, R. G., 1993, Use of GIS and image processing for enhancing geological investigations. Case study: Tenterfield 1:100,000 sheet area, northern NSW: Geological Survey Report No. GS I 9931290. Geo!. Surv. N.S.W.

Barnes, R. G., Henley, H. F., and Henley, J. E., 1995, Exploration data package for the Tenterfield and Coaldale 1:100,000 sheet areas: Geological Survey Report No. GS1995/004. Geo!. Surv. N.S.W.

Benediktsson, J. A., Swain, P. H., and Ersoy, O. K., 1993, Conjugate-gradient neural networks in classification of multi source and very-high dimensional remote sensing data: Internat. J. Remote Sensing, 14,2883-2903.

Bonham-Carter, G. F., 1994, Geographic information systems for geoscientists: modelling with GIS, Computer Methods in the Geosciences, 13, Pergamon Press, Inc.

Bonham-Carter, G. F., Agterberg, F. P., and Wright, D. F., 1988, Integration of geological datasets for gold exploration in Nova Scotia: Photo. Eng. & Rem. Sens.,54, 1585-1592.

Bonham-Carter, G. F., Agterberg, F. P., and Wright, D. F., 1989, Weights of evidence: a new approach to mapping mineral potential, in Agterberg, F. P., and Bonham-Carter, G. F., Eds., Statistical Applications in the Earth Sciences: Geo!. Surv. Can. Paper 89-9, Geo!. Surv. Can., 171-183.

Brown, W. M., Taylor, G. R., Jusmady, Groves, D. I., and Knox-Robinson, C. M., 1998, Application of artificial neural networks to prospectivity analysis in a GIS environment: a comparison with statistical and fuzzy logic methods for Au and Sn deposits of the Tenterfield area, NSW, in 14th Australian Geological Convention, Geological Society of Australia, Abstracts, 49, 57.

Brown, W. M., Gedeon, T. D., and Barnes, R. G., 1999, The use ofa multilayer feedforward neural network for mineral prospectivity mapping, in Gedeon, T., Wong, P., Halgamuge, S., Kasabov, N., Nauck, D., and Fukushima, K., Eds., ICONIP'99, Proceedings of the 6th International Conference of Neural Information Processing: 1, 160-165, IEEE, Piscataway.

Bruzzone, L., Conese, C., Maselli, F., and Roll, F., 1997, MuItisource classification of complex rural areas by statistical and neural-network approaches: Photo. Eng. & Rem. Sens., 63, 523-533.

Clare, A., Bolt, G., and Daneel, G., 1997, The use of neural analysis as an exploration tool, in Proceedings of the 12th International Conference on Applied Geologic Sensing, 2, Environmental Research Institute of Michigan (ERIM), 199-205.

Cucuzza, J., and Goode, A. D. T., 1998, Australian mineral exploration research, in Berkman, D. A. and MacKenzie, D. H. Eds., Geology of Australian and Papua New Guinean mineral deposits: Australasian Institute of Mining and Metallurgy Mono. 22, 53-60.

Decateur, S. E., 1989, Application of neural networks to terrain classification: IEEE INNS International Joint Conference on Neural Networks, 1, 283-288.

196

Eddy, B. G., Bonham-Carter, G. F., and Jefferson, C. W., 1995, Mineral resource assessment of the Parry Islands, high Arctic, Canada: a GIS-based fuzzy logic model, in Proceedings Canadian Conference on GIS, CD ROM Session C3, Paper 4.

Harris, D., and Pan, G., 1999, Mineral favourability mapping: a comparison of artificial neural networks, logistic regression, and discriminant analysis: Natural Resources Research, 8, 93-109.

Hornik, K., 1991, Approximation capabilities ofmuitilayer feedforward networks: Neural Networks, 4,251-257.

Jeffrey, W., and Rossner, R., 1986, Optimization algorithms: simulated annealing and neural network processing: Astrophysical Journal, 310, 473-481.

Jusmady, and Taylor, G. R., 1997, Geological Data Integration and Processing System (GEODIPS) software package Version 1.03: Department of Applied Geology, Univ. of New South Wales.

Kasabov, N .K., 1996, Foundations of neural networks, fuzzy logic, and knowledge engineering: M.LT. Press. Katz, S. S., 1991, Emulating the Prospector expert system with a raster GIS: Compu!. and Geosci., 17, 1033-

1050. Legge, P. J., 1995, Geoscience 1994 and beyond: thoughts on geology and exploration for world-class

deposits: Austral. J. Earth Sciences, 42, 1-10. Lusted, L. 8., 1968, Introduction to medical decision making: Charles Thomas. Masters, T., 1993, Practical neural network recipes in C++: Academic Press, Inc. McClelland, J. L., and Rumelbart, D. E., 1988, Explorations in parallel distributed processing: M.I.T. Press. Moon, W, M. 1990, Integration of geophysical and geological data using evidential belief function: IEEE

Trans. Geosci. Remote Sensing, 28, 711-720. Pan, G., and Harris, D. P., 1992, Estimating a fuvorability equation for the integration of geodata and

selection of mineral exploration targets: Mathematical Geology, 24, 177-201. Reddy, R. K. T., Agterberg, F. P., and Bonham-Carter, G. F., 1991, Application of GIS-based logistic models

to base-metal potential mapping in Snow Lake area, in Proceedings of Canadian Conference on GIS, 607-618.

Ritter, N. D., and Hepner, G. F., 1990, Application of an artificial neural network to land-cover classification of thematic mapper imagery: Comput. and Geosci., 16, 873-880.

Ritter, N. D., Logan, T. L., and Bryant, N. A., 1988, Integration of neural network technologies with geographic information systems, in Thomas, H. F., Ed., GIS; integrating technology and geoscience applications (Denver): National Academy of Science, 102-\03.

Rummelhart, D. E., Hinton, G. E., and Williams, R. J., 1986, Learning representations by back-propagating errors: Nature, 323, 533-536.

Sinclair, A J., and Woodsmith, G. I., 1970, Multiple regression as a method of estimating exploration potential in an area near Terrace, British Columbia: Econ. Geol., 65, 998-1003.

Singer, D. A, and Kouda, R., 1988, Integrating spatial and frequency information in the search for Kuroko deposits of the Hokuroku District, Japan: Econ. Geol., 83,18-29.

Singer, D. A, and Kouda, R., 1996, Application of a feedforward neural network in the search for Kuroko deposits in the Hokuroku District, Japan: Mathematical Geology, 28, \017-\023.

Taylor, G. R., 1997, Geoscience GIS: Desktop and Prospectivity Mapping: Short Course at the Key Centre for Mines, July 7-11,1997, Univ. of New South Wales.

Wang, C., Venkatesh, S. S., and Judd, J. S., 1994, Optimal stopping and effective machine complexity in learning, in Cowan, J. D., and Tesauro, G., and Alspector, J., Eds., Advances in neural information processing systems (Denver, 1993): Morgan Kaufinann Publ., 303-310.

Werbos, P., 1974, Beyond regression; new tools for prediction and analysis in the behavioural sciences: Ph.D. thesis, Harvard Univ.

Woodall, R., 1994, Empiricism and concept in successful mineral exploration: Austral. J. Earth Sciences, 41, 1-10.

Wyborn, L. A. 1., Gallagher, R., and Raymond, 0., 1995, Using GIS for mineral potential evaluation in areas with few known mineral occurrences, in Second National Forum on GIS in the Geosciences - Forum Proceedings: Australian Geological Survey Organisation, AGSO Record 1995/46, 199-211.

Zaknich, A., 1999, Artificial Neural Networks: a research dominated engineering discipline: Centre for Intelligent Information Processing Systems (ClIPS), Department of Electrical and Electronic Engineering, Univ. of Western Australia, (unpublished).

Zimmermann, H. J., 1985, Fuzzy set theory and its applications: Kluwer Academic Publ.

Documents

AN ARTIFICIAL NEURAL NETWORK METIIOD FOR MINERAL