13
Finding all optimal solutions to the reserve site selection problem: formulation and computational analysis JEFFREY L. ARTHUR 1 , MARK HACHEY 1 , KEVIN SAHR 2 , MANUELA HUSO 2 and A.R. KIESTER 2 1 Department of Statistics, Oregon State University, Corvallis, OR 97331–4606, USA 2 Biodiversity Research Consortium, Department of Geosciences, Oregon State University, Corvallis, OR 97331–5506, USA Received January 1996. Revised January 1997 The problem of selecting nature reserves has received increased attention in the literature during the past decade, and a variety of approaches have been promoted for selecting those sites to include in a reserve network. One set of techniques employs heuristic algorithms and thus provides possibly sub-optimal solutions. Another set of models and accompanying algorithms uses an integer programming formulation of the problem, resulting in an optimization problem known as the Maximal Covering Problem, or MCP. Solution of the MCP provides an optimal solution to the reserve site selection problem, and while various algorithms can be employed for solving the MCP they all suffer from the disadvantage of providing a single optimal solution dictating the selection of areas for conservation. In order to provide complete information to decision makers, the determination of all alternate optimal solutions is necessary. This paper explores two procedures for finding all such solutions. We describe the formulation and motivation of each method. A computational analysis on a data set describing native terrestrial vertebrates in the state of Oregon illustrates the effectiveness of each approach. Keywords: biodiversity, integer programming, maximal covering problem, species preservation, zonal constraints. 1. Introduction Ever-growing pressure from a number of (mostly human) sources has resulted in an increased need for preserving species biodiversity through the selection and maintenance of reserve net- works. Simultaneously, the resource pool for the development of nature reserves has become more limited. As a consequence, the efficient utilization of conservation resources is more cri- tical than ever before, and the reserve site selection problem (RSSP) has drawn considerable interest from conservation managers trying to balance these divergent forces. In response, the scientific community has proposed a number of approaches to assist in this difficult decision-making process of establishing nature reserves. Most of these models look at selecting from a set of N geographic units some subset of size n which provides a maximum cov- erage of the known species within the area under consideration (i.e. the N units). An excellent bibliography on the RSSP and related biodiversity issues can be found in Polasky et al. (1995). 1352-8505 1997 Chapman & Hall Environmental and Ecological Statistics 4, 153–165 (1997)

Finding all optimal solutions to the reserve site selection

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Finding all optimal solutions to the reserve site selection

Finding all optimal solutions to the reserve

site selection problem: formulation and

computational analysis

J E F F R E Y L . A R T H U R 1 , M A R K H A C H E Y 1 , K E V I N S A H R 2 ,M A N U E L A H U S O 2 and A . R . K I E S T E R 2

1Department of Statistics, Oregon State University, Corvallis, OR 97331±4606, USA2Biodiversity Research Consortium, Department of Geosciences, Oregon State University, Corvallis,OR 97331±5506, USA

Received January 1996. Revised January 1997

The problem of selecting nature reserves has received increased attention in the literature during the pastdecade, and a variety of approaches have been promoted for selecting those sites to include in a reservenetwork. One set of techniques employs heuristic algorithms and thus provides possibly sub-optimalsolutions. Another set of models and accompanying algorithms uses an integer programming formulationof the problem, resulting in an optimization problem known as the Maximal Covering Problem, or MCP.Solution of the MCP provides an optimal solution to the reserve site selection problem, and while variousalgorithms can be employed for solving the MCP they all suffer from the disadvantage of providing asingle optimal solution dictating the selection of areas for conservation. In order to provide completeinformation to decision makers, the determination of all alternate optimal solutions is necessary. Thispaper explores two procedures for finding all such solutions. We describe the formulation and motivationof each method. A computational analysis on a data set describing native terrestrial vertebrates in thestate of Oregon illustrates the effectiveness of each approach.

Keywords: biodiversity, integer programming, maximal covering problem, species preservation, zonalconstraints.

1. Introduction

Ever-growing pressure from a number of (mostly human) sources has resulted in an increasedneed for preserving species biodiversity through the selection and maintenance of reserve net-works. Simultaneously, the resource pool for the development of nature reserves has becomemore limited. As a consequence, the efficient utilization of conservation resources is more cri-tical than ever before, and the reserve site selection problem (RSSP) has drawn considerableinterest from conservation managers trying to balance these divergent forces.

In response, the scientific community has proposed a number of approaches to assist in thisdifficult decision-making process of establishing nature reserves. Most of these models look atselecting from a set of N geographic units some subset of size n which provides a maximum cov-erage of the known species within the area under consideration (i.e. the N units). An excellentbibliography on the RSSP and related biodiversity issues can be found in Polasky et al. (1995).

1352-8505 1997 Chapman & Hall

Environmental and Ecological Statistics 4, 153±165 (1997)

Page 2: Finding all optimal solutions to the reserve site selection

Several algorithms have been proposed for finding a solution to the RSSP. The most straight-forward approach uses an exhaustive technique to enumerate all possible combinations of n outof the N possible units, and selects those combinations which provide maximum species repre-sentation. Other algorithms have employed heuristics which attempt to find a satisfactory ±although not necessarily optimal ± solution to the selection problem through the systematicapplication of logical decision rules. As described in Csuti et al. (in press), these typically fallinto one of two classes. The `greedy' algorithms begin by selecting the unit containing the great-est number of species, then continue to add units by selecting those which contribute the largestnumber of previously unrepresented species. Examples of this approach and its variants can befound in Margules et al. (1988), Pressey and Nicholls (1989), Vane-Wright et al. (1991), Presseyet al. (1993) and Nicholls and Margules (1993). The other type of heuristic uses a `rarity-based'approach to select sites according to the rarity of the species they contain (Williams et al. 1993;Kershaw et al. 1994). Another approach recognizes that there are often spatial patterns in thedata describing the distribution of species among units, and uses this characteristic to explorealternate configurations ± often starting from one or more greedy solutions ± using a methodbased on simulated annealing (Csuti et al., in press). A third set of algorithms employs aninteger programming formulation of the RSSP, resulting in an optimization problem calledthe Maximal Covering Problem (MCP). Standard integer programming algorithms such asbranch-and-bound schemes (see, for example, Land and Doig (1960) and Garfinkel andNemhauser (1972)) can often be used to find an optimal solution to the MCP. A common criti-cism of the use of branch-and-bound strategies is that their solution times ± as measured bycomputer CPU time ± are often prohibitive for large problems; however, a recently developedalgorithm by Downs and Camm (1996) appears promising for large instances of the MCP.A computational comparison of the efficacy and efficiency of many of these algorithms canbe found in Csuti et al. (in press). While many of these heuristic methods can be easilymodified to identify alternate `best' solutions, their value in determining nature reserves iscurrently under debate (see Williams et al. (in press), Underhill (1994), Pressey et al. (1996)and Camm et al. (in press)).

Our primary concern is that most of these approaches provide a single solution to the RSSP.An exception to this is the exhaustive enumeration method, but its applicability is limited toalmost trivial cases. There are several primary reasons to be concerned with the identification ofall optimal solutions in selecting a nature reserve network. The first comes from the recognitionthat any sort of model and its resultant solution act merely in providing information to theactual decision makers (in this case, appropriate government officials) and not in dictatingwhat the final decision will be. Thus, specifying a single optimal solution would be an instanceof providing incomplete information. Secondly, providing decision makers with a complete setof solutions with maximal species representation allows those decision makers to use additionalcriteria in reducing the list of allowable network designs. Examples of the types of criteria thatcould be applied include network topology (i.e. creating a small number of geographically largereserves); financial considerations (e.g. expected maintenance costs); and additional informationon species viability such as population counts within selected units.

Our interest is in the use of standard branch-and-bound algorithms for solving the MCPresulting from the RSSP. While these algorithms certainly have the advantage of providing anecessarily optimal solution, they are usually available only as black-box computer programs,and consequently their internal decision rules which guide the search to the optimum arehidden. Thus, attempting to search for alternate optimal solutions by providing different start-ing points can be a hit-or-miss strategy.

154 Arthur et al.

Page 3: Finding all optimal solutions to the reserve site selection

The purpose of this paper is to develop and explore the effectiveness of two related strategiesfor finding alternate optimal solutions to the RSSP using the MCP formulation. Section 2 intro-duces methods for finding an optimal solution to the RSSP, with emphasis on the mathematicalformulation of the MCP and the branch-and-bound package used for finding the initial optimalsolution. In Section 3 we present two ways to modify the formulation so as to guide the searchto a different optimal solution (as long as one exists) or to conclude that all optimal solutionshave been enumerated. We motivate each method's development and provide some details ofthe mathematical formulations. Section 4 presents the results of an exhaustive computationalanalysis of the proposed method on an actual data set describing 457 species of native terrestrialvertebrates in the state of Oregon. Using these results, we highlight several trends that arise incomputer times and in species coverage. The paper concludes in Section 5 with a summary ofour findings and some discussion of current and future work on related problems.

2. Solving the reserve site selection problem

The problem considered in this paper is the Reserve Site Selection Problem (RSSP), and is oneof selecting from among N known geographic units, which we call cells, some subset of thesecells in order to maximize the number of species represented in the selected units. In particular,we assume there are M species of interest known to reside in the N cells, and at most n cells (n< N) can be selected for designation as reserve sites. These cells can be thought of as counties,parcels of land, watersheds, or any arbitrary geographic region. In Section 4, our results arebased on a specific gridding scheme.

Optimal solutions to the RSSP can be found using exhaustive search methods only when afew cells can be chosen for the set, i.e. small values of n. For example, in the case where only n= 2 cells can be selected, the number of species covered by each combination of two cells fromthe N total can be calculated, and the combinations giving the greatest coverage (largestnumber of species represented) are, by definition, optimal. However, because the number ofsuch combinations is N

n

ÿ �, for typical values of N the use of such exhaustive techniques becomes

impractical for all but trivial values of n.Fortunately, more efficient methods exist for finding optimal solutions to the RSSP which

come from the field of integer programming. We assume the data representing species presencewithin each cell is in a matrix, A, of size M �N , where the individual elements of A are definedas

aij � 1 if species i is located in cell j0 otherwise

nfor i � 1; . . . ;M and j � 1; . . . ; N . We then define two types of decision variables. For each cellj � 1; . . . ; N , we let

xj � 1 if cell j is selected for reserve designation0 otherwise

nFinally, for each species i � 1; . . . ;M we define

Yi � 1 if species i is in at least one selected cell0 otherwise

n

Optimal solutions to the reserve site selection problem 155

Page 4: Finding all optimal solutions to the reserve site selection

The problem of selecting n cells in order to maximize species coverage can then be repre-sented as the following optimization problem:

Maximize:XMi�1

Yi �1�

Subject to:XNj�1

aij � xj � Yi i � 1; . . . ;M �2�

XNj�1

xj � n �3�

0 � Yi � 1 i � 1; . . . ;M �4�

xj � 0 or 1; j � 1; . . . ; N �5�

Since each variable Yi equals 1 only when species i is represented, Equation (1) measures thetotal number of species covered by the selected cells, the appropriate measure of effectiveness.Equation (2) guarantees that a Yi variable can be set to one only if that species is present in atleast one of the selected cells; this holds since, for any fixed i, the quantity

Pj aij � xj is the

integral number of times species i is represented in the selected cells. Therefore, if species i isnot covered, Yi will be set to 0; whereas Yi will be set to 1, its upper bound, if the species iscovered at least once. Equation (3) restricts the number of cells that can be selected. Equation(4) provides the natural bounds on Yi, and indicates that this variable can be treated as contin-uous rather than discrete, since (1) and (2) will force it to 1 if that species is covered by theselection, but leaves it at 0 otherwise. Equation (5) sets the mandatory binary restriction oneach cell variable xj.

The problem formulated in Equations (1)±(5) is a form of integer program known as themaximal covering problem (MCP). This formulation of the RSSP as a MCP has been fullydeveloped in Church et al. (1996) and Camm et al. (in press). Our method of solving theproblem first involves converting the data set into the MCP. This conversion is accomplishedby scanning the species-hexagon incidence matrix A with a C or FORTRAN program, andtransforming the information into the Mathematical Programming System (MPS) format. TheMPS format indicates the contribution of each species to Equation (1), the necessary constraintson the decision variables as described by Equations (2) and (3), and the individual variabledefinitions and bounds described by Equations (4) and (5). The MPS file could then be usedwith any mixed-integer programming solver to find an optimal solution to the MCP. We used asour solver the Optimization Subroutine Library (OSL) system developed by IBM (1992).

As a technical note, we are actually solving an analogous version of the problem known asthe Minimum Uncovered Problem, as described in Church et al. (1996), Downs and Camm(1996) and Camm et al. (in press). The algorithm searches for the minimum number of uncov-ered species if one must select n cells. The mathematical formulation is similar. A variable Wi

can be defined as 1ÿ Yi, and row (1) is now to minimize, as opposed to maximize, the sum ofthe Wi's. Constraint (2) becomes

PNj�1 aij � xj �Wj � 1, and (4) then forces the Wi's to satisfy

0 � Wi � 1. In fact, as shown in Church et al. (1996), the upper bounds of 1 on the Wi variables

156 Arthur et al.

Page 5: Finding all optimal solutions to the reserve site selection

can be ignored, thus avoiding the need to use an upper-bounded variant of the simplex method.With this formulation, the minimization of each Wi allows it to stay at 0 if

PNj�1 aij � xj is 1, i.e. if

species i is covered.

3. Finding all optimal solutions

While solving the MCP is not, in theory, a difficult problem, new complications arise when alloptimal solutions must be found. Since most species appear in more than one cell, and sincemost cells contain multiple species, there are likely to be different selections of n cells thatproduce optimal coverage. The exhaustive search, by its nature, is able to find all suchoptimal solutions, but as mentioned previously its applicability is limited to trivial cases.

An alternate method for finding all optimal solutions has been used by Gerrard and Church(1995) to find alternate and near-optimal solutions to the p-median problem, of which the MCPis a special case. Camm et al. (in press) adopt this approach specifically to the RSSP and theMCP formulation. The method, which we call Explicit Exclusion (EE), involves repeated solu-tion of the MCP, each time with an additional constraint (a type of zonal constraint) thatexcludes the previously enumerated optimal solution. This is accomplished as follows: supposethe solution of the MCP results in a set of selected cells represented by x�j�j � 1; . . .N� where

x�j � 1 if cell j is in the selected set0 otherwise

nWe then define a set of new constants bj � x�j and add the following constraint to the MCP:

XNj�1

bj � xj � nÿ 1 �6�

This constraint ensures that the same set of hexagons cannot be selected, for then the left-hand side of (6) would take on the value n. This constraint is likely to be similar to the one usedby Csuti et al. (in press) in deriving their preliminary results on alternate optimal solutions.

After finding a new optimal solution with this constraint added, another constraint would beformed and added to the problem that would disallow the second optimal solution found. Theproblem would be augmented and resolved in this manner until the number of species repre-sented drops below the optimal value. When the number of species covered in an optimal solu-tion falls below this optimal value, all optimal solutions have been found. Our approach hasbeen to implement this procedure in a loop which adds a row corresponding to Equation (6)prior to each subsequent call to OSL. The general solution strategy is depicted in Fig. 1.

A less brute force method, called Explicit Subsets (ES), is based on exploiting the similaritiesthat exist between alternate solutions. Due to the fact that certain groupings of populous cellswill contain many species when combined, particular subsets tend to appear in many solutions.For example, when n � 10 in the data set referenced in Section 4, five cells appear in everysolution while different sets of five occupy the remaining spots. The Explicit Subsets methodwill identify those first five cells (the explicit subset), force them to remain in each solution, andthen search for five more that would provide optimal coverage. In this case, the search essen-tially becomes an n � 5 problem solved with EE.

ES performs its search by first finding an optimal solution with the row constraints (2)±(5) asdescribed by the original model. Then, subsets of the original set are considered for follow-up

Optimal solutions to the reserve site selection problem 157

Page 6: Finding all optimal solutions to the reserve site selection

solutions. The subsets are forced to be of size i for all values of i between 0 and nÿ 1. If asubset of size i cannot yield an optimal solution, i is increased. If a subset of size i does give anoptimal solution, that subset is forced to remain. Then all sets of size nÿ i which ± when com-bined with those i fixed in the subset ± give an optimal solution, are found using EE to join thesubset. In the example mentioned above, no solutions would be found until i is set to five,thereby reducing the scope of the overall search. When a specific group of five is completelyexhausted, another group of five becomes a candidate. If no other subset of five can produce anoptimal solution, i is increased. The data flow diagram for this algorithm can be seen in Fig. 2.

4. Numerical results

In order to measure the effectiveness of the EE and ES strategies, we tested them on a dataset describing all native terrestrial vertebrates in the state of Oregon. This set containsM � 441 cells and lists N � 457 species. The number of species within a single cell rangesfrom one to 275, and the species-cell incidence matrix has a density (proportion of non-zeros)of 0.4757.

This particular set was created by tessellating the state of Oregon with a grid of hexagonseach approximately 640 km2 (see White et al., 1992). For each hexagon, experts determined theset of species likely to be present in that hexagon. This data set corresponds to the aij compo-nents described in the MCP formulation, with the hexagon taking the place of the unspecifiedgeographic unit. A diagram of this tessellation can be seen in Fig. 3 with the most populatedhexagon highlighted.

158 Arthur et al.

Figure 1. Data flow diagram for explicit exclusion.

Page 7: Finding all optimal solutions to the reserve site selection

Optimal solutions to the reserve site selection problem 159

Figure 2. Data flow diagram for explicit subsets.

Page 8: Finding all optimal solutions to the reserve site selection

The results of our computational experiments are presented in Table 1. For each value of n,we indicate (i) the maximum number of species that can be covered, and (ii) for each methodwe provide the number of optimal solutions found, the total time to find all solutions, and thetime per solution. Deserving mention is the fact that total times reported include the timenecessary to create and input the initial MPS file, which is independent of the method andthe specific value of n. Our tests were performed on an IBM RS/6000 running under the AIXoperating system. The execution times reported in Table 1 are for a single run of the procedure.A sampling of various values of n and replication of the method for those values of n indicatedthere was negligible variability, and thus we did not perform multiple replications for all values.

Several results in Table 1 merit elaboration. First, we see that the species coverage increasesrapidly for small values of n, with 415 (over 90%) of the 457 species represented when only fourhexagons can be selected. Species coverage continues to increase, but at a decreasing rate, as nincreases from 4 to 12. At this point, each increase in n of one unit provides a gain of only oneadditional species, a trend which continues until all species are represented for the value n � 20.This characteristic of decreasing marginal returns in the number of species represented has beenobserved in the analysis of other data sets.

The number of optimal solutions found as a function of n is not so straightforward. Whilegenerally increasing as n increases from 1 to 10, the rapid rise in the number of optimal solu-tions beyond n � 10 is unexpected. Note that the implementation of the explicit exclusion

160 Arthur et al.

Table 1. Numerical results

Explicit exclusion Explicit subsets

n Number ofspeciescovered

Number ofsolutions

Totaltime (s)

Time persolution

Number ofsolutions

Totaltime (s)

Time persolution

1 275 1 61 61 1 59 592 360 2 60 30 2 98 493 396 2 71 35.5 2 85 42.54 415 4 79 19.3 4 122 30.55 424 6 87 14.5 6 96 166 430 4 69 16.8 4 80 207 435 33 578 17.2 33 262 7.98 440 7 63 9 7 78 11.19 442 672 21 973 32.7 672 2995 4.510 445 72 447 6.2 72 244 3.411 447 292 2400 8.2 292 789 2.712 449 144 853 5.9 144 416 2.913 450 >5000 92 977 18.6 >8459 66 716 7.914 451 >5000 48 617 9.7 >5000 56 434 11.315 452 >5000 50 095 10.0 >5000 42 215 8.416 453 >5000 49 411 9.9 >5000 52 562 10.517 454 >5000 49 091 9.8 >5000 40 268 8.118 455 >5000 44 838 9.0 >5000 92 684 18.519 456 >5000 45 990 9.2 >5000 340 074 68.020 457 >5000 43 063 8.6 >5000 44 514 8.9

Page 9: Finding all optimal solutions to the reserve site selection

Optimal solutions to the reserve site selection problem 161

Figure 3. Relative frequencies of occurrence in optimal solutions (n � 1; . . . ; 8�.

Page 10: Finding all optimal solutions to the reserve site selection

method stopped the looping process after the addition of 5000 constraints of the form inEquation (6). Thus, the actual numbers of optimal selections for values of n � 13 areunknown, but could be easily determined by increasing this self-imposed termination criterion.

Some anomalous solution times appear for values of n � 7; 11; 13; and 19. The EE per solu-tion times for these values of n are more than twice those for neighbouring values of n. Thenature of the branch-and-bound algorithm leaves open the possibility of having to considermany possible solutions before the determination of an optimal one. This unpredictability inthe search is inherent in integer programming and is one of the factors that continues to gen-erate research interest and specialized algorithms in the field. The major increase in the persolution time for n � 19 is hypothesized to be due to the fact that all 5000 solutions stemfrom the same subset of size four. In this regard, ES seems restrictive in that it does notallow the branch-and-bound algorithm to move to an optimal solution without the explicitsubset.

A comparison of the solution efforts required by EE and ES for the various values of nresults in no overall superior approach. EE is generally more efficient than ES for smallvalues of n (less than or equal to eight). This is likely to be due to the need for ES to find asub-optimal solution for each subset size i in order to conclude its search for that subset size(thus effectively having to complete the search n times), while EE needs only reach a singlesub-optimal solution to terminate. The additional bookkeeping effort required by ES is notrecovered when finding a small number of optimal solutions. For values of n where thesearch was terminated after the discovery of 5000 optimal solutions, the frequent similarity ofsolution times suggests that the ES approach degenerated into a reduced EE search. This, infact, was observed to occur; for example, when n � 19, a single subset of size four produced all5000 optimal solutions. Consequently, the solution time reflects the amount of work in a restric-ted EE search for the remaining nÿ i � 15 cells to be selected.

Another item of interest from both a practical as well as algorithmic viewpoint is the identi-fication of a `core' set of hexagons, that is, a subset of specific hexagons that have a high like-lihood of appearing in any optimal solution. The practical ramifications of knowledge of such acore set are immediate. From the algorithmic perspective, the ability to identify a core withrelatively little effort could help steer the search for optimal solutions in more complicatedsituations; in particular, for larger values of n. Furthermore, one of the common questionsraised in the analysis of the RSSP is: given knowledge about the optimal reserve layouts for aspecific value of n, how do such proposed reserve systems change if n is increased or decreasedby one. Information on a core set of hexagons (the composition of which one suspects mightchange marginally for neighbouring values of n) would greatly facilitate such analyses. To thisend, a preliminary effort to identify such a core set of hexagons was undertaken for our data.All 59 optimal solutions identified for values of n from one through eight were examined andthe frequency of occurrence for each hexagon was noted. If we denote Oi as the number ofoptimal solutions found for n � i, then it is a straightforward calculation to see that there are atotal of 1 �O1 � 2 �O2 � . . .� 8 �O8 � 368 possible hexagons that could appear in these solu-tions. However, only 24 unique hexagons were present in our results. The relative frequency ofoccurrence for each of these 24 hexagons is provided in Figure 3. The results indicate that threehexagons constitute a core set, with relative frequencies of appearance greater than 80%. Twoadditional hexagons could be labelled as marginally core (relative frequencies of occurrencegreater than 60% but less than 80%), while none of the other 19 hexagons that showed up inat least one optimal solution appeared in more than 30% of all solutions. Of particular note isthe fact that the hexagon containing the greatest number of species (highlighted in Fig. 3), and

162 Arthur et al.

Page 11: Finding all optimal solutions to the reserve site selection

thus constituting the single optimal solution for the case when n � 1, is not one of the threehexagons in the core set but one of the two we have labelled marginally core. The consequencesof this are readily evident when one realizes that many greedy type heuristics begin with theselection of just such a unit.

5. Conclusions and future work

We have described the formulation and implementation of two methods for finding all optimalsolutions to the maximal covering problem resulting from the reserve site selection problem. Anillustrative computational analysis using the method on a 441-hexagon, 457-species data setdescribing native terrestrial vertebrates in the state of Oregon has demonstrated its efficacy.These methods have been successfully applied to data sets ranging in size from (M, N) = (10,114) to (615, 13922).

With any sort of iterative method for finding all optimal solutions to the MCP, a techniquethat helps reduce the overall effort is to initially scan the data for certain exploitable character-istics (see Camm and Sweeney (1994) and Pressey et al. (1996)). For example, if two hexagonscontain exactly the same species, then one hexagon can be removed from consideration and atthe conclusion of the solution process any solution using the hexagon that was not removed canbe rewritten using that which was removed. Similarly, two species always appearing in the samehexagons can be treated in like fashion. These pre-solution ideas are currently being implemen-ted. Additional work in progress is attempting to address the observed behaviour of the ESmethod and its tendency to degenerate into a reduced EE approach by implementation of anembedded form of ES. For a fixed subset of size i, the methodology being developed looks atsearching for the remaining portion of the solution (i.e. the nÿ i cells not in the explicit subsetneeded to fill out the solution) using ES rather than EE.

The information obtained from a preliminary analysis of the frequency of hexagon occur-rence in optimal solutions suggests several areas of additional investigation. On one front, thequestion of whether a core set of units exists for all values of n, and the relative stability of thatcore as n changes, deserves further analysis on this and other data sets. In addition, the mannerin which knowledge of such a core (which, in some respect, forms the basis behind the ESapproach we have described) can be further exploited to guide the search for optimal solutionsis an open question.

Acknowledgements

The authors wish to express their appreciation to the anonymous referees of an earlier draft ofthis paper. The incorporation of their constructive comments has hopefully improved the clarityand content of this version. We are also grateful to the Editor and Guest Editor for the patiencethey have shown us throughout this process.

This research was supported by the Biodiversity Research Consortium and by theDepartment of Defense Strategic Research and Development Program, the EnvironmentalProtection Agency via Interagency Agreement DW12935631, and the USDA Forest Service,Pacific Northwest Research Station.

Optimal solutions to the reserve site selection problem 163

Page 12: Finding all optimal solutions to the reserve site selection

References

Camm, J.D. and Sweeney, D.J. (1994) Row reduction in the maximal set covering problem. Working paperQA-1994±005, Department of Quantitative Analysis and Operations Management, University ofCincinnati, Cincinnati, OH.

Camm, J.D., Polasky, S., Solow, A. and Csuti, B. (in press) A note on optimization models for reserve siteselection. Biological Conservation.

Church, R.L., Stoms, D.M. and Davis, F.W. (1996) Reserve selection as a maximal covering locationproblem. Biological Conservation, 76, 105±12.

Csuti, B., Polasky, S., Williams, P.H., Pressey, R.L., Camm, J.D., Kershaw, M., Kiester, A.R., Downs, B.,Hamilton, R., Huso, M. and Sahr, K. (in press) A comparison of reserve selection algorithms usingdata on terrestrial vertebrates in Oregon. Biological Conservation.

Downs, B.T. and Camm, J.D. (1996) An exact algorithm for the maximal covering problem. NavalResearch Logistics, 34, 435±61

Garfinkel, R.S. and Nemhauser, G.L. (1972) Integer programming. Wiley, New York.Gerrard, R.A. and Church, R.L. (1995) A general construct for the zonally constrained p-median problem.

Environment and Planning B: Planning and Design, 22, 213±36.IBM (1992) Optimization Subroutine Library (OSL), Guide and Reference, Release 2. Fourth edition. IBM

Corporation, Kingston, New York.Kershaw, M., Williams, P.H. and Mace, G.C. (1994) Conservation of Afrotropical antelopes: consequences

and efficiency of using different site selection methods and diversity criteria. Biodiversity andConservation, 3, 354±72.

Land, A.H. and Doig, A.G. (1960) An automatic method of solving discrete programming problems.Econometrica, 28, 497±520.

Margules, C.R., Nicholls, A.O. and Pressey, R.L. (1988) Selecting networks of reserves to maximizebiological diversity. Biological Conservation, 43, 63±76.

Nicholls, A.O. and Margules, C.R. (1993) An upgraded reserve selection algorithm. BiologicalConservation, 41, 11±37.

Polasky, S., Jaspin, M., Szenfandrasi, S., Bergeron, N. and Berrens, R. (1995) Bibliography on theConservation of Biological Diversity: Biological/Ecological, Economic, and Policy Issues. Un-published report. Department of Agricultural and Resource Economics, Oregon State University,Corvallis, OR.

Pressey, R.L. and Nicholls, A.O. (1989) Application of a numerical algorithm to the selection of reservesin semi-arid New South Wales. Biological Conservation, 50, 263±78.

Pressey, R.L., Possingham, H.P. and Margules, C. R. (1996) Optimality in reserve selection algorithms:when does it matter and how much? Biological Conservation, 76, 259±67.

Pressey, R.L., Humphries, C.J., Margules, C.R., Vane-Wright, R.I. and Williams, P.H. (1993) Beyondopportunism: key principles for systematic reserve selection. Trends in Ecology and Evolution, 8,124±8.

Underhill, L.G. (1994) Optimal and suboptimal reserve selection algorithms. Biological Conservation, 70,85±7.

Vane-Wright, R.I., Humphries, C.J. and Williams, P.H. (1991) What to protect? systematics and the agonyof choice. Biological Conservation, 55, 235±54.

White, D., Kimerling, A.J. and Overton, W.S. (1992) Cartographic and geometric components of a globalsampling design for environmental monitoring. Cartography and Geographic Information Systems,19, 5±22.

Williams, P.H., Vane-Wright, R.I. and Humphries, C.J. (1993) Measuring biodiversity for choosing conser-vation areas. In Hymenoptera and Biodiversity, J. LaSalle and I Gauld (eds) CAB International:Wallingford, UK.

164 Arthur et al.

Page 13: Finding all optimal solutions to the reserve site selection

Williams, P., Gibbons, C., Margules, C., Rebelo, A., Humphries, C. and Pressey, R. (in press) A compar-ison of richness hotspots, rarity hotspots, and complementary areas of conserving diversity usingBritish birds. Biological Conservation.

Bibliographical sketches

Jeffrey L. Arthur is an Associate Professor of Operations Research in the Department ofStatistics at Oregon State University. His recent research has focused on the use of mathema-tical programming models for problems in biological conservation, biodiversity, and relatedenvironmental issues. He is currently involved in a project to determine an appropriate modelfor reserve selection based on physical habitat characteristics in the Klamath/Siskiyou region ofnorthwestern California and southwestern Oregon.

Mark Hachey received his Bachelor's Degree in Mathematics from Ithaca College in 1994,and his Master of Science in Operations Research from Oregon State University in 1996. He iscurrently enrolled as a PhD student in the Mathemathical Sciences Department of the JohnsHopkins University.

Kevin Sahr is a research assistant and programmer working on a variety of biogeographicproblems in the Department of Geosciences at Oregon State University.

Manuela Huso is a consulting statistician with the Department of Forest Sciences at OregonState University. She has MSc degrees in ecology and in statistics, and has worked for eightyears in the applications of statistics to ecological research.

A. Ross Kiester is a supervisory statistician with the US Forest Service, USDA, located atthe Forest Sciences Lab at Oregon State University, Corvallis, Oregon.

Optimal solutions to the reserve site selection problem 165