14
A software tool for elicitation of expert knowledge about species richness or similar counts Rebecca Fisher a, * , Rebecca A. OLeary a , Samantha Low-Choy b, c , Kerrie Mengersen c , M. Julian Caley d a Australian Institute of Marine Science, UWA Oceans Institute (M096), 35 Stirling Hwy, Crawley, WA 6009, Australia b Cooperative Research Centre for National Plant Biosecurity, LPO Box 5012, Bruce ACT 2617, Australia c School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane, Qld 4001, Australia d Australian Institute of Marine Science, PMB 3, Townsville, QLD 4810, Australia article info Article history: Received 30 June 2011 Received in revised form 15 November 2011 Accepted 17 November 2011 Available online 16 December 2011 Keywords: Elicitation software Estimating counts Species richness Expert elicitation Expert opinion Estimating counts abstract Elicitation of expert knowledge has proven to be useful in a variety of disciplines including ecology, conservation management and policy. Here we report the development of a protocol and software tool that aids elicitation of expert knowledge of complex systems of count data, focusing on a case study of elicitation of species richness estimates for coral reefs. The software uses newly developed elicitation procedures to elicit probability distributions of counts in a structured and ordered protocol. We present a novel tool that has considerable advantage over more classical surveytype methods for canvassing expert opinion, including the ability to produce rapid feedback based on tted statistical distributions (thereby ensuring that experts opinions are captured with accuracy) and a means of estimating credible bounds for estimates (essential when few experts are available). It is user friendly, based on open source software (R and tcl/tk), cross platform (Windows and Mac OSX) and the code can be easily modied to tailor the software to a range of applications. Crown Copyright Ó 2011 Published by Elsevier Ltd. All rights reserved. Software availability Title: ElicitN Developers: Rebecca Fisher (GUI, data, visualization), Rebecca OLeary (statistical computation) Contact Address: Australian Institute of Marine Science, UWA Oceans Institute (M096), 35 Stirling Hwy, Crawley, WA 6009, Australia E-mail: r.[email protected]; rebecca_[email protected] Software availability: contact authors, http://sourceforge.net/ projects/elicitn/les/ Software requirements: R version 2.12 with packages gWidg- etstcltk, digest and gWidgets installed. 1. Introduction Estimation and subsequent modelling of counts is a very common task in science. A count refers to the number of times an event occurs, whether this is the number of animals in a population, airline accidents or earthquakes (Cameron and Trivedi, 1998). In the environmental and ecological sciences counts are key variables of interest as much research in these elds involves the study of populations (counts of individuals) or communities (counts of species), and how these are inuenced by a myriad of co-varying factors related to their environment. The statistical analysis and modelling of counts has a long and rich history, dating back to the nineteenth century and has developed over this time within a broad range of disciplines: actuarial science, biostatistics, demography, economics, political science and soci- ology (Cameron and Trivedi, 1998). While techniques for dealing with specic issues related to counts are now well developed, such methods obviously require that appropriate count estimates can be made. There are numerous situations, however, where such estimates cannot be made using traditional data collection methods. For example, rigorous empir- ical estimates of the total number of species on the planet (Erwin, 1982) or within a particular ecosystem (Reaka-Kudla, 1997; Small et al., 1998) are unobtainable because the repeated and indepen- dent counts required (Gotelli and Colwell, 2001) can not realisti- cally be performed at the necessary scales. Similarly, sound population estimates for rare species may be difcult to obtain because the sampling required is very large (Green and Young, 1993) and may become too resource intensive (e.g. Austin, 2002). * Corresponding author. Tel.: þ61 422953312. E-mail addresses: r.[email protected], rebecca_[email protected] (R. Fisher). Contents lists available at SciVerse ScienceDirect Environmental Modelling & Software journal homepage: www.elsevier.com/locate/envsoft 1364-8152/$ e see front matter Crown Copyright Ó 2011 Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.envsoft.2011.11.011 Environmental Modelling & Software 30 (2012) 1e14

A software tool for elicitation of expert knowledge about species richness or similar counts

Embed Size (px)

Citation preview

Page 1: A software tool for elicitation of expert knowledge about species richness or similar counts

at SciVerse ScienceDirect

Environmental Modelling & Software 30 (2012) 1e14

Contents lists available

Environmental Modelling & Software

journal homepage: www.elsevier .com/locate/envsoft

A software tool for elicitation of expert knowledge about species richnessor similar counts

Rebecca Fisher a,*, Rebecca A. O’Leary a, Samantha Low-Choy b,c, Kerrie Mengersen c, M. Julian Caley d

aAustralian Institute of Marine Science, UWA Oceans Institute (M096), 35 Stirling Hwy, Crawley, WA 6009, AustraliabCooperative Research Centre for National Plant Biosecurity, LPO Box 5012, Bruce ACT 2617, Australiac School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane, Qld 4001, AustraliadAustralian Institute of Marine Science, PMB 3, Townsville, QLD 4810, Australia

a r t i c l e i n f o

Article history:Received 30 June 2011Received in revised form15 November 2011Accepted 17 November 2011Available online 16 December 2011

Keywords:Elicitation softwareEstimating countsSpecies richnessExpert elicitationExpert opinionEstimating counts

* Corresponding author. Tel.: þ61 422953312.E-mail addresses: [email protected], reb

(R. Fisher).

1364-8152/$ e see front matter Crown Copyright � 2doi:10.1016/j.envsoft.2011.11.011

a b s t r a c t

Elicitation of expert knowledge has proven to be useful in a variety of disciplines including ecology,conservation management and policy. Here we report the development of a protocol and software toolthat aids elicitation of expert knowledge of complex systems of count data, focusing on a case study ofelicitation of species richness estimates for coral reefs. The software uses newly developed elicitationprocedures to elicit probability distributions of counts in a structured and ordered protocol. We presenta novel tool that has considerable advantage over more classical “survey” type methods for canvassingexpert opinion, including the ability to produce rapid feedback based on fitted statistical distributions(thereby ensuring that expert’s opinions are captured with accuracy) and a means of estimating crediblebounds for estimates (essential when few experts are available). It is user friendly, based on open sourcesoftware (R and tcl/tk), cross platform (Windows and Mac OSX) and the code can be easily modified totailor the software to a range of applications.

Crown Copyright � 2011 Published by Elsevier Ltd. All rights reserved.

Software availability

Title: ElicitNDevelopers: Rebecca Fisher (GUI, data, visualization), Rebecca

O’Leary (statistical computation)Contact Address: Australian Institute of Marine Science, UWA

Oceans Institute (M096), 35 Stirling Hwy, Crawley, WA6009, Australia

E-mail: [email protected]; [email protected] availability: contact authors, http://sourceforge.net/

projects/elicitn/files/Software requirements: R version 2.12 with packages gWidg-

etstcltk, digest and gWidgets installed.

1. Introduction

Estimation and subsequent modelling of counts is a verycommon task in science. A count refers to the number of times anevent occurs, whether this is the number of animals in

[email protected]

011 Published by Elsevier Ltd. All

a population, airline accidents or earthquakes (Cameron andTrivedi, 1998). In the environmental and ecological sciencescounts are key variables of interest as much research in these fieldsinvolves the study of populations (counts of individuals) orcommunities (counts of species), and how these are influenced bya myriad of co-varying factors related to their environment. Thestatistical analysis and modelling of counts has a long and richhistory, dating back to the nineteenth century and has developedover this timewithin a broad range of disciplines: actuarial science,biostatistics, demography, economics, political science and soci-ology (Cameron and Trivedi, 1998).

While techniques for dealing with specific issues related tocounts are now well developed, such methods obviously requirethat appropriate count estimates can be made. There are numeroussituations, however, where such estimates cannot be made usingtraditional data collection methods. For example, rigorous empir-ical estimates of the total number of species on the planet (Erwin,1982) or within a particular ecosystem (Reaka-Kudla, 1997; Smallet al., 1998) are unobtainable because the repeated and indepen-dent counts required (Gotelli and Colwell, 2001) can not realisti-cally be performed at the necessary scales. Similarly, soundpopulation estimates for rare species may be difficult to obtainbecause the sampling required is very large (Green and Young,1993) and may become too resource intensive (e.g. Austin, 2002).

rights reserved.

Page 2: A software tool for elicitation of expert knowledge about species richness or similar counts

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e142

There are situations where the inability to achieve good countestimates can have considerable negative consequences. Forexample, inability to estimate population size may cause a speciesto be overlooked in conservation efforts, such as those speciesdefined as “Data Deficient” by the International Union for Conser-vation of Nature (IUCN) Redlist. Furthermore, there are situationswhere relevant datamight exist, but are only recorded and/ormadeavailable for a subset of biased cases (e.g. medical studies based ondeath certificates (Lauer et al., 1999) or insurance claims data (Jolliset al., 1993)), or contain inherent biases (e.g. medical and socialstudies relying on self reporting (Sherry et al., 2007) or ecologicalstudies where sampling is opportunistic (Grand et al., 2007)).Lastly, considerable data remains unrecorded existing only in theform of personal knowledge (e.g. local ecological knowledge inmarine conservation e Drew, 2005).

In many situations experts exist that have considerable knowl-edge that can be useful for estimating counts where data areotherwise unreliable or unavailable (e.g. Yamada et al., 2003). Therehas been a recent increase in the use of such expert knowledge inecology, motivated by a lack of data for addressing many issues, andleading to a concomitant development of methodologies for elic-iting expert knowledge (O’Hara, 2005; Perera et al., 2011). Appli-cations for the use of expert knowledge within EnvironmentalModelling and Software (EMS) are highly varied, ranging fromforming the basis for model formulation, variable selection andvariable weighting (Ferraro, 2009; Lynam et al., 2010; Oliver et al.,2010; Page et al., in press) to the incorporation of expert and localknowledge in environmental monitoring and decision supportsystems (Giordano and Liersch, in press; Lagabrielle et al., 2010;Zerger et al., 2011).

When expert knowledge is important it is usual to adopta formal elicitation procedure managed by a facilitator withexpertise in probabilities and the process of elicitation (O’Hagan, inpress). A structured and designed approach to eliciting knowledgefrom experts greatly increases the value of the informationobtained (Low-Choy et al., 2009b). A well-designed approach canimprove accuracy by reducing cognitive biases (Spetzler and Stäelvon Holstein, 1975; O’Hagan et al., 2006; Kynn, 2008) andensures a transparent, repeatable and therefore more scientificallyjustifiable process (O’Leary et al., 2008). In particular, we considerelicitation within the Bayesian statistical paradigm, which admitssubjective as well as long-run frequency interpretations of proba-bility. In our case study this is crucial since it permits elicitation ofsubjective uncertainty about species richness. There are nowseveral examples of formal approaches for eliciting knowledgefrom experts under this Bayesian paradigm (Kuhnert et al., 2010;Low-Choy et al., 2009b; Oakley and O’Hagan, 2010).

With the development of these formal approaches, it isbecoming increasingly apparent that their implementation benefitssubstantially from the use of specialised software to aid the elici-tation process. Interactive software to support elicitation providesa wide range of advantages; see Low-Choy et al. (2009a) for recentreviews. For example, software can be used to provide instantgraphical feedback, ensuring that elicited information adequatelyreflects the expert’s opinions and help the expert maintain selfconsistency. Software also helps to: automate complex numericalcalculations (with recent example Oakley and O’Hagan, 2010); andto streamline the elicitation process by reducing the length ofelicitation sessions, and thereby, keeping the expert engaged asmuch as possible throughout the process. Furthermore, softwarecan help ensure consistency of the approach used throughout theelicitation procedure and improves repeatability (with recentexample in James et al., 2010).

Here we report on software developed for the elicitation ofexpert knowledge of complex systems of count data. While the

potential applications of this software are broad, we focus ona template developed for eliciting expert knowledge of speciesrichness on coral reefs as a case study. Below, in Section 2 wedescribe the motivation and justification behind our case studyselection. In Section 3, we describe the specifications of thestatistical model and the elicitation method implemented by thesoftware. Section 4 outlines the organisation of the software,justifies the elements included and describes its use. Section 5discusses some issues that arose during the development of thesoftware and initial observations from its use.

2. Case study justification

Cataloguing the species that inhabit the earth is one of thefundamental quests of biology, yet this task is still far fromcomplete for any of the world’s biomes (Agnarsson and Kuntner,2007). Total global diversity consists of those species that havebeen assigned valid taxonomic names, those that have beendiscovered but not been formally named (i.e. morphospecies oroperational taxonomic units e OUT’s) and those that remaincompletely undiscovered. Even existing databases of valid namedspecies are still incomplete, despite the recognised necessity forrobust species inventories over the past two decades (May, 1988,1992, 1994). Globally it is estimated there are approximately1.5e1.8 million described species (May, 1988; Reaka-Kudla, 1997).In the marine environment, the number is thought to be between250,000 and 274,000 (Bouchet, 2006; Reaka-Kudla, 1997), withonly 207,730 valid species currently listed in the World Registry ofMarine Species (Appeltans et al., 2010; http://www.marinespecies.org/). Approximately 93,000 of these described species are thoughtto be associated with coral reefs (Reaka-Kudla, 1997). While there isconsiderable uncertainty in our estimates of known and describedspecies, our uncertainty about the number of undescribed species iseven higher with estimates of total coral reef diversity ranging from618,000 to 9 million (Reaka-Kudla, 1997; Small et al., 1998). Varyingby more than one order of magnitude these values represent, atbest, what are informally referred to as rough “guestimates”(Knowlton et al., 2010). In the ocean particularly, the magnitude ofthe unknown diversity is immense. Even in the most well studiedsea area in the world, as much as 25% of species are thought toremain to be discovered (Wilson and Costello, 2005).

This large gap in currently described species is exacerbated bythe lack of taxonomic expertise to effectively describe theremaining biodiversity on earth (Evenhuis, 2007), a deficit that isfrequently referred to as the “taxonomic impediment” (de Carvalhoet al., 2007; Flowers, 2007;Wheeler et al., 2004). At the present rateof new species descriptions, it is estimated that it will take250e1000 years to complete the inventory of marine biodiversity(Bouchet, 2006). In contrast, the rate of global species extinctions israpidly increasing (Ehrlich and Wilson, 1991; Loreau et al., 2006).Major threats to biodiversity include habitat destruction, intro-duction of invasive species, overexploitation of biological resources,pollution, and climate change (Loreau et al., 2006). Therefore, it isimperative now that we try to better understand the magnitude ofextant biodiversity in order to be able to better protect it and toprovide a baseline against which future loses can be measured.

Given the above challenges, ecologists need alternative ways torapidly estimate the magnitude and patterns of species richnessand global biodiversity, across a range of scales (both taxonomicand spatial) and ecological biomes. A variety of empiricalapproaches have been used to predict unknown species richnessbut many of these approaches have serious limitations. Rigorous,data-driven species richness estimators (e.g. Gotelli and Colwell,2001) are of limited value on broad taxonomic and spatial scalesas the data requirements are difficult, if not impossible, to meet.

Page 3: A software tool for elicitation of expert knowledge about species richness or similar counts

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e14 3

Predictions of species richness from discovery curves are compro-mised by incomplete records of described species and variablelevels of effort across taxa and through time (Gaston andMay,1992;Gaston et al., 1995; May, 1988). Further, accurate predictions fromdiscovery curves are only possible where the taxonomic group isalmost completely known (Bebber et al., 2007). Predictions ofspecies richness based on extrapolations from ratios (e.g. fromcanopy beetles to all terrestrial arthropods e Erwin, 1982) requiremany, as yet, untested assumptions (Hammond, 1994).

Elicitation of expert knowledge of taxonomists for estimatingspecies richness is a promising alternative to these more traditionalapproaches (Hammond, 1994; O’Leary et al., in preparation). Experttaxonomists are perhaps best placed to predict total species rich-ness based on their knowledge of the diversity of their particulartaxa of interest and their broad knowledge of what is known andunknown about these groups (Agnarsson and Kuntner, 2007).There have been several attempts to estimate species richnessusing expert knowledge of taxonomists (Andersen, 1992; Bouchet,2006; Hammond, 1992; Joppa et al., 2010). To date, however, noauthors in the taxonomic literature have elicited this expertknowledge within a statistical framework for structured expertelicitation where a statistical distribution is encoded, supple-menting the usual single fixed value, with a richer description ofuncertainty. Formal statistical frameworks for the elicitation ofexpert knowledge can ensure that the ‘right’ type of expertknowledge is targeted, among the subtle variations available(Low-Choy et al., 2010, Appendix 2). It may also reduce thepotential for bias in estimates and estimate uncertainty aroundthem (O’Hagan et al., 2006). The software presented here imple-ments elicitation of expert knowledge in a statistical framework inwhich uncertainty can be estimated. This software is designed tosupport elicitation of knowledge from a single expert (for indi-vidual taxa) and does not attempt to combine estimates frommultiple experts or provide any additional analysis of thisinformation.

3. Software specifications

We developed the elicitation support tool, and embeddedquestions, in several stages: starting with taxonomists well-knownto the elicitation group and expanding to a pilot on other taxono-mists. This approach helped ensure the language and terminologieswere appropriate to the user group and also helped encourageparticipation and “buy-in” (Oliver et al., in press).

3.1. Overview

Several elements are fundamental when eliciting informationfrom experts. These include: 1. initial familiarization and training ofthe experts to adequately prepare them for the elicitation, 2. elici-tation of the parameters of interest that are then used to fit appro-priate statistical distributions, and 3. review and feedback of theelicited informationwith the subsequent opportunity for correction(Kynn, 2008; Low-Choy et al., 2009a). In order to streamline elici-tation sessions and ensure that any knowledge that is quantifiedaccurately reflects the opinions of the expert, it is essential that anysoftware used to aid the elicitation incorporates these elements.Exactly how this is implemented depends on the intended appli-cation of the software and the underlying conceptual model. Ourapproach is described in detail below for our specific case study.

For the elicitation of species richness for coral reefs, our protocolconsisted of three main stages (Fig. 1): 1) initial familiarization,preparation and training of the expert; 2) elicitation of values forthe taxon (taxa) onwhich the expert is expert; and 3) the elicitationof values for additional taxa for which the expert is less familiar but

still sufficiently expert to provide estimates that would be betterthan a non-expert (Fig. 1). Review and feedback are provided for allsteps in stages 2 and 3, and the opportunity for review andcorrection is available at any time.

For our tool, these three main stages are supported by sevensections of the software (AeG, Fig. 1). The second stage (Section E ofthe software, Fig. 1) is where the primary data of interest are eli-cited, and is divided into components and sub-components basedon our underlying conceptual model for total species richness (N)(outlined below, Section 3.2). Within each stage of the software inwhich parameters are elicited, the same underlying parameterelicitation method, feedback protocol, and units are used(described in detail in Sections 3.3e3.5 below).

ElicitN is designed to be used in face-to-face elicitation inter-views with graphical feedback to ensure repeatability in theexpert’s responses (O’Hagan et al., 2006). A face-to-face styleinterview provides considerable advantage in that it allows for theuse of multiple modes of communication which can be importantbecause experts differ in how they best think about a problem(Low-Choy et al., 2009a; Low-Choy et al., 2009b; Oakley andO’Hagan, 2010). It is particularly important to have face-to-faceinterviews so that the facilitator/elicitor can phrase the elicitationquestions to suit the individual expert’s understanding andknowledge of statistics and probabilities (Hogarth, 1975; Kadaneand Wolfson, 1998; O’Leary et al., 2009).

3.2. Conceptual model

One of the early steps in elicitation design is to formulatea statistical model representing the underlying conceptual model(Low-Choy et al., 2009b). It is often useful to break down the overallproblem into a series of small well-defined questions (Spetzler andStäel von Holstein, 1975; Kynn, 2008). Model decomposition in thisway helps identify components that the expert may find easier torelate to, but which may also be independently elicited (Spetzlerand Stäel von Holstein, 1975; O’Hagan, in press). Doing so gener-ally leads to hierarchical model representations and facilitatesincorporation of expert knowledge as informative priors(Low-Choy et al., 2009b; Discussion issue 8). Given that taxonomy,the science of naming and describing species, is incomplete, totalspecies richness of any taxon (N) will be the sum of the speciesalready assigned valid taxonomic names (V), those that have beendiscovered but remain unnamed (or currently are assigned thewrong name) (D), and those that remain undiscovered (U):

N ¼ V þ Dþ U (1)

We consider that each of these components is independent,which avoids challenges of modelling dependence (O’Hagan, inpress). These components also form a finite partition of N, in thesense that the components do not overlap and describe all thecontributions to N. Moreover, in the field of taxonomy, the countingblocks (here species) forming each component are sufficientlywell-defined so that there is no overlap among them. Statistically,this defines a sum of three independent random variables, so thatoverall uncertainty is also additive, since

VarðNÞ ¼ VarðVÞ þ VarðDÞ þ VarðUÞ (2)

A protocol for eliciting information so that we can statisticallyencode the distribution of each random variable is described indetail below.

3.2.1. Valid named species (V)These are the species that have been formally described and

named using standardized methods and nomenclatural rules

Page 4: A software tool for elicitation of expert knowledge about species richness or similar counts

Fig. 1. Flowchart depicting the elicitation process implemented in ElicitN. The coloured boxes show the three stages of the software (1. Initial familiarization, preparation andtraining of the expert; 2. Elicitation of values for the primary taxon (taxa); and 3. The elicitation of values for additional taxa). The seven sections (AeG) of the survey supportingthese three stages are shown, with arrows indicating the order of questions. Each square box represents an interactive widget. Red boxes are those widgets for the individualcomponents where elicitation is achieved using the specific questions of the base protocol (and associated control functions) outlined below (i.e., Fig. 2) and Section 3.3. Thesewidgets also contain an enforced internal feedback cycle in the form of a box and whisker plot of the fitted statistical distribution. (For interpretation of the references to colour inthis figure legend, the reader is referred to the web version of this article.)

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e144

(International Commission on Zoological Nomenclature, 2011;McNeill et al., 2006). While these species are the best known totaxonomists, some uncertainty around the number of namedspecies remains for most taxonomic groups. This uncertaintyincreases when considering specific habitats or ecosystems becausemost taxonomic databases (e.g. WoRMS Appeltans et al., 2010) donot systematically include habitat information and, where habitataffiliations are available, they tend to be presence only data. Becausethe number of named species across all marine environments mustbe greater than the number of species in any one habitat orecosystem, we first elicited values for the global marine richness ofnamed species, thereby anchoring the total number of namedspecies for that taxon. Next, we elicited the total number of namedspecies that inhabit coral reef ecosystems worldwide. In scientificnomenclature, synonyms are different scientific names that are (orwere) used for a single taxon of organisms, for example two namesfor the same species. For some taxonomic groups there can be

a large number of synonyms, some of which have been identifiedbut not yet formally addressed and others that are unknown. Thus,here we attempted to elicit the number of valid named species,rather than the total number of taxonomic names, encouraging theexpert to incorporate their knowledge of likely synonymy withintheir group. While accounting for synonymy may generate lesscertainty around estimates for some taxa (and correspondinglylarger credible bounds), this is essential formaintaining the additivenature of our model.

3.2.2. Discovered-but-unnamed species (D)Discovered species are those for which specimens reside in the

collections of institutions (e.g. museums) worldwide, but for whichno name, or an incorrect name, has been applied. In this context, werefer to these species as “discovered but unnamed” even thougha name in some cases has been applied to them. In order to obtainthe best possible estimate of D, we directly elicited opinions on

Page 5: A software tool for elicitation of expert knowledge about species richness or similar counts

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e14 5

three sub-components of discovered-but-unnamed species whichwere assumed to be additive.

First, we elicited the number of cryptic species (C). Those speciesoccurring in sympatry and either have insufficient morphologicalvariation to identify them as different species (but would berevealed as genetically distinct upon the application of appropriatemethods) or where morphological variation has not been properlyexploited so that such species are assigned already existing, butincorrect names.

Next we elicited the number of morphospecies (M) currently incollections worldwide. Morphospecies (also known as operationaltaxonomic units) are those species that in the opinion of a taxono-mist are sufficiently differentiated morphologically from all othercurrently named valid species to be a new species, but which hasnot yet been formally described and assigned a valid name.

Lastly, we elicited the number of genetically distinct species (G),which are one of two types. In some taxa, insufficient morpho-logical variation exists and genetic methods such as DNA barcodesare required to distinguish species. In others, species with largegeographic distributions can be morphologically indistinct acrossthe range, but can be divided into a series of allopatrically distrib-uted, but genetically distinct, species revealed by barcoding(Plaisance et al., 2009; Ratnasingham andHebert, 2007), but not yetdescribed, or named.

The total number of discovered-but-unnamed species (D) issimply the sum of these three sub-components:

D ¼ C þM þ G (3)

Depending on the taxon being considered and the taxonomist,all three categories may not be relevant in all cases. Because thesoftware can accommodate zero values, taxonomists were free toaddress only a subset or combine categories, whichever made themost sense in their opinions, and would ensure that only uncor-related, independent terms were elicited. The intent here was toobtain the best estimate for these discovered species (i.e. D), notunderstand the distribution of relative richness within it. Whileflexibility here is possible, it is important to guard against missingspecies if these categories are combined, or double, or triplecounting of species if they are not.

3.2.3. Undiscovered species (U)Undiscovered species are those for which no specimens yet

reside in collections. Species remain undiscovered for threereasons. They are missed during collecting in a particular locationbecause insufficient effort was applied to enumerate all the speciesat that location. Species may also have been missed becauseparticular geographic locations, or habitats within locations, havenot been sampled. Unlike discovered species, these components ofhow species remain undiscovered are difficult to separate, as theyare all manifestations of insufficient sampling, and may not beindependent. Thus instead of eliciting the numbers of undiscoveredspecies remaining due to each form of under-sampling separately,we elicited this as a single parameter. Prior to eliciting thisparameter, however, a series of questions were asked to assist thetaxonomist in thinking through the major reasons why speciesremain undiscovered in their particular expert taxon, and to helpthem give us their most informed opinion.

3.3. Parameter elicitation

Regardless of the exact statistical model and problem decompo-sition, elicitation of the individual components and sub-componentsentails eliciting a subjective probability distribution from the expertthat characterise the expert’s uncertainty accurately, and in

particular does not understate it (O’Hagan, in press). In thisendeavour it is important not only to elicit the expert’s uncertainty,but also to elicit additional information in order to select the bestunderlying model to describe the elicited information (e.g.Rinderknecht et al., in press). An important decisionwhen designingan elicitation iswhich summary statistics are to be elicited, includingtheir order of elicitation (Garthwaite and Dickey, 1985; Phillips andWisbey, 1993). Identifying quantities that are meaningful to theexpert is important, particularly where experts have limitedknowledge of statistics and probability theory (Kynn, 2008). Thetechniquewe used involved eliciting a subjective probability intervalto describe a plausible range of counts (Teigne and JØrgensen, 2005;Goldstein, 2006)which subtly differs from elicitation of a confidenceinterval to describe accuracy of the best estimate (as elicited by, forexample, Soll and Klayman, 2004; Speirs-Bridge et al., 2010). Whilesome elicitationmethods allow the expert to providemore than twoquantiles (e.g. Oakley and O’Hagan, 2010) we chose to elicit twoquantiles to define one interval, with the expert also stipulating thecorresponding level of plausibility, followed by elicitation of themode to reflect their “best estimate”. In contrast, eliciting the bestestimate first can lead the expert to mis-interpret the interval asa measure of accuracy on their best estimate and may also lead tooverly narrow and symmetric intervals (Low-Choy et al., 2009b). It isalso relatively intuitive for experts with limited understanding ofprobabilities (Kynn, 2005). This “outside-in” approach (Low-Choyet al., 2010, 2011) for eliciting subjective probability intervals hassimilar sequencing to an approach for eliciting confidence intervalsthat reduces overconfidence by experts (Speirs-Bridge et al., 2010).

The elicitation questions used are shown in Fig. 2(a,b). Theywere designed to be asked in the order in which they appeared inthe software. First the expert was asked about the smallest possiblevalue (the number theywould be really surprised if it was less than)and then the largest possible value (the number they would bereally surprised if it was greater than). The expert should thus benearly 100% sure that the real value lies between these bounds.Eliciting such end-points can improve accuracy (Hora et al., 1992),circumvent anchoring and adjustment bias (Morgan et al., 2001), aswell as avoid mis-interpretation as confidence interval on the bestestimate, as discussed above. Although not used directly to encodethe statistical distribution, during feedback (see 3.4) a usefulcoherency check is that these bounds fall outside the 95% intervalfrom the encoded statistical distribution (O’Leary et al., inpreparation).

After providing these bounds the expert is asked to “bring themin a little” to offer “more realistic” lower and upper bounds,together with an estimate of how sure they are that the real valuelies between these bounds; here this is termed sureness and isexpressed as a percentage. Sureness intuitively reflects the range ofparameter values considered plausible by the expert. It can beinterpreted as a subjective probability interval in the Bayesiansense, and reflects an expert’s belief that there is, for example, a 95%chance that the real value of the parameter falls within the elicitedinterval. This contrasts with a Frequentist confidence interval (CI),which is constrained to be interpreted within the context of manyidentical elicitation experiments. In many contexts where expertknowledge is desired because of the lack of empirical data, thisfrequentist interpretation can be problematic and impractical (seee.g. Reckhow, 1990), for a single expert or for multiple experts.What is desired is an interval that is interpretable based on the fewelicitations that are feasible on the topic. Investing more time inmore accurately obtaining a few elicitations is more feasible. Inaddition, it is often difficult to identify a large number of expertswith suitable expertise (Murray et al., 2009). To avoid confusion,the software checks that experts define intervals with surenessgreater than 50% to capture uncertainty rather than focussing on

Page 6: A software tool for elicitation of expert knowledge about species richness or similar counts

Fig. 2. Base protocol survey widget of the ElicitN software tool, showing the specific questions asked and their order (a & b) and the graphs produced for providing feedback to theexperts (c & d).

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e146

Page 7: A software tool for elicitation of expert knowledge about species richness or similar counts

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e14 7

the median (which is the limit of central credible intervals ofdiminishing size).

Finally, we asked the expert for their “best guess”, as the valuethe expert considered “most likely”. This specific wording wasadopted to ensure the parameter being elicited would correspondto a mode, rather than a median or mean. Pilot studies confirmedthat taxonomists conceptualized their best estimate as the mostlikely value. In this way, we are eliciting more than two summarystatistics, which can lead to more accurate representation of theexpert’s probability distribution (e.g. Kadane and Wolfson, 1998).

For species richness, it was deemed important to capture uncer-tainty as well as the best estimate by fitting a distribution with twoparameters (rather than a single parameter) to the realistic lowerand upper bounds, sureness and best guess. A simple least squaresapproach was used to ensure adequate fit between this distributionand all elicited summary statistics (for details see, for example, Low-Choy et al., 2008). We considered that an expert’s assessment maylegitimately be represented by a continuous distribution, permittinguse of the normal, log-normal and mirror log-normal, or [0,1] Betadistributions (used only when elicitationwas in percentages. See 3.5below) (O’Leary et al., in preparation). This approximation isadequate given that analysis targets facets of the cumulative distri-bution rather than the probability density itself. All of these distri-butions utilize two parameters, and are therefore able to capture“over-dispersion” (e.g. with respect to the single parameter Poisson)in different ways. The normal captures dispersion but assumes noskewness and independent location and scale. The log-normalretains the latter on the log scale, and also permits skewness. Thebeta distribution can capture skewness but sacrifices location-scaleindependence. For counts, the standard distributional choice isa Negative Binomial, but sensitivity analysis showed this distributionwas unable to capture the heavy tails implied by some of the experts’assessments. Distributions were truncated to non-negative values,with no upper limit on values. The expert could choose the unitsunderlying elicited quantities for a particular taxon (see 3.5).

3.4. Feedback

Critical to the success of this approach is the ability to providefeedback during the elicitation process to maximise the accuracyof the elicited information, and this is best supported by software(e.g. Low-Choy et al., 2009a; Spetzler and Stäel von Holstein,1975). Feedback was used to check whether the distribution thatwas fitted to an expert’s opinions adequately represented theiropinion (Kuhnert et al., 2010; Low-Choy, in press; O’Hagan et al.,2006).

Once the statistical distribution that fit the expert’s opinionsbest was determined, a fitted mode and credible bounds corre-sponding to a new level of sureness (the default value is 95% butthis can be changed by the elicitor) are calculated from thisdistribution. These fitted numbers were used to generate graphicaloutput in the form of a box and whisker plot that is shown to theexpert for feedback (see Fig. 2c & d). Feedback encourages theexpert to reflect and carefully review their assessments and canhighlight apparent errors, thus providing an opportunity for theexpert to correct them and maintain self consistency (Spetzler andStäel von Holstein, 1975; Kynn, 2008). A box and whisker plot waschosen as previous work providing several sources of feedbackfound this was the most commonly preferred choice amongsta pool of biological experts (Low-Choy, in press). During use of thesoftware, we found that it was often useful to draw the expert’sparticular attention to the 95% credible bounds calculated from thefitted statistical distribution. These fitted bounds were occasionallyoutside the extreme bounds (smallest and largest value) in whichthe expert claimed to be close to 100% confident, thus indicating

they were likely more certain about their estimate of the morerealistic values than their level of sureness indicated, i.e. the expertwas under-confident. The fitted bounds may also have been muchnarrower than expected, indicating they were overconfident (orlack of fit to the embedded distribution). Following examination ofthe feedback, the software allows the expert to go back and changetheir estimates until the fitted distribution adequately matchestheir opinions. When they are happy with the distribution andassociated output, the elicitation sessionwould proceed to the nextcomponent.

3.5. Units used in elicitation

Another key factor that must be considered when designing anelicitation protocol is the units that should be used for the elicita-tion. In particular, questions should refer to units and scales familiarto the expert (Spetzler and Stäel von Holstein, 1975). Our intentionwas to use this software to elicit information from taxonomistsacross a broad range of taxonomic groups, and at a range of levels inthe taxonomic hierarchy. Therefore, the software had to accom-modate taxa with species richness values ranging from single unitsto hundreds of thousands. The units used in these elicitationstherefore were made flexible so that each taxonomist could answerin units with which theyweremost comfortable. As our goal was toelicit actual numbers of species, the initial anchoring parameter(number of taxonomically named species across all marine habi-tats) needed to be elicited as a number (Fig. 2a). However, all theremaining parameters (named species on coral reefs, the threecomponents of discovered-but-unnamed species, and undiscov-ered species) could be elicited as either a number (N), a percentage(%), or a multiplicative factor (X) (Fig. 2b). For named species oncoral reefs, answers in % and X represent the percentage or multi-plicative factor of the number of named species in all marinehabitats. Alternatively, for discovered-but-unnamed species, or thecompletely undiscovered species, the % and X denotes the percentor multiplicative factor of the number of named coral reef species. Ifeither a % or X were chosen by the expert as the units of elicitation,then the graphical feedback included both the results presented intheir original units (% or X), as well as the value in numbers ofspecies (N) (Fig. 2d). This step proved essential for experts to gaugethe accuracy of their answers, and proved to be one of the mostcommon means of improving on original estimates based on thefeedback provided.

4. Software architecture, description and use

4.1. Architecture

The softwarewaswrittenusingR version 2.12.0 (R-Development-Core-Team, 2010). At present the software has been tested onwindowsXPandMacOSX10.6.6 “SnowLeopard”operating systems.In addition to the basic R package, our implementation requires theinstallation of the package gWidgetstcltk (Verzani, 2011b), which inturn requires two additional packages: digest (Eddelbuettel et al.,2011) and gWidgets (Verzani, 2011a).

Tcl/tk (Dalgaard, 2001; Grosjean, 2010) was used within the Rframework to create a Graphical User Interface (GUI) for imple-menting the elicitation protocol in a rigorous and consistentmanner. The GUI allows data to be entered directly by the elicitorand has buttons that run code to generate graphical output forfeedback to the expert during elicitation. The software is interac-tive, allowing responses to be adjusted and new feedback gener-ated so that the consequences of any changes can be carefullyreviewed. Navigation buttons are also provided allowing the elic-itor and expert to go back and review previous estimates at any

Page 8: A software tool for elicitation of expert knowledge about species richness or similar counts

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e148

stage. The ability to revisit and adjust estimates greatly increasesflexibility and maximises the ability to obtain estimates thataccurately reflect the opinions of the experts (Low-Choy et al.,2009a). The software stores entered data as comma delimitedfiles (.csv) that can be imported into a relational database. For eachexpert elicited, a file listing the main results is also generated.

The software consists of 25 individual scripts containingstatistical functions and calculations that also generate graphicaloutput and a further 32 scripts that run the tcl/tk code for the GUI.Despite the large number of component scripts, the entire softwarecan be run simply by sourcing a single control script “wElicitN.r”from within R. As the format of the elicitation method is identicalfor all parameters elicited, only a limited number of functions areactually needed to run the software. A simpler and more genericversion of the software is available at http://sourceforge.net/projects/elicitn/files/.

4.2. Software description and use

When wElicitN.r is sourced from within R, the user is promptedvia a navigation widget to select the folder “survey tool” containingthe software scripts from within their directory structure. Once theappropriate directory is selected, the software opens a dialogue boxlabelled “CReefs expert elicitation survey”. This box lists any existingelicitation sessions contained within the “results” folder of the“survey tool” and is the locationwhere all data are stored by default.At this point, theelicitorcan type in thenameofanewexpertor selectprevious survey results by clicking on the name of one of the listedexperts. If the elicitation session involves a new expert, the userselects the option to “Start survey from beginning” and the “Intro-duction” widget opens, and the elicitation proceeds from stages Athrough G described below (Fig. 1). If the elicitor wishes to view theresults from a previous elicitation, or elicit the same expert ondifferent taxa during separate sessions, they can select the option to“Gostraight to elicitationof expert taxon”. If there are results formorethan one elicited taxon for this expert, the user is prompted to selecta taxon, and the software proceeds directly to Section 4.2.4, “Experttaxon”. The sequence of steps using the software for eliciting infor-mation from a new expert, that is, the option to “Start survey frombeginning” is described below. Screen shots of a complete exampleelicitation are provided in the online supplementary material.

4.2.1. IntroductionBefore an elicitation, it is recommended to explain to the experts

why their judgements are required and how they can be used(Kynn, 2008; O’Hagan et al., 2006). Therefore, days or weeks priorto eliciting an individual expert, a summary of the goals of theproject and some background informationwas supplied. This initialstage provides the basis of a common definition of a coral reef anda coral reef species across all experts, confirmation that we requiretheir expert opinions and their uncertainty around those opinions,reinforcement that there are no right or wrong answers, andassurance that uncertainty is natural. At the beginning of an elici-tation session this information is presented on screen as an“Introduction” widget (Table 1) and was discussed with the expertas a reminder.

4.2.2. Personal informationFor many applications of this software it will be necessary to

elicit opinions from taxonomists with different levels of expertise.Information on an expert’s professional experience can be used toprovide some basis by which to objectively weight their responses(O’Leary et al., 2011). Refreshing the expert’s memory with respectto their personal sources of expertise (types of habitats, geographicscope, experience) provides some initial preparation that can also

aid in reducing availability bias, which is the tendency to recall onlyrecent information, or information that is more noteworthy(Tversky and Kahneman, 1973, 1974). For example, encouragingexperts to recall all the places in theworld they have researched (orotherwise processed samples from) will help them to rememberinformation from less recent expeditions, as well as places that maybe less memorable because they yielded few new species. Inaddition, it helps the expert to place their knowledge into a globalcontext. Consequently, information about their level of expertise iscollected via a series of questions including their formal qualifica-tions in taxonomy, experience with post-graduate supervision, andthe time frame over which they developed their expertise (Table 1).We also determined both the geographic and habitat breadth oftheir expertise. Each expert was also asked to rate their level ofstatistical knowledge and their past involvement with geneticstudies of their taxonomic groups; the latter can help to calibratetheir knowledge of how the application of molecular genetictechniques has in the past, and is likely in the future, to affect thenumber of species within their taxonomic group(s). While notdirectly used in further stages of the software, this information canbe used to investigate any apparent discrepancies in the data and/or examine potential biases once elicited data has been collected.

4.2.3. Training questionsSubstantive expertise in a specialist area does not necessarily

relate to expertise in providing accurate and coherent probabilityassessments (O’Hagan et al., 2006). For our case study, this may beparticularly true because for some taxonomists there has been littleneed for formal statistical training in many branches of the disci-pline. Therefore, before the formal elicitation process began, weprovided training in the form of the questioning they wouldencounter in the survey and the nature of the interactive graphicaland statistical feedback that they would receive. This trainingensured that the experts understood the questions, how theirresponses would be interpreted, and to the greatest extent possible,provide consistency among the experts. The training questions(Table 1) focused on eliciting the numbers of people that live intheir city, or some other city, with which they were familiar. Thisexample was chosen because our goal was to elicit the numbers ofspecies, which are distributed in the same manner as the numbersof people in cities, as frequencies. While training questions are onlyuseful for calibration when directly related to test questions, theydo provide considerable benefit by familiarizing the expert with theelicitation process (Kynn, 2008).

4.2.4. Expert taxonEach expert was asked to choose an expert taxon for the survey

for which they had particular professional expertise (Table 1) (butsee below for discussion on “less-expert” taxa under section “4.2.6.Opportunity for further elicitation”). In order to gather as muchinformation as possible, experts were encouraged to choosea taxonomic group as high as possible within the taxonomic hier-archy for which they felt they could act as an expert. Informationwas then elicited on the breadth of the taxonomist’s researchexperience with respect to different marine ecosystems (e.g.temperate reefs, mangroves, seagrass beds), habitats within coralreef ecosystems for their particular expert group, and thegeographic scope of their current and past research efforts.

4.2.5. Parameter elicitation e components of species richnessHere expert opinions on the components and sub-components of

species richness (as described in 3.2 above) and their associateduncertainty are elicited. Altogether, there are six parameters elicitedfrom the expert (as described in 3.2): 1) the number of namedspecies across all marine environments; 2) the number of named

Page 9: A software tool for elicitation of expert knowledge about species richness or similar counts

Table 1The information provided and questions asked during the initial preparation (stage 1, Sections 4.2.1e4.2.4) of the software for our case study of the elicitation of speciesnumbers on coral reefs.

Section Text

A. Introduction Overall goal: Estimate of the total number of species on coral reefs globally.Reasons why we do not have a good estimate of species on coral reefs: "The total number of coral reef associated species includesthose that are known and formally named, those that are known but not yet formally named, and those that have not yet beendiscovered. Of those that remain to be discovered, some are undiscovered, either because their habitats have never been sampled,because the geographic scope of past/current sampling has been limited, or simply because the group is so diverse that even wellstudied habitats/locations are still yielding new species.What we would like from you: We are interested in your expert opinion, with respect to all three components (above) of the globalspecies count for coral reefs, in whatever taxonomic groups you feel comfortable offering an expert opinion on. The current estimatefor species richness on coral reefs is very uncertain, and based on a large range of assumptions and very poor data. Therefore, we areattempting to provide a new and robust estimate of the total species diversity on coral reefs. Throughout this questionnaire there isno right or wrong answer, we just want your best guess, based on consideration of the factors you think are important. We do notintend that the information we collect on expert opinion will replace any relevant empirical data. Instead, we intend to use it toimprove our estimate and refine other analytical models for obtaining this estimate.Our definition of a coral reef: Shallow, tropical coral reefs include a wide range of habitats that house myriad other biological organisms,but defining what constitutes a coral reef is not always straight forward. For the purposes of this survey, we define a coral reef as a coralcreated/covered structure, occurring within tropical regions (e.g. excluding deep water reefs). We are interested in including species fromall coral reef associated habitats which are likely to have a strong ecological link with the coral reef habitat itself and house at least 1 of thelife phases. We include the coral reef structure itself, associated seagrass/macro algal habitats, rubble and sand within the immediatevicinity of coral reef structure and the pelagic environment immediately surrounding reefs. Corals reefs can also consist of distinct coralbommies or patch reefs, a reef flat, back reef area, crest, slope and base.If you would like, we are happy to retain the confidentiality of your answers. Please also be assured that there are no wrong answers,we are simply interested in your best judgement. Your contributions to this project will be gratefully acknowledged in any publicationsarising from this research.

B. Personal details Please enter the following details: Name, Date, Location, Elicitor.Are you formally qualified as a taxonomist (e.g., B.Sc., M.Sc., Ph.D?).Was your taxonomic expertise gained by: Solo work, Group work, Supervisor, Supervised, Other (state) - choose as many as areapplicable.Over what time frame was your expertise gained? <12 Months, 1e5 Years, 5e10 Years, >10 YearsHow do you rate your statistical knowledge? None or little (no formal training or experience), Some (undergraduate level/beginnercourses or equivalent experience), Lots (post-graduate level/advanced courses or equivalent experience), Advanced (researchexperience in statistics at a postdoctoral level) e choose one only.Have you led and/or collaborated in genetic studies of this/these taxonomic group/s? (describe).

C. Training questions The following questions will help us develop a baseline and familiarise you with the format of how we are going to ask the questions.We will focus on estimates of human populations. We do not expect you to be an expert in this topic.What city do you live in?.We’re interested in what you think the population of this city is. We’ll start by thinking how big or how small you think this numbercould be (elicit values using widget in Fig. 2).Can you think of any factor that might substantially alter the realised human population for your city?.We are interested in how this factor might change your estimate of the total number of people. As before we will start by finding outfirst how big or small this change might be. Consider how this factor will change the number/percentage of additional peoplethat might occur in your city.You may answer either with an actual number of additional people you think this will add to the total number, as a percentage of thenumber that you told us initially (ie 2%, 40% etc), or as a multiplicative factor (i.e. twice as many, 5 times as many, 100 times asmany, etc) (elicit values using widget in Fig. 2).Are you willing to estimate the population for another city that you have not lived in? (yes/no).

D. Expert taxon We are attempting to estimate the total number of species on coral reefs globally by asking experts for their best estimates ofdiversity in their groups of expertise. What is your specific taxonomic group of expertise?We would like to get an idea of the range of locations and habitats from which you have experience in this experttaxonomic group.Referring to the map shown, which geographic region/s have you researched/collected/identified your main expert taxonomicgroup (any/all habitats)?.For this/these taxonomic group/s, in which major marine ecosystems (e.g. rocky reef, coral reef, seagrass beds etc) have youresearched/collected/identified them from?.For this/these taxonomic group/s, in which types of coral reef habitats (e.g. coral reef structure itself, associated seagrass/macroalgal habitats, rubble and sand within the immediate vicinity of coral reef etc) have you researched/collected/identified them from?.

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e14 9

species occurring in coral reef ecosystems; 3) the number of crypticspecies; 4) the number of unique type/museum specimens notformally named; 5) genetically distinct species not yet described ornamed; 6) number of undiscovered species. Information on each ofthese is elicited in turn, guided by the software (Table 2), using themethod (and specific questions) outlined in Section 2. Graphicalfeedback is provided by the software at each step, and the oppor-tunity to revise estimates at any time is provided.

Upon completion of all the individual component questionsand any subsequent revisions incorporated (done automatically bythe software), a “review estimate” button appears. This buttoninvokes a summary of the statistical output from the survey intoa final estimate of the total number of species (see mathematical

details in O’Leary et al., in preparation)). The software then auto-matically generates a box and whisker plot of this final calculatedtotal number of species within coral reefs (N ¼ Vþ D þ U, Equation(1)), the component that is named (within coral reefs, V) and thecomponent that represents the inflation applied to this number(D þ U) along with the uncertainty around these (Fig. 3). Uncer-tainty in this total estimate was determined by summing the fitted95% credibility limits for each component. Standard samplingtheory provides methods for propagating uncertainty on esti-mated means and ratios; see summary in Refsgaard et al. (2007).Here the use of the log-normal and Beta distributions permituncertainty to be propagated asymmetrically. The expert isencouraged to consider these estimates and to decide if these

Page 10: A software tool for elicitation of expert knowledge about species richness or similar counts

Table 2Information provided and questions asked for each component during the parameter elicitation stage (stage 2, Section 4.2.5) of the software for our case study of the elicitationof species numbers on coral reefs.

Component Text

(i) Named Three components that make up the total number of species on coral reefs are: a) the number ofspecies that are already discovered, identified and named, b) the number that have been discovered(either genetically or as a unique morphotype) but have not been formally named and c) those thathave not been discovered yet.We would like to ask you some questions relating to all three of these components.We are interested in how many named species for this taxonomic group are found amongst allmarine habitats.We’ll start by thinking how big or how small you think this number could be (elicit values usingwidget in Fig. 2).We are interested in how many named species for this taxonomic group are found on coral reefs.You may answer either with an actual number of additional species you think this will add to the totalnumber of named species on coral reefs, as a percentage of the number of named species (i.e. 2%, 40% etc),or as a multiplicative factor (ie twice as many, 5 times as many, 100 times as many, etc),We’ll start by thinking how big or how small you think this number could be (elicit values using widget in Fig. 2).

(ii) Discovered (unnamed) Now let’s consider how accounting for cryptic species could increase or decrease this number. By crypticspecies we mean those species that are already named, but which are likely to constitute more than onespecies that would be revealed by genetic analysis. Again we start with bounds on how many morespecies would be found if cryptic species were accounted for properly. Consider the percentage/numberof cryptic species that occur in this group on coral reefs worldwide (elicit values using widget in Fig. 2).Now let’s consider how accounting for the discovered, but as yet unnamed species could increase thetotal number of coral reef species globally. Here we mean all species occurring in collections globallythat have been identified as probably being unique (do not appear to match already existing species)but have not yet been officially described and named. Consider the percentage/number of discovered butunnamed species (elicit values using widget in Fig. 2).Now let’s consider how accounting for the known genetically distinct species (i.e., genetically barcoded butnot named) could increase this number. Here we mean the number of genetically distinct species that areknown, but not yet described or named. An example might be some recent sponge studies that are usinggenetic techniques to distinguish species, but have not yet formally identified or named these species.Consider the percentage/number of unnamed genetically distinct species that occur in this group on coralreefs worldwide (elicit values using widget in Fig. 2).

(iii) Undiscovered The main reasons for missing species (rather than just not cataloguing those already discovered, but not named)include: insufficient coverage of coral reefs habitats, insufficient geographic sampling (worldwide), and notsampling ’well enough’ in habitats and geographic locations that have been sampled (i.e., under sampling).To help get you thinking about these issues, we’ll start with a few questions about how well this grouphas been sampled on coral reefs.Think about the range of habitats that are associated with coral reefs. (e.g. coral reef structure itself,associated seagrass/macro algal habitats, rubble and sand within the immediate vicinity of coral reef, orany other habitat important for your group). In general, within a typical location, what proportion of therange of habitats likely to be inhabited by this group has been well sampled?.On a global scale how well studied is this group? (good/poor).Referring to the map, if global coverage is poor, can you list the regions (numbered) in which this grouphas been studied? If global coverage is good, can you list any regions (numbered) in which this groupmay have been neglected?.Even in the most well studied coral reef habitat and in the most well studied location are there still be morespecies to be discovered if we keep looking? (no, we should have found them all; yes, might be some,but we are getting close to finding them all; yes there are still many to discover and we are no whereclose to finding them all).Based on the information we have just discussed, we would like to know how many species you thinkwe’ve missed due to insufficient sampling of habitats, geographic locations, or through under sampling.We will start with how big or small this could be. Consider the percentage or number of undiscovered ormissed species that occur in this group on coral reefs worldwide (elicit values using widget in Fig. 2).

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e1410

adequately represent their expert opinion of the species richnessof this taxon. If not satisfied that the results adequately repre-sented their opinions, they are encouraged to reconsider thevarious sections of the survey and modify their responses ifappropriate.

4.2.6. Opportunity for further elicitationThe taxonomists are then asked if they feel equally expert to

answer similar questions for any additional taxa (Table 3). If so, theycan repeat Sections 4.2.4 and 4.2.5 of the protocol, or provideestimates for other groups based on a shorter set of componentquestions (Additional non-expert taxon, Fig. 1). The latter elicits thespecies numbers of named species on coral reefs for the additionaltaxon (which can be answered as an actual number, a percentage,or a multiplicative factor of the named coral reef species of theiroriginal expert group) and then an inflation that incorporates bothdiscovered-but-unnamed and undiscovered as a single value

(either as an actual number, percentage, or multiplicative factor ofthe estimated named coral reef species).

4.2.7. Final thankyouTypically the survey takes between 1.5 and 2 h to complete;

a considerable investment in time by the experts. Their contribu-tion is acknowledged in a short “thankyou” widget that remindsthe elicitor to thank the expert for their contribution and offera digital copy of their answers for their records. These answers areautomatically saved as a “.csv” file. This record also provides theexpert the opportunity to adjust their responses sometime in thefuture if they feel it necessary.

5. Discussion

There are numerous advantages in using this software toquantify expert knowledge of counts, in our case, the estimates of

Page 11: A software tool for elicitation of expert knowledge about species richness or similar counts

Fig. 3. Screen shot of the “Review Estimate” widget and the associated graphical output used in the feedback process.

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e14 11

species numbers by taxonomists. It provides a mechanism forcapturing not only an expert’s best guess of a value, but also theuncertainty surrounding it, allowing the estimation of credibleintervals. Eliciting uncertainty is particularly important when onlya single expert is available (Kuhnert et al., 2010; Low-Choy et al.,2009b). This is often the case for situations like our case study,where there are very few experts specialist on the taxonomy ofa particular group within coral reef ecosystems. By definition, therewill likely be only a few experts with suitable expertise for themajority of situations where elicitation of expert knowledge isdeemed useful. Our software supports elicitation of expertknowledge by facilitating a robust and repeatable process. It alsostreamlines the elicitation process by providing rapid feedback,automating complex calculations, and providing data management(Low-Choy, in press). Furthermore, interactive software is essential

if some questions draw on earlier answers (O’Hagan et al., 2006).Our elicitation tool is highly flexible and tailored to suit differentways of thinking, allowing elicitation in a variety of numerical unitsand statistical distributions, and different levels of sureness.

The quality and value of elicited knowledge is highly dependenton the specific experience and expertise of the expert(s) involved. Inour case, expert opinions of species richness will only be as good asthe depth and breath of the specialist’s experience in relation tonatural assemblages and samples drawn from them (Hammond,1994). We assumed that the taxonomists surveyed had the neces-sary knowledge andexperience to provide reasonable answers to thequestions posed. While the method is capable of incorporatingexpert’s uncertainty, and this is implicitly modelled, there is noguarantee their answers are not biased. Indeed, evenwith the aid ofsoftware, expert opinions may contain considerable bias (Kuhnert

Page 12: A software tool for elicitation of expert knowledge about species richness or similar counts

Table 3Information provided and questions asked for elicitation of additional taxa (stage 3, Section 4.2.6) for our case study of the elicitation of species numbers on coral reefs.

Component Text

Additional expert taxon Are there any other groups in which you feel equally expert and for which you would be willing to answer the previous questions?If so, please list these (separated by commas).We would like to re-do this survey for each of these taxa that you are expert in if you are willing, but first we would like to asksome questions about closely related taxa to the one we have just been talking about. We will return to these additional expertgroups later.

Additional less expert taxa Are there any taxa that are closely related to the taxon for which you have just provided answers, that you would be willing toanswer a brief set of more general questions about the diversity of these groups compared to the one you just addressed? (yes/no).We are interested in how many named species for this new taxonomic group are found on coral reefs. You may answer eitherwith an actual number, as a percentage of the number of your expert group (the one you were telling us about initially), or as amultiplicative factor of your expert group (elicit values using widget in Fig. 2).As for your expert group, we are interested in how many more species you think there might be in this new group still to bediscovered. Instead of going through all the sources of missing species separately, we will simplify this here and ask for yourbest guess of how many undiscovered species you think there might be in this new group, including all those species we knowabout but have not yet named as well as those that have not been discovered due to lack of effort in different coral reefs habitats,around the world or general under sampling. We expect your confidence in this answer to be lower than that for your expert taxon.You may answer either with an actual number of additional species you think this will add to the total number of named specieson coral reefs, as a percentage of the total number of named species (i.e. 2%, 40% etc), or as a multiplicative factor of the number ofnamed species (i.e. twice as many, 5 times as many, 100 times as many, etc) (elicit values using widget in Fig. 2).

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e1412

et al., 2010; O’Hagan et al., 2006). Our protocol is specificallydesigned to reduce common biases, including: availability bias(expert’s tendency to recall recent or important events) e by incor-porating initial preparation reminding the expert of their relevantexperience (as described above in Section 4.2, item B); anchoring(where the expert adjusts all of their values according to their bestguess) e by eliciting the most extreme values first (Low-Choy et al.,2010); and overconfidence e by eliciting the level of surenesstogether with the interval (Kynn, 2005). Nonetheless furtherresearch is required to systematically assess the effectiveness offunctionality provided by this software, for example, to ascertainwhether overconfidence noted in eliciting confidence intervals alsooccurs when eliciting subjective intervals (Soll and Klayman, 2004;Teigen and JØrgensen, 2005; Speirs-Bridge et al., 2010).

Other biases are less well controlled. For example, experts maybe influenced by previous estimates that are incorrect, or repetitionof a single estimate in the literature that creates a false sense ofsecurity that a parameter is well known (Bouchet, 2006). Regard-less of these potential issues, expert knowledge can represent, inmany cases, the only possible way of obtaining these estimates. Inthe case of estimates of coral reef species richness the structuredelicitation approach presented here is likely to greatly improve onwhat has been parochially termed the rough “guestimates” ofprevious efforts (Knowlton et al., 2010), although this can only beknown with certainty if (and when) the inventory of coral reefspecies is completed. At the very least, expert estimates provide aninteresting alternate source of information that can be compared toother empirically derived estimates as they become available.Alternatively these initial estimates, which represent the currentstate of knowledge, can be viewed as “prior” distributions on N, andcan be updated using new data within a Bayesian statisticalmodelling framework (Spiegelhalter et al., 2004; Kuhnert et al.,2010). Moreover the structured design and implementation ofelicitation, where software helps to streamline and manage theprocess, also imparts transparency and repeatability to the process,enhancing its robustness.

As currently implemented, our tool includes a section on elici-tation of additional non-expert taxa. This section was included toencourage experts to provide more information beyond theirspecialist group, as their opinion and knowledge of similar orrelated taxa are likely to be a considerable improvement on existinginformation. Doing so, however, proved to be of limited use as mosttaxonomists preferred only to provide opinions on groups forwhich they felt they were truly “expert”. It seems likely that similar

attempts to elicit “less” or “non-expert” information in otherapplications, especially in a similar voluntary context, may alsoprove futile. During implementation, we noted that some expertswere able to quickly assess the accuracy of another estimate despitebeing reluctant to provide their own estimate for a taxon theyconsidered beyond their expertise. An alternative approach foreliciting “less-expert” knowledge (if this is even necessary) may beto provide several estimates and instead use the expert’s knowl-edge to refute these, or to get the expert to provide a relative rankusing methods such as the analytical hierarchy process (AHP, Saaty,1987). As noted by Page et al. (in press), experts usually find it easierto advise on how to adjust empirical estimates upwards or down-wards, so utilizing existing information such as the information inthe World Registry of Marine Species may also prove useful.

Furthermore, it was often difficult to encourage taxonomists toprovide estimates at higher taxonomic levels. There is a trade-offbetween collecting precise (relatively well known) data ona limited number of taxonomic groups for which there are clear“experts” and collecting less precise (with associated large uncer-tainty) data on a larger set of taxa from the taxonomic tree.Depending on the specific motivation behind the elicitation study,researchers may prefer to lean towards either end of theseextremes. Ultimately, however, the extent and range of data thatcan be collected using this methodology lie in thewillingness of theexperts to offer their opinion and the experts should not be pushedbeyond what they fell comfortable providing.

Future adaptations of the software could be based on moresophisticated GUI programming languages such as Java (e.g. Jameset al., 2010), which has greater flexibility by allowing more visuallyappealing widgets to be created along with interactive feedbackgraphs. Similarly, the code could be further developed to save datadirectly to a database. Doing so, however, would necessitate theselection of a particular platformwhich may not be preferred by allusers or available on their operating system. The simplicity of thecurrent programming platform and output in the form of “.csv” hasthe advantage that the underlying code remains relatively straightforward and can be easily modified by researchers who only havea basic knowledge of programming in R.

While this software was written for the elicitation of globalspecies richness on coral reefs, it could be easily adapted bysomeone with only limited knowledge in programming in R andtcltk for the elicitation of species richness in any ecosystem, at anyspatial scale. Furthermore, in the same way many estimators ofpopulation size (numbers of individuals) are applicable to

Page 13: A software tool for elicitation of expert knowledge about species richness or similar counts

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e14 13

estimators of species richness (numbers of species; e.g. Burnhamand Overton, 1979; Cam et al., 2002), the same protocol could beused to elicit population sizes. A simplified, generic version of thesoftware is now available at http://sourceforge.net/projects/elicitn/files/ and can be used as is, or easily modified to suit any applicationwhere the elicitation is desired for a frequency or count which issubject to several sources of uncertainty.

Acknowledgments

Funded by BHP Billiton through CReefs Australia (CReefs, Censusof Marine Life). We are grateful to C Birrell, C Shoenberg and PSutcliffe for initial discussions and testing of early versions of thesoftware, and to the many taxonomists who agreed to be elicited.We thank the Cooperative Research Centre for National PlantBiosecurity for supporting contribution by the third author.

Appendix. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at doi:10.1016/j.envsoft.2011.11.011.

References

Agnarsson, I., Kuntner, M., 2007. Taxonomy in a changing world: seeking solutionsfor a science in crisis. Systematic Biology 56 (3), 531e539.

Andersen, R.A., 1992. Diversity of eukaryotic algae. Biodiversity and Conservation 1(4), 267.

Appeltans, W., Bouchet, P., Boxshall, G.A., Fauchald, K., Gordon, D.P.,Hoeksema, B.W., Poore, G.C.B., van Soest, R.W.M., Stöhr, S., Walter, T.C.,Costello, M.J., 2010. World Register of Marine Species URL. http://www.marinespecies.org Accessed 1 April 2010.

Austin, M.P., 2002. Spatial prediction of species distribution: an interface betweenecological theory and statistical modelling. Ecological Modelling 157, 101e118.

Bebber, D.P., Marriott, F.H.C., Gaston, K.J., Harris, S.A., Scotland, R.W., 2007.Predicting unknown species numbers using discovery curves. Proceedings ofthe Royal Society B: Biological Sciences 274, 1651e1658.

Bouchet, P., 2006. The magnitude of marine biodiversity. In: Duarte, C.M. (Ed.), TheExploration of Marine Biodiversity. Fundación BBVA, Madrid, Spain, pp. 32e64.

Burnham, K.P., Overton, W.S., 1979. Robust estimation of population size whencapture probabilities vary among animals. Ecology 60 (5), 927e936.

Cam, E., Nichols, J.D., Sauer, J.R., Hines, J.E., 2002. On the estimation of speciesrichness based on the accumulation of previously unrecorded species.Ecography 25 (1), 102e108.

Cameron, A.C., Trivedi, P.K., 1998. Regression Analysis of Count Data. CambridgeUniversity Press, Cambridge UK.

Dalgaard, P., 2001. A Primer on the R-Tcl/Tk package. R News 1 (3), 27e31.de Carvalho, M., Bockmann, F., Amorim, D., Brandão, C., de Vivo, M., de Figueiredo, J.,

Britski, H., de Pinna, M., Menezes, N., Marques, F., Papavero, N., Cancello, E.,Crisci, J., McEachran, J., Schelly, R., Lundberg, J., Gill, A., Britz, R., Wheeler, Q.,Stiassny, M., Parenti, L., Page, L., Wheeler, W., Faivovich, J., Vari, R., Grande, L.,Humphries, C., DeSalle, R., Ebach, M., Nelson, G., 2007. Taxonomic impedimentor impediment to taxonomy? A commentary on systematics and thecybertaxonomic-automation paradigm. Evolutionary Biology 34 (3), 140e143.

Drew, J.A., 2005. Use of traditional ecological knowledge in marine conservation.Conservation Biology 19 (4), 1286e1293.

Eddelbuettel, D., Lucas, A., Tuszynski, J., Bengtsson, H., Urbanek, S., Frasca, M., 2011.Create Cryptographic Hash Digests of R Objects CRAN. http://dirk.eddelbuettel.com/code/digest.html.

Ehrlich, P.R., Wilson, E.O., 1991. Biodiversity studies: science and policy. Science 253,758e762.

Erwin, T.L., 1982. Tropical forests: their richness in Coleoptera and other arthropodapecies. The Coleopterists Bulletin 36 (1), 74e75.

Evenhuis, N.L., 2007. Helping solve the “other” taxonomic impediment: completingthe eight steps to total enlightenment and taxonomic nirvana. Zootaxa 1407,3e12.

Ferraro, D.O., 2009. Fuzzy knowledge-based model for soil condition assessment inArgentinean cropping systems. EnvironmentalModelling and Software 24 (3), 359.

Flowers, R.W., 2007. Comments on “Helping solve the ‘other’ taxonomic impedi-ment: completing the eight steps to total enlightenment and taxonomicnirvana” by Evenhuis (2007). Zootaxa 1494, 67e68.

Garthwaite, P.H., Dickey, J.M., 1985. Double and single bisection methods forsubjective probability assessments in a location-scale family. Journal ofEconometrics 29, 149e163.

Gaston, K.J., May, R.M., 1992. Taxonomy of taxonomists. Nature 356, 281e282.

Gaston, K.J., Scoble, M.J., Crook, A., 1995. Patterns in species description: a casestudy using the Geometridae (Lepidoptera). Biological Journal of the LinneanSociety 55, 225e237.

Giordano, R., Liersch, S. A fuzzy GIS-based system to integrate local and technicalknowledge in soil salinity monitoring. Environmental Modelling and SoftwareCorrected Proof, Accepted 9 September 2011, in press.

Goldstein, M., 2006. Subjective Bayesian analysis: principles and practice. BayesianAnalysis 1 (3), 403e420.

Gotelli, N.J., Colwell, R.K., 2001. Quantifying biodiversity: procedures and pitfalls in themeasurement and comparison of species richness. Ecology Letters 4, 379e391.

Grand, J., Cummings, M.P., Rebelo, T.G., Ricketts, T.H., Neel, M.C., 2007. Biased datareduce efficiency and effectiveness of conservation reserve networks. EcologyLetters 10, 364e374.

Green, R.H., Young, R.C., 1993. Sampling to detect rare species. EcologicalApplications 3 (2), 351e356.

Grosjean, P., 2010. SciViews-R: A GUI API for R. UMONS, Mons, Belgium.Hammond, P., 1992. Species inventory. In: Groombridge, B. (Ed.), Global Biodiversity:

Status of the Earth’s Living Resources. Chapman and Hall, London, pp. 17e39.Hammond, P.M., 1994. Practical approaches to the estimation of the extent of

biodiversity in Speciose groups. Philosophical Transactions of the Royal SocietyB: Biological Sciences 345 (1311), 119e136.

Hogarth, R.M., 1975. Cognitive Processes and the assessment of subjectiveprobability distributions. Journal of the American Statistical Association 70(350), 271e289.

Hora, S.C., Hora, J.A., Dodd, N.G., 1992. Assessment of probability distributions forcontinuous random variables: a comparison of the bisection and fixed valuemethods. Organizational Behavior and Human Decision Processes 51, 133e155.

International Commission on Zoological Nomenclature, 2011. International Code ofZoological Nomenclature (ICZN). http://iczn.org/.

James, A., Low Choy, S., Mengersen, K., 2010. Elicitator: an expert elicitation tool forregression in ecology. Environmental Modelling & Software 25 (1), 129e145.

Jollis, J.G., Ancukiewicz, M., DeLong, E.R., Pryor, D.B., Muhlbaier, L.H., Mark, D.B.,1993. Discordance of databases designed for claims payment versus clinicalinformation systems: implications for outcomes research. Annals of InternalMedicine 119 (8), 844e850.

Joppa, L.N., Roberts, D.L., Pimm, S.L., 2010. Howmany species of flowering plants arethere? Proceedings of the Royal Society B: Biological Sciences. doi:10.1098/rspb.2010.1004.

Kadane, J.B., Wolfson, L.J., 1998. Experiences in elicitation. The Statistician 47, 3e19.Knowlton, N., Brainard, R.E., Fisher, R., Moews, M., Plaisance, L., Caley, M.J., 2010.

Chapter 4: coral reef biodiversity. In: McIntyre, A.D. (Ed.), Life in the World’sOceans. Wiley-Blackwell.

Kuhnert, P.M., Martin, T.G., Griffiths, S.P., 2010. A guide to eliciting and using expertknowledge in Bayesian ecological models. Ecology Letters 13, 900e914.

Kynn, M., 2005. Eliciting expert knowledge for Bayesian logistic regression inspecies habitat modelling in natural resources. PhD thesis, QueenslandUniversity of Technology: Brisbane.

Kynn, M., 2008. The ‘heuristics and biases’ bias in expert elicitation. Journal of theRoyal Statistical Society: Series A (Statistics in Society) 171 (1), 239.

Lagabrielle, E., Botta, A., Dare, W., David, D., Aubert, S., Fabricius, C., 2010. Modellingwith stakeholders to integrate biodiversity into land-use planning e Lessonslearned in Reunion Island (Western Indian Ocean). Environmental Modellingand Software 25 (11), 1413.

Lauer, M.S., Blackstone, E.H., Young, J.B., Topol, E.J., 1999. Cause of death in clinicalresearch; time for a reassessment? Journal of the American College ofCardiology 34 (3), 618e620.

Loreau, M., Oteng-Yeboah, A., Arroyo, M.T.K., Babin, D., Barbault, R., Donoghue, M.,Gadgil, M., Häuser, C., Heip, C., Larigauderie, A., 2006. Diversity withoutrepresentation. Nature 442 (7100), 245e246.

Low-Choy, S., Mengersen, K., Rousseau, J., 2008. Encoding expert opinion on skewednonnegative distributions. Journal of Applied Probability & Statistics 3, 1e21.

Low-Choy, S., James, A., Mengersen, K., 2009a. Expert Elicitation and its Interfacewith Technology: A Review with a View to Designing Elicitator, 18th WorldIMACS/MODSIM09 International Congress on Modelling and Simulation:Cairns, Australia, pp. 4269e4275.

Low-Choy, S., O’Leary, R., Mengersen, K., 2009b. Elicitation by design in ecology: usingexpert opinion to informpriors for Bayesian statisticalmodels. Ecology 90, 265e277.

Low-Choy, S., Murray, J., James, A., Mengersen, K., 2010. Indirect elicitation fromecological experts: from methods and software to habitat modelling androck-wallabies. In: O’Hagan, A., West, M. (Eds.), Oxford Handbook of AppliedBayesian Analysis. Oxford University Press, Oxford, UK.

Low-Choy, S. Chapter 4: Elicitator: a user-friendly, interactive tool to support elic-itation of expert knowledge, in press.

Lynam, T., Drewry, J., Higham, W., Mitchell, C., 2010. Adaptive modelling for adap-tive water quality management in the Great Barrier Reef region, Australia.Environmental Modelling and Software 25 (11), 1291e1301.

May, R.M., 1988. How many species on earth? Science 241, 1441e1449.May, R.M., 1992. How many species inhabit the earth? Scientific American 267 (4),

18e24.May, R.M., 1994. Conceptual aspects of the quantification of the extent of bio-

logical diversity. Proceedings of the Royal Society B: Biological Sciences 345,13e20.

McNeill, J., Barrie, F.R., Burdet, H.M., Demoulin, V., Hawksworth, D.L., Marhold, K.,Nicolson, D.H., Prado, J., Silva, P.C., Skog, J.E., Wiersema, J.H., Turland, N.J., 2006.International Code of Botanical Nomenclature (Vienna Code) Adopted by the

Page 14: A software tool for elicitation of expert knowledge about species richness or similar counts

R. Fisher et al. / Environmental Modelling & Software 30 (2012) 1e1414

Seventeenth International Botanical Congress Vienna, Austria, July 2005,Regnum Vegetabile. A.R.G. Gantner Verlag KG, Ruggell, Liechtenstein.

Morgan, M.G., Pitelka, L.F., Shevliakova, E., 2001. Elicitation of expert judgmentsof climate change impacts on forest ecosystems. Climatic Change 49,279e307.

Murray, J.V., Goldizen, A.W., O’Leary, R.A., McAlpine, C.A., Possingham, H.P., Choy, S.L.,2009. How useful is expert opinion for predicting the distribution of a specieswithin and beyond the region of expertise? A case study using brush-tailed rock-wallabies Petrogale penicillata. Journal of Applied Ecology 46 (4), 842.

Oakley, J.E., O’Hagan, A., 2010. SHELF: The Sheffield Elicitation Framework (Version2.0). School of Mathematics and Statistics, University of Sheffield. http://tonyohagan.co.uk/shelf.

O’Hagan, A., Buck, C.E., Daneshkhah, A., Ebooks, C., 2006. Uncertain Judgements:Eliciting Experts’ Probabilities. Wiley Chichester.

O’Hagan, A., Probabilistic uncertainty specification: Overview, elaboration tech-niques and their application to a mechanistic model of carbon flux. Environ-mental Modelling & Software Corrected Proof, Available online 31 March 2011,in press.

O’Hara, R.B., 2005. Species richness estimators: how many species can dance on thehead of a pin? Journal of Animal Ecology 74 (2), 375e386.

O’Leary, R.A., Murray, J.V., Low-Choy, S.J., Mengersen, K.L., 2008. Expert elicitation forBayesianclassificationtrees. JournalofAppliedProbability&Statistics3 (1),95e106.

O’Leary, R., Low Choy, S., Murray, J., Kynn, M., Denham, R., Martin, T.,K.,M., 2009.Comparison of three elicitation methods for logistic regression on predictingthe presence of the threatened brush-tailed rock-wallaby. Environmetrics 20,379e398.

O’Leary, R.A., Fisher, R., Low-Choy, S., Mengersen, K., Caley, M.J., 2011. What is anexpert? In: Chan, F., Marinova, D. (Eds.), MODSIM 2011 International Congresson Modelling and Simulation Modelling and Simulation Society of Australia andNew Zealand, December 2011.

O’Leary, R.A., Low-Choy, S., Fisher, R., Mengersen, K., Caley, M.J., Characterisinguncertainty in expert knowledge: encoding heavily skewed count data. Annalsof Applied Statistics, in preparation.

Oliver, D.M., Page, T., Hodgson, C.J., Heathwaite, A.L., Chadwick, D.R., Fish, R.D.,Winter, M., 2010. Development and testing of a risk indexing framework todetermine field-scale critical source areas of faecal bacteria on grassland.Environmental Modelling and Software 25, 503e512.

Oliver, D.M., Fish, R.D., Winter, M., Hodgson, C.J., Heathwaite, A.L., Chadwick, D.R.,Valuing local knowledge as a source of expert data: Farmer engagement and thedesign of decision support systems. Environmental Modelling & SoftwareCorrected Proof, Available online 6 November, in press.

Page, T., Heathwaite, A.L., Thompson, L.J., Pope, L., Willows, R., Eliciting fuzzydistributions from experts for ranking conceptual risk model components.Environmental Modelling & Software Corrected Proof, Available online 30March 2011, in press.

Perera, A.H., Drew, C.A., Johnson, C.J., 2011. Expert Knowledge and Its Applicationsin Landscape Ecology. Springer, NY.

Phillips, L.D., Wisbey, S.J., 1993. The Elicitation of Judgmental Probability Distribu-tions from Groups of Experts: A Description of the Methodology and Records ofSeven Formal Elicitation Sessions Held in 1991 and 1992 Report nss/r282, NirexUK, Didcot, UK.

Plaisance, L., Knowlton, N., Paulay, G., Meyer, C., 2009. Reef-associated crustaceanfauna: biodiversity estimates using semi-quantitative sampling and DNAbarcoding. Coral Reefs 28, 977e986.

Ratnasingham, S., Hebert, P.D.N., 2007. Bold: the barcode of Life data system.Molecular Ecology Notes 7 (3), 355e364. http://www.barcodinglife.org.

R-Development-Core-Team, 2010. R: A Language and Environment for StatisticalComputing. R Foundation for Statistical Computing, Vienna, Austria.

Reaka-Kudla, M.L., 1997. The global biodiversity of coral reefs: a comparison withrain forests. In: Reaka-Kudla, M.L., Wilson, D.E., Wilson, E.O. (Eds.), BiodiversityII: Understanding and Protecting Our Biological Resources. Natl Academy Pr,Washington, pp. 83e108.

Reckhow, K.M., 1990. Bayesian inference in non-replicated ecological studies.Ecology 71, 2053e2059.

Refsgaard, J.C., Sluijs, J.P.v.d., Højberg, A.L., Vanrolleghem, P.A., 2007. Uncertainty inthe environmental modelling process e A framework and guidance. Environ-mental Modelling & Software 22, 1543e1556.

Rinderknecht, S.L., Borsuk, M.E., Reichert, P., Bridging uncertain and ambiguousknowledge with imprecise probabilities. Environmental Modelling & SoftwareCorrected Proof, Available online 13 September 2011, in press.

Saaty, T.L., 1987. Risk-its priority and probability: the analytic hierarchy process.Risk Analysis 7, 159e172.

Sherry, B., Jefferds, M.E., Grummer-Strawn, L., 2007. Accuracy of adolescent self-report of height and weight in assessing overweight status: a literaturereview. Archives of Pediatrics & Adolescent Medicine 161 (12), 1154e1161.

Small, A.M., Adey, W.H., Spoon, D., 1998. Are current estimates of coral reef biodi-versity too low? The view through the window of a microcosm. Atoll ResearchBulletin 458, 1e20.

Soll, J.B., Klayman, J., 2004. Overconfidence in interval estimates. Journal ofExperimental Psychology: Learning, Memory, and Cognition 30, 299e314.

Speirs-Bridge, A., Fidler, F., McBride, M., Flander, L., Cumming, G., Burgman, M.,2010. Reducing overconfidence in the interval judgments of experts. RiskAnalysis 30 (3), 512.

Spetzler, C.S., Stäel von Holstein, C.A.S., 1975. Probability encoding in decisionanalysis. Management Science 22 (3), 340e358.

Spiegelhalter, D.J., Adams, K.R., Myles, J.P., 2004. Bayesian Ap-proaches to ClinicalTrials and Health-Care Evaluation. Statistics in Practice. John Wiley & Sons Ltd,Chichester, U.K.

Teigen, K.H., JØrgensen, M., 2005. When 90% confidence intervals are 50% certain:on the credibility of credible intervals. Applied Cognitive Psychology 19 (4), 455.

Tversky, A., Kahneman, D., 1973. Availability: a heuristic for judging frequency andprobability. Cognitive Psychology 5, 207e232.

Tversky, A., Kahneman, D., 1974. Judgement under uncertainty: heuristics andbiases. Science 185, 1124e1131.

Verzani, J., 2011a. gWidgets API for Building Toolkit-independent, Interactive GUIsCRAN. https://r-forge.r-project.org/R/?group_id¼761.

Verzani, J., 2011b. Toolkit Implementation of gWidgets for tcltk Package CRAN.http://gwidgets.r-forge.r-project.org/.

Wheeler, Q.D., Raven, P.H., Wilson, E.O., 2004. Taxonomy: impediment or expe-dient? Science 303 (5656), 285.

Wilson, S.P., Costello, M.J., 2005. Predicting future discoveries of European marinespecies by using a non homogeneous renewal process. The Journal of the RoyalStatistical Society, Series C (Applied Statistics) 54, 897e918.

Yamada, K., Elith, J., McCarthy, M., Zerger, A., 2003. Eliciting and integrating expertknowledge for wildlife habitat modelling. Ecological Modelling 165, 251e264.

Zerger, A., Warren, G., Hill, P., Robertson, D., Weidemann, A., Lawton, K., 2011.Multi-criteria assessment for linking regional conservation planning andfarm-scale actions. Environmental Modelling and Software 26 (1), 103e110.