A unified framework to combine disperate data types in species distribution modelling
A unified framework to combine disperate datatypes in species distribution modelling
Slides on Slideshare:http://www.slideshare.net/oharar/gf-o2014talk
Bob O’Hara1 Petr Keil 2 Walter Jetz2
1BiK-F, Biodiversity and Climate Change Research CentreFrankfurt am Main
GermanyTwitter: @bobohara
2Department of Ecology and Evolutionary BiologyYale University
New Haven, CT, USA
A unified framework to combine disperate data types in species distribution modelling
A ”Real” Curve
0 20 40 60 80 100
020
4060
80
Curve
A unified framework to combine disperate data types in species distribution modelling
Approximated with a Discretised Curve
0 20 40 60 80 100
020
4060
80
CurveDiscrete
A unified framework to combine disperate data types in species distribution modelling
Better: linear interpolation
0 20 40 60 80 100
020
4060
80
CurveDiscreteInterpolated
A unified framework to combine disperate data types in species distribution modelling
With more points, the approximations improve
0 20 40 60 80 100
020
4060
80
CurveDiscreteInterpolated
A unified framework to combine disperate data types in species distribution modelling
What does this have to do with distribution models?
A unified framework to combine disperate data types in species distribution modelling
What does this have to do with distribution models?
This is how SDMs see the world:
source: http://bit.ly/1l8sG7M
Map produced by Peter Blancher, Science and Technology Branch, Environment Canada, based on data from the
North American Breeding Bird Survey
A unified framework to combine disperate data types in species distribution modelling
Problems: scale, within-grid heterogeneity
A unified framework to combine disperate data types in species distribution modelling
Let’s sidestep the whole problem
Work in continuous space insteadThe maths will let us work on different scales
I e.g. Renner & Warton (2013) doi:10.1111/j.1541-0420.2012.01824.x
Lets us deal with points & irregular shapesMakes it straightforward to include different sorts of data
A unified framework to combine disperate data types in species distribution modelling
Motivation
Map Of Life
www.mol.org/
Different data sources
I GBIF
I expert range maps
I eBird and similarcitizen science efforts
I organised surveys(BBS, BMSs)
I Regional checklists
A unified framework to combine disperate data types in species distribution modelling
A Unified Model
There is a single state - density of the species
Actual State
PresenceAbsence
PresenceOnly
ExpertRangeMaps
���� ?
HHHj
A unified framework to combine disperate data types in species distribution modelling
Point Processes: Model
Each point in space, ξ, has anintensity, ρ(ξ)
log(ρ(ξ)) = η(ξ) =∑
βX (ξ)+ν(ξ)
The number of individuals in anarea A follows a Poissondistibution with mean
λ(A) =
∫Aρ(ξ)ds
A unified framework to combine disperate data types in species distribution modelling
Point Processes: Reality
Approximate λ(ξ) numerically:select some integration points,and sum over those
λ(A) ≈N∑
s=1
|A(s)|eη(s)
A unified framework to combine disperate data types in species distribution modelling
Observation Models
Presence only points: thinned point processAbundance: Poisson Presence/Absence: binomial, cloglogwith µA(A, t) = η(A) + log(|A|) + log(t) + log(p)(large) areas:
Pr(n(A) > 0) = 1− e∫A eρ(ξ)dξ
Expert range: use distance to range as a covariate
A unified framework to combine disperate data types in species distribution modelling
Put these together
Data likelihoods: P(Xi |λ) for data Xi . Total likelihood is
P(X) =∏i
P(Xi |λ)P(λ)
Where P(λ) is the actual distribution model, and will depend onenvironmental and other covariates
A unified framework to combine disperate data types in species distribution modelling
In practice
Be Bayesian. Could use MCMC, but this is quicker in INLA
SolTim.res <- inla(SolTim.formula,
family=c('poisson','binomial'),
data=inla.stack.data(stk.all),
control.family = list(list(link = "log"),
list(link = "cloglog")),
control.predictor=list(A=inla.stack.A(stk.all)),
Ntrials=1, E=inla.stack.data(stk.all)$e, verbose=FALSE)
A unified framework to combine disperate data types in species distribution modelling
The Solitary Tinamou
Photo credit: Francesco Veronesi on Flickr (https://www.flickr.com/photos/francesco veronesi/12797666343)
A unified framework to combine disperate data types in species distribution modelling
Data
Whole RegionExpert rangePark, absentPark, presenteBirdGBIF
I expert range
I 2 pointprocesses (49points)
I 28 parks
A unified framework to combine disperate data types in species distribution modelling
A Fitted Model
mean sd
Intercept -0.03 0.02b.eBird 1.54 0.39b.GBIF 1.54 0.24Forest 0.00 0.01
NPP -0.01 0.01Altitude -0.01 0.01
DistToRange -0.01 0.00
A unified framework to combine disperate data types in species distribution modelling
Predicted Distribution
Posterior Mean
−0.10
−0.09
−0.08
−0.07
−0.06
−0.05
−0.04
−0.03
−0.02
Posterior Standard Deviation
0.01
0.02
0.03
0.04
0.05
0.06
A unified framework to combine disperate data types in species distribution modelling
Individual Data Types
eBird GBIF Parks Expert Range
A unified framework to combine disperate data types in species distribution modelling
Join the bandwagon!
Using continuous space - makes lifeeasierIn practice, use INLA (but I need totidy up the code)
A unified framework to combine disperate data types in species distribution modelling
Not the final answer...
http://www.gocomics.com/nonsequitur/2014/06/24