Download pdf - Gf o2014talk

A unified framework to combine disperate data types in species distribution modelling

A unified framework to combine disperate datatypes in species distribution modelling

Slides on Slideshare:http://www.slideshare.net/oharar/gf-o2014talk

Bob O’Hara1 Petr Keil 2 Walter Jetz2

1BiK-F, Biodiversity and Climate Change Research CentreFrankfurt am Main

GermanyTwitter: @bobohara

2Department of Ecology and Evolutionary BiologyYale University

New Haven, CT, USA


A ”Real” Curve

0 20 40 60 80 100

020

4060

80

Curve


Approximated with a Discretised Curve

0 20 40 60 80 100

020

4060

80

CurveDiscrete


Better: linear interpolation

0 20 40 60 80 100

020

4060

80

CurveDiscreteInterpolated


With more points, the approximations improve

0 20 40 60 80 100

020

4060

80

CurveDiscreteInterpolated


What does this have to do with distribution models?


What does this have to do with distribution models?

This is how SDMs see the world:

source: http://bit.ly/1l8sG7M

Map produced by Peter Blancher, Science and Technology Branch, Environment Canada, based on data from the

North American Breeding Bird Survey


Problems: scale, within-grid heterogeneity


Let’s sidestep the whole problem

Work in continuous space insteadThe maths will let us work on different scales

I e.g. Renner & Warton (2013) doi:10.1111/j.1541-0420.2012.01824.x

Lets us deal with points & irregular shapesMakes it straightforward to include different sorts of data


Motivation

Map Of Life

www.mol.org/

Different data sources

I GBIF

I expert range maps

I eBird and similarcitizen science efforts

I organised surveys(BBS, BMSs)

I Regional checklists


A Unified Model

There is a single state - density of the species

Actual State

PresenceAbsence

PresenceOnly

ExpertRangeMaps

�� ?

HHHj


Point Processes: Model

Each point in space, ξ, has anintensity, ρ(ξ)

log(ρ(ξ)) = η(ξ) =∑

βX (ξ)+ν(ξ)

The number of individuals in anarea A follows a Poissondistibution with mean

λ(A) =

∫Aρ(ξ)ds


Point Processes: Reality

Approximate λ(ξ) numerically:select some integration points,and sum over those

λ(A) ≈N∑

s=1

|A(s)|eη(s)


Observation Models

Presence only points: thinned point processAbundance: Poisson Presence/Absence: binomial, cloglogwith µA(A, t) = η(A) + log(|A|) + log(t) + log(p)(large) areas:

Pr(n(A) > 0) = 1− e∫A eρ(ξ)dξ

Expert range: use distance to range as a covariate


Put these together

Data likelihoods: P(Xi |λ) for data Xi . Total likelihood is

P(X) =∏i

P(Xi |λ)P(λ)

Where P(λ) is the actual distribution model, and will depend onenvironmental and other covariates


In practice

Be Bayesian. Could use MCMC, but this is quicker in INLA

SolTim.res <- inla(SolTim.formula,

family=c('poisson','binomial'),

data=inla.stack.data(stk.all),

control.family = list(list(link = "log"),

list(link = "cloglog")),

control.predictor=list(A=inla.stack.A(stk.all)),

Ntrials=1, E=inla.stack.data(stk.all)$e, verbose=FALSE)


The Solitary Tinamou

Photo credit: Francesco Veronesi on Flickr (https://www.flickr.com/photos/francesco veronesi/12797666343)


Data

Whole RegionExpert rangePark, absentPark, presenteBirdGBIF

I expert range

I 2 pointprocesses (49points)

I 28 parks


A Fitted Model

mean sd

Intercept -0.03 0.02b.eBird 1.54 0.39b.GBIF 1.54 0.24Forest 0.00 0.01

NPP -0.01 0.01Altitude -0.01 0.01

DistToRange -0.01 0.00


Predicted Distribution

Posterior Mean

−0.10

−0.09

−0.08

−0.07

−0.06

−0.05

−0.04

−0.03

−0.02

Posterior Standard Deviation

0.01

0.02

0.03

0.04

0.05

0.06


Individual Data Types

eBird GBIF Parks Expert Range


Join the bandwagon!

Using continuous space - makes lifeeasierIn practice, use INLA (but I need totidy up the code)


Not the final answer...

http://www.gocomics.com/nonsequitur/2014/06/24