111
Islands and Integrals Processes of Diversification in an Island Archipelago and Bayesian Methods of Comparative Phylogeographical Model Choice Jamie R. Oaks 1 1 Department of Ecology and Evolutionary Biology, University of Kansas October 16, 2013 Islands and Integrals J. Oaks, University of Kansas 1/53

Islands and Integrals

Embed Size (px)

Citation preview

Islands and IntegralsProcesses of Diversification in an Island Archipelago and

Bayesian Methods of Comparative Phylogeographical ModelChoice

Jamie R. Oaks1

1Department of Ecology and Evolutionary Biology, University of Kansas

October 16, 2013

Islands and Integrals J. Oaks, University of Kansas 1/53

Southeast Asia

Islands and Integrals J. Oaks, University of Kansas 2/53

Southeast Asia

Islands and Integrals J. Oaks, University of Kansas 3/53

Philippine Archipelago

Islands and Integrals J. Oaks, University of Kansas 4/53

Philippine Archipelago

Islands and Integrals J. Oaks, University of Kansas 4/53

Climate-driven diversification model

I Repeated coalescence andfragmentation of islandcomplexes

I Prominent paradigm forexplaining Philippinebiodiversity

I Proposed as model ofdiversification

Islands and Integrals J. Oaks, University of Kansas 5/53

Climate-driven diversification model

I Repeated coalescence andfragmentation of islandcomplexes

I Prominent paradigm forexplaining Philippinebiodiversity

I Proposed as model ofdiversification

Islands and Integrals J. Oaks, University of Kansas 5/53

Climate-driven diversification model

I Repeated coalescence andfragmentation of islandcomplexes

I Prominent paradigm forexplaining Philippinebiodiversity

I Proposed as model ofdiversification

Islands and Integrals J. Oaks, University of Kansas 5/53

Testing climate-driven diversification

Did repeated fragmentation ofislands during inter-glacialrises in sea level promotediversification?

Model has testable prediction:

I Temporally clustereddivergences among taxaco-distributed acrossfragmented islands

Islands and Integrals J. Oaks, University of Kansas 6/53

Testing climate-driven diversification

Did repeated fragmentation ofislands during inter-glacialrises in sea level promotediversification?

Model has testable prediction:

I Temporally clustereddivergences among taxaco-distributed acrossfragmented islands

Islands and Integrals J. Oaks, University of Kansas 6/53

Climate-driven model: Prediction

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 7/53

Climate-driven model: Prediction

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 7/53

Climate-driven model: Prediction

T2

T3

T5

τ2 τ1

T1

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 7/53

Divergence model choice

T2

T3

T5

τ2 τ1

T1

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (T1,T2,T3,T4,T5)

τ = {τ1, τ2}

|τ| = 2

T2

T3

T5

τ2 τ1

T1

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (330, 330, 125, 125, 125)

τ = {125, 330}

|τ| = 2

T2

T3

T5

τ2 τ1

T1

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (330, 330, 125, 330, 125)

τ = {125, 330}

|τ| = 2

T2

T3

T5

τ2 τ1

T1

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (375, 330, 125, 330, 125)

τ = {125, 330, 375}

|τ| = 3

T2

T3

T5

τ2 τ1

T1

τ3

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (T1,T2,T3,T4,T5)

τ = {τ1, τ2, τ3}

|τ| = 3

T2

T3

T5

τ2 τ1

T1

τ3

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (T1,T2, . . . ,TY)

τ = {τ1, . . . , τ|τ|}

|τ|

T2

T3

T5

τ2 τ1

T1

τ3

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (T1,T2, . . . ,TY)

τ = {τ1, . . . , τ|τ|}

|τ|

I We want to infer T given DNAsequence alignments X

I

p(T |X) =p(X |T)p(T)

p(X)

I This approach implemented inmsBayes

I Not that simple

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (T1,T2, . . . ,TY)

τ = {τ1, . . . , τ|τ|}

|τ|

I We want to infer T given DNAsequence alignments X

I

p(T |X) =p(X |T)p(T)

p(X)

I This approach implemented inmsBayes

I Not that simple

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (T1,T2, . . . ,TY)

τ = {τ1, . . . , τ|τ|}

|τ|

I We want to infer T given DNAsequence alignments X

I

p(T |X) =p(X |T)p(T)

p(X)

I This approach implemented inmsBayes

I Not that simple

Islands and Integrals J. Oaks, University of Kansas 8/53

Divergence model choice

T = (T1,T2, . . . ,TY)

τ = {τ1, . . . , τ|τ|}

|τ|

I We want to infer T given DNAsequence alignments X

I

p(T |X) =p(X |T)p(T)

p(X)

I This approach implemented inmsBayes

I Not that simple

Islands and Integrals J. Oaks, University of Kansas 8/53

The msBayes model

T2

T3

T5

τ2 τ1

T1

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 9/53

The msBayes model

T1

T2

τ2

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 9/53

The msBayes model

T1

T2

τ2

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 9/53

The msBayes model

X Sequence alignments

G Gene trees

T Divergence times

Θ Demographicparameters

T1

T2

τ2

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 10/53

The msBayes model

X Sequence alignments

G Gene trees

T Divergence times

Θ Demographicparameters

T1

T2

τ2

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 10/53

The msBayes model

Full Model:

p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)

p(X)

X Sequence alignments

G Gene trees

T Divergence times

Θ Demographic parameters

Islands and Integrals J. Oaks, University of Kansas 11/53

The msBayes model

Full Model:

p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)

p(X)

X Sequence alignments

G Gene trees

T Divergence times

Θ Demographic parameters

Islands and Integrals J. Oaks, University of Kansas 11/53

The msBayes model

Full Model:

p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)

p(X)

p(G,T, θA, θD1, θD2, τB, ζD1, ζD2,m, α,υ | X,φ,ρ, ν)

=1

p(X)p(T)f (α)

[ Y∏i=1

p(θA,i )p(θD1,i , θD2,i )p(τB,i )p(ζD1,i )f (ζD2,i )p(mi )

ki∏j=1

p(Xi,j | Gi,j , φi,j )p(Gi,j | Ti , θA,i , θD1,i , θD2,i , ρi,j , νi,j , υj , τB,i , ζD1,i , ζD2,i ,mi )

][K∏

j=1

f (υj |α)]

Islands and Integrals J. Oaks, University of Kansas 11/53

The msBayes model

Full Model:

p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)

p(X)

Approximate Bayesian computation (ABC)

X → S∗ → Bε(S∗)

Islands and Integrals J. Oaks, University of Kansas 11/53

The msBayes model

Full Model:

p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)

p(X)

Approximate Bayesian computation (ABC)

X → S∗ → Bε(S∗)

Approximate Model:

p(G,T,Θ |Bε(S∗)) =p(X |G,T,Θ)p(G,T,Θ)

p(Bε(S∗))

Islands and Integrals J. Oaks, University of Kansas 11/53

The msBayes model

Full Model:

p(G,T,Θ |X) =p(X |G,T,Θ)p(G,T,Θ)

p(X)

T Vector of divergence times across pairs of populations

|τ| Number of divergence parameters

DT The variance of T

Islands and Integrals J. Oaks, University of Kansas 11/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 12/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 13/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 13/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 13/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 13/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 13/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 13/53

Empirical results

Strong support forsimultaneous divergence ofall 22 taxon pairs

pp > 0.96

∼100,000–250,000 years ago

Islands and Integrals J. Oaks, University of Kansas 14/53

Simulation-based power analyses

What is “simultaneous”?

I Simulate datasets in which all 22 divergence times are random

I τ ∼ U(0, 0.5MGA)

I τ ∼ U(0, 1.5MGA)

I τ ∼ U(0, 2.5MGA)

I τ ∼ U(0, 5.0MGA)

I MGA = Millions of Generations Ago

I Simulate 1000 datasets for each τ distribution

I Analyze all 4000 datasets as we did the empirical data

Islands and Integrals J. Oaks, University of Kansas 15/53

Simulation-based power analyses

What is “simultaneous”?I Simulate datasets in which all 22 divergence times are random

I τ ∼ U(0, 0.5MGA)

I τ ∼ U(0, 1.5MGA)

I τ ∼ U(0, 2.5MGA)

I τ ∼ U(0, 5.0MGA)

I MGA = Millions of Generations Ago

I Simulate 1000 datasets for each τ distribution

I Analyze all 4000 datasets as we did the empirical data

Islands and Integrals J. Oaks, University of Kansas 15/53

Simulation-based power analyses

What is “simultaneous”?I Simulate datasets in which all 22 divergence times are random

I τ ∼ U(0, 0.5MGA)

I τ ∼ U(0, 1.5MGA)

I τ ∼ U(0, 2.5MGA)

I τ ∼ U(0, 5.0MGA)

I MGA = Millions of Generations Ago

I Simulate 1000 datasets for each τ distribution

I Analyze all 4000 datasets as we did the empirical data

Islands and Integrals J. Oaks, University of Kansas 15/53

Simulation-based power analyses

What is “simultaneous”?I Simulate datasets in which all 22 divergence times are random

I τ ∼ U(0, 0.5MGA)

I τ ∼ U(0, 1.5MGA)

I τ ∼ U(0, 2.5MGA)

I τ ∼ U(0, 5.0MGA)

I MGA = Millions of Generations Ago

I Simulate 1000 datasets for each τ distribution

I Analyze all 4000 datasets as we did the empirical data

Islands and Integrals J. Oaks, University of Kansas 15/53

Simulation-based power analyses: Results

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 0.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 1.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 2.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 5.0 MGA)

Estimated number of divergence events (mode)

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 16/53

Simulation-based power analyses: Results

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 0.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 1.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 2.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 5.0 MGA)

Estimated number of divergence events (mode)

Dens

ity

0.05 0.25 0.45 0.65 0.850

5

10

15

20

0.05 0.25 0.45 0.65 0.850

5

10

15

20

0.05 0.25 0.45 0.65 0.850

2

4

6

8

10

12

0.05 0.25 0.45 0.65 0.850

2

4

6

8

10

Posterior probability of one divergence

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 16/53

Simulation-based power analyses: Results

Strong support for highly clustered divergences when divergencetimes are random over 5 million generations

Our empirical results are likely spurious

Islands and Integrals J. Oaks, University of Kansas 17/53

Why the bias?

Potential causes of the bias:

1. The prior on divergence models

2. Broad uniform priors on many of the model’s parameters,including divergence times

Islands and Integrals J. Oaks, University of Kansas 18/53

Causes of bias: Prior on divergence models

T = (375, 330, 125, 330, 125)

τ = {125, 330, 375}

|τ| = 3

T2

T3

T5

τ2 τ1

T1

τ3

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 19/53

Causes of bias: Prior on divergence models

I msBayes uses a discrete uniform prior on the number ofdivergence events, |τ|

# of

div

erge

nce

mod

els

020

4060

8010

012

0

1 3 5 7 9 11 13 15 17 19 21

A

p(M

|τ|,i)

0.00

0.01

0.02

0.03

0.04

1 3 5 7 9 11 13 15 17 19 21

B

# of divergence events, |τ|

Islands and Integrals J. Oaks, University of Kansas 20/53

Causes of bias: Broad priors

I msBayes uses uniform priors on most model parameters,including divergence times

I This requires the use of broad priors

I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood

I This vast space can cause problems with Bayesian modelchoice

I Reduced marginal likelihoods

Islands and Integrals J. Oaks, University of Kansas 21/53

Causes of bias: Broad priors

I msBayes uses uniform priors on most model parameters,including divergence times

I This requires the use of broad priors

I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood

I This vast space can cause problems with Bayesian modelchoice

I Reduced marginal likelihoods

Islands and Integrals J. Oaks, University of Kansas 21/53

Causes of bias: Broad priors

I msBayes uses uniform priors on most model parameters,including divergence times

I This requires the use of broad priors

I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood

I This vast space can cause problems with Bayesian modelchoice

I Reduced marginal likelihoods

Islands and Integrals J. Oaks, University of Kansas 21/53

Causes of bias: Broad priors

I msBayes uses uniform priors on most model parameters,including divergence times

I This requires the use of broad priors

I Models with more divergence-time parameters have muchgreater parameter space, much of it with low likelihood

I This vast space can cause problems with Bayesian modelchoice

I Reduced marginal likelihoods

Islands and Integrals J. Oaks, University of Kansas 21/53

Causes of bias: Marginal likelihoods

p(X ) =

∫θ

p(X | θ)p(θ)dθ

Islands and Integrals J. Oaks, University of Kansas 22/53

Causes of bias: Marginal likelihoods

p(X ) =

∫θ

p(X | θ)p(θ)dθ

0.0 0.2 0.4 0.6 0.8 1.0θ

0

5

10

15

20

25

30

Dens

ity

p(X | θ)

Islands and Integrals J. Oaks, University of Kansas 22/53

Causes of bias: Marginal likelihoods

p(X ) =

∫θ

p(X | θ)p(θ)dθ

0.0 0.2 0.4 0.6 0.8 1.0θ

0

5

10

15

20

25

30

Dens

ity

p(X | θ)

p(θ)

Islands and Integrals J. Oaks, University of Kansas 22/53

Causes of bias: Marginal likelihoods

Islands and Integrals J. Oaks, University of Kansas 23/53

Causes of bias: Marginal likelihoods

Islands and Integrals J. Oaks, University of Kansas 23/53

Causes of bias: Marginal likelihoods

Islands and Integrals J. Oaks, University of Kansas 23/53

Causes of bias: Marginal likelihoods

p(θ |X ) =p(X | θ)p(θ)

p(X )

p(X ) =

∫θ

p(X | θ)p(θ)dθ

Islands and Integrals J. Oaks, University of Kansas 24/53

Causes of bias: Marginal likelihoods

p(θ1 |X ,M1) =p(X | θ1,M1)p(θ1 |M1)

p(X |M1)

p(X |M1) =

∫θ1

p(X | θ1,M1)p(θ |M1)dθ1

Islands and Integrals J. Oaks, University of Kansas 24/53

Causes of bias: Marginal likelihoods

p(θ1 |X ,M1) =p(X | θ1,M1)p(θ1 |M1)

p(X |M1)

p(X |M1) =

∫θ1

p(X | θ1,M1)p(θ |M1)dθ1

p(M1 |X ) =p(X |M1)p(M1)

p(X |M1)p(M1) + p(X |M2)p(M2)

Islands and Integrals J. Oaks, University of Kansas 24/53

Causes of bias: Marginal likelihoods

Predictions:

I Posterior estimates should be sensitive to priors

I As prior converges to distribution underlying the data, thebias should disappear

Testing prior sensitivity:

1. Analyze empirical data under several different prior settings

I Results are very sensitive

2. Use simulations to assess behavior when priors are correct

Islands and Integrals J. Oaks, University of Kansas 25/53

Causes of bias: Marginal likelihoods

Predictions:

I Posterior estimates should be sensitive to priors

I As prior converges to distribution underlying the data, thebias should disappear

Testing prior sensitivity:

1. Analyze empirical data under several different prior settings

I Results are very sensitive

2. Use simulations to assess behavior when priors are correct

Islands and Integrals J. Oaks, University of Kansas 25/53

Causes of bias: Marginal likelihoods

Predictions:

I Posterior estimates should be sensitive to priors

I As prior converges to distribution underlying the data, thebias should disappear

Testing prior sensitivity:

1. Analyze empirical data under several different prior settingsI Results are very sensitive

2. Use simulations to assess behavior when priors are correct

Islands and Integrals J. Oaks, University of Kansas 25/53

Causes of bias: Marginal likelihoods

Predictions:

I Posterior estimates should be sensitive to priors

I As prior converges to distribution underlying the data, thebias should disappear

Testing prior sensitivity:

1. Analyze empirical data under several different prior settingsI Results are very sensitive

2. Use simulations to assess behavior when priors are correct

Islands and Integrals J. Oaks, University of Kansas 25/53

Simulation results: Performance when priors are correct

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

Posterior probability of one divergence

True

prob

abili

tyof

one

dive

rgen

ce

msBayes performs well when all assumptions are met

Islands and Integrals J. Oaks, University of Kansas 26/53

Causes of bias: Marginal likelihoods

Predictions:

I Posterior estimates should be sensitive to priors

I As prior converges to distribution underlying the data, thebias should disappear

Testing prior sensitivity:

1. Analyze empirical data under several different prior settingsI Results are very sensitive

2. Use simulations to assess behavior when priors are correct

3. Use simulations to assess behavior under “ideal” real-worldpriors

Islands and Integrals J. Oaks, University of Kansas 27/53

Causes of bias: Marginal likelihoods

Predictions:

I Posterior estimates should be sensitive to priors

I As prior converges to distribution underlying the data, thebias should disappear

Testing prior sensitivity:

1. Analyze empirical data under several different prior settingsI Results are very sensitive

2. Use simulations to assess behavior when priors are correct

3. Use simulations to assess behavior under “ideal” real-worldpriors

Islands and Integrals J. Oaks, University of Kansas 27/53

Simulation results: Power with informed priors

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 0.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 1.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 2.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 5.0 MGA)

Estimated number of divergence events (mode)

Dens

ity

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =0.997

1 3 5 7 9 11 13 15 17 19 210.0

0.1

0.2

0.3

0.4

0.5

0.6p( ˆ|τ|=1) =0.473

Estimated number of divergence events (mode)

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 28/53

Simulation results: Power with informed priors

0.05 0.25 0.45 0.65 0.850

5

10

15

20

τ∼U(0, 0.5 MGA)

0.05 0.25 0.45 0.65 0.850

5

10

15

20

τ∼U(0, 1.5 MGA)

0.05 0.25 0.45 0.65 0.850

2

4

6

8

10

12

τ∼U(0, 2.5 MGA)

0.05 0.25 0.45 0.65 0.850

2

4

6

8

10

τ∼U(0, 5.0 MGA)

Posterior probability of one divergence

Dens

ity

0.05 0.25 0.45 0.65 0.8502

468

1012

14

0.05 0.25 0.45 0.65 0.850123456789

0.05 0.25 0.45 0.65 0.850

1

2

3

4

5

6

0.05 0.25 0.45 0.65 0.850.0

0.5

1.0

1.5

2.0

Posterior probability of one divergence

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 29/53

Causes of bias: Simulation results

Broad uniform priors are reducing marginal likelihoods of modelswith more divergence events

Even when uniform priors are informed by the data the bias remains

Potential solution:

More flexible priors

Islands and Integrals J. Oaks, University of Kansas 30/53

Causes of bias: Simulation results

Broad uniform priors are reducing marginal likelihoods of modelswith more divergence events

Even when uniform priors are informed by the data the bias remains

Potential solution:

More flexible priors

Islands and Integrals J. Oaks, University of Kansas 30/53

Mitigating the bias

Potential solution:

More flexible priors

0.0 0.2 0.4 0.6 0.8 1.0θ

0

5

10

15

20

25

30De

nsity

p(X | θ)

p(θ)

Potential solution:

Alternative prior over divergence models (e.g., uniform or Dirichletprocess)

Islands and Integrals J. Oaks, University of Kansas 31/53

Mitigating the bias

Potential solution:

More flexible priors

0.0 0.2 0.4 0.6 0.8 1.0θ

0

5

10

15

20

25

30De

nsity

p(X | θ)

p(θ)

Potential solution:

Alternative prior over divergence models (e.g., uniform or Dirichletprocess)

Islands and Integrals J. Oaks, University of Kansas 31/53

Mitigating the bias

Potential solution:

More flexible priors

# of

div

erge

nce

mod

els

020

4060

8010

012

0

1 3 5 7 9 11 13 15 17 19 21

A

p(M

|τ|,i)

0.00

0.01

0.02

0.03

0.04

1 3 5 7 9 11 13 15 17 19 21

B

# of divergence events, |τ|

Potential solution:

Alternative prior over divergence models (e.g., uniform or Dirichletprocess)

Islands and Integrals J. Oaks, University of Kansas 31/53

Mitigating the bias

Potential solution:

More flexible priors

Potential solution:

Alternative prior over divergence models (e.g., uniform or Dirichletprocess)

Islands and Integrals J. Oaks, University of Kansas 31/53

New method: dpp-msbayes

I Reparameterized the model implemented in msBayes

I Replaced uniform priors on continuous parameters withgamma and beta distributions

I Dirichlet process prior (DPP) over all possible discretedivergence models

I Uniform prior over divergence models

Islands and Integrals J. Oaks, University of Kansas 32/53

New method: dpp-msbayes

I Reparameterized the model implemented in msBayes

I Replaced uniform priors on continuous parameters withgamma and beta distributions

I Dirichlet process prior (DPP) over all possible discretedivergence models

I Uniform prior over divergence models

Islands and Integrals J. Oaks, University of Kansas 32/53

New method: dpp-msbayes

I Reparameterized the model implemented in msBayes

I Replaced uniform priors on continuous parameters withgamma and beta distributions

I Dirichlet process prior (DPP) over all possible discretedivergence models

I Uniform prior over divergence models

Islands and Integrals J. Oaks, University of Kansas 32/53

New method: dpp-msbayes

I Reparameterized the model implemented in msBayes

I Replaced uniform priors on continuous parameters withgamma and beta distributions

I Dirichlet process prior (DPP) over all possible discretedivergence models

I Uniform prior over divergence models

Islands and Integrals J. Oaks, University of Kansas 32/53

dpp-msbayes: Simulation-based assessment

Simulate 50,000 datasets under four models

MmsBayes I U-shaped prior on divergence modelsI Uniform priors on continuous parameters

MUshaped I U-shaped prior on divergence modelsI Gamma priors on continuous parameters

MUniform I Uniform prior on divergence modelsI Gamma priors on continuous parameters

MDPP I DPP prior on divergence modelsI Gamma priors on continuous parameters

Analyze all datasets under each of the models

Islands and Integrals J. Oaks, University of Kansas 33/53

dpp-msbayes: Simulation-based assessment

Assess power

I Simulate datasets in which all 22 divergence times are random

I τ ∼ U(0, 0.5MGA)

I τ ∼ U(0, 1.5MGA)

I τ ∼ U(0, 2.5MGA)

I τ ∼ U(0, 5.0MGA)

I MGA = Millions of Generations Ago

I Simulate 1000 datasets for each τ distribution

I Analyze all 4000 datasets as we did the empirical data

Islands and Integrals J. Oaks, University of Kansas 34/53

dpp-msbayes: Simulation results

0.0

0.2

0.4

0.6

0.8

1.0

MmsBayes MDPPM

msBayes

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

MDPP

Posterior probability of one divergence

True

prob

abili

tyof

one

dive

rgen

ce Analysis

modelData model

Islands and Integrals J. Oaks, University of Kansas 35/53

dpp-msbayes: Simulation results

0.0

0.2

0.4

0.6

0.8

1.0

MmsBayes MDPP MUniform MUshaped

MmsBayes

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

MDPP

Posterior probability of one divergence

True

prob

abili

tyof

one

dive

rgen

ce Analysis

model

Data model

Islands and Integrals J. Oaks, University of Kansas 36/53

dpp-msbayes: Simulation results

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 0.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =1.0

τ∼U(0, 1.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =0.999

τ∼U(0, 2.5 MGA)

1 3 5 7 9 11 13 15 17 19 210.00.10.20.30.40.50.60.70.80.9

p( ˆ|τ|=1) =0.83

τ∼U(0, 5.0 MGA)

MmsBayes

Estimated number of divergence events (mode)

Dens

ity

1 3 5 7 9 11 13 15 17 19 210.0

0.2

0.4

0.6

0.8

1.0p( ˆ|τ|=1) =0.926

1 3 5 7 9 11 13 15 17 19 210.00.10.20.30.40.50.60.7

p( ˆ|τ|=1) =0.605

1 3 5 7 9 11 13 15 17 19 210.000.050.100.150.200.250.300.350.400.45

p( ˆ|τ|=1) =0.187

1 3 5 7 9 11 13 15 17 19 210.000.020.040.060.080.100.120.14

p( ˆ|τ|=1) =0.003

MDPP

Estimated number of divergence events (mode)

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 37/53

dpp-msbayes: Simulation results

0.05 0.25 0.45 0.65 0.8502468

10121416

τ∼U(0, 0.5 MGA)

0.05 0.25 0.45 0.65 0.850123456789

τ∼U(0, 1.5 MGA)

0.05 0.25 0.45 0.65 0.85012

34

567

τ∼U(0, 2.5 MGA)

0.05 0.25 0.45 0.65 0.850.0

0.5

1.0

1.5

2.0

2.5

3.0

τ∼U(0, 5.0 MGA)

MmsBayes

Posterior probability of one divergence

Dens

ity

0.05 0.25 0.45 0.65 0.85012

34

567

0.05 0.25 0.45 0.65 0.850.00.51.01.52.02.53.03.54.0

0.05 0.25 0.45 0.65 0.850

1

2

3

4

5

0.05 0.25 0.45 0.65 0.850

5

10

15

20

MDPP

Posterior probability of one divergence

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 38/53

dpp-msbayes: Simulation results

0.0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

150

200p(D̂T <0.01) =1.0

τ∼U(0, 0.5 MGA)

0.0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

150

200p(D̂T <0.01) =0.999

τ∼U(0, 1.5 MGA)

0.0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

150

200p(D̂T <0.01) =0.996

τ∼U(0, 2.5 MGA)

0.0 0.02 0.04 0.06 0.08 0.1 0.12020406080

100120140160180

p(D̂T <0.01) =0.637

τ∼U(0, 5.0 MGA)

MmsBayes

Estimated variance in divergence times (median)

Dens

ity

0.0 0.1 0.2 0.3 0.4 0.50

2

4

6

8

10p(D̂T <0.01) =0.002

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.00.51.01.52.02.53.03.54.04.5

p(D̂T <0.01) =0.0

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.0

0.5

1.0

1.5

2.0

2.5p(D̂T <0.01) =0.0

0.0 0.4 0.8 1.2 1.60.0

0.5

1.0

1.5

2.0

2.5

3.0p(D̂T <0.01) =0.0

MDPP

Estimated variance in divergence times (median)

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 39/53

dpp-msbayes: Simulation results

0.0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

150

200p(D̂T <0.01) =1.0

τ∼U(0, 0.5 MGA)

0.0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

150

200p(D̂T <0.01) =0.999

τ∼U(0, 1.5 MGA)

0.0 0.02 0.04 0.06 0.08 0.1 0.120

50

100

150

200p(D̂T <0.01) =0.996

τ∼U(0, 2.5 MGA)

0.0 0.02 0.04 0.06 0.08 0.1 0.12020406080

100120140160180

p(D̂T <0.01) =0.637

τ∼U(0, 5.0 MGA)M

msBayes

Estimated variance in divergence times (median)

Dens

ity

0.0 0.05 0.1 0.15 0.2 0.25 0.3 0.35010203040506070

p(D̂T <0.01) =0.914

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

5

10

15

20

25p(D̂T <0.01) =0.626

0.0 0.2 0.4 0.6 0.80123456789

p(D̂T <0.01) =0.235

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.0

0.5

1.0

1.5

2.0

2.5p(D̂T <0.01) =0.004

MUshaped

Estimated variance in divergence times (median)

Dens

ity

0.0 0.1 0.2 0.3 0.4 0.50

2

4

6

8

10p(D̂T <0.01) =0.002

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.00.51.01.52.02.53.03.54.04.5

p(D̂T <0.01) =0.0

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.40.0

0.5

1.0

1.5

2.0

2.5p(D̂T <0.01) =0.0

0.0 0.4 0.8 1.2 1.60.0

0.5

1.0

1.5

2.0

2.5

3.0p(D̂T <0.01) =0.0

MDPP

Estimated variance in divergence times (median)

Dens

ity

Islands and Integrals J. Oaks, University of Kansas 40/53

dpp-msbayes: Simulation results

I Results confirm the bias of msBayes was caused by

1. Broad uniform priors2. U-shaped prior on divergence models

I The new model shows improved model-choice accuracy,power, and robustness

Islands and Integrals J. Oaks, University of Kansas 41/53

Testing climate-driven diversification

Did repeated fragmentation ofislands during inter-glacialrises in sea level promotediversification?

Islands and Integrals J. Oaks, University of Kansas 42/53

Species n1 n2

MammalsCrocidura beatus 12 11Crocidura negrina-panayensis 12 6Hipposideros obscurus 19 9Hipposideros pygmaeus 3 12Cynopterus brachyotis 20 8Cynopterus brachyotis 8 14Haplonycteris fischeri 29 8Haplonycteris fischeri 9 21Macroglossus minimus 19 4Macroglossus minimus 8 10Ptenochirus jagori 4 7Ptenochirus jagori 8 8Ptenochirus minor 30 9SquamatesCyrtodactylus gubaot-sumuroi 29 6Cyrtodactylus annulatus 14 3Cyrtodactylus philippinicus 6 14Gekko mindorensis 8 11Insulasaurus arborens 22 10Pinoyscincus jagori 8 8Dendrelaphis marenae 6 6AnuransLimnonectes leytensis 4 2Limnonectes magnus 2 3

Islands and Integrals J. Oaks, University of Kansas 43/53

dpp-msbayes: Philippine diversification

1 3 5 7 9 11 13 15 17 19 21Number of divergence events

0.0

0.1

0.2

0.3

0.4

0.5

Post

erio

r pro

babi

lity

msBayes

1 3 5 7 9 11 13 15 17 19 21Number of divergence events

dpp-msbayes

Islands and Integrals J. Oaks, University of Kansas 44/53

dpp-msbayes: Philippine diversification

1 3 5 7 9 11 13 15 17 19 21Number of divergence events

0.00

0.02

0.04

0.06

0.08

0.10

0.12

Prob

abili

ty

Prior

1 3 5 7 9 11 13 15 17 19 21Number of divergence events

Posterior

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 45/53

Conclusions

I Our new approximate-Bayesian method of phylogeographicalmodel choice shows improved behavior

I Improved accuracy, robustness, and powerI More “honest” estimates regarding uncertainty

I Philippine climate-driven diversification model?

I Results consistent with prediction of clustered divergencesI Results suggest multiple co-divergencesI However, there is a lot of uncertainty

Islands and Integrals J. Oaks, University of Kansas 46/53

Conclusions

I Our new approximate-Bayesian method of phylogeographicalmodel choice shows improved behavior

I Improved accuracy, robustness, and powerI More “honest” estimates regarding uncertainty

I Philippine climate-driven diversification model?I Results consistent with prediction of clustered divergencesI Results suggest multiple co-divergencesI However, there is a lot of uncertainty

Islands and Integrals J. Oaks, University of Kansas 46/53

Future directions: Full-Bayesian phylogenetic framework

T2

T3

T5

τ2 τ1

T1

τ3

T4

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 47/53

Future directions: Full-Bayesian phylogenetic framework

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

Islands and Integrals J. Oaks, University of Kansas 47/53

Software

Everything is on GitHub. . .

I dpp-msbayes: https://github.com/joaks1/dpp-msbayes

I PyMsBayes: https://github.com/joaks1/PyMsBayes

I ABACUS: Approximate BAyesian C UtilitieS.https://github.com/joaks1/abacus

Islands and Integrals J. Oaks, University of Kansas 48/53

Open Notebook Science

Everything is on GitHub. . .

I msbayes-experiments:https://github.com/joaks1/msbayes-experiments

I [email protected]

Islands and Integrals J. Oaks, University of Kansas 49/53

Acknowledgments

Ideas and feedback:

I KU Herpetology

I Holder Lab

I Melissa Callahan

I Mike Hickerson

I Laura Kubatko

I My committee

Computation:

I KU ITTC

I KU Computing Center

I iPlant

Funding:

I NSF

I KU Grad Studies, EEB & BI

I SSB

I Sigma Xi

Photo credits:

I Rafe Brown, Cam Siler, &Jake Esselstyn

I FMNH Philippine MammalWebsite:

I D.S. Balete, M.R.M. Duya,& J. Holden

Islands and Integrals J. Oaks, University of Kansas 50/53

Acknowledgments

Friends & Family

Islands and Integrals J. Oaks, University of Kansas 51/53

Acknowledgments

Friends & Family

Islands and Integrals J. Oaks, University of Kansas 52/53

Questions?

Islands and Integrals J. Oaks, University of Kansas 53/53

Gene tree divergences

Age (mybp)

Split

(Tax

on: I

slan

d 1−

Isla

nd 2

)

Crocidura beatus: Leyte−Samar

Crocidura negrina−panayensis: Negros−Panay

Cynopterus brachyotis: Biliran−Mindanao

Cynopterus brachyotis: Negros−Panay

Cyrtodactylus annulatus: Bohol−Mindanao

Cyrtodactylus gubaot−sumuroi: Leyte−Samar

Cyrtodactylus philippinicus: Negros−Panay

Dendrelaphis marenae: Negros−Panay

Gekko mindorensis: Negros−Panay

Haplonycteris fischeri: Biliran−Mindanao

Haplonycteris fischeri: Negros−Panay

Hipposideros obscurus: Leyte−Mindanao

Hipposideros pygmaeus: Bohol−Mindanao

Limnonectes leytensis: Bohol−Mindanao

Limnonectes magnus: Bohol−Mindanao

Macroglossus minimus: Biliran−Mindanao

Macroglossus minimus: Negros−Panay

Ptenochirus jagori: Leyte−Mindanao

Ptenochirus jagori: Negros−Panay

Ptenochirus minor: Biliran−Mindanao

Insulasaurus arborens: Negros−Panay

Pinoyscincus jagori: Mindanao−Samar

0.5 1.0 1.5 2.0 2.5 3.0

Islands and Integrals J. Oaks, University of Kansas 53/53

Causes of bias: Insufficient sampling

I Models with more parameter space are less densely sampled

I Could explain bias toward small models in extreme casesI Predicts large variance in posterior estimates

I We explored empirical and simulation-based analyses with 2, 5,and 10 million prior samples, and estimates were very similar

0.0 0.2 0.4 0.6 0.8 1.01e8

0.0

0.2

0.4

0.6

0.8

1.0

1.2

95%

HPD

DT

UnadjustedA

0.0 0.2 0.4 0.6 0.8 1.01e8

0.00.10.20.30.40.50.60.70.8 GLM-adjustedB

Number of prior samples

Islands and Integrals J. Oaks, University of Kansas 53/53

Geological history

Islands and Integrals J. Oaks, University of Kansas 53/53