Introduction Projected tree Posterior inference Real data
Projected Pólya Tree
Luis E. Nieto Barajas
(joint with G. Núñez-Antonio)
Department of StatisticsITAM
BNP Conference – June 27, 2019
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 1 / 19
Introduction Projected tree Posterior inference Real data
Contents
1 Introduction
2 Projected tree
3 Posterior inference
4 Real data
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 2 / 19
Introduction Projected tree Posterior inference Real data
Setting
Directional data : (Mardia, 1972) Unit vectors in k -dimensional space, i.e. k − 1
angles.
Examples : wind directions, orientation data, directions of bird migration,
mammalian activity patterns in ecological reserves, etc.
The most common case is k = 2 producing circular data.
One of the simplest ways to generate distributions on Sk is to radially project
distributions originally defined on Rk .
e.g. projected normal distribution (Núñez-Antonio & Gutiérrez-Peña, 2005).
Aim : Project a bivariate Pólya tree to the unit circle and study its properties.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 3 / 19
Introduction Projected tree Posterior inference Real data
Notation
Bivariate PT : F on (R2,B2) has a bivariate PT prior with pars (Π,A), where
Π = {Bm,j,k}, Bm,j,k = Bm,j × Bm,k and A = {αm,j,k}, j, k = 1, . . . , 2m, m = 1, 2, . . .,
if there exists r.v. Ym,j,k = (Ym+1,2j−1,2k−1,Ym+1,2j−1,2k ,Ym+1,2j,2k−1,Ym+1,2j,2k ) s.t.
1 Ym,j,k are independent
2 Ym,j,k ∼ Dir(αm,j,k ),
3 For every m = 1, 2, . . . and every j, k = 1, . . . , 2m,
F (Bm,j,k ) =m∏
l=1
Ym−l+1,jm,j,km−l+1,k
m,j,km−l+1
,
where j(m,j,k)l−1 =
⌈j(m,j,k)l
2
⌉and k (m,j,k)
l−1 =
⌈k(m,j,k)
l2
⌉are recursive decreasing
formulae, whose initial values are j(m,j,k)m = j and k (m,j,k)
m = k .
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 4 / 19
Introduction Projected tree Posterior inference Real data
Notation
Centring : E{F (Bm,j,k )} = F0(Bm,j,k ) = 1/4m ifSince Bm,j,k = Bm,j × Bm,k , we take F0(x1, x2) = F10 (x1)F20 (x2).Match the partition with the dyadic quantiles of the marginals
Bm,j =
(F−1
10
(j − 12m
), F−1
10
(j
2m
)]and Bm,k =
(F−1
20
(k − 1
2m
), F−1
20
(k
2m
)]Define αm,j,k = (αρ(m + 1), . . . , αρ(m + 1)), α > 0 is the prec. par., ρ(m) = mδ withδ > 1 to define an abs. cont. tree.
Finite tree : stop partitioning the space at a finite level MAt the lowest level M, we spread probability according to f0Bivariate density at x = (x1, x2) ∈ R2
f (x) =
{M∏
m=1
Ym,j
(x1)m ,k
(x2)m
}4M f0(x),
Denote a finite bivariate PT as PTM (α, ρ, F0).
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 5 / 19
Introduction Projected tree Posterior inference Real data
Notation
Centring : E{F (Bm,j,k )} = F0(Bm,j,k ) = 1/4m ifSince Bm,j,k = Bm,j × Bm,k , we take F0(x1, x2) = F10 (x1)F20 (x2).Match the partition with the dyadic quantiles of the marginals
Bm,j =
(F−1
10
(j − 12m
), F−1
10
(j
2m
)]and Bm,k =
(F−1
20
(k − 1
2m
), F−1
20
(k
2m
)]Define αm,j,k = (αρ(m + 1), . . . , αρ(m + 1)), α > 0 is the prec. par., ρ(m) = mδ withδ > 1 to define an abs. cont. tree.
Finite tree : stop partitioning the space at a finite level MAt the lowest level M, we spread probability according to f0Bivariate density at x = (x1, x2) ∈ R2
f (x) =
{M∏
m=1
Ym,j
(x1)m ,k
(x2)m
}4M f0(x),
Denote a finite bivariate PT as PTM (α, ρ, F0).
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 5 / 19
Introduction Projected tree Posterior inference Real data
Definition
Let X = (X1,X2) s.t. X | f ∼ f and f ∼ PTM (α, ρ,F0).
We project X to the unit circle by using polar coordinates (X1,X2)→ (Θ,R),
where Θ is the angle and R = ||X|| is the resultant.
The inverse transf. becomes X1 = R cos Θ and X2 = R sin Θ. Thus, the Jacobian
is J = R.
The projected Pólya tree, denoted by PPTM (α, ρ, f0), has density
f (θ) =
∫ ∞0
{ M∏m=1
Ym,j(r cos θ)
m ,k(r sin θ)m
}4M f0(r cos θ, r sin θ) r dr .
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 6 / 19
Introduction Projected tree Posterior inference Real data
Properties
Smooth : at the boundaries of the partitions. The marginalisation can also be seen
as a mixture
f (θ) =
∫f (θ | r)f (r)dr .
Easily centred : Say on the projected normal by taking f0(x) = N2(x | µ, I) a
bivariate normal density with µ = (µ1, µ2) and var-cov I. The projected Pólya tree
becomes
f (θ) =
∫ ∞0
{ M∏m=1
Ym,j(r cos θ)
m ,k(r sin θ)m
}4M (2π)−1e−
12 µ′µ r
× exp
[−
12
{r2 − 2r (µ1 cos θ + µ2 sin θ)
}]I(0,2π](θ)dr .
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 7 / 19
Introduction Projected tree Posterior inference Real data
Properties
Numerically approximated : There is no analytical expression. Use MC or
quadrature. If 0 = r (0) < r (1) < · · · < r (L) <∞ is a partition, then
f (θ) ≈L∑
l=1
f (r (l) cos θ, r (l) sin θ) |J|(
r (l) − r (l−1))
Moments : Circular densities are periodic, f (θ + 2π) = f (θ), require trigonometric
moments ϕp = E(eipΘ) = ap + ibp , where ap = E(cos pΘ) and bp = E(sin pΘ),
νθ = arctan(b1/a1), %θ =√
a21 + b2
1 ,
where %θ ∈ [0, 1].
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 8 / 19
Introduction Projected tree Posterior inference Real data
Properties
Posterior consistency : Let f ∼ PPT(α, ρ, f0) and f∗(θ) be an arbitrary density s.t.KL(f∗, f0) <∞. Then, if
∑∞m=1 ρ(m)−1/2 <∞, as n→∞ f achieves weak
posterior consistency.In particular, if ρ(m) = mδ , we need δ > 2.
Examples : M = 4, α = 1, ρ(m) = mδ with δ = 1.1, and different values of µ.
Used L = 100 points.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 9 / 19
Introduction Projected tree Posterior inference Real data
Examples
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
0.8
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
0.8
1.0
1.2
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
2.0
theta
f(th
eta)
FIGURE – Ten simulated densities µ : (0, 1) (top left), (1, 0) (top right), (0,−1) (bottom left) and(−1, 0) bottom right.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 10 / 19
Introduction Projected tree Posterior inference Real data
Examples
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
0.8
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.5
1.0
1.5
2.0
2.5
3.0
theta
f(th
eta)
0 1 2 3 4 5 6
01
23
45
theta
f(th
eta)
FIGURE – Ten simulated densities µ : (0, 0) (top left), (1, 1) (top right), (2, 2) (bottom left) and (5, 5)bottom right.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 11 / 19
Introduction Projected tree Posterior inference Real data
Examples
●
●
●
●
●
●
●●
●●●
●
●
●
(0,1) (1,0) (0,−1) (−1,0)
−2
02
4
●
●●●●
●
●
●●●
●
●
●
●
●
●●
●●
(0,1) (1,0) (0,−1) (−1,0)
0.2
0.4
0.6
0.8
1.0
●
●●●
●
(0,0) (1,1) (2,2) (5,5)
−3
−2
−1
01
23
●●●
●●●
●●●
●●●●
●
●
●●
●
●●
●
●
●
●●
(0,0) (1,1) (2,2) (5,5)
0.0
0.2
0.4
0.6
0.8
1.0
FIGURE – Prior distribution of moments. νθ (first column), and %θ (second column).
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 12 / 19
Introduction Projected tree Posterior inference Real data
Posterior
Let θ1, θ2, . . . , θn a sample of size n s.t. θi | f ∼ f , independently, and
f ∼ PPTM (α, ρ, f0).
Consider a data augmentation approach (Tanner, 1991) : define latent resultant
lengths R1,R2, . . . ,Rn s.t. (Θi ,Ri ) and (X1i ,X2i ) are 1 :1.
Posterior conditionals :Ym,j,k | data ∼ Dir(αm,j,k + Nm,j,k ), where
Nm,j,k = (Nm+1,2j−1,2k−1,Nm+1,2j−1,2k ,Nm+1,2j,2k−1,Nm+1,2j,2k ).
f (ri | Y, θi ) ∝{∏M
m=1 Ym,j
(ri cos θi )m ,k
(ri sin θi )m
}f0(ri cos θi , ri sin θi ) ri ,
Require MH step with random walk proposal Ga(κ, κ/r (t)) with variation
coefficient 1/√κ.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 13 / 19
Introduction Projected tree Posterior inference Real data
Simulated data
Sampled from a projected bivariate normal f (x) =∑4
j=1 πj N2(x | ηj , I), with
π = (0.1, 0.2, 0.4, 0.3) and η1 = (1.5, 1.5), η2 = (−1, 1), η3 = (−1,−2),
η4 = (1.5,−1.5).
n = 50 and n = 500
Fitted PPTM (α, ρ, f0), with f0 = N2(µ, I), µ ∈ {(0, 0), (1, 1), (2, 2)}, ρ(m) = mδ
with δ = 1.1, α ∈ {0.5, 1, 2} and M = 4.
MCMC : 10, 000 iterations, 1, 000 of burn-in, 5th thinning.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 14 / 19
Introduction Projected tree Posterior inference Real data
Simulated data
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.1
0.2
0.3
0.4
0.5
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.1
0.2
0.3
0.4
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.1
0.2
0.3
0.4
0.5
0.6
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
FIGURE – Posterior estimates with n = 500. Columns α = 0.5 and α = 2. Rows µ = (0, 0) andµ = (2, 2).
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 15 / 19
Introduction Projected tree Posterior inference Real data
El Triunfo Reserve
Data : Temporal activity (time of the day) for three animal species : peccary, tapir
and deer.
Data sizes are : 16, 35 and 115
Fitted PPTM (α, ρ, f0), with f0 = N2(µ, I), µ = (0, 0), ρ(m) = m1.1 and M = 4.
Tried α ∈ {0.5, 1, 2} to compare. Placed hyper-prior α ∼ Ga(1, 2).
LPML gof statistics
α Peccary Tapir Deer0.5 −23.05 −61.02 −208.311 −23.22 −60.20 −206.922 −24.10 −59.57 −205.68
Ga(1, 2) −23.40 −60.15 −206.77Proj.Normal −26.52 −59.43 −207.54
DPM Proj.Normal −24.64 −59.56 −204.31
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 16 / 19
Introduction Projected tree Posterior inference Real data
El Triunfo Reserve
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
0.8
1.0
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.2
0.4
0.6
theta
f(th
eta)
0 1 2 3 4 5 6
0.0
0.1
0.2
0.3
0.4
0.5
FIGURE – Posterior estimates. Peccary (top left), tapir (top right) and deer (bottom).Peccaries are seen from 6 :00 to 18 :00 hrs.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 17 / 19
Introduction Projected tree Posterior inference Real data
El Triunfo Reserve
●
●
●
●
●
●
●
●●●
●
●●●
●
●
●●●
●●●●●●●
●
●●●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●●●●
●●●
●
●
●●
●●
●
●
peccary tapir deer
−2
02
4
FIGURE – Posterior distribution of νθ . Preferred activity-time :peccaries (midday), tapirs (20 :30 hours) and deer (19 :00 hours).
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 18 / 19
Introduction Projected tree Posterior inference Real data
References
Mardia, K.V. (1972). Statistics of Directional Data. London, Academic press.
Nuñez-Antonio, G. and Gutiérrez-Peña, E. (2005). A Bayesian analysis of
directional data using the projected normal distribution. Journal of Applied
Statistics 32, 995–1001.
Tanner, M.A. (1991). Tools for statistical inference : Observed data and data
augmentation methods. Springer, New York.
Luis E. Nieto Barajas Projected Pólya Tree BNP Conference – June 27, 2019 19 / 19