Poincaré Embeddings for Learning Hierarchical Representationslcarin/Becky11.30.2018.pdf2018/11/30...

Poincaré Embeddings forLearning Hierarchical Representations

Maximilian Nickel, Douwe KielaFacebook AI Research

Presented by Ke (Becky) Bai

Nov. 30th, 2018

1 / 15

Introduction

• Symbolic data exhibits underlying latent hierarchy (Tree-likestructure, power-law distributed data).• Simultaneously capture similarity and hierarchy in the

embedding space by unsupervised learning.• Introduce a novel approach for learning hierarchical

representations by embedding entities into hyperbolic space.

2 / 15

3 / 15

MotivationsThe distance in the embedding space of the symbolic data reflectstheir semantic similarity.• The nodes of trees with branching factor b>1 grows

exponentially.• The hyperbolic disc area and circle length grow exponentiallywith the radius.

4 / 15

Embeding SpacePoincaré BallPoincaré ball model of hyperbolic space is a Riemannian manifold(Bd , gx)

Bd = {x ∈ Rd | ‖x‖ < 1} (1)

d-dimensional open unit ball, where ‖ · ‖ denotes the Euclidean norm.

1− ‖x‖2

gE , (2)

where x ∈ Bd and gE denotes the Euclidean metric tensor. Let

21−‖x‖2

DistanceThe distance between points θ, x ∈ Bd is computed by

d(θ, x) = arcosh(1 + 2

‖θ − x‖2

(1− ‖θ‖2)(1− ‖x‖2)

). (3)

5 / 15

Optimization

Θ′ ← argminΘL(Θ) s.t. ∀θi ∈ Θ : ‖θi‖ < 1. (4)

θt+1 = θt + Rθt (−ηt∇RL(θt)) (5)

R donates the retraction onto B at θt and ηt denotes the learning rate attime t.The Riemannian gradient can be derived from the Euclidean space byrescaling ∇E with the inverse of the Poincaré ball metric tensor, i.e., k−1

θ .∇R = k−1

θ ∇E

∇E =∂L(θ)

∂d(θ, x)

∂θ(6)

θt+1 ← proj(θt − ηt

(1− ‖θt‖2)2

). (7)

6 / 15

Optimization with more details

∂d(θ, x)

∂θ=

β√γ2 − 1

(‖x‖2 − 2〈θ, x〉+ 1

α2 θ − xα

). (8)

Where γ = 1 + 2αβ‖θ − x‖2, α = 1− ‖θ‖2 , β = 1− ‖x‖2

proj(θ) =

{θ/‖θ‖ − ε if ‖θ‖ ≥ 1θ otherwise ,

7 / 15

Comparisons

u, v are also embedding vectors like θ, x shown before.

Euclidean Distance

d(u, v) = ‖u − v‖2

Translational Distance

d(u, v) = ‖u − v + r‖2

Where r is a learned global translation vector designed forasymmetric data.

8 / 15

Application1: Embedding Taxonomies

Let D = {(u, v)} be the set of observed hypernymy relationsbetween noun pairs from WordNet. The loss function is

∑(u,v)∈D

loge−d(u,v)∑

v ′∈N (u) e−d(u,v ′)

where N (u) = {v | (u, v) 6∈ D} ∪ {u}

9 / 15

Application1: Results

10 / 15

Application2: Network Embeddings

Let D = {(u, v)} represents the relationships between two people ifthey co-author a paper. In this social network, the probability of aco-author edge is

P((u, v) = 1) =1

e(d(u,v)−r)/t + 1

Where r and t are hyperparameters.The loss is the cross-entropy loss based on the probability.

11 / 15

Table 1: Mean average precision for Reconstruction and Link Prediction onnetwork data.

Dimensionality

Reconstruction Link Prediction

10 20 50 100 10 20 50 100

AstroPh Euclidean 0.376 0.788 0.969 0.989 0.508 0.815 0.946 0.960N=18,772; E=198,110 Poincaré 0.703 0.897 0.982 0.990 0.671 0.860 0.977 0.988

CondMat Euclidean 0.356 0.860 0.991 0.998 0.308 0.617 0.725 0.736N=23,133; E=93,497 Poincaré 0.799 0.963 0.996 0.998 0.539 0.718 0.756 0.758

GrQc Euclidean 0.522 0.931 0.994 0.998 0.438 0.584 0.673 0.683N=5,242; E=14,496 Poincaré 0.990 0.999 0.999 0.999 0.660 0.691 0.695 0.697

HepPh Euclidean 0.434 0.742 0.937 0.966 0.642 0.749 0.779 0.783N=12,008; E=118,521 Poincaré 0.811 0.960 0.994 0.997 0.683 0.743 0.770 0.774

12 / 15

Application3: Lexical Entailment

We can quantify to what degree X is a type of Y via ratings on scaleof [0,10] to evaluate how well semantic models can capture gradedlexical entailment.

score(is-a(u, v)) = −(1 + α(‖v‖ − ‖u‖))d(u, v)

Where α is a hyper parameter representing the severity of the penalty.Penalty is ‖v‖ − ‖u‖.

Training processure

− Train the embedding using WordNet as application 1.− Use the above evaluation to score all noun pairs in HYPERLEX.− Calculate Spearman’s rank correlation with the ground-truth

ranking.

13 / 15

Table 2: Spearman’s ρ for Lexical Entailment on HyperLex.

FR SLQS-Sim WN-Basic WN-WuP WN-LCh Vis-ID Euclidean Poincaré

ρ 0.283 0.229 0.240 0.214 0.214 0.253 0.389 0.512

14 / 15

Thanks

15 / 15

Poincaré Embeddings for Learning Hierarchical Representationslcarin/Becky11.30.2018.pdf2018/11/30...

Documents

Jules Henri Poincaré

Inferring Concept Hierarchies from Text Corpora via ......Embeddings Recently, works have proposed a variety of graph embedding techniques for rep-resenting and recovering hierarchical

HARP: Hierarchical Representation Learning for Networksrepresentation learning [8, 13, 17] communities to build sub-stantially better graph embeddings. Improved Optimization Primitives

Jules henri poincaré

Sobolevräume und Poincaré-Ungleichungdiening/ws14/seminar/haslauer.pdf · LMUMünchen,Germany EliasHaslauer Sobolevräume und Poincaré-Ungleichung SeminarNumerischeAnalysisbei

From Word Embeddings To Document Distancesmkusner.github.io/presentations/From_Word_Embeddings_To... · 2020-05-16 · From Word Embeddings To Document Distances ... word embeddings

Conjetura de poincaré, Demostración

Sherlock:Sparse Hierarchical Embeddings for Visually-Aware One … · embedding method, called Sherlock, to uncover the visual dimensions of users’ opinions on top of raw visual

Poincaré Embeddings for Learning Hierarchical Representations · PDF filePoincaré Embeddings for Learning Hierarchical Representations Maximilian Nickel Facebook AI Research maxn@fb.com

Poincaré Intuition Mathématiques

Machine Learning (ICML) 2018 Overview International ...Maximilian Nickel, Douwe Kiela. Poincaré Embeddings for Learning Hierarchical Representations. NIPS, 2017. Maximilian Nickel,

Hierarchical Bayesian Embeddings for Analysis and …people.ee.duke.edu/~lcarin/Haojun_JMLR12.pdf1 Hierarchical Bayesian Embeddings for Analysis and Synthesis of High-Dimensional Dynamic

Poincaré, Langevin et Einstein - HAL archive ouverte · 2020. 12. 24. · Poincaré, Langevin et Einstein Michel PATY* RESUME Henri Poincaré (1854-1912), Paul Langevin (1871-1946)

Dualidad poincaré (Efraín Vega)

Termodinamica di Henri Poincaré

Representations of higher-dimensional Poincaré maps · PDF fileRepresentations of higher-dimensional Poincaré maps with applications to spacecraft trajectory ... dimensional Poincaré

Poincaré Cicéron

CERCLE Poincaré - École Polytechnique

Hierarchical Representations with Poincaré Variational Auto … · empirically shown that data with hierarchical structure can efﬁciently be embedded in hyperbolic spaces.

Word Embeddings - cocoxu.github.io