70
Towards Explainable, Data-Efficient, Verifiable Representation Learning Pasquale Minervini University College London

Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Towards Explainable, Data-Efficient, Verifiable Representation Learning

Pasquale Minervini University College London

Page 2: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Symbolic and Sub-Symbolic AI

Page 3: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Symbolic

Ontologies, First-Order Logic, Logic Programming, Knowledge Bases, Theorem Proving..

∀X, Y, Z : grandFather(X, Y ) ⇐ father(X, Z), parent(Z, Y )

Symbolic and Sub-Symbolic AI

Page 4: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Symbolic

Ontologies, First-Order Logic, Logic Programming, Knowledge Bases, Theorem Proving..

∀X, Y, Z : grandFather(X, Y ) ⇐ father(X, Z), parent(Z, Y )

Symbolic and Sub-Symbolic AI

✓Data-Efficient

✓Interpretable and Explainable

✓Easy to Incorporate Knowledge

✓Verifiable

Page 5: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Symbolic

Ontologies, First-Order Logic, Logic Programming, Knowledge Bases, Theorem Proving..

∀X, Y, Z : grandFather(X, Y ) ⇐ father(X, Z), parent(Z, Y )

Sub-Symbolic/Connectionist

Neural Representation Learning, (Deep) Latent Variable Models..

f1( f2(…fn( )…)) = {0.15 ≡ cat0.85 ≡ dog

Symbolic and Sub-Symbolic AI

✓Data-Efficient

✓Interpretable and Explainable

✓Easy to Incorporate Knowledge

✓Verifiable

Page 6: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Sub-Symbolic/Connectionist

Neural Representation Learning, (Deep) Latent Variable Models..

f1( f2(…fn( )…)) = {0.15 ≡ cat0.85 ≡ dog

✓Noisy, Ambiguous, Sensory Data

✓Highly Parallel

✓High Predictive Accuracy

Symbolic

Ontologies, First-Order Logic, Logic Programming, Knowledge Bases, Theorem Proving..

✓Data-Efficient

✓Interpretable and Explainable

✓Easy to Incorporate Knowledge

✓Verifiable

∀X, Y, Z : grandFather(X, Y ) ⇐ father(X, Z), parent(Z, Y )

Symbolic and Sub-Symbolic AI

Page 7: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Sub-Symbolic/Connectionist

Neural Representation Learning, (Deep) Latent Variable Models..

f1( f2(…fn( )…)) = {0.15 ≡ cat0.85 ≡ dog

✓Noisy, Ambiguous, Sensory Data

✓Highly Parallel

✓High Predictive Accuracy

Symbolic

Ontologies, First-Order Logic, Logic Programming, Knowledge Bases, Theorem Proving..

✓Data-Efficient

✓Interpretable and Explainable

✓Easy to Incorporate Knowledge

✓Verifiable

∀X, Y, Z : grandFather(X, Y ) ⇐ father(X, Z), parent(Z, Y )

Symbolic and Sub-Symbolic AI

Page 8: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Knowledge Graph — graph structured Knowledge Base, where knowledge is encoded by relationships between entities.

Knowledge Graphs

Page 9: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Knowledge Graph — graph structured Knowledge Base, where knowledge is encoded by relationships between entities. In practice — set of subject-predicate-object triples, denoting a relationship of type predicate between subject and object.

subject predicate object

Barack Obama was born in Honolulu

Hawaii has capital Honolulu

Barack Obama is politician of United States

Hawaii is located in United States

Barack Obama is married to Michelle Obama

Michelle Obama is a Lawyer

Michelle Obama lives in United States

Knowledge Graphs

Page 10: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Knowledge Graphs

Page 11: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Knowledge Graphs

Page 12: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Link Prediction in Knowledge Graphs

Washington

Malia Ann Obama

Sasha Obama

Barack Obama

Michelle Obama

lives in

parent of

?

Page 13: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Rule-Based Link Prediction

∀X, Y, Z :married with(X, Y) ⇐

parent of(X, Z),parent of(Y, Z)

Washington

Malia Ann Obama

Sasha Obama

Barack Obama

Michelle Obama

lives in

parent of

?

Page 14: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

✕ Not always true ✕ Hard to learn from data ✕ Hard to formalise for other modalities

Washington

Malia Ann Obama

Sasha Obama

Barack Obama

Michelle Obama

lives in

parent of

?

∀X, Y, Z :married with(X, Y) ⇐

parent of(X, Z),parent of(Y, Z)

Rule-Based Link Prediction

Page 15: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Washington

Malia Ann Obama

Sasha Obama

Barack Obama

Michelle Obama

lives in

parent of

?

Neural Link Prediction

Page 16: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

P(BO married MO) ∝

fmarried( , )

Washington

Malia Ann Obama

Sasha Obama

Barack Obama

Michelle Obama

lives in

parent of

?

Neural Link Prediction

Page 17: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Learning Representationsℒ(𝒢 ∣ Θ) = ∑

(s,p,o)∈𝒢

log σ (fp(es, eo))+ ∑

(s,p,o)∉𝒢

log [1 − σ (fp(es, eo))]Washington

Malia Ann Obama

Sasha Obama

Barack Obama

Michelle Obama

lives in

parent of

?

P(BO married MO) ∝

fmarried( , )

Neural Link Prediction

Page 18: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Neural Link Prediction — Scoring Functions

Models Scoring Functions Parameters

RESCAL [Nickel et al. 2011]

TransE [Bordes et al. 2013]

DistMult [Yang et al. 2015]

HolE [Nickel et al. 2016]

ComplEx [Trouillon et al. 2016]

ConvE [Dettmers et al. 2017]

− es + rp − eo2

p

⟨es, rp, eo⟩

Re (⟨es, rp, eo⟩)

r⊤p (ℱ−1 [ℱ[es] ⊙ ℱ[eo]])

f (vec (f ([es; rp] * ω)) W) eo

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk, W ∈ ℝc×k

rp ∈ ℂk

e⊤s Wpeo Wp ∈ ℝk×k

The interaction between the latent features is defined by the scoring function — several variants in the literature:f( ⋅ )

Page 19: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Neural Link Prediction — Scoring Functions

Models Scoring Functions Parameters

RESCAL [Nickel et al. 2011]

TransE [Bordes et al. 2013]

DistMult [Yang et al. 2015]

HolE [Nickel et al. 2016]

ComplEx [Trouillon et al. 2016]

ConvE [Dettmers et al. 2017]

− es + rp − eo2

p

⟨es, rp, eo⟩

Re (⟨es, rp, eo⟩)

r⊤p (ℱ−1 [ℱ[es] ⊙ ℱ[eo]])

f (vec (f ([es; rp] * ω)) W) eo

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk, W ∈ ℝc×k

rp ∈ ℂk

e⊤s Wpeo Wp ∈ ℝk×k

The interaction between the latent features is defined by the scoring function — several variants in the literature:f( ⋅ )

Page 20: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Neural Link Prediction — Scoring Functions

Models Scoring Functions Parameters

RESCAL [Nickel et al. 2011]

TransE [Bordes et al. 2013]

DistMult [Yang et al. 2015]

HolE [Nickel et al. 2016]

ComplEx [Trouillon et al. 2016]

ConvE [Dettmers et al. 2017]

− es + rp − eo2

p

⟨es, rp, eo⟩

Re (⟨es, rp, eo⟩)

r⊤p (ℱ−1 [ℱ[es] ⊙ ℱ[eo]])

f (vec (f ([es; rp] * ω)) W) eo

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk, W ∈ ℝc×k

rp ∈ ℂk

e⊤s Wpeo Wp ∈ ℝk×k

The interaction between the latent features is defined by the scoring function — several variants in the literature:f( ⋅ )

Page 21: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Neural Link Prediction — Scoring Functions

Models Scoring Functions Parameters

RESCAL [Nickel et al. 2011]

TransE [Bordes et al. 2013]

DistMult [Yang et al. 2015]

HolE [Nickel et al. 2016]

ComplEx [Trouillon et al. 2016]

ConvE [Dettmers et al. 2017]

− es + rp − eo2

p

⟨es, rp, eo⟩

Re (⟨es, rp, eo⟩)

r⊤p (ℱ−1 [ℱ[es] ⊙ ℱ[eo]])

f (vec (f ([es; rp] * ω)) W) eo

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk

rp ∈ ℝk, W ∈ ℝc×k

rp ∈ ℂk

e⊤s Wpeo Wp ∈ ℝk×k

The interaction between the latent features is defined by the scoring function — several variants in the literature:f( ⋅ )

Page 22: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Convolutional 2D Knowledge Graph Embeddings

Subject Embedding

Predicate Embedding

[AAAI 2018]

Idea — use ideas from computer vision for modeling the interactions between latent features.

Page 23: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Convolutional 2D Knowledge Graph Embeddings

Subject Embedding

Predicate Embedding

Reshape

“Latent Images”

Idea — use ideas from computer vision for modeling the interactions between latent features.

[AAAI 2018]

Page 24: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Convolutional 2D Knowledge Graph Embeddings

Subject Embedding

Predicate Embedding

Reshape

“Latent Images”

2D ConvolutionsConcatenate

Idea — use ideas from computer vision for modeling the interactions between latent features.

[AAAI 2018]

Page 25: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Convolutional 2D Knowledge Graph Embeddings

Subject Embedding

Predicate Embedding

Reshape

“Latent Images”

2D ConvolutionsConcatenateProjection Logits

Product with Entity Matrix

Idea — use ideas from computer vision for modeling the interactions between latent features.

[AAAI 2018]✓Scalable

✓State-of-the-art Results

Page 26: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Convolutional 2D Knowledge Graph EmbeddingsIdea — use ideas from computer vision for modeling the interactions between latent features.

[AAAI 2018]

0.1

0.225

0.35

0.475

0.6

DistMult ComplEx R-GCN ConvE

MRR Hits@1 Hits@3 Hits@10

✓Efficiency via parameter sharing ✓State-of-the-art Results

Page 27: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Interpreting Knowledge Graph EmbeddingsQuite hard to understand the semantics of the learned representations..

[Minervini et al. ECML 2017]

Page 28: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Interpreting Knowledge Graph EmbeddingsQuite hard to understand the semantics of the learned representations..

.. but we can use their geometric relationships for identifying — and incorporating — semantic relationships between them.

[Minervini et al. ECML 2017]

Page 29: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Regularising Knowledge Graph EmbeddingsQuite hard to understand the semantics of the learned representations..

.. but we can use their geometric relationships for identifying — and incorporating — semantic relationships between them.

[Minervini et al. ECML 2017]

Page 30: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Regularising Knowledge Graph Embeddings

[Minervini et al. ECML 2017]

Page 31: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Regularising Knowledge Graph EmbeddingsQuite hard to understand the semantics of the learned representations..

.. but we can use their geometric relationships for identifying — and incorporating — semantic relationships between them.

[Minervini et al. ECML 2017]

✕ is a(x, y) ∧ is a(y, z) ⇒ is a(x, z)

Page 32: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

Idea — adversarial training process where, iteratively:

[Minervini et al. UAI 2017]

Page 33: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

Idea — adversarial training process where, iteratively: • An adversary searches for inputs where the model violates constraints

[Minervini et al. UAI 2017]

e.g. x, y, z  such that is a(x, y) ∧ is a(y, z) ∧ ¬is a(x, z)

Page 34: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

Idea — adversarial training process where, iteratively: • An adversary searches for inputs where the model violates constraints • The model is regularised to correct such violations.

[Minervini et al. UAI 2017]

Page 35: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

Idea — adversarial training process where, iteratively: • An adversary searches for inputs where the model violates constraints, • The model is regularised to correct such violations.

Formally:

minΘ

ℒdata(D ∣ Θ) + λ maxS

ℒviolation(S, D ∣ Θ)

[Minervini et al. UAI 2017]

e.g. S = {x, y, z} such that is a(x, y) ∧ is a(y, z) ∧ ¬is a(x, z)

Page 36: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

Idea — adversarial training process where, iteratively: • An adversary searches for inputs where the model violates constraints, • The model is regularised to correct such violations.

Formally:

• Inputs S can be either input space or embedding space

• In most interesting cases, max has closed form solutions

• Constraints are guaranteed to hold everywhere in embedding space.

[Minervini et al. UAI 2017]

minΘ

ℒdata(D ∣ Θ) + λ maxS

ℒviolation(S, D ∣ Θ)

e.g. S = {ex, ey, ez} such that is a(ex, ey) ∧ is a(ey, ez) ∧ ¬is a(ex, ez)

Page 37: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

Idea — adversarial training process where, iteratively: • An adversary searches for inputs where the model violates constraints, • The model is regularised to correct such violations.

Formally:

[Minervini et al. UAI 2017]

e.g. S = {ex, ey, ez} such that is a(ex, ey) ∧ is a(ey, ez) ∧ ¬is a(ex, ez)

✓Incorporates Background Knowledge

✓Verifiable

minΘ

ℒdata(D ∣ Θ) + λ maxS

ℒviolation(S, D ∣ Θ)

Page 38: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

[Minervini et al. UAI 2017]

55

60.75

66.5

72.25

78

TransE KALE-Pre KALE-Joint DistMult ASR-DistMult ComplEx ASR-ComplEx

Hits@3 Hits@5 Hits@10

Page 39: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge via Adversarial Training

[Minervini et al. UAI 2017]

55

60.75

66.5

72.25

78

TransE KALE-Pre KALE-Joint DistMult ASR-DistMult ComplEx ASR-ComplEx

Hits@3 Hits@5 Hits@10

Page 40: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge in Natural Language Inference Models

[Minervini et al. CoNLL 2018]

Natural Language Inference — detect the type of relationship, i.e. entailment, contradiction, neutral, between two sentences.

Page 41: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge in Natural Language Inference Models

If a stentence x contradicts y, then also y contradicts x. If x entails y, and y entails z, then x also entails z.

[Minervini et al. CoNLL 2018]

Natural Language Inference — detect the type of relationship, i.e. entailment, contradiction, neutral, between two sentences.

Page 42: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge in Natural Language Inference Models

If a stentence x contradicts y, then also y contradicts x. If x entails y, and y entails z, then x also entails z.

x) A man in uniform is pushing a medical bed. y) A man is pushing carrying something.

[Minervini et al. CoNLL 2018]

Natural Language Inference — detect the type of relationship, i.e. entailment, contradiction, neutral, between two sentences.

Page 43: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge in Natural Language Inference Models

If a stentence x contradicts y, then also y contradicts x. If x entails y, and y entails z, then x also entails z.

ℒviolation({x, y}) : 0.01 ⇝ 0.92P(x entails y) = 0.72

P(y contradicts x) = 0.93

x) A man in uniform is pushing a medical bed. y) A man is pushing carrying something.

[Minervini et al. CoNLL 2018]

Natural Language Inference — detect the type of relationship, i.e. entailment, contradiction, neutral, between two sentences.

Page 44: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Incorporating Background Knowledge in Natural Language Inference Models

[Minervini et al. CoNLL 2018]

Page 45: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable Reasoning

[NAMPI 2018, ACL 2019]

✓Can learn representations via gradient-based optimisation

✓Can provide explanations to the user (proofs)

✓Can incorporate background knowledge (rules)

✓Can learn interpretable rules from data

Page 46: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable ReasoningTake a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

[NAMPI 2018, ACL 2019]

Page 47: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable ReasoningTake a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Backward Chaining TL;DR — start with a list of goals, and work backwards from the consequent Q to the antecedent P (if P then Q) to see if any data supports any of the consequents.

[NAMPI 2018, ACL 2019]

Page 48: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable ReasoningTake a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Backward Chaining TL;DR — start with a list of goals, and work backwards from the consequent Q to the antecedent P (if P then Q) to see if any data supports any of the consequents.

Recursively reformulate each goal in simpler subgoals:

[NAMPI 2018, ACL 2019]

p(a)p(b)p(c)

q(X) ← p(X)q(a)?

Page 49: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable ReasoningTake a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Backward Chaining TL;DR — start with a list of goals, and work backwards from the consequent Q to the antecedent P (if P then Q) to see if any data supports any of the consequents.

Recursively reformulate each goal in simpler subgoals:

[NAMPI 2018, ACL 2019]

p(a)p(b)p(c)

q(X) ← p(X)q(a)?

p(a)

Page 50: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable ReasoningTake a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Backward Chaining TL;DR — start with a list of goals, and work backwards from the consequent Q to the antecedent P (if P then Q) to see if any data supports any of the consequents.

Recursively reformulate each goal in simpler subgoals:

[NAMPI 2018, ACL 2019]

p(a)p(b)p(c)

q(X) ← p(X)q(a)?

p(a)

Page 51: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable ReasoningTake a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏 (𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏 (𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

✓ ✓✕

q(X) ← p(X)q(a)?p(a)

p(b)p(c)

…p(a)

Page 52: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable Reasoning

[NAMPI 2018, ACL 2019]

Take a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏 (𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏 (𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

✓ ✓✓sim = 1sim = 1sim = 0.9

Page 53: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable Reasoning

[NAMPI 2018, ACL 2019]

Take a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Knowledge Base:

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y ) ⇐𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Z ),𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, Y ) .

𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏(𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

Page 54: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable Reasoning

[NAMPI 2018, ACL 2019]

Take a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Knowledge Base:

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y ) ⇐𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Z ),𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, Y ) .

𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏(𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)

proof score S1

𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

proof score S2

Page 55: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable Reasoning

[NAMPI 2018, ACL 2019]

Take a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Knowledge Base:

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y ) ⇐𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Z ),𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, Y ) .

𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏(𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)

proof score S1

𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

proof score S2

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y )X /𝚊𝚋𝚎 Y/𝚋𝚊𝚛𝚝

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, Z )𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, 𝚋𝚊𝚛𝚝)

Subgoals:

proof score S3

Page 56: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable Reasoning

[NAMPI 2018, ACL 2019]

Take a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Knowledge Base:

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y ) ⇐𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Z ),𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, Y ) .

𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏(𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)

proof score S1

𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

proof score S2

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y )X /𝚊𝚋𝚎 Y/𝚋𝚊𝚛𝚝

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, Z )𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, 𝚋𝚊𝚛𝚝)

Subgoals:

proof score S3

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, Z )Z

proof score S4

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛) …

Page 57: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

End-to-End Differentiable Reasoning

[NAMPI 2018, ACL 2019]

Take a standard reasoning algorithm — e.g. backward-chaining — and replace matching between symbols with an end-to-end differentiable comparison operator.

Knowledge Base:𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏(𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)

proof score S1

𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)

proof score S2

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y )X /𝚊𝚋𝚎 Y/𝚋𝚊𝚛𝚝

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, Z )𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, 𝚋𝚊𝚛𝚝)

Subgoals:

proof score S3

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, Z )Z

proof score S4

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛) …

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)θ1(X, Y ) ⇐ θ2(X, Z ), θ3(Z, Y ) .

∑F∈K

log pKB∖F(F)

− ∑F̃∼corr(F)

log pKB(F̃)

TrainingMaximise Log-Likelihood:

Page 58: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Differentiable Reasoning at ScaleProblem — The number of proof paths grows exponentially with the proof depth.

[NAMPI 2018, ACL 2019]

Page 59: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Differentiable Reasoning at ScaleProblem — The number of proof paths grows exponentially with the proof depth. Solution — Dynamically consider only a subset of proof paths.

Knowledge Base:𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(𝚊𝚋𝚎, 𝚑𝚘𝚖𝚎𝚛)

𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(𝚑𝚘𝚖𝚎𝚛, 𝚋𝚊𝚛𝚝)𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y ) ⇐

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Z ),𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, Y ) .

𝚐𝚛𝚊𝚗𝚍𝙿𝚊𝙾𝚏(𝚊𝚋𝚎, 𝚋𝚊𝚛𝚝)

𝚐𝚛𝚊𝚗𝚍𝙵𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X, Y )X /𝚊𝚋𝚎 Y/𝚋𝚊𝚛𝚝

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X /𝚊𝚋𝚎, Z )𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z, Y/𝚋𝚊𝚛𝚝)

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X /𝚊𝚋𝚎, Z )

𝚏𝚊𝚝𝚑𝚎𝚛𝙾𝚏(X /𝚊𝚋𝚎, Z /𝚑𝚘𝚖𝚎𝚛)

Z

𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z /𝚑𝚘𝚖𝚎𝚛, Y/𝚋𝚊𝚛𝚝)

𝚙𝚊𝚛𝚎𝚗𝚝𝙾𝚏(Z /𝚑𝚘𝚖𝚎𝚛, Y/𝚋𝚊𝚛𝚝)

Subgoal:

Subgoals:

[NAMPI 2018, ACL 2019]

Page 60: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Differentiable Reasoning at ScaleProblem — The number of proof paths grows exponentially with the proof depth. Solution — Dynamically consider only a subset of proof paths.

[NAMPI 2018, ACL 2019]

55

60.75

66.5

72.25

78

TransE DistMult ComplEx GNTP

Hits@3 Hits@5 Hits@10

Page 61: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Explainable Neural Link Prediction

[NAMPI 2018, ACL 2019]

Page 62: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Differentiable Reasoning in Natural LanguageProblem — Classic reasoners can only reason over symbolic, unambiguous representations.

[NAMPI 2018, ACL 2019]

Page 63: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Differentiable Reasoning in Natural LanguageProblem — Classic reasoners can only reason over symbolic, unambiguous representations. Solution — Represents Knowledge Base and Natural Language facts in a shared embedding space, and reason over them.

Rule Group p(X, Y) :- q(Y, X) Rules Rule Group p(X, Y) :- q(X, Z), r(Z, Y)

X Y :- Y X

encoder

KB

encoder

Query

AND

containedIn(River Thames, UK)

“London is located in the UK”

“London is standing on the River Thames”

“[X] is located in the [Y]”(X, Y) :- locatedIn(X, Y)

locatedIn(X, Y) :- locatedIn(X, Z), locatedIn(Z, Y)

KB Rep. Text Representations

X Y :- Y XX Y :- Y X X Y :- X Z,

Z YX Y :- X Z,

Z Y

Recurse

k-NN OR

[NAMPI 2018, ACL 2019]

Rule Group p(X, Y) :- q(Y, X) Rules Rule Group p(X, Y) :- q(X, Z), r(Z, Y)

X Y :- Y X

encoder

KB

encoder

Query

AND

containedIn(River Thames, UK)

“London is located in the UK”

“London is standing on the River Thames”

“[X] is located in the [Y]”(X, Y) :- locatedIn(X, Y)

locatedIn(X, Y) :- locatedIn(X, Z), locatedIn(Z, Y)

KB Rep. Text Representations

X Y :- Y XX Y :- Y X X Y :- X Z,

Z YX Y :- X Z,

Z Y

Recurse

k-NN OR

Page 64: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Differentiable Reasoning in Natural Language

[NAMPI 2018, ACL 2019]

Page 65: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Differentiable Reasoning in Natural Language

[NAMPI 2018, ACL 2019]

Page 66: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Open Problems

Page 67: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Reasoning Over Large and Web-scale Corpora• Reasoning over large corpora is still expensive

• We need ways of retrieving and extracting useful fragments of knowledge from massively large corpora

Page 68: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Natural Language Constraints and Explanations

• Background knowledge, constraints, and explanations are communicated in a declarative, formal, logic-like language • We should be able to express them in natural language,

and be able to trust machine-generated explanations

Page 69: Towards Explainable, Data-Efficient, Verifiable Representation …itrc.inha.ac.kr/wp-content/uploads/2019/09/... · 2019-09-02 · Symbolic Ontologies, First-Order Logic, Logic Programming,

Formal Verification of Representation Learning Models

• I showed we can provide provable guarantees for a model to comply with a given set of constraints • How far can we go in this direction? • Can we have models that are provably unbiased and fair?