239
Neural-Symbolic Integration Steffen H ¨ olldobler International Center for Computational Logic Technische Universit¨ at Dresden Germany ICCL International Center for Computational Logic Algebra, Logic and Formal Methods in Computer Science 1

€¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Neural-Symbolic Integration

Steffen HolldoblerInternational Center for Computational LogicTechnische Universitat DresdenGermany

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science1

Page 2: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Introduction & Motivation: Overview

I Introduction & Motivation

I Propositional Logic

. Existing Approaches

. Proposititonal Logic Programs and the Core Method

I First-Order Logic

. Existing Approaches

. First-Order Logic Programs and the Core Method

I The Neural-Symbolic Learning Cycle

I Challenge Problems

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science2

Page 3: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Introduction & Motivation: Connectionist Systems

I Well-suited to learn, to adapt to new environments, to degrade gracefully etc.

I Many successful applications.

I Approximate functions.

. Hardly any knowledge about the functions is needed.

. Trained using incomplete data.

I Declarative semantics is not available.

I Recursive networks are hardly understood.

I McCarthy 1988: We still observe a propositional fixation.

I Structured objects are difficult to represent.

. Smolensky 1987: Can we instantiate the power of symboliccomputation within fully connectionist systems?

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science3

Page 4: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Introduction & Motivation: Logic Systems

I Well-suited to represent and reason aboutstructured objects and structure-sensitive processes.

I Many successful applications.

I Direct implementation of relations and functions.

I Explicit expert knowledge is required.

I Highly recursive structures.

I Well understood declarative semantics.

I Logic systems are brittle.

I Expert knowledge may not be available.

. Can we instantiate the power of connectionist computationwithin a logic system?

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science4

Page 5: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Introduction & Motivation: Objective

I Seek the best of both paradigms!

I Understanding the relation between connectionist and logic systems.

I Contribute to the open research problems of both areas.

I Well developed for propositional case.

I Hard problem: going beyond.

I In this lecture:

. Overview on existing approaches.

. Logic programs and recurrent networks.

. Semantic operators for logic programs can be computedby connectionist systems.

. Semantic operators can be learned.

. Logic programs can be extracted.

Page 6: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Introduction & Motivation: Objective

I Seek the best of both paradigms!

I Understanding the relation between connectionist and logic systems.

I Contribute to the open research problems of both areas.

I Well developed for propositional case.

I Hard problem: going beyond.

I In this lecture:

. Overview on existing approaches.

. Logic programs and recurrent networks.

. Semantic operators for logic programs can be computedby connectionist systems.

. Semantic operators can be learned.

. Logic programs can be extracted.

Neural Symbolic Integration using the Core Method

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science5

Page 7: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Connectionist Networks

I A connectionist network consists of

. a set U of units and

. a set W ⊆ U × U of connections.

I Each connection is labeled by a weight w ∈ R.

I If there is a connection from unit uj to uk, then wkj is its associated weight.

Page 8: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Connectionist Networks

I A connectionist network consists of

. a set U of units and

. a set W ⊆ U × U of connections.

I Each connection is labeled by a weight w ∈ R.

I If there is a connection from unit uj to uk, then wkj is its associated weight.

I A unit is specified by

. an input vector~i = (i1, . . . , im), ij ∈ R, 1 ≤ j ≤ m,

. an activation function Φ mapping~i to a potential p ∈ R,

. an output function Ψ mapping p to an (output) value v ∈ R.

Page 9: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Connectionist Networks

I A connectionist network consists of

. a set U of units and

. a set W ⊆ U × U of connections.

I Each connection is labeled by a weight w ∈ R.

I If there is a connection from unit uj to uk, then wkj is its associated weight.

I A unit is specified by

. an input vector~i = (i1, . . . , im), ij ∈ R, 1 ≤ j ≤ m,

. an activation function Φ mapping~i to a potential p ∈ R,

. an output function Ψ mapping p to an (output) value v ∈ R.

I If there is a connection from uj to uk

then wkjvj is the input received by uk along this connection.

I The potential and value of a unit are synchronously recomputed (or updated).

I Often a linear time t is added as parameter to input, potential and value.

I The state of a network with units u1, . . . , un at time t is (v1(t), . . . , vn(t)).

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science6

Page 10: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

A Simple Connectionist Network

v3 v4

m mu3 u4

? ?

m mu1 u2

? ?

w31 w42

@@w43

����

�����@@R

��w34

HHHH

HHHHH��

w34, w43 = −0.5w31, w42 = 1

pi(t + 1) = pi(t) +P4

j=1 wijvj(t)

vi(t) = round(pi(t))

v1(t) =

6 if t = 02 otherwise

v2(t) =

5 if t = 02 otherwise

I What happens if the network is synchronously updated?

Page 11: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

A Simple Connectionist Network

v3 v4

m mu3 u4

? ?

m mu1 u2

? ?

w31 w42

@@w43

����

�����@@R

��w34

HHHH

HHHHH��

w34, w43 = −0.5w31, w42 = 1

pi(t + 1) = pi(t) +P4

j=1 wijvj(t)

vi(t) = round(pi(t))

v1(t) =

6 if t = 02 otherwise

v2(t) =

5 if t = 02 otherwise

I What happens if the network is synchronously updated?

I A winner-take-all network is a synchronously updated connectionist network of n

units (not counting input units) such that after each unit receives an initial inputat t = 0 eventually only the unit with the highest initial input produces a valuegreater than 0 whereas the value of all other units is 0.

I Exercise Construct a winner-take-all network of 3 units.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science7

Page 12: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Literature

I Feldman, Ballard 1982: Connectionist Models and Their Properties.Cognitive Science 6 (3), 205-254.

I McCarthy 1988: Epistemological Challenges for Connectionism.Behavioural and Brain Sciences 11, 44.

I Smolensky 1987: On Variable Binding and the Representation of Symbolic Struc-tures in Connectionist Systems. Report No. CU-CS-355-87, Department of Com-puter Science & Institute of Cognitive Science, University of Colorado, Boulder.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science8

Page 13: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Logic

I Existing Approaches

. Finite Automata and McCulloch-Pitts Networks

. Weighted Automata and Semiring Artificial Neural Networks

. Propositional Reasoning and Symmetric/Stochastic Networks

. Other Approaches

I Proposititonal Logic Programs and the Core Method

. The Very Idea

. Logic Programs

. Propositional Core Method

. Backpropagation

. Knowledge-Based Artificial Neural Networks

. Propositional Core Method using Sigmoidal Units

. Further Extensions

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science9

Page 14: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

McCulloch-Pitts Networks

I McCulloch, Pitts 1943:Can the activities of nervous systems be modelled by a logical calculus?

I A McCulloch-Pitts network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connections.

I The set UI of input units is defined as UI = {uk ∈ U | (∀uj ∈ U) wkj = 0}.I The set UO of output units is defined as UO = {uj ∈ U | (∀uk ∈ U) wkj = 0}.

McCulloch-Pittsnetwork

-

-

...UI

-

-

... UO

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science10

Page 15: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Binary Threshold Units

I uk is a binary threshold unit if

Φ(~ik) = pk =Pm

j=1 wkjvj

Ψ(pk) = vk =

1 if pk ≥ θk

0 otherwise

where θk ∈ R is a threshold.

I Three binary threshold units:

v1 -w21 = −1 θ2

= −0.5

u2

- v2 = ¬v1

Page 16: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Binary Threshold Units

I uk is a binary threshold unit if

Φ(~ik) = pk =Pm

j=1 wkjvj

Ψ(pk) = vk =

1 if pk ≥ θk

0 otherwise

where θk ∈ R is a threshold.

I Three binary threshold units:

v1 -w21 = −1 θ2

= −0.5

u2

- v2 = ¬v1

v2 -

w32 = 1

v1 -w31 = 1

θ3 = 0.5

u3

- v3 = v1 ∨ v2

Page 17: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Binary Threshold Units

I uk is a binary threshold unit if

Φ(~ik) = pk =Pm

j=1 wkjvj

Ψ(pk) = vk =

1 if pk ≥ θk

0 otherwise

where θk ∈ R is a threshold.

I Three binary threshold units:

v1 -w21 = −1 θ2

= −0.5

u2

- v2 = ¬v1

v2 -

w32 = 1

v1 -w31 = 1

θ3 = 0.5

u3

- v3 = v1 ∨ v2v2 -

w32 = 1

v1 -w31 = 1

θ3 = 1.5

u3

- v3 = v1 ∧ v2

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science11

Page 18: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

A Simple McCulloch-Pitts Network

I Example Consider the following network of logical threshold units:

"!#

"!#

.5 .5u1 u3

-

-

1

1"!#

"!#

.5 .5

u2 u4

����

��������*HHHH

HHHHHH

HHj

-1-1

"!#

.5u5������������:

XXXXXXXXXXXXz

1

1

I Exercise

. Specify UI and UO.

. What is computed by the network if all units are updated synchronously?

. Specify the states of the network ignoring input and output units.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science12

Page 19: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Finite Automata

I A finite automaton consists of:

. Σ, a finite set of input symbols,

. Φ, a finite set of output symbols,

. Q, a finite set of states,

. q0 ∈ Q, an initial state,

. F ⊂ Q, a set of final states

. δ : Q× Σ→ Q, a state transition function,

. ρ : Q→ Φ, an output function.

I Exercise Let Σ = Φ = {1, 2}, Q = {p, q, r}, F = {r}, q0 = p,

ρ p q r

1 1 2, δ 1 2

p q p

q r q

r r r

.

What is computed by this automaton?

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science13

Page 20: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Finite Automata and McCulloch-Pitts Networks

I Theorem McCulloch-Pitts networks are finite automata and vice versa.

I Proof

⇒ Exercise⇐ Let T = (Σ, Φ, Q, q0, F, δ, ρ) an automaton with

• Σ = {b1, . . . , bm},• Φ = {c1, . . . , cr},• Q = {q0, . . . , qk−1}.

To show there exists network N with

• inputs {b′1, . . . , b′m},• outputs {c′1, . . . , c′r},• states {q′0, . . . , q′k−1} such that

if T generates cj1, . . . , cjn given bj1, . . . , bjn

then N generates c′j1, . . . , c′jngiven b′j1, . . . , b′jn

.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science14

Page 21: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Inputs and Outputs

I Remember |Σ| = m, |Φ| = r.

Page 22: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Inputs and Outputs

I Remember |Σ| = m, |Φ| = r.

I Inputs x1, . . . , xm with b′j = ~x where

xi =

1 if i = j,

0 otherwise.

Page 23: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Inputs and Outputs

I Remember |Σ| = m, |Φ| = r.

I Inputs x1, . . . , xm with b′j = ~x where

xi =

1 if i = j,

0 otherwise.

I Outputs y1, . . . , yr with c′j = ~y where

yi =

1 if i = j,

0 otherwise.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science15

Page 24: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Units and Connections

I Remember |Σ| = m, |Φ| = r, |Q| = k.

Page 25: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Units and Connections

I Remember |Σ| = m, |Φ| = r, |Q| = k.

I qb-units represent that T in state q receives input b (k×m units).

Page 26: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Units and Connections

I Remember |Σ| = m, |Φ| = r, |Q| = k.

I qb-units represent that T in state q receives input b (k×m units).

I c-units represent output c (r units).

Page 27: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Units and Connections

I Remember |Σ| = m, |Φ| = r, |Q| = k.

I qb-units represent that T in state q receives input b (k×m units).

I c-units represent output c (r units).

I Connections

. Let {k1, . . . , kn(k)} = {(q, b) | δ(q, b) = q∗} in

vuq∗b∗(t + 1) =

1 if xb∗(t) ∧ [k1(t) ∨ . . . ∨ kn(k)(t)],0 otherwise.

Page 28: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Units and Connections

I Remember |Σ| = m, |Φ| = r, |Q| = k.

I qb-units represent that T in state q receives input b (k×m units).

I c-units represent output c (r units).

I Connections

. Let {k1, . . . , kn(k)} = {(q, b) | δ(q, b) = q∗} in

vuq∗b∗(t + 1) =

1 if xb∗(t) ∧ [k1(t) ∨ . . . ∨ kn(k)(t)],0 otherwise.

. Let {l1, . . . , ln(l)} = {(q, b) | ρ(q) = c} in

vuc(t + 1) =

1 if l1(t) ∨ . . . ∨ ln(l)(t),0 otherwise.

Page 29: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Construction of the Network: Units and Connections

I Remember |Σ| = m, |Φ| = r, |Q| = k.

I qb-units represent that T in state q receives input b (k×m units).

I c-units represent output c (r units).

I Connections

. Let {k1, . . . , kn(k)} = {(q, b) | δ(q, b) = q∗} in

vuq∗b∗(t + 1) =

1 if xb∗(t) ∧ [k1(t) ∨ . . . ∨ kn(k)(t)],0 otherwise.

. Let {l1, . . . , ln(l)} = {(q, b) | ρ(q) = c} in

vuc(t + 1) =

1 if l1(t) ∨ . . . ∨ ln(l)(t),0 otherwise.

I The theorem follows by induction on the length of the input sequence.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science16

Page 30: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Exercises

I Specify the automaton corresponding to the sample network.

I Specify the network corresponding to the sample finite automaton.

I Complete the proof of the theorem.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science17

Page 31: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Remarks on McCulloch-Pitts Networks

I McCulloch-Pitts networks are not just simple reactive systems, but theirbehavior depends on previous inputs as well as the activity within the network.

. Example

x -

-

1

����0.5 -

1 ����0.5 - y

I Literature

. Arbib: Brains, Machines and Mathematics. Springer, 2nd edition (1987).

. McCulloch & Pitts: A Logical Calculus and the Ideas Immanent in theNervous Activity. Bulletin of Mathematical Biophysics 5, 115-133 (1943).

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science18

Page 32: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Weighted Automata and Semiring Artificial Neural Networks

I Bader, Holldobler, Scalzitti 2004:Can the result by McCulloch and Pitts be extended to weighted automata?

I Let (K,⊕,�, 0K, 1K) be a semiring.

I uk is a⊕-unit ifΦ(~ik) = pk =

Lmj=1 wkj � vj

Ψ(pk) = vk = pk

I uk is a�-unit ifΦ(~ik) = pk =

Jmj=1 wkj � vj

Ψ(pk) = vk = pk

I A semiring artificial neural network consists of a set U of⊕- and�-unitsand a set W ⊆ U × U of K-weighted connections.

I Theorem Weighted automata are semiring artificial neural networks.

I Literature Bader, Holldobler, Scalzitti 2004: Semiring Artificial Neural Networksand Weighted Automata – and an Application to Digital Image Encoding.In: KI 2004: Advances in Artificial Intelligence,Lecture Notes in Artificial Intelligence 3238, 281-294.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science19

Page 33: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

Page 34: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

Page 35: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

Page 36: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0

Page 37: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

Page 38: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0

Page 39: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

Page 40: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

Page 41: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0

Page 42: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0}0

Page 43: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0}0ml5

Page 44: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0}0ml5}5

Page 45: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0}0ml5}5 }1 m12

Page 46: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0}0ml5}5 }1 m12}1 ml1

Page 47: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0}0ml5}5 }1 m12}1 ml1}m1 }1

Page 48: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Symmetric Networks

I Hopfield 1982: Can statistical models for magnetic materialsexplain the behavior of certain classes of networks?

I Original application: associative memory.

I A symmetric network consists of a set U of binary threshold unitsand a set W ⊆ U × U of weighted connectionssuch that wkj = wjk for all k, j with k 6= j.

I Asynchronous update procedure:while state ~v is unstable: update an arbitrary unit.

m0

m0

−1 m5

�����

������

2

HHHHH

HHHHHH

2

m02

}0

}0

ml0}0

}0}0m

}m0

}0

ml0}0ml5}5 }1 m12}1 ml1}m1 }1

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science20

Page 49: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Energy Minimization

I What happens precisely when a symmetric network is updated?

I Consider the energy function

E(t) = −12

Pk,j wkjvj(t)vk(t) +

Pk θkvk(t)

= −P

k<j wkjvj(t)vk(t) +P

k θkvk(t)

describing the state of the network at time t.

I We assume wii = 0 for all units i in the network.

I Exercise

. Specify E(t) for the symmetric networks on the previos page.

Page 50: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Energy Minimization

I What happens precisely when a symmetric network is updated?

I Consider the energy function

E(t) = −12

Pk,j wkjvj(t)vk(t) +

Pk θkvk(t)

= −P

k<j wkjvj(t)vk(t) +P

k θkvk(t)

describing the state of the network at time t.

I We assume wii = 0 for all units i in the network.

I Exercise

. Specify E(t) for the symmetric networks on the previos page.

. How does an update change the energy of a symmetric network(you may assume that θk = 0 for all k)?

Page 51: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Energy Minimization

I What happens precisely when a symmetric network is updated?

I Consider the energy function

E(t) = −12

Pk,j wkjvj(t)vk(t) +

Pk θkvk(t)

= −P

k<j wkjvj(t)vk(t) +P

k θkvk(t)

describing the state of the network at time t.

I We assume wii = 0 for all units i in the network.

I Exercise

. Specify E(t) for the symmetric networks on the previos page.

. How does an update change the energy of a symmetric network(you may assume that θk = 0 for all k)?

I Theorem E is monotone decreasing, i.e., E(t + 1) ≤ E(t).

Page 52: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Energy Minimization

I What happens precisely when a symmetric network is updated?

I Consider the energy function

E(t) = −12

Pk,j wkjvj(t)vk(t) +

Pk θkvk(t)

= −P

k<j wkjvj(t)vk(t) +P

k θkvk(t)

describing the state of the network at time t.

I We assume wii = 0 for all units i in the network.

I Exercise

. Specify E(t) for the symmetric networks on the previos page.

. How does an update change the energy of a symmetric network(you may assume that θk = 0 for all k)?

I Theorem E is monotone decreasing, i.e., E(t + 1) ≤ E(t).

I Exercise Does this theorem still hold if we drop the assumption that wij = wji?

Page 53: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Energy Minimization

I What happens precisely when a symmetric network is updated?

I Consider the energy function

E(t) = −12

Pk,j wkjvj(t)vk(t) +

Pk θkvk(t)

= −P

k<j wkjvj(t)vk(t) +P

k θkvk(t)

describing the state of the network at time t.

I We assume wii = 0 for all units i in the network.

I Exercise

. Specify E(t) for the symmetric networks on the previos page.

. How does an update change the energy of a symmetric network(you may assume that θk = 0 for all k)?

I Theorem E is monotone decreasing, i.e., E(t + 1) ≤ E(t).

I Exercise Does this theorem still hold if we drop the assumption that wij = wji?

I Exercise How plausible is the assumption that wij = wji?

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science21

Page 54: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Stochastic Networks or Boltzmann Machines

I Hinton, Sejnowski 1983: Can we escape local minima?

I A stochastic network is a symmetric network,but the values are computed probabilistically

P (vk = 1) =1

1 + e(θk−pk)/T

where T is called pseudo temperature.

I In equilibrium stochastic networks are more likely to be in a state with low energy.

I Kirkpatrick etal. 1983: Can we compute a global minima?

I Simulated annealing decrease T gradually.

I Theorem (Geman, Geman 1984)A global minima is reached if T is decreased in infinitesimal small steps.

I Applications Combinatorial optimization problems like thetravelling salesman problem or graph bipartitioning problem.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science22

Page 55: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Literature

I Geman, Geman 1984: Stochastic Relaxation, Gibbs Distribution, and the BayesianRestoration of Image. IEEE Transactions on Pattern Analysis and Machine Intelli-gence 6, 721-741.

I Hinton, Sejnowski 1983: Optimal Perceptual Inference. In: Proceedings of theIEEE Conference on Computer Vision and Recognition, 448-453.

I Hopfield 1982: Neural Networks and Physical Systems with Emergent CollectiveComputational Abilities. In: Proceedings of the National Academy of SciencesUSA, 2554-2558.

I Kirkpatrick etal. 1983: Optimization by Simulated Annealing. Science 220, 671-680.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science23

Page 56: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Logic

I Variables are p1, . . . , pn.

I Connectives are ¬,∨,∧.

I Atoms are variables.

I Literals are atoms and negated atoms.

I Clauses are (generalized) disjunctions of literals.

I Formulas in clause form are (generalized) conjunctions of clauses.

Page 57: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Logic

I Variables are p1, . . . , pn.

I Connectives are ¬,∨,∧.

I Atoms are variables.

I Literals are atoms and negated atoms.

I Clauses are (generalized) disjunctions of literals.

I Formulas in clause form are (generalized) conjunctions of clauses.

I Notation Sometimes variables are denoted by different lettersif there is a bijection between these letters and p1, . . . , pn.

I Example

(¬o ∨m) ∧ (¬s ∨ ¬m) ∧ (¬c ∨m) ∧ (¬c ∨ s) ∧ (¬v ∨ ¬m)

which is abbreviated by

〈[¬o, m], [¬s,¬m], [¬c, m], [¬c, s], [¬v,¬m]〉.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science24

Page 58: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models

I Notation (all symbols may be indexed)

. A denotes an atom.

. L denotes a literal.

. F, G denote formulas.

. C denotes a clause.

Page 59: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models

I Notation (all symbols may be indexed)

. A denotes an atom.

. L denotes a literal.

. F, G denote formulas.

. C denotes a clause.

I Interpretations are mappings from {p1, . . . , pn} to {0, 1}.

. They can be encoded as ~v.

Page 60: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models

I Notation (all symbols may be indexed)

. A denotes an atom.

. L denotes a literal.

. F, G denote formulas.

. C denotes a clause.

I Interpretations are mappings from {p1, . . . , pn} to {0, 1}.

. They can be encoded as ~v.

. They are extended to formulas as follows:

pi(~v) = vi

(¬F )(~v) = 1− F (~v)(F ∧G)(~v) = F (~v)×G(~v)(F ∨G)(~v) = F (~v) + G(~v)− F (~v)×G(~v)

Page 61: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models

I Notation (all symbols may be indexed)

. A denotes an atom.

. L denotes a literal.

. F, G denote formulas.

. C denotes a clause.

I Interpretations are mappings from {p1, . . . , pn} to {0, 1}.

. They can be encoded as ~v.

. They are extended to formulas as follows:

pi(~v) = vi

(¬F )(~v) = 1− F (~v)(F ∧G)(~v) = F (~v)×G(~v)(F ∨G)(~v) = F (~v) + G(~v)− F (~v)×G(~v)

I ~v is a model for F iff F (~v) = 1.

I F is satisfiable if it has a model.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science25

Page 62: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models – Example

I Let F = 〈[¬p1, p2], [p3,¬p2]〉 and ~v = ~101, then:

F (~v)= [¬p1, p2](~v)× [p3,¬p2](~v)

Page 63: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models – Example

I Let F = 〈[¬p1, p2], [p3,¬p2]〉 and ~v = ~101, then:

F (~v)= [¬p1, p2](~v)× [p3,¬p2](~v)= ((¬p1)(~v) + p2(~v)− (¬p1)(~v)× p2(~v))

Page 64: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models – Example

I Let F = 〈[¬p1, p2], [p3,¬p2]〉 and ~v = ~101, then:

F (~v)= [¬p1, p2](~v)× [p3,¬p2](~v)= ((¬p1)(~v) + p2(~v)− (¬p1)(~v)× p2(~v))× (p3(~v) + (¬p2)(~v)− p3(~v)× (¬p2)(~v))

Page 65: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models – Example

I Let F = 〈[¬p1, p2], [p3,¬p2]〉 and ~v = ~101, then:

F (~v)= [¬p1, p2](~v)× [p3,¬p2](~v)= ((¬p1)(~v) + p2(~v)− (¬p1)(~v)× p2(~v))× (p3(~v) + (¬p2)(~v)− p3(~v)× (¬p2)(~v))

= ((1− p1(~v)) + p2(~v)− (1− p1(~v))× p2(~v))

Page 66: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models – Example

I Let F = 〈[¬p1, p2], [p3,¬p2]〉 and ~v = ~101, then:

F (~v)= [¬p1, p2](~v)× [p3,¬p2](~v)= ((¬p1)(~v) + p2(~v)− (¬p1)(~v)× p2(~v))× (p3(~v) + (¬p2)(~v)− p3(~v)× (¬p2)(~v))

= ((1− p1(~v)) + p2(~v)− (1− p1(~v))× p2(~v))× (p3(~v) + (1− p2(~v))− p3(~v)× (1− p2(~v)))

Page 67: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models – Example

I Let F = 〈[¬p1, p2], [p3,¬p2]〉 and ~v = ~101, then:

F (~v)= [¬p1, p2](~v)× [p3,¬p2](~v)= ((¬p1)(~v) + p2(~v)− (¬p1)(~v)× p2(~v))× (p3(~v) + (¬p2)(~v)− p3(~v)× (¬p2)(~v))

= ((1− p1(~v)) + p2(~v)− (1− p1(~v))× p2(~v))× (p3(~v) + (1− p2(~v))− p3(~v)× (1− p2(~v)))

= ((1− 1) + 0− (1− 1)× 0)× (1 + (1− 0)− 1× (1− 0))= 0× 1

Page 68: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Interpretations and Models – Example

I Let F = 〈[¬p1, p2], [p3,¬p2]〉 and ~v = ~101, then:

F (~v)= [¬p1, p2](~v)× [p3,¬p2](~v)= ((¬p1)(~v) + p2(~v)− (¬p1)(~v)× p2(~v))× (p3(~v) + (¬p2)(~v)− p3(~v)× (¬p2)(~v))

= ((1− p1(~v)) + p2(~v)− (1− p1(~v))× p2(~v))× (p3(~v) + (1− p2(~v))− p3(~v)× (1− p2(~v)))

= ((1− 1) + 0− (1− 1)× 0)× (1 + (1− 0)− 1× (1− 0))= 0× 1= 1

I Hence, ~v is not a model for F , but is a model for [p3,¬p2].

I Exercise

. Is F satisfiable? Prove your claim.

. Is 〈[¬p], [p,¬q], [q]〉 satisfiable? Prove your claim.

. Find all models of 〈[¬o, m], [¬s,¬m], [¬c, m], [¬c, s], [¬v,¬m]〉.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science26

Page 69: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Energy Minimization

I Pinkas 1991:Is there a link between propositional logic and symmetric networks?

I Let F = 〈C1, . . . , Cm〉 be a propositional formula in clause form.

I We define

τ (C) =

8>><>>:0 if C = [ ],A if C = [A],1−A if C = [¬A],τ (C1) + τ (C2)− τ (C1)τ (C2) if C = (C1 ∨ C2).

τ (F ) =Pm

i=1(1− τ (Ci))

I Example τ (〈[¬o, m], [¬s,¬m], [¬c, m], [¬c, s], [¬v,¬m]〉)= vm− cm− cs + sm− om + 2c + o.

I Exercise Compute τ (〈[¬p], [p,¬q], [q]〉).

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science27

Page 70: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

Page 71: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

Page 72: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

Page 73: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

��

��

��

1

Page 74: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

��

��

��

1

HHHHHH

HHHHHH

−1

Page 75: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

��

��

��

1

HHHHHH

HHHHHH

−1

������

������

1

Page 76: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

��

��

��

1

HHHHHH

HHHHHH

−1

������

������

1

2

Page 77: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

��

��

��

1

HHHHHH

HHHHHH

−1

������

������

1

2

1

Page 78: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

��

��

��

1

HHHHHH

HHHHHH

−1

������

������

1

2

1

0

0

0

Page 79: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Reasoning and Symmetric Networks

I Theorem F (~v) = 1 iff τ (F ) has a global minima at ~v and τ (F )(~v) = 0.

I Compare τ (F ) = vm− cm− cs + sm− om + 2c + o

with E(~v) = −P

k<j wkjvjvk +P

k θkvk.

mu1 = o

mu2 = m

mu3 = s

mu5 = v

mu4 = c

������������

−1

AA

AA

AA

AA

AA

AA

1

��

��

��

1

HHHHHH

HHHHHH

−1

������

������

1

2

1

0

0

0

}0

}0

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science28

Page 80: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Non-Monotonic Reasoning

I Pinkas 1991a:Can the above mentioned approach be extended to non-monotonic reasoning?

I Consider F = 〈(C1, k1), . . . , (Cm, km)〉, where Ci are clauses and ki ∈ R+.

I The penalty of ~v for (C, k) is k if C(~v) = 0 and 0 otherwise.

I The penalty of ~v for F is the sum of the penalties for (Ci, ki).

I ~v is preferred over ~w wrt F

if the penalty of ~v for F is smaller than the penalty of ~w for F .

I Modify τ to become τ (F ) =Pm

i=1 ki(1− τ (Ci)), e.g.,

τ (〈([¬o, m], 1), ([¬s,¬m], 2), ([¬c, m], 4), ([¬c, s], 4), ([¬v,¬m], 4)〉)= 4vm− 4cm− 4cs + 2sm− om + 8c + o.

I The corresponding stochastic network computes most preferred interpretations.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science29

Page 81: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Exercises and Literature

I Exercise Consider

F = 〈([¬o, m], 1), ([¬s,¬m], 2), ([¬c, m], 4), ([¬c, s], 4), ([¬v,¬m], 4)〉.

. Compute the most preferred interpretations of F .

. What happens if we add (o, 100) to F ?

. What happens if we add (o, 100) and (s, 100) to F ?

I Literature

. Pinkas 1991: Symmetric Neural Networks and Logic Satisfiability. NeuralComputation 3, 282-291.

. Pinkas 1991a: Propositional Non-Monotonic Reasoning and Inconsistencyin Symmetrical Neural Networks. In: Proceedings International Joint Con-ference on Artificial Intelligence, 525-530.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science30

Page 82: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Proposititonal Logic Programs and the Core Method

I The Very Idea

I Logic Programs

I Propositional Core Method

I Backpropagation

I Knowledge-Based Artificial Neural Networks

I Propositional Core Method using Sigmoidal Units

I Further Extensions

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science31

Page 83: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Very Idea

I Various semantics for logic programs coincide with fixed points of associated im-mediate consequence operators (e.g., Apt, vanEmden 1982).

Page 84: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Very Idea

I Various semantics for logic programs coincide with fixed points of associated im-mediate consequence operators (e.g., Apt, vanEmden 1982).

I Banach Contraction Mapping Theorem A contraction mapping f defined ona complete metric space (X, d) has a unique fixed point. The sequencey, f(y), f(f(y)), . . . converges to this fixed point for any y ∈ X.

. Fitting 1994: Consider logic programs,whose immediate consequence operator is a contraction.

Page 85: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Very Idea

I Various semantics for logic programs coincide with fixed points of associated im-mediate consequence operators (e.g., Apt, vanEmden 1982).

I Banach Contraction Mapping Theorem A contraction mapping f defined ona complete metric space (X, d) has a unique fixed point. The sequencey, f(y), f(f(y)), . . . converges to this fixed point for any y ∈ X.

. Fitting 1994: Consider logic programs,whose immediate consequence operator is a contraction.

I Funahashi 1989: Every continuous function on the reals can be uniformly approx-imated by feedforward connectionist networks.

. Holldobler, Kalinke, Storr 1999: Consider logic programs,whose immediate consequence operator is continuous on the reals.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science32

Page 86: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Metrics

I A metric on a space M is a mapping d : M ×M → R such that

. d(x, y) = 0 iff x = y,

. d(x, y) = d(y, x), and

. d(x, y) ≤ d(x, z) + d(z, y).

I Let (M, d) be a metric space and S = (si | si ∈M) a sequence.

. S converges if (∃s ∈M)(∀ε > 0)(∃N)(∀n ≥ N) d(sn, s) ≤ ε.

. S is Cauchy if (∀ε > O)(∃N)(∀n, m ≥ N) d(sn, sm) ≤ ε.

. (M, d) is complete if every Cauchy sequence converges.

I A mapping f : M →M is a contraction on (M, d)if (∃0 < k < 1)(∀x, y ∈M) d(f(x), f(y)) ≤ k · d(x, y).

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science33

Page 87: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Logic Programs

I A propositional logic programP over a propositional language Lis a finite set of clauses

A← L1 ∧ . . . ∧ Ln,

where A is an atom, Li are literals and n ≥ 0.P is definite if all Li, 1 ≤ i ≤ n are atoms.

I Let V be the set of all propositional variables occurring in L.

I An interpretation I is a mapping V → {>,⊥}.I I can be represented by the set of atoms which are mapped to> under I.

I 2V is the set of all interpretations.

I Immediate consequence operator TP : 2V → 2V:

TP(I) = {A | there is a clause A← L1 ∧ . . . ∧ Ln ∈ P

such that I |= L1 ∧ . . . ∧ Ln}.

I I is a supported model iff TP(I) = I.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science34

Page 88: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Exercises

I ConsiderP = {p, q ← p, r ← q}

. Draw the lattice of all interpretations ofP wrt the⊆ ordering.

. Mark the models ofP .

. Compute TP(∅), TP(TP(∅)), . . ..

. Mark the supported models ofP .

I LetP be a definite program.

. Show that if M1 and M2 are models ofP then so is M1 ∩M2.

. Let M be the least model ofP . Show that M is a supported model.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science35

Page 89: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Core Method

I Let L be a logic language.

I Given a logic programP together with immediate consequence operator TP .

I Let I be the set of interpretations forP .

I Find a mapping R : I → Rn.

I Construct a feed-forward network computing fP : Rn → Rn, called the core,such that the following holds:

. If TP(I) = J then fP(R(I)) = R(J), where I, J ∈ I.

. If fP(~s) = ~t then TP(R−1(~s)) = R−1(~t), where ~s,~t ∈ Rn.

I Connect the units in the output layer recursively to the units in the input layer.

I Show that the following holds

. I = lfp (TP) iff the recurrent network converges to or approximates R(I).

Page 90: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Core Method

I Let L be a logic language.

I Given a logic programP together with immediate consequence operator TP .

I Let I be the set of interpretations forP .

I Find a mapping R : I → Rn.

I Construct a feed-forward network computing fP : Rn → Rn, called the core,such that the following holds:

. If TP(I) = J then fP(R(I)) = R(J), where I, J ∈ I.

. If fP(~s) = ~t then TP(R−1(~s)) = R−1(~t), where ~s,~t ∈ Rn.

I Connect the units in the output layer recursively to the units in the input layer.

I Show that the following holds

. I = lfp (TP) iff the recurrent network converges to or approximates R(I).

Connectionist model generation using recurrent networks with feed forward core.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science36

Page 91: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

3-Layer Recurrent Networks

input layer

hidden layer

output layer

Page 92: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

3-Layer Recurrent Networks

input layer

hidden layer

output layer

�������

�������

��

��

���

��

��

���

��

��

��

���36

AA

AA

AAK

AA

AA

AAK

@@

@@

@@I

@@

@@

@@I6

QQ

QQ

QQ

QQQk

AA

AA

AAK

AA

AA

AAK

@@

@@

@@I

@@

@@

@@I

QQ

QQ

QQ

QQQk 6

6

�������

�������

��

��

���

��

��

���

��

��

���

��

��

��

���3

����������

AAAAAAAAAA core

. . .

. . .

Page 93: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

3-Layer Recurrent Networks

input layer

hidden layer

output layer

�������

�������

��

��

���

��

��

���

��

��

��

���36

AA

AA

AAK

AA

AA

AAK

@@

@@

@@I

@@

@@

@@I6

QQ

QQ

QQ

QQQk

AA

AA

AAK

AA

AA

AAK

@@

@@

@@I

@@

@@

@@I

QQ

QQ

QQ

QQQk 6

6

�������

�������

��

��

���

��

��

���

��

��

���

��

��

��

���3

����������

AAAAAAAAAA core

. . .

. . .

6 6���� 6 6����

���� ����

. . .

. . .

Page 94: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

3-Layer Recurrent Networks

input layer

hidden layer

output layer

�������

�������

��

��

���

��

��

���

��

��

��

���36

AA

AA

AAK

AA

AA

AAK

@@

@@

@@I

@@

@@

@@I6

QQ

QQ

QQ

QQQk

AA

AA

AAK

AA

AA

AAK

@@

@@

@@I

@@

@@

@@I

QQ

QQ

QQ

QQQk 6

6

�������

�������

��

��

���

��

��

���

��

��

���

��

��

��

���3

����������

AAAAAAAAAA core

. . .

. . .

6 6���� 6 6����

���� ����

. . .

. . .

I At each point in time all units do:

. apply activation function to obtain potential,

. apply output function to obtain output.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science37

Page 95: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method using Binary Threshold Units

I Let L be the language of propositional logic over a set V of variables.

I LetP be a propositional logic program, e.g.,

P = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.

I I = 2V is the set of interpretations forP .

I TP(I) = {A | A← L1 ∧ . . . ∧ Lm ∈ P such that I |= L1 ∧ . . . ∧ Lm}.

TP(∅) = {p}

Page 96: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method using Binary Threshold Units

I Let L be the language of propositional logic over a set V of variables.

I LetP be a propositional logic program, e.g.,

P = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.

I I = 2V is the set of interpretations forP .

I TP(I) = {A | A← L1 ∧ . . . ∧ Lm ∈ P such that I |= L1 ∧ . . . ∧ Lm}.

TP(∅) = {p}TP({p}) = {p, r}

Page 97: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method using Binary Threshold Units

I Let L be the language of propositional logic over a set V of variables.

I LetP be a propositional logic program, e.g.,

P = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.

I I = 2V is the set of interpretations forP .

I TP(I) = {A | A← L1 ∧ . . . ∧ Lm ∈ P such that I |= L1 ∧ . . . ∧ Lm}.

TP(∅) = {p}TP({p}) = {p, r}TP({p, r}) = {p, r} = lfp (TP)

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science38

Page 98: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Representing Interpretations

I I = 2V

I Let n = |V| and identify V with {1, . . . , n}.I Define R : I → Rn such that for all 1 ≤ j ≤ n we find:

R(I)[j] =

1 if j ∈ I,

0 if j 6∈ I.

E.g., if V = {p, q, r} = {1, 2, 3} and I = {p, r} then R(I) = (1, 0, 1).

I Other encodings are possible, e.g.,

R(I)[j] =

1 if j ∈ I,

−1 if j 6∈ I.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science39

Page 99: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing the Core

I Consider againP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I A translation algorithm translatesP into a core of binary threshold units:

p q r

12

12

12 input layer

ω2

ω2

ω2 output layer

hidden layer

p q r

Page 100: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing the Core

I Consider againP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I A translation algorithm translatesP into a core of binary threshold units:

p q r

12

12

12 input layer

ω2

ω2

ω2 output layer

hidden layer

p q r

−ω2

6

ω

Page 101: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing the Core

I Consider againP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I A translation algorithm translatesP into a core of binary threshold units:

p q r

12

12

12 input layer

ω2

ω2

ω2 output layer

hidden layer

p q r

−ω2

6

ωω

ω2

�������6

−ω

�������

Page 102: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing the Core

I Consider againP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I A translation algorithm translatesP into a core of binary threshold units:

p q r

12

12

12 input layer

ω2

ω2

ω2 output layer

hidden layer

p q r

−ω2

6

ωω

ω2

�������6

−ω

�������

−ω

ω2

��

��

���

�������

6

I Exercise Specify the core for {p1 ← p2, p1 ← p3 ∧ p4, p1 ← p5 ∧ p6}.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science40

Page 103: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

Page 104: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

666

1

Page 105: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

666

1−ω2

Page 106: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

666

1−ω2

ω2

Page 107: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

666

1−ω2

ω2

12

Page 108: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

666

1−ω2

ω2

12

ω2

Page 109: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

666

1−ω2

ω2

12

ω2

ω2

Page 110: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Some Results

I Proposition 2-layer networks cannot compute TP for definiteP .

I Theorem For each programP , there exists a core computing TP .

I RecallP = {p, r ← p ∧ ¬q, r ← ¬p ∧ q}.I Adding recurrent connections:

12

12

12

ω2

ω2

ω2

−ω2

6

ω2

�������6

�������

ω2

��

��

���

�������

6

666

1−ω2

ω2

12

ω2

ω2

12

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science41

Page 111: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Strongly Determined Programs

I A logic programsP is said to be strongly determined if there exists a metric d onthe set of all Herbrand interpretations forP such that TP is a contraction wrt d.

I Exercise Are the following programs strongly determined?

. {p, q ← p, r ← q},

. {p1 ← p2, p1 ← p3 ∧ p4, p1 ← p5 ∧ p6},

. {p← ¬p}.

I Corollary Let P be a strongly determined program. Then there exists a corewith recurrent connections such that the computation with an arbitrary initial inputconverges and yields the unique fixed point of TP .

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science42

Page 112: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Time and Space Complexity

I Let n be the number of clausesand m be the number of propositional variables occurring inP .

. 2m + n units, 2mn connections in the core.

. TP(I) is computed in 2 steps.

. The parallel computational model to compute TP(I) is optimal.

. The recurrent network settles down in 3n steps in the worst case.

I Exercise Give an example of a program with worst case time behavior.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science43

Page 113: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Rule Extraction (1)

I PropositionFor each core C there exists a programP such that C computes TP .

-0.2 0.2

-0.40.3 0.6

u6

u3 u4

u1 u2

u5

u7

2

0.7 0 -1 -0.2

1

1 -2-0.5 1.5

0.3 0.8

u1 u2 u3 u4 u5 u6 u7

p3 v3 p4 v4 p5 v5 p6 v6 p7 v7

0 0 0 0 0 1 0 0 0 1 −1 00 1 1.5 1 .3 1 .8 1 1.8 1 .7 11 0 1 1 −1 0 −.5 0 2 1 .7 11 1 2.5 1 −.7 0 .3 0 2 1 .7 1

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science44

Page 114: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Rule Extraction (2)

I Extracted program:

P = { q1 ← ¬q1 ∧ ¬q2,

q1 ← ¬q1 ∧ q2, q2 ← ¬q1 ∧ q2,

q1 ← q1 ∧ ¬q2, q2 ← q1 ∧ ¬q2,

q1 ← q1 ∧ q2, q2 ← q1 ∧ q2 }.

I Simplified form:P = {q1, q2 ← q1, q2 ← ¬q1 ∧ q2}.

I You can do much better compared to this simple approach(see Mayer-Eichberger 2006).

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science45

Page 115: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Literature

I Apt, van Emden 1982: Contributions to the Theory of Logic Programming. Journalof the ACM 29, 841-862.

I Fitting 1994: Metric Methods – Three Examples and a Theorem. Journal of LogicProgramming 21, 113-127.

I Funahashi 1989: On the Approximate Realization of Continuous Mappings byNeural Networks. Neural Networks 2, 183-192.

I Hitzler, Holldobler, Seda 2004: Logic Programs and Connectionist Networks.Journal of Applied Logic 2, 245-272.

I Holldobler, Kalinke 1994: Towards a Massively Parallel Computational Model forLogic Programming. In: Proceedings of the ECAI94 Workshop on Combining Sym-bolic and Connectionist Processing, 68-77.

I Holldobler, Kalinke, Storr 1999: Approximating the Semantics of Logic Programsby Recurrent Neural Networks. Applied Intelligence 11, 45-59.

I Markus-Eichberger 2006: Extracting Propositional Logic Programs from NeuralNetworks: A Decompositional Approach. Bacherlor Thesis TU Dresden.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science46

Page 116: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

3-Layer Feed-Forward Networks Revisited

I Theorem (Funahashi 1989) Suppose that Ψ : R → R is non-constant, bounded,monotone increasing and continuous. Let K ⊆ Rn be compact, let f : K → Rbe continuous, and let ε > 0. Then there exists a 3-layer feed-forward networkwith output function Ψ for the hidden layer and linear output function for the inputand output layer whose input-output mapping f : K → R satisfies

maxx∈K|f(x)− f(x)| < ε.

. Every continuous function f : K → R can be uniformly approximated byinput-output functions of 3-layer feed-forward networks.

I uk is a sigmoidal unit if

Φ(~ik) = pk =Pm

j=1 wkjvj

Ψ(pk) = vk = 1

1+eβ(θk−pk)

where θk ∈ R is a threshold (or bias) and β > 0 a steepness parameter.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science47

Page 117: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Backpropagation

I Bryson, Ho 1969, Werbos 1974, Parker 1985, Rumelhart, etal. 1986:Can 3-layer feed-forward networks learn a particular function?

I Training set of input-output pairs {(~il, ~ol) | 1 ≤ l ≤ n}.I Minimize E =

Pl El where El = 1

2

Pk(o

lk − vl

k)2.

I Gradient descent algorithm to learn appropriate weights.

I Backpropagation

. Initialize weights arbitrarily.

. Do until all input-output patterns are correctly classified.

1 Present input pattern ~il at time t.2 Compute output pattern ~vl at time t + 2.3 Change weights according to ∆wl

ij = ηδliv

lj, where

δli =

Ψ′i(p

li)× (ol

i − vli) if i is output unit,

Ψ′i(pli)×

Pk δl

kwki if i is hidden unit,

η > 0 is called learning rate.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science48

Page 118: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Output Functions Revisited

I Remember sigmoidal function (with β = 1):

vi =1

1 + e−(P

j wijvj+θi)

I We finddvi

d(P

j wijvj + θi)= vi(1− vi).

I Hence

δli =

vl

i(1− vli)(o

li − vl

i) if ui is an output unit,vl

i(1− vli)

Pk δl

kwki if ui is a hidden unit.

I Units are active if vi ≥ 0.9 and passive if vi ≤ 0.1.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science49

Page 119: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Properties

I Learning rate η:

. If η is large, then system learns rapidly but may oscillate.

. If η is small, then system learns slowly but will not oscillate.

. In the ideal case η should be adapted during learning:

∆wij(t + 1) = ηδi(t)vj(t) + α∆wij(t)

where α is a constant and α∆wij(t) is called momentum term.

I Almost all functions can be learned.

I Learning is NP–hard.

I Literature Rumelhart etal. 1986: Parallel Distributed Processing. MIT Press.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science50

Page 120: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Level Mappings and Hierarchical Logic Programs

I Let V be a set of propositional variablesandP be a propositional logic program wrt V .

I A level mapping forP is a function l : V → N.

. We define l(¬A) = l(A).

I P is hierarchical if for all clauses A← L1 ∧ . . . ∧ Ln ∈ P we findl(A) > l(L1) for all 1 ≤ i ≤ n.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science51

Page 121: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

Page 122: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

Page 123: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

ω2

@@

@I

��

��6

Page 124: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

ω2

@@

@I

��

��6

��

��6

3ω2 H

Page 125: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

ω2

@@

@I

��

��6

��

��6

3ω2 H

��

��

@@

@@

@@I

−ω

ω2

K

Page 126: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

ω2

@@

@I

��

��6

��

��6

3ω2 H

��

��

@@

@@

@@I

−ω

ω2

K

Page 127: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

ω2

@@

@I

��

��6

��

��6

3ω2 H

��

��

@@

@@

@@I

−ω

ω2

K

3ω2

Page 128: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

ω2

@@

@I

��

��6

��

��6

3ω2 H

��

��

@@

@@

@@I

−ω

ω2

K

3ω2

ω2

Page 129: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks

I Towell, Shavlik 1994: Can we do better than empirical learning?

I Sets of hierarchical logic programs, e.g.,

P = {A← B ∧ C ∧ ¬D, A← D ∧ ¬E, H ← F ∧G, K ← A,¬H}.

B C D E F G

��

��6

@@

@I

ω2A

ω

3ω2

��

��

ω2

@@

@I

��

��6

��

��6

3ω2 H

��

��

@@

@@

@@I

−ω

ω2

K

3ω2

ω2

ω2

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science52

Page 130: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – Learning

I Given hierachical sets of propositional rules as background knowledge.

I Map rules into multi-layer feed forward networks with sigmoidal units.

I Add hidden units (optional).

I Add units for known input features that are not referenced in the rules.

I Fully connect layers.

I Add near-zero random numbers to all links and thresholds.

I Apply backpropagation.

. Empirical evaluation: system performs betterthan purely empirical and purely hand-built classifiers.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science53

Page 131: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

Page 132: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

Page 133: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

Page 134: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

Page 135: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

ω2

Page 136: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

ω2

I pA = pB = 9ω

Page 137: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

ω2

I pA = pB = 9ω and vA = vB = 11+eβ(9.5ω−9ω) ≈ 0.46 with β = 1.

Page 138: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

ω2

I pA = pB = 9ω and vA = vB = 11+eβ(9.5ω−9ω) ≈ 0.46 with β = 1.

I pC = 0.92ω

Page 139: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

ω2

I pA = pB = 9ω and vA = vB = 11+eβ(9.5ω−9ω) ≈ 0.46 with β = 1.

I pC = 0.92ω and vc = 11+eβ(0.5ω−0.92ω) ≈ 0.6 with β = 1.

Page 140: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Knowledge Based Artificial Neural Networks – A Problem

I Works if rules have few conditions andthere are few rules with the same head.

. . .

A1 A9 A10

������*6

@@

@I

19ω2A

. . .

B1 B2 B10

��

��6

HHHHHHY

19ω2 B

ω2

C

����

��*

HHHH

HHY

ω2

I pA = pB = 9ω and vA = vB = 11+eβ(9.5ω−9ω) ≈ 0.46 with β = 1.

I pC = 0.92ω and vc = 11+eβ(0.5ω−0.92ω) ≈ 0.6 with β = 1.

I Literature Towell, Shavlik 1994: Knowledge Based Artificial Neural Networks.Artificial Intelligence 70, 119-165.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science54

Page 141: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method using Bipolar Sigmoidal Units

I d’Avila Garcez, Zaverucha, Carvalho 1997:Can we combine the ideas in Holldobler, Kalinke 1994 and Towell, Shavlik 1994while avoiding the above mentioned problem?

I Consider propositional logic language.

I Let I be an interpretation and a ∈ [0, 1].

R(I)[j] =

v ∈ [a, 1] if j ∈ I,

w ∈ [−1,−a] if j 6∈ I.

I Replace threshold and sigmoidal units by bipolar sigmoidal ones,i.e., units with

Φ(~ik) = pk =Pm

j=1 wkjvj,

Ψ(pk) = vk = 2

1+eβ(θk−pk) − 1,

where θk ∈ R is a threshold (or bias) and β > 0 a steepness parameter.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science55

Page 142: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Task

I How should a, ω and θi be selected such that:

. vi ∈ [a, 1] or vi ∈ [−1,−a] and

. the core computes the immediate consequence operator?

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science56

Page 143: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Hidden Layer Units

I Consider A← L1 ∧ . . . ∧ Ln.

I Let u be the hidden layer unit for this rule.

. Suppose I |= L1 ∧ . . . ∧ Ln.

• u receives input≥ ωa from unit representing Li.• pu ≥ nωa = p+

u .

Page 144: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Hidden Layer Units

I Consider A← L1 ∧ . . . ∧ Ln.

I Let u be the hidden layer unit for this rule.

. Suppose I |= L1 ∧ . . . ∧ Ln.

• u receives input≥ ωa from unit representing Li.• pu ≥ nωa = p+

u .

. Suppose I 6|= L1 ∧ . . . ∧ Ln.

• u receives input≤ −ωa from at least one unit representing Li.• pu ≤ (n− 1)ω1− ωa = p−u .

Page 145: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Hidden Layer Units

I Consider A← L1 ∧ . . . ∧ Ln.

I Let u be the hidden layer unit for this rule.

. Suppose I |= L1 ∧ . . . ∧ Ln.

• u receives input≥ ωa from unit representing Li.• pu ≥ nωa = p+

u .

. Suppose I 6|= L1 ∧ . . . ∧ Ln.

• u receives input≤ −ωa from at least one unit representing Li.• pu ≤ (n− 1)ω1− ωa = p−u .

I θu = nωa+(n−1)ω−ωa2 = (na + n− 1− a)ω

2 = (n− 1)(a + 1)ω2 .

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science57

Page 146: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Output Layer Units

I Let µ be the number of clause with head A.

I Consider A← L1 ∧ . . . ∧ Ln.

I Suppose I |= L1 ∧ . . . ∧ Ln.

. pA ≥ ωa + (µ− 1)ω(−1) = ωa− (µ− 1)ω = p+A.

Page 147: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Output Layer Units

I Let µ be the number of clause with head A.

I Consider A← L1 ∧ . . . ∧ Ln.

I Suppose I |= L1 ∧ . . . ∧ Ln.

. pA ≥ ωa + (µ− 1)ω(−1) = ωa− (µ− 1)ω = p+A.

I Suppose for all rules of the form A← L1∧ . . .∧Ln we find I 6|= L1∧ . . .∧Ln.

. pA ≤ −µωa = p−A.

Page 148: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Output Layer Units

I Let µ be the number of clause with head A.

I Consider A← L1 ∧ . . . ∧ Ln.

I Suppose I |= L1 ∧ . . . ∧ Ln.

. pA ≥ ωa + (µ− 1)ω(−1) = ωa− (µ− 1)ω = p+A.

I Suppose for all rules of the form A← L1∧ . . .∧Ln we find I 6|= L1∧ . . .∧Ln.

. pA ≤ −µωa = p−A.

I θA = ωa−(µ−1)ω−µωa2 = (a− µ + 1− µa)ω

2 = (1− µ)(a + 1)ω2 .

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science58

Page 149: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

Page 150: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

Page 151: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

. a(n + 1)ω > (n− 1)ω.

Page 152: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

. a(n + 1)ω > (n− 1)ω.

. a > n−1n+1 .

Page 153: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

. a(n + 1)ω > (n− 1)ω.

. a > n−1n+1 .

I p+A > p−A:

. ωa− (µ− 1)ω > −µaω.

Page 154: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

. a(n + 1)ω > (n− 1)ω.

. a > n−1n+1 .

I p+A > p−A:

. ωa− (µ− 1)ω > −µaω.

. ωa + µaω > (µ− 1)ω.

Page 155: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

. a(n + 1)ω > (n− 1)ω.

. a > n−1n+1 .

I p+A > p−A:

. ωa− (µ− 1)ω > −µaω.

. ωa + µaω > (µ− 1)ω.

. a(1 + µ)ω > (µ− 1)ω.

Page 156: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

. a(n + 1)ω > (n− 1)ω.

. a > n−1n+1 .

I p+A > p−A:

. ωa− (µ− 1)ω > −µaω.

. ωa + µaω > (µ− 1)ω.

. a(1 + µ)ω > (µ− 1)ω.

. a > µ−1µ+1 .

Page 157: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for a

I p+u > p−u :

. nωa > (n− 1)ω − ωa.

. nωa + ωa > (n− 1)ω.

. a(n + 1)ω > (n− 1)ω.

. a > n−1n+1 .

I p+A > p−A:

. ωa− (µ− 1)ω > −µaω.

. ωa + µaω > (µ− 1)ω.

. a(1 + µ)ω > (µ− 1)ω.

. a > µ−1µ+1 .

I Consider all rules minimum value for a.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science59

Page 158: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

Page 159: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

Page 160: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

I 21+a ≥ 1 + eβ(θ−p).

Page 161: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

I 21+a ≥ 1 + eβ(θ−p).

I 21+a − 1 = 2

1+a −1+a1+a = 1−a

1+a ≥ eβ(θ−p).

Page 162: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

I 21+a ≥ 1 + eβ(θ−p).

I 21+a − 1 = 2

1+a −1+a1+a = 1−a

1+a ≥ eβ(θ−p).

I ln(1−a1+a) ≥ β(θ − p).

Page 163: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

I 21+a ≥ 1 + eβ(θ−p).

I 21+a − 1 = 2

1+a −1+a1+a = 1−a

1+a ≥ eβ(θ−p).

I ln(1−a1+a) ≥ β(θ − p).

I 1β ln(1−a

1+a) ≥ θ − p.

Page 164: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

I 21+a ≥ 1 + eβ(θ−p).

I 21+a − 1 = 2

1+a −1+a1+a = 1−a

1+a ≥ eβ(θ−p).

I ln(1−a1+a) ≥ β(θ − p).

I 1β ln(1−a

1+a) ≥ θ − p.

I Consider a hidden layer unit:

. 1β ln(1−a

1+a) ≥ (n− 1)(a + 1)ω2 −nωa = na+n−a−1−2na

2 ω = n−1−a(n+1)2 ω.

Page 165: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

I 21+a ≥ 1 + eβ(θ−p).

I 21+a − 1 = 2

1+a −1+a1+a = 1−a

1+a ≥ eβ(θ−p).

I ln(1−a1+a) ≥ β(θ − p).

I 1β ln(1−a

1+a) ≥ θ − p.

I Consider a hidden layer unit:

. 1β ln(1−a

1+a) ≥ (n− 1)(a + 1)ω2 −nωa = na+n−a−1−2na

2 ω = n−1−a(n+1)2 ω.

. ω ≥ 2(n−1−a(n+1))β ln(1−a

1+a) because a ≥ n−1n+1 .

Page 166: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Computing a Value for ω

I Ψ(p) = 21+eβ(θ−p) − 1 ≥ a.

I 21+eβ(θ−p) ≥ 1 + a.

I 21+a ≥ 1 + eβ(θ−p).

I 21+a − 1 = 2

1+a −1+a1+a = 1−a

1+a ≥ eβ(θ−p).

I ln(1−a1+a) ≥ β(θ − p).

I 1β ln(1−a

1+a) ≥ θ − p.

I Consider a hidden layer unit:

. 1β ln(1−a

1+a) ≥ (n− 1)(a + 1)ω2 −nωa = na+n−a−1−2na

2 ω = n−1−a(n+1)2 ω.

. ω ≥ 2(n−1−a(n+1))β ln(1−a

1+a) because a ≥ n−1n+1 .

I Consider all hidden and output layer units as well as the case that Ψ(p) ≤ −a:

minimum value for ω.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science60

Page 167: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Exercises

I Show that hierarchical programs are strongly determined.

I ConsiderP = {r ← p ∧ ¬q, r ← ¬p ∧ q, p← s ∧ t}.

. Compute values for a, ω and θi.

. Specify the core forP .

. How can the approach be extended to handle facts like s and t.?

I Consider nowP ′ = P ∪ {s, t}, whereP is as before.

. Show thatP ′ is strongly determined.

. Show that the recurrent network computes the least model ofP ∪ {s, t}.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science61

Page 168: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Results

I Relation to logic programs is preserved.

I The core is trainable by backpropagation.

I Many interesting applications, e.g.:

. DNA sequence analysis.

. Power system fault diagnosis.

I Empirical evaluation:system performs better than well-known machine learning systems.

I See d’Avila Garcez, Broda, Gabbay 2002 for details.

I Literature

. d’Avila Garcez, Zaverucha, Carvalho 1997: Logic Programming and Induct-ive Inference in Artificial Neural Networks. In: Knowledge Representationin Neural Networks Logos, Berlin, 33-46.

. d’Avila Garcez, Broda, Gabbay 2002: Neural-Symbolic Learning Systems:Foundations and Applications, Springer.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science62

Page 169: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Further Extensions

I Many-valued logic programs

I Modal logic programs

I Answer set programming

I Metalevel priorities

I Rule extraction

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science63

Page 170: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method – Three-Valued Logic Programs

I Kalinke 1994: Consider truth values>, ⊥, u.

I Interpretations are pairs I = 〈I+, I−〉.I Immediate consequence operator ΦP(I) = 〈J+, J−〉, where

J+ = {A | A← L1 ∧ . . . ∧ Lm ∈ P and I(L1 ∧ . . . ∧ Lm) = >},J− = {A | for all A← L1 ∧ . . . ∧ Lm ∈ P : I(L1 ∧ . . . ∧ Lm) = ⊥}.

I Let n = |V| and identify V with {1, . . . , n}.I Define R : I → R2n as follows:

R(I)[2j − 1] =

1 if j ∈ I+

0 if j 6∈ I+

ffand R(I)[2j] =

1 if j ∈ I−

0 if j 6∈ I−

ff

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science64

Page 171: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method – Multi-Valued Logic Programs

I For each programP , there exists a core computing ΦP , e.g.,

P = {C ← A ∧ ¬B, D ← C ∧ E, D ← ¬C}.

12

12

12

12

12

12

12

12

12

12

A ¬A B ¬B C ¬C D ¬D E ¬E

A ¬A B ¬B C ¬C D ¬D E ¬E

Page 172: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method – Multi-Valued Logic Programs

I For each programP , there exists a core computing ΦP , e.g.,

P = {C ← A ∧ ¬B, D ← C ∧ E, D ← ¬C}.

12

12

12

12

12

12

12

12

12

12

A ¬A B ¬B C ¬C D ¬D E ¬E

A ¬A B ¬B C ¬C D ¬D E ¬E

3ω2

ω2

������*

������*

��

��

@@

@I

����

��*

����

��*

ω2

ω2

Page 173: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method – Multi-Valued Logic Programs

I For each programP , there exists a core computing ΦP , e.g.,

P = {C ← A ∧ ¬B, D ← C ∧ E, D ← ¬C}.

12

12

12

12

12

12

12

12

12

12

A ¬A B ¬B C ¬C D ¬D E ¬E

A ¬A B ¬B C ¬C D ¬D E ¬E

3ω2

ω2

������*

������*

��

��

@@

@I

����

��*

����

��*

ω2

ω2

3ω2

ω2

6 6

XXXXXXXXXXXXy

XXXXXXXXXXXXy

������*

������*

ω2

3ω2

Page 174: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method – Multi-Valued Logic Programs

I For each programP , there exists a core computing ΦP , e.g.,

P = {C ← A ∧ ¬B, D ← C ∧ E, D ← ¬C}.

12

12

12

12

12

12

12

12

12

12

A ¬A B ¬B C ¬C D ¬D E ¬E

A ¬A B ¬B C ¬C D ¬D E ¬E

3ω2

ω2

������*

������*

��

��

@@

@I

����

��*

����

��*

ω2

ω2

3ω2

ω2

6 6

XXXXXXXXXXXXy

XXXXXXXXXXXXy

������*

������*

ω2

3ω2

ω2

ω2

���������1

��

��>

6 6

Page 175: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method – Multi-Valued Logic Programs

I For each programP , there exists a core computing ΦP , e.g.,

P = {C ← A ∧ ¬B, D ← C ∧ E, D ← ¬C}.

12

12

12

12

12

12

12

12

12

12

A ¬A B ¬B C ¬C D ¬D E ¬E

A ¬A B ¬B C ¬C D ¬D E ¬E

3ω2

ω2

������*

������*

��

��

@@

@I

����

��*

����

��*

ω2

ω2

3ω2

ω2

6 6

XXXXXXXXXXXXy

XXXXXXXXXXXXy

������*

������*

ω2

3ω2

ω2

ω2

���������1

��

��>

6 6

ω2 −

ω2

ω2 −

ω2

ω2 −

ω2

I Lane, Seda 2004: Extension to finitely determined sets of truth values.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science65

Page 176: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Propositional Core Method – Modal Logic Programs

I d’Avila Garcez, Lamb, Gabbay 2002.

I Let L be a propositional logic language plus

. the modalities 2 and 3, and

. a finite set of labels w1, . . . , wk denoting worlds.

I Let B be an atom, then 2B and 3B are modal atoms.

I A modal definite logic programP is a set of clauses of the form

wi : A← A1 ∧ . . . ∧Am

together with a finite set of relations wi Iwj, wherewi, 1 ≤ i, j ≤ k, are labels and A, A1, . . . , Am are atoms or modal atoms.

I P =Sk

i=1Pi, wherePi consists of all clauses labelled with wi.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science66

Page 177: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – Semantics

I Example: P = {w1 : A, w1 : 3C ← A}∪ {w2 : B}∪ {w3 : B}∪ {w4 : B}∪ {w1 Iw2, w1 Iw3, w1 Iw4, w2 Iw4, }

I Kripke semantics:

• •

w1

w3

w2 w4

Page 178: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – Semantics

I Example: P = {w1 : A, w1 : 3C ← A}∪ {w2 : B}∪ {w3 : B}∪ {w4 : B}∪ {w1 Iw2, w1 Iw3, w1 Iw4, w2 Iw4, }

I Kripke semantics:

• •

w1

w3

w2 w4

������

�����*

HHHHHHH

HHHHj

-

Page 179: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – Semantics

I Example: P = {w1 : A, w1 : 3C ← A}∪ {w2 : B}∪ {w3 : B}∪ {w4 : B}∪ {w1 Iw2, w1 Iw3, w1 Iw4, w2 Iw4, }

I Kripke semantics:

• •

w1

w3

w2 w4

������

�����*

HHHHHHH

HHHHj

-

A

2AB

2B2C

B 2AB

2B

2C

Page 180: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – Semantics

I Example: P = {w1 : A, w1 : 3C ← A}∪ {w2 : B}∪ {w3 : B}∪ {w4 : B}∪ {w1 Iw2, w1 Iw3, w1 Iw4, w2 Iw4, }

I Kripke semantics:

• •

w1

w3

w2 w4

������

�����*

HHHHHHH

HHHHj

-

A

2AB

2B2C

B 2AB

2B

2C

2B3B3C

2B3B

Page 181: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – Semantics

I Example: P = {w1 : A, w1 : 3C ← A}∪ {w2 : B}∪ {w3 : B}∪ {w4 : B}∪ {w1 Iw2, w1 Iw3, w1 Iw4, w2 Iw4, }

I Kripke semantics:

• •

w1

w3

w2 w4

������

�����*

HHHHHHH

HHHHj

-

A

2AB

2B2C

B 2AB

2B

2C

2B3B3C

2B3B

C

Page 182: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – Semantics

I Example: P = {w1 : A, w1 : 3C ← A}∪ {w2 : B}∪ {w3 : B}∪ {w4 : B}∪ {w1 Iw2, w1 Iw3, w1 Iw4, w2 Iw4, }

I Kripke semantics:

• •

w1

w3

w2 w4

������

�����*

HHHHHHH

HHHHj

-

A

2AB

2B2C

B 2AB

2B

2C

2B3B3C

2B3B

C2C3C

Page 183: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – Semantics

I Example: P = {w1 : A, w1 : 3C ← A}∪ {w2 : B}∪ {w3 : B}∪ {w4 : B}∪ {w1 Iw2, w1 Iw3, w1 Iw4, w2 Iw4, }

I Kripke semantics:

• •

w1

w3

w2 w4

������

�����*

HHHHHHH

HHHHj

-

A

2AB

2B2C

B 2AB

2B

2C

2B3B3C

2B3B

C2C3C

3C

C

fC(w1) = w4

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science67

Page 184: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Immediate Consequence Operator

I Interpretations are tuples I = 〈I1, . . . , Ik〉I Immediate consequence operator MTP(I) = 〈J1, . . . , Jk〉, where

Ji = {A | there exists A← A1 ∧ . . . ∧Am ∈ Pi

such that {A1, . . . , Am} ⊆ Ii}∪ {3A | there exists wi Iwj ∈ P and A ∈ Ij}∪ {2A | for all wi Iwj ∈ P we find A ∈ Ij}∪ {A | there exists wj Iwi ∈ P and 2A ∈ Ij}∪ {A | there exists wj Iwi ∈ P, 3A ∈ Ij and fA(wj) = wi}

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science68

Page 185: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Modal Logic Programs – The Translation Algorithm

I Let n = |V| and identify V with {1, . . . , n}.I Let a ∈ [0, 1].

I Define R : I → R3n as follows:

R(I)[3j − 2] =

v ∈ [a, 1] if j ∈ Ij

w ∈ [−1,−a] if j 6∈ Ij

R(I)[3j − 1] =

v ∈ [a, 1] if 2j ∈ Ij

w ∈ [−1,−a] if 2j 6∈ Ij

R(I)[3j] =

v ∈ [a, 1] if 3j ∈ Ij

w ∈ [−1,−a] if 3j 6∈ Ij

I Translation algorithm such that

. for each world the “local” part of MTP is computed by a core,

. the cores are turned into recurrent networks, and

. the cores are connected with respect to the given set of relations.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science69

Page 186: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

Page 187: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

Page 188: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

Page 189: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

Page 190: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

Page 191: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

Page 192: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

Page 193: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

Page 194: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

Page 195: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

Page 196: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

Page 197: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

Page 198: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

Page 199: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

Page 200: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

Page 201: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

∧ ∨

Page 202: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

∧ ∨

Page 203: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Example Network

A 2A

3A

B 2B

3B

C 2C

3C

w1

w2 w3

w4

6 6

6

6 6

6

6

6

6

6

6

6

r6∧

6

6

6

r6∧6

6

6

r

r6 66 6r

6

6

r6∧

6

6

6

∧ ∧ ∧

∧ ∧ ∧

∧ ∨

∧ ∨

∧ ∨

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science70

Page 204: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

First-Order Logic

I Existing Approaches

. Reflexive Reasoning and SHRUTI

. Connectionist Term Representations

• Holographic Reduced Representations Plate 1991• Recursive Auto-Associative Memory Pollack 1988

. Horn logic and CHCL Holldobler 1990, Holldobler, Kurfess 1992

. Other Approaches

I First-Order Logic Programs and the Core Method

. Initial Approach

. Construction of Approximating Networks

. Topological Analysis and Generalisations

. Employing Iterated Function Systems

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science71

Page 205: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Literature

I Holldobler 1990: A Structured Connectionist Unification Algorithm. In: Proceed-ings of the AAAI National Conference on Artificial Intelligence, 587-593.

I Holldobler, Kurfess 1992: CHCL – A Connectionist Inference System. In: Parallel-ization in Inference Systems, Lecture Notes in Artificial Intelligence, 590, 318-342.

I Plate 1991: Holographic Reduced Representations. In Proceedings of the Interna-tional Joint Conference on Artificial Intelligence, 30-35.

I Pollack 1988: Recursive auto-associative memory: Devising compositional dis-tributed representations. In: Proceedings of the Annual Conference of the Cognit-ive Science Society , 33-39.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science72

Page 206: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Reflexive Reasoning

I Humans are capable of performing a wide variety of cognitive taskswith extreme ease and efficiency.

I For traditional AI systems, the same problems turn out to be intractable.

I Human consensus knowledge: about 108 rules and facts.

I Wanted: “Reflexive” decisions within sublinear time.

I Shastri, Ajjanagadde 1993: SHRUTI.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science73

Page 207: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI – Knowledge Base

I Finite set of constants C, finite set of variables V .

I Rules:

. (∀X1 . . . Xm) (p1(. . .) ∧ . . . ∧ pn(. . .)→ (∃Y1 . . . Yk p(. . .)).

. p, pi, 1 ≤ i ≤ n, are multi-place predicate symbols.

. Arguments of the pi: variables from {X1, . . . , Xm} ⊆ V .

. Arguments of p are from {X1, . . . , Xm} ∪ {Y1, . . . , Yk} ∪ C.

. {Y1, . . . , Yk} ⊆ V .

. {X1, . . . , Xm} ∩ {Y1, . . . , Yk} = ∅.

I Facts and queries (goals):

. (∃Z1 . . . Zl) q(. . .).

. Multi-place predicate symbol q.

. Arguments of q are from {Z1, . . . , Zl} ∪ C.

. {Z1, . . . , Zl} ⊆ V .

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science74

Page 208: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Further Restrictions

I Restrictions to rules, facts, and goals:

. No function symbols except constants.

. Only universally bound variables may occur as argumentsin the conditions of a rule.

. All variables occurring in a fact or goal occur only onceand are existentially bound.

. An existentially quantified variable is only unified with variables.

. A variable which occurs more than once in the conditions of a rule mustoccur in the conclusion of the rule and must be bound when the conclusionis unified with a goal.

. A rule is used only a fixed number of times.

Incompleteness.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science75

Page 209: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI – Example

I RulesP = { owns(Y, Z)← gives(X, Y, Z),

owns(X, Y )← buys(X, Y ),can-sell(X, Y )← owns(X, Y ),gives(john, josephine, book),(∃X) buys(john, X),owns(josephine, ball) },

I Queries:can-sell(josephine, book) ; yes(∃X) owns(josephine, X) ; yes {X 7→ book}

{X 7→ ball}

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science76

Page 210: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

Page 211: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

H } }

Page 212: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

H } }

H } }

Page 213: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

H } }

H } }

H } } H } }

Page 214: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

H } }

H } }

H } } H } }

I

Page 215: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

H } }

H } }

H } } H } }

I

H

Page 216: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

H } }

H } }

H } } H } }

I

H

H

Page 217: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI : The Network

�� ��AA AAgives AA AA�� ��m mm mm buys

- -�� ��HH HH

6 6

r rr rrr r r

from johnfrom jos.from book

r from john

��

��

��

��

���

@@

@@

@@

@@

@@I �

��

��

��

��

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

@@

@@

@@

@@

@@R

��

��

��

��

��

��

��

��

? ?

��

��

AA

AA

owns

can-sell

AA

AA

��

��

m

m

m

m

@@

@ rr@@ r

r@

@@

@@

@ -��HH

@@

@@

@@

@@I

6

? ? ?

jose

phin

e

john

ball

book

H } }

H } }

H } } H } }

I

H

H

H

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science77

Page 218: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Solving the Variable Binding Problem

bookjohnball

josephinecan–sell4can–sell5

can–sell 1st argcan–sell 2nd arg

owns4owns5

owns 1st argowns 2nd arg

owns �

gives4gives5

gives 1st arggives 2nd arggives 3nd arg

gives �

buys4buys5

buys 1st argbuys 2nd arg

buys �

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science78

Page 219: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

SHRUTI – Remarks

I Answers are derived in time proportional to depth of search space.

I Number of units as well as of connections is linear in the sizeof the knowledge base.

I Extensions:

. compute answer substitutions

. allow a fixed number of copies of rules

. allow multiple literals in the body of a rule

. built in a taxonomy

I ROBIN (Lange, Dyer 1989): signatures instead of phases.

I Biological plausibility.

I Trading expressiveness for time and size.

I Logical reconstruction by Beringer, Holldobler 1993:

. Reflexive reasoning is reasoning by reduction.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science79

Page 220: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Literature

I Beringer, Holldobler 1993: On the Adequateness of the Connection Method. In:Proceedings of the AAAI National Conference on Artificial Intelligence, 9-14.

I Shastri, Ajjanagadde 1993: From Associations to Systematic Reasoning: A Con-nectionist Representation of Rules, Variables and Dynamic Bindings using Tem-poral Synchrony. Behavioural and Brain Sciences 16, 417-494.

I Lange, Dyer 1989: High-Level Inferencing in a Connectionist Network. ConnectionScience 1, 181-217.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science80

Page 221: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

First-Order Logic Programs and the Core Method

I Initial Approach

I Construction of Approximating Networks

I Topological Analysis and Generalisations

I Employing Iterated Function Systems

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science81

Page 222: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Logic Programs

I A logic programP over a first-order language L is a finite set of clauses

A← L1 ∧ . . . ∧ Ln,

where A is an atom, Li are literals and n ≥ 0.

I BL is the set of all ground atoms over L called Herbrand base.

I A Herbrand interpretation I is a mapping BL → {>,⊥}.I 2BL is the set of all Herbrand interpretations.

I ground(P) is the set of all ground instances of clauses inP .

I Immediate consequence operator TP : 2BL → 2BL:

TP(I) = {A | there is a clause A← L1 ∧ . . . ∧ Ln ∈ ground(P)such that I |= L1 ∧ . . . ∧ Ln}.

I I is a supported model iff TP(I) = I.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science82

Page 223: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Initial Approach

I Holldobler, Kalinke, Storr 1999:Can the core method be extended to first-order logic programs?

I Problem

. Given a logic programP over a first order language Ltogether with TP : 2BL → 2BL.

. BL is countably infinite.

. The method used to relate propositional logic and connectionist systems isnot applicable.

. How can the gap between the discrete, symbolic setting of logic, and thecontinuous, real valued setting of connectionist networks be closed?

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science83

Page 224: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

The Goal

I Find R : 2BL → R and fP : R→ R such that the following conditions hold.

. TP(I) = I′ implies fP(R(I)) = R(I′).fP(x) = x′ implies TP(R−1(x)) = R−1(x′).

fP is a sound and complete encoding of TP .

. TP is a contraction on 2BL iff fP is a contraction on R.

The contraction property and fixed points are preserved.

. fP is continuous on R.

A connectionst network approximating fP is known to exist.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science84

Page 225: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Acyclic Logic Programs

I LetP be a program over a first order language L.

I A level mapping forP is a function l : BL → N.

. We define l(¬A) = l(A).

I We can associate a metric dL with L and l. Let I, J ∈ 2BL:

dL(I, J) =

0 if I = J

2−n if n is the smallest level on which I and J differ.

I Proposition (Fitting 1994) (2BL, dL) is a complete metric space.

I P is said to be acyclic wrt a level mapping l,if for every A← L1 ∧ . . . ∧ Ln ∈ ground(P) we find l(A) > l(Li) for all i.

I Proposition LetP be an acyclic logic program wrt l and dL the metric associatedwith L and l, then TP is a contraction on (2BL, dL).

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science85

Page 226: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Mapping Interpretations to Real Numbers

I LetD = {r ∈ R | r =P∞

i=1 ai4−i, where ai ∈ {0, 1} for all i}.I Let l be a bijective level mapping.

I {>,⊥} can be identified with {0, 1}.I The set of all mappings BL → {>,⊥} can be identified with

the set of all mappings N→ {0, 1}.I Let IL be the set of all mappings from BL to {0, 1}.I Let R : IL → D be defined as

R(I) =∞Xi=1

I(l−1(i))4−i.

I Proposition R is a bijection.

We have a sound and complete encoding of interpretations.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science86

Page 227: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Mapping Immediate Consequence Operators to Functions on the Reals

I We define fP : D → D : r 7→ R(TP(R−1(r))).

r -

-

fP

TP

r′

I I′

? ?

R R

We have a sound and complete encoding of TP .

I Proposition LetP be an acylic program wrt a bijective level mapping.fP is a contraction onD.

Contraction property and fixed points are preserved.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science87

Page 228: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Approximating Continuous Functions

I Corollary fP is continuous.

I Recall Funahashi’s theorem:

. Every continuous function f : K → R can be uniformly approximated byinput-output functions of 3-layer feed forward networks.

I Theorem fP can be uniformly approximated by input-output functions of 3-layerfeed forward networks.

. TP can be approximated as well by applying R−1 .

Connectionist network approximating immediate consequence operator exists.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science88

Page 229: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

An Example

I Consider P = {q(0), q(s(X))← q(X)} and let l(q(sn(0))) = n + 1.

. P is acyclic wrt l, l is bijective, R(BL) = 13.

. fP(R(I)) = 4−l(q(0)) +P

q(X)∈I 4−l(q(s(X)))

= 4−l(q(0)) +P

q(X)∈I 4−(l(q(X)))+1) = 1+R(I)4 .

I Approximation of fP to accuracy ε yields

f(x) ∈»1 + x

4− ε,

1 + x

4+ ε

–.

I Starting with some x and iterating f yields in the limit a value

r ∈»1− 4ε

3,1 + 4ε

3

–.

I Applying R−1 to r we find

q(sn(0)) ∈ R−1(r) if n < −log4ε− 1.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science89

Page 230: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Approximation of Interpretations

I LetP be a logic program over a first order language L and l a level mapping.

I An interpretation I approximates an interpretation J to a degree n ∈ Nif for all atoms A ∈ BL with l(A) < n we find I(A) = > iff J(A) = >.

. I approximates J to a degree n iff dL(I, J) ≤ 2−n.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science90

Page 231: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Approximation of Supported Models

I Given an acyclic logic programP with bijective level mapping.

I Let TP be the immediate consequence operator associated withP andMP the least supported model ofP .

I We can approximate TP by a 3-layer feed forward network.

I We can turn this network into a recurrent one.

Does the recurrent network approximate the supported model ofP?

I Theorem For an arbitrary m ∈ N there exists a recursive network with sigmoidalactivation function for the hidden layer units and linear activation functions forthe input and output layer units computing a function fP such that there exists ann0 ∈ N such that for all n ≥ n0 and for all x ∈ [−1, 1] we find

dL(R−1(f

n

P(x)), MP) ≤ 2−m.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science91

Page 232: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

First Order Core Method – Extensions

I Detailed study in (topological) continuity of semantic operatorsHitzler, Seda 2003 and Hitzler, Holldobler, Seda 2004:

. many-valued logics,

. larger class of logic programs,

. other approximation theorems.

I A core method for reflexive reasoning Holldobler, Kalinke, Wunderlich 2000.

I The graph of fP is an attractor of some iterated function systemBader 2003 and Bader, Hitzler 2004:

. representation theorems,

. fractal interpolation,

. core with units computing radial basis functions.

I Finitely determined sets of truth values Lane, Seda 2004.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science92

Page 233: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Constructive Approaches: Fibring Artificial Neural Networks

I Fibring function Φ associated with neuron i maps some weights w of a networkto new values depending on w and the input x of i (Garcez, Gabbay:2004).

w

Φ

x y= =

I Idea approximate fP by computing values of atoms with level n = 1, 2, . . ..

Clause1

Clause2

Clausex

Φ

+1

TP(I)I

n

I Works well for acyclic logic programs with bijective level mapping(Bader, Garcez, Hitzler 2004).

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science93

Page 234: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Constructive Approaches: Approximating Piecewise Constant Functions

I Consider graph of fP .

Approximate fP up to a given level l.

Construct network computing piecewise constant function.

Step activation functions.Sigmoidal activation functions.Radial basis functions.

0.5

0.2

0.45

0.1

0.4

0.35

0 0.50.40.3

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science94

Page 235: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Constructive Approaches: Approximating Piecewise Constant Functions

I Consider graph of fP .

I Approximate fP up to a given level l.

Construct core computing piecewise constant function.

Step activation functions.Sigmoidal activation functions.Radial basis functions.

0.5

0.2

0.45

0.1

0.4

0.35

0 0.50.40.3

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science95

Page 236: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Constructive Approaches: Approximating Piecewise Constant Functions

I Consider graph of fP .

I Approximate fP up to a given level l.

I Construct core computing piecewise constant function.

. Step activation functions.Sigmoidal activation functions.Radial basis functions.

0.5

0.2

0.45

0.1

0.4

0.35

0 0.50.40.3

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science96

Page 237: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Constructive Approaches: Approximating Piecewise Constant Functions

I Consider graph of fP .

I Approximate fP up to a given level l.

I Construct core computing piecewise constant function.

. Step activation functions.

. Sigmoidal activation functions.Radial basis functions.

0.5

0.2

0.45

0.1

0.4

0.35

0 0.50.40.3

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science97

Page 238: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Constructive Approaches: Approximating Piecewise Constant Functions

I Consider graph of fP .

I Approximate fP up to a given level l.

I Construct core computing piecewise constant function.

. Step activation functions.

. Sigmoidal activation functions.

. Radial basis functions.

3210-1-2

1

-3

0.8

0.6

0.4

0.2

03210-1-2

1

-3

0.8

0.6

0.4

0.2

0

I Bader, Hitzler, Witzel 2005.

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science98

Page 239: €¦ · Introduction & Motivation: Connectionist Systems IWell-suited to learn, to adapt to new environments, to degrade gracefully etc. IMany successful applications. IApproximate

Open Problems

I How can first order terms be represented and manipulatedin a connectionist system? Pollack 1990, Holldobler 1990, Plate 1994.

I Can the mapping R be learned? Gust, Kuhnberger 2004.

I How can first order rules be extracted from a connectionist system?

I How can multiple instances of first order rules be representedin a connectionist system? Shastri 1990.

I What does a theory for the integration of logic and connectionist systemslook like?

I Can such a theory be applied in real domains outperformingconventional approaches?

I How does the core method relate to model-based reasoning approachesin cognitive science (e.g. Barnden 1989, Johnson-Laird, Byrne 1993)?

ICCLInternational Center for Computational Logic

Algebra, Logic and Formal Methods in Computer Science99