39
It d ti t N l Introduction to Neural Networks Networks Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs Complex and Adaptive Systems Labs University College Dublin [email protected]

It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

I t d ti t N lIntroduction to Neural NetworksNetworks

Gianluca Pollastri, Head of LabSchool of Computer Science and Informatics and

Complex and Adaptive Systems LabsComplex and Adaptive Systems LabsUniversity College [email protected]

Page 2: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

CreditsCredits

Geoffrey Hinton, University of Toronto.borrowed some of his slides for “Neural

Networks” and “Computation in Neural Networks” courses.

Paolo Frasconi, University of Florence.This guy taught me Neural Networks in the firstThis guy taught me Neural Networks in the first

place (*and* I borrowed some of his slides too!).

Page 3: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Recurrent Neural Networks (RNN)Recurrent Neural Networks (RNN)

One of the earliest versions: Jeffrey Elman, 1990, Cognitive Science., , g

P bl it i ’t t t tiProblem: it isn’t easy to represent time with Feedforward Neural Nets: usually time is represented with space.Attempt to design networks with memoryAttempt to design networks with memory.

Page 4: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

RNNsRNNs

The idea is having discrete time steps, and considering the hidden layer at time t-1 as g yan input at time t.This effectively removes cycles: we canThis effectively removes cycles: we can

model the network using an FFNN, and model memory explicitly.

Page 5: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot

Xt dXt d

Itd = delay element

Page 6: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

BPTTBPTT

BackPropagation Through Time. If Ot is the output at time t It the input at If Ot is the output at time t, It the input at

time t, and Xt the memory (hidden) at time t we can model the dependencies ast, we can model the dependencies as follows:

Page 7: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

BPTTBPTT

We can model both f() and g() with (possibly multilayered) networks.(p y y )We can transform the recurrent network by

unrolling it in timeunrolling it in time. Backpropagation works on any DAG. An

RNN becomes one once it’s unrolled.

Page 8: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot

Xt dXt d

Itd = delay element

Page 9: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 10: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

gradient in BPTTgradient in BPTT GRADIENT(I O T) { GRADIENT(I,O,T) { # I=inputs, O=outputs, T=targets T := size(O); X0 := 0; for t := 1..T Xt := f( Xt-1 , It ); for t := 1..T { Ot := g( Xt , It ); g.gradient( Ot - Tt );g g δt = g.deltas( Ot - Tt ); } for t := T..1for t : T..1 f.gradient( δt ); δt-1 += f.deltas( δt ); } }

Page 11: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 12: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 13: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 14: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 15: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 16: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 17: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 18: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 19: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 20: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 21: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 22: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 23: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

Page 24: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

What I will talk aboutWhat I will talk about

Neurons Neurons Multi-Layered Neural Networks:

Basic learning algorithm E pressi e po er Expressive power Classification

How can we *actually* train Neural Networks: Speeding up training Speeding up training Learning just right (not too little, not too much) Figuring out you got it right

Feed back networks? Feed-back networks? Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann

Machines) Recurrent Neural Networks Recurrent Neural Networks Bidirectional RNN 2D-RNN

Concluding remarksConcluding remarks

Page 25: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Bidirectional Recurrent Neural Networks (BRNN)

Page 26: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

BRNNBRNN

Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:

stationary

Page 27: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

BRNNBRNN

Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:

stationary

Page 28: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

BRNNBRNN

Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:

stationary

Page 29: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

BRNNBRNN

Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:

stationary

Page 30: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Inference in BRNNsInference in BRNNs

FORWARD(U) { FORWARD(U) { T size(U); F B 0; F0 BT+1 0; for t 1..T Ft = ( Ft 1 , Ut ); Ft ( Ft-1 , Ut ); for t T..1 Bt = ( Bt+1 , Ut );t ( t+1 t ) for t 1..T Yt = ( Ft , Bt , Ut ); return Y; }

Page 31: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

Learning in BRNNsLearning in BRNNs

GRADIENT(U Y) { f T 1 GRADIENT(U,Y) { T size(U); F B 0;

for t T..1 δFt-1 +=

.backprop&gradient(δFt ); F0 BT+1 0; for t 1..T Ft = ( Ft-1 , Ut );

p p g ( Ft ); for t 1..T δBt+1 +=

b k & di t(δ )t t 1 t

for t T..1 Bt = ( Bt+1 , Ut );

f t 1 T {

.backprop&gradient(δBt ); }

for t 1..T { Yt = ( Ft , Bt , Ut ); [δFt δBt] = [δFt, δBt]

.backprop&gradient( Yt - Yt ); }

Page 32: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

What I will talk aboutWhat I will talk about

Neurons Neurons Multi-Layered Neural Networks:

Basic learning algorithm E pressi e po er Expressive power Classification

How can we *actually* train Neural Networks: Speeding up training Speeding up training Learning just right (not too little, not too much) Figuring out you got it right

Feed back networks? Feed-back networks? Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann

Machines) Recurrent Neural Networks Recurrent Neural Networks Bidirectional RNN 2D-RNN

Concluding remarksConcluding remarks

Page 33: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

2D RNNs2D RNNs

ll i ldi fPollastri & Baldi 2002, BioinformaticsBaldi & Pollastri 2003, JMLR

Page 34: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

2D RNNs2D RNNs

Page 35: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

2D RNNs2D RNNs

Page 36: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

2D RNNs2D RNNs

Page 37: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

2D RNNs2D RNNs

Page 38: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

2D RNNs2D RNNs

Page 39: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

2D RNNs2D RNNs