It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science

I t d ti t N lIntroduction to Neural NetworksNetworks

Gianluca Pollastri, Head of LabSchool of Computer Science and Informatics and

Complex and Adaptive Systems LabsComplex and Adaptive Systems LabsUniversity College [email protected]

CreditsCredits

Geoffrey Hinton, University of Toronto.borrowed some of his slides for “Neural

Networks” and “Computation in Neural Networks” courses.

Paolo Frasconi, University of Florence.This guy taught me Neural Networks in the firstThis guy taught me Neural Networks in the first

place (*and* I borrowed some of his slides too!).

Recurrent Neural Networks (RNN)Recurrent Neural Networks (RNN)

One of the earliest versions: Jeffrey Elman, 1990, Cognitive Science., , g

P bl it i ’t t t tiProblem: it isn’t easy to represent time with Feedforward Neural Nets: usually time is represented with space.Attempt to design networks with memoryAttempt to design networks with memory.

RNNsRNNs

The idea is having discrete time steps, and considering the hidden layer at time t-1 as g yan input at time t.This effectively removes cycles: we canThis effectively removes cycles: we can

model the network using an FFNN, and model memory explicitly.

Ot

Xt dXt d

Itd = delay element

BPTTBPTT

BackPropagation Through Time. If Ot is the output at time t It the input at If Ot is the output at time t, It the input at

time t, and Xt the memory (hidden) at time t we can model the dependencies ast, we can model the dependencies as follows:

BPTTBPTT

We can model both f() and g() with (possibly multilayered) networks.(p y y )We can transform the recurrent network by

unrolling it in timeunrolling it in time. Backpropagation works on any DAG. An

RNN becomes one once it’s unrolled.

Ot

Xt dXt d

Itd = delay element

Ot Ot+1Ot-1 Ot+2Ot-2

X XX XX Xt Xt+1Xt-1 Xt+2Xt-2

It It+1It-1 It+2It-2

gradient in BPTTgradient in BPTT GRADIENT(I O T) { GRADIENT(I,O,T) { # I=inputs, O=outputs, T=targets T := size(O); X0 := 0; for t := 1..T Xt := f( Xt-1 , It ); for t := 1..T { Ot := g( Xt , It ); g.gradient( Ot - Tt );g g δt = g.deltas( Ot - Tt ); } for t := T..1for t : T..1 f.gradient( δt ); δt-1 += f.deltas( δt ); } }








































What I will talk aboutWhat I will talk about

Neurons Neurons Multi-Layered Neural Networks:

Basic learning algorithm E pressi e po er Expressive power Classification

How can we *actually* train Neural Networks: Speeding up training Speeding up training Learning just right (not too little, not too much) Figuring out you got it right

Feed back networks? Feed-back networks? Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann

Machines) Recurrent Neural Networks Recurrent Neural Networks Bidirectional RNN 2D-RNN

Concluding remarksConcluding remarks

Bidirectional Recurrent Neural Networks (BRNN)

BRNNBRNN

Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )

• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:

stationary

BRNNBRNN



stationary

BRNNBRNN



stationary

BRNNBRNN



stationary

Inference in BRNNsInference in BRNNs

FORWARD(U) { FORWARD(U) { T size(U); F B 0; F0 BT+1 0; for t 1..T Ft = ( Ft 1 , Ut ); Ft ( Ft-1 , Ut ); for t T..1 Bt = ( Bt+1 , Ut );t ( t+1 t ) for t 1..T Yt = ( Ft , Bt , Ut ); return Y; }

Learning in BRNNsLearning in BRNNs

GRADIENT(U Y) { f T 1 GRADIENT(U,Y) { T size(U); F B 0;

for t T..1 δFt-1 +=

.backprop&gradient(δFt ); F0 BT+1 0; for t 1..T Ft = ( Ft-1 , Ut );

p p g ( Ft ); for t 1..T δBt+1 +=

b k & di t(δ )t t 1 t

for t T..1 Bt = ( Bt+1 , Ut );

f t 1 T {

.backprop&gradient(δBt ); }

for t 1..T { Yt = ( Ft , Bt , Ut ); [δFt δBt] = [δFt, δBt]

.backprop&gradient( Yt - Yt ); }

What I will talk aboutWhat I will talk about

Neurons Neurons Multi-Layered Neural Networks:

Basic learning algorithm E pressi e po er Expressive power Classification

How can we *actually* train Neural Networks: Speeding up training Speeding up training Learning just right (not too little, not too much) Figuring out you got it right

Feed back networks? Feed-back networks? Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann

Machines) Recurrent Neural Networks Recurrent Neural Networks Bidirectional RNN 2D-RNN

Concluding remarksConcluding remarks

2D RNNs2D RNNs

ll i ldi fPollastri & Baldi 2002, BioinformaticsBaldi & Pollastri 2003, JMLR

2D RNNs2D RNNs

2D RNNs2D RNNs

2D RNNs2D RNNs

2D RNNs2D RNNs

2D RNNs2D RNNs

2D RNNs2D RNNs

Documents

It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science