Artificial Intelligence – Lecture 12aass.oru.se/~mbl/AI/lectures.2010/AI-12.pdf · Perceptron • Simple model of a single neuron ... • Perceptron equivalent to linear ... You

Artificial Intelligence – Lecture 12

2

● Neural networks

● Unsupervisedtraining

● Decision trees

● The exam

● Examplequestions

Nearest neighbour classification

• Can't we just find the “closest” example data and use that classification?• Eg. Compute distance in vector space, find closest point

in P + N

• Corresponds to splitting vector space into a voronoi diagram

• How many examples are required?• Atleast, cover each “quadrant” in the vector space

• N dimensions of data, requires atleast 2N examples

• This approach usually does not work due to lack of data.

http://www.oru.se/start____2232.aspx?filter=0

3

● Neural networks


● Decision trees

● The exam


Preprocessing data

• Speech recognition• 44Khz > 44000 intensity values / second

• Vector for each second > too much data

• Classification between words cannot be separated by a hyperplane

• Frequency analysis• Fourier or wavelets

• Sample less frequencies > less data


4

● Neural networks


● Decision trees

● The exam


Principal component analysis (PCA)

• Statistics method showing the dominant components in a set of linear datapoints

• Vector A of highest variance

• Vector B of second highest variance, ortho to A

• ...

• Reduce a high dimensional problem to a lower dimension – often down to 2D

• Smaller number of examples required for classification


5

● Neural networks


● Decision trees

● The exam


Neural networks

• Based on how (human) brain works• ~10.000.000.000 neurons x ~10.000 axons

• Electrical charge accumulated in each neuron

• Sometimes, fires an impulse

• Propage through axons to ~10.000 other neurons

Neuron

Axon


6

● Neural networks


● Decision trees

● The exam


Perceptron

• Simple model of a single neuron

• if sum(xi * wi) > 0: then return 1, else 1

• Equivalent to linear classification

• Simple training method

• wi += a * (y f(x))*xi

• where y is correct answer, f(x) is current answer


7

● Neural networks


● Decision trees

● The exam


Training the perceptron

• Gradient decent• Each training example: small step “down” on error curve

• If xi positive and yf(x) positive, then make f(x) higher by

increasing wi

• wi += a * (y f(x))*xi


8

● Neural networks


● Decision trees

● The exam


Does the training work?

• Guaranteed to succeed if • examples a linearly separable

• Sufficiently small learning rate a

• Training rule uses gradient descent


9

● Neural networks


● Decision trees

● The exam


Example

• Two inputs + bias, 3 examples• 3 weights: w0 (bias), w1, w2

x1 x2 y

1 1 1

0.5 0.5 1

1 1 1


10

● Neural networks


● Decision trees

● The exam


Limitations of perceptrons

• Perceptron equivalent to linear classification• Only works if a hyperplane separating data exists

• Cannot handle eg. the xor problem


11

● Neural networks


● Decision trees

● The exam


Multilayer neural networks

• Multiple layers of simple perceptrons• Layer N has inputs from layer N1

• Handles complex classification spaces

• Training?• Backprop. NN

• Genetic Alg.

• ...


12

● Neural networks


● Decision trees

● The exam


Backwards propagating neural networks

• How can we train a multilayer neural network• Gradient descent!

• Need continous derivative from each perceptron

• Threshold step function is not continous!

• Sigmoid step function


13

● Neural networks


● Decision trees

● The exam


Backwards propagating neural networks

• How can we compute gradient “down” on a multilayer perceptron?

• If we know the error at each node, and it's input. Use same algorithm as before

• Output layer: • error known from training data

• Hidden layer node H:

•ΣΟ : out layer O * WH ,O

• Error is propagated backwards


14

● Neural networks


● Decision trees

● The exam


Example of feedforward multilayer

• ALVINN• Drives autonomously

• 70 mph (!)


15

● Neural networks


● Decision trees

● The exam


Unsupervised learning

• Learn to see difference between datapoints automatically.

• Without any positive/negative examples

• Selforganising maps (Kohonen networks)

• Random map with vector values

• Input vector V

• Find vector W in current mapwith minimal |WV|

• Update neighbours of W sothey move closer to V

• Generalization of PCA

• Shifts complex data to lower dimensions

• Classify in lower dimensional space


16

● Neural networks


● Decision trees

● The exam


When can we consider neural networks

• Input is highdimensional discrete or realvalued

• Output is discrete or realvalued

• Output is a vector of values

• Possibly noisy data

• Target function form is unknown

• Human readability of result unimportant!

• Examples

• Speech phoneme recognition

• Image classification

• ...


17

● Neural networks


● Decision trees

● The exam


Can we trust our learned model?

• To know how well our model works we need to test it.

• Testing on the training data > 100% accuracy... hardly realistic!

• Split data into:• Training set: use this to compute SVM, NN, etc.

• Validation set: use this to evaluate how well the system works.


18

● Neural networks


● Decision trees

● The exam


About the Exam

• Exam 10/1 • REMEMBER TO REGISTER FOR IT !

• Questions in English • You may use a SwedishEnglish dictionary

• Answer in Swedish and/or English

• You may use a calculator • (but most math “should” be doable in your heads)

• 3 hours. 46 exercises, Total 40 points• DT2001: 20 points for grade 3, 30 points for grade 4 and

35 points for grade 5.

• DT2006: 20 points for G, 32.5 points for VG


19

● Neural networks


● Decision trees

● The exam


Example questions

• Explaining concepts• For each subexercise write 13§ of explanation. Both

explanation and example (but not only example).


20

● Neural networks


● Decision trees

● The exam


Example questions

• Algorithmic questions• Pseudocode is fine.

• Plausibility of idea and concept is what is graded

• NB. You can use notations from python, lisp, C, C++ as you like


21

● Neural networks


● Decision trees

● The exam


Example questions

• Computational questions• You must show the computation and the steps you took!

• No computations – no points!

• Correct answer does not guarantee full points• “Lucky guess” does not give points

• Incorrect answer does not guarantee not full points• Lapsus error – can sometimes be ignored

• Is the answer reasonable? (Is the question reasonable?)


22

● Neural networks


● Decision trees

● The exam


Finally

• Double check that you have read and answered all questions

• Try to answer every question even if you do not know the answer exactly. It is better to answer with as much as you know, but do not make up wild guesses.

• If you get stuck answering one question – move on to the next.

• You can come back to the tricky question when you finished everything else!

• Every incorrect answer in a question removes points

• If your answer contains the correct answer, and some incorrect parts – then you will not get full (or possibly even any) points!


Documents

Artificial Intelligence – Lecture 12aass.oru.se/~mbl/AI/lectures.2010/AI-12.pdf · Perceptron • Simple model of a single neuron ... • Perceptron equivalent to linear ... You