Upload
doquynh
View
221
Download
0
Embed Size (px)
Citation preview
Artificial Intelligence – Lecture 12
2
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Nearest neighbour classification
• Can't we just find the “closest” example data and use that classification?• Eg. Compute distance in vector space, find closest point
in P + N
• Corresponds to splitting vector space into a voronoi diagram
• How many examples are required?• Atleast, cover each “quadrant” in the vector space
• N dimensions of data, requires atleast 2N examples
• This approach usually does not work due to lack of data.
3
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Preprocessing data
• Speech recognition• 44Khz > 44000 intensity values / second
• Vector for each second > too much data
• Classification between words cannot be separated by a hyperplane
• Frequency analysis• Fourier or wavelets
• Sample less frequencies > less data
4
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Principal component analysis (PCA)
• Statistics method showing the dominant components in a set of linear datapoints
• Vector A of highest variance
• Vector B of second highest variance, ortho to A
• ...
• Reduce a high dimensional problem to a lower dimension – often down to 2D
• Smaller number of examples required for classification
5
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Neural networks
• Based on how (human) brain works• ~10.000.000.000 neurons x ~10.000 axons
• Electrical charge accumulated in each neuron
• Sometimes, fires an impulse
• Propage through axons to ~10.000 other neurons
Neuron
Axon
6
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Perceptron
• Simple model of a single neuron
• if sum(xi * wi) > 0: then return 1, else 1
• Equivalent to linear classification
• Simple training method
• wi += a * (y f(x))*xi
• where y is correct answer, f(x) is current answer
7
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Training the perceptron
• Gradient decent• Each training example: small step “down” on error curve
• If xi positive and yf(x) positive, then make f(x) higher by
increasing wi
• wi += a * (y f(x))*xi
8
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Does the training work?
• Guaranteed to succeed if • examples a linearly separable
• Sufficiently small learning rate a
• Training rule uses gradient descent
9
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Example
• Two inputs + bias, 3 examples• 3 weights: w0 (bias), w1, w2
x1 x2 y
1 1 1
0.5 0.5 1
1 1 1
10
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Limitations of perceptrons
• Perceptron equivalent to linear classification• Only works if a hyperplane separating data exists
• Cannot handle eg. the xor problem
11
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Multilayer neural networks
• Multiple layers of simple perceptrons• Layer N has inputs from layer N1
• Handles complex classification spaces
• Training?• Backprop. NN
• Genetic Alg.
• ...
12
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Backwards propagating neural networks
• How can we train a multilayer neural network• Gradient descent!
• Need continous derivative from each perceptron
• Threshold step function is not continous!
• Sigmoid step function
13
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Backwards propagating neural networks
• How can we compute gradient “down” on a multilayer perceptron?
• If we know the error at each node, and it's input. Use same algorithm as before
• Output layer: • error known from training data
• Hidden layer node H:
•ΣΟ : out layer O * WH ,O
• Error is propagated backwards
14
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Example of feedforward multilayer
• ALVINN• Drives autonomously
• 70 mph (!)
15
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Unsupervised learning
• Learn to see difference between datapoints automatically.
• Without any positive/negative examples
• Selforganising maps (Kohonen networks)
• Random map with vector values
• Input vector V
• Find vector W in current mapwith minimal |WV|
• Update neighbours of W sothey move closer to V
• Generalization of PCA
• Shifts complex data to lower dimensions
• Classify in lower dimensional space
16
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
When can we consider neural networks
• Input is highdimensional discrete or realvalued
• Output is discrete or realvalued
• Output is a vector of values
• Possibly noisy data
• Target function form is unknown
• Human readability of result unimportant!
• Examples
• Speech phoneme recognition
• Image classification
• ...
17
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Can we trust our learned model?
• To know how well our model works we need to test it.
• Testing on the training data > 100% accuracy... hardly realistic!
• Split data into:• Training set: use this to compute SVM, NN, etc.
• Validation set: use this to evaluate how well the system works.
18
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
About the Exam
• Exam 10/1 • REMEMBER TO REGISTER FOR IT !
• Questions in English • You may use a SwedishEnglish dictionary
• Answer in Swedish and/or English
• You may use a calculator • (but most math “should” be doable in your heads)
• 3 hours. 46 exercises, Total 40 points• DT2001: 20 points for grade 3, 30 points for grade 4 and
35 points for grade 5.
• DT2006: 20 points for G, 32.5 points for VG
19
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Example questions
• Explaining concepts• For each subexercise write 13§ of explanation. Both
explanation and example (but not only example).
20
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Example questions
• Algorithmic questions• Pseudocode is fine.
• Plausibility of idea and concept is what is graded
• NB. You can use notations from python, lisp, C, C++ as you like
21
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Example questions
• Computational questions• You must show the computation and the steps you took!
• No computations – no points!
• Correct answer does not guarantee full points• “Lucky guess” does not give points
• Incorrect answer does not guarantee not full points• Lapsus error – can sometimes be ignored
• Is the answer reasonable? (Is the question reasonable?)
22
● Neural networks
● Unsupervisedtraining
● Decision trees
● The exam
● Examplequestions
Finally
• Double check that you have read and answered all questions
• Try to answer every question even if you do not know the answer exactly. It is better to answer with as much as you know, but do not make up wild guesses.
• If you get stuck answering one question – move on to the next.
• You can come back to the tricky question when you finished everything else!
• Every incorrect answer in a question removes points
• If your answer contains the correct answer, and some incorrect parts – then you will not get full (or possibly even any) points!