Extreme Learning Machines...•Introduction to Extreme Learning Machines ELM •Early Results •Brief description of code •Discuss possible future work Neural Network Revision •In

Extreme Learning MachinesTony Oakden

ANU AI Masters Project (early Presentation)

4/8/2014

This presentation covers:

• Revision of Neural Network theory

• Introduction to Extreme Learning Machines ELM

• Early Results

• Brief description of code

• Discuss possible future work

Neural Network Revision

• In a single layer perceptron inputs are connected to output nodes via weights

• Training is carried out using least squares or similar function

• Pros• Simple and quick to train

• Cons• Can only learn to classify linearly separable problems

Hidden Layer

• To classify none linear data we must add an additional layer of weights between input and output (hidden layer)

• When combined with a suitable activation function (sigmoidal for example) the network can classify none linear functions

• To train the hidden layer we propagate errors on the output back through the network. This is the back propagation algorithm

• Pros:• Can theoretically classify any data set

• Cons• Training of the network can be very slower

Extreme Learning Machines

• Provide a way to train networks to classify none linear problems without back propagation

• These networks still use a hidden layer. But the weights and bias in the hidden layer are set to random values

• We only train the output nodes.

• Training is achieved using least squares algorithm

• Pros:• Very fast training time

• Cons:• Less accurate

http://www.ntu.edu.sg/home/egbhuang/

Wait, we use random weights? Huh?

• Sounds too good to be true so lets look at some results:

• http://fastml.com/extreme-learning-machines/

Two Spirals Data Set

• First set of experiments where carried out with the twin spiral data set.

• This was used because:• It is a difficult set to classify

• Easy visualization of results

Neural Network trained with back propagation

• 20 nodes in hidden layer

• Training time is 6.4 seconds

• Training accuracy is 100%

(Testing was performed with training data)

Extreme Learning Machine




Not great…

But…

Extreme Learning continued




If the number of nodes in the hidden layer is significantly increased then performance improves dramatically but time taken to train still remains much faster than a traditional network

Accuracy plotted against hidden Layer/20

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Chart Title

Matlab Code • http://www.ntu.edu.sg/home/egbhuang/reference.html%create random weights for hidden layer

InputWeight=rand(NumberofHiddenNeurons,NumberofInputNeurons)*2-1;

BiasofHiddenNeurons=rand(NumberofHiddenNeurons,1);

….

tempH=InputWeight*trainData.P;

ind=ones(1,NumberofTrainingData);

BiasMatrix=BiasofHiddenNeurons(:,ind); % Extend the bias matrix BiasofHiddenNeurons to match the dimention of H

tempH=tempH+BiasMatrix;

% Calculate hidden neuron output matrix H

% we can use a variety of activation functions here but we’ll stick to sigmoidal for now…

H = 1 ./ (1 + exp(-tempH));

OutputWeight=pinv(H') * trainData.T'; % pinv gives Moore-Penrose pseudoinverse matrix

http://www.mathworks.com.au/help/matlab/ref/pinv.html

http://www.ntu.edu.sg/home/egbhuang/reference.html

Conclusion

• As can be see training times for ELM are very fast.

• From these early experiments 100 times faster than traditional back prop for similar accuracy

• Accuracy is slightly lower, with other data sets back prop achieved 85% ELM 80%. But for many applications is still good enough

• Increasing the number of nodes in the hidden layer improves performance at the expense of a small increase in training time

Further research

• Use of ELM with GA for feature selection (this weeks work)• Experiment with different data sets• Perform more rigorous analysis of results• So far we have only looked at binary classifiers. How does ELM algorithm

cope with multi-class classification?• Can we improve the accuracy of ELM in some way, maybe by combining

results with cascade networks?• What about continuous data sources?

• Second part of project is cascade networks, • can these be combined with elm in some way?

references

Guang-Bin Huang: An Insight into Extreme Learning Machines: Random Neurons,

Random Features and Kernels (Springer Science+Business Media New York 2014)

Documents

Extreme Learning Machines...•Introduction to Extreme Learning Machines ELM •Early Results •Brief description of code •Discuss possible future work Neural Network Revision •In