Neural Assignment 2

8/2/2019 Neural Assignment 2

1/16

1

NATIONAL UNIVERSITY OF SINGAPORE

NEURAL NETWORKS ASSIGNMENT

BY

SURYA VIGNESH

A0091282H


2/16

2

Q1)a)%%Steepest descent gradient methodclc;iteration = 1;%%variable iteration is initialized%%x(iteration) = 0;y(iteration) = 0;

f(iteration) = (1-x(iteration))^2 + 100*(y(iteration)-x(iteration)^2)^2;eta = 0.001;while f(iteration) > 1e-6%%begining of while loop%%fx(iteration) = 2*x(iteration)-2+400*(x(iteration)^3-x(iteration)*y(iteration));fy(iteration) = 200*(y(iteration)-x(iteration)^2);iteration = iteration + 1;x(iteration) = x(iteration-1) - eta*fx(iteration-1);y(iteration) = y(iteration-1) - eta*fy(iteration-1);f(iteration) = (1-x(iteration))^2 + 100*(y(iteration)-x(iteration)^2)^2;enddisplay(iteration);iteration=iteration-1;

i=1:1:iteration;subplot(2,2,1)plot(x(i),y(i));%%graph is plotted%%

Number of iterations =14298

f = 9.9933e-007 which is close to 0

When learning rate = 0.1, f will diverge to infinity.

The output obtained is:

Q1)b)

%%Newton methodclc;iteration = 1;%%variable is initialized%%x(iteration) = 0;y(iteration) = 0;f(iteration) = (1-x(iteration))^2 + 100*(y(iteration)-x(iteration)^2)^2;while f(iteration) > 1e-8


3/16

3

fx(iteration) = 2*x(iteration)-2+400*(x(iteration)^3-x(iteration)*y(iteration));fy(iteration) = 200*(y(iteration)-x(iteration)^2);H{iteration} = [1200*x(iteration)^2-400*y(iteration)+2 -400*x(iteration);-400*x(iteration) 200];iteration = iteration + 1;

tmp = [x(iteration-1);y(iteration-1)]-inv(H{iteration-1})*[fx(iteration-1);fy(iteration-1)];x(iteration) = tmp(1);y(iteration) = tmp(2);f(iteration) = (1-x(iteration))^2 + 100*(y(iteration)-x(iteration)^2)^2;end %%end of while loop%%display(iteration);i=1:1:iteration;subplot(2,2,3)plot(x(i),y(i));%%graph is plotted between X and Y%%xlabel('X');ylabel('Y');ylim([-0.2 1.5]);

xlim([0 2]);subplot(2,2,4)plot(i,f(i));xlabel('Iteration');ylabel('f(x,y)');ylim([-1 101]);xlim([0 4]);Number of iterations=3

f = 0 which is the global minimum

The Newtons method is much faster. But it requires the inverse of Hessian Matrix, which is

computational intensive if the dimension is high.

Q2) a) Sequential mode of learning:

%Matlab code for sequential learning modeclear all;%% sampling 41 points in the range of [-1,1]x=-1:0.05:1;%% generating training data, the desired outputsy=1.2*sin(pi*x) - cos(2.4*pi*x);%% specify the structure and learning algorithm for MLPnet=newff(minmax(x),[1,1],{'tansig','purelin'},'trainlm');net.trainparam.show=1000;net.trainparam.lr=1;epoch_s = 10; % Specify the number of epoch for sequential trainingnet.trainparam.epochs=1;% Number of Epochs for Batch trainingnet.trainparam.goal=1e-4;%% Train the MLP


4/16

4

%%[net,tr]=train(net,x,y);for i = 1 : epoch_sindex = randperm(length(x)); % Shuffle the input data every epochfor j = 1 : length(x)net = train(net,x(j),y(j)); % Perform sequential learning

end

end%% Test the MLP, net_output is the output of the MLP, ytest is the desiredoutput.xtest=-1:0.01:1;ytest=1.2*sin(pi*xtest) - cos(2.4*pi*xtest);net_output=sim(net,xtest);%% Plot out the test resultstitle ('Sequential Learning mode, number of hidden neurons=1');plot(xtest,ytest,'b+');hold on;plot(xtest,net_output,'r*');hold off

When having 1 hidden neurons: When having 2 hidden neuron:

Under fitting. Underfitting

When having 3 hidden neuron: When having 4 hidden neuron:

Under fitting Under fitting




5/16

5




Under fitting Proper fitting

When having 50 hidden neuron:

Over fitting

When x = -1.5 and 1.5, the desired output are 0.89 and -1.51, however the

outputs of the neural networks are:

x\neurons 1 2 3 4 5 6 7 8 9 10 50

1.5 -0.307 -0.3531 0.3379 -0.0644 1.1069 -1.2221 -0.6775 0.9121 0.0079 0.2117 -0.0400

-1.5 -1.258 -2.0589 -1.6231 -1.2101 -1.5609 0.3879 -0.4078 -0.2666 -0.1926 -0.6347 -0.4409

CONCLUSION:

It can be concluded that MLP can not make correct predictions outside the training set.


6/16

6

b)Using Batch mode learning with trainlm training

%Matlab code Batch mode learningclear all;%% sampling 41 points in the range of [-1,1]x=-1:0.05:1;

%% generating training data for the deired datay=1.2*sin(pi*x) - cos(2.4*pi*x);%% specify the structure and learning algorithm for MLPnet=newff(minmax(x),[1,1],{'tansig','purelin'},'trainlm');net.trainparam.show=2000;net.trainparam.lr=0.01;net.trainparam.epochs=10000;net.trainparam.goal=1e-4;%% Train the MLP[net,tr]=train(net,x,y);%% Test the MLP, net_output is the output of the MLP, ytest is the%%desired output.xtest=-1:0.01:1;%%To check performance of the nn outside the -` to 1

limit%%ytest=1.2*sin(pi*xtest) - cos(2.4*pi*xtest);net_output=sim(net,xtest);%% Plot out the test resultstitle ('Sequential Learning mode, number of hidden neurons=1');plot(xtest,ytest,'b+');hold on;plot(xtest,net_output,'r-.');hold off


Under fitting. Under fitting


7/16

7






Proper fitting Proper fitting




8/16

8


Over fitting



x\neurons 1 2 3 4 5 6 7 8 9 10 50

1.5 0.8364 -0.4631 -0.3048 -0.5356 0.8312 -0.4394 -0.0756 -0.1517 0.0821 -0.1261 -0.8719-1.5 -0.8726 -0.8917 -0.8723 3.8711 -0.3020 2.1893 0.3489 0.9576 0.2958 0.4195 -0.9632

CONCLUSION:


c)Using Batch mode learning with trainbr training

%Matlab code Batch mode learningclear all;%% sampling 41 points in the range of [-1,1]x=-1:0.05:1;%% generating training data for the deired datay=1.2*sin(pi*x) - cos(2.4*pi*x);%% specify the structure and learning algorithm for MLPnet=newff(minmax(x),[1,1],{'tansig','purelin'},'trainbr');net.trainparam.show=2000;net.trainparam.lr=0.01;net.trainparam.epochs=10000;net.trainparam.goal=1e-4;%% Train the MLP[net,tr]=train(net,x,y);%% Test the MLP, net_output is the output of the MLP, ytest is the%%desired output.xtest=-1:0.01:1;%%To check performance of the nn outside the -` to 1

limit%%ytest=1.2*sin(pi*xtest) - cos(2.4*pi*xtest);net_output=sim(net,xtest);%% Plot out the test resultstitle ('Sequential Learning mode, number of hidden neurons=1');plot(xtest,ytest,'b+');hold on;plot(xtest,net_output,'r-.');hold off


9/16

9


Under fitting. Under fitting






10/16

10






Proper fitting



x\neurons 1 2 3 4 5 6 7 8 9 10 50

1.5 0.7709 -0.7777 -2.3939 -0.5571 -0.5059 1.9951 1.9372 1.2764 0.9614 0.5799 0.0388

-1.5 -1.1340 -0.8978 2.2594 3.5415 2.7612 4.283 2.7395 2.4442 2.5512 1.8424 -0.4493

CONCLUSION:



11/16

11

Q3)

General Implementation Considerations:

1. Preprocessing of Input Vectors :

Dimensionality reduction (optional): principal component analysis (PCA)

Normalization: variance normalization (recommended);(x-m)/d

m is the mean and d is the standard deviation.

Other types of methods to rescale input data into [0,1] or [-1,1]

2. The number of output neurons correspond to the number of class. But for two-class

problem, the number of outputs can be either one or two.

3. How many hidden layers? Usually one hidden layer is sufficient.

4. The activation functions for the hidden neurons are tansig. The activation function for

the output neuron is preferably logsig since it is pattern-classification.

5. For MLP, choose the proper number of hidden neurons:

Singular Value Decomposition Trial and error; or growing and pruning.

6. Possible learning method: traingd, traingdx, traincgf (recommended), trainlm

7. Robustness issue: For MLP, due to random initial weights, it is more reasonable to

run the program several times and average the final results.Issues of normalization:

variance normalization

(x-m)/d

m is the mean and d is the standard deviation.

1) When performing the variance normalization, the mean and standard deviation

should be computed over all samples rather than over all variables of onesingle vector. Therefore, for different pixels of the image (different variables of

the input vector), the normalization parameters (means and standard

deviations) are different.

2)Other normalization methods are also can be applied. Then the results are

different.a) Face recognition using single layer perceptron.

%%Matlab code for single layer perceptron for face recognition

clear all;

for i=1:15fem_pic=strcat('test_im/f',num2str(i),'.png');mal_pic=strcat('test_im/m',num2str(i),'.png');test_im_fem=imread(fem_pic);test_im_mal=imread(mal_pic);test_fem(i,:)=test_im_fem(:);test_mal(i,:)=test_im_mal(:);

end;p_test=double([test_fem' test_mal']);for i=1:50

fem_pic=strcat('train_im/f',num2str(i+15),'.png');mal_pic=strcat('train_im/m',num2str(i+15),'.png');train_im_fem=imread(fem_pic);

train_im_mal=imread(mal_pic);train_fem(i,:)=train_im_fem(:);train_mal(i,:)=train_im_mal(:);


12/16

12

end;p=double([train_fem' train_mal']);for i=1:2:100

temp(1,:) = p(i,:);p(i,:) = p(i+2,:);p(i+2,:) = temp(1,:);

t(i)=0;t(i+1)=1;end%t(1:50)=0;%t(51:100)=1;net=newp(p,t);net.trainParam.epochs=1000;net.trainParam.lr=1;%net.inputweightsf1,1g.initFcn='rands';%net.biasesf1g.initFcn='rands';net.trainParam.goal=1e-6;[net,tr]=train(net,p,t);t_training=sim(net,p);

t_test=sim(net,p_test);training_success=0;testing_success=0;for i=1:100

if (i0)training_success=training_success+1;

elseif (i>50) && (t_training(i)


13/16

13

b) Multilayer Perveptron with Batch learning

%%Matlab code for Multi layer perceptron for face recognitionclear all;for i=1:15

fem_pic=strcat('test_im/f',num2str(i),'.png');

mal_pic=strcat('test_im/m',num2str(i),'.png');test_im_fem=imread(fem_pic);test_im_mal=imread(mal_pic);test_fem(i,:)=test_im_fem(:);test_mal(i,:)=test_im_mal(:);

end;p_test=double([test_fem' test_mal']);for i=1:50

fem_pic=strcat('train_im/f',num2str(i+15),'.png');mal_pic=strcat('train_im/m',num2str(i+15),'.png');train_im_fem=imread(fem_pic);train_im_mal=imread(mal_pic);train_fem(i,:)=train_im_fem(:);

train_mal(i,:)=train_im_mal(:);end;p=double([train_fem' train_mal']);for i=1:2:100 %%For shuffing the input images

temp(1,:) = p(i,:);p(i,:) = p(i+2,:);p(i+2,:) = temp(1,:);t(i)=0.2;t(i+1)=0.8;

end%t(1:50)=0;%t(51:100)=1;net=newff(p,t,[1,1],{'logsig','purelin'},'trainlm');

net.trainParam.epochs=100;net.trainParam.lr=1;net.trainParam.max_fail=20;net.trainParam.goal=1e-6;net=train(net,p,t);t_training=sim(net,p);t_test=sim(net,p_test);training_success=0;testing_success=0;for i=1:100

if (i0.65)training_success=training_success+1;

elseif (i>50) && (t_training(i)15) && (t_test(i)


14/16

14

plot(x,t_training,'bs');hold on;

No.of neurons\Attempt 1st Attempt 2nd Attempt Average

Training Testing Training Testing Training Testing

20 90 67 90 77 90 72

40 92 77 92 73 92 7560 89 76 91 80 90 78

80 87 77 89 73 88 75

*rounded off to nearest whole number

When input images are shuffled



20 55 63 55 50 55 57

40 47 43 54 53 51 48

60 56 47 53 50 55 49

80 49 50 51 50 50 50

When inputs are normalised before training network



20 90 83 90 73 90 78

40 94 73 96 87 95 80

60 94 77 91 73 93 75

80 87 80 91 73 89 77

c) When using MLP with sequential learningclear all;for i=1:15

fem_pic=strcat('test_im/f',num2str(i),'.png');mal_pic=strcat('test_im/m',num2str(i),'.png');test_im_fem=imread(fem_pic);test_im_mal=imread(mal_pic);test_fem(i,:)=Test_im_fem(:);test_mal(i,:)=Test_im_mal(:);

end;p_test=double([test_fem' test_mal']);

for i=1:50fem_pic=strcat('train_im/f',num2str(i+15),'.png');mal_pic=strcat('train_im/m',num2str(i+15),'.png');test_im_fem=imread(fem_pic);

test_im_mal=imread(mal_pic);test_fem (i,:)=Test_im_fem(:);test_mal(i,:)=Test_im_mal(:);

end;p=double([test_fem ' test_mal]);t(1:50)=0.2;t(51:100)=0.8;[pn,ps]=mapstd(p);p_testn=mapstd('apply',p_test,ps);net=newff(pn,t,[50 1],{'tansig','logsig'},'traingdm);net.trainParam.epochs=1; %Set epoch of batch trainingtobe1net.trainParam.show=1000;for i=1:50

index=randperm(100); %Shuffletheinputdataeveryepochfprintf('\nepochs%d \n',i);for j=1:100


15/16

15

net=train(net,pn(:,index(j)),t(index(j)));%Performsequentiallearning

fprintf('%d',j);end;

end;t_test=sim(net,p_testn);

t_training=sim(net,pn);error_training=0;error_testing=0;for i=1:100

if((i0.5))error_training=error_training+1;

elseif((i>50)&&(t_training(i)15)&&(t_test(i)


16/16

16

Conclusion:

1) Single layer perceptron works well for linearly separable problems but its

performance is inferior to Multi layer perceptron. Also single layer perceptron works

well with preprocessed inputs.

2)

Even though process of shuffling the inputs doesnt yield great results, it will yield

better results if improvised methods of shuffling are used.

3) By this we train the network with a surprise and thereby the network learns better.

4) Normalisation of the inputs improves the performance of the network considerably.

5) And the choice of the number of hidden neurons can be decided based on a trial and

error method (and see which configuration yields best results).

6) Multilayer with sequential mode of learning is very slow. While multilayer

perceptron with batch mode is faster to yield the same results. But MCP with batch

learning requires large memory space to store the inputs and so trainlm learning

method is not preferred (because it results in out of memory situation).

7) For pattern recognition, preprocessing of the input data, such as dimensionalityreduction and normalization, can help to improve the final performance.

8) For large and redundant database, sequential learning would be better.

Documents

Neural Assignment 2