Upload
shivan-biradar
View
28
Download
2
Embed Size (px)
DESCRIPTION
In machine learning and related fields, artificial neural networks (ANNs) are computational models inspired by an animal's central nervous systems (in particular the brain) which is capable of machine learning as well as pattern recognition. Artificial neural networks are generally presented as systems of interconnected "neurons" which can compute values from inputs.
Citation preview
Target outputs are y2*=1 and y3*=0.5. Learning rate is 0.5.Show that with the updated weights there is a reduction in the total error.
lr=0.5; % learning ratew1=4;%hidden neuron weightw2=3;%first output neuron weightw3=2;%second output neuron weightx=1;% inputnet1=w1*x;a1=1/(1+exp(-net1));%a1 is the output of the hidden neuronnet2=w2*a1;o1=1/(1+exp(-net2));% o1 is output of first output neuronnet3=w3*a1;o2=1/(1+exp(-net3));% o2 is output of second output neuront1=1; %target 1t2=0.5;%target2format longd2=(t1-o1)*o1*(1-o1);% d2=del of first output neurond3=(t2-o2)*o2*(1-o2);%d3=del of second output neurond1=a1*(1-a1)*(d2*w2+d3*w3);% del of the hidden neuronw2n=w2+lr*d2*a1 % new w3(answer)w3n=w3+lr*d3*a1 % new w3(answer)w1n=w1+lr*d1*x% new w1(answer)
o1 = 0.950076058709617
o2 = 0.876968168373950---w2n = 3.001162689345306
w3n = 1.980029286114346
w1n = 3.999344342217976--o1 = 0.950128539767988
o2 = 0.874833981221324
Homework
Approximation of SinC function
-10 -8 -6 -4 -2 0 2 4 6 8 10-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
-10.0000 -0.0544 -9.8000 -0.0374 -9.6000 -0.0182 -9.4000 0.0026 -9.2000 0.0242 -9.0000 0.0458 -8.8000 0.0665 -8.6000 0.0854 -8.4000 0.1017 -8.2000 0.1147 -8.0000 0.1237 -7.8000 0.1280 -7.6000 0.1274 -7.4000 0.1214 -7.2000 0.1102 -7.0000 0.0939 -6.8000 0.0727 -6.6000 0.0472 -6.4000 0.0182 -6.2000 -0.0134 … ……
load fisheriris p=meas';a=zeros(3,50); b=zeros(3,50); c=zeros(3,50); a(1,:)=1; b(2,:)=1; c(3,:)=1; t=[a,b,c]; net=newff(p,t,5); net=train(net,p,t); y=sim(net,p);
y=sim(net,p(:,150))
y =
-0.0000 0.0743 0.9258
simplefit_dataset - Simple fitting dataset. abalone_dataset - Abalone shell rings dataset. bodyfat_dataset - Body fat percentage dataset. building_dataset - Building energy dataset. chemical_dataset - Chemical sensor dataset. cho_dataset - Cholesterol dataset. engine_dataset - Engine behavior dataset. house_dataset - House value dataset
simpleclass_dataset - Simple pattern recognition dataset. cancer_dataset - Breast cancer dataset. crab_dataset - Crab gender dataset. glass_dataset - Glass chemical dataset. iris_dataset - Iris flower dataset. thyroid_dataset - Thyroid function dataset. wine_dataset - Italian wines dataset.
Regression Problems
Classification Problems
Data Sets
[x,t] = simplefit_dataset; figure(1) plot(x,t) net = feedforwardnet(10); net = train(net,x,t); view(net) y = net(x); figure(2) plot(x,y) perf = perform(net,t,y)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
9
10
[x,t] = simplefit_dataset; net = fitnet(10); net = train(net,x,t); view(net) y = net(x); perf = perform(net,t,y)
• Small random values of weights for avoidance of saturation.
• The connection weights from the inputs to a hidden unit determine the orientation of the hyperplane. The bias determines the distance of the hyperplane from the origin.
• If the data are not centered at the origin, the hyperplane may fail to pass through the data cloud.
•If all the inputs have a small coefficient of variation, it is quite possible that all the initial hyperplanes will miss the data entirely.
Input normalization (Preprocessing)
• To avoid saturation• If the bias terms are all small random numbers, then
all the decision surfaces will pass close to the origin. If the data are not centered at the origin, the decision surfaces will not pass through the data points
‘prestd’ or ‘mapstd’ command in MATLAB
Consider an MLP with two inputs (X and Y) and 100 hidden units.
It will be easy to learn a hyperplane passing through any part of these regions at any angle.
Curse of Dimensionality
Example: Fisher Iris problem is a 3-class pattern recognition problem.Assume that we are taking only one feature (x1), say sepal length.
If we are forced to work with a limited quantity of data then increasing the dimensionality of the space rapidly leads to the point where the data is very sparse, in which case it provides a very poor representation of the mapping.
Idea of PCA
• Reduce the dimensionality of a data set which consists of a large number of interrelated variables by linearly transforming the original data set to a new set of usually fewer uncorrelated variables (PCs), while retaining as much as possible of the variation present in the original data set.
• The PC causing higher variation has more impact on the observations, thus intuitively more informational.
Mean, Standard Deviation and Variance
Covariance
The average distancefrom the mean of the data set to a point
The covariance Matrix
Covariance is always measured between 2 dimensions. If we have a data set with more than 2 dimensions, there is more than one covariance measurement that can be calculated. For example, from a 3 dimensional data set
Mean 1.81 and 1.91
Original data set
-1.5 -1 -0.5 0 0.5 1 1.5-1.5
-1
-0.5
0
0.5
1
1.5
f=e(:,2)r=f'*B';
r'
ans =
-0.8280
1.7776
-0.9922
-0.2742
-1.6759
-0.9130
0.0991
1.1446
0.4381
1.2239
Step 1: Get some data
Step 2: Subtract the mean
Step 3: Calculate the covariance matrix
Step 4: Calculate the eigenvectors and eigenvalues of the covariance matrix
Step 5: Choosing components and forming a feature vector
Step 6: Deriving the new data set
A set of variables that define a projection that encapsulates the maximum amount of variation in a dataset and is orthogonal (and therefore uncorrelated) to the previous principal component of the same dataset.
The blue lines represent 2 consecutive principle components. Note that they are orthogonal (at right angles) to each other.