15
Lecture 6. Perceptron – Simplest Neural Network Saint-Petersburg State Polytechnic University Distributed Intelligent Systems Dept “ARTIFICIAL NEURAL NETWORKSPerceptron is one of the first and simplest artificial neural networks, which was presented in the middle of 50-th years. It was a first mathematical model, which demonstrates new paradigms of machine learning computational environment, and a threshold logic units model of classification tasks. Prof. Dr. Viacheslav P. Shkodyrev E:Mail: [email protected]

Viacheslav P. Shkodyrev- Perceptron – Simplest Neural Network

Embed Size (px)

Citation preview

Lecture 6. Perceptron Simplest Neural Network Saint-Petersburg State Polytechnic UniversityDistributed Intelligent Systems DeptARTIFICIAL NEURAL NETWORKS Perceptron is one of the first and simplest artificial neural networks, which was presented in the middle of 50-th years. It was a first mathematical model, which demonstrates new paradigms of machine learning computational environment, and a threshold logic units model of classification tasks. Prof. Dr. Viacheslav P. ShkodyrevE:Mail: [email protected] 6-1 Lecture 6Perceptron - Neural Networks Perceptron was proposed by Rosenblatt (1958) as the first model for learning with a teacher, i.e. supervised learning. This model is an enhancement of threshold logic units (TLUs) which used for the classification of patterns said to be linearly separable. How to formalize and interpret this model? Objective of Lecture 6 This lecture introduces the simplest class of neural networks perseptron and its application to pattern classification. It is possible to interpret the functionality of perceptron and threshold logic units model geometrically via a separating hyper plane. In this lecture we will define what we mean by a perceptron learning rule, explain the P networks and its training algorithms, discuss the limitation of the P networks. You will learn: What is a single-layer perceptron via threshold logic units model- Perceptrons as Linear and Non-Linear Classifiers via threshold logic theory- Multi-Layer Perceptrons networks- Perceptron Learning RulesL 6-2 Lecture 6The Simple Nonlinear One-Layer Neural Networks It was P which presents at first time a new paradigms of training computation algorithms. thresholdInput patternAssociation unit We solve a classification task when we assign any image, represented by a feature vector, to one of two classes, which we shall denote by F of A, so that class A corresponds to the character a and class F corresponds to b. Using the perceptron training algorithm, we may to classify two linearly separable classesL 6-3 Lecture 6.Single-Layer Perceptron as a simplest model for classificationx1x2xnInput pattern x...........................Input patternsa b c ...123:a1 b1 c1 :Input vector x Synapses matrix WThreshold logic activationutresh u( ) u y =1Threshold logic activation The single-layer perceptron was the first simplest model that generates a great interest due to its ability to generalize from its training vectors and learn from initially randomly distributed connections.Architecture of SLPL 6-4 Lecture 6Linearly Separable Classification Problem via SLP lx1x2x2uutreshu=w1x1 + w2x2 + b( ) { }, SA tresh o u x u x x < < ( ) { }, SB tresh o u x u x x > > uxu=wx+bSASButreshx0( ) { } , SA tresh o u u < < x x x( ) { }, SB tresh o u u > > x x xx1SASB,21((

=xxxux1x2w1w2ybTwo-inputs Perceptronuxw ybOne-input PerceptronGeometric interpretation of Threshold Logic UnitsL 6-5 Lecture 6:Limitations of Single-Layer Perceptrons Exclusive-Or (XOR) gate Problem [Minsly, Papert, 1969] 0 1 11 1 01 0 10 0 0Output uInput x1Input x1XOR-Logicx2x(0,0) x1x(1,1)x(1,0)x(0,1)x(0,0)x2x1x(1,1)x(1,0)x(0,1)Solution of the Exclusive-Or (XOR) gate Problem Linear separable surface can not to solve the Exclusive-Or gate classification tasksOvercoming of problem is multi-layers networkL 6-6 Lecture 6Two-Layers Perceptron for Non-Linear Separability u1(1)xw1(1)yWith an one-input two-layers Perceptron we have close separable area with catting out boundary u(x)xuthreshx01 x021S2SClose separable boundary at 1-Dimension space of x2 1S S S = Au2(1)u1(1)=w1(1)x +bw2(1)u (2)w1(2)w2(2)u2(1)=w2(1)x +b( )( ) { }, S11 1 1 thresh o u x u x x > > ( ) { }, S) 1 (2 2 2 thresh o u x u x x > > where:L 6-7 Lecture 6Topology Classification of Multilayer NNux2AS A x1x1x2123123( ) ( ) ( ) 111111 b u + = x w ( ) ( ) ( ) 121212 b u + = x wDecision boundary of Neuron 2Decision boundary of Neuron 1Decision boundary of Neuron 3With a two-inputs two-layers Perceptron net we can realize a convex separable surface y(2)x x1x2u1(1)u2(1)u3(1)w11(1)w12(1)w23(1)u(2)w11(2)w31(2)3 2 1 AS S S S = AConvex separable boundary at 2-Dimension spaceLayer 2Layer1L 6-8 Lecture 6Learning of Neural Networksx x1x2u1(1)u2(1)u3(1)w11(1)w12(1)w23(1)y1(2)u1(2)w11(2)w31(2)y2(2)u2(2)w12(2)w32(2)u3(3)w21(3)w11(3)y1(3)Layer 1Layer 2Layer 31S A2S AZWith a three-layers two-inputs Perceptron net we can realize a non-convex separable surface Complex concave separable surface:2 1S S Z A A =where:( ){ }( ){ }. S, S22 i 221 i 1threshthreshu uu u> A > AxxL 6-9 Lecture 6Learning Rule for the Single-Layer Perceptron( ) ( ) min : 9 ey y W WWopt tarJ JLearning of SLP via Optimization TaskSolution of Task:( ) ( ) ( ),1 kijkijkij w w w A + =+o ( ) ( )( ).kijkijwJwcc AWwhere:Rosenblatts learning rule:on base of quantized error minimizationModified Rosenblatts learning rule:on base of non-quantized error minimizationWidrow-Hoff learning rule (delta rule}:on base of state error minimization( )( ) . sgn ,teachwx y ee W = RJ ( )( ) . , sgn ), (21teach2 wx y ee W =RJ ( )( ). ,teach kRJu u ee W = The aim of learning is to minimize the instantaneous squared error of the output signal. L 6-10 Lecture 6Rosenblatts learning ruleWe determine the cost function via quantized error e: ( )( ) Wx yy y e WsgnJteachteach R = = where:(((((

=m1eee2e( )= = = = = = = == =. 1 , 1 0, 0 , 1 , 1, 1 , 0 , 1, 0 , 0 , 0Tjteach jteach jteach jteach jteach jy y if y y if y y if y y ifsgn y e x w- is a vector of quantized error with element ej.Then weights change value is:( ) ( )( ) ( )( ) ( )( ) ( )( ) ( )= = = = = = = == = A. 0 , 1 , 0, 1 , 1 ,, 1 , 0 ,, 0 , 0 0, kjkjkjkj ikjkj ikjkjjkjkije y if e y if x e y if x e y ifx e wooo( ) ( )( ).kijkijwJwcc AWor:The first original perceptron learning rule for adjusting the weights was developed by Rosenblatt.L 6-11 Lecture 6Modified Rosenblatts learning ruleutresh u( ) u y =1utresh u( ) u y =1 In modern perceptron implementations the hard-limiter function is usually replaced by a smooth nonlinear activation function such as the sigmoid function: ( ) ( )( )( ) ( ) ( )( ) ( ). tanh : or , exp 1 :, 2112u u whereJteachteach M= + = = = WxWxWx yy y e W We determine the modified cost function via quantized error e: We get the final equationApplying the algebraic transformation( ) ( ) ( )( ) | | ( ). 12 kikjkjkij x y e w = A o( ) ( )( ),kijMkijwJwcc AWL 6-12 Lecture 6General Algorithm of the Learning RuleSLP ( ) AJxInitW0W[k]| | k W AyyteachW[k+1]Delta RuleModif. Rosen.Rozenblatt Learning of a SLP illustrates a supervised learning rule which aims to assign the input patterns {x1, x2, ,xp} to one of the prespecified classes or categories with desired response if perceptron outputs for every classes we know in advanced the desired response.fTarfxL 6-13 Lecture 6Block-Diagram of the Rosenblatts Learning Rule| | ( ) ( )| || | | | | | | || | | || | | || | | |= = = = = = = == = A(((

cc = V A, 0 , 1 if , 0, 1 , 1 if ,, 1 , 0 if ,, 0 , 0 if , 0:, elements h matrix wit -,k e k f k e k f x k e k f x k e k fx k e k w wherek w wJJ kj jj j ij j ij ji i ijijijRRooooWW WThe Rosenblatts learning rule realises the weights change value as:L 6-14 Lecture 6Recommended References 1. Marvin L., Minsly, S.Pafert Perceptrons Expanded Edition: Introduction to Computation Geometry.2. Haykin S. Neural Networks: A Comprehensive Foundation. Mac-Millan. N.York.1994.3. Laurence Fausett Fundamentals of Neural Networks: Architecture, Algorithms, and Applications . Prentice Hall, 1994. 4. Cichocki A. Unbehauen R. Neural Networks for Optimization and Signal Processing, Wiley, 1993.