Upload
samiksha-shende
View
230
Download
1
Embed Size (px)
Citation preview
8/13/2019 Soft Computing 2
1/33
8/13/2019 Soft Computing 2
2/33
8/13/2019 Soft Computing 2
3/33
8/13/2019 Soft Computing 2
4/33
8/13/2019 Soft Computing 2
5/33
NeuralnetworksNN Architecture Learning methods
Gradient Descent Hebbian Competitive Stochastic
Single layer FFN ADALINE (Adaptive
Linear Neuron
AM (Associative Memory LVQ (Learning vector
quantization)
ement
Hopfield
Perceptron
op e
( self organizing feature
map
Multilayer FFN CCM (Cauchy
Machines
Neo-cognition
Function)
Recurrent networks RNN BAM( Bidirectional AM)
BSB(Brain state in a box)
ART( Adaptive resonance
theory)
Boltzmann and
Cauchy Machines
Hopfield
8/13/2019 Soft Computing 2
6/33
AdaptiveFilteringProblem
8/13/2019 Soft Computing 2
7/33
8/13/2019 Soft Computing 2
8/33
8/13/2019 Soft Computing 2
9/33
Unconstrainedoptimizationtechniques
8/13/2019 Soft Computing 2
10/33
Newtons
SteepestDescent
GaussNewton
Method
SteepestDescent
8/13/2019 Soft Computing 2
11/33
8/13/2019 Soft Computing 2
12/33
LMS
Al orithmTheLeastMeanSquare(LMS)algorithm,introducedbyWidrowandHoffin1959
, .
LMSalgorithmusestheestimatesofthegradientvectorfromtheavailabledata.
LMSincorporatesaniterativeprocedurethatmakessuccessivecorrectionstothe
weig tvectorint e irectiono t enegativeo t egra ientvectorw ic
eventuallyleadstotheminimummeansquareerror.
Com aredtootheral orithmsLMSal orithmisrelativel sim le itdoesnot
requirecorrelationfunctioncalculationnordoesitrequirematrixinversions.
8/13/2019 Soft Computing 2
13/33
LMSAlgorithm
8/13/2019 Soft Computing 2
14/33
8/13/2019 Soft Computing 2
15/33
8/13/2019 Soft Computing 2
16/33
graphrepresentation
8/13/2019 Soft Computing 2
17/33
Solutionfollowstherandomtrajectoryhence
.
steepestdescent
follows
well
defined
LMSdoes
not
require
the
knowledge
of
the
statisticso t eenvironment
Simpleandrobustasitismodelindependent
Slowrate
of
convergence
8/13/2019 Soft Computing 2
18/33
Learningcurves
8/13/2019 Soft Computing 2
19/33
MultilayerNeuralNetwork(perceptrons)
8/13/2019 Soft Computing 2
20/33
activationfunction
Highly
connected
8/13/2019 Soft Computing 2
21/33
Backpropagationalgorithm
Backpropagation is a common method of teaching artificial neuralne wor s ow o per orm a g ven as . was rs escr e y
Arthur E. Bryson and YuChi Ho in 1969,
]
but it wasn't until 1974 andlater, through the work of Paul Werbos, David E. Rumelhart,Geoffre E. Hinton and Ronald J. Williams that it ainedrecognition, and it led to a renaissance in the field of artificialneural network research.
It is a supervised learning method, and is a generalization of thedelta rule. It requires a teacher that knows, or can calculate, thedesired output for any input in the training set. It is most useful for
, ,
that have no connections that loop). The term is an abbreviation for"backward propagation of errors". Backpropagation requires thatthe activation function used by the artificial neurons (or "nodes")
e eren a e.
8/13/2019 Soft Computing 2
22/33
Backpropagation networks are necessarily multilayer perceptrons(usually with one input, one hidden, and one output layer). In order forthe hidden layer to serve any useful function, multilayer networks musthave nonlinear activation functions for the multiple layers: a multilayernetwork using only linear activation functions is equivalent to some
, .commonly used include the logistic function, the softmax function, andthe gaussian function.
8/13/2019 Soft Computing 2
23/33
you're not sure how to relate it to the output. The roblem a ears to have overwhelmin
complexity, but there is clearly a solution.
It is easy to create a number of examples of thecorrect behavior.
The solution to the problem may change over time,
within the bounds of the given input and outputparameters (i.e., today 2+2=4, but in the future we
may n t at + = . .
Outputs can be "fuzzy", or nonnumeric.
8/13/2019 Soft Computing 2
24/33
The conver ence obtained from back ro a ation
learning is very slow. The convergence in backpropagation learning is
.
The result may generally converge to any local
gradient descent exists on a surface which is notflat.
ac propagat on earn ng requ res nput sca ngor normalization. Inputs are usually scaled intothe ran e of +0.1f to +0.9f for best erformance.
8/13/2019 Soft Computing 2
25/33
8/13/2019 Soft Computing 2
26/33
8/13/2019 Soft Computing 2
27/33
TrainingaTwoLayerFeedforwardNetwork
1.Take the set of trainin atterns ou wish the network to learn
{ini p, outj p : i = 1 ninputs, j = 1 noutputs, p = 1 npatterns} .
2. Set up your network with ninputs input units fully connected to
nhidden nonlinearhidden units via connections with weights,
which in turn are fully
3. Generate random initial weights, e.g. from the range [smwt,
+smwt]4. Select an appropriate error functionand learning rate .
5. Apply the weight update equation for each training pattern p.
wpatterns is called oneepoch of training.
6. Re eat ste 5 until the network error function is small
enough.
The extension to networks with more hidden layers should be
rac ca ons era ons or ac
8/13/2019 Soft Computing 2
28/33
rac ca ons era ons or ac
PropagationLearningMost
of
the
practical
considerations
necessary
for
general
Back
Propagation
learning
1.Doweneedtopreprocessthetrainingdata?Ifso,how?
2.Howdowechoosetheinitialweightsfromwhichwestartthetraining?
3.Howdowechooseanappropriatelearningrateh?
. ,
set?
5.Aresomeactivation/transferfunctionsbetterthanothers?
.
7.Howcanweavoidlocalminimaintheerrorfunction?
8.Howdoweknowwhenweshouldstopthetraining?
However,there
are
also
two
important
issues
9.Howmanyhiddenunitsdoweneed?
10.Shouldwehavedifferentlearningratesforthedifferentlayers?
8/13/2019 Soft Computing 2
29/33
HowManyHiddenUnits?The best number of hidden units depends in a complex way on many factors,
including:
1.The number of trainin atterns
2. The numbers of input and output units
3. The amount of noise in the training data
.
5. The type of hidden unit activation function
6. The training algorithmToo few hidden units will generally leave high training and generalisation
errors due to underfitting. Too many hidden units will result in low
training errors, but will make the training unnecessarily slow, and will
resu t n poor genera sat on un ess some ot er tec n que suc asregularisation) is used to prevent overfitting.
Virtually all rules of thumb you hear about are actually nonsense. A
sens e strategy s to try a range o num ers o en un ts an see
which works best.
Diff t L i R t f Diff t
8/13/2019 Soft Computing 2
30/33
DifferentLearningRatesforDifferent
L r ?A network as a whole will usually learn most efficiently if all its neurons arelearning at roughly the same speed. So maybe different parts of the
network should have different learning rates h. There are a number of
factors that may affect the choices:
1.Thelaternetworkla ers nearertheout uts willtendtohavelar erlocal
gradients(deltas)thantheearlierlayers(nearertheinputs).
2.Theactivationsofunitswithmanyconnectionsfeedingintooroutofthem
.
3.ActivationsrequiredforlinearunitswillbedifferentforSigmoidalunits.
4.Thereisempiricalevidencethatithelpstohavedifferentlearningratesh
.
Inpractice,itisoftenquickertojustusethesamerateshforalltheweights
andthresholds,ratherthanspendingtimetryingtoworkoutappropriate
.
determinegoodlearningrates.
8/13/2019 Soft Computing 2
31/33
NNArchitecture
HopfieldNetwork
KohonenSelfOrganizingMap
RadialBasis
Function
Network
ART(Adaptiveresonancetheory
BSB(BrainstateinaboxModel)
MarkovChains
Helmholtzmachines
Boltzmannmachine
Simulatedannealing
KalmanFilters
SaptioTemporalModelsofaneuron
Bellmantheorem
KullbackLeiblerDivergence
8/13/2019 Soft Computing 2
32/33
Expansion,Generation,Transmission
Distri ution,Structura
ReactivePower
Reliability
8/13/2019 Soft Computing 2
33/33
p an
Generationscheduling,Economicdispatch,OPF,Unitcommitment,
Reactivepowerdispatch,Voltagecontrol,Securityassessment,Static,
Dynamic,Maintenancescheduling,Contractmanagement
Equipmentmonitoring,
SystemLoadforecasting,Loadmanagement,Alarmprocessing/Fault,agnos s, erv cerestorat on, etwor sw tc ng, ont ngencyana ys s,
FACTs,Stateestimation
Analysis/Modeling,
Power
flow,Harmonics,Transient
stability,Dynamic
s a y, on ro es gn, mu a on opera ors, ro ec on