Soft Computing 2

Embed Size (px)

Citation preview

  • 8/13/2019 Soft Computing 2

    1/33

  • 8/13/2019 Soft Computing 2

    2/33

  • 8/13/2019 Soft Computing 2

    3/33

  • 8/13/2019 Soft Computing 2

    4/33

  • 8/13/2019 Soft Computing 2

    5/33

    NeuralnetworksNN Architecture Learning methods

    Gradient Descent Hebbian Competitive Stochastic

    Single layer FFN ADALINE (Adaptive

    Linear Neuron

    AM (Associative Memory LVQ (Learning vector

    quantization)

    ement

    Hopfield

    Perceptron

    op e

    ( self organizing feature

    map

    Multilayer FFN CCM (Cauchy

    Machines

    Neo-cognition

    Function)

    Recurrent networks RNN BAM( Bidirectional AM)

    BSB(Brain state in a box)

    ART( Adaptive resonance

    theory)

    Boltzmann and

    Cauchy Machines

    Hopfield

  • 8/13/2019 Soft Computing 2

    6/33

    AdaptiveFilteringProblem

  • 8/13/2019 Soft Computing 2

    7/33

  • 8/13/2019 Soft Computing 2

    8/33

  • 8/13/2019 Soft Computing 2

    9/33

    Unconstrainedoptimizationtechniques

  • 8/13/2019 Soft Computing 2

    10/33

    Newtons

    SteepestDescent

    GaussNewton

    Method

    SteepestDescent

  • 8/13/2019 Soft Computing 2

    11/33

  • 8/13/2019 Soft Computing 2

    12/33

    LMS

    Al orithmTheLeastMeanSquare(LMS)algorithm,introducedbyWidrowandHoffin1959

    , .

    LMSalgorithmusestheestimatesofthegradientvectorfromtheavailabledata.

    LMSincorporatesaniterativeprocedurethatmakessuccessivecorrectionstothe

    weig tvectorint e irectiono t enegativeo t egra ientvectorw ic

    eventuallyleadstotheminimummeansquareerror.

    Com aredtootheral orithmsLMSal orithmisrelativel sim le itdoesnot

    requirecorrelationfunctioncalculationnordoesitrequirematrixinversions.

  • 8/13/2019 Soft Computing 2

    13/33

    LMSAlgorithm

  • 8/13/2019 Soft Computing 2

    14/33

  • 8/13/2019 Soft Computing 2

    15/33

  • 8/13/2019 Soft Computing 2

    16/33

    graphrepresentation

  • 8/13/2019 Soft Computing 2

    17/33

    Solutionfollowstherandomtrajectoryhence

    .

    steepestdescent

    follows

    well

    defined

    LMSdoes

    not

    require

    the

    knowledge

    of

    the

    statisticso t eenvironment

    Simpleandrobustasitismodelindependent

    Slowrate

    of

    convergence

  • 8/13/2019 Soft Computing 2

    18/33

    Learningcurves

  • 8/13/2019 Soft Computing 2

    19/33

    MultilayerNeuralNetwork(perceptrons)

  • 8/13/2019 Soft Computing 2

    20/33

    activationfunction

    Highly

    connected

  • 8/13/2019 Soft Computing 2

    21/33

    Backpropagationalgorithm

    Backpropagation is a common method of teaching artificial neuralne wor s ow o per orm a g ven as . was rs escr e y

    Arthur E. Bryson and YuChi Ho in 1969,

    ]

    but it wasn't until 1974 andlater, through the work of Paul Werbos, David E. Rumelhart,Geoffre E. Hinton and Ronald J. Williams that it ainedrecognition, and it led to a renaissance in the field of artificialneural network research.

    It is a supervised learning method, and is a generalization of thedelta rule. It requires a teacher that knows, or can calculate, thedesired output for any input in the training set. It is most useful for

    , ,

    that have no connections that loop). The term is an abbreviation for"backward propagation of errors". Backpropagation requires thatthe activation function used by the artificial neurons (or "nodes")

    e eren a e.

  • 8/13/2019 Soft Computing 2

    22/33

    Backpropagation networks are necessarily multilayer perceptrons(usually with one input, one hidden, and one output layer). In order forthe hidden layer to serve any useful function, multilayer networks musthave nonlinear activation functions for the multiple layers: a multilayernetwork using only linear activation functions is equivalent to some

    , .commonly used include the logistic function, the softmax function, andthe gaussian function.

  • 8/13/2019 Soft Computing 2

    23/33

    you're not sure how to relate it to the output. The roblem a ears to have overwhelmin

    complexity, but there is clearly a solution.

    It is easy to create a number of examples of thecorrect behavior.

    The solution to the problem may change over time,

    within the bounds of the given input and outputparameters (i.e., today 2+2=4, but in the future we

    may n t at + = . .

    Outputs can be "fuzzy", or nonnumeric.

  • 8/13/2019 Soft Computing 2

    24/33

    The conver ence obtained from back ro a ation

    learning is very slow. The convergence in backpropagation learning is

    .

    The result may generally converge to any local

    gradient descent exists on a surface which is notflat.

    ac propagat on earn ng requ res nput sca ngor normalization. Inputs are usually scaled intothe ran e of +0.1f to +0.9f for best erformance.

  • 8/13/2019 Soft Computing 2

    25/33

  • 8/13/2019 Soft Computing 2

    26/33

  • 8/13/2019 Soft Computing 2

    27/33

    TrainingaTwoLayerFeedforwardNetwork

    1.Take the set of trainin atterns ou wish the network to learn

    {ini p, outj p : i = 1 ninputs, j = 1 noutputs, p = 1 npatterns} .

    2. Set up your network with ninputs input units fully connected to

    nhidden nonlinearhidden units via connections with weights,

    which in turn are fully

    3. Generate random initial weights, e.g. from the range [smwt,

    +smwt]4. Select an appropriate error functionand learning rate .

    5. Apply the weight update equation for each training pattern p.

    wpatterns is called oneepoch of training.

    6. Re eat ste 5 until the network error function is small

    enough.

    The extension to networks with more hidden layers should be

    rac ca ons era ons or ac

  • 8/13/2019 Soft Computing 2

    28/33

    rac ca ons era ons or ac

    PropagationLearningMost

    of

    the

    practical

    considerations

    necessary

    for

    general

    Back

    Propagation

    learning

    1.Doweneedtopreprocessthetrainingdata?Ifso,how?

    2.Howdowechoosetheinitialweightsfromwhichwestartthetraining?

    3.Howdowechooseanappropriatelearningrateh?

    . ,

    set?

    5.Aresomeactivation/transferfunctionsbetterthanothers?

    .

    7.Howcanweavoidlocalminimaintheerrorfunction?

    8.Howdoweknowwhenweshouldstopthetraining?

    However,there

    are

    also

    two

    important

    issues

    9.Howmanyhiddenunitsdoweneed?

    10.Shouldwehavedifferentlearningratesforthedifferentlayers?

  • 8/13/2019 Soft Computing 2

    29/33

    HowManyHiddenUnits?The best number of hidden units depends in a complex way on many factors,

    including:

    1.The number of trainin atterns

    2. The numbers of input and output units

    3. The amount of noise in the training data

    .

    5. The type of hidden unit activation function

    6. The training algorithmToo few hidden units will generally leave high training and generalisation

    errors due to underfitting. Too many hidden units will result in low

    training errors, but will make the training unnecessarily slow, and will

    resu t n poor genera sat on un ess some ot er tec n que suc asregularisation) is used to prevent overfitting.

    Virtually all rules of thumb you hear about are actually nonsense. A

    sens e strategy s to try a range o num ers o en un ts an see

    which works best.

    Diff t L i R t f Diff t

  • 8/13/2019 Soft Computing 2

    30/33

    DifferentLearningRatesforDifferent

    L r ?A network as a whole will usually learn most efficiently if all its neurons arelearning at roughly the same speed. So maybe different parts of the

    network should have different learning rates h. There are a number of

    factors that may affect the choices:

    1.Thelaternetworkla ers nearertheout uts willtendtohavelar erlocal

    gradients(deltas)thantheearlierlayers(nearertheinputs).

    2.Theactivationsofunitswithmanyconnectionsfeedingintooroutofthem

    .

    3.ActivationsrequiredforlinearunitswillbedifferentforSigmoidalunits.

    4.Thereisempiricalevidencethatithelpstohavedifferentlearningratesh

    .

    Inpractice,itisoftenquickertojustusethesamerateshforalltheweights

    andthresholds,ratherthanspendingtimetryingtoworkoutappropriate

    .

    determinegoodlearningrates.

  • 8/13/2019 Soft Computing 2

    31/33

    NNArchitecture

    HopfieldNetwork

    KohonenSelfOrganizingMap

    RadialBasis

    Function

    Network

    ART(Adaptiveresonancetheory

    BSB(BrainstateinaboxModel)

    MarkovChains

    Helmholtzmachines

    Boltzmannmachine

    Simulatedannealing

    KalmanFilters

    SaptioTemporalModelsofaneuron

    Bellmantheorem

    KullbackLeiblerDivergence

  • 8/13/2019 Soft Computing 2

    32/33

    Expansion,Generation,Transmission

    Distri ution,Structura

    ReactivePower

    Reliability

  • 8/13/2019 Soft Computing 2

    33/33

    p an

    Generationscheduling,Economicdispatch,OPF,Unitcommitment,

    Reactivepowerdispatch,Voltagecontrol,Securityassessment,Static,

    Dynamic,Maintenancescheduling,Contractmanagement

    Equipmentmonitoring,

    SystemLoadforecasting,Loadmanagement,Alarmprocessing/Fault,agnos s, erv cerestorat on, etwor sw tc ng, ont ngencyana ys s,

    FACTs,Stateestimation

    Analysis/Modeling,

    Power

    flow,Harmonics,Transient

    stability,Dynamic

    s a y, on ro es gn, mu a on opera ors, ro ec on