Adaptive 1994 Nn Ad Con Camacho

Coptight Q IFAC ArtificialIntelligencein RealTime Control,Valencia, Spain, 1994 NEURALNETWORKBASED ADAPTIVECONTROL E.F.CAMACHOandM.R.ARAHAL. Dpto.IngenieriadeSistemasyAutombtica,Univ.ofSeville,Spain. Abstract.Thispaperpresentsdifferentswaysof usingartificialneuralnetworksinadap- tivecontrol.Aclassificationofarchitecturesforcontrolusingneuralnetworksispre- sented,showingtheexistingparalelismwithAdaptiveControltechniques. KeyWords-Adaptivecontrol;Automaticcontrol;Neuralnets;Nonlinearcontrolsystems. 1.INTRODUCTION TheControlTheoryforlinearprocesseshasfor sometimebeenconsideredawellestablished scientificdisciplinewithpowerfultechniquesfor analyzinganddesigningcontrollers.Themain problemsinprocesscontrolwhenapplyingthe LinearControlTheoryarecausedbythefact that: a)Alinearmathematicalmodeloftheplantis neededandfindingoneisnotatrivialproblem inmanycases. b)Mathematicalmodelsofrealprocessescan- nottakeallaspectofrealityintoaccount.Sim- plifyingassumptionshavetobemadeandmod- elsareonlyapproximationsofreality. c)Mostprocessesarenonlinear,havingnon- lineardynamicsandnonlinearitiescausedby actuatorsthathavealimitedrangeofaction andalimitedslewrate,asinthe,caseofcon- trolvalves,whicharelimitedbyfullyclosedand fullyopenpositionsandamaximumslewrate. Constructiveand/orsafetyreasons,aswellas sensorranges,causelimitsinprocessvariables, asinthecaseoftanklevels,pipeflowsandpres- suresindeposits. d)Becauseofchangingenvironmentalcondi- tions,suchasambienttemperature,humidity etc.,mostprocessesarenottimeinvariant. Theseproblemshavebeenextensivelytreated 13 inliteratureandsomenewdisciplineshaveap- pearedtoaddressthem.Someofthedis- ciplineshaveevolvedaroundtheLinearSys- temsControlCommunity,asisthecaseof RobustorAdaptiveControlwhileotherdisci- plineshavedevelopedaroundtheArtificialIn- telligenceControlCommunity,asisthecaseof expert,fuzzy,orNeuralControl. InRobustControltheprocessisusuallymod- eledbyalinearmodelandsomeoftheprob- lemsmentionedabovearetreatedbyconsider- inguncertaintiesaboutthemodel.Themain assumptioninmostcasesisthattheunderlay- ingprocessislinear.InAdaptiveControlthe mainideaisthatbyanappropriateadapta- tionmechanism,thecontrollerand/o1modelof theprocess,linearinmostcases,willcopewith unknown,changingandpossiblynonlineardy- namics.Advancedcontrolstrategies,normally basedonanexactcancellationofthenonlin- eardynamics(Craig,1988)havetobeusedfor nonlinearprocessessuchasrobots.Theuncer- taintiesonthedynamicparametersofthepro- cesses,suchasinertiasandpayloadconditions inrobots,havemotivatedthedesignofadap- tivecontrollers(SlotineandLi,1990;Kelly, CarelliandOrtega,1989;OrtegaandSpong, 1988).Thistypeofcontrollerisdesignedas- suminganexactknowledgeofthemodelstruc- tureanddoesnotincludeaspects,suchasnon- linearfrictions,elasticityinthejointsandlinks, backlashandtorqueperturbations,whichcan befoundinrobots. TheAItypeofapproachestrysomehowtore- producethebehaviorofhzlmancontrollersthat areabletousenaturalintelligencetocontrol processesexhibitingalltheproblemsdescribed. Afurtherdifferenceinbothapproacheshas beenthatwhiletheLinearControlCommu- nityapproachseemedtobemoreinterestedin demonstratingresultsaboutstabilityofpro- posedcontrolschemes,theAIControlCom- munityseemedmoreinterestedinshowingthat thetechniqueworkedonparticularprocesses. Thisishoweverchanginglatelyandthereare anumberofworksrelatingbothtypesofdisci- plines.Stabilityanalysisisoneoftheconverg- ingfieldsandsomeresultshaveappearedin Iiteratureestablishingconditionstoensurethe stabilityofAIcontrollers(Aracil,Olleroand Garcia-Cerezo,1989).AdaptiveControlisan- otherfieldwherethereisastrongconfluence withAIcontrollers.Theideaofadaptationis stronglylinkedtotheideaoflearningwhichis fundamentaltoNN. NeuralNetworkbasedcontrollershavereceived muchattentioninrecentyears.Thistype ofcontrollerexploitsthepossibilitiesofneu- ralnetworksforlearningnonlinearfunctions and/orthepossibilitiesofneuralnetworksto solvecertaintypeofproblemswheremassive parallelcomputationisrequired.Thelearn- ingcapabilityofNNisusedtomakethecon- trollermapacertainfunction,highlynonlin- earmostofthetime,representingdirectdy- namics,inversedynamicsoranyothercharac- teristicsoftheprocess.Thisisusuallydone duringa,normallylong,trainingperiodwhen commissioningthecontrollerinasupervisedor unsupervisedmanner(Psaltis,SiderisandYa- mura,1988).Ifthelearningcapabilityofthe NNisnotswitchedoffafterthetrainingpe- riod,oncethecontrolleriscommissioned,the NNbasedcontrollerworksasanadaptivecon- troller.TheabilityofNNforparallelcompu- tationhasbeenexploitedtoimplementcon- trollerswhichrequireasubstantialamountof computation,suchaslongrangepredictivecon- trollerswhereaquadraticoptimizationprob- lemhastobesolved(QueroandCamacho, 1990). ThispaperdealswithadaptiveNNcontrollers fornonlinearprocesses.Neuralnetworks,as wasmentionedbefore,havetheabilityoflearn- inganonlinearmodelwithoutapriorknowl- edgeofitsstructure(LightbodyandIrwing, 1992)andareadequateforworkinginrealtime becauseofitshighparallelism.NNseemstobe analternativewayofsolvingsomeoftheprob- lemsmentionedabove.Thatis,nonlinearpro- cesseswithnonecessarilyknownand/orchang- ingdynamics.Themainobjectiveofthispaper istoshowhowNeuralNetworkscanbeusedfor adaptivecontrolandtoexploretheparallelism foundinNeuralNetworkControlandAdaptive Control. 2.NEURALNETWORKBASEDCONTROL ThehistoryofNNcanbetracedbacktothe 40swiththefirstmodelsofbiologicalneural cells.Afirststepinconnection&twasdoneby McCullochandPitts(1943)thatmodeledan artificialneuronandstudiedthepropertiesof theresultingnetwork.Hebbobservedthata strengtheningoftheconnectionsbetweenneu- ronsoccurswhenonecellstimulatesanother whenthelatterisfiring.Thisobservationwas usedbyGrossberginthe60stomodelneu- rallearning.Hebbiantypeofrulesforlearning wereusedbyRosenblattsPerceptron(1958) thatwaslaterstudiedindepthbyMinskyand Papert(1969).Agradientdescentmethod calledthedeltarulewasusedbyWidrow andHoff(1960)totrainaNNwhosenodesare calledADALINE(AdaptiveLinearElement). Thebackpropagationtrainingalgorithmdevel- opedinthe70sand80sisanothermilestonein thehistoryofartificialNN.Itallowedfornet- workswithhiddenlayerstobetrained,over- comingtheproblemsthatperceptronhadof representingcertaintypeoffunctionssuchas theexclusive-ORfunction(MinskyandPapert, 1969).TheintroductionoffeedbackinNNpro- duceddynamicalsystemswithvariousequilib- riumpointsthatwereusedasassociativemem- ories:Hopfield(1982)devisedadynamicstruc- turethathasbeenwidelyusedforthesolving ofoptimizationproblems. NNabilitiesweresoonappliedtochallenging controlproblems.Barto,SuttonandAnder- son(1983)solvedthewell-knownproblemof balancingapoleinacart.Theydiscussedan importantaspectoftrainingNN:thecreditas- signmentproblem.Backpropagationruleneed tobetoldtheerrormadeatanytime,but insomecasesitisonlyknownthatanerror hasbeenmade.NguyenandWidrow(1990) exploitedtheabilityofNNtolearnnonlinear functionsintheproblemofthedockandthe trailertruck.Backingatrailertrucktoaload- ingdock,isahardtaskevenforhumans.The controisignalwasgeneratedbyaNNprevi- ouslytrainedusingbackpropagationwiththe helpofanemulator.Theemulatorconsists ofanotherNNthatidentifiestheplant,ithas thesameinputsastheplantplusthestateof 14 theplant.Theoutputofthenetisanestima- tionoftheplantsnextstate(usingadiscrete- timerepresentation).Backpropagatingtheer- rormadeintheprediction,thenetlearnsthe behavioroftheplant.Oncetheemulator knowstheplantsdynamicswithacertainac- curacy,thetrainingofthecontrollerbegins.It iscommissionedbyusingb~ckpropagationof erroratthefinalstate,Theaimistofindthe weightsthatminimizeameasureofthestates errorateachtimestep.But,astheerrorisonly availableinthefinalstate,ithastobeback- propagatedthroughtheplantemulatorinorder toestimatethecontrollerserrorateachstep. Fromtrialtotrial,eachonehavingdifferentini- tialconditions,thecontrollerisdrivenbythe emulatortogivethecorrectcontrollaw.The factthattherealplantcarmotbeusedtoback- propagatetheerrorsmadebythecontrollerex- plainstheneedforanemulator. AnotherwayofviewingNNincontrolisas look-uptables.TheNNstorescontrolsignals, giventhestateoftheplantandthenext-step desiredstate.In(KraftandCampagna,1990)a NNwasusedtocontrolthreetypesofsystems: linear,linear+noiseandnonlinear.Theper- formanceoftheNNcontrollerwascompared withacoupleofadaptivecontrollers:STRand MRACshowinggoodcharacteristics. In1990thefirstnumberofthenewmag- azineIEEETransactionsonNeuralNetwork appeared,anditsfirstpaper(Narendraand ~arth~arathy~1990)wasdedicatedtotheap- plicationofNNtocontrolandidentificationof nonlinearsystems.Theysuggestedstructures foridentificationandcontrolofnonlinearsys- temswithunknowndynamicsusingNN.Based onsimpleoperations:1)timedelay,2)summa- tionandmultiplicationbyaconstantand3) thenonlinearactivationfunction,theytreated recurrentandmultilayernetworksinaunified fashion.Theyusedtheterm~~~er~~~~e~NNto namethenetsresultingfromthecombination oftheabovelistedbuildingblocks.Amethod thatallowstheparametersoftheNNtobe dynamicallychangedwaspresentedasanex- tensionofstaticbackpropagation.Bysimula- tionstudiestheyrevealedtheeffectivenessof suchstructurestoidentifyandcontrolnonlin- earplantswithunkno~v~structureandparam- eters. SannerandSlotine(1992)proposedanewar- chitectureforadaptivecontrolusingGaussian NN.AGaussianNetworkusesGaussianradial basisfunctions(RBF)initsnodestoapproxi- mateanonlinearfunction,Providedthatsuch afunctionhassomedegreeofsmoothness,it wasshownthatthesystemformedbytheplant andacontrollerusingtheNN,isstableandthat thetrackingerrorwillconvergetoaneighbor- hoodofzero.Theaimoftheauthorswasto developstableadaptivearchitecturescapable ofexploitinganalogdesignsforthecontrolof continuous-timenonlineardynamicsystems. Letsconsideraplantwhosedynamicshavea nonlinearexpressionrelatingthen-thderiva- tiveofthestatewiththestateanditsn-1 firstderivatives.TheroleoftheNN,consist- inginasinglelayerofnodespossessingradial Gaussiancharacteristics,istoprovideanesti- mationofsuchafunctionatanytime.Thatis, thenethastouniformlyapproximateacontin- uousfunctionwithaprespecifiedaccuracyon acompactsubsetofRusingafinitenumberof nodes.Itisnecessarytoprovethatsuchafunc- tioncanberepresentedasalinearcombination ofasetofcontinuous,knownbasisfunctions. SannerandSlotineshowedthatthisapproxi- mationcanbedoneusingGaussianradialba- sisfunctions.Theresultingsystemadjuststhe networksweightswhilecontrollingtheplant. Nopriorlearningisneeded.Aslidingmode controlissetuptopreventthetrackingfrom degradingwhenthestateoftheplantisoutside theregioninwhichtheNNhasgoodperfor- mance. NNhavebeenusedtodesigncontrollersfor highlynonlinearprocessessuchasrobotma- nipulators(Kawatoetal.,1987).Adaptive feedbackcontrollerhasbeenproposedbyGuez andBar-Kana(1990) NNhavealsobeenusedtoimplementlong rangerecedinghorizonpredictivecontrollers. Long-rangepredictivecontrollers(LRPC),or ModelPredictiveControllers(MPC)asthey arecalledinthedomainoftheprocessindus- try,havereceivedalotofattentioninrecent years.Al1 thesecontrollersarebasedonthefact thattheprocessoutputcanbepredictedover ahorizonfromthepastprocessinputandout- putandthepotentialfuturecontrolsequence ifasuitableparameterizedmodelofthesys- temisknown.Thenameofthesetypesof controllerscomesfromthewayinwhichthe controllawiscomputed:atthepresenttime tthefuturesequenceofmanipulatedvariables isselectedinsuchawaythatthepredictedre- sponseoftheprocesshascertaindesirablechar- acteristics.Onlythefirstcomputedmanipula- blevariableisimplementedandtheprocessis repeatedattimet+1 .Therehavebeenmany LRPCorMPCalgorithmsproposedinlitera- ture(Garcia,PrettandMorari,1989;Tanand 15 DeKeyser,1993;CutlerandRamaker,1980), ModelAlgorithmicControl(MAC)(Rouhani andMehra,1982).G eneralizedPredictiveCon- trol(GPC)(Clarke,MohtadiandTu&,1987a and1987b)canbementionedamongthemost popular.ThebasicideaofGPCistocalcu- lateasequenceoffuturecontrolsignalswhich minimizesamultistagecostfunctiondefined overarecedingcontrolhorizon.Theindexto beoptimizedistheexpectationofaquadratic functionmeasuringthecontroleffortandthe distancebetweenthepredictedsystemoutput andsomepredictedreferencesequenceoverthe recedinghorizon.TheGPCinvolvestheso- lutionofanunconstrainedquadraticproblem (QP)withNvariableswhichcaneasilybeob- tainedbyusinganystandardmethodforun- constrainedQPoptimization.Thesemethods cannot,however,solvetheconstrainedprob- lemandalthoughtheamountofcomputation neededisnotveryhigh,itcanbeadrawback forrealtimeapplications.Whenprocessvari- ablesarebounded,aQPproblemwithlinear constraintshastobesolved(Camacho,1993) whichrequiresasubstantialamountofcom- putationforrealtimeproblems.HopfieldNN havebeenusedtoimplementGPCforpro- cesseswithunbounded(QueroandCamacho, 1990)andboundedsignals(Quero,Camacho andFranquelo,1993). Whentheprocessisnonlineartheproblemgets morecomplexastheimplementationofaGPC requirestheoptimizationofa nonlinearfunc- tion,NNhavealsobeenusedinthiscontextto implementGPC.G6mez-OrtegaandCamacho (1994)useNNtoimplementaGPCforpath trackingofmobilerobots.TheNNistrained inasupervisedwayinanoff-linemannerand usingtheoutputofanumericaloptimizational- gorithmthatcomputedthebestcontrolaction. TanandDeKeyser(1993)useaNNpredictor toimplementGPCfornonlinearprocesses.An interestingfastlearningalgorithmisalsopro- posedbytheseauthors. 3.LEARNINGANDADAPTATION Learningandadaptationarefundamentalcon-PersistentExcitation.Theconceptofpersis- ceptsassociatedtoNNandadaptivecontroltentexcitationiscrucialtoadaptivecontrol,it thatalthoughrelatedarenotquitethesame.Itreferstotheneedforusingasignalforiden- canbeconsideredthatwiththeadaptivemech-tificationpurposeswhichisdynamicallyrepre- anism,anadaptivecontrollerlearnstheprocesssentativeoftheentireclassofinputthatthe parametersorasetofadequatecontrollerpa-processmaybesubjectedto.Letusconsider rameters.Ontheotherhandthelearningphaseaprocesswithatransferfunctioncharacter- ofaNNcanbeconsideredastheadaptationofizedbyasetoftrueparameter5.Considerthe theNNweightstoadequatevalues.Thereare,setofadjustableparameteroandanappropri- however,somedifferenceswhenconsideringtheateidentificationalgorithm.A signalispersis- wayinwhichtheadaptationmechanismworks inadaptivecontrolandhowthelearningmech- anismoperateswhenaNNislearning.These differencesareillustratedbyFig.1. time bFIG.1. Adaptation(a)andlearning(b)processes. Inadaptivecontrol,theadaptationisper- formedinasingletrajectory.Normallyatthe beginningofthetrajectory,whilethecontroller isnotproperlytuned,theprocesstrajectory differssubstantiallyfromthereferencetrajec- tory.Oncetheparametersareproperlytuned, theprocessfollowsthedesiredtrajectorywith greateraccuracy. LearningisperformedbymodifyingtheNNpa- rametersduringrepeatedperformancetrialsof thedesiredtrajectory.Itislikepracticingthe samestrokeof,letssaytennis,anumberof timesuntilsuccesshasbeenachieved.This ideaisillustratedbyfigurelbwherethedif- ferenttrajectoriesobtainedatdifferentlearn- ingstagesareshown.Itcanbeseenthatpro- cesstrajectoriesreproducethereferencetrajec- torywithmoreaccuracywhenlearningpro- gresses.Apracticestrategyhasbeensuggested whichinsteadofusingthereferencetrajectory ineachlearningperiodasequenceoftrajecto- riesisused.Thefirstelementofthesequenceis apreviouslylearnedtrajectoryandthelastel- ementisthedesiredtrajectory.Thisapproach couldsolvesomeproblemsfoundinNNlearn- ingwhenthedesiredtrajectoryisfarfromany ofthepreviouslylearnedones. Thetrainingsignalsusedforadaptationand learningareofgreatimportanceandsomepar- allelismcanbeestablishedbetweenNNlearn- ingandadaptivecontrolregardingthisissue. 16 tentlyexcitingif0+5whentheerrorbetween processandmodeltendstozeroasymptotically. WhenusingNNfortheidentificationorcontrol ofnonlinearprocesses,asimilartypeofconcept wouldbeofgreatinterestforansweringques- tionssuchas:Isthechosentrainingsetade- quate??Whatsortoftrainingpatternshould beusedtotraintheNN?.Whatwearelooking foraresignalswhicharedynamicallyrepresen- tativeoftheentireclassofinputsandtech- niquestodeterminethis.Unfortunately,there arenotknownmethodstogenerateapersis- tentlyexcitingsignalforNNtrainingandonly goodjudgmentcanbeusedforthispurposeat present. ~~~~~~~~o~Speed.Oneofthefundament~ problemsofadaptivecontrolistheadaptation speedofthecontroller.Inselftuningcontrol theory,twodifferenttimescalesareassumed forprocessdynamicsandforprocessparame- terchanges.Inpractice,althoughprocesspa- rameterstendtochangemoreslowlythanpro- cessvariables,thetimescalesarenotsofar apartandaquicktuningisrequiredinmost cases.Inadaptivecontroltherearetwofac- torsthatdetermineadaptationspeed.Thefirst oneistheelectionofappropriateadaptation gainsorforgettingfactorsandthesecondone isthenumberofparametersbeingidentified oradjusted.Smalladaptationgainwillresult inaslowadaptationspeedwhilehighadapta- tiongaintendtoproduceoscillationsandcon- vergenceproblems.Theadaptationspeedde- creasesconsiderablywiththenumberofparam- eterschosen. WhenNNareusedforadaptation,thesame factorsdominatethelearningoradaptation speed.E&twhileinadaptivecontrollerthe numberofadaptingparameterstendtobekept small,whenusingNNthereareasubstantial numberofparametersifalltheweightsareto beadapted.Somemechanismstoobtainfaster adaptationspeedforNNbasedadaptivecon- trollershavebeenproposedinliterature(Tan andDeKeyser,1993;Care&Camachoand Patiiro,1993,1994). 4.NNBASEDADAPTIVECUNT~OLLERS CLASSIFICATION Manycontrollershavebeenproposedinclud- ingNNasapartofthem,Mostofthemare neuralversionsofclassicaladaptivecontrollers (&t&mandWittenmark,1989).Thewaythe NNisincorporatedinthesystemdiffersfrom onetotheothers.Mostfrequentlyusedar- chitectureswillbeclassifiedaccordingtothe roleplayedbythenetwork. 4.1.NNII$acontroller. &ec6inverseCani ?rokAnadaptivecontrol schemeusingdirectinversecontrolisshownin Fig.2.IftheNNistrainedtoproducethe signaluffitispossibletosubstitutethefeed- forwardcontrollerwiththeNN.Inthisway, theNNproducestheinversedynamicsofthe plant,whilethefeedbackcontrollercountsfor non-perfectlearningandperturbations.Clas- sicadaptivecontrollersmakeuseoftwoele- ments:theadaptivecontrollerandtheadap- tationlaw.Intheneuralcontextthecontroller isperformedbyaNNandtheadaptationlaw istheruleusedtoadjusttheparametersof thenet(usuallyconnectionweights).Kawatos proposalforrobotcontrol(Kawato,Uno,Isobe andSuzuki,1987)matchesthisstructure.The NNistrainedtomakethefeedbackcontrolsig- nalzero.Inthefirststagesofthetrainingof theNNthesystemisstablethankstothefeed- backcontroller. Feed Forwnrd controller h uff I L---.-_____________: I FIG.2.Controllerfora nonlinearplant. In(SannerandSlotine,1992)aRadialBasis FunctionNNisusedtoprovidetheinversedy- namicsofanonlinearplant.Thecontrollerin- corporatesslidingmodecontrolandadaptive controlblendedthroughamodulationfunction. Therearemanyotherarchitecturesthatmake useofthelearningcapabilityofNNtoidentify theinversedynamicsofaplant.Noticethatan inverseoftheprocessdynamicsmustexistfor thisschemetowork. ModelRejkrenceAdaptiveCo&-d.C&i&c MRACcanbeextendedtotheneuralcase.In (NarendraandParthasarathy,1990)aNNis usedtoidentifytheplantwhileanotherNN producestheinputtotheplant.Theobjec- tiveistotracktheoutputofareferencemodel. InFig.3networkNihastobepreviously trainedtoidentifytheinput-o~~tputbehavior oftheplant.Laterthecontrollersparameters canbeadjustedbybackpropagatingthetrack- 17 ingerrorsthroughtheplantidentifier.Adi- rectadaptationofthecontrollerisnotpossible sincetheplant,whosedynamicsareunknown, liesbetweenthecontrollerandthetrackinger- TOT. Rd. MGi-j ym FIG.3.Indirectadaptivecontrol. InFig.3,theTDLblocksrepresenttappedde- laylineswhosefunctionistoprovidedelayed valuesoftheplantsinputsandoutputs. Othermodels.Toovercometheproblemofthe lackofpreviousinformationabouttheplant alearningmethodhasbeenusedthatenables onetocontrolprocesseswithouttheidentifica- tionstage.Reinforcementlearning(RL)usesa W&Cinsteadofa~te~che~.Thecriticgets ameasurementoftheperformanceofthesys- temfromtheenvironment.Theobjectiveofthe adaptationsystemistoimprovethereinforce- mentsignalproducedbythecritic.Examples ofthistypeofarchitecturesarefoundin(Barto, SuttonandAnderson,1983)andin(Zomaya, 1994)* /l~I/ IJI II FIG.4.AssociativeandcriticelementsinaRL architecture. Thefirstexampledealswithbalancingapole inacartthatcanmovebetweentwostops.The goalofthecontrolleristomovethecartin suchawayastokeepthepolevertical.An errorismadewhenthepalefallsorthecart hitthetracksbounds.Tosolvethisproblem twoadaptiveelementswereused(seeFig.4): 1)Theassociativesearchelement(ASE)has asinputacodificationofthestateoftheplant, andgivesasoutputacontrolsignaldepend- ingonthatstate(i.e.positionandvelocityof thecartandpole).2)Theadaptivecriticel- ement(ACE)predictsreinforcementfromthe environmentthatcorrespondstothecontrolac- tiongeneratedbytheassociativeelement.Both elementsneedtobeadjusted.Theassocia- tivesearchelementisconstantlymodifyingits weightsthankstoasignalgeneratedbythe criticelement.Thismeansthatthecriticdrives thedecisionthattheASEhastotakebymeans ofaainternalreinforcement.Thecriticlearns fromtrialtotrialtopredictthefutureactionof thepoleintermsofreinforcement.Itreceives thereinforcementsignalsuppliedfromtheexte- rior.Theresultsshowedabetterperformance oftheASE-ACEsystemthantheclassicalbox system. Insomestructures,theNNisgiventhetaskof generatingasmallpartofthecontrolsignal.In thesecasesthenetworkcountsforstructured andnon-structureduncertaintiesofamodel. Themainpartofthecommandsignalispro- ducedbyaconventionalcontrollerbasedonthe model.Examplescanbefoundin(Iiguni,Sakai andTokumaru,1991)and(Zomaya,1993). 4.2.NNasestimator, InternalModelControl.TheInternalModel Controlschemeproposedin(Economou, MorariandPalsson,1986)usesasystemfor- wardandaninversemodel.Thesystem modelsoutputiscomparedtotheplantsout- putandthedifferenceisfedbacktoacon- troller,ThisstructurecanincorporateNNfor theidentificationofnonlinearplants(Huntand Sbarbaro,1991). PredictiveControl.Apredictivecontrollerpro- ducesacommandsignalthatminimizesthe squarederrorbetweenthepredictedoutputof theplantandthereferenceover a certaintem- poralhorizonateverytimestepk.Apredic- tiancanbecomputedforlinearplantsbyus- ingaDiophantineequation(Clarke,Mohtadi andTuffs,1987).Toextendtheideatonon- linearplantsapredictorhastobedeveloped. In(Takahashi,1993)aNNisusedtoproduce apredictionofanonlinearplantsoutput. InferentialControl.Insomeindustrialpro- cessescontrolisdifficultduetothefactthatthe plantsoutputisnonmeasurableataproper frequency.Thisisthecaseofqualitymeasure- mentsinchemicalprocesses.Inferentialcontrol usessecondarymeasurementstoestimatethe plantsoutput.Themappingfromsecondary 18 toprimaryvariablescanbenonlinearanddiffi- culttodetermine,soaNNislikelytoproduce goodresults.Inferentialnonlinearcontrolwas studiedbyMorariandFung(1982)andbyPar- rishandBrosilow(1988).Animplementation ofNNtoinferentialcontrolisgivenin(Luo, ShaoandZhang,1993). 4.3.NNusu~~ustme~telement. ANNcanbeusedtoadjust a classiccontroller. Mostcontrollerscurrentlyonuseinindustry arePIDduetoitssimplicityandrobustness. However,thetuningofthist*ypeofcontroller isoftenaburdensometask.In(Akhyarand Qmatu,1993)aNNisusedtoautomatically tuneaPID(Fig.5).TheNNsoutputarethe parametersofthePID.TheNNusesagradient descentalgorithmtolearntheadequatemap- pingusingthecontrolerror. r FIG.5. TheneuralnetadjuststhePIDcontroller. 5s NNADAPTIVECONTROLWITHFAST ADAPTATIONSPEED Thissectiondescribesastructureforthe adaptivecontrolofmanipulatorsproposedby Carelli,CamachoandPatifio,(1993,1994) whichdoesnotrequirealongadaptationperiod {seeFig.6).Althoughthecontrolstructure hasbeendesignedforrobotmanipulatoritcan beappliedtoanynonlinearprocessthatcanbe reparameterizedinthewaydescribed.Thecon- trollerusesasetoffixedfeedforwardNNwhich aretrainedinanoff-linemanner.Thisstruc- tureallowstheadaptationofthecontrollerto dealwithdynamicuncertainties,suchaslink inertiasorpayloads,minimizingtheamountof computationthathastobeperformedon-line. Asthenumberofparameterstobeadaptedis small,theadaptationtochangesinrobotpa- rametersisfasterthanwhenusingthelearning capabilitiesoftheNNtoadapt. 5.1.NNInverseRobotDynmmics. Theinversedynamicsofarobotcanbeex- FIG. 6. Adaptivecontrollerusingfixed feedforwardN N , pressedas? r(t)=~(~)~+C(%4)Fi +~(~)(5-I) Thedynamicstructurecanbeexpressed (KhodaandKanade,1935)asalinearfunc- tionofasuitableselectedsetofrobotandload parameters: IthasbeenshownbyFunahashi{1989),Cy- benko(1989),Horniketal.(1989)thattwo layerNNcanapproximateanywell-behaved nonlinearfunctiontoanydesiredaccuracy. Consideraset{ipi)ofNneuralnetworks,each representingtheinverserobotdynamicsfora determinedpayloadconditioncharacterizedby a value Bi of theparameters.Ifwetakeintoac- countthelinearparameterizationpropertyof therobotmodelandassumethateachelement oftheNNsetrepresentstheinverserobotdy- namicsforeachloadcondition, (5.3) where,Bi=[et,@,,I I., OrIT,withi=5 1,2, * * ,Nandz=[q,i,ilT Nowconsideraparticularrobotpayloadcondi- tioncharacterizedbyavalueoftheparameter0 andassumethatitcanbeexpressedasalinear combinationofthevaluesBi, 6=nlei+a?zBz +1 4 +ap$arJ (5.4) Theinverserobotdynamicsforapayloadcon- ditioncharacterizedby0canbeexpressedas, 19 Thus,theinverserobotdynamiccanbeapprox- imatedtoanypayloadconditionby, =al@&)+a&(z)+.a.+WV%+$(5.6) Equation(5.4)canbewrittenas, (54 IfN=nand0isnonsingular,thereisa uniquesolutiongivenby: Ifthecolumnsof0donotformabasis,be- causeN%d(t>+ m(t)%l(t) (6.22) wherez+,b isapd-likecontrolterm,aadisan adaptivecomponentprovidedbythenetwork tocountforthenonlinearfunctionsand2~~1(t}is thecontributionoftheslidingmodecontroller. Noticethatm(t)isa modulationfunctionthat mixesadaptiveandslidingcontrolmodesde- pendinguponthesituationofthestateofthe plantinthesetA.Whentheoperatingpoint isnexttotheboFder,theslidingcomponentis preferred;sothat,thestateisdrivenbackto thesetA. Theresultingsystem(seeFig.7)adjuststhe networksweightswhilecontrollingtheplant. Nopriorlearningisneeded.Itispossible toprovethatallstatesinthesystemremain boundedandthetrackingerrorsasymptotically convergetoaneighborhoodofzeFo.See(San- nerandSlotine(1992)~foracompleteproof. 7.IMPLEMENTATIONS MostofthecontrolapplicationsofNNareim- plementedbyprogramsindigitalcomputers whichsimulatethebehavioroftheneuralnets. OneofthepotentialadvantagesofNNwhichis theinherentpardelismisthereforelost.Hard- wareimplementationsareconsequentlyconve- nientinOFdeFtouseNNtoitsfullpotential, InordertouseNNforadaptivecontrol,imple- mentationsmustperformsomekindofparam- eteradjustment.Thisismoreeasilyachieved whenusingsometypesofcircuitsbutatthe costofbiggersiliconsurfaces.Implementations ofNNcanbeclassifiedinthefollowingcate gories: 7.1.NeurralProcessors. Theyexhibitaflexibilitywhichisanadvantage overothertypes.Feildand~avlakh~(1988) proposedanarchitectureconsistingintwoIN- MOSboardshavingfivetransputersconnected toaIB~/XTcomputer.Beynon(1988)devel- opedanetworkoftransputerstosimulatethe BPtrainingalgorithm.Inthefieldofpattern recognitionwefindtheGraphSearchMachine (Glinskietal.,1987). Digitalimplementationsaremorerobustthan analogonesagainstdispersioninthecharac- teristicsofthecomponents.R,asureetal. designedafeedforwardnetthatwasableto classifyhand-writtendigits.A3-Dstructure ofNETSIMboardswasproposedin(Garth, 1987).Eachboardcontainscommunication buses,controlcircuitsandneuralcoprocessors. Toreducethesurfaceneededindigitalcircuits synchronousstochasticimplementationscanbe used(Janer,1994;Janer,QueroandFranquelo, 1993). 7.3.SpecificAn&gCirct~i~s. Theyuselesssurfacethantheothertypesof implementationbutneedamorecarefuldesign. InCaltechagroupofresearchersdirectedby Meadhavedevelopedanumberofimplementa- tionsthatusearchitecturesbasedonbiological models(MeadandMahowald,1988). 7.4.~~~~~~~~le~entu~~o~. Theuseofhybrid(digital-analog)circuitsaims atobtainingamixofthegoodtraitsofboth t,ypesofimplementationswhileavoidingthe badones.Murraysgrouphaspublishedmany papersdealingwiththistypeofimplementa- tions(MurrayetaI.,1987). 8.CONCLUSIONS Neuralnetworkshavetheabilityoflearninga nonlinearmodelwithoutapriorknowledgeof itsstructureandareadequateforworkingin realtimebecauseofthehighparallelism.The useofNNseemsthereforetobeawayofim- plementingadaptivecontrollersforprocesses wherestandardadaptivecontrolisnotade- quate,hatis,nonlinearprocesseswithnonec- essarilyknownmodelstructureand/orchang- ingdynamics. AlthoughthepotentialsofNNforadaptivecon- trolhavebeendemonstratedinliteraturewith differentprocesses,therearestillanumberof openresearchissuesinthefieldsuchasstabil- ity,characterizationofpersistentexcitingpat- terns,adaptationspeedandhardwareimple- mentationofNNwithprogrammableweights. ACKNOWLEDGEMENT TheauthorswouldliketoacknowledgeCICYT forfundingtheworkundergrantTAP93-0804 andprojectCYTED-D. Akhyar,S.andS.Omatu(1993).Neuromor- phicself-tuningPIDcontroller.InIEEE InternationalJointConferenceonNeu- ral Networks,IJCNN93,pp.552-557. Aracil,J.,A.OlleroandA.Garcia-Cerezo (1989).Stabilityindicesfortheglobal analysisofexpertcontrolsystems.IEEE Trans.onSystem,Manid~~&e~e~ics, 19,998-1007. Astrom,K.J.andB.Wittenmark(1989). A&p&eGon&ol.Add~on-Wesley,New York. Barto,A.G.,R.S.SuttonandG.W.An- derson(1983).Neuronlikeadaptiveel- ementsthatcansolvedifficultlearning controlproblems.IEEETrans.onSys- tem,ManandC~~e~et~c$,13,834-846. Beynon,T.(1988)Aparallelimplementation ofthebackpropagationalgorithmona neuralnetworkoftransputers.IEEEb- ternationalConfeferenceonNeuralNet- works. CamachoE-F.(1993).Constrainedgeneralized predictivecontrol.IEEETrans.onAut. Control,30,327-332. CamachoE.F.andJ.M.Quero(1991).Precom- putationofgenemlizedpredictivecon- trollers.IEEETrans.onAut.Control, 36,852-859. Care&R.,E.F,Carnacho,andD.Patiiio (1993).Neuralnetworkbasedadaptive controlforrobots.ProcQR the2ndEW ropeanControlConference,Groningen 475-480. Carelli,R.,E,F.CamachoandD.Patifio (1994).Aneuralnetworkbasedadap- tivecontrollerforrobots.IEEETbs. onSMCToappear. Clarke,D.W.,C.MohtadiandP.S.Tuffs (1987a).Generalizedpredictivecontrol. PartIThebasicalgorithm.A~to~ut~~~ 23,137-148. 22 Clarke,D.W., C.MohtadiandP.S.Tuffs (1987b).Generalizedpredictivecontrol. PartIIExtensionsandInterpretations. Automatica23,149-160. Clarke,D.W.andC.Mohtadi(1989).Prop- ertiesofgeneralizedpredictivecontrol. Automatica25,859-875. Craig,J.J.(1988).AdaptiveControlofMe- chanicalManipulatorsAddison-Wesley PublishingCo. Cutler,C.R.andB.L.Ramaker(1980).Dy- namicmatrixcontrol.Acomputercon- t,rolalgorithm.Proc.JACC80. Cybenko,G.(1989).Approximationbysuper- positionsofsigmoidalfunction.Math,. ControtSignalSystem.s,2,303-314. Economou,C.G.,M.MorariandB.0.Palsson (1986).Internalmodelcontrol.5.Ex- tensiontononlinearsystems.Ind.Eng. Chem.ProcessDes.Den.,25,403-411. Feild,W.B.,andJ.K.Navlakha(1988). TransputerimplementationofHopfield NN.IEEEInt.Conf.on,NN.ICNN88. Funahashi,K.I.(1989).Ont,heapproximate realizationofcontinuousmappingsby neuralnetworks.New&Netloo&,2, 183-192. Garcia,C.E.,D.M.PrettandM.Morari(1989). Modelpredictivecontrol:theoryand practice-asurvey.Automatica25,335- 348. Garth,S(1987).Achipsetforhighspeedsimu- lationofneuralnetworksystems.IEEE Int.Conf.on, NN.ICNN87443-452. Glinski,S.,T.Lalumia,D.Cassiday,T.Koh, C.Gerveshi,G.WilsonandJ.Kumar (1987).Thegraphsearchmachine:A VLSIarchitectureforconnectedword speechrecognitionandotherapplica- tions.Proc.IEEE,75,1172-1184. G6mez-Ortega,J.andE.F.Camacho(1994) NeuralnetworkGPCformobilerobots pathtracking.EURISCON9.j,Toap- pear. G6mez-Ortega,J.,E.F.CamachoandJ.Quero (1994)Neuralnetworklocalnavigation ofmobilerobotsinamovingobstacles environment.PreprinGsof -Intelligent Componen,tsandInstrum,entsforCon- trolApplicationsSICICA94.263-268. GuezA.andI.Bar-Kana(1990).Two-degree- of-freedomrobotneurocontroller.Proc. 2gfhConferenceonDecisionandCon- trol,pp3260-3264. Hopfield,J.J.(1982)NeuralNetworksand physicalsystemswithemergentcollec- tivecomputationalabilities.Proc.of theNationalAcademyofSciences,79, 2554-2558. Hornik,K.,M.StinchcombeandH.White (1989).Multilayerfeedforwardnetworks areuniversalapproximators. Networks,2,359-366. Neural Hunt,K.J.andD.Sbarbaro(1991).Neural networksfornon-linearinternalmodel control.Proc.IEEPt.D,138,431-438. Kelly,R.,R.CarelliandR.Ortega(1989). Adaptivemotioncontroldesignofrobot manipulators:Aninput-outputap- proach.Int.J.Control,50,2563-2581. Iiguni,Y.,H.SakaiandH.Tokumaru(1991). Anonlinearregulatordesigninthepres- enceofsystemuncertaintiesusingmul- tilayeredneuralnetworks.IEEETrans. onNN,2,410-417. Janer,C.L.(1994).Paralellstochasticarchi- tecturesforthemicroelectronicimple- mentationofneuralnetworks,(inspan- ish).Ph.D.Thesis,Universityof Seville,Spain. Janer,C.L.,J.QueroandL.G.Franquelo (1993)FullyParallelsummationinanew stochasticneuralnetworkarchitecture. IEEEInt.ConfonNeuralNetworks ICNN93,1498-1503. KawatoM.,Y.Uno,M.IsobeandR.Suzuki (1987).Ahierarchicalmodelforvol- untarymovementanditsapplicationto robotics.Proc.IEEEControlSystems Magazine,8,8-17. Kraft,L.G.andD.P.Campagna(1990).A comparisonbetweenCMACneuralnet- workandtwotraditionaladaptivecon- trolsystems.IEEEControbSystems Magazine,pp.36-43. LigtbodyG.andG.Irwing(1992).Neural networksfornonlinearadaptivecontrol IFACWorkshoponAlgorithmsandAr- &tecturesforRentTimeControt,pp.l- 13. Luo,R.F.,H.H.ShaoandZ.J.Zhang(1993). Fuzzyneuralnetsbasedinferen&&con- trolforahighpuritydistillationcolumn AutomaticaToappear. Mead,C.A.andM.A.Mahowald(1988)A siliconmodelofearlyvisualprocessing. NeuralNetworks,1,91-97. McCulloch,W.S.andW.Pitts(1943).Ana- logicalcalculusoftheideasinmanentin nervousactivity.BulletinofMathemati- cal Biophysics,9,127-147. Minsky,M.L.andS.A.Papert(1969).Per- ceptrons.TheMITPress,Cambridge, MA. Morari,A.J.andK.W.Fung(1982).Nonlin- earinferentialcontrol.Comput.Chem. Eng.,6,271-281. Murray,A.F.,D.DelCorsoandL.Tarassenko (1991)Pulse-streamVLSIneuralnet- worksmixinganaloganddigitaltech- niques.IEEETransactionsonNeural Networks,2. 23 Namndra,K.S.andA.M.Ann~w~y(1989). StableAaTqthe&&ems.Prentice-Ha& EnglewoodCliffs,NJ. Narendra,K.S.andK.Parthasarathy(1990). Identificationandcontrolofdynamical systemsusingneuralnetworks.IEEE TransactionsonNeuralNetworks,1,4- 27. Nguyen,D.H.andB.Widrow(1990).Neu- ralnetworksforself-learningcontrolsys- tems.1X&XCon&.Syst.Msg.,10, 1% 23. Ollero,A.,J.Aracil,A.Garcia-CerezoandA. Barreiro(1993).StabilityofFuzzyCon- trolSystems.InA.DriankovandH. Hellendoorn(Eds),I ntroductiontoFuzzy Cantrol.Springer-Verlag,Berlin. Ortega,R.andM.Spong(1988).AdaptiveMo- tionControlofRigidRobot:ATutorial. Automat~ca,25,8?7-888. Parrish,J.R.andC.B.Brosilow(1988).Non- linearinferentialcontrol,AIChEJ .,34, 633-644. P&isD.,A.SiderisandA.A.Yamura (1988).Amultilayeredneuralnetwork controller.I EEEControlSystemMaga- zine,8,17-21. Quero,J.MandE.F.Camacho(1990).Neural generalizedpredictiveself-tuningcon- trollers.PTOC.oftheI EEEI nternational ConferenceonSystemEngineering,160- 163. Quero,J.M,E.F.CamachoandL.G.Franquelo (1993).Neuralnetworkforconstrained predictivecontrol.!l+trzasectionsonCir- cuitsandSystems,40,621-626. Rosenblatt,F.(1958).Theperceptron:aprob- abilisticmodelforinformationstorage andorganizationinthebrain.phycho- logicalReview,65,386-408. Rouhani,R.andR.K.Mehra(1982).Model algorithmiccontrol:Basictheoreticper- spectives.Automatica,10,401-414. Saerens,M.,A.Soquet,J.-M.RendersandH. Bersini(1991).Somepreliminarycom- parisonsbetweenaneuraladaptivecon- trollerandamodelreferencecontroller. InG.A.BekeyandK.Y.Goldberg (Eds.),NN$nRobotics.131-146. Sanner,R.M.andJ.-J.E.Slotine(1992). Gaussiannetworksfordirectadaptive control.I EEE.Bans.onNeumlNet- works,3,833-863. Slotine,J.-J.E.andW.Li(1990).Adaptive manipulatorcontrol:AcasestudyI EEE Trans.onAutomaticControl,AC-33,. Takahashi,Y.(1993).Adaptivepredictivecon- trolofnonlineartime-varyingsystems nsingneuralnetworks.ht.J o&tCora- ferenceonNNI~CNN~~~,1464-1468. Tan,Y.andR.DeKeyser(1993).Neu- ralnetworkbasedadaptivepredictive control.CEG/ESPRI T/CI ME/CI DI G Conf.AdvancesinMBPC.77-88. Widrow,B.andM.E.Hoff(1960).Adaptive switchingcircuits.1960I REWESCON conventionRecord,NewYork:I RE,pp. 96- 104. Yabuta,T.andT.Yamada(1990).Possi- bilityofneuralnetworkscontrollerfor robotmanipuiators.Proc.I nt.Conf. RoboticsAutomat.,1686-1691. Zomaya,A.Y.(1994).Reinforcementlearning fortheadaptivecontrolofnonlinearsys- tems.I EEETrans.Syst.,Man,Cybern., 24,357-363. Zomaya,A.Y.andT,M,Nabhan(1993). Centralizedanddecentralizedneuro- adaptiverobotcontrollers.NeuralNet- works,6,223-244. 24

Documents

Adaptive 1994 Nn Ad Con Camacho