Download pdf - Parallel and Distributed Processing Applications in Power ...tween themselves by message passing. Figure 2 2 Power System Simulation and Con troI Simulation of power system behavior

Parallel and Distributed Processing Applicationsin Power System Simulation and Control

Djalma M. Falcão

COPPE'Universidade Federal do Rio de Janeiro

Abstract

Recent advances in computer technology will certainly have a great impact on the methodologiesused in power system expansion and operational planning as well as in real-time control. Pa~allel a~d

distributed processing are among the new technologies that present great potentllu for apphcatlon lU

these areas. Parallel computers use multipie functional or processing units to speed up computationwhile distributed processing computer systems are collections of computers joined together by high speedcommunication networks having many objectives and advantages. This paper discusses the potentialapplication areas af paralIeI and distributed processing in power system and presents. ao overview afthe current research effort in this subject. The papel also presents comments regardmg the researchwork in this area presently in developmewnt in Brazilian institutions and parallel computation facilitiesavailable in Brazil.

1 Introduction

Power system planning and operation in the 1990sface new challenges imposed by a new economic context (dereguiation, privatization, etc.), the praeticalutilization of fast aeting e!ectronic contro! devices(FACT's and HVDC controls), and the stretching ofthe e!ectric performance by operation e10se to system limits. Computation tools to simu!ate and control power systems with these characteristics mustbe powerful enough to deal with increasingiy complex modeis and solution possibilities overlooked inthe pasto Moreover, the reduced number of powerengineers graduated in the past decade indicates alikely shortage of power system experts in the nearfuture. To overcome this personnel deficiency, thefew available experts will need eflicient computationtools to properly perform their duties.

Power system computation methods for simulatíon and contro! have undergone a great deal of development in the last three decades or 50. Techniques for handling large sparse matrices, e1everadaptations like the fast decoupled power tlow method, etc., made possible the analysis of large scalepower system models. A widely accepted guess isthat the introductiol! of new mathematical or programming methodologies will produce only marginalincrease in the performance of these methods [IJ.Moreover, single processing capacity, although dramatically increased in the last years, seems to beapproaching physical limits [2, 3, 4J. Therefore,

-The author can be reached at COPPEjUFRJ, Caixa Pos,al68504, 21945-970 Rio de Janeiro RJ, Brazil, Phone: (021)260 SOlO, Fax: (021) 290 6626, E-ma.il. [email protected]írj.br.

substantial improvements are more likely to comefrom the adaptation of the already well developedpower system computation methodologies to the newhardware and software developments being offeredby the computer industry. Parallel and distributedprocessing appear to be among the most promising ones of these new developments [5]-[7J. Parallelprocessing consists in the use of multiple hardwarecomponents to expioit concurrency in the computation job. The main advantage of paraliei processing in power system applications is the speed upof computations in order to make viable the solution of problems intractable in conventional computers. Distributed computer systems are collectionsof independent computers joined together by highspeed communication networks having many objectives and advantages. Power system applications canbenefit from this form of decentralized computer architecture due to its geographically distributed nature and, as in many other application areas, fromits tlexibility, scalability, cost advantage, etc.

This paper introduces and discusses some ideasfor the use of parallel and distributed processing inpower system simulation and contro!. lnitially, it ispresented a general description of paraliei and distributed processing schemes, their advantages anddisadvantages, as well as promising power systemapplication areas. Next, an overview of the researc!;work currently in development in these topics is presented together with a perspective view of future developments in this area. Finally, the paper presentscomments regarding the research work presently indevelopment in Brazilian institutions and paralieicomputation facilities available in Brazi!.

125

3 ParalleI and Distributed Processing

Fig. 1: Trends in computers and microprocessors performance

9085

Microprocesson

75 80Year

7065

perf 10or

manc Minicomputerae 1f-L.-r---.,,----,---.--..,

R 1000eIatIV 100e

power system simulation. The preventive controlstrategy, by and large adopted in modem EnergyManagement Systems (EMS), tries to avoid catastrophic occurrences by a periodic simulation of credible contingencies and system rescheduling and reconfiguration. Obviously, in this case a fast solutiontime is essential to allow corrective measures to betaken in due time. The state-of-the-art in real-timecontrol hardware and software is so far only able todeai with steady-state models. Important dynamicphenomena, like transient, small signal, and voltage stability, are almost completely ignored due tothe lack of powerful enough computation resources.Probabilistic analysis is also usually left out of thereal-time control strategy for the same reason.

Parallel and distributed processing are similarbut rely on different information processing concepts. Parallel computers are usually single frameunits in which the idea of concurrent processing isexploited mainly to speed up computations. Distributed computer systems are collections of computers joined together with multiple objectives depending on the envisaged applications. These dataprocessing approaches have achieved practical implementation mainly due to the relative large increasein microprocessor performance as can be seen in Figure 1 [4) and Table 1.

3.1 Parallel Processing

ParalIei processing can be broadly defined as theutilization of multiple hardware components to perform a computation job. Severallevels of parallelismcan be found in present day computers but the onesmost important for the applications discussed in thispaper are multiprocessors and multicomputers [3].

A multiprocessor is a computer composed of several processors sharing a common memory. Amulticomputer is a computer composed of severai processor-memory pairs which communicate between themselves by message passing. Figure 2

2 Power System Simulation and ControI

Simulation of power system behavior in the presence of several types of predictable disturbances is afundamental tool in expansion and operational planning and real-time control of these systems. Owing to the complexity of the phenomena involved inpower system operation, simulation studies are usually performed within different time frames: starting with the very fast electromagnetic transientscaused by switching operations and Iightning, passing through the slower electromechanical oscillations resulting from power unbalance on rotatingmachines, and eventually reaching the steady-stateanalysis corresponding to different loading conditions during the daily, weekly or yearly load cyeIe.Different models are used to represent power system components in these simulations but some basiccharacteristics appear in ali of them. For instance,the dynamic interaction between apparatus (generators, compensators, etc.) occur via the transmissionnetwork whose representation introduces some similar structural properties into the models [8, 9, 10].These modeIs usually consist of sets of non-Iinearordinary differential equations, presenting direct interaction only to other few equations, and/or sets ofalgebraic equations whose linearized approximationsexhibit similar sparsity pattems.

The power system simulation problems, whencompared to simulation problems found in otherfields of engineering, may be eIassified as mediumsize problems. However, simulation cases are seidomperformed only once. In general, practical expansion and operational planning applications requirehundreds or thousands of simulation cases. In someconjectured probabilistic assessment studies, this figure may reach several thousands. Other stringentrequirement of the power system simulation problem is in the case of real-time simulators: unfolding the power system's response in the same ratethe physical system would do. These requirementsmake power system simulation a very time 'consuming and/or numerically intensive computer application. To circumvent the lack of enough computerpower to satisfy these high requirements, the usualpractice so far has been to sacrifice accuracy by theintroduction of simplifying assumptions and approximated models. This procedure sometimes leads tomeaningless results very often misunderstood by engineers not well acquainted with the limitations ofthe used models. Dedicated analog or hybrid devices (TNA's and HVDC simulators) have been usedfor the simulation of electromagnetic transients butthey are expensive and less f1exible than fully digitalsimulators.

The real-time control environment, apart fromspecific computation tools, also rely extensively on

126

..Table 1: Microprocessors used in microcomputers, workstations and paraliei computers

Micropro- Manufac- Type Clock Operating Peak performancecessars turer (MHz) systems (MIPS/MFLOPS)

Alpha 21064 DEC RISC 200 WNT,OSFl,VMSMIPS R4400SC MIPS RISC 150 IRIX,SRVR45

PA7100 HP RISC 100 HP-UXPentium Intel CISC 66 WNT,05.2,Unix,etc.

PowerPC 601 IBM/Motorola RISC 80 OS2,AIX,MacOSSuperSparc SUN/Texas RISC 60 Unix SVR4, Sun OS

i860 Intel RISC 66 WNT,OS2,Unix,etc.Transputer IRmos

presents a schematic view of these two forms of paraliei computer architecture. Other interesting formsof parallelism are the pipelined processors, used inveetor supercomputers, specially tailored for concurrent operation of loops present in vector and matrixoperations, and the superscalar processors that Usesspecial compilers to optimize the execution of conventional programs by executing concurrently someinstruct.ions in a single instruction stream.

Multicomputers, and less extensively multiprocessors, can be built using commodity hardware in contrast to, for instance, vector supercomputers thatrequire specially designed processors and memories.In particular, the microprocessors used in microcomputers and workstations are becoming standard components of commercially available parallel computersas can be seen in Table 2. For this reason, multiprocessors are able to present a price/performanceadvantage over supercomputers based on one or afew customized processors. This fact is driving thehigh performance computing industry towards massive parallelism, i.e., computers with hundreds orthousands of processors. Scalability is another interesting feature of parallel computers. Table 2 presentsa sample of presently commercially available parallelcomputers ..

Programs developed for conventional computersusually require relatively exter.sive modificationsto be used in multiprocessars and multicoinputers.Moreover, the best performance of these machinesare usually achieved only with specially developedal'gorithms and programs. Automatic program parallelization tools are rare in the massive parallel environment and it is doubted weather they will becommonly available in the foreseeable future. Fortunately, a strong research e!fort in universities andindustry around the world in the last years madeavailable a large number of algorithms suitable forparallel implementation [6]. Moreover, parallel computer manufacturers are aI. present o!fering development tools much more efficient than they were afew years ago. On the other hand, it is usually easier to convert programs to run on supercomputersand superscalar processors. Automatic vectorization

compilers are often available for supercomputers although the best performance are still achieved onlyby careful hand customization.

The gain obtained in moving an application toa parallel computer has been usually measured interms of speedup and efficiency of the paraliei implementation when compared to the best sequentialversion. Speedup is usually defined as the ratio between the execution time of the best sequential codein one processor of the parallel machine to the timeto run the parallel code in P processars. EjJiciencyof the parallelization process is defined as the ratiobetwee'n the speedup achieved on P processors to P.In the early stages of applications development forparallel computers these two indexes were almost exclusively the determinants of the quality of parallelalgorithrns. As more parallel machines became commercially available, and practical applications beginto be actually implemented, other aspects of theproblem started to become important. For instance,the cost/performance ratio (Mflops/$; Mflops=106

f10ating point operations) attainable in real-life applications. In other cases, although speedup and efficiencies are not se high, the implementation in aparaliei machine is the only way to achieve the required speed in the computations.

3.2 Distributed Processing

A distributed computing system is a collection ofautonomous computers, interconnected by a communication network, that work together to satisfythe information processing needs of a company, department, laboratory, etc. [6, 7J. Figure 3 presentsa generic view of a distributed computing system.This type of decentralized computing approach isfeasible today due to the availability of very powerfui desktop computers (microcomputers and workstations) which can be interconnected by very fastnetworks. Distributed systems are replacing mainframe based computer systems in both commercialand scientific applications generating a phenomenaknown as downsizing of applications.

Distributed computing systems may include a fewcomputers located in a single room, connected by a

127

----------

Mem 1 Mem 2 Mem p...

CPU 1 CPU 2 CPU p

I 1 1Interconnection Network

II I/O Oevices

Multícomputer(Oistributed Memory)

CPU 11 ICPU 2 . . . CPU P

Switching Mechanísm

Memory 1/0Oevices

Multiprocessar

(Shared Memory)

Fig. 2: Parallel Computer Architectures

local area network using coaxial cable as interconnection media, or a large number of computers scattered hundred of kilometers apart, interconnectedby a wide area network based on telephone lines,microwave channels, optical fiber cables, etc. Thecomputers may be of different types and capacities.No single operational system need to be shared between processors.

Some of the advantages of distributed computingsystems over conventional ones are [7]:

• Cost: the average cost for MIP (million instructions per second) on a mainframe is almost 100times the one for a workstation.

• Scalabi/ity: distributed systems can be expandedincrementally.

client-server system which allows a user to store, access and manipulate data transparently from manycomputers. This type of system will eventuallyevolve to a truly distributed operating system sothat the entire network appears to the user as a largecomputer system.

Computer

High Speed DataCommunication Network

Fig. 3: Distributed processing

In the last few years, workstation c/usters havebeen gaining acceptance as a platform to implementparallel computing in distributed systems. These aregroups of workstations connected by a local areanetwork which can exchange information by message passing. Parallel execution of programs can beachieved using client-server programming languagesand tools (Remote Procedure Call-RPC, for instance) or specially developed tools like PVM (Parallei Virtual Machine). The communication speed between workstations is usually low what makes thesesystems indicated only for applications in whichcomputation on the processors are much more intense than communication between processors. Itis claimed that these systems may be able to usethe idle time of a cluster of workstations to perCormhighly intensive computer jobs in the background.

• Flexibi/ity and Configurabi/ity: distributed systems have improved performance and reliabilitydue to redundant data and processing.

• Exploitation of special hardware: special graphicdevices, highly efficient number crunching modules (vector or parallel units), etc., can be addedeasily to the system.

On the other hand, distributed processing is a newand unmarked field. The lack of standards, securityand int.egrity controi difficu!ties, the availability oftoa many options, etc., raise several technical andmanagement concerns.

Distributed applications can be developed according to several different paradigms [7]. In the simplest one, a terminal emulator makes a computerlook like a terminal which is connected to anothercomputer. A file transfer package allows files to betransferred between different computeIS. A moreclaborate modeI is the c/ient-server system in whichremotely located programs can exchange information in real-time. A distributed data and transactionmanagement system is a sophisticated version of the

ComputerComputer

128

Table 2: Commercially available parallel computers.

Name Manu- Classi- Proces- Topology No.of Peak Performancefacturer fication sor Proc. (GFLOPS)"

Paragon Intel MIND i860 XP Mesh 2d 8-480 36nCube 2 Ncube MIMD Custom Hipercube 2-1024 2.5

CM-5 Tinking MIMD Sun Fat Tree 1-16000 1000Machine Sparc

SPl IBM MIMD PowerRisc 1-64 8T3D Cray MIMD Alpha Torus 3d 32-2048

SIMD 21064 307Exemplar Convex MIMD PA7100 Tightly Coupled 1-128 25

SMM Crossbar

4 The Impact of Parallel and Distributed Processing in Power SystemApplications

From the point of view of potential for parallelprocessing, power system applications can be elassified as:

• Obviollsly Parallelizable: applications that consist in the solution of almost independent andcomputationalIy intensive numerical problems asstatic contingency analysis, Monte Carlo simulations, eigenvalue analysis, security constrainedoptima! power flow, etc.

• Not Obvioúsly Parallelizable: applications thatrequire ingenious problem decomposition in order to achieve a reasonable degree of concurrencyin the solution algorithrns. Most of the applications in this category require the paralIei solutionof large sets of sparse linear algebraic equationsoriginated from different power network rhodelingapproaches. Examples of this elass of applications are the power flow computations, dynamicsimulation for transient and long term stabilityanalysis, etc.

Some applications, like electromagnetic transientscomputation and state estimation, although conceptually. belonging to the second category, may be ineluded in the first one due to a nice structure of theequation set to be solved as wilI be discussed laterin this paper. It should be noted that the borderbetween the two categories above is nebulous andlikely to be changing with advancements in paralIelprocessing methods and architectures.

The brief history of parallel processing application to power system problems registers an apparent paradoxo although several important obviouslyparallelizable applications are anilable, most of theresearch effort has been concentrated on applications in the not obviously parallelizable category.An explanation for this phenomenon may be found

in the fact that most of the research in this fieldhas been conducted in universities, were paraliei machines have been available for quite a long time. Applications with not obvious parallelism present moreinteresting challenges from both the theoretical andalgorithm design points of view and, therefore, aremore attractive to academic researchers. ~10re recently, this trend has been changing, even amongacademics, as the obviously paralIelizable applications have shown to present as big chalIenges as theother category although seldom in the numerical algorithmic leveI.

Major impacto of paralIel processing in practica!terms are expected to occur in areas in which conventional computer methods and architectures havefailed so far to produce adequate results or in areasthat are likely to increase their complexity due tothe need of modeling newly introduced equípment.Some candidates are:

• Real- Time Control: dynamic security assessmentand correction, voltage stability assessment, security constrained optimal power flow, etc. [11,12).

• Real- Time Simulators: e1ectromechanical andelectromagnetic transient simulators to be usedin design and test of new equipment, control andprotection schemes, operators training, etc. [13J-[15].

• Probabilistic Assessment: composite reliabili.yand other probabilistic techniques based onMonte Carlo and enumeration methods using realistic static and dynamic models [16]-[18].

Another field that can benefit from paralIei pro-cessing is the automation of power engineering analysis and synthesis studies required both in expansion and operational planning. These studies require long and tedious cyeles of case preparation,runs, and analysis of load flow, transient stability,short-circuit, etc. As the availability of qualified engineers to perform these studies is diminishing, a

129

Fig. 4: Dynamic Simulation Model Decomposition

x Ax+Bu (3)

I(E, V) = YV (4)

u = h(E, V) (5)

-"-1

5.1 Dynamic Simulation for Transient Stabil.ity Analysis '

The mathematical mode! usually adopted fortransient stability analysis consists of a set of ordinary non-linear differential equations, associatedto the synchronous machine rotors and their con·trollers, and a set of non-linear algebraic equationsassociated to the transmission network, synchronousmachine stators, and loads [9]. These equations canbe expressed as:

(1)

(2)

f(x,z)g(x,z)o

x

A

where f and 9 are non-linear vector functions: x isthe vector of state variables; and z is the vedor thealgebraic equations variables.

In the model defined in (I) and (2), the differentialequations representing one machine present interaction with the equations representing other macbinesonly via the network equations variables. From astructural point of view, this model can be visualized as shown in Figure 4: c!usters of generators areconnected by local transmission and subtransmissionnetworks and interconneded among themselves andto load centers by tie-lines.

In the sequential computer context, several solution schemes have been used to solve the dynamicsimulation problem. The main differences betweenthese schemes are in the numerical integration approach (implicit or explicit) and the strategy to solvethe differential and algebraic set of equations (simultaneous or alternating). Implicit integration methods, particularly the trapezoidal rule, have beenmostly adopted for this application. The most usedschemes are [9J:

1) Atternating Impticit Scheme (AIS): this solution scheme is better understood if (I) and (2) arerewritten as:

• Corporation Integrated Information System: integration of real-time operations with planningfunctions and other corporation functions, suchas accounting, customer services, and management information systems.

5 Power System Applications

This section introduces a review of paraliei anddistributed processing techniques applied to powersystem problems. This review contains mainly thework developed by the author and his associates.Research work developed by other groups are alsomentioned and reviewed to some extent.

• Distribution A utomation: distribution automation projects have been proposed based on theavailability of processing power distributed alongside the power distribution network with the ohjective of supervised switching operations, voltage/var control. load management, customer interface automation functions, etc. [5, 21, 22J.

New areas which may benefit from distributedprocessing are [23]:

• Distributed EMS: integration of the traditionalsupervision and control system of the main transmission network with the distribution automationfunction and down to the customer leveI.

130

result of both the utilities personnel policies and reduced interest of electrical engineering students forthe power engineering area, the deve!opment of automated tools to help engineers to speed up thepower engineering studies is becoming an importantrequirement. This type of application exhibits alarge amount of obvious parallel computations (eachpower flow or transient stability case is almost complete!y independent from each' other) but requiresa central coordination tool most likely based on intel/igent processing techniques like expert systems,neural networks, fuzzy logic, etc.

Two distinct reasons motivate the utilization ofdistributed processing in power system applications:the geographically distributed nature of the powersystem and the recognition by the electrical energyindustry of the cost and operational advantages ofusing networks of microcomputers and/or workstations, based on the open system concept, insteadof the expensive and less flexible proprietary mainframe based systems.

Two applications are already established as viableand being implemented in practice:

• New Architectures for EMS: the traditional dualand similar proprietary architectures of controlcenters, using mainframe or supermini computers, are being substituted by more versatile andcost effective networks of high performance workstations [5, 19, 20].

[F(x, V') ] k __ [Q 1!-] k [ ~x ] k+1G(x, V') - S Y ~V' (8)

where F is a vector function associated with the difference algebraic equations, G is a vector functionassociated with the algebraic equations, and V' isa vector of nodal voltages expanded in its real andimaginary components, are lumped together in anenlarged set of equations which are, then, solvedsimultaneousiy. The Newton method is the usualchoice of method to solve this set of equations. Inthe k-th iteration of the Newton algorithm, the fo1lowing set of linear equations has to be solved:

where Q, R, S, and Y are Jacobian submatrices, TheJacobian matrix in (6) cao be ordered in a BlockBordered Diagonal Form (BBDF) [25, 26J and thesystem of equations solved by Block Gauss elimina-tion .. The elements of the J acobian matrix may bekept constant for several iterations and/or integration steps in a process known as the Very DishonestNewton Method (VDHN).

The difficulties for the parallelization of the dynamic simulation problem in the AIS are concentrated on the network solution. The differentialequations associated with the synchronous machinesand their controllers are naturally decoupled andeasily parallelizable. On the other hand, the network equations constitute a tightly coup/ed problem

where A is a square, real valued, block diagonal matrix in which each block is associated to one machine;B is a rectangular, real valued, blocked matrix inwhich each block is associated to one machine; uis a vector of interface variables; I is a vector of injected nodal currents; Y is a square, complex valued,nodal admittance matrix; V is a vector of complexnodal voltages representing the steady-state networkbehavior; E is a subvector of x required to calculatecurrent injections in generation nodes; and h is anonlinear vector function.

The alternating implicit scheme consists in transforming the differential equations in difference equations, by the application of an implicit integrationmethod, and to solve these equations iteratively andalternately with the algebraic equations. A modified version of this scheme, called the InterlacedAlternating Implicit scheme [9, 31, 32], is obtainedrelaxing the convergence requirements on the network equation solution. The convergence test isperformed in the state variables. This method usually presents better results than the conventional approach.

2) Simultaneous Imp/icit Scheme (5/5): in thisapproach, the network algebraic equations and theaigebrized differential equations

5.1.1 Spatial Parallelization

Methods in this category exploit the structuralproperties of the the network or J acobian equationsto be solved in each integration step of conventionalsimulation schemes (AIS or SIS). Four methods arebriefiy described below:

1) The Parallel VDHN [27J: it consists in astraightforward parallelization of the VDHN methodsimply identifying tasks that can be performed concurrently and allocating them among the processors.This method was implemented in the paraliei computers Intel iPSC/2 (distributed memory) and Alliant FX/8 (shared memory) and tests performedwith the IEEE 118 bus and US Midwestern systemwith 662 buses. The results show speedups slightlysuperior for the iPSC/2 with a strong saturationwith the increase in the number of processors. Themaximum obtained speedup was 5.61 for 32 processors (efficiency = 17.5%).

2) The Parallel Newton- W matrix Method [28]:the main feature of this approach is the use o a paraliei version of the Sparse Matrix Inverse Factors[29, 30] in the SISo The methodology was tested inthe shared memory Symmetry parallel computer thesame test system used in the work cited in the previous item. The results show a worse performance of·this method wheo compared to the parallel VHDNwith ao slowdown of 10% to 30% depending on thechosen partitions.

requiring ingenious decomposition schemes and solution methods suitable for parallel applications. TheSIS also requires the parallel solution of sets of linearalgebraic equations in every integration step, withdifficulties similar to the ones described for the AIS.

A basic numerical problem in both simulationschemes is the parallel solution of sets of linear algebraic equations. Direct methods, like LU factorization, have been dominating this application in conventional computers. If parallel computers are considered, however, the hegemony of direct methodsis no more guaranteed. In several other engineering and scientific fields, parallel implementations ofiterative methods have shown superior performance.Among the most successful iterative methods are theones belonging to the Conjugate Gradient (CG) category [6]. The parallelization of the solution of thenetwork equations requires the decomposition of theset of equations in a number of subsets equal to thenumber of processors used in the simulation. An adequate decomposition is fundamental to the successof the parallel solution and need to take into consideration factors like computation load balancing,convergence rate of the iterative algorithms, etc.

In the last decade or so, several parallel methodswere proposed for the solution of the dynamic simulation problem. In the following sections, some ofthese methods are reviewed.(6)

(7)

= O= O

F(x, V')

G(x,V')

~-131

=

h(E1, Vilh(E2 , V2 )

"

(11)

(13)

(12)

1,2, ... ,p, solve

Yi Vi = li - Y; V,

p'""' -, -\l, = l, - L..J Y; Y; lii=l

p .'""' -, -1-Y, = Y, - L."Y; Y; Y;i=l

Phase 2: For i

• The number of subnetwork connections is keptto a minimum in order to reduce communieation requirements.

• The subnetworks have approximately the sainedimensions and complexity.

Two approaches were used to obtain the NBDF's ..with the above characteristics: a trial and erro~.... :~<.method based on a careful analysis of the systel!!,!I" :

'-:", .

The solution of the network equations in the form(10-13) can be performed in different ways. Considering the implementation in paraliei machines, themain issue is how to solve Phase 1 efficiently sincePhase 2 presents an inhe~ently paraliei structure.The explicit formation of YI and II and the soiutionof (la) by a direct method could be a straightforward approach to this problem. However, it shouldbe noted that Y, usually presents a less sparse structure than Y,. Moreover, as stated before, the parallei implementation of direct methods is not an ea.sytask.

The solution of (la) by the .CGM does no~ require the explicit formation of Y, but only of y! V.0and Í, in the ca1culation of the residual rO and Y,dk

(dk 0,1, ... , are the conjugate directions) ineach iteration. Moreover if the p independent systems given in (13) are solved by a direct method likeLU factorization, the factors of Y;, i = 1, ... , p, can

• o' . k'be efficiently used to ca1cu!ate Y, VI , II and Y, d !n ,_paralieI. 'I''.'.

5) The Full CG Approach [33]: In this method thenetwork equations are solved as a whole by a block- .parallel version of the Preconditioned CG method.The network matrix is decomposed in such a waythat the blocks in the diagonal are weakiy coupiedto each other, i.e., in a Near Block Diagonal Form(NBDF) [25, 26J. The NBDF is equivalent to thedecomposition ofthe network in subnetworks weaklycoupled. A block-diagonal matrix, obtained from theNBDF neglecting the off-diagonal blocks, is used asa preconditioner. To optimize the algorithm performance, the NBDF should be obtained according,tothe following:

• The number of subnetworks is fixed and equalto the number of processors used in the simulation.

(la)

Vp

V,

=y,V,

where:

Phase 1: Solve

(9)The BBDF of the network equations can be

achieved by re-ordering the network nodes in p + 1sets in which the p first sets correspond to subnetworks connected to each other by the boundary busesin the (p + 1)'h set. A semiautomatic decompositionmethod to obtain the BBDF for large networks IS

described in the next section of this paper.Equation (9) can be solved in a two-phase scheme

by Block-GalLssian Elimination as follows:

9) The Parallel Real- Time Digital Simulator [14]:The main objective of this work was the deveiopmentof a real-time power system s,imulator for equipmenttesting based on a massive parallel architecture. Theparallel algorithm is based on the AIS using thetrapezoidal integration method, One processor IS

associated to each network bus, The differentialequations corresponding to each generator and itscontrollers are solved in the processor assigned tothe bus in which the generator is connected, Thenetwork equations are solved by a Gauss-Seidel likemethod aiso allocating one equation to each processor, Therefore, the number of processors requiredto perform the simulation is equal to the numberof buses in the network. Reported results with a261 bus network in a 512 node nCube parallel computer show that the required cpu time is not affectedby the system dimensions. However, it is doubtfulweather this property can be kept valid for largersystem taking into consideration that the numberof iterations required by the Gauss-Seidel algorithmincreases considerably with system size. This approach exhibits low speedup and efficiency measuredby the traditional indexes. However, it has. had atremendous impact in the research commumty because it has demonstrated the usefulness of parallelprocessing in solving a real world problem. .

4) The Hybrid CC-LU Approach [31, 32]:. Thlsmethod is based on the AIS, the decomposltlon ofthe network equations in a BBDF, and a hybrid 50

lution scheme using LU decomposition and the CGmethod. An efficient parallel implementation of theCG method depends on the existence of a convenientstructure of the coefficient matrix. One such structure is the BBDF which is shown in (9) for equation(4).

132

.,

-----------

133

• Each processor may be a cluster of q other processors which could be used to solve each subsetof equations in parallel in the 50 called space andtime parallelization.

Two CG methods suitable for asymmetric sets of linear equations were experimented in the implementation of the this method: These implementations arereferred to as:

• The Bi-Conjugate Gradient method (Bi-CG) inwhich two sequences of mutually orthogonalresiduais are generated by simple relations similar to the one used in the CG method [41].

2) The Spaee and Time Gauss-Jaeobi-BloekNewton Approaeh [38, 39]: In this approach thewaveform relaxation concept is used with a slightlydifferent formulation. The discretization of the differential equations is performed for ali integrationsteps simultaneously resulting in an extremely largeset of algebraic equations. In a first work [38), thisset of equations is solved by a paraliei version of theGauss-Jacobi method with a poor performance. Ina second work [39] a method called Gauss-JacobiBlock-Newton Approach that consists essentially in

• Low communication requirements. Except for theconvergence test, each processor needs to communicate only with processors assigned to integration steps before and after itself.

• the recently introduced more robust and eflicient variant of the Bi-CG method called the BiCGSTAB [42].

Both methods were implemented usingpreconditioning. The preconditioning matrix usedis the following block-diagonal matrix:

where p is the number of integration steps takensimultaneously. Matrix A is asymmetric and verysparse. Assuming that each one of the m synchronous machines and its controllers is modeled byK differential equations and that the network has nnodes, the dimension of (7) is (Km + 2n).

1) The Spaee and Time CG Approaeh [33]: Thecalculation of the e!ements of the mismatch functionsF; and C;, the e!ements of the J acobian matrix, andthe updating of the components of x; and V;' in (14)elin be performed with perfect parallelism. Therefore, an obvious mapping of these operations in themultiprocessar system is to aliocate the equationscorresponding to each integration step to a singleprocessor. The decompositionof(14) should be donein the same way. The advantages of this decomposition approach are:

• Perfect processing load balance.

ç -Ab (14)

where

ç= FI CI F2 C2 Fp Cp ]T

b= ~Xl ~V{ ~X2 ~v.' ... ~xp ~Vp' f2

QI RISI y!

Q2! R 21 Q2 R2O O 52 Y2

A=

Qpp-I Rpp-IO O

one-line diagram and the semiautomatic methodology described in the next sedion.

The Hybrid CG-LU method has shown a fairlygood performance on actual parallel implementations but presents the disadvantages of applying theCG method to a system of equations with relati vely small dimensions and the need of an optimizedBBDF. The Full CG method an attempt to overcomethese difficulties. The CG gradient method is supposed to present best performance, both in paralleland vector computers, for very large problems.

The use of preconditioning techniques is advisedfor a better performance of the Hybrid CG-LUmethod and essential for an adequate performanceof Full CG method. Truncated Series preconditioning [40] is used for Hybrid CG-LU method. For theFull CG method, the Incomplete LU Factorization[6) was chosen as the preconditioning approach afterpreliminary tests with several methods available inthe literature.

5.1.2 Waveform Relaxation

This method [35, 36] consists in the decomposition of the set of equations describing the power system dynamics into subsystems weakly coupled andto solve each subsystem independently for severalintegration steps to get a first approximation of thetime response. The results are, then, exchanged andthe process repeated. The advantages of this methodare the possibility of using different integration stepsfor each subsystem (multirate integration) and toavoid of the need to solve large sets of linear algebraic equations.

5.1.3 Space and Time Parallelization

This elass of methods follows the idea introducedin [37J in which the differential equations given in (1)are algebrized for several integration steps, calledintegration windows, and solved together with thealgebraic equations of this window by the Newtonmethod, similarly to the conventlonal one-step approach shown in (56). The resulting enlarged systemof linear equations has the following structure:

t

t

~ ...

the of the VDHN method to the equations associated to each integration step and, then, to apply theGauss-Jacobi globally to all integration steps, is usedwith better results. Both works present results onlyfor simulations of parallel implementation.

5.1.4 Conjugate Gradient Approach Results

The Hybrid CG-LU, Full CG, and Space andTime CG methods described above were tested using different test systems, inciuding a representationof the South-Southern Brazilian interconnected system with 80 machines and 616 buses. The testswere performed on the iPSC/860 computer and ina prototype parallel computer using the TransputerT800 processor. Despite the difliculties in parallelizing this application, the results obtained in thesetests showed a considerable reduction in computation time. The CG methods presented adequate robustness, accuracy, and computation speed establishing themselves firmly as an alternative to directmethods in parallel dynamic simulation. Moderateefliciencies and speedups were achieved, particularlyin the tests performed on the iPSC/860, which arepartially explained by the relatively low communication/computation speed ratio of the machines usedin the tests. It is believed that in other commerciallyavailable parallel machines, the studied algorithmswill be able to achieve higher leveis of speedup andefliciency.

5.2 Network Decomposition for ParallelBlock-Iterative SolutioDS

A basic step in the solution of sets of linear equations by block-iterative methods is the decomposition of the coeflicient matrix in order to have itmapped onto the parallel architecture. In the caseof electrical networks, this is equivalent to decompose or tear apart the network into subnetworks ofsmaller size.

Reference [24J introduces an enhanced version ofa semiautomatic network decomposition method forblock-iterative solutions on parallel computers preliminary reported in [34]. The basic operating principie of the proposed ;nethod is the build ui> of subnetworks from a given number of starting nades (seednades). The subnetworks are forined by aggregatingnodes one by one to these seed nodes. The choiceof which node to aggregate to a given seed node depends on a node ranking criterion based on weightsassigned previously to all network nodes. Fig. 5shows an overview of the decomposition method.

The weights used for the node ranking are calculated using the following criterion:

Sul>Network

Formatian

SeedNadesCnoice

Sul>Network

Formatian

Selectionof the

Decom positions

Sul>Network

Formatian

Fig. 5: Decomposition method overview

NadeRanking

where ni is the set of ali branches connected to nodei; nT is the set of ali network branches; B1 is thesusceptance of branch I; nb is the number of networkbranches. The value of Wi indicates the magnitude ofthe electrical interconnection between a node and itsneighbors. The exponent ft is used to increase theinfluence of branches with high susceptance values inorder to differentiate between nodes with severallowsusceptance value interconnections from nodes withfew interconnections with high susceptance value.

In 'each step of the process, the next node to beaggregated to the subnetwork is the one with thelargest weight and still not yet aggregated to othersubnetworks. The process terminates when the subnetworks can aggregate no mOfe nodes. In the algorithm to generate the NBDF, ali nodes are aggregated to the subnetworks while in the one for theBBDF, the nodes left out of the subnetworks formthe interconnection subnetwork.

The choice of the seed nodes set is a relevantstep in the decomposition method outlined above.The number of seed nodes determines the number ofsubnetworks and their location strongly influencesthe suitability of the decompositions for parallel 50

lution. Most of the generated decompositions arelikely to be adequate for a solution by a blockiterative method. However, some of these decompositions require less iterations and computation timethan the others, for the same solution method andconvergence tolerance. Therefore, it is desirable tobe able to spot the most promising decompositionsbefore the effective network equation solution. Thiscan be achieved using a Decomposition Selection

(15)

(16)M

Wi

134

method.The decomposition method outlined above, us

ing a node ranking based on weights calculated according with (15) and (16). is conducive to blockdiagonal dominance of the NBDF's and BBDF'sand, therefore, to the convergence of associated iterative processes for the network equation solutions.

5.3 Simulation of Electromagnetic Transients

In the usual model of the power network for electromagnetic transient studies ali components, excepttransmission lines, are modeled by lumped parameter equivalent circuits composed of voltage and current sources, linear and non-linear resistors, inductors, capacitors, ideal switches, etc. These elementsare described in the mathematical modei by ordinary differential equations which are solved by stepby-step numerical integration, often using the trapezoidal rule, leading to equivalent circuits consistingof resistors and current sources.

Transmission lines often have dimensions comparable to the wave-length of the high frequency transients and, therefore, have to be modeled as distributed parameter elements described mathematically by partial differential 'equations (wave equation). For instance, in a transmission line of lengthl, the voltage and current in a point at a distance xfrom the sending end, at a time t, are related throughthe following equation:,

processing, often correspond to a geographical mapping of the power system onto the multiprocessortopology as shown below for a two subnetwork example:

Fig. 6: Natural Decomposition of the ElectromagnelicTransienls Problem Model

where GA and GB are conductance matrices relatedto linear branch elements; TA and TB are non-linearfunctions related to non-linear branch elements;EA

and EB are vectors of the unknown node voltages;l~, I~ are nodal current injections corresponding toindependent sources, li, l~, IX, liJ are the nodalinjection currents related to the equivalent circuits ofinductors and capacitors, and If, 111 are the nodalcurrent injections present in the transmission linemodels. Figure 6 is a graphical representation ofthis modeL

Since IS(t) is known and IH(t), IL(t), and IC(t)depend only on terms computed in previous integration steps, EA(t) and EB(t) can be computedindependently in different processors. The compu-'tation of the terms I~(t), I~(t), I~(t), IMt), andIX(t), liJ(t) can also be executed in parallel, sincethe equations related to branches in a particular subnetwork depend only on nodal voltages belonging tothe same subnetwork. However, the term If (t) depend on the past terms lff (t - r) and EB(t - r). aswell as lff (t) depend on the past terms If (t - r)and E.(t - r). Since such terms have already beenevaluated in previous integration steps, the processors must exchange data in order to each one be ableto compute its part of the vector IH (t).

An electromagnetic transients simulation methodology, based on the above model, was developed andimplemented on a prototype of a parallel machineusing the Transputer T800 processor [43J. Using acareful handicraft load balancing procedure, it waspossible to achieve very high efficiencies (98%' insome cases) in the simulation of real size networks.

L 8l~, t) + R I(x, t) (17)

C 8E~, t) + G E(x, t) (18)

8E(x, t)8x

8I(x, t)8x

where E(x, t) and I(x, t) are p x 1 vectors of phasevoltage and currents (p is the number of phases); R,G, L and C are p x p matrices of the transmissionline parameters.

The wave equation does not have an analytic solution in the time domain, in t.he case of a lossyline, but it has been shown that it can be adequatelyrepresented by a trave!ing wave model consisting oftwo disjoint equivalent circuits containing a currentsource in paraliei with an impedance in both endsof the line. The value of the current sources aredetermined by circuit variables computed in past integration steps (history terms).

This model is nicely structured for parallel processing: subnetworks of lumped parameter circuitelements connected by transmission !ines, representing a group of devicee in a substation for instance,can be represented by sets of nodal equations thatinterface with other groups of equations by the variables required to calculate the current sources in thetransmission !ine equivalent circuits. The exploitation of this characteristic of the network model, inthe partitioning of the set of equations for paraliei

135~~~-------- -----

5.4 Small-Signal Stability

Operational problems caused by e1ectromechanical oscillations weakly damped, or even negativelydamped, must be detected and eliminated to avoiddeterioration in the quality of the supplied electrical energy and damage to equipment. This type ofproblem can be studied using linearized versions ofthe power system dynamic mode!. The great advanLage of this approach is the possibility of the performance assessment of control schemes without timesimulation. This assessment is conducted throughlinear control systems analysis and design methods.A large scale numerical problem resulting from theapplication of these techniques is the computationaf eigenvalues and eigenvectors associated with thestate matrix of the linearized system mode!.

A linearized version of (1) and (2) at an operatingpoint (xo, zo) is given by

where lI,,'" l4 are Jacobian matrices evaluated atthe linearization poinl. The power system statetransition equation can be obtained eliminating Llzfrom (21):

Lli: = (lI - hl4-l h) Llx = A Llx (22)

where A is the system state matrix whose eigenvalues provide information on the singular point stability of the nonlinear system. Efficient algorithmsto obtain the dominant eigenvalues and eigenvectorsof A for large scale systems do not require the explicit calculation of this matrix [44, 45]. These algorithms can be directly applied to (21), named theaugmented system matrix, whose sparse structurecan be fully exploited to reduce both cpu time andmemory requirements. These methods require repeated solutions of linear equation sets of the form[46J:

where UI, vare unknown vectors; q is a complex shiftused lo make dominant the eigenvalues e10se to q;1 is lhe identity matrix; r is a complex vector; andk is the iteration counter. These equation sets areindependent and their solution can be obtained concurrently in different processors. This property makelhe eigenvalue problem well suited for parallel processmg_

In the work reported in [46] and [47], algorithmsfor the parallel solution of the eigenvalue problemfor small-signal stability assessment, using the aboveformulation. are described and the results of testswith models of a large practical power systems arepr~sented. A first investigatory line of research was

based on the parallelization of the Lop-sided Simultaneous Iterations method [46]. The obvious paralleI stratagem used was to carry out each trial vectorsolution on a different processor. Results obtainedin tests performed on the iPSCj860 parallel computer, using two large scale representations of theBrazilian South-Southem interconnected power system, presented computation efficiencies around 50%.A second approach to the problem uses a HybridMethod [47] resulting from the combination of theBi-Iteration version of the Simultaneous Iteration algorithm and the Inverse Iteration method. The Hybrid algorithm exploits the fast eigenvalue estimation of the Bi-Iteration algorithm and the fast eigenvector convergence of the Inverse Iteration algorithmwhenever the initial shift is e10se to an eigenvalue.In the Inverse Iteration stage, the Hybrid algorithmallows perfect parallelization. The results obtainedindicate a superior performance of this method bothin terms of computation speedup and robustness.

5.5 State Estimation

State estimation is a basic module in the EnergyManagement System (EMS) advanced applicationsoftware. lts main function is to provide reliableestimates of the quantities required for monitoringand control of the electric power system. In almost ali state estimation implementations, a set ofmeasurements obtained by the data acquisition system throughout the whole supervised network, atapproximately the same time instant, is centrallyprocessed by a static state estimator at regular intervals or by operator's requesl. Modem high speeddata acquisition equipment is able to obtain new setsof measurements every I-lO seconds but the presentEMS hardware and software alIow state estimaÚonprocessing only every few minutes. It has been argued that a more useful state estimation operationalscheme would be achieved by shortening the time interval between consecutive state estimations to allowa e10ser monitoring of the system evolution particularly in emergency situations in which the systemstate changes rapidly. Another industry trend is toenlarge the supervised network by extending stateestimation to low voltage subnetworks. These trendspose the challenge of performing state estimation ina few seconds for networks with thousands of nodes.

The higher frequency in state estimation execution requires the development of faster state estimation algorithms. The larger size of the supervisednetworks will increase the demand on the numericalstability of the algorithms. Conventional centralizedstate estimation methods have reached a development stage in which substantial improvements in either speed or numerical robustness are not likely tooccur. These facts. together with the technical developments on distributed EMS, based on fast datacommunication network technology, opens up the

136

possibility of parallel and distributed implementations of the state estimation function.

In the work reported in [48), the possibility of par-oalieI and distributed state estimation implementation was exploited leading to a soJution methodologybased on conventional state estimation algorithmsand a coupling constraint optimization technique.

The information modeI used in power system stateestimation is represented by the equation

where z is a (m x 1) measurement vector, x IS a(n x 1) true state vector, h(.) is a (m x 1) vedorof nonlinear functions, w is a (m xl) measurementerror vector, m is the number of measurements, andn is the number of state variables. The usual choicefor state variables are the voltage phase angles andmagnitudes while the measurements are active andreactive power f10ws and node injections and voltagemagnitudes.

Like in load f10w calculations, it has been foundthat state estimation algorithms based on decoupledversions of (24) behave quite adequately for the usualpower networks·. Therefore, the foilowing decoupledmodel has been mostly adopted:

z = h(x) + W

Zp hp(O, v) + Wp

z. = h.(O, v) + W q

(24)

(25)

(26)

where

z; and z; are vectors of active and reactive measurements in area k; dimensions: (m; x 1) and(m; xl), respectively.

Ok and vk are vectors of voltage phase angles andmagnitudes in area k, including the ones corresponding to the boundary buses; dimensions:(n~ x 1) and (n~ xl), respectively.

The WLS state estimat~ for the distributed estimatíon problem depicted in (27) and (28) can be obtained solving a constrained optimization plOblemwith a separable objective function and a set of linear constraints introduced to force the state variablesin the overlapping areas to assume the same values.The WLS plOblem is stated as follows:

r

L 1/2{ [z; - h~()f[R;]-I[z; - h;()] +k=1

+[z; - h:(.)nR~rl[z; - h:()]} (29)r

S. to L.1kOk - O (30)p -k=1

r

L.1;vk' = O (31)k=1

where O (nq xl) and v (n. xl) are the vectors oftrue voltage magnitudes and phase angles; p and qare subscripts indicating partitions of vectors andmatrices corresponding to active and reactive measurements, respectively; nq = n - 1, n. = n, and nis the number of network nodes.

Assume that the power network is decomposed inr areas. The areas are connected through boundarybuses which belongs simultaneously to both adjacentareas. Therefore, there are overlapping areas whichmay be observed using both adjacent measurementsets. The number of boundary buses may be kept toa minimum or incorporate a few extra buses in order to facilitate some estimation functions·like baddata processing, for instance. A further assumptionis that there are no injection measurements in theoverlapping area buses. This assumption does notnecessarily represent a practical limitation to theproposed methodology as actual injection measurement buses in overlapping areas can be replaced byfictitious buses with no injection measurement connected to the actual buses, now placed outside theoverlapping area, by zero impedance lines.

Under the above assumptions, the state estimatíon plOblem introduced in (25) and (26) can be decomposed as follows

where 'P and "Iq are (I xl) vectors of Lagrange multipliers and

calculated at the solution point.As in the integrated state estimation applOach,

the Gauss-Newton method combined with the usual

ohk(Ok. vk)H k = q .

q avkand

ohk(Ok vk)H k = p ,

P OOk

aL-[H;f[R~rl[z;- h~(Ok, vk)J +=

aok

+[.1;f "Ip = O , k = 1, ... , r (32)

oL-[H;f[R~]-I[z;- h;(Ok, vk)] +

ovk =

+[.1~]T "Iq = O , k = I, '.'IT (33)

oLr

= L.1~Ok =0 (34)a"lp

k=l

aLr

LA~vk = O (35)O,q k=1

where A; and A; are (I x n~) and (I x n~), respectively, matrices (I is the number of boundary buses)whose nonzero elements are either 1 or -1.

The necessary conditions for the solution of theabove problem, derived flOm the corresponding Lagrangian function L, are:

(27)

(28)

k=l, ,r

k=l, ,r

= h~(Ok, vk) + w~,

h~(/fk, vk) + w;,

137

Fast Decoupled assumptions can be used to solve(32)-(35) leading to the following algorithm:

Õk(i+l) = Ok(i)+[C;t'tlb;(i),k = 1, ... ,r (36)

(37),p(i + 1)

Ok (i + 1)

r

Np-I L .1;Õk(i + 1)

k=1Õk(i + 1) - [C;]-I[A;f-{p(i + 1),

k=I, ... ,r (38)

A straightforward implementation of the algorithm given in (36)-(47) leads to a hierarchical estimator. As explained earlier, this kind of estimator is not suitable for parallel or distributed processing. A version of the algorithm more adequateto this type of processing can be obtained neglecting the off-diagonal elements in matrices C; and C;in equations (37,38) and (40,41). In this case, lheseequations can be merged and rewritten, respectively,as

r

,q(i+l) Nq- ' L A;vk(i + 1) (40)k=1

vk(i + 1) = vk(i + 1) - [C;]-I[A;f ,q(i + 1),

k = 1, ...,r (41)

where

tlb; (i) [H;f[R;]-'[z; +

- h;(Ok(i),v.(i))] (42)

tlb; (i) [H;f[R;]-'[z; +

- h;(Ok(i + 1),v.(i))] (43)

Ck = [H;f[R;l-IH; (44)p

Ck [H;f[R;]-1 H; (45)qr

Np = L.1;[C;t1[.1;f (46)k=1

r

Nq L .1;[C;t1[.1;f (47)k=1

and H p and H q are calculated at nominal conditionsand kept constant in the iterative processo

Matrices N p and N q defined in (46) and (47) playa key role in the distributed state estimation formulation. These matrices present the following relevantcharacteristics:

• Slructure: represented by a graph whose nadescorrespond to the boundary buses: from a particular node, there are links to ali nodes corresponding to boundary buses of the adjacent areas.

• Elements: the diagonal elements are the sum ofthe diagonal elements of the two local area inversegain matrices (C; and C;) involved in the corresponding constraints; the nonzero off-diagonal elements correspond to the off-diagonal elements ofthe inverse local area gain matrices; in the case ofmore than one boundary bus in the overlappingarea, the off-diagonal elements related to the interconnection of these buses are the sum of thecorresponding e!ements of the two local area inverse gain matrices.

k

± k 9rrgi [Õ~(i + 1) - õ;.(i + 1)150)grr + rr

(48)

(49)

Õk(i + 1) + tlÕk(i + 1)

vk(i + 1) + tlvk(i + 1)

Ok (i + 1)

vk (i + 1) =

where g:r and gtr are diagonal elements corresponding to bus r of the inverse gain matrices of the neighboring areas k and j, respectively, and the sign isset to + or - according to A;. The elemems oftlvk(i + I) are calculated similarly.

The algorithm introduced in the previous sectioncalculates the e and v updates at every iteration ina synchronous way, i.e., it has to wait until the statevector is updated in ali areas (ar processors) beforeit starts a new iteration. This approach presentsdrawbacks regarding both parallel and distributedprocessing. In paraliei processing, synchronous algorithms usually cannot achieve high elficiency dueto processar idle time unless a perfect load balancing is obtained which is not easily achievable in mostpractical applications. In distributed processing, itrequires dilficult synchronizing operations and reliable communication systems which is not usuallypresent in geographically distributed systems. Asynchronous algorithms, on the other hand, are moref1exible for the implementation of both parallel anddistributed computation processes.

To achieve asynchronous mode operation, the algorithm defined above has to be slightly modified.The modifications concem the way the componentsof the vectors defined in (48) and (49) are updated.In the synchronous version, these vectors are calculated using values of the state variables obtained inthe i-th iteration. In the asynchronous algorithm,these vectors are updated using the latesl availablevalue of the state variables. Therefore, whenever aparticular processar finishes the updating of its statevector, it transmits this information to its adjacentprocessors (logical adjacency), reads in' the values ofthe boundary buses state variables from its buffer,regardless of lhe age of Ihal informQllOn, and proceeds to the calculation of the next area ,;tate vectorupdate.

where ali the elements of tlÕk(i + 1) are null exceptthe ones corresponding to boundary buses that aregiven by

(39)

Vk (i) + [C;t 1tlb; (i),

k=I, ... ,r

vk(i + 1)

and

138

Owing to the local based nature of the stateestimation algorithm, simulated experiments haveshown that the computation can continue even inthe absence of information from other areas. Thisfact can easily be understood considering that (36)and (39) alone are in fact a local decoupled stateestimator. Obviously, in this case the matching ofboundary bus state variables would not be achieved.This property make the algorithm suitable to asynchronou!! implementation. Based on the above property, several iterative schemes can be derived, incorporating different degrees of asynchronism, as willbe discussed in latter sections.

The results of computational experiments indicatethat the algorithm described above is accurate, robust and efficient in terms of reducing the requiredcomput.ation time. An important by-product. of thework is the indication that state estimation is probably a naturally decoupled problem. Important parallei and distributed state estimation issues, like database access and communication between processors,were not addressed in this work. Also, bad data processing and observability analysis, in the distributedcontext, were not studied. These issues are topicswell suited for further research. Finally, a strongpractical advantage of the methodology introducedin this paper is the use of standard state estimationalgorithms.

5.7 Team AIgorithms

Parallel asynchronous implementations of iterative algorithms are now establishing themselves asgood choices for high performance computation indistributed-memory environments, in view of several attractive features that they possess, such asease of implementation, facility with load balancing,shorter convergence times and so on [6J. A new con·text for asynchronism arose with the introduction ofthe so-called Team Algorithms, which are hybridsof different methods. Such combinations of algorithms were first proposed in the context of powersystem applications in [52], for sequential computers, and later again in [53, 54, 55], where they werefirst called Team Algorithm, in the context of asynchronous parallel computing, in which such methodshave a natural implementation.

The main idea behind a Team Algorithm (TA)is to partition a complex problem in several subproblems and to solve each subproblem in a differentprocessor of a distributed computer system. choos-ing for each subproblem one or more methods thatbest solve it. That way, if there are subproblemswith different characteristics for which it has beenchosen different methods, the resulting combinationof methods is a TA.

As an illustration for the TA approach, considerthe problem of solving a seI. of n nonlinear algebraicequations given by:

5.6 Load Flow Computations (x) = O x E R" (51)

,(X) = O (54)

2(X) = O (55)

An asynchronous implementation of a TA to solve(51), implemented in a distributed memory computing system, can be represented as:

Assume that the elements of (51) can be partitionedas:

(x) = [,(x) 2(X) f (52)

x = [Xl X2 f (53)

Then, equation (51) may be rewritten as:

9dx ' (k)]92[x2(k)]cx(k) + W, X~(k)

=X1(k+ 1)X2(k + 1)x(k + I) +w2X~(k)

(56)where, 9i (i = 1,2) are functions representing theiterative processes used in each processor. respectively; xi(k) represents the most recently receivedvalue of vector x in processor i aI. iteration k, whilexr(k), (i = 1,2), represents the most recent valueof Xi aI. iteration k, available to processor 3 (theAdministrator) .

The TA given by (56) works as follows: each processor i (i = 1,2) tries to solve problem (4) using

• The practicalload-flow problem is much more difficult to parallelize than other similar problemsowing to constraints added to the basic system ofnon-linear algebraic equations.

• Very efficient algorithms are already availablewhich can solve large load-flow problems (morethan 1000 nodes) in a few seconds on relativelyinexpensive computers [1].

More interesting investigatory lines are the parallelization of multipie load-flow solutions (contingency analysis, for instance) [49, 50J and the speedup of load-flow programs in vector processors[51].

Load-flow is a fundamental tool in power system studies. It is by far the most often used program in evaluating system security, configuration adequacy, etc., and as a starting point for other computations such as short circuit, dynamic simulation,etc. Its efficient solution is certainly a fundamental requirement for the overall efficiency of severalintegrated power system analysis and synthesis programs. Therefore, it should be expected a great research effort in the parallelization of the 10ad-f1owalgorithms. Tbat has not been the case, however,for two main reasons:

139

tx txProcessor 1 Processor 2

Xl = Çl (x) X2 = Ç2(X)

Xl X2

Processor 3: Administrator

~ X = C X + Wl Xl + W2 X2

Fig. 7: Example cf Team Algcrithm.

an iterative algorithm represented by the map ç;,using the most recent value of X received from theAdministrator, and transmitting the updated valueXi. At the same time, the Administrator updates X

combining most recentl)' received values of Xi, usingproperly chosen non-negative weíghts (Wl, W2 2: O).The constant c is chosen to ensure that for a solutionx' , the following condition holds:

which. in turno implies that:

c = 1 - Wl - W2·

This solution process is illustrated in figure (7).The advantages of using a TA in distributed load

flow computations is demonstrated in the researchwork reported in [56, 57, 58]. In this work, twoload flow example problems are studied. For theseexamples, neither the Fast Decoupled method northe block Jacobi version of the Y matrix methodcan properly solve the problem; however, a TAcombining both methods solves the problems withthe additional advantage of being easily parallelizedin an asynchronous environment with an excellentspeedup. That way, it was possible to exploit ali thepotentia! of distributed memory computers, as theInte! iPSCj860 hypercube of 8 nodes used for theimplementations described in the referred work.

6 Research Activities l.n Brazilian Institutions

In Brazil, research actlvltles in parallel processing application to power system problems have beenshowing a fairly fast pace in the last ten years or so.At least three Brazilian institutions have researchgroups working in this area: COPPE I , CEPEV,and UNICAMp3 .

l The Postgraduate School of Engineering of the FederalL;niversitv of Rio de Janeiro.

2The Brazilian Utilities Research Center.3The University of Campinas.

CO PPE started a research project in the application of paraliei processing to engineering problems in 1986. The project joined together professors, researchers and graduate students of the Computer Science, Civil, and Electrical Engineering Departments. The objective of the project is the development of a complete cycIe of parallel processing technology comprising hardware, basic and application software. The project has been financedby the Brazilian Government Agencies FINEP andCNPqjRHAE. An Intel iPSCj860 parallel computer, with hypercubic topology and eight processing nodes, has been used in this project. AnIBMjSP-2 parallel computer with eight cpu·s anda Cray J90j8 supercomputer are in the process ofacquisition by COPPE and are planned to becomeavailable for research activities in early 1995. As aresult of this project, several software methodologies and programs for parallel processing have beendeve!oped. The ones related to power system applications have already been mentioned in previoussection of this paper. Also, a prototype of a paralIe! machine based on the Transputer T800 processorwas built and has been used in the development ofapplications.

CEPEL and UNICAMP started their activities inthis research area using a parallel machine knownas PP (Preferential Processor) developed at theresearch center of the Brazilian TelecomunicationCompanies (CPqDjTELEBRAS). This machine isbased on a hybrid sharedjdistributed memory architecture and the Intel 80286 processor. A few prototypes of this machine were bui!t, some of whichare still in operation, but the project was later discontinued.

CEPEL developed a parallel programming environment for the PP, based on the MS- DOS operating system extended with facilities for communication and synchronization, which was used inpower system applications such as multi-area reli_oability evaluation. hydrothermal production costingand reliability evaluation. and securitv constraÍned. .optimal power flow [59. 60]. This programming environment was also used at UNICAMP to develop aparallel methodology to solve sets of network equations based on the matrix inverse factors (W-matrix)approach [29]. After the discontinuity of the PPproject. the UNICAMP power system group acquired a nCUBE-2 with 64 processors that has beenused in research projects in security constrained aptimal power flow, probabilistic short-circuit analvsis, etc. [18, 61]. CEPEL is also developing a longterm project in the application of distributed architectures and open system concepts to EMS design[20J.

The availability of parallel machines for researchand development activities in Brazilian institutianshas also increased considerably in the !ast few years.

140

Most of the ideas contained in this paper are theresult of a long and fruitful collaboration with Prol.E. Kaszkurewicz and ProL A. Bhaya (COPPE) andour present and former students I.e. Decker, H.L.S.Almeida, M.H.M. Vale, B. Ba.ran, J.M. Campagnolo,C.L.T. Borges, and D.Q. Siqueira. More recent workhas been developed in collaboration with Prol. F.F. Wuand Dr. L. Murphy (U .C. Berkeley, USA) in Pala.lIel andDistlibuted State Estimation, and DI. Nelson Martins(CEPEL) and Prol. J.L.R. Pereira (UFJF) in PaIalieISma.ll Signal Stability Assessment. Pa.rt of this papeltext and some figures were extracted from thesis and papeIS wlitten by OI with people referred to above.

Acknowledgments

[1] J.V. Mittsche, "Stletching the Limits of PowerSystem Analysis", IEEE Computer Application inPower, vol. 6, no. 1, pp. 16-21, Janua.ry 1993.

[2] IEEE Spectrum, Special I.sue: Supercomputers,September 1992.

[3] T.G. Lewis and H. EI-Rewini, Introduction to Paraliei Computing, Prentice Ha.lI, New YOlk, 1992.

[4] M.J. Quinn, Parollel Computing: Theory and Proctico, McGlaw-Hill, New York, 1994.

References

tle bit less uncertain, it should be pointed out thetendency in the paralieI computer industry to maketheir products follow open system standards and thepossibility of developing applications less dependentof a particular architecture.

Distributed processing is a prevailing technologyin many information processing applications. Combined with Lhe concept of open system, it offers abetter performance/cost ratio, more reliability andf1exibility, scalability, etc. In the power utility environment, it has been used in distribution automation and is substituting the minis and mainframes inthe supervisory and control centers. In the future,networks of microcomputers and workstations willprobably integrate operation and planning information systems with accounting and billing processingsystems allowing direct information transfer and resource sharing.

The most likely scenario for the electric utility information processing system of the future will bea conglomerate of local area networks serving specilic functions (control centers, expansion and operational planning, billing, substation control, etc),built around sta'ldard hardware and software, inwhich each computer has specilic characteristics likesophisticated graphic terminais, data base servers,etc. Some of the computers may actually be paralieImachines that will act as number crunchmg serversfor highly intensive numerical applications. Theselocal area networks will be joined together by a widearea network connecting computers in the corporation headquarters and widespread in the whole areaserviced by the utility.

Avatlable In early 199<l.

Make Number Institi-Model of cpu's tution

Intel iPSC/860 8 COPPEnCube 64 UNICAMP

IBM SP-l 16 LNCCIBM SP-l UNICAMPCray YMP 2 UFRGSCray J90/8 8 COPPE"IBM SP-2 8 COPPE"

" -

ParalieI processing may be the only way to makeviable some power system applications requiringhigh performance computing like real-time dynamicsecurity assessment, security constrained optimalpower f1ow, real-time simulation of electromagneticand electromechanical transients, composite reliability assessment using realistic models, etc. ParalieIcomputers are presently available in a price rangecompatible with power system applications and presenting the required computation power.

Two main factors are still impairments to the wideacceptance of these machines in power system applications: the requirements for reprogramming orredevelopment of applications and the uncertaintyabout the prevailing parallel architecture. The lirstproblem is inevitable, as automatic parallelizationtools are not likely to become practical in the nearfuture, but has been minimized by the research effortin parallel algorithms and the availability of more efficient programming tools. The second difficulty isdestined to disappear as parallel computers leave thelaboratories and arrive in the showrooms of computers' manufacturers.

Likewise in the history of sequential computer evolution, a unique and overwhelming solution to theparalieI computer architecture problem is not to beexpected. It is more likely that a few different architectures will be successful in the next few yearsand the users will have to decide which one is themost. adequate for their application. Moreover, it isprobably that commercial processing applications,which are now turning towards paraliei processing,are the ones that will ~hape the future parallel computer market. However, to make this scenario a lit-

7 Concluding Remarks

Table 3: Parallel computers and supercomputers available in Brazil

Table 3 gives a partial relation of parallel computers and supercomputers available in Brazil. Mo.stof these machines are open to use for the academiccommunity and accessible via computer communication networks.

1"I

141

[5] D.J. Tylavsky, A. Bose, et ai., "Parallel Processingin Power System Computation", IEEE Transaetionson Power Systems, voI. 7, no. 2, pp. 629-637, May1992.

[6] D.P. Bertsekas and J.N. Tsitsikilis, ParaI/c/ andDistributed Computahon, Prentice Hall, New York,1989.

[7] A. Umar, Distributed Computing: A Praetical Synthesis, Prentice-Hall, New Jersey. 1993.

[8] B. Stott, "Review of Load Flow Calculation Methods", Proeeedings of the IEEE, voI. 62, no. O?, pp.916-929, July 1974.

[9] B. Stott, "Power System Dynamic Response Calculations", Proeeedings of the IEEE, voI. 67, no. 2, pp.219-241, February 1979.

[10] H.W. Dommel and W.S. Meyer, "Computationof Electromagnetic Transients", Proceedings of theIEEE, voI. 62, no. ???, pp. 983-993, July 1974.

[11] B. Stott, O. Alsac, and A. Monlicelli, "SecurityAnalysis· and Optimization", Proeeedings of theIEEE, voI. 75, no. 12, pp. 1623-1644, December1987.

[12] N.J. Balu, et ai., "On-Line Power System SecurityAnalysis", Proeeedings of the IEEE, voI. 80, no. 2,pp. 262-280, February 1992.

[13] Y. Sekine, K. Takahashi, and T. Sakaguchi, "RealTime Simulation af Power System Dynamics" I Proeeedings of the 11th Power Systems ComputationConferenee, Avignon, France, Aug. 30 - Sep. 3 ,1993.

[14] H. Taoka, I. lyoda, H. Noguchi, N. Sato, and T.Nakazawa, "Real-Time Digital Simulator for PowerSystem Analysis on a Hypercube Computer", IEEETransactions on Power SY4ter.u, vai. 7, no. 1, pp.1-10, February 1992.

[15] D. Brandt, R. Wachal, R. Valiquette, and R. Wierckx, "Closed Loop Testing of A Joint VAR ControUer Using a Digital Real-Time Simulator", IEEETran.!oction$ on Power Sy.!tem8, vaI. 6, no. 3, pp.1140-1146, August 1991.

[16J M.V.F. Pereira and N.J. Balu, "Composite Generation/Transmission Reliability EvaJuation" I Proceedings of lhe IEEE, voI. 80, no. 4, pp. 470-491, April1992.

[17J A.M. Leite da Silva, J. Endrenyi and L. Wang, "Integrated Treatment of Adequacy and Security inBulk Power System Reliability Evaluation". IEEETran.!octio"" on Power Sydems, vaI. 8, no. 11- pp.275-285, February 1993.

[18] F. Sato, A.V. Garcia, and A. Monticelli, "ParallelImplementation of Probabilistic Short-Circuit Analysis by the Monte Carlo Approach", Proceedingsof the 18th Power Industry Computer AppliealionsConferenee, Scottsdale, AZ, May 1993.

[19] IEEE Tutorial Course on Fundamentais of Supervisory Syslems, Publication 91EH0337-6-PWR, 1991.

[20] L.C. Lima et ai., "Design and Development of anOpen EMS", Proeeedings of th, Joint InlemalionalPower Conference (Athens Power Teeh), Athens,Greece, September 1993.

[21J IEEE Tutorial Course on Distribulion Automation,Publication 88H0280-8-PWR, 1988.

[22] W.R.Cassel, "Distribution Management Syslems:Functions and Payback", IEEE Transaetions onPower Syslems, voI. 8, no. 3, pp. 796-801, August1993.

[23] L. Murphy and F.F. Wu, "An Open Design Approach for Distributed Energy Management Systems", IEEE Transactions on Power Sy"tems, vaI.8, no.3, pp. 1172-1179, AugusI 1993.

[24] M.H.M Vale, D.M. Falcão, and E. Kaszkurewicz,"A Semiautomatic Nelwork Decomposition Methodfor Block-Iteralive Solutions on Paraliei Compulers" , submitted lo the 1995 IEEE Power IndusIry Computer Applications Conference, Salt Lake,USA, May 1995.

[25] W.L. Hatcher, F.M. Brasch, and J.E. Van Ness,"A Feasibility Study for lhe Solution of TransientSlability Problems by Multiprocessor Structures",IEEE Transaelions on Power Apparalus and Systems, voI. 96, no. 6, pp. 1789-1797, Nov./Dec. 1977.

[26] F.M. Brasch, J.E. Van Ness, and S.C. Kang, "Simulation of a Multiprocessor Network for Power System Problems", IEEE Transactions on Power Apparatus and Systems, voI. 101, no. 2, pp. 295-301,1982.

[27J J .S. Chai, N. Zhu, A. Bose, and D.J. Tylavsky,"Paraliei Newton Type Methods for Power SystemStability Analysís Using Local and Shared Memory Multiprocessors", IEEE Transactions on PowerSyslems, voI. 6, no. 4, pp. 1539-1545, November1991.

[28] J .S. Chai and A. Bose, "Bottlenecks in ParalleIAlgorithms for Power System Stability AnalysiB",IEEE Transaetions on Power Systems, voI. 8, no. 1,pp. 9-15, February 1993.

[29] A. Padilha and A. Morelato, "A W-Matrix Methodology for Solving Sparse Network Equations onMultiprocessor Computers", IEEE Transaetio!,s onPower Systems, voI. 7, no. 3, pp. 1023-130, August1992.

[30] A.F.C. Canto e J.1.R. Pereira, "Solução deEquações da Rede Elétrica Utilizando Processamento Sequencial e Paralelo", Anais do 10 Congre880 Bra.!ileiro de Atomática, Rio de Janeiro, RJ, Setembro 1994.

[31] I.C. Decker, D.M. Falcão, and E. Kaszkurewicz,"An Eflicient Parallel Method for Transient Stability Analysis", Proceedings of the 10th Power Systema Computation Conference, Graz, Austria , pp.509-516, August 1990.

[32] I.C. Decker, D.M. Falcão, and E. Kaszkurewicz,"Parallel Implementation of a Power System Dynamic Simulation Methodology Using the Conjugate Gradient Method", IEEE Transactions onPower Systems, vol. 7, no. 1, pp. 458-465, February 1992.

[33] I.C. Decker, D.M. Falcão, and E. Kaszkurewicz,"Conjugate Gradient Methods for Power SystemDynamic Simulation in Parallel Computers" I submitted to the 1995 IEEE PES Summer Meeling.

[34] M.H.M Vale. D.M. Falcão, and E. Kaszkurewicz,"Electrical Power Network Decomposition for ParalieI Computations", IEEE Symposium on Circuit, T':.

142

I

I ,I

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43J

[44]

[45]

[46]

[47]

and Systems, San Diego, CA, pp. 2761-2764, May1992.

M.Ilié-Spong, M.L. Crow, and M.A. Pai, "TransientStability Simulation by Waveform Relaxation Methods'" 1 IEEE Transaetions on Power Systems, vaI. 2,no. 4, pp. 943-952, November 1987.

M.L. Crow and M. Ilié, "The Parallel Implementation af the Waveform Relaxation Method for Tran4

.

sient Stability Simulations", IEEE Transaetions onPower Systems, vol. 5, no. 3, pp. 922-932, August1990.

F. Alvarado, "Parallel Solution of Transient Prol>lems by Trapezoidal Integration", IEEE Transaetions on Power Apparatus and Systems, vol. 98, no.3, pp. 1080-1090, MayfJune 1979.

M. LaScala, A. Bose, D.J. Tylavsky, and J.S. Chai,"A Highly Parallel Method for Transient StabilityAnalysis", IEEE Tran.soction" on Power Sydem.!,vol. 5, no. 4, pp. 1439-1446, November 1990.

M. LaScala, M. Brucoli, F. Torelli, and M. Trovato,"A Gauss-Jacobi-Block Newton Method for ParallelTransient Stability Analysis", IEEE Transaetionson Power Systems, vol. 5, no. 4, pp. 1168-1177,November 1990.

J. M. Ortega, Introduetion to ParaI/e! and VeetorSolution of Linear Systems, Plenum Press, NewYork, 1988.

J.J. Dongarra, I.S. Duff, D.C. Sorensen, and H.A.van der Vorst, Solving Linear Systems on Vectorand Shared Memory Computera, SIAM, 1991.

H.A. van der Vorst, "Bi-CGSTAB: A Fast andSmoothly Converging Variant of Bi-CG for the 5<>lution of Nonsymmetric Linear Systems", SIAM J.Sei. Stal. Comput., vol. 13, no. 2, pp. 631-644,Mareh 1992.

D.M. Falcão, E. Kaszkurewicz and H.L.S. Almeida,"App!ication of Parallel Processing Techniques tothe Simulation of Power System ElectromagneticTransients", IEEE Transactions on Power Appara.tus and Systems, vol. 8, no. 1, pp. 90-96, February1993.

N. Martins, "Eflicient Eigenvalue and FrequencyResponse Methods Applied to Power System SmallSignal Stability Studies", IEEE Transactions onPower ApparatuAI ond System!, vaI. 1, no. 1, pp.217-226, February 1986.

N. Martins, 1.T.G. Lima, H.J.C.P. Pinto, and N.J.P. Macedo, "A State-of·the-Art Computer ProgramPackage for the Analysis and Control of Small Signal Stability of Large AC/DC Power Systems", Pro·eeedings of the lERE Workshop on New Issues inPower System Simulation, Caen, France, pp. 11-19,March 1992.

J.M. Campagnolo, N. Martins, J.1.R. Pereira, L.T.G. Lima, H.J.C.P. Pinto, and D.M.Falcão, "FastSmall-Signal Stability Assessment Using ParallelProcessing", IEEE Tran"oetioo., 00 Power Apparo~

tu.! and Systems , vaI. 9, no. 2, May 1994.

J .M. Campagnolo, N. Martins, and D.M. Falcão,"An Eflicient and Robust Eigenvalue Method forSmall-Signal Stability Assessment in Paraliei Com-

puters", presented at the 1994 IEEE PES SummerMeeting, San Francisco, CA, Ju!y 1994.

[48] D.M. Falcão, F.F. Wu, and L. Murphy, "Paralleland Distributed State Estimation", presented at the1994 IEEE PES Summer Meeting, San Francisco,CA, July 1994.

[49] D.Q. Siqueira, Power System Statie ContingeneyAnalllsis Using Paral/el Processing, M.Sc. Thesis,COPPE, 1991, (In Portuguese).

[50J D.P. Pinto e J.L.R. Pereira, "Um Método Integradopara Análise de Contingências Usando Processamento Paralelo", Anais do 10 Congresso Brasileirode Atomática, Rio de Janeiro, RJ , Setembro 1994.

[51J C.L.T. Borges, Performance Assessment of PowerFlow Solution Methods for ParaI/e! and Vector Pro·eessing, M.Sc. Thesis, .COPPE, 1991, (In Portuguese).

[52] Y.P. Dusonchet, S.N. Talukdar, H.E.Sinnot. andA.H. El-Abiad, "Load Flows Using a Combination of Point Jacobi and Newton's Method", IEEETranJaetion$ on Power Apparatu$ and Sy.dem!, vaI.90, pp. 941-949, 1971.

[53] S. N. Talukdar, 5.5. Pyo, and R. Mehrotra, "DesignAIgorithms and Assignments for Distributed Pr<>eessing", EPRI Reparl EL.9917, 1983.

[54] R. Mehrota and S.N. Talukdar, "Task Scheduling onMultiprocessors for Power System Problems". IEEETransaetions on Power Apparatus and Systems. vaI.102, pp. 3590-3597, 1983.

[55] S.N. Talukdar, 5.5. Pyo, and T.C. Giras, "Asynchronous Proeedures for Paraliei Processing", IEEETran$oction" on Power Apparatu", and SYJtemJ, voI.102, pp. 3652-3659, 1983.

[56] B. Barán, An Study of Asynehronous Toam AI.gorithms, D.Sc. Thesis, COPPE, 1993, (In Portuguese).

[57] B. Barán, E. Kaszkurewicz, and A. Bhava. "Paraliei Asynchronous Team Algorithms: Co~vergenceand Performance Analysis" , submitted to SeientijieComputing, SIAM, 1994.

[58] B. Barán, E. Kaszkurewicz, D.M. Falcão, "Team AIgorithms in Distributed Load Flow Computations",submitted to the 1995 IEEE PES Winler Meeting.

[59] M.J. Teixeira, H.J.C. Pinto, M.V.F. Pereira, andM.F. MCCOYl "Developing Concurrent ProcessingApplications to Power System Planning and Opera.tions", IEEE Tran.saction.s on Power Sydem.s, voI.5, no. 2, pp. 659-664, May 1990.

[60J H.J.C. Pinto, M.V.F. Pereira, and M.J. Teixeira, "New Parallel Algorithms for the SecurityConstrained Dispatch with Post-Contingency Corrective Actions", Proceedings of the 10th Poteer Sys.temJ Computation Conference, Graz, Austria. I August 1990.

[61J M. Rodrigues, O.R. Saavedra, and A. MonticeUi,"Asynchronous Programming Model for the Concurrent Solution of the Security Constrained Optimal Power Flow Problem", IEEE/PES 1999 Summer Meeting, Vancouver, Canada, July 1993.

143