Lecture 1

THE EQUATION ORIENTEDSTRATEGY FOR PROCESS FLOWSHEETING

Paul I. BartonDepartment of Chemical EngineeringMassachusetts Institute of Technology

Cambridge, MA 02139

copyright Paul I. Barton, March 2000

The equation oriented, simultaneous or equation based strategy is the major competingtechnology to the sequential modular strategy. This strategy is much easier to describethan to realize in a practical software implementation: rather than tearing recyclestreams and solving unit models in a modular fashion, an equation oriented simulatorassembles all the equations describing a process model together and attempts to solvethem simultaneously with a general purpose multi-dimensional root finding code.

One (largely abandoned) approach to general purpose root finding is known as tearing.This tearing at the equation level should not be confused with the tearing at theflowsheet level performed by a sequential modular simulator. In fact, the sequentialmodular approach can be best interpreted as a decomposition approach in which a twolevel nested iteration is established. The outer iteration uses a tearing approach toconverge, and the inner iteration uses the unit module specific methods to converge.Further, the variables in the outer iteration (the tear variables) are chosen so that theinner iteration breaks down into a series of subproblems (corresponding to the unitmodules) that can be solved sequentially. Even further levels of iterative nesting areintroduced by the subroutine libraries that calculate physical and transport properties.For example, each bubble temperature calculation during each flowsheet pass typicallyrequires a Newton-Raphson type iteration.

There is one further flowsheeting technology called the simultaneous modular approach.As the name suggests, this lies somewhere between the equation oriented andsequential modular extremes. Broadly speaking, more and more variables are movedfrom the inner iteration into the outer iteration and more sophisticated techniques areused for convergence and derivative evaluation in the outer iteration. For example, thegeneralized executive controller with multiple tear streams and design specification, andBroyden’s method for convergence, can be interpreted as a simultaneous modularapproach. Other researchers have proposed the use of equation oriented solutiontechniques and simplified models (e.g., linear, ideal) in the outer loop, and the use ofrigorous models in the inner loop to update parameters for the simplified model. In thecase of linear models in the outer loop, this can be interpreted as a derivative evaluationstrategy. Most modern flowsheeting packages have at least options that support amore simultaneous modular approach.

EQUATION ORIENTED FLOWSHEETING PACKAGES

In more detail, the equation oriented strategy is composed of the following steps:

(i) equations and variables for all the unit operation models in the flowsheet aredefined individually (hence the model is still decomposed in a modular fashion,even if the solution strategy does not exploit this).

(ii) these equations and variables are then assembled together into one large systemof nonlinear equations.

(iii) additional specifications are added until the degrees of freedom for the equationsystem are zero and a well posed mathematical problem remains.

(iv) the equation system is solved directly and simultaneously by a general purposeroot finding code.

Figure 1 shows how these steps are implemented in the ABACUSS II system.

Clearly, the practicality of this approach hinges on the capabilities of the root findingcode to solve a very broad class of problems. Hence, the development of such codeshas received a large amount of academic attention (of course, this is entirely unrelatedto the fact that this problem is inherently more interesting to academics!).

This approach is clearly more elegant and intellectually satisfying, and has manypotential advantages:

(i) it is much more efficient. While the sequential modular approach appearedvery efficient for simulation specifications on unidirectional flowsheets (we will seelater that the equation oriented approach can take even greater advantage of suchproblems), as soon as recycles or design specifications are introduced, it becomesprohibitively expensive by comparison due to the need for repeated passesthrough the flowsheet, and the nested convergence at the unit operation level.

(ii) the artificial distinction between simulation and design specification sets isremoved. In general, provided a specification set leads to a well posed problem,there is little difference in computation load between specification sets (unlesslinearity or the block decomposition are significantly altered from specification setto specification set). Hence, for design problems in particular, the equationoriented approach is much more efficient.

(iii) as an equation oriented simulator really does view unit operation models just as aset of variables and equations, it is much easier to extend the model library andmodify existing models (i.e., it is not necessary to provide a subroutine with asystem of equations and a solution procedure embedded). In some simulators thishas led to the design of high level engineering oriented simulation languages thatallow engineers to specify models merely as sets of equations and variables. Theselanguages are effectively high level computer programming languages designedfor this specific purpose. Note that they are primarily declarative (i.e., they are used

to specify information) as opposed to procedural (i.e., they define a sequence ofinstructions to be executed). An example of a unit model coded in one of theselanguages is appended to these notes.

(iv) the previous point leads into the primary advantage of this technology, and why ithas not been abandoned: it is much more readily extended to other calculationssuch as dynamic simulation or flowsheet optimization. As the architecture of thesimulator completely decouples the model from particular solution routines, it canmake sets of equations and variables directly accessible to several differentnumerical solution routines. Equation oriented simulators such as ABACUSS II arethe closest current technology is to realizing the process modelling environmentintroduced in an earlier lecture.

(v) diagnosis of certain errors in problem formulation is much easier; e.g., if thespecification set makes the problem badly posed, but this problem is not localizedto a single unit, an equation-oriented simulator can analyze the entire equationsystem for problems such as singularity.

Given all these advantages, why are we bothering to learn about the sequentialmodular technology? Clearly there are disadvantages:

(i) the general purpose nonlinear equation solvers are not as robust and reliable asthe sequential modular approach. In fact, for most relatively complex models onehas to virtually know the answer in order to get ABACUSS II and its counterpartsto converge. Again, an evolutionary approach such as modelling and convergingthe components of a system before attacking the overall system model can helptremendously.

(ii) it makes much larger demands on computer resources, particularly machinememory (big matrices need to be stored). However, from about the late 70’sonwards advances in hardware (e.g., the increasing availability of large amountsof memory) and numerical analysis (e.g., sparse unstructured linear algebra) haveremoved these objections.

In light of the first deficiency, the equation oriented approach is not currently a viablecompetitor to the sequential modular approach for steady-state process simulation.

We will be using the ABACUSS II equation oriented process simulator, primarily fordynamic simulation of chemical processes. ABACUSS II has been developed by mystudents and myself.

LANGUAGE TRANSLATOR

Data Structures

UserInput

Results

dH

dt= hIN (t)− FOUT,ihi

i=1

NC

∑

VMAX = V

V = Nivii=1

NC

∑

H = Nihii=1

NC

∑

FOUT,I =Ni

Nii=1

NC

∑FOUT ,T

FOUT,T = 0

SYMBOLICMANIPULATION

NUMERICALALGORITHMS

SIMULATION EXECUTIVE

Figure 1: Cycle of Problem Execution in ABACUSS II

The commercial products you are most likely to encounter are Aspen Custom Modeler(formerly called SpeedUp) and gPROMS. Speedup was developed over a period of 25years at Imperial College in London. It first appeared in 1986 as a commercial product,and over the next five years it was widely adopted not for steady-state simulation, butdynamic simulation and to a lesser extent flowsheet optimization. Until the recentcommercialization of gPROMS by a spin-off company also from Imperial College, it hasnot had a serious competitor in the dynamic simulation market. The equation orientedapproach is inherently better suited to dynamic simulation, and, although many havetried, there are fundamental theoretical problems with extending the sequentialmodular approach to dynamic simulation. I developed the first version of gPROMSduring my Ph.D. thesis at Imperial College. All three systems (ABACUSS II, gPROMSand SpeedUp) share a common intellectual heritage, so once you can use one, it isrelatively easy to transfer to the others.

MULTI DIMENSIONAL NEWTON’S METHOD

From the above discussion, it is clear that the heart of any equation oriented simulatoris the multidimensional root finding code. This is the key to the viability of thetechnology. It is ironic, particularly in light of the effort devoted to this problem in the70’s and 80’s, that at the moment, the leading commercial products rely on a slightmodification of a 17th century algorithm. As we will be using this algorithm a lot, it isworthwhile studying it in more detail.

Basically, Newton’s method comes from the Taylor series expansion of a vector offunctions in terms of a vector of variables. The solution x* is expressed as an expansionaround the current estimate xk. For each function i this leads to:

f ffx

x xf

xx x

fx

x x

i n

i ik

nn n

k k k

( ) = ( ) + H.O.T.

= 1

k k kx xx x x

* * * *∂∂

∂∂

∂∂1

1 12

2 2−( ) + −( ) + + −( ) +

∀

K

K

where H.O.T. denotes quadratic and higher terms in the expansion. This can also beexpressed in matrix notation as:

f x f x J x xkx

kk( ) = ( ) + ( ) + H.O.T.* * −

where J is the now familiar Jacobian matrix of partial derivatives:

J =

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

∂∂

fx

fx

fx

fx

fx

fx

fx

N

N N

N

1

1

1

2

1

2

1

2

2

1

L

M

O M

L L

Each row contains the gradient vector of the corresponding function.

If the above expansion is curtailed at the first order (or linear) term, we end up with thefollowing linear approximation to the functions in the region of the current estimate:

f x f x J x xx( ( ( )) )≈ + −kk

k

We are interested in finding the point x* at which f(x*)=0. The above linearapproximation can be used to estimate this point:

f x 0 f x J x xx( ) = ( ) + (* )≈ −+kk

k k1

where x k+1 is the updated estimate for the solution. This leads to the following systemof linear equations that define the iterative process:

J x x f xxkk k k+ −( ) = − ( )1

Example:

f x x x

f x x x x

1 1 2 22

2 22

1 2 12

12

0

2 2 0

x

x

( ) = − =

( ) = − + − =

Partial derivatives:

∂∂

fx

x1

12=

∂fdx

x x1

21 2= −

∂∂

fx

x x2

11 22 2= −

∂∂

fx

x x2

22 12 2= −

So,

J =

−− −

x x x

x x x x2 1 2

1 2 12 2 2 2

and if xk T= ( , )1 2 :

2 12 2

12

01

11

21

−−

−−

= −

+

+x

x

k

k

Solve these linear equations to get a new estimate for the solution.

Algorithm: Multidimensional Newtons Method:

1. k := 0; guess x°2. evaluate all functions (or residuals) - i.e. calculate f(xk)3. check for convergence. For example, if f x( )k ≤ ε then STOP.4. evaluate the Jacobian (or derivative evaluation) - i.e.calculate J xk

5. assemble the system of n linear equations:

J p f xxkk k= − ( )

and solve with any suitable method for pk .

6. x x pk k k+ = +1 :7. k k:= +1. Repeat from step 2.

Convergence of this iterative process is very fast close to the solution; the number ofcorrect significant digits will double at each iteration. Current codes can convergeseveral thousand equations in a few seconds. The basic Newton Method is usuallymodified in the following manner:

• any updated x k+1must obey the physical bounds on the values of variables, soonly a fraction of the full Newton step may be taken to ensure these bounds arenot violated:

x x pk k k+ = +1 : α

where 0 1< ≤α . This is not to be confused with “trust region” strategies toimprove convergence.

• the systems of nonlinear equations to be solved in process flowsheeting areusually large, sparse and unstructured, hence the use of appropriate sparse linearequation solvers for step 5 (see Duff et al., 1986).

• modifications to the iterative process are required if the Jacobian becomesnumerically singular at one or more iterates (see below).

• individual elements of the Jacobian may be evaluated numerically (i.e., by a finitedifference approximation) or analytically (i.e., from an analytical expressionderived by partial differentiation of the model equations with respect to thevariable in question). In most applications, this leads to a sparse, hybrid Jacobiancomposed of some numerical and analytical entries. A quasi Newton update maybe applied on successive iterations to the numerical Jacobian entries. A novelfeature of ABACUSS II is that it derives analytical expressions for most of thepartial derivatives automatically from the equations input by the user via what iscalled automatic differentiation.

The success of this iterative process hinges on the quality of the initial guesses.Newton’s method is what is called a locally convergent method. Put simply, this meansthat the iterative process will only converge if it is started from an initial guess within a

certain region around the solution. If the initial guess is outside this region, a solutionwill not be found or the iterative process will converge to another root. Unfortunately,there is no way to tell a priori if an initial guess is in this region of convergence.However, the closer an initial guess is to the solution, the more likely it is thatconvergence will be achieved. Hence providing good initial guesses is the key tosuccess.

BLOCK DECOMPOSITION - PARTITIONING AT THE EQUATION LEVEL

We have already seen how the algorithm strongly connected components can beapplied to a process flowsheet represented as a digraph to identify those minimalsubsets of units that must be solved simultaneously, and the sequence in which thosesubsets must be solved. In this section we will discuss algorithmic approaches toidentifying a similar partitioning and precedence ordering applied directly to the modelequations. This can be viewed as a finer grained decomposition than that possible at theunit operation level.

As usual, the algorithm is applied to the incidence matrix. The objective is to identifyindependent equation (row) and variable (column) permutations so that the permutedincidence matrix becomes block lower triangular with minimal blocks (e.g. the squarediagonal blocks cannot be decomposed further by row and/or column permutations).This block lower triangular structure is illustrated in Figure 2.

As usual, the algorithm is applied to the incidence matrix. The objective is to identify avariable (column) permutation, i.e.:

x Qy=

and an equation permutation, i.e.:

g y Pf x( ) = ( )

where Q and P are permutation matrices, such that incidence matrix corresponding tothe permuted Jacobian:

∂∂

∂∂

gy

Pfx

Q=

has a block lower triangular structure (Figure 2) with minimal blocks (e.g., the squarediagonal blocks cannot be decomposed further by row and/or column permutations).Note that a non zero entry ij in the incidence matrix is equivalent to the existence of thepartial derivative Jij , hence operations on the incidence matrix are sufficient to identifythis permutation.

∂∂gy

=

Figure 2: Illustration of Block Triangular Decomposition

The advantage of discovering this permutation is that it has decomposed the system ofequations into the sequential solution of a series of subproblems. The first squarediagonal block of equations can be solved for the variables appearing in these equationsindependently of all the other equations in the system (because these equations arefully determined in this subset of the variables). Once these variables are calculated,they can be substituted into the equations defining the second block and the secondblock can be solved for the corresponding variables. The calculation sequence continuesin this manner until the whole system is solved.

To elaborate, block decomposition has the following advantages:

• a large system of equations is decomposed into a sequence of smaller systems ofequations (faster computation, less computer memory required at each step).

• the partial solution substituted into later blocks in the decomposition improvesrobustness of the locally convergent algorithms applied to each block - this can beviewed as having exact initial guesses for already calculated variables.

• some of the blocks may be linear (and hence can be solved directly) or blocksinvolving just one variable/equation (for which root finding procedures are muchmore robust).

• derivative evaluation for the off diagonal blocks is never required.

Users will see ABACUSS II automatically block decomposing a problem and solving theblocks in a sequence.

Given what we have learnt already, the algorithm to derive a block decomposition iseasy to describe. We will describe a two step strategy based on the most efficientalgorithm for each step:

1. Duff’s algorithm is applied to obtain a transversal. If the system is structurallysingular, terminate. Otherwise a row permutation has been identified such thatnonzero entries lie on the diagonal.

2. Use the transversal identified in step 1 to derive the directed graph representationof the system of equations. Apply Tarjan’s algorithm to this graph to identify thestrongly connected components and a precedence order. The strongly connectedcomponents are equivalent to the diagonal blocks in the block decomposition.

Clearly, the fact that Duff’s algorithm is applied in step 1 implies that we get a check onthe structural singularity of the system for free. The fact that the diagonal blockscorrespond to the strongly connected components of the graph demonstrates theessential uniqueness of the block triangular form. Permutations of equations orvariables within a block (strongly connected component) does not alter thedecomposition, and although there may be limited alternative orderings for the blocks(precedence orderings) the fact that the blocks are strongly connected componentsindicates that the number or size of blocks cannot be reduced further.

Example: solve the system of 5 nonlinear equations:

f x x

f x x x x

f x x x

f x x

f x x x

1 1 4

2 22

3 4 5

3 1 21 7

4

4 4 1

5 1 3 5

10 0

6 0

3 6 06 0

( )

( ) =

( ) = ( - 5) - 8 = 0( ) =( ) =

x

x

x

x

x

= + − =

− − =

− + =− + =

.

the incidence (or occurrence) matrix and step 1 (the ouput set assignment) yields:

x x x x xf

f

f

f

f

1 2 3 4 5

1

2

3

4

5

⊗ ×× × × ⊗

× ⊗ ×× ⊗× ⊗ ×

this yields the digraph shown in Figure 3.

f1 f2 f3 f 4 f5

Figure 3: Digraph Representation of Equation System

When Tarjan’s algorithm is applied to this digraph, it identifies the following stronglyconnected components and precedence order:

( , )

( , )

f f

f

f f

1 4

3

2 5

↓solutionsequence

and the corresponding permuted incidence matrix is:

x x x x xf

f

f

f

f

1 4 2 3 5

1

4

3

5

2

× ×× ×× × ×× × ×

× × × ×

We can now submit the subproblems to our Newton solver in the following sequence:

(a) solve:

x x

x x1 4

4 1

103 6

+ =− =

for x x1 4,{ } . Note this can be done directly (in general) with a sparse linearequation solver.

(b) substitute x x1 4,{ } into:

x x x1 21 7

4 5 8. ( )− =

and solve for x2{ } . Note this can be solved by a one dimensional root findingroutine.

(c) substitute x x x1 4 2, ,{ } into:

x x x x

x x x22

3 4 5

1 3 5

66 0

− =− + =

and solve for x x3 5,{ } . In fact, these equations are now linear in x3 and x5!

Although the above example is a rather dramatic demonstration, the very highefficiency of the algorithm to derive a block decomposition motivates its application bydefault to general problems.

It is important to recognize that the above approach will at least recognize thedecompositions possible through a study of the flowsheet structure (e.g., if theflowsheet is unidirectional, the system of equations will block decompose). A classicexample of a situation in which an equation based analysis will identify a finerdecomposition is when the mass balance of a unit or a group of units is decoupled fromthe energy balance. Although this situation only arises for certain flowsheets, models,and specification sets, the equation based analysis will solve the mass balance and thenthe energy balance, whereas the unit based analysis will solve both simultaneously.

The Dulmage and Mendelsohn Decomposition

The block decomposition ideas can also be used to develop a semi-automated approachto building a valid specification set (in a structural sense). In ABACUSS II, this feature iscalled the “Intelligent Degree of Freedom Analyzer.”

Dulmage and Mendelsohn (1963) demonstrated that any m by n incidence matrix (i.e.,possibly rectangular) can be transformed by row and column permutations to acanonical decomposition composed of three parts: an over determined part, a fullydetermined part and an under determined part. This is illustrated by the exampleshown in Figure 4.

x1 x2 x3 x4 x5 x6 x7 x8 x9 x1 0

f1f2f3f4f5f6

f7f8f9f10

Over Determined Part

Under Determined Part

Fully Determined Part

Figure 4: Dulmage and Mendelsohn decomposition for an example incidence matrix

Each part gives very useful information to help debug a model. If the over determinedpart exists, then the matrix in question is structurally rank deficient. In the example, wehave three equations in terms of only two variables, so a transversal cannot beextended for these equations. In general, the equations that end up in the overdetermined part are structurally inconsistent, in the sense that there are more equationsthan variables involved. The corresponding system of equations will be eitherinconsistent or redundant. Hence, this situation can only be rectified by deleting one ormore of the equations in the over determined part.

The fully determined part may have a finer internal structure, which is exactly the blocklower triangular decomposition discussed above. The variables in the fully determinedpart are fully determined by the equations (in the structure sense), hence they shouldnot be selected as degrees of freedom.

If the under determined part exists, it means that a unique solution cannot bedetermined for the problem as it stands; there are less equations than variables in theunder determined part, and therefore degrees of freedom that must be specified. Onlythose variables that appear in the under determined part are candidates to be specifiedas degrees of freedom.

An algorithm that computes the Dulmage and Mendelsohn decomposition for anarbitrary rectangular incidence matrix is described in Pothen and Fan (1990). Supposethat you are developing a model in ABACUSS II and that you are ready to simulate themodel. Before starting the numerical solution of a simulation, ABACUSS II uses thisalgorithm to compute the Dulmage and Mendelsohn decomposition for your problem.Note that (unwittingly) your problem, as currently stated, may have a rectangularincidence matrix; ABACUSS II will give you guidance on how to reformulate your

problem so that it has a square incidence matrix. This analysis might lead to any of thefollowing feedback on your model:

• if the number of unknowns is less than the number of equations, ABACUSS II willreport this. You need to specify more degrees of freedom.

• if there is an over determined part, ABACUSS II will tell you that a certain numberof equations must be removed from the model. It will also suggest which equationsit is necessary to delete. For the example above, it would say delete one out of theset of equations { f1, f2, f3 }.

• if there is a under determined part, ABACUSS II will tell you that a certain numberof additional variables must be specified as degrees of freedom. It will also suggestwhich variables to specify. For the example above, it would say specify one out ofthe set of variables { x8, x9, x10 }. Note that one has to be a little careful with this. Ifyou specify more than one suggested variable at a time out of a particular group ofvariables, you may introduce an over determined part in the resulting newincidence matrix formed from deleting the columns specified. The best way to avoidthis is to work interactively with ABACUSS II, specifying only one variable fromeach group at a time, and then asking ABACUSS II to compute a new Dulmage andMendelsohn decomposition each time. This way the advice on which variables canbe selected as degrees of freedom is always up to date.

ABACUSS II will only proceed with a simulation if the incidence matrix is square, andthe Dulmage and Mendelsohn decomposition only has a fully determined part. Thisindicates that the problem is well posed in the structural sense. Clearly, this feature isuseful when constructing a model, altering the specification set, or when a structurallysingular problem is detected.

PROCEDURES

In addition to equations defined symbolically in the simulation language, most equationoriented simulators also allow equations to be defined by PROCEDUREs. These areeffectively equations of the following functional form:

x pi − ( ) = 0x

where p( )x is a function evaluated by a call to a FORTRAN subroutine, which is linkedinto the simulation executable.

If the FORTRAN subroutine above performs some form of iterative calculation in orderto evaluate the function, then a nested iteration strategy is established. The outer loopconsists of all the model variables, and the inner loop consists of iterative solution of theprocedure(s) given the current iterate variables. There is a subtle point here. Theequation above is not solved by the procedure. At each step in the outer iterative loopthe following residual is evaluated:

f x pik

ik k( ) = ( )x x−

and it will be nonzero except at the solution. In order to evaluate this residual for onestep of the outer iteration, the subroutine is called and converged (the inner loop).However, there are no guarantees that xi

k will equal the value returned by p. In the caseof, for example, a phase stability calculation, the function is defined by the globalsolution of an optimization problem (minimization of Gibbs free energy), so theprocedure emdeds a numerical optimization inside the overall process simulation.

Procedures are primarily used as interfaces to a physical property subroutine library.This is the primary manner in which physical properties are evaluated. For example, inorder to relate a vapour phase specific enthalpy to the composition, temperature andpressure, the following procedure would be used:

H p T Pv Hv− ( , ) = 0y ,

where Hv, y, T, P are model variables representing the enthalpy, vapour phase molefractions, temperature and pressure, and pHv

is a hook to a FORTRAN routine thatcalculates vapour phase enthalpy given mole fractions, temperature and pressure. Notethat in a transversal on the overall system, Hv does not necessarily have to be assignedto this equation (in fact, it is usually T). The procedure equation just establishes afunctional relationship between the four variables.

Another advantage of procedures is that they reduce the size of the process model (orequivalently the number of variables in the outer iteration). For example, theevaluation of k-values (vapour liquid equilibrium distribution coefficients) for an idealvapour phase and nonideal liquid phase uses the formula:

k T

T P TPi

i iSAT

(( )

, ), ( )

xx= γ

in other words, it introduces the activity coefficients and the pure component saturationpressures as extra variables. The use of a procedure to evaluate the k-values obviatesthe need to introduce these extra variables into the outer iteration. The use of tailoredalgorithms in the inner iterations (procedures) can also improve the robustness of theoverall iterative process.

In earlier days, authors have also advocated the use of procedures to representdiscontinuous or nondifferentiable functions. The need for this is less these daysbecause of language structures like IF equations that allow explicit declaration of suchdiscontinuous functions. In fact, it is highly undesirable to “hide” discontinuousfunctions in procedures from the numerical solution routines of the simulator,especially in dynamic simulation (Barton, 1992) (see also below).

FUNCTIONAL, STRUCTURAL AND NUMERICAL SINGULARITY

We are beginning to understand the role that the Jacobian matrix J plays in the solutionof systems of nonlinear equations. For example, any Newton family method relies onan iteration matrix that is either the Jacobian matrix itself (evaluated at the currentvalues of the iterate variables xk), or some approximation to this Jacobian. If the

iteration matrix is singular, the system of linear equations that determines the iterateupdate cannot be solved uniquely. Thus, the iterative process breaks down and anotherstrategy is required to continue.

In this section, we will briefly introduce the broader implications of singularity of theJacobian matrix. In fact, the analogy with systems of linear equations is almost exact.For a system of linear equations:

Ax b 0- = (SI1)

the matrix of partial derivatives is clearly A itself. We already know that resultsconcerning the existence and uniqueness of a solution to (SI1) are based on thesingularity or nonsingularity of A. Further, we note that if (SI1) is substituted into theNewton iteration formula, then (SI1) is solved in a single Newton step (not exactly astunning discovery!). In the linear case of (SI1), singular A implies a singular Jacobianfor all realizations of x.

Functional Singularity

Roughly speaking, results on the existence and uniqueness of a solution to a system ofnonlinear equations:

f(x) = 0 (SI2)

are based on the singularity (or not) of the Jacobian matrix J(x). However, singularity ofthis matrix for only certain realizations of x does not imply (SI2) does not have a locallyunique solution.

Definition: functional singularity refers to the case when J(x) is singular for allrealizations of the vector x. A functionally singular system is badly posed.

Example: consider the nonlinear system (SI3):

x x

x x x x1 2

12

1 2 22

3 0

2 0

+ − =

+ + − =α(SI3)

The Jacobian matrix is:

J x( ) =

1 12 2 2 21 2 1 2x x x x+ +

(SI4)

Note that this is a matrix of functions rather than fixed values, hence the singularity ofthe Jacobian is potentially a function of the values assigned to x. Clearly, (SI4) is singularfor all x: the second row is a multiple of the first row regardless of the particularrealization of x (although the value of the multiplier will change with x). Hence (SI3) isfunctionally singular, and we cannot satisfy both existence and local uniqueness of asolution. The qualifier local refers to the fact that nonlinear systems may have multiple

(but not infinite) roots, whereas in the case of linear equations the only choices are oneor infinite solutions.

A functionally singular system is either redundant (which implies an infinite number ofsolutions) or inconsistent (a solution does not exist). This provides the theoreticalconditions for these two problems which we have already discussed and illustrated inthe linear case.

What are the implications of functional singularity for the example (SI3)?

If α = 9 then the second equation is exactly a function of the first equation (move theconstant to the right-hand side and then square both sides). In this case, the secondequation is redundant with the first (it does not constrain the solution further) and aninfinite family of vectors x will satisfy these equations (i.e., any x satisfying x x1 2 3+ = ).

If α ≠ 9 then no combination of values ( , )x x T1 2 will satisfy both these equations and the

system is inconsistent.

In physical models, functional singularity normally implies an error in formulation ofthe model, or in selection of a specification set. In the former case, the most commonmistake is overspecification of a quantity by the equations that make up a model. Forexample, consider the subset model equations (SI5).

L L x i NC

x

L L

i T i

ii

NC

i Ti

NC

= ∀ =

=

=

=

=

∑

∑

1

11

1

K

(SI5)

Because of the first equation, the latter two equations effectively say the same thing -the first equation makes the other two redundant with each other (i.e., the third can bederived by substituting the first into the second). In this example, either the second orthird equation should be present in the model (to define LT in terms of the Li), but notboth simultaneously (otherwise the model will be functionally singular).

In deleting the equations that overspecify a quantity, further action may be required. Ifthe extra equation really was superfluous to needs, then no further action is required (inthis case, the fact that there are less D.O.F. than anticipated can warn of this problem).On the other hand, overspecification of one quantity could mean that another quantityis not properly defined by the system of equations and we have missed an equation inthe model formulation. This can usually be detected by structural criteria (remember:structural singularity implies that there is a subset of equations involving less variablesthan equations). If there seem to be too many D.O.F. and a structural analysis forcesone to specify a variable that should naturally be an output from the model, it is likelythat the variable concerned is not defined by and coupled to the rest of the system, anda further equation is required.

Unfortunately, beyond the structural techniques (see above and below) I know of nosystematic technique that can help a modeller derive fully determined and functionallynonsingular systems of equations. In my experience, the responsibility falls to theengineer: experience, physical insight and understanding, certain tricks of the trade, andinevitable errors lead to the final model. As an example of a trick of the trade, there areseveral arguments that favour the use of:

xi

i

NC

==∑ 1

1instead of the equivalent:

L Li T

i

NC

==∑

1

in process models, most notably because the former formulation can avoid multiplenonphysical roots of the equations.

Structural Singularity

This is a concept we have defined, discussed at length, and studied algorithms fordetection: all that is really necessary to add is that an incidence matrix is clearly astructural representation of the Jacobian matrix. Hence, structural singularity impliesfunctional singularity, but the converse is not necessarily true (i.e., structural singularityis sufficient for functional singularity).

Detection of Functional Singularity

The sufficiency of structural singularity implies that efficient structural algorithms (cf.Duff) should always be applied to a model. Detection of functional singularity in thismanner is unambiguous. However, if the structural criterion is satisfied, we cannot besure our model is functionally nonsingular. In this case, we must resort to numericalexperimentation: the fact that the Jacobian is singular at several different realizations ofx is a good indication (but not proof) of functional singularity. On the other hand, asingle counter example of a nonsingular Jacobian is sufficient to prove functionalnonsingularity. Both the examples in the section above would not be detected by astructural algorithm.

Numerical (Local) Singularity

A system of equations that is functionally nonsingular (and hence well posed) can stillhave a Jacobian that is singular for a strict subset of realizations of x. As mentionedabove, if we are unlucky, the iterative process can yield an iterate vector in this subset.This case is known as numerical or local singularity of the Jacobian, and is illustrated inone dimension in Figure 5, where the Jacobian is singular at points a and b, and a locallyunique solution exists at x*.

In higher dimensions, numerical singularity is a consequence of linear dependence ofone or more of the gradient vectors of the individual functions fi . The simplest examplewould be a stationary point in one of the fi (see Figure 5).

f(x)

0

a b x

ROOT

x*

Figure 5: Illustration of numerical singularity in one dimension

The problem posed by numerical singularities can be illustrated by Newton’s method,for which we solve the linear system (SI6) at step k.

J p fx xk kk = − (SI6)

If J(x) is numerically singular at point xk , Gaussian elimination on J xk*, will yield a

matrix with the structure:

× × × × ×× × × ×

× × ×

=

×××××

pk (SI7)

where the number of non zero rows equals the rank r (<n) of J xk , and (SI6) cannotdetermine unique values for pk.

The solution to this problem is simply to ignore the last (n-r) rows in the factorizedmatrix, to yield a r × n rectangular linear system:

* note that J xk denotes a numerical realization of the matrix of functions J(x) at point xk.

× × × × ×× × × ×

× × ×

=×××

pk (SI8)

and assign arbitrary values to the last (n-r) elements of pk , solving (SI8) for the first relements.

In matrix notation:

U Bp

pb

Up Bp b

Up b Bp

11

1

1

[ ]

= [ ]

⇒ + =

⇒ = −

k

k

k k

k k

2

1 2

1 2

and p2k are arbitrary values, U is an r × r upper triangular matrix.

Example: consider the system in (SI9):

x x

x x x

x x

1 2

1 2 3

12

2

3 02 0

2 6 0

+ − =− + =

+ − =

(SI9)

One root is x = ( , , )2 1 0 T and the Jacobian is:

J x( ) =1 1 01 2 1

2 2 01

−

x

and this matrix is nonsingular at the solution. However, with the initial guess x

0 1 1 1= ( , , )T (and in fact any point with x1 1= ) we will encounter a numericalsingularity.

First Newton iteration:

1 1 01 2 12 2 0

103

1

2

3

−

= −−

p

p

p

Gaussian elimination yields:

1 1 00 3 10 0 0

115

1

2

3

−

= −−

p

p

p

so we ignore the last equation:

1 10 3

01

11

1 10 3

11

1

23

1

2 3

−

+

= −

−

⇒−

=

− −

p

pp

p

p p

and take an arbitrary step in x3 (e.g., p3 0 1= . ):

⇒−

=

−

⇒ =

1 10 3

11 1

0 633 0 367 0 1

1

2

0

p

p .

. , . , .p ( )

and the iteration formula becomes:

x x p1 0 0= + = (1.633,1.367,1.1)T

and J x0 is nonsingular, so the iterative process can proceed.

Note that a numerical singularity can occur at a root of the system of equations ( x2 0=is the simplest example). Typically this will imply multiplicity of the root.

References and Further Reading

• Barton P.I., “The Modelling and Simulation of Combined Discrete/ ContinuousProcesses,” Ph.D. Thesis, University of London, 1992.

• Dumage A.L. and N.S. Mendelsohn, “Two Algorithms for Bipartite Graphs”, SIAMJournal, 11(1), pp. 183-194, 1963.

• Duff I.S., A.M. Erisman and J.K. Reid, “Direct Methods for Space Matrices,” OxfordUniversity Press, 1986.

• Pantelides C.C., “SpeedUp - Recent Advances in Process Simulation,” Computers andChemical Engineering, 12, pp. 745-755, 1988.

• Perkins, J.D. and R.W.H. Sargent, “SPEEDUP - a computer program for steady-stateand dynamic simulation and design of chemical processes,” AIChE Symposium Series,78, pp. 1-11, 1982.

• Perkins J.D., “Equation-oriented flowsheeting,” Second Conference on Foundationsof Computer Aided Process Design, Snowmass, Colorado, 1983.

• Pothen A. and C.J. Fan, “Computing the Block Triangular Form of a Sparse Matrix”,ACM Transactions on Mathematical Software, 16(4), pp. 303-324, 1990.

• Sargent R.W.H. and A.W. Westerberg, “SPEED-UP in Chemical EngineeringDesign,” Trans. Institute of Chemical Engineers, 42, pp. T190-T197, 1964.

• Westerberg, A.W., H.P. Hutchinson, R.L. Motard and P. Winter, “ProcessFlowsheeting,” Cambridge University Press, 1979.

Example of High Level Equation Based Language

# ---------------------------------------------------------------------------------------------------------# Simple extent reactor model - mass balance only## Author: Paul I. Barton# Date: 3/9/00# Language: ABACUSS II## Parameters:## NoComp - number of components in the process streams# NoReaction - number of reactions taking place# Stoich - stoichiometric coefficients for the reaction(s)# Key - component number for the key reactant(s)## Simple specification set:## Inlet flowrates plus one out of molar extent and fractional# conversion for each reaction.## Note:## The molar extent and fractional conversion for each reaction are NOT# independent.## ---------------------------------------------------------------------------------------------------------MODEL Extent_Reactor

PARAMETER NoComp AS INTEGER NoReaction AS INTEGER Stoich AS ARRAY(NoReaction,NoComp) OF INTEGER Key AS ARRAY(NoReaction) OF INTEGER

VARIABLE Molar_Extent AS ARRAY(NoReaction) OF Positive Fractional_Conversion AS ARRAY(NoReaction) OF Fraction Flow_In, Flow_Out AS ARRAY(NoComp) OF Flowrate X_In, X_Out AS ARRAY(NoComp) OF Fraction

STREAM Inlet : Flow_In AS MainStream Outlet : Flow_Out AS MainStream

EQUATION

# Material balance FOR I := 1 TO NoComp DO Flow_Out(I) = Flow_In(I) + SUM(Molar_Extent*Stoich(,I)) ; END # for

# Define the fractional conversion for each reaction FOR J := 1 TO NoReaction DO Fractional_Conversion(J) =

(Flow_In(Key(J)) - Flow_Out(Key(J)))/Flow_In(Key(J)) ; END # for

# Define mole fractions in inlet and outlet streams Flow_In = X_In*SUM(Flow_In) ; Flow_Out = X_Out*SUM(Flow_Out) ;

END # Extent_Reactor

Documents

Lecture 1