Chap04 - Basic Concepts of Optimization

5/26/2018 Chap04 - Basic Concepts of Optimization

1/36

P RT

OPTIMIZ TIONTHEORYND METHODS

art II describes modern techniques of optimization and translates theseconcepts into computational methods and algorithms. A vast literature onoptimization techniques exists, hence we have focused solely on methods whichhave been proved effective for a wide range of problems. Optimization methodshave matured sufficiently during the past twenty years so that fast and reliablemethods are available to solve each important class of problem.Six chapters make up art II of this book, covering the following areas:

1. Mathematical concepts Chapter 4)2. One-dimensional search Chapter 53. Unconstrained multivariable optimization Chapter 64. Linear programming Chapter 75. Nonlinear programming Chapter 86. Optimization involving staged processes and discrete variables Chapter 9

2


2/36

22 OPTIMIZATION THEORY AND METHODS

The topics are grouped so that unconstrained methods are presented first, followed by constrained methods. The last chapter in art II deals with discontinuous integer) variables, a common category of problem in chemical engineeringbut one quite difficult to solve without great effort.s optimization methods as well as computer hardware have been improved over the past two decades, the degree of difficulty of the problems thatcan be solved has expanded significantly. Continued improvements in optimization algorithms and computer hardware and software should enable optimization of large-scale nonlinear problems involving thousands of variables, bothcontinuous and integer, some of which may be stochastic in nature.


3/36

CHAPTER

4BASICCONCEPTS

OF OPTIMIZATION

4 1 Continuity of Functions 1244 2 Unimodal Versus Multimodal Functions 1274 3 Convex and Concave Functions 1294 4 Convex Region 1344 5 Necessary and Sufficient Conditions for an Extremum of an UnconstrainedFunction 1384 6 Interpretation of the Objective Function in Terms of Its Quadratic Approximation 145References 151Problems 152

123


4/36

124 BASIC CONCEPTS OF OPTIMIZATIONIn order to understand the strategy of optimization procedures, certainbasic concepts must be described. In this chapter we will examine the propertiesof objective functions and constraints to establish a basis for analyzing optimiza-tion problems. Those features which are desirable and also undesirable) in the

formulation of an optimization problem are identified. Both qualitative andquantitative characteristics of functions will be described. In addition, we willpresent the necessary and sufficient conditions to guarantee that a supposedextremum is indeed a minimum or a maximum.

4 1 ONTINUITY O FUN TIONSIn carrying out analytical or numerical optimization you will find it preferableand more convenient to work with continuous functions of one or more vari-ables than with functions containing discontinuities. Functions having contin-uous derivatives are also preferred. What does continuity mean? ExamineFig. 4.1. In case A, the function is clearly discontinuous. Is case B also discon-tinuous?

We define the property of continuity as follows. A function of a singlevariable x is continuous at a point Xo ifa) x o) existsb) lim f (x) exists

X-- XO

c) lim f (x) = x o)X- XO

f f (x) is continuous at every point in region R then f(x) is said to be contin-uous throughout R. For case B in Fig. 4.1 the function of x ~ s a "kink" in it,but f (x) does satisfy the property of continuity. However, f '(x) df(x)/dx doesnot. Therefore, the function in case B is continuous but not continuously differ-entIable.

(X l ) (x2 ) \Xl X 2Case A Case BFigure 4.1 Functions with discontinuities in the function and/or derivatives.


5/36

4.1 CONTINUITY OF FUNCTIONS 125

EXAMPLE 4.1 ANALYSIS OF FUNCTIONS FOR CONTINUITYAre the following functions continuous? a) f(x) l/x; b) f(x) In x. In eachcase specify the range of x for which f(x) and rex) are continuous.Solutiona) f(x) = l/x is continuous except at x = 0; f O) is not defined. rex) = -1/x 2 iscontinuous except at x Ob) f(x) In x is continuous for x > O For x S 0 In (x) is not defined. As to

f (x) l/x, see a).

A discontinuity in a function mayor may not cause difficulty in optimiza-tion. In case A in Fig. 4.1, the maximum occurs reasonably far from the discon-tinuity and mayor may not be encountered in the search for the optimum. Incase B if a method of optimization that o ~ s not use derivatives is employed,then the kink in f(x) will probably be unimportant, but methods employingderivatives might fail, because the derivative becomes undefined at the discontin-uity and has different signs on each side of the discontinuity. Hence smallchanges in x do not lead to convergence.One type of discontinuous objective function, one that allows only discretevalues of the independent variable s), occurs frequently in process design be-

cause the process variables assume only specific values rather than continuousvalues. Examples are the cost per unit diameter of pipe, the cost per unit areafor heat exchanger surface, or the insulation cost considered in Example 1.1. Fora pipe, we might represent the installed cost as a function of the pipe diameteras shown in Fig. 4.2. See also Noltie 1978). Although in fact discontinuous, thecost function can for most purposes be approximated as a continuous functionbecause of the relatively small differences ip available pipe diameters. You could

Cost

Commercially availablepipe diameters

Figure 4.2 Installed pipe cost as afunction of diameter.


6/36

26 BASIC CONCEPTS OF OPTIMIZATIONthen disregard the discrete nature .of the functi.on and .optimize the C.ost as if the.diameter were a c.ontinu.ous variable. Once the .optimum value .of the diameter is.obtained f.or the c.ontinu.ous fl.mcti.on the discrete-valued diameter nearest t.o the.optimum that is c.ommercially available can be selected. A sub.optimal value f.orinstalled cost will result, but such a s.oluti.on sh.ould be sufficiently adequate forengineering purp.oses because of the narrow intervals between discrete values .ofthe diameter.

EXAMPLE 4.2 OPTIMIZ TION INVOLVING N INTEGER-VALUEDVARIABLEConsider a catalytic regeneration cycle in which there is a simple trade-off betweencosts incurred during regeneration and the increased revenues due to the regenerated catalyst. Let x 1 be the number of days during which the catalyst is used in thereactor, and l be the number of days for regeneration. The reactor start-up crewis only available in the morning ,shift, so Xl l must be an integer.We will assume that the reactor feed-flow rate q (kg/day) is constant as is thecost of the feed C l ( /kg), the value-of the product C l ( /kg), and the regenerationcost C3 ( /regeneration cycle). We will further assume that the catalyst deterioratesgraduaJly according to the linear relation

d = 1.0 - kXlwhere 1.0 represents the weight fraction conversion of feed at the start of the operating cycle, and k is the deterioration factor in units of weight fraction per day.Define an objective function and find the .optimal value of X lSolution For one complete cycle of operation and regeneration, the objective function for the total profit per day is comprised ofProfit-D = product value - feed cost - (regeneration cost per cycle) (cycles per day)ayor in the defined notation qClxldavg - qC1x l - C3f x) = - - - - - - - - - - ' - -

Xl lwhere d vg = 1.0 - kxt/2).a)

The maximum daily profit for an entire cycle would be obtained by maximizing Eq. a) with respect to X l When the first derivative of Eq a) is set equal tozero and the resulting equation solved for XI the optimum ist [1 2) Xl C)Jl/lx ~ = X l Xl k Xl - --c; qCl

Suppose Xl = 2, k l = 0.02, q = 1000, Cl = 1.0, C l = 0.4, and C3 = 1000. Thenx ~ t = 12.97 (rounded to 13 days if XI is an integer).Clearly, treating XI as a continuous variable may be improper if Xl is 1 2 3etc., but is probably satisfactory if XI is 15 16 17 etc. You might specify Xl interms of shifts of four or eight hours instead of days to obtain finer subdivisions oftime.


7/36

4.2 UNIMODAL VERSUS MULTIMODAL FUNCTIONS 127You will find in real life that other problems involving discrete variablesmay not be so nicely posed. or example, if the cost is a function of the numberof discrete pieces of equipment, such as compressors, the optimization procedurecannot ignore the integer character of the cost function because usually only asmall number of pieces of equipment are involved. You cannot install 1.54 com-

pressors, and rounding off to 1 or 2 compressors may be quite unsatisfactory.This subject will be discussed in more detail in Chap. 9.4 2 UNIMOD L VERSUSMULTIMOD L FUNCTIONSIn formulating an objective function, it is far better, if possible, to choose aunimodal than a multimodal function as a performance criterion. Compare Fig.4.3a with Fig. 4.3b. A unimodal function f(x) in the range specfied for x) has asingle extremum minimum or maximum) whereas a multimodal function hastwo or more extrema. f f (x) = 0 at the extremum, the point is called a station-ary point and can be a maximum or a minimum). Multiple stationary pointsoccur for multimodal functions. The distinction between the global extremum,the biggest or smallest among a set of extrema, and local extrema any extre-mum) becomes significant in numerous practical optimization problems involv-ing nonlinear functions. Numerical procedures usually terminate at a localextremum, and that point may not be the point you are seeking.) More pre-cisely, for a function of a single variable, if x is the point where f(x) reaches amaximum see Fig. 4.3a , unimodality is defined as

a

f(x l < f(x z < f(x*)f(x 4 < f(x 3 < f(x*)

x

Figure 4.3a A unimodal function.b

Xl < Xz < xx < X < X4

x

4.1)


8/36

28 BASIC CONCEPTS O OPTIMIZATION

a b x

Figure 4 3b A multimodal function.

or a maximum, the function f(x) monotonically increases from the left to themaximum, and monotonically decreases to the right of the maximum. An analo-gous set of relations to 4.1) can be written for a minimum. In Fig. 4.4 the pointX represents a saddle point the concept of which is clear when you later onexamine Fig. 4.14). We shall in Chap. 5 describe numerical techniques that arebased on the underlying assumption that the function treated is unimodal. Theproperty of unimodality is difficult to establish analytically, as demonstrated byWilde and Beightler 1967). or functions of one or two variables, the functioncan be plotted, making it obvious whether or not the function is unimodal. orexample, Fig. 4.4 illustrates the contours of a multimodal function of two vari-ables with two minima.

Figure 4.4 Contours of (x , x 2 ) , a multimodal function.


9/36

4 3 CONVEX AND CONCAVEFUNCTIONS

4 3 CONVEX AND CONCAVE FUNCTIONS 29

Determination of convexity, or concavity, will help you establish whether a localoptimal solution is also the global optimal solution (the best among all solu-tions), a matter of some concern in view of the remarks made in Sec 4 2 and inChap. 1 regarding multiple optima. When the objective function is known tohave certain properties (defined below), computation of the optimum can be ac-celerated by using appropriate optimization algorithms. A function is called con-cave over the region R if the following relation holds. For any two differentvalues of x x may be a vector x), a and b lying in the region R,

(4.2)where ) is a scalar having a value between 0 and 1 The function is strictlyconcave i the greater than or equal to sign of (4.2) is replaced with the greaterthan > ) sign.Figure 4 5 illustrates the concepts involved in Eq. (4.2). f the values off x) on each straight line (in the figure the dashed line) connecting pairs offunction values f xa) and f x b for all pairs a and b lie on or below thefunction values themselves, then the function is concave. f the values of f x) oneach straight line between f xa) and f x b lie above or on the function valuesthemselves, then the function is said to be convex. For a convex function theinequality sign would be reversed in Eq. (4.2). What would strictly convex implywith respect to the:::;; sign? The expressions strictly concave or strictly con-vex imply that the dashed lines between f xa) and f xb cannot fall on thefunction itself except at the end points of the line. The common example of afunction that is convex but not strictly convex is a straight line; the line is alsoconcave but not strictly concave.Equation (4.2) is not a convenient equation to use in testing for convexityor concavity. Instead, we will make use of the second derivative of f x), orV2 f x) if x is a vector. v2 f x) is called the Hessian matrix of f x), often de-noted by the symbol H(x), and is the symmetric matrix of second derivatives of

a) Concave function b) Convex function

f x)

o xFigure 4 5 Comparison of concave a) and convex b) functions.

x


10/36

13 BASIC CONCEPTS O OPTIMIZATIONa) Concave function b) Convex function

: (X) /x 7 x

Figure 4.6 Plot of the first derivatives of a quadratic concave and convex function of a singlevariable.

f(x) see App. B . or example, if f(x) is a function of two variables and isquadratic

Suppose we examine the simplest function of one variable that can have curva-ture, a quadratic function, such as shown in Figs. 4 5a and b. A plot of the firstderivative of f(x) would appear as indicated in Fig. 4.6. Figure 4.7 illustrates thevalue of the second derivative, f (x) = d2f(x)/dx 2rom Fig. 4.7, we conclude that if the sign of the second derivative of f(x)in the range of a :::;; x :::;; b is always negative or zero, then f(x) is concave; if thesign of f (x) is positive or zero nonnegative) for a .:::;; x :::;; b, then f(x) will beconvex. Strictly concave means that the sign of f (x) is always negative and

strictly convex means that the sign of f (x) is always positive in the specifiedrange of x. A general proof of these statements is available in Aoki 1971).

a) Concave function

rexo r ~ x

b) Convex function

rexo ~ x

Figure 4.7 Second derivative of f(x) for a quadratic function.


11/36

4.3 CONVEX AND CONCAVE FUNCTIONS 3

EXAMPLE 4.3 ANALYSIS FOR CONVEXITY AND CONCAVITYFor each of the functions below,

a) f(x) = 3x2

b) f(x) = 2xc) f(x) = x 2d) f(x) = 2X2 - x 3determine i f(x) is convex, concave, strictly convex, strictly concave, all, or noneof these classes in the range - ro S; x S; rooSolutiona) f (x) = 6, always positive, hence f(x) is both strictly convex and convex.b) f (x) = 0 for all values of x, hence f(x) is convex and concave. Note straightlines are both convex and concave simultaneously.c) f (x) = -10, always negative, hence f(x) is both strictly concave and concave.d) f (x) = 6':' 3x; may be positive or negative depending on the value of x, hence

f(x) is not convex or concave over the entire range of x.

The concepts of concavity and convexity also apply to a multivariablefunction f(x). For any objective functions, the Hessian matrix H x) must beevaluated to determine the nature of f(x). First let us summarize the definitionsof matrix types:1. H is positive definite if and only if xT H x is >0 for all x O2. H is negative definite if and only if x T H x < 0 for all x O3. H is indefinite if x T H x < 0 for some x and >0 for other x.Definitions 1 and 2 can be extended to H being positive semidefinite xT H x 0)or negative semidefinite xT H x 0), for all x. t can be shown from a Taylorseries expansion that if f(x) has continuous second partial derivatives, f(x) isconcave if and only if the Hessian matrix is negative semidefinite. For f(x) to bestrictly concave, H must be negative definite. For f(x) to be convex H x) mustbe positive semidefinite and for f (x) to be strictly convex, H x) must be positivedefinite.Two convenient tests can be used to establish the status of H x) for strictconvexity:1) All diagonal elements must be positive and the determinants of all leadingprindpal minors, det {M; H)}, and of H x) itself, det H), are positive >0).Keep in mind that H x) must be a symmetric matrix.

Another test is2) All the eigenvalues of H x) are positive > 0 .


12/36

132 BASIC CONCEPTS OF OPTIMIZATIONTable 4.1 Relationship between the character of f x) andthe state of H x)

All the eigen- Determinants of thevalues of H x) leading principalf x) is H x) is are minors of H* L\,)

Strictly convex Positive definite >0 L\ > 0, L\2 > 0, ...Convex Positive semi- ~ L\1 ~ 0, L\2 ~ 0, ...definite

Concave Negative semi- :0;0 L\1 : ; 0, L\2 ~ 0, L\3 : ; 0, ...definite alternating sign)

Strictly concave Negative definite 0, L\3 < 0alternating sign)

Refer to App. B for.details of how to calculate the leading principal minors andeigenvalues. For strict concavity, two alternative definitions similarly can be employed:1) All diagonal elements must be negative and det H) and det {M; H)} > 0if i is even i = 2,4,6, ...); det H) and det {M; H)} < 0 if i is odd

i = 1, 3, 5, ... , where Mi is the ith principal minor determinant.2) All the eigenvalues of H x) are negative < 0).

To establish convexity and concavity, the strict inequalities > or < respectively, in the above tests are replaced by ~ or ~ respectively. f a function has astationary point where the Hessian has eigenvalues of mixed signs, the functionis neither convex nor concave.

Table 4.1 summarizes the relations between convexity, concavity, and thestate of the Hessian matrix of f x). We have omitted the indefinite case for H,that is, when f x) is neither convex or concave.

EXAMPLE 4.4 DETERMINATION OF POSITIVE DEFINITENESSClassify the function f x) = 2xi - 3X I X2 x ~ using the categories in Table 4.1,or state that it does not belong in any of the categories.Solution

af x)-a-- = 4Xl - 3x2Xlaf x)-a-- = 3X I 4X2X 2

a2f x) a2f x)=4 =4axi x ~a2f x) a2f x)= = 3aX IaX2 aX2aX I

Both diagonal elements are positive.


13/36

4 3 CONVEX AND CONCAVE FUNCTIONS 133

The leading principal minors are:M1 order 1 = 4M 2 order 2 = H

detM1 =4detM = 7

hence H x) is positive definite. Consequently, f x) is strictly convex as well asconvex).

EXAMPLE 4 5 DETERMINATION OF POSITIVE DEFINITENESSRepeat the analysis of Example 4.4for f x) = xi + X X 2 + 2X2 + 4olution

H x) = [ ~ ~det H x)) = 1

The principal minors are:detM = 1

det Ml = 2Consequently, f x) does not fall into any of the categories in Table 4 1 We con-clude that no unique extremum exists. Can you demonstrate the same result bycalculating the eigenvalues of H?

EXAMPLE 4 6 DETERMINATION OF CONVEXITY ND CONCAVITYRepeat the analysis of Example 4.4 for f x) = Xl + 3x2 + 6olution

H x) = [ ~ ~hence the function is both convex and concave.EXAMPLE 4 7 DETERMINATION OF CONVEXITYConsider the following objective function: is it convex? Use eigenvalues in theanalysis.

olutiona f x)----;---z- = 4UX I a

2 f x)----;---z- = 3UX 2Therefore the Hessian matrix is

H x) = [ ~ ~


14/36

134 BASIC CONCEPTS OF OPTIMIZATION

Next determine the eigenvalues of H x).

[ 4 - I Xdet 21X2 - 71X 8 = 0

1X1 = 5.561X2 = 1.44

Because both eigenvalues are positive, the function is strictly convex and convex,of course) for all values of Xl and X 2

EXAMPLE 4 8 CLASSIFICATION O STATIONARY POINTSFind the stationary points and their classification for the nonlinear function

f x) = xi ~ - 3x 8x 2 2Solution The stationary points are found from solving simultaneously

ando 2- = 0 = 3x - 3xo= =2x2 8oX 2

and are 1, -4) and -1, -4 . Next find the Hessian:

[6X1 OH x) = 0 2

For x* = 1, -4), H is positive definite, hence f x*) is convex at that point. Forx* = -1, -4), H is indefinite. This point corresponds to a saddlepoint, a topic tobe discussed later in Sec. 4.6.

In some instances we can use the concept that a sum of convex concave)functions is also convex concave). We can examine any f x) in terms of itscomponent parts. I f the function is separable into f x) = f l X) f ix , and iff l X) is convex and f 2 X) is convex, then f x) is convex. For examplef x) = a x - c1)Z az{xz - CZ)2 ai 0

is convex.4 4 CONVEX REGIONA convex region set of points) plays a useful role in optImIzation involvingconstraints. Figure 4.8 illustrates the concept of a convex region and one that is


15/36

a)

Convexregion

Figure 4.8 Convex and nonconvex regions.

4.4 CONVEX REGION 135b)

not convex. A convex set of points exists if for any two points in a region, Xaand Xb, all points x = p Xa 1 - P,)Xb where ;; p ::;; 1, on the line joining Xaand b are in the set. In Fig. 4.8b note how this requirement is not satisfiedalong the dashed line. f a region is completely bounded by concave functionsfor the case in which all g; x) 0, then the functions form a closed convexregion. For the case in which all the inequality constraints stated in the formgi X) ::;; re convex functions, the functions form a convex region. Keep inmind that straight lines are both concave and convex functions.

EX MPLE 4.9 DETECTION OF A CONVEX REGIONDoes the following set of constraints that form a closed region form a convexregion

x i X 2 1X 1 2 ::;; 2

Solution. A plot of the two functions indicates the region circumscribed is closed.The arrows in Fig. E4.9 designate the directions in which the inequalities hold.Write the inequality constraints as i O Therefore

gl X) = x i X 2 - 1 0gzCx) = Xl X 2 - 2 0

That the enclosed region is convex can be demonstrated by showing that bothgl X) and gzCx) are concave functions:

H[gl X)] = [ ~ ~H[g2 X)] = [ ~ ~

Since all eigenvalues are zero or negative, according to Table 4.1 both gl and g2are concave and the region is convex.


16/36


Figure E4 9 Convex region composed of two concave functions.

EXAMPLE 4 10 CONSTRUCTION OF A CONVEX REGION

Construct the region given by the following inequality constraints; is it convex?Xl ::;; 6; x ::;; 6; l 0; l X ::;; 6; X 0Solution See Fig. E4 1O for the region delineated by the inequality constraints. Byvisual inspection the region is convex. This set of linear inequality constraints

Figure E4 10 Diagram of region defined by linear inequality constraints.


17/36

4 4 CONVEX REGION 137

forms a convex region since all the constraints are concave. In this case the convexregion is closed.

As mentioned before, the existence of convex regions has an importantbearing on optimization of functions subject to constraints. For example, examine Fig. 4 9 In optimization problems involving constraints, points that satisfyall the constraints are said to be feasible points; all other points are nonfeasible

....- ....------............

/ \ \Feasible-++---'-----+-, ~ - _ ; - _ - , ----ill--l)nconstrainedmaximum / / maximumF e a s i b l e - ~ ~ ~ ~region

Local

Feasible - - ~ ~ ~region

,

~ ' r F ' = - - - . / / 1/

.....

a)

~ ~ ~ ~ ~ ~ ~ ? f . l f f - - - - - - f - ~ - U n c o n s t r a i n e d

b)

- ..... 30, ........ 20 2515

c)

maximum

Local and globalmaximum- -

........ 50-4035

Figure 4.9 Effect of the character of the region of search in solving constrained optimization problems.


18/36

138 BASIC CONCEPTS OF OPTIMIZATIONpoints. Inequality constraints specify a feasible region comprised of the set ofpoints that are feasible whereas equality constraints limit the feasible set of pointsto hypersurfaces (multidimensional surfaces defined by hi X 1, X2 , , xn) = 0),curves, or perhaps even a single point. Solid lines in Fig. 4.9 represent inequalityconstraint boundaries. Dashed lines are the contours of the objective function.All points that satisfy all the inequality constraints a) as strict inequalities aretermed interior points, and b) as equalities are termed boundary points. All otherpoints are exterior points.

In Fig. 4.9a the maximum of the unconstrained objective function lies outside the feasible region. t is clear that the given set of constraints requires thatthe search for the optimum terminate at a lesser value of the objective function(for the maximum) termed the constrained solution. In Fig. 4.9a one of theconstraints at the optimal solution is active, that is, the inequality constraint issatisfied at its boundary as an equality =0). f no constraints are active atthe optimal solution then the unconstrained maximum can be reached as shownin Fig. 4.9b. Figure 4.9c illustrates the importance of having the constraints comprise a convex region if a global extremum is to be located. The search for themaximum of f x), if initiated in the left-hand part of the feasible region, mightwell terminate at point A, whereas a search starting in the right-hand side of theregion might terminate at B. Therefore, we conclude that the nature of thesearch region has an important bearing on the potential for obtaining suitableresults in optimization. f the feasible region in Fig. 4.9c were extended, asshown by the dashed line, to make the region convex, then the search from anyinitial point would converge to the same answer.4.5 NECESSARY AND SUFFICIENTCONDITIONS FOR AN EXTREMUMOF AN UNCONSTRAINED FUNCTIONIn optimization of an unconstrained function we are concerned with finding theminimum or maximum of an objective function f x), a function of one or morevariables. The problem can be interpreted geometrically as finding the point inan n-dimension space at which the function has an extremum. Examine Fig. 4.10in which the contours of a function of two variables are displayed.An optimal point x* is completely specified by satisfying what are calledthe necessary and sufficient conditions for optimality. A condition N is necessaryfor a result R if R can be true only if the condition is true R => N). However,the reverse is not true, that is if N is true, R is not necessarily true. A conditionis sufficient for a result R if R is true if the condition is true (S => R). A conditionT is necessary and sufficient for result R if R is true if and only if T is true

T ~ R ) .The easiest way to develop the necessary and sufficient conditions for aminimum or maximum of f x) is to start with a Taylor series expansion aboutthe presumed extremum x

f x) = f x*) VTf x*) Ax t{AX T)V2f x*)Ax 03(Ax) (4.3)


19/36

4 5 NECESSARY AND SUFFICIENT CONDITIONS OR EXTREMUM O UNCONSTRAINED FUNCTION 139X

X2

igure 4 10a A function of two variableswith a single stationary point the extre-mum.

Figure 4 10b A function of two variables withthree stationary points and two extrema andB


20/36


where ~ = x - x*, the perturbation of x from x*. We assume all terms in Eq.4.3) exist and are continuous, but will ignore the terms of order 3 or higher[ O i ~ x ) ] , and simply analyze what occurs for various cases involving just theterms through the second ortfer.

By defining a local minimum is a point x* such that no other point in thevicinity of x yields a value of f x) less than f x*), orf x) - f x*) ~ 0 4.4)

x* is a global minimum if 4.4) holds for any x in the n-dimensional space of x).Similarly, x is a local maximum iff x) - f x*) ;; 0 4.5)

Examine the second term on the right-hand side of Eq. 4.3): V T f x * ) ~ x .Because ~ is arbitrary and can have both plus and minus values of elements,we must insist that V f x*) = 0 or otherwise we could add a term to f x*) sothat Eq. 4.4) for a minimum, or 4.5) for a maximum, would be violated. Hence,a necessary condition for a minimum or maximum of f x) is that the gradient off x) vanishes at x*

Vf x*) = 0 4.6)that is, x* is a stationary point.With the second term on the right-hand side of Eq. 4.3) forced to be zero,we next examine the third term: 1 < : ~ X T ) V 2 f x * ) ~ x . This term establishes thecharacter of the stationary point minimum, maximum, or saddle point). In Fig.4.10b A and B are minima while C is a saddle point. Note how movement alongone of the perpendicular search directions dashed lines) from point C increasesf x) whereas movement in the other direction decreases f x). Thus, satisfactionof the necessary conditions does not guarantee a minimum or maximum. Figure4 11 illustrates the character of f x) if the objective function is a function of asingle variable.

f x)

xa inflection point scalar equivalent to a saddle point)b global maximum and local maximum)c local minimumd local maximum

igure 4 11 A function exhibitingdifferent types of stationary points.


21/36

4.5 NECESSARY AND SUFFICIENT CONDITIONS FOR EXTREMUM OF UNCONSTRAINED FUNCTION 4

To establish the existence of a minimum or maximum at x*, we knowfrom Eq. (4.3) with Vf(x*) = 0 and the conclusions reached in Sec. 4.3 concerning convexity that for Llx i 0

'l/2f(x*)Positive definitePositive semidefiniteNegative definiteNegative semidefiniteIndefinite

>0~ o


22/36

142 BASIC CONCEPTS OF OPTIMIZATIONf(x)

1 o xFigure E4 11

I f both first and second derivatives vanish at the stationary point, thenfurther analysis is required to evaluate the nature of the function. For functions ofa single variable, take successively higher derivatives and evaluate them at the stationary point. Continue this procedure until one of the higher derivatives is notzero (the nth one); hence, f (x*), f (x*), ... , pn-l)(x*) all vanish. Two cases mustbe analyzed:a) I f n is even, the function attains a maximum or a minimum; a positive sign of

f(n) indicates a minimum, a negative sign a maximum.b) I f n is odd, the function exhibits a saddle point.For more details refer to Beveridge and Schechter (1970).

For application of these guidelines to f(x) = x you will find d4j(x)/dx4 =24 for which n is even and the derivative is positive, so that a minimum exists.

EXAMPLE 4 12 CALCULATION OF EXTREMAIdentify the stationary points of the following function (Fox, 1971), and determineif any extrema exist.

Solution For this function, three stationary points can be located by settingVf(x) = :

af(x) 2= 4 4X2 - 2Xl - 2Xl = 0UX 2

a)

b)

The set of nonlinear equations a) and b) has to be solved, say by Newton smethod, to get the pairs (Xl x 2) as follows:


23/36

4.5 NECESSARY AND SUFFICIENT CONDITIONS FOR EXTREMUM OF UNCONSTRAINED FUNCTION 43

Stationary point Hessian matrix(x X z) f x) eigenvalues Classification

1.941, 3.854) 0.9855 37.03 0.97 Local minimum-1.053 1.028) -0.5134 10.5 3.5 Local minimum

also the globalminimum)0.6117, 1.4929) 2.83 7.0 -2.56 Saddle point

Figure 4.10b shows contours for the objective function in this example. Notethat the global minimum can only be identified by evaluating f x) for all thelocal minima. For general nonlinear objective functions, it is usually difficult toascertain the nature of the stationary points without detailed examination ofeach point.

EXAMPLE 4.13In many types of processes such as batch constant-pressure filtration or fixed-bedion exchange, the production rate decreases as a function of time. At some optimaltime, tOP production is terminated at poPt and the equipment is cleaned. FigureE4.13a illustrates the cumulative throughput pet) as a function of time t) for sucha process. For one cycle of production and cleaning, the overall production rate is

R t) = pet)t tc (a)where R t) is the overall production rate per cycle mass/time) and tc is the cleaning time assumed to be constant).Determine the maximum production rate and show that popt is indeed themaximum throughout.

pet)

Figure E4.13a


24/36


Solution Differentiate R t) with respect to t and equate the derivative to 0:dR t) = - pet) [dP t)/dt] t te) = 0

dt t tPOP = dP t) I t te)dt OP b)

The geometric interpretation of Eq. b) is the classical result (Walker et aI. 1937)that the tangent to pet) at POP intersects the time axis at - te Examine Fig. E4 13bThe maximum overall production rate is

P t)

igure E4 13b

POPROP =tOP te

Slope = dP t) Idt opt

d2 P ~ t (the slope) is negativedt~ __________ __ _____o

igure E4.13c

c)


25/36

4.6 INTERPRETATION OF THE OBJECTIVE FUNCTION 145Does popt meet the sufficiency condition to be a maximum? Is

d2 R t) 2P t) - 2[dP t)jdt] t tc) [d2P t)/dt2 ] t t=


26/36


f x)

Figure 4 12 Geometry of second order objective function of two independent variables circularcontours.

Figure 4 13 Geometry of second order objective function of two independent variables ellipticalcontours.


27/36

4 6 INTERPRETATION OF THE OBJECTIVE FUNCTION 47

f x)

Figure 4 14 Geometry of second order objective function of two independent variables saddlepoint.

f x)

Figure 4 15 Geometry of second order objective function of two independent variables valley.


28/36


Figure 4.16 Geometry of second-order objective function of two independent variables-fallingvalley.

different types of surfaces corresponding to each case that arises for quadraticfunction. By implication, analysis of a function of many variables via examination of the eigenvalues can be conducted whereas contour plots are limited tofunctions of only two or three variables.Figures 4.12 and 4.13 correspond to objective functions in well-posed optimization problems. In Table 4.2, cases 1 and 2 Fig. 4.12) correspond to contours of f x) that are concentric circles, but such functions rarely occur inpractice. Elliptical contours such as correspond to cases 3 and 4 are most likelyfor well-behaved functions. Cases 5 to 10 correspond to degenerate problems,those in which there is no finite maximum/minimum and/or perhaps nonuniqueoptima appear.

EXAMPLE 4.14 INTERPRETATION OF AN OBJECTIVE FUNCTIONIN TERMS OF ITS EIGENVALUESExamine the function f x) = 2xi ~ 2x t x Z

H x) = [ ~ ~ d [ 4 - a 2 ] _ 0et 2 2 - aZ - a 4 = 0

The eigenvalues are = 6 J36 - 16)/2 = 3 fl both positive, but not equal,corresponding to case 4 in Table 4.2.


29/36

4.6 INTERPRETATION OF THE OBJECTIVE FUNCTION 149

H(x) = [ ~ ~ [ 2 - IX 2 Jdet 2 2 _ IX = 0SO and

This case corresponds to case 9 in Table 4.2.

For well-posed quadratic objective functions the contours always form aconvex region; for more general nonlinear functions, they do not see for example Figs. 4.10a or 4.10b . t is helpful to construct contour plots to assist inanalyzing the performance of multivariable optimization techniques when applied to problems of two or three dimensions. Most computer libraries havecontour plotting routines to generate the desired figures.

As indicated in Table 4.2, the eigenvalues of the Hessian matrix of f x)indicate the shape of a function. For a positive definite symmetric matrix, theeigenvectors refer to App. B) form an orthonormal set. For example, in twodimensions, if the eigenvectors are Vi and V2 , ViV2 = 0 the eigenvectors are perpendicular to each other). The eigenvectors also correspond to the directions ofthe principal axes of the contours of f x). This information can be used to makea variable transformation which eliminates the tilt in the contours, and yieldssearch directions that are efficient. The next example illustrates such a calculation.

EXAMPLE 4 15For f x) = 3xi 1X 2 l . 5 x ~ find the equations for the principal axes and determine a transformation x = Vz such that = allzi 2 2 z ~ thus eliminating theinteraction term.Solution The optimum is at 0,0). First, compute H(x) and the correspondingeigenvalues and eigenvectors.

The eigenvalues are 2 and 7 and the corresponding eigenvectors are

andRefer to App. B for the procedure to calculate these quantities. Note that viv = O

From linear algebra Amundson, 1966), the transformation which eliminatesthe interaction (X 1X2) term is

x=Vz where


30/36

156 BASIC CONCEPTS OF OPTIMIZATIONSubstituting for V and z = G J

Xl = Zl +2Z2X 2 = Zl Z

a)b)

f Eqs. a) and b) are substituted into the objective function the result in terms ofZl and Z is

fez = 5zI 1 7 5 z ~The equations for the principal axes can be found from the eigenvectors

[ 1 2] and [2 1]X 2 = O x lX 2 = 2 . 0 X l

Figure E4 15 illustrates the contours of f x) and the principal axes for this Example The variable transformation represents rotation of the variable axes from theXl - X 2 plane to the Zl - Z plane.

Figure E4 1S Contours and principal axes for f = 3xi 2x t x2 5 x ~


31/36

SUPPLEMENTARY REFERENCES 5

This example was somewhat simplified because the origin of the principalaxes and the origin for the original axes were the same. f they were not the same,then the origin for the principal axes would be at the extremum, and you wouldtransform the variables so that the new variables in the principal axes, y, would berelated to x by y x - x*, and thus eliminate linear terms in f x).

One of the primary requirements of any successful optimization techniqueis the ability to be able to move rapidly in a local region along a narrow valley(in minimization) toward the minimum of the objective function. In other words,an efficient algorithm will select a search direction which generally follows theaxis of the valley rather than jumping back and forth across the valley. Valleys(ridges in maximization) occur quite frequently, at least locally, and these typesof surfaces have the potential to slow down greatly the search for the optimum.A valley lies in the direction of the eigenvector associated with a small eigenvalue of the Hessian matrix of the objective function. For example, if theHessian matrix of a quadratic function isHG ~then the eigenvalues are ) l = 1 and ()(2 = 10. The eigenvector associated with()(l = 1, that is, the l axis, is lined up with the valley in the ellipsoid. Transformation techniques such as those discussed above can be used to allow the problem to be more efficiently solved by a search technique. (See Chap. 6.Valleys and ridges corresponding to cases 1 through 4 can lead to a minimum or maximum, respectively, but not for other cases. Do you see why?REFERENCESAmundson, N. R., Mathematical Methods n Chemical Engineering: Matrices and Their ApplicationPrentice-Hall, Englewood Cliffs, New Jersey (1966).Aoki, M., Introduction to Optimization Techniques Macmillan Co., New York (1971), p.24.Beveridge, G. S. G., and R. S. Schechter, Optimization: Theory and Practice McGraw-Hill, NewYork (1970), p. 126.Fox, R. L., Optimization Methods for Engineering Design Addison-Wesley, Reading. Massachusetts(1971), p.42.Noltie, C. B., Optimum Pipe Size Selection Gulf Pub . Co., Houston, Texas (1978).Walker, W. H., W. K Lewis, W. H. McAdams, and E R. Gilliland, Principles of Chemical Engineer-

ing 3d ed., McGraw-Hill, New York (1937), p. 357.Wilde, D. J., and C. S. Beightler, Foundations of Optimization Prentice-Hall, Englewood Cliffs, NewJersey (1967), p. 220.

SUPPLEMENT RY REFERENCESAvriel, M., Nonlinear Programming Prentice-Hall, Englewood Cliffs, New Jersey, (1976).Jeter, M. W., Mathematical Programming Marcel Dekker, New York (1986).Walsh, G. R., Methods of Optimization John Wiley, New York (1975).Wilde, D. J., Optimum Seeking Methods Prentice-Hall, Englewood Cliffs, New Jersey (1964).


32/36


PRO LEMS4 1 Classify the following functions as continuous specify the range) or discrete:

a) f x) = eb) f x) = aXn - 1 b xo - x n) where Xn represents a stage in a distillationcolumnc) f x) = n - xsI 1 xs) where n = concentration of vapor from a still and

Xs is the concentration in the still4 2 The future worth S of a series of n uniform payments each of amount P is

PS = [ 1 i n - 1]I

where i is the interest rate per period. f i is considered to be the only variable, is tdiscrete of continuous? Explain. Repeat for n. Repeat for both nand i being variables.

4 3 In a plant the gross profit P in dollars isP = nS - nV F)

where n is the number of units produced per year, S is the sales price in dollars perunit, V is the variable cost of production in dollars per unit, and F is the fixedcharge in dollars. Suppose that the average unit cost is calculated asnV FAverage unit cost = n

Discuss under what circumstances n can be treated as a continuous variable.4 4 One rate of return is the ratio of net profit P to total investment

R = 100 P l - t) = 100 1 _ t) [S - V Fin)]I lin

where t is the fraction tax rate and I is the total investment in dollars. Find themaximum R as a function of n for a given I if n is a continuous variable. Repeat if nis discrete. See Prob. 4.3 for other notation.)

4 5 Classify the following functions as unimodal, multimodal, or neither:a) f x) = 2x 3b) f x) = x 2c) f x) = a cos x)

4 6 Are the following functions unimodal, multimodal, or neither:a) f x) = xi ~ X 1X 2b) f x) = xf - 4xix2 I X ~c) f x) = e - x f + X ~ )


33/36

PROBLEMS 534.7. Determine the convexity or concavity of the following objective functions:

a) f x l , Xz) = Xl - Xz)Z ~b) f x l , Xz, X 3 ) = xi ~ ~c) f x l , Xz) = eX eX

4.8. Given a linear objective function,f = Xl Xz

explain why a nonconvex region such as region A in Fig. P4.8 will yield difficultiesin the search for the maximum.

5

Xl and Xz must lie in region A

o 5FigureP4.8

Why is region A not convex?4.9. Are the following functions convex? Strictly convex? Why?

a) 2xi 2x l xZ x ~ XI 8xz 25What are the optimum values of Xl and xz?b) e5x4.10. A reactor converts an organic compound to product P by heating the material inthe presence of an additive A (mole fraction = xA ). The additive can be injected intothe reactor, while steam can be injected into a heating coil inside the reactor toprovide heat. Some conversion can be obtained by heating without addition of A,

and vice versa.The product P can be sold for 50 per Ib-mol. For lib-mol of feed, the costof the additive (in dollars per Ib-mol) as a function of XA is given by the formula,2.0 lOxA 0 x ~ The cost of the steam (in dollars) as a function of S is 1.00.003S 2.0 x 1O- 6Sz. S = Ib steamjlb-mol feed). The yield equation is p = 0.10.3xA O.OOlS O OOOlxAS; Yp = Ib-mol product Pjlb-mol feed.


34/36

154 BASIC CONCEPTS OF OPTIMIZATIONa) Formulate the profit function basis of 1.0 lb-mol feed) in terms of X A and S.

f = income - costsb) Maximize f subject to the constraints

:5:xA :5:1 S ~by any method you choose.

c) Is f a concave function? Demonstrate mathematically why it is or why it is notconcave.

d) Is the region of search convex? Why?4 11 Show that f = e"1 eX2 is convex. Is it also strictly convex?4 12 Show that f = xi is convex.4 13 Is the following region constructed by the four constraints convex? Closed?

Xz 1 - XlXz :5: 1 O.5x lXl :5: 2Xz 0

4 14 Does the following set of constraints form a convex region?gl X) = - x i x 9 0gz{x) = -Xl - Xz 1 0

4 15 Consider the following problem:Minimize f x) = xi X zSubject to gl X) = xi ~ - 9 :5: 0

gz{x) = Xl ~ - 1 :5: 0g3 X) = Xl xz) - 1 :5: 0

Does the constraint set form a convex region? Is it closed? Hint: a plot will helpdecide.)

4 16 The objective function work requirement for a three-stage compressor can be ex-pressed as .f = ;:f Z8 ;:f Z8 ;:f Z8

Pl = 1 atm and P4 = 10 atm. The minimum occurs at a pressure ratio for each stageof ViO. Is f convex for 1 :5: pz :5: 10, 1 :5: P :5: 10?

4 17 Happel and Jordan [Chemical Process Economics, Marcel Dekker, New York, 1975,p. 178] reported an objective function cost) for the design of a distillation columnas folloW :'" f = 14720 100 - P) 6560R - 30.2PR 6560 - 30.2P+ 19.5n 5000R - 23PR 5000 - 23P O.5

23.2 [5000R - 23PR 5000 - 23P]O.6Z


35/36

PROBLEMS 155where n = number of theoretical stages, R = reflux ratio, and P = percent recoveryin bottoms stream. They reported the optimum occurs at R = 8, n= 55, and P =99. Is f convex at this point? Are there nearby regions where f is not convex?

4 18 Consider the functiony = x - a 2

Note that x = a minimizes y. Let z = x 2 - 4x 16. Does the solution to x 2 - 4x +16 = 0,

4 J 4 8x = - 2 = 2 j2j3minimize z? j = J=i .

4 19 The following objective function can be seen by inspection to have a minimum atx=O:Can the criteria of Sec. 4.5 be applied to;jtest this outcome?

4 20 a) Consider the objective function,f = 6xi ~ 6X 1X2 x ~

Find the stationary points and classify them using the Hessian matrix.b) Repeat for

c) Repeat for

4 21 Classify the stationary points ofa) f = 4 x 3 20b) f = x 3 3x2 X 5c) f = X4 - 2x 2 1d) f = xi - 8X X 2 ~

according to Table 4.2/4 22 List stationary poinfs and their classification maximum, minimum, saddle point) of

a) f = xi Xl x ~ 6x 2 4b) f = Xl X 2 xi - 4X l X2 x ~

4 23 Show that the eigenvectors of the Hessian for a two-variable quadratic optimizationproblem are orthogonal.4 24 Figure P4.24 shows the contours of a quadratic optimization problem, which has aminimum at 2, 2) where f x) = 5. I f the general quadratic function is described by

f = txTAx bTX cgive characteristics of A b and c; be as quantitative as possible.


36/36

56 BASIC CONCEPTS OF OPTIMIZATION6 . 0 ~ - - ~ - - ~ - - ~ ~ - - ~ - - - - -

- 2 . 0 L . - - - - - L - - ~ - - - _ _ _ _ _ _ : L - - - _ : L : _ - - _ _ : : l2.0 0 2 0 4 0 6.0 8.0

Figure P4.244.25. We wish to minimize

f x) = 12xl 2xi x ~ - 2X 1X2a l ) a2) a3) a4)

a) Find the stationary point and determine if it is a maximum or minimum basedon the Hessian matrix.b) The coefficients of each term a l , a2, a3, a4) can affect whether a maximum or aminimum occurs. f three of the four coefficients are fixed at the above valuesand we vary the remaining one, find the individual values of a l , a2, a3, and a4that change the character of the stationary point e.g., from minimum to saddlepoint).

4.26. Forf = x X 1X 2 ~find the transformation x = Vz such thatf = l lz i z ~

Illustrate the transformed coordinate system on a contour plot of f x).

Documents

Chap04 - Basic Concepts of Optimization