Guidelines for Assessing Ill Conditioning in Linear and Integer Programs

© 2009 IBM Corporation

®

Guidelines for Assessing Ill Conditioning in Linear and Integer

Programs

Welcome

Ed KlotzMathematical Programming SpecialistAdvanced Support for CPLEX Users IBM Software Group

2

IBM Software Group


Presenter:

Ed KlotzMathematical Programming SpecialistAdvanced Support for CPLEX [email protected]

Moderator:

Gabriela WilsonOptimization Campaign Specialist

On 24 Producer:

Craig Chamides


®

Guidelines for Assessing Ill Conditioning in Linear and Integer

Programs

Welcome

Ed KlotzMathematical Programming SpecialistAdvanced Support for CPLEX Users IBM Software Group

4

IBM Software Group


Objective

Enable more precise assessment of ill conditioning in linear and integer programs

Discuss some techniques to either mitigate or avoid potential ill conditioning in LP and MIP models

5

IBM Software Group


Outline

Description of ill conditioning

Assessment of ill conditioning

Modeling pitfalls that can contribute to ill conditioning

– Formulation alternatives to avoid ill conditioning

Conclusions

6

IBM Software Group


Problem Definition

Ill Conditioning

– Does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?

Meteorological Model

Meteorological Model

Data to 3 decimal places

(.506)

Data to 6 decimal places

(.506127)

7

IBM Software Group


Problem Definition

Ill Conditioning

– Motivated by work of meteorologist & mathematician Edward Lorenz

– Lorenz focused on small changes in initial conditions, resulting trajectories in nonlinear meteorological models• Lorenz subsequently became a pioneer in the field of Chaos Theory

– Ill conditioning extends beyond the nonlinear meteorological models on which Lorenz worked

– More generally, a mathematical model or system is ill conditioned when a small change in the input can result in a large change to the computed solution

8

IBM Software Group


Problem Definition

Ill Conditioning

– Small change in input leads to big change in output

– Can we quantitatively measure ill conditioning?• For many mathematical systems or models, quantitative measures have

yet to be discovered. But, sometimes we can measure it.

• Specifically, we can measure ill conditioning when solving square linear systems of equations

9

IBM Software Group


Condition Number of a Square Matrix (Turing 1948; Rice, 1966) CPLEX solves square linear systems of form:

Exact solution is:

How will a change to the input vector b affect the computed solution x?

Cauchy-Schwarz inequality:

Cauchy-Schwarz for original system:

Combine and rearrange:

bBx bBx 1

bBx 1

b

bBB

x

x 1

)(1 bbBxx

bBx 1

xBb

10

IBM Software Group


Condition Number Condition number of B is defined as

As condition number increases, upper bound on change in solution relative to change in data also increases

Even if the modeler doesn‘t change the data, finite precision computers can introduce small changes

– Machine precision for 64 bit double = 1e-16• Most numbers (e.g. 1/3, 5/12) can‘t be represented exactly

– Ill conditioned linear systems can yield much different results on different machines

• Different floating point representations due to different chips• Same floating point representation, but compilers use it differently

1 BBB

bbBxx )(

11

IBM Software Group


A square linear system is ill conditioned if the condition number is large, well conditioned if the condition number is small

– Higham, Accuracy and Stability of Numeric Algorithms

– Duff, Erisman and Reid, Direct Methods for Sparse Matrices

– Gill, Murray and Wright, Practical Optimization

– Golub and Van Loan, Matrix Computations

– The usual suspects on the web say the same thing as well

What constitutes a large or small value?

– 1000? 10000000? 10000000000000?

And what should we do for algorithms that solve a series of interrelated square linear systems?

– Linear, Mixed Integer, Quadratic and Nonlinear Programming algorithms all do this

Assessment of Ill Conditioning

12

IBM Software Group


What constitutes a large or small value?

– Depends on machine, data and algorithm precision ( ), algorithm tolerances (t)

– Ill conditioning can occur when round-off error associated with precision is large enough to influence algorithm decisions

– Classify based on threshold defined by t /• Four distinct categories

– Example: CPLEX has default algorithmic tolerances of 1e-6, runs double precision arithmetic on machines with precision of ~1e-16• t / = 1e-6/1e-16 = 1e+10 is a key threshold


bbBxx )(

≥ t ?

13

IBM Software Group



Condition number is a bound for the increase of the error:

Basic epsilons:

– Machine precision (double): 1e-16

– Default feasibility and optimality tolerance: 1e-6

Classification of condition numbers for LP bases:

– Stable: 0 ≤ (B) < 1e+7

– Suspicious: 1e+7 ≤ (B) < 1e+10

– Unstable: 1e+10 ≤ (B) < 1e+14

– Ill-posed: 1e+14 ≤ (B)

bbBxx )(

14

IBM Software Group



CPLEX provides condition numbers for the basis associated with the current solution when solving LPs and QPs

What about MIP?

Previously, CPLEX provided Kappa for the associated fixed LP:

– Some indication of condition of primal solution• But optimal basis for fixed LP may not match that of node LP that yielded

the associated integer solution

– Optimality proof of MIP is based on pruning during tree search and thus not available with final solution

How reliable is it?

– Need to monitor condition number of all optimal bases used during Branch-and-Cut search

– Performance impact: computing precisely is equivalent to ~m/4 simplex iterations (m = number of constraints in the model)

• Can be mitigated by sampling, estimating kappa

1B

15

IBM Software Group


MIP Kappa feature, available starting with CPLEX 12.2

– Sample from the series of condition numbers• New parameter CPX_PARAM_MIPKAPPA with settings:

-1: off 0: auto (defaults to off) 1: sample 2: use every optimal basis

– Classification thresholds provide percentages of each category

– Provide an assessment for users unfamiliar with ill conditioning

If enabled, categorize condition numbers of optimal bases• Stable• Suspicious• Unstable• Ill-posed


16

IBM Software Group


MIP Kappa sample output :

Branch-and-cut subproblem optimization:Max condition number: 3.5490e+16Percentage of stable bases: 0.0%Percentage of suspicious bases: 86.9%Percentage of unstable bases: 13.0%Percentage of ill-posed bases: 0.1%Attention level: 0.048893CPLEX encountered numerical difficulties while solving this model.

Attention level

– =0 if only stable bases encountered

– >0 if at least one basis encountered that is not stable

– Max value is 1 (all bases ill-posed)

– Can view as the probability of numerical difficulties given the sample of condition numbers


17

IBM Software Group


Branch-and-cut subproblem optimization:Max condition number: 3.5490e+16Percentage of stable bases: 0.0%Percentage of suspicious bases: 86.9%Percentage of unstable bases: 13.0%Percentage of ill-posed bases: 0.1%Attention level: 0.048893CPLEX encountered numerical difficulties while solving this model.

Ideal output:

– Attention level 0 (no suspicious, unstable or ill posed bases)

Good output:

– No ill-posed or unstable bases

Bad output:

– Any ill posed bases

– More than 5% unstable bases


18

IBM Software Group


Alternate Interpretations of an Ill Conditioned Basis

Condition number of Simplex Solutions

– Simplex solution is intersection point of n hyperplanes

19

IBM Software Group



– Simplex solution is intersection point of n hyperplanesb = change in hyperplane (input); x = change in solution

(output)

b

x

= x / b 1: well conditioned!


20

IBM Software Group




b

x

= x / b >>1: ill conditioned!


21

IBM Software Group




stable Solution

ill-conditioned Solution


22

IBM Software Group



Distance to singularity of a matrix is the reciprocal of its condition number (Gastinel, Kahan).

– Implies that linear combinations of rows or columns of B that are close to 0 imply ill conditioning:

)(/1)(dist

singular :||||

min:)(dist

pp

p

pp }{

BB

BBB

BB

B

dconditione ill hence and singular, toclose is

,|||| ,|||| , if

B

vvBT

23

IBM Software Group


Now that we can better assess the meaning of the basis condition numbers, what can we do about it?

– Ill conditioning can occur under perfect arithmetic.• For some models, even perfect data, algorithm and machine

precision may not address the problem. Consider adjustments to existing formulation, or alternate

formulations that provide the solution to the ill conditioned model.

– But, in most cases, finite precision can perturb the exact system of equations we wish to solve, resulting in significant changes to the computed solution.• Calculate data, formulate model and configure algorithm to keep such

perturbations as small as possible

– Condition number provides a worst case bound on the effect• CPLEX provides good quality solutions on majority of models

containing some basis condition numbers in [1e+10, 1e+14]

Implications of Ill Conditioning

24

IBM Software Group


Can’t avoid perturbations due to machine precision

– But, increasing tolerances when condition number is high can prevent algorithmic decisions based on round-off error associated with machine precision (increase t).

– More precise input values are better (reduce )• Always calculate and input model data in double precision• Machine precision for 32 bit floats ~1e-8

Condition numbers > 1e+2 could result in algorithmic decisions based on machine precision based round-off error If you really need to use single precision in the model data, increase CPLEX’s feasibility and optimality tolerances above the defaults of 1e-6

Implications for the Practitioner (Data Input)

≥ t ?

bbBxx )(

25

IBM Software Group


Minimize perturbations involving other factors

– Model data values • Don’t divide big numbers by small numbers in data calculations

Increases round-off error

• Make sure all procedures that calculate the data are implemented in a numerically stable manner• Less round-off error if all data values of similar order of magnitude

Mix of large and small numbers results in more shifting of the exponents, loss of precision in the mantissa.– Use CPLEX’s aggressive scaling if unavoidable

However, similar orders of magnitude in the coefficients does not guarantee absence of ill conditioning

Implications for the Practitioner (Data Calculation)

bbBxx )(

≥ t ?

26

IBM Software Group


Minimize perturbations involving other factors (continued)

– Algorithmic strategies• Numerically stable implementations of all steps

Simplex method ratio test must avoid dividing big numbers by small ones

• Especially important for the basis factorization in the simplex method Set CPLEX’s Markowitz tolerance to .90 with ill conditioned models Consider turning on CPLEX’s numerical emphasis parameter

Implications for the Practitioner (Algorithm)

bbBxx )(

≥ t ?

27

IBM Software Group


Implications for the Practitioner (Formulation)

Avoid nearly linear dependent rows or columns

– Such linear combinations of rows and columns often arise from round-off error in the data

dconditione ill hence and singular, toclose is

,|||| ,|||| , if

B

vvBT

28

IBM Software Group


Examples

Model data values

– Avoid rounding if you can, or round as precisely as possible

– Matrices can be ill conditioned despite small spread of coefficients

– Exact formulation: Maximize x1 + x2 c1: 1/3 x1 + 2/3 x2 = 1 c2: x1 + 2 x2 = 3

– Imprecisely rounded, single [double] precision Maximize x1 + x2 c1: .33333333 x1 + .66666667 x2 = 1[ c1: .33333333333333333 x1 + .6666666666666667 x2 = 1] c2: x1 + 2x2 = 3

– Scale to integral value whenever possible: Maximize x1 + x2 c1: x1 + 2 x2 = 3 c2: x1 + 2 x2 = 3

)(/1)(dist

singular :||||

min:)(dist

pp

p

pp }{

BB

BBB

BB

B

(results in near singular matrix)(better)

(best)

29

IBM Software Group


Examples

) 10 ~(error )b (b/ a - a/b ) a/(b ~ a/b

)10 ~(error /a b/a )/a (b ~ b/a

)10 1/30000, b 3, (a b/a and a/b compare b, aFor

0

8-

-8

Don’t divide big numbers by small numbers in data calculations

) 10 ~(error )b (b/ a - a/b ) a/(b ~ a/b

)10 ~(error /a b/a )/a (b ~ b/a

)10 1/30000, b 3, (a

8-

16-

-16

30

IBM Software Group


Examples Why doesn’t CPLEX set the Markowitz tolerance to .90 by default?

– The simplex method performs Gaussian elimination • Stability test to avoid dividing big numbers by small ones

– Markowitz tolerance defines acceptable pivot relative to maximum element in a matrix column

– Trade off between accuracy and sparsity of factorization• Most models are not ill conditioned; avoid the loss of sparsity in the factorization by using a smaller Markowitz tolerance (.01 by default)

a11 a12 a13

a21 0 a23

a31 0 0

a11 = 10000, a21 = 101,

a31 = 99, Markowitz = .01

Threshold = 10000*.01 = 100

31

IBM Software Group


Examples

integer

10 ,0

) with constraint(only 0

subject to

)0 ,( Minimize

i

ii

iii

TT

z

zx

zMzx

bAx

fczfxc

“integer feasible” solution within integrality tolerance that violates intent of the model (trickle flow)

Consider alternate formulations to improve numerics

– Fixed costs on continuous variables using big-Ms:

7

810

1 unless branchingfor eligiblenot

1/1 ,100

eMz

eMxzeMx

i

iii

MxzzMxMzx iiiiii //

• LP relaxation solution

• CPLEX default integrality tolerance: 1e-5

(Mixture of large and small numbers)

32

IBM Software Group


Examples

To get correct answers with big-M formulation– Use smallest possible value of big-M that doesn’t violate intent of model

• Bound strengthening in CPLEX presolve often does this automatically

– Set integrality tolerance to 0

– Set simplex tolerances to minimum values, 1e-9

– Ask for more accuracy on a potentially ill-conditioned system• Turn on numerical emphasis parameter

Many users are unfamiliar with issues– Frequent source of CPLEX customer calls– One of most popular CPLEX FAQs

– But should they have to be?

Issues

33

IBM Software Group


Examples

integer

10 ,0

00

subject to

)0 ,( Minimize

i

ii

ii

TT

z

zx

xz

bAx

fczfxc

(integer feasible solutions aligned with intent of the model)branching requires i constraintindicator

0 ,100 ii zx

Indicator constraint formulation for fixed costs on continuous variables

– LP relaxation solution

(CPLEX branches on these directly)

34

IBM Software Group


Examples Which approach to use?

– Indicator formulation more precise representation of model• Indicator and big-M formulation equivalent when M=∞

– If we can use modest values for big-M, indicator formulation tends to be weaker

– Use indicator constraints, let CPLEX decide whether to replace with big-Ms if preprocessing can deduce big-M values of modest size• Presolve tightens the indicator formulation (improved further in

CPLEX 12.2.0) Presolve on indicators (improved) Node presolve on indicators Probing on by default Probing on indicator constraints Re-presolve by default

35

IBM Software Group


Solution Assessment Solution quality (available in all CPLEX APIs)Max. unscaled (scaled) bound infeas. = 5.39174e-07 (5.39174e-07)

(Reduce feasibility tolerance, continue optimizing to decrease bound infeasibilities)

Max. unscaled (scaled) reduced-cost infeas. = 8.61859e-09 (8.61859e-09)

(Reduce optimality tolerance, continue optimizing to decrease reduced cost infeasibilities)

Max. unscaled (scaled) Ax-b resid. = 6.29803e-11 (3.14901e-11)

(If exceeds feasibility tolerance, CPLEX feasibility decisions based on round-off error)

Max. unscaled (scaled) c-B'pi resid. = 5.83291e-14 (5.83291e-14)

(If exceeds optimality tolerance, CPLEX optimality decisions based on round-off error)

Max. unscaled (scaled) |x| = 4084.58 (3892.32)

Max. unscaled (scaled) |slack| = 2562.23 (2562.23)

Max. unscaled (scaled) |pi| = 202.512 (202.512)

Max. unscaled (scaled) |red-cost| = 202.233 (202.233)

Condition number of scaled basis = 1.4e+06

(Use to assess sensitivity of solution to perturbations in the model data)

36

IBM Software Group


Summary and Conclusion

Ill conditioning can occur under perfect arithmetic

– But finite precision introduces perturbations that ill conditioning may magnify• Can’t avoid machine precision

– Take steps to reduce unnecessary perturbations• Use highest precision possible in data calculations

• Either avoid mixtures of large and small coefficients in the problem data, or consider tuning the optimizer to deal with them.

• Optimization algorithm must be numerically stable

• Consider using CPLEX’s numerical emphasis parameter

– Take steps to reduce the ill conditioning in the model• Avoid near linear dependencies in the constraint matrix rows and

columns

• Consider indicator constraints instead of big M formulation.

37

IBM Software Group


Summary and Conclusion

Assessing high condition numbers

– Increasing algorithm tolerances can mitigate effects of large condition numbers• But improvements to formulation, accuracy of input are more robust

– 1e+10, 1e+14 are important thresholds on common machines

– CPLEX’s MIP Kappa feature enables assessment of ill conditioning in MIPs

CPLEX solves most models with some or all of these problems just fine

– Familiarity with these remedies can save the practitioner precious time when they do encounter a badly ill conditioned model.

38

IBM Software Group


Discussion

What other features in CPLEX Optimization Studio would help you bridge the gap between the mathematical model and the practical application?

– How useful would a minimal subset of an ill conditioned model that remains ill conditioned be?

Let us know ([email protected]).

39

IBM Software Group


References/Further Reading

Higham, Accuracy and Stability of Numeric Algorithms

Duff, Erisman and Reid, Direct Methods for Sparse Matrices

Gill, Murray and Wright, Practical Optimization

Golub and Van Loan, Matrix Computations

http://www-01.ibm.com/support/docview.wss?uid=swg21399993

(CPLEX technote on ill conditioning)

40

IBM Software Group


QUESTIONS & ANSWERSAn inspiring Smarter Planet quote???

“A measure of the success of CPLEX in academia is that at any conference where papers are presented, 95 percent of the papers that mention a solver mention CPLEX.” Source: INFORMS

41

IBM Software Group


Merci

Grazie

Gracias

Obrigado

Danke

Japanese

French

Russian

German

Italian

Spanish

Brazilian Portuguese Arabic

Traditional Chinese

Simplified Chinese

Thai

Documents

Guidelines for Assessing Ill Conditioning in Linear and Integer Programs