Upload
muncel
View
51
Download
0
Embed Size (px)
DESCRIPTION
Welcome. Guidelines for Assessing Ill Conditioning in Linear and Integer Programs. Ed Klotz Mathematical Programming Specialist Advanced Support for CPLEX Users IBM Software Group. - PowerPoint PPT Presentation
Citation preview
© 2009 IBM Corporation
®
Guidelines for Assessing Ill Conditioning in Linear and Integer
Programs
Welcome
Ed KlotzMathematical Programming SpecialistAdvanced Support for CPLEX Users IBM Software Group
2
IBM Software Group
© 2009 IBM Corporation
Presenter:
Ed KlotzMathematical Programming SpecialistAdvanced Support for CPLEX [email protected]
Moderator:
Gabriela WilsonOptimization Campaign Specialist
On 24 Producer:
Craig Chamides
© 2009 IBM Corporation
®
Guidelines for Assessing Ill Conditioning in Linear and Integer
Programs
Welcome
Ed KlotzMathematical Programming SpecialistAdvanced Support for CPLEX Users IBM Software Group
4
IBM Software Group
© 2009 IBM Corporation
Objective
Enable more precise assessment of ill conditioning in linear and integer programs
Discuss some techniques to either mitigate or avoid potential ill conditioning in LP and MIP models
5
IBM Software Group
© 2009 IBM Corporation
Outline
Description of ill conditioning
Assessment of ill conditioning
Modeling pitfalls that can contribute to ill conditioning
– Formulation alternatives to avoid ill conditioning
Conclusions
6
IBM Software Group
© 2009 IBM Corporation
Problem Definition
Ill Conditioning
– Does the flap of a butterfly’s wings in Brazil set off a tornado in Texas?
Meteorological Model
Meteorological Model
Data to 3 decimal places
(.506)
Data to 6 decimal places
(.506127)
7
IBM Software Group
© 2009 IBM Corporation
Problem Definition
Ill Conditioning
– Motivated by work of meteorologist & mathematician Edward Lorenz
– Lorenz focused on small changes in initial conditions, resulting trajectories in nonlinear meteorological models• Lorenz subsequently became a pioneer in the field of Chaos Theory
– Ill conditioning extends beyond the nonlinear meteorological models on which Lorenz worked
– More generally, a mathematical model or system is ill conditioned when a small change in the input can result in a large change to the computed solution
8
IBM Software Group
© 2009 IBM Corporation
Problem Definition
Ill Conditioning
– Small change in input leads to big change in output
– Can we quantitatively measure ill conditioning?• For many mathematical systems or models, quantitative measures have
yet to be discovered. But, sometimes we can measure it.
• Specifically, we can measure ill conditioning when solving square linear systems of equations
9
IBM Software Group
© 2009 IBM Corporation
Condition Number of a Square Matrix (Turing 1948; Rice, 1966) CPLEX solves square linear systems of form:
Exact solution is:
How will a change to the input vector b affect the computed solution x?
Cauchy-Schwarz inequality:
Cauchy-Schwarz for original system:
Combine and rearrange:
bBx bBx 1
bBx 1
b
bBB
x
x 1
)(1 bbBxx
bBx 1
xBb
10
IBM Software Group
© 2009 IBM Corporation
Condition Number Condition number of B is defined as
As condition number increases, upper bound on change in solution relative to change in data also increases
Even if the modeler doesn‘t change the data, finite precision computers can introduce small changes
– Machine precision for 64 bit double = 1e-16• Most numbers (e.g. 1/3, 5/12) can‘t be represented exactly
– Ill conditioned linear systems can yield much different results on different machines
• Different floating point representations due to different chips• Same floating point representation, but compilers use it differently
1 BBB
bbBxx )(
11
IBM Software Group
© 2009 IBM Corporation
A square linear system is ill conditioned if the condition number is large, well conditioned if the condition number is small
– Higham, Accuracy and Stability of Numeric Algorithms
– Duff, Erisman and Reid, Direct Methods for Sparse Matrices
– Gill, Murray and Wright, Practical Optimization
– Golub and Van Loan, Matrix Computations
– The usual suspects on the web say the same thing as well
What constitutes a large or small value?
– 1000? 10000000? 10000000000000?
And what should we do for algorithms that solve a series of interrelated square linear systems?
– Linear, Mixed Integer, Quadratic and Nonlinear Programming algorithms all do this
Assessment of Ill Conditioning
12
IBM Software Group
© 2009 IBM Corporation
What constitutes a large or small value?
– Depends on machine, data and algorithm precision ( ), algorithm tolerances (t)
– Ill conditioning can occur when round-off error associated with precision is large enough to influence algorithm decisions
– Classify based on threshold defined by t /• Four distinct categories
– Example: CPLEX has default algorithmic tolerances of 1e-6, runs double precision arithmetic on machines with precision of ~1e-16• t / = 1e-6/1e-16 = 1e+10 is a key threshold
Assessment of Ill Conditioning
bbBxx )(
≥ t ?
13
IBM Software Group
© 2009 IBM Corporation
Assessment of Ill Conditioning
Condition number is a bound for the increase of the error:
Basic epsilons:
– Machine precision (double): 1e-16
– Default feasibility and optimality tolerance: 1e-6
Classification of condition numbers for LP bases:
– Stable: 0 ≤ (B) < 1e+7
– Suspicious: 1e+7 ≤ (B) < 1e+10
– Unstable: 1e+10 ≤ (B) < 1e+14
– Ill-posed: 1e+14 ≤ (B)
bbBxx )(
14
IBM Software Group
© 2009 IBM Corporation
Assessment of Ill Conditioning
CPLEX provides condition numbers for the basis associated with the current solution when solving LPs and QPs
What about MIP?
Previously, CPLEX provided Kappa for the associated fixed LP:
– Some indication of condition of primal solution• But optimal basis for fixed LP may not match that of node LP that yielded
the associated integer solution
– Optimality proof of MIP is based on pruning during tree search and thus not available with final solution
How reliable is it?
– Need to monitor condition number of all optimal bases used during Branch-and-Cut search
– Performance impact: computing precisely is equivalent to ~m/4 simplex iterations (m = number of constraints in the model)
• Can be mitigated by sampling, estimating kappa
1B
15
IBM Software Group
© 2009 IBM Corporation
MIP Kappa feature, available starting with CPLEX 12.2
– Sample from the series of condition numbers• New parameter CPX_PARAM_MIPKAPPA with settings:
-1: off 0: auto (defaults to off) 1: sample 2: use every optimal basis
– Classification thresholds provide percentages of each category
– Provide an assessment for users unfamiliar with ill conditioning
If enabled, categorize condition numbers of optimal bases• Stable• Suspicious• Unstable• Ill-posed
Assessment of Ill Conditioning
16
IBM Software Group
© 2009 IBM Corporation
MIP Kappa sample output :
Branch-and-cut subproblem optimization:Max condition number: 3.5490e+16Percentage of stable bases: 0.0%Percentage of suspicious bases: 86.9%Percentage of unstable bases: 13.0%Percentage of ill-posed bases: 0.1%Attention level: 0.048893CPLEX encountered numerical difficulties while solving this model.
Attention level
– =0 if only stable bases encountered
– >0 if at least one basis encountered that is not stable
– Max value is 1 (all bases ill-posed)
– Can view as the probability of numerical difficulties given the sample of condition numbers
Assessment of Ill Conditioning
17
IBM Software Group
© 2009 IBM Corporation
Branch-and-cut subproblem optimization:Max condition number: 3.5490e+16Percentage of stable bases: 0.0%Percentage of suspicious bases: 86.9%Percentage of unstable bases: 13.0%Percentage of ill-posed bases: 0.1%Attention level: 0.048893CPLEX encountered numerical difficulties while solving this model.
Ideal output:
– Attention level 0 (no suspicious, unstable or ill posed bases)
Good output:
– No ill-posed or unstable bases
Bad output:
– Any ill posed bases
– More than 5% unstable bases
Assessment of Ill Conditioning
18
IBM Software Group
© 2009 IBM Corporation
Alternate Interpretations of an Ill Conditioned Basis
Condition number of Simplex Solutions
– Simplex solution is intersection point of n hyperplanes
19
IBM Software Group
© 2009 IBM Corporation
Condition number of Simplex Solutions
– Simplex solution is intersection point of n hyperplanesb = change in hyperplane (input); x = change in solution
(output)
b
x
= x / b 1: well conditioned!
Alternate Interpretations of an Ill Conditioned Basis
20
IBM Software Group
© 2009 IBM Corporation
Condition number of Simplex Solutions
– Simplex solution is intersection point of n hyperplanes
b
x
= x / b >>1: ill conditioned!
Alternate Interpretations of an Ill Conditioned Basis
21
IBM Software Group
© 2009 IBM Corporation
Condition number of Simplex Solutions
– Simplex solution is intersection point of n hyperplanes
stable Solution
ill-conditioned Solution
Alternate Interpretations of an Ill Conditioned Basis
22
IBM Software Group
© 2009 IBM Corporation
Alternate Interpretations of an Ill Conditioned Basis
Distance to singularity of a matrix is the reciprocal of its condition number (Gastinel, Kahan).
– Implies that linear combinations of rows or columns of B that are close to 0 imply ill conditioning:
)(/1)(dist
singular :||||
min:)(dist
pp
p
pp }{
BB
BBB
BB
B
dconditione ill hence and singular, toclose is
,|||| ,|||| , if
B
vvBT
23
IBM Software Group
© 2009 IBM Corporation
Now that we can better assess the meaning of the basis condition numbers, what can we do about it?
– Ill conditioning can occur under perfect arithmetic.• For some models, even perfect data, algorithm and machine
precision may not address the problem. Consider adjustments to existing formulation, or alternate
formulations that provide the solution to the ill conditioned model.
– But, in most cases, finite precision can perturb the exact system of equations we wish to solve, resulting in significant changes to the computed solution.• Calculate data, formulate model and configure algorithm to keep such
perturbations as small as possible
– Condition number provides a worst case bound on the effect• CPLEX provides good quality solutions on majority of models
containing some basis condition numbers in [1e+10, 1e+14]
Implications of Ill Conditioning
24
IBM Software Group
© 2009 IBM Corporation
Can’t avoid perturbations due to machine precision
– But, increasing tolerances when condition number is high can prevent algorithmic decisions based on round-off error associated with machine precision (increase t).
– More precise input values are better (reduce )• Always calculate and input model data in double precision• Machine precision for 32 bit floats ~1e-8
Condition numbers > 1e+2 could result in algorithmic decisions based on machine precision based round-off error If you really need to use single precision in the model data, increase CPLEX’s feasibility and optimality tolerances above the defaults of 1e-6
Implications for the Practitioner (Data Input)
≥ t ?
bbBxx )(
25
IBM Software Group
© 2009 IBM Corporation
Minimize perturbations involving other factors
– Model data values • Don’t divide big numbers by small numbers in data calculations
Increases round-off error
• Make sure all procedures that calculate the data are implemented in a numerically stable manner• Less round-off error if all data values of similar order of magnitude
Mix of large and small numbers results in more shifting of the exponents, loss of precision in the mantissa.– Use CPLEX’s aggressive scaling if unavoidable
However, similar orders of magnitude in the coefficients does not guarantee absence of ill conditioning
Implications for the Practitioner (Data Calculation)
bbBxx )(
≥ t ?
26
IBM Software Group
© 2009 IBM Corporation
Minimize perturbations involving other factors (continued)
– Algorithmic strategies• Numerically stable implementations of all steps
Simplex method ratio test must avoid dividing big numbers by small ones
• Especially important for the basis factorization in the simplex method Set CPLEX’s Markowitz tolerance to .90 with ill conditioned models Consider turning on CPLEX’s numerical emphasis parameter
Implications for the Practitioner (Algorithm)
bbBxx )(
≥ t ?
27
IBM Software Group
© 2009 IBM Corporation
Implications for the Practitioner (Formulation)
Avoid nearly linear dependent rows or columns
– Such linear combinations of rows and columns often arise from round-off error in the data
dconditione ill hence and singular, toclose is
,|||| ,|||| , if
B
vvBT
28
IBM Software Group
© 2009 IBM Corporation
Examples
Model data values
– Avoid rounding if you can, or round as precisely as possible
– Matrices can be ill conditioned despite small spread of coefficients
– Exact formulation: Maximize x1 + x2 c1: 1/3 x1 + 2/3 x2 = 1 c2: x1 + 2 x2 = 3
– Imprecisely rounded, single [double] precision Maximize x1 + x2 c1: .33333333 x1 + .66666667 x2 = 1[ c1: .33333333333333333 x1 + .6666666666666667 x2 = 1] c2: x1 + 2x2 = 3
– Scale to integral value whenever possible: Maximize x1 + x2 c1: x1 + 2 x2 = 3 c2: x1 + 2 x2 = 3
)(/1)(dist
singular :||||
min:)(dist
pp
p
pp }{
BB
BBB
BB
B
(results in near singular matrix)(better)
(best)
29
IBM Software Group
© 2009 IBM Corporation
Examples
) 10 ~(error )b (b/ a - a/b ) a/(b ~ a/b
)10 ~(error /a b/a )/a (b ~ b/a
)10 1/30000, b 3, (a b/a and a/b compare b, aFor
0
8-
-8
Don’t divide big numbers by small numbers in data calculations
) 10 ~(error )b (b/ a - a/b ) a/(b ~ a/b
)10 ~(error /a b/a )/a (b ~ b/a
)10 1/30000, b 3, (a
8-
16-
-16
30
IBM Software Group
© 2009 IBM Corporation
Examples Why doesn’t CPLEX set the Markowitz tolerance to .90 by default?
– The simplex method performs Gaussian elimination • Stability test to avoid dividing big numbers by small ones
– Markowitz tolerance defines acceptable pivot relative to maximum element in a matrix column
– Trade off between accuracy and sparsity of factorization• Most models are not ill conditioned; avoid the loss of sparsity in the factorization by using a smaller Markowitz tolerance (.01 by default)
a11 a12 a13
a21 0 a23
a31 0 0
a11 = 10000, a21 = 101,
a31 = 99, Markowitz = .01
Threshold = 10000*.01 = 100
31
IBM Software Group
© 2009 IBM Corporation
Examples
integer
10 ,0
) with constraint(only 0
subject to
)0 ,( Minimize
i
ii
iii
TT
z
zx
zMzx
bAx
fczfxc
“integer feasible” solution within integrality tolerance that violates intent of the model (trickle flow)
Consider alternate formulations to improve numerics
– Fixed costs on continuous variables using big-Ms:
7
810
1 unless branchingfor eligiblenot
1/1 ,100
eMz
eMxzeMx
i
iii
MxzzMxMzx iiiiii //
• LP relaxation solution
• CPLEX default integrality tolerance: 1e-5
(Mixture of large and small numbers)
32
IBM Software Group
© 2009 IBM Corporation
Examples
To get correct answers with big-M formulation– Use smallest possible value of big-M that doesn’t violate intent of model
• Bound strengthening in CPLEX presolve often does this automatically
– Set integrality tolerance to 0
– Set simplex tolerances to minimum values, 1e-9
– Ask for more accuracy on a potentially ill-conditioned system• Turn on numerical emphasis parameter
Many users are unfamiliar with issues– Frequent source of CPLEX customer calls– One of most popular CPLEX FAQs
– But should they have to be?
Issues
33
IBM Software Group
© 2009 IBM Corporation
Examples
integer
10 ,0
00
subject to
)0 ,( Minimize
i
ii
ii
TT
z
zx
xz
bAx
fczfxc
(integer feasible solutions aligned with intent of the model)branching requires i constraintindicator
0 ,100 ii zx
Indicator constraint formulation for fixed costs on continuous variables
– LP relaxation solution
(CPLEX branches on these directly)
34
IBM Software Group
© 2009 IBM Corporation
Examples Which approach to use?
– Indicator formulation more precise representation of model• Indicator and big-M formulation equivalent when M=∞
– If we can use modest values for big-M, indicator formulation tends to be weaker
– Use indicator constraints, let CPLEX decide whether to replace with big-Ms if preprocessing can deduce big-M values of modest size• Presolve tightens the indicator formulation (improved further in
CPLEX 12.2.0) Presolve on indicators (improved) Node presolve on indicators Probing on by default Probing on indicator constraints Re-presolve by default
35
IBM Software Group
© 2009 IBM Corporation
Solution Assessment Solution quality (available in all CPLEX APIs)Max. unscaled (scaled) bound infeas. = 5.39174e-07 (5.39174e-07)
(Reduce feasibility tolerance, continue optimizing to decrease bound infeasibilities)
Max. unscaled (scaled) reduced-cost infeas. = 8.61859e-09 (8.61859e-09)
(Reduce optimality tolerance, continue optimizing to decrease reduced cost infeasibilities)
Max. unscaled (scaled) Ax-b resid. = 6.29803e-11 (3.14901e-11)
(If exceeds feasibility tolerance, CPLEX feasibility decisions based on round-off error)
Max. unscaled (scaled) c-B'pi resid. = 5.83291e-14 (5.83291e-14)
(If exceeds optimality tolerance, CPLEX optimality decisions based on round-off error)
Max. unscaled (scaled) |x| = 4084.58 (3892.32)
Max. unscaled (scaled) |slack| = 2562.23 (2562.23)
Max. unscaled (scaled) |pi| = 202.512 (202.512)
Max. unscaled (scaled) |red-cost| = 202.233 (202.233)
Condition number of scaled basis = 1.4e+06
(Use to assess sensitivity of solution to perturbations in the model data)
36
IBM Software Group
© 2009 IBM Corporation
Summary and Conclusion
Ill conditioning can occur under perfect arithmetic
– But finite precision introduces perturbations that ill conditioning may magnify• Can’t avoid machine precision
– Take steps to reduce unnecessary perturbations• Use highest precision possible in data calculations
• Either avoid mixtures of large and small coefficients in the problem data, or consider tuning the optimizer to deal with them.
• Optimization algorithm must be numerically stable
• Consider using CPLEX’s numerical emphasis parameter
– Take steps to reduce the ill conditioning in the model• Avoid near linear dependencies in the constraint matrix rows and
columns
• Consider indicator constraints instead of big M formulation.
37
IBM Software Group
© 2009 IBM Corporation
Summary and Conclusion
Assessing high condition numbers
– Increasing algorithm tolerances can mitigate effects of large condition numbers• But improvements to formulation, accuracy of input are more robust
– 1e+10, 1e+14 are important thresholds on common machines
– CPLEX’s MIP Kappa feature enables assessment of ill conditioning in MIPs
CPLEX solves most models with some or all of these problems just fine
– Familiarity with these remedies can save the practitioner precious time when they do encounter a badly ill conditioned model.
38
IBM Software Group
© 2009 IBM Corporation
Discussion
What other features in CPLEX Optimization Studio would help you bridge the gap between the mathematical model and the practical application?
– How useful would a minimal subset of an ill conditioned model that remains ill conditioned be?
Let us know ([email protected]).
39
IBM Software Group
© 2009 IBM Corporation
References/Further Reading
Higham, Accuracy and Stability of Numeric Algorithms
Duff, Erisman and Reid, Direct Methods for Sparse Matrices
Gill, Murray and Wright, Practical Optimization
Golub and Van Loan, Matrix Computations
http://www-01.ibm.com/support/docview.wss?uid=swg21399993
(CPLEX technote on ill conditioning)
40
IBM Software Group
© 2009 IBM Corporation
QUESTIONS & ANSWERSAn inspiring Smarter Planet quote???
“A measure of the success of CPLEX in academia is that at any conference where papers are presented, 95 percent of the papers that mention a solver mention CPLEX.” Source: INFORMS
41
IBM Software Group
© 2009 IBM Corporation
Merci
Grazie
Gracias
Obrigado
Danke
Japanese
French
Russian
German
Italian
Spanish
Brazilian Portuguese Arabic
Traditional Chinese
Simplified Chinese
Thai