Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Experts in numerical algorithmsand HPC services
What's New in Mathematical Optimisation from NAG
Jan Fiala, Benjamin Marteau
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
2
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
3
Experts in numerical algorithmsand HPC services
Nonlinear optimisation
Problems of the form:
minx∈Rn
f(x)
hk(x) = 0, k = 1...me
gk(x) ≤ 0, k = 1...mi
Two dierent approaches:
Sequential quadratic programming:
Active set method
based on Gill et al., Stanford University
Interior point method
based on Wächter, Biegler, Carnegie Mellon University
4
Experts in numerical algorithmsand HPC services
Formalisation of the problem
Karush-Kuhn-Tucker (KKT) optimality conditions:
Stationarity condition
∇f(x) +
me∑k=1
λk∇hk(x) +
mi∑k=1
µk∇gk(x) = 0
Primal feasibility condition
h(x) = 0
g(x) ≤ 0
Dual feasibility condition
∀k ∈ 1, ...,mi, µk ≥ 0
Complementarity condition
∀k ∈ 1, ...,mi, µkgk(x) = 0
5
Experts in numerical algorithmsand HPC services
Two approches to tackle these equations
The Complementarity condition is problematic due to its
combinatorial nature.
Two distincts strategy:
An SQP solver guesses which constraints are binding
An IPM perturbs the equation
6
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
7
Experts in numerical algorithmsand HPC services
Sequential quadratic programming
Denition
An inequality constraint k is said to be active at x if it is binding
(g(x) = 0).
SQP methods iteratively build the set of active constraints by
solving quadratic programs:
Initialisation Choose a rst estimate of the solution x0. Build a
quadratic model of the objective around x0 and take a rst guess of
the set of active constraints
Iteration k
Solve the quadratic program warm started by the active set estimation
Update xk+1 and the set of active constraints
Build a new quadratic model around xk+1
8
Experts in numerical algorithmsand HPC services
A few characteristics of SQP methods
Perform lots of inexpensive iterations
Work on the null space of the constraints
The more active constraints there are, the cheaper the iterations are
As a consequence, SQP methods scale very well to large NLP
problems with a high number of constraints.
9
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
10
Experts in numerical algorithmsand HPC services
Interior point methods
If one tries to solve the KKT system directly, the complementarity
condition turns out to be problematic. Therefore, a IPM iteration
can be:
Relax the complementarity condition (µg(x∗) = ν with ν > 0)
Perform one Newton iteration towards the solution of the relaxed
KKT system
Update the current solution estimate and the relaxation parameter
ν
Interior point methods aim at nding a sequence of points
converging to the solution that satisfy the constraints strictly.
11
Experts in numerical algorithmsand HPC services
A few characteristics of Interior Point methods
Perform a few expensive iterations
In the absence of constraints, behave as a Newton method
As a consequence, Interior Point methods scale very well to large
NLP problems with a small number of constraints.
12
Experts in numerical algorithmsand HPC services
Illustration on a few highly constrained problems
Problems were selected from the CUTER test set.
Name Number Number e04vh (SQP) e04st (IPM)
of vars of constrs time (s) time (s)
MINC44 1113 1033 0.28 7.60
READING8 2002 1000 9.78 251.12
NCVXQP6 10000 7500 3.60 613.38
MADSSCHJ 201 398 0.34 5.51
13
Experts in numerical algorithmsand HPC services
Illustration on a few weakly constrained problems
Problems selected from the CUTER test set.
Name Number Number e04vh (SQP) e04st (IPM)
of vars of constrs time (s) time (s)
JIMACK 3549 0 542.42 8.12
OSORIO 10201 202 303.00 0.78
TABLE8 1271 72 3.80 0.04
OBSTCLBL 10000 1 40.84 0.50
The number of constraints is not the only factor...
14
Experts in numerical algorithmsand HPC services
Illustration on a few weakly constrained problems
Problems selected from the CUTER test set.
Name Number Number e04vh (SQP) e04st (IPM)
of vars of constrs time (s) time (s)
JIMACK 3549 0 542.42 8.12
OSORIO 10201 202 303.00 0.78
TABLE8 1271 72 3.80 0.04
OBSTCLBL 10000 1 40.84 0.50
The number of constraints is not the only factor...
14
Experts in numerical algorithmsand HPC services
Other characteristics
IPM (e04st) advantages SQP (e04vh) advantages
•Ecient on unconstrained or
loosely constrained problems
•Ecient on highly constrained
problems
•Can exploit 2nd derivatives•Can capitalize on good initial
point
•Ecient also for quadratic
problems
•Stay feasible with respect to the
linear constraints throughout the
optimization
•Better use of multi-core
architecture
•Usually better results on
pathological problems
•New and simpler interface•Usually requires less function
evaluations
•Infeasibility detection
•Allows warm starting
15
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
16
Experts in numerical algorithmsand HPC services
Mixed integer nonlinear optimisation
Problems of the form:
minx∈Rn,y∈Zm
f(x, y)
l ≤ c(x, y) ≤ u
x: continuous variables
y: integer variables
SQP with branch-and-cut techniques
Ordinal variables
Does not require the model evaluation on fractional values of integer
variables
17
Experts in numerical algorithmsand HPC services
Some characteristics
It might be necessary to use integral variables in an optimization
model, for example:
Cardinality constraints
Decision logic between variables (e.g. constraints only present if a
certain variable is nonzero)
Variables can only take values inside a predecided set
...
Included in NAG, Mark 25 as h02da. Based on Schittkowski et al.,
University of Bayreuth.
18
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
19
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP)
Linear Programming (LP)
well-known, well-researched
convex (local → global)
strong theoretical properties
but only linear
Extensions:
NLP but some nice properties lost (e.g., convexity, duality theory)
SDP retain the theory, change geometry
add matrix inequality, symmetric matrix is positive semidenite
(all eignevalues are nonnegative)
highly nonlinear
notation: A(x) 0
20
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP)
Linear Programming (LP)
well-known, well-researched
convex (local → global)
strong theoretical properties
but only linear
Extensions:
NLP but some nice properties lost (e.g., convexity, duality theory)
SDP retain the theory, change geometry
add matrix inequality, symmetric matrix is positive semidenite
(all eignevalues are nonnegative)
highly nonlinear
notation: A(x) 0
20
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP)
Linear Programming (LP)
well-known, well-researched
convex (local → global)
strong theoretical properties
but only linear
Extensions:
NLP but some nice properties lost (e.g., convexity, duality theory)
SDP retain the theory, change geometry
add matrix inequality, symmetric matrix is positive semidenite
(all eignevalues are nonnegative)
highly nonlinear
notation: A(x) 0
20
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP) formulation
LP
→ SDP → BMI-SDP
minx∈Rn
cTx
subject to lB ≤ Bx ≤ uBlx ≤ x ≤ ux
21
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP) formulation
LP → SDP
→ BMI-SDP
minx∈Rn
cTx
subject to lB ≤ Bx ≤ uBlx ≤ x ≤ ux
A(x) = A0 +n∑i=1
xiAi 0
Ai given symmetric matrices
A(x) is linear in x, LMI = linear matrix inequality
with special choice A(x) can be a matrix variable X
21
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP) formulation
LP → SDP → BMI-SDP
minx∈Rn
cTx+1
2xTHx
subject to lB ≤ Bx ≤ uBlx ≤ x ≤ ux
A(x) = A0 +
n∑i=1
xiAi +
n∑i,j=1
xixjQij 0
further (quadratic) extension
BMI = bilinear matrix inequalities
unique to NAG, included in Mark 26 as e04sv
in collaboration with Ko£vara at al., University of Birmingham
21
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP) Applications?
SDP = special tool
It's there when you need it!
very powerful concept
matrix constraints might not appear naturally
⇒ reformulations, relaxations
structural optimization, chemical engineering, combinatorial
optimization, statistics, control and system theory, polynomial
optimization, ...
spark interest
Warning: I am not a quant!
22
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP) Applications?
SDP = special tool
It's there when you need it!
very powerful concept
matrix constraints might not appear naturally
⇒ reformulations, relaxations
structural optimization, chemical engineering, combinatorial
optimization, statistics, control and system theory, polynomial
optimization, ...
spark interest
Warning: I am not a quant!
22
Experts in numerical algorithmsand HPC services
Semidenite Programming (SDP) Applications?
SDP = special tool
It's there when you need it!
very powerful concept
matrix constraints might not appear naturally
⇒ reformulations, relaxations
structural optimization, chemical engineering, combinatorial
optimization, statistics, control and system theory, polynomial
optimization, ...
spark interest
Warning: I am not a quant!
22
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
23
Experts in numerical algorithmsand HPC services
SDP Applications in Finance
positive semidenite requirement appears directly construction of a correlation/covariance matrix
nearest correlation matrix (with constraints)
robust (worst-case) portfolio optimization
calibration of volatility structure for Libor market swaption
eigenvalue optimization(min/max eigenvalue/singular value, matrix condition number,nuclear norm as heuristic for rank minimization, ...) risk-management: limit Γ of your portfolio
relaxations many relaxations of (NP-hard) combinatorial problems
asian option pricing bounds(?)
reformulations polynomial nonnegativity ↔ matrix inequality
(e.g., interpolation by nonnegative splines)
Lyapunov stability of ODE
in nance?
24
Experts in numerical algorithmsand HPC services
SDP Applications in Finance
positive semidenite requirement appears directly construction of a correlation/covariance matrix
nearest correlation matrix (with constraints)
robust (worst-case) portfolio optimization
calibration of volatility structure for Libor market swaption
eigenvalue optimization(min/max eigenvalue/singular value, matrix condition number,nuclear norm as heuristic for rank minimization, ...) risk-management: limit Γ of your portfolio
relaxations many relaxations of (NP-hard) combinatorial problems
asian option pricing bounds(?)
reformulations polynomial nonnegativity ↔ matrix inequality
(e.g., interpolation by nonnegative splines)
Lyapunov stability of ODE
in nance?
24
Experts in numerical algorithmsand HPC services
SDP Applications in Finance
positive semidenite requirement appears directly construction of a correlation/covariance matrix
nearest correlation matrix (with constraints)
robust (worst-case) portfolio optimization
calibration of volatility structure for Libor market swaption
eigenvalue optimization(min/max eigenvalue/singular value, matrix condition number,nuclear norm as heuristic for rank minimization, ...) risk-management: limit Γ of your portfolio
relaxations many relaxations of (NP-hard) combinatorial problems
asian option pricing bounds(?)
reformulations polynomial nonnegativity ↔ matrix inequality
(e.g., interpolation by nonnegative splines)
Lyapunov stability of ODE
in nance?
24
Experts in numerical algorithmsand HPC services
SDP Applications in Finance
positive semidenite requirement appears directly construction of a correlation/covariance matrix
nearest correlation matrix (with constraints)
robust (worst-case) portfolio optimization
calibration of volatility structure for Libor market swaption
eigenvalue optimization(min/max eigenvalue/singular value, matrix condition number,nuclear norm as heuristic for rank minimization, ...) risk-management: limit Γ of your portfolio
relaxations many relaxations of (NP-hard) combinatorial problems
asian option pricing bounds(?)
reformulations polynomial nonnegativity ↔ matrix inequality
(e.g., interpolation by nonnegative splines)
Lyapunov stability of ODE
in nance?
24
Experts in numerical algorithmsand HPC services
Nearest Correlation Matrix (with Constraints)
minX
n∑i,j=1
(Xij −Hij)2
subject to Xii = 1, i = 1, . . . , n
X 0
correlation matrix = symmetric positive semidenite matrix with
unit diagonal
H approximate correlation matrix
X new (true) correlation matrix closest to H in Frobenius norm
do not use SDP on vanilla NCM due to algorithm complexity;
special solvers in G02 are preferrable
25
Experts in numerical algorithmsand HPC services
Nearest Correlation Matrix (with Constraints)
minX
n∑i,j=1
(Xij −Hij)2
subject to Xii = 1, i = 1, . . . , n
X 0
correlation matrix = symmetric positive semidenite matrix with
unit diagonal
H approximate correlation matrix
X new (true) correlation matrix closest to H in Frobenius norm
do not use SDP on vanilla NCM due to algorithm complexity;
special solvers in G02 are preferrable
25
Experts in numerical algorithmsand HPC services
Nearest Correlation Matrix (with Constraints)
minX
n∑i,j=1
(Xij −Hij)2
subject to Xii = 1, i = 1, . . . , n
X 0
Possible new constraints:
x elements: Xij = Hij for some i, j
element-wise bounds: lij ≤ Xij ≤ uijsmallest eigenvalue constraint: X λminI, where λmin given
limit condition number: λmaxI X λminI, λmax ≤ κλmin,
where κ is given and λmin, λmax are new variables
25
Experts in numerical algorithmsand HPC services
Nearest Correlation Matrix (with Constraints)
minX
n∑i,j=1
(Xij −Hij)2
subject to Xii = 1, i = 1, . . . , n
X 0
Possible dierent objective:
weight elements:∑Wij(Xij −Hij)
2
consider portfolio V aRα: −λZ2αw
TDXDw +∑
(Xij −Hij)2
D deviations (dii = σi), w asset allocation, λ weighting factor
25
Experts in numerical algorithmsand HPC services
Nearest Correlation Matrix (with Constraints)
minX
n∑i,j=1
(Xij −Hij)2
subject to Xii = 1, i = 1, . . . , n
X 0
Full control over the formulation!
25
Experts in numerical algorithmsand HPC services
Robust Portfolio Optimization
mean-variance analysis often very sensitive to the data
are nominal µ (expected returns) and Σ (covariance) correct?
robust EF = limit sensitivity of the results by incorporating
uncertainity model on parameters
choose solution in the worst-case scenario (see Boyd '07)
min (µ− r1 + λ)TΣ−1(µ− r1 + λ)
subject to Fµ ≥ 0
|µi − µi| ≤ α1|µi|, i = 1, . . . , n
|1Tµ− 1T µ| ≤ α2|1T µ||Σij − Σij | ≤ β1|Σij |, i, j = 1, . . . , n
||Σ− Σ||F ≤ β2||Σ||FΣ 0
λ ≥ 0
26
Experts in numerical algorithmsand HPC services
Robust Portfolio Optimization
mean-variance analysis often very sensitive to the data
are nominal µ (expected returns) and Σ (covariance) correct?
robust EF = limit sensitivity of the results by incorporating
uncertainity model on parameters
choose solution in the worst-case scenario (see Boyd '07)
min (µ− r1 + λ)TΣ−1(µ− r1 + λ)
subject to Fµ ≥ 0
|µi − µi| ≤ α1|µi|, i = 1, . . . , n
|1Tµ− 1T µ| ≤ α2|1T µ||Σij − Σij | ≤ β1|Σij |, i, j = 1, . . . , n
||Σ− Σ||F ≤ β2||Σ||FΣ 0
λ ≥ 0
26
Experts in numerical algorithmsand HPC services
Calibration of volatility structure
How to extract correlation information from market option prices?
assume LIBOR market model with covariance structure X and
swap weights Ω = wwT
under some assumptions, swaption prices are given by
Black-Scholes formula with volatility parameter σ = Tr(ΩX)
Task: calibrate X to observed swaption market prices:
nd X
subject to Tr(ΩX) = σ
X 0
where σ are observed swaption implied vols
27
Experts in numerical algorithmsand HPC services
Calibration of volatility structure cont.
Correlation X in the previous feasibility problem not unique,
can choose objective:
min or max price of some other option: min/max Tr(ΩX)
norm of X: min‖X‖smoothness: min‖∆X‖robustness via Bid/Ask spread:
max t s.t. σBid + t ≤ Tr(ΩX) ≤ σAsk − trank of X as a heuristic via nuclear norm of X
28
Experts in numerical algorithmsand HPC services
Risk-management: How to construct positive Γ portfolio?
assume existing portfolio Π of derivatives/exotics on underlying Si:
Π = F (S1, . . . , Sn)
Π must be risk managed usual Delta hedging: ∂Π/∂S = 0
but Delta hedging only works for very small movements in the
underlyings, for larger would like to keep positive (or small) Γ as
dΠ = ∂Π∂S + 1
2ST ∂2Π∂S2 S + · · ·
to construct positive Γ: buy xi units of vanilla option pi on Si and
yi of underlying Si
minx,y∑
xipi(Si) + yiSi
subject to∂2F
∂S2+ diag
(xi∂2pi∂S2
i
) 0
∂F
∂Si+ xi
∂pi∂Si
+ yi = 0, i = 1, . . . , n
29
Experts in numerical algorithmsand HPC services
Risk-management: How to construct positive Γ portfolio?
assume existing portfolio Π of derivatives/exotics on underlying Si:
Π = F (S1, . . . , Sn)
Π must be risk managed usual Delta hedging: ∂Π/∂S = 0
but Delta hedging only works for very small movements in the
underlyings, for larger would like to keep positive (or small) Γ as
dΠ = ∂Π∂S + 1
2ST ∂2Π∂S2 S + · · ·
to construct positive Γ: buy xi units of vanilla option pi on Si and
yi of underlying Si
minx,y∑
xipi(Si) + yiSi
subject to∂2F
∂S2+ diag
(xi∂2pi∂S2
i
) 0
∂F
∂Si+ xi
∂pi∂Si
+ yi = 0, i = 1, . . . , n
29
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
30
Experts in numerical algorithmsand HPC services
Coming next new LP solver
NAG = Amazon of optimization
(be a one-stop-shop for all you need in optimization)
Constant evolution of the library
based on our roadmap
customers' requests
latest research & collaborations
... ongoing hard work
New LP solver
new solver for large-scale LP problems
based on interior point method (IPM)
lling the missing gap
signicant speed-up
31
Experts in numerical algorithmsand HPC services
Coming next new LP solver
NAG = Amazon of optimization
(be a one-stop-shop for all you need in optimization)
Constant evolution of the library
based on our roadmap
customers' requests
latest research & collaborations
... ongoing hard work
New LP solver
new solver for large-scale LP problems
based on interior point method (IPM)
lling the missing gap
signicant speed-up
31
Experts in numerical algorithmsand HPC services
Coming next DFO for calibration
Standard data-tting (calibration) problem
given oberved data [ti, yi]; model f(·;x) depending on model
parameters x
Task: nd x to t the data as close as possible,
typically in least square sense: minx∑
(yi − f(ti;xi))2
Additional requirements
small number of parameters (< 100)
black-box model, no derivatives available
possibly expensive and/or inaccurate function evaluations
typically reasonable starting point, small improvement sucient
⇒ nite dierences shouldn't be used!
New Derivative free optimization (DFO) solver exploiting the
problem structure (the only of its kind!)
32
Experts in numerical algorithmsand HPC services
Coming next DFO for calibration
Standard data-tting (calibration) problem
given oberved data [ti, yi]; model f(·;x) depending on model
parameters x
Task: nd x to t the data as close as possible,
typically in least square sense: minx∑
(yi − f(ti;xi))2
Additional requirements
small number of parameters (< 100)
black-box model, no derivatives available
possibly expensive and/or inaccurate function evaluations
typically reasonable starting point, small improvement sucient
⇒ nite dierences shouldn't be used!
New Derivative free optimization (DFO) solver exploiting the
problem structure (the only of its kind!)
32
Experts in numerical algorithmsand HPC services
Coming next DFO for calibration
Standard data-tting (calibration) problem
given oberved data [ti, yi]; model f(·;x) depending on model
parameters x
Task: nd x to t the data as close as possible,
typically in least square sense: minx∑
(yi − f(ti;xi))2
Additional requirements
small number of parameters (< 100)
black-box model, no derivatives available
possibly expensive and/or inaccurate function evaluations
typically reasonable starting point, small improvement sucient
⇒ nite dierences shouldn't be used!
New Derivative free optimization (DFO) solver exploiting the
problem structure (the only of its kind!)
32
Experts in numerical algorithmsand HPC services
Nonlinear programming: active set versus interior point methods
Overview
Sequential quadratic programming
Interior point methods
Illustration on a few examples
Mixed integer nonlinear optimisation
Semidenite programming
Sample applications in nance
Coming next
Large-scale linear programming
Derivative free solver for calibration
Working with customers
33
Experts in numerical algorithmsand HPC services
Working with customers
Sometimes solution out of the box is not sucient!
Is it possible to speed up the solver?
Does the model t the solver?
Can a special problem structure be exploited?
NAG Mathematical Optimization Consultancy ready to help!
choice and tuning of the solver
adjustments with the model
bespoke solver development
34
Experts in numerical algorithmsand HPC services
Working with customers
Sometimes solution out of the box is not sucient!
Is it possible to speed up the solver?
Does the model t the solver?
Can a special problem structure be exploited?
NAG Mathematical Optimization Consultancy ready to help!
choice and tuning of the solver
adjustments with the model
bespoke solver development
34
Experts in numerical algorithmsand HPC services
Examples of optimisation projects
Energy & Commodities Trading Co. The client's model was demonstrating unusual behaviour - signicant
memory footprint and slow convergence. Analysis of the model
showed that a more suitable equivalent reformulation is available.
When the model was adjusted, the solver performed as expected.
Financial Services Software Vendor extended site visit of a client allowed us to discuss client's problem in
detail and helped to identify a weak point which was causing
convergence issues and x.
Financial Brokerage Co. The client wanted a class of problems to be solved within the
prescribed time limit. After the initial assessment of the problem, a
possible solution was identied using recent research from Stanford
university. A bespoke solution was delivered during a short consulting
engagement. The new solver drastically improved the performance so
that even bigger problems could be considered by the client.
35