Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Directions in Optimization/Nonlinear Programming Stats Dept Retreat, Oct 27, 2012 Mihai Anitescu
Optimization interests
• Linear, Quadratic, Conic, Semiinfinite, Nonconvex – but primarily nonlinear programming,
• Variational inequalities -- Nonlinear Complementarity—differential variational inequalities
• Stochastic programming:
minx f (x)
s.t. g(x) ≤ 0 ∈K( )h(x) = 0
x − x( )F(x) ≥ 0,∀x ∈K x ∈K ⊥ F(x)∈ K *
f x( ) = f xd , x u( ){ }u( ) = φ xd , x u( )( )du∫
1. DIFFERENTIAL VARIATIONAL INEQUALITIES
Mihai Anitescu, STAT 343, Autumn 12. Not for use outside UC
Differential variational inequalities -- DVI
• DVIs appear whenever both dynamics and inequalities/ switching appear in model description.
• Dense granular flow. The second most-manipulated industrial material after water!
• Microstructure evolution (how does fatigue appear in steel)?
• Model predictive control: What is the optimal way to control a complex dynamical system such as power grid?
DVI formulation
• Differential variational inequalities: Mixture of differential equations and variational inequalities.
• In the case of complementarity,
Dvi Approaches • Recall, DVI (for C=R+)
• Smoothing
• Followed by forward Euler.
Easy to implement!! But Stiff!
• I specialize in time-stepping. Can be much faster IF you can solve the subproblem
x = f t,x t( ),u t( )( );u ≥ 0 ⊥ F t,x t( ),u t( )( ) ≥ 0
x = f t,x t( ),u t( )( );ui Fi t,x t( ),u t( )( ) = ε , i = 1,2,…nu
uin Fi t
n−1,xn−1,un−1( ) = ε , i = 1,2,…nuxn+1 = xn + hf tn ,xn ,un( );
( )( )
1 1 1 1
1 1 1 1
, , ;
0 , , 0
n n n n n
n n n n
x x hf t x u
u F t x u
+ + + +
+ + + +
= +
≥ ⊥ ≥
Granular flow: PBNR • 160’000 Uranium-Graphite
spheres, 600’000 contacts on average
• Two millions of primal variables, six millions of dual variables.
• A new convex subproblem approach which is much faster.
2( ) ( )( ) ( )1 2
( ) ( ) ( ) ( ) ( ) ( )1 1 2 2
1 2
( ) ( )
( ) ( ) ( ) ( )1 2 1 1 2 2
( ) ( )
0 ( ) 0 1 2
argminj jj j
n
j j j j j jn c
j … p
j jn
j j T j T j
c
dvM c n t t f q v k t q vdt
dq vdtc q j … p
v t v tµ β β
β β
β β β β⎛ ⎞⎜ ⎟⎝ ⎠
⎛ ⎞⎜ ⎟⎝ ⎠
= , , ,
⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠≥ +
= + + + , + , ,
=
≥ ⊥ Φ ≥ , = , , ,
⎡ ⎤, = +⎣ ⎦
∑
Microstructure Evolution
8
Evolution of phases: e.g defects in solids
Applications: • Data Assimilation (Estimation), Transmission Planning, Power State Forecasting,
Generation Control, Buildings Optimization, Markets
DO Large-Scale NLP
Discretization Data
Dynamic Optimization (DO)– Model Predictive Control
• It is very expensive … but if I write its optimality conditions and look at it as a differential variational inequality, I can approximate it with limited amount of work per step !
Outstanding Research Issues
• Many applications are “undermodeled” due to “fear of switches” in dynamics, lots to do in modeling. Ice Sheet Modeling, Microstructure Evolution, Hybrid Systems.
• What are proper splittings in time that have best stability/accuracy properties?
• Convergence theory for PDE-based DVI. • Efficient solvers for problems whose active set changes often?
E.g how do we adapt multigrid? • How do we reuse information optimally from step to step (hot-
starting). For example, interior-point is notorious for not being able to do that?
• Etc.
Mihai Anitescu, STAT 343, Autumn 12. Not for use outside UC
2. DECISION UNDER UNCERTAINTY : STOCHASTIC OPTIMIZATION
Mihai Anitescu, STAT 343, Autumn 12. Not for use outside UC
A leading paradigm for optimization under uncertainty paradigm: stochastic programming.
• Two-stage stochastic programming with recourse (“here-and-now”)
•
Mihai Anitescu - Optimization under uncertainty
12
{ }0
0 0 ( ) ) ( ,
x xMin f x M xin f ω⎡ ⎤+ ⎣ ⎦Es.t. g0 (x0 ) = b0 s.t. h (x ,ω ) = b(ω ) − g (x0 ,ω )
x0 ≥ 0 x (ω ) ≥ 0
Minx0 ,x1,x2 ,…,xS
f 0 (x0 )+1S
f k (xk )k =1
S
∑g0 (x0 ) = b0gk (x0 ) + hk (xk ) = bk ,
x0 ≥ 0, xk ≥ 0, k = 1,...,S .
subj. to. 1 2, , , Sξ ξ ξ…
Second-stage random data ( )ξ ω
continuous discrete
Sampling
Inference Analysis
M samples
Sample average approximation (SAA)
or bootstrapping
x0*
x0N
x0N − x0
* ~ ?
2.1 A PERHAPS UNEXPECTED APPLICATION. SCALABLE MAX LIKELIHOOD FOR GAUSSIAN PROCESSES
Mihai Anitescu, STAT 343, Autumn 12. Not for use outside UC
Maximum Likelihood Estimation (MLE)
• A family of covariance functions parameterized by θ: φ(x; θ)
• Maximize the log-likelihood to estimate θ:
• First order optimality: (also known as score equations)
14
maxθL(θ ) = log (2π )−n/2 (detK )−1/2 exp(−yTK −1y / 2){ }
= −12yTK −1y− 1
2log(detK )− n
2log2π
12yTK −1(∂ jK )K
−1y− 12tr K −1(∂ jK )#$ %&= 0
Maximum Likelihood Estimation (MLE)
The log-det term poses a significant challenge for large-scale computations • Cholesky of K: Prohibitively expensive! • log(det K) = tr(log K): Need some matrix function methods to
handle the log • No existing method to evaluate the log-det term in sufficient
accuracy
15
maxθ
− 12yTK −1y− 1
2log(detK )− n
2log2π
Sample Average Approximation of Maximum Likelihood Estimation (MLE)
We (Anitescu et al. 2012) consider approximately solving the first order optimality instead: • A randomized trace estimator tr(A) = E[uTAu]
– u has i.i.d. entries taking ±1 with equal probability • It becomes a stochastic nonlinear equation • As N tends to infinity, the solution approaches the true estimate • Numerically, one must solve linear systems with O(N) right-hand
sides. 16
12yTK −1(∂ jK )K
−1y− 12
tr K −1(∂ jK )#$ %&
≈ 12yTK −1(∂ jK )K
−1y− 12N
uiT
i=1
N
∑ K −1(∂ jK )#$ %&ui = 0
Convergence of Stochastic Programming - SAA
• Let
• First result: where
17
θ : truth
θ̂ : sol of 12yTK −1(∂ jK )K
−1y− 12
tr K −1(∂ jK )#$ %&= 0
θ̂ N : sol of F = 12yTK −1(∂ jK )K
−1y− 12N
uiT
i=1
N
∑ K −1(∂ jK )#$ %&ui = 0
[V N ]−1/2 (θ̂ N −θ̂ ) D" →" standard normal, V N = [J N ]−T ΣN [J N ]−1
J N =∇F(θ̂ N ) and ΣN = cov{F(θ̂ N )}
Simulation: We scale
• Truth θ = [7, 10], Matern ν = 1.5
18
104 1066
7
8
9
10
11
matrix dimension n
1
2
104 106101
102
103
104
105
matrix dimension n
time
(sec
onds
)
64x64 grid 2.56 mins func eval: 7
128x128 grid6.62 mins func eval: 7
256x256 grid1.1 hours func eval: 8
512x512 grid2.74 hours func eval: 8
1024x1024 grid11.7 hours func eval: 8
2.1 Stochastic Programming for GP
• Prof. Stein will have some as well • How do we precondition? • Are there classes of processes for which we can prove global
convergence? • Can we prove global convergence without function values? • Can we exploit hierarchical structure? • Can we approximate the matrix-vector multiplication efficiently
AND maintain positive definiteness? • (The ScalaGAUSS project).
Mihai Anitescu, STAT 343, Autumn 12. Not for use outside UC
2.2: DECISION (CONTROL, DESIGN, PLANNING) OF ENERGY SYSTEMS UNDER UNCERTAINTY
Mihai Anitescu, STAT 343, Autumn 12. Not for use outside UC
Ambient Condition Effects in Energy Systems Operation of Energy Systems is Strongly Affected by Ambient Conditions - Power Grid Management: Predict Spatio-Temporal Demands (Douglas, et.al. 1999)
- Power Plants: Generation levels affected by air humidity and temperature (General Electric)
- Petrochemical: Heating and Cooling Utilities (ExxonMobil)
- Buildings: Heating and Cooling Needs (Braun, et.al. 2004)
- (Focus) Next Generation Energy Systems assume a major renewable energy penetration: Wind + Solar + Fossil (Beyer, et.al. 1999)
- Increased reliance on renewables must account for variability of ambient conditions, which cannot be done deterministically …
- We must optimize operational and planning decisions accounting for the uncertainty in ambient conditions (and others, e.g. demand)
- Optimization Under Uncertainty.
Wind Power Profiles Mihai Anitescu - Optimization under uncertainty
21
Multifaceted Mathematics for Complex Energy Systems (M2ACS) Project Director: Mihai Anitescu, Argonne National Lab
22
Goals: • By taking a holistic view, develop deep mathematical
understanding and effective algorithms to remove current bottlenecks in analysis, simulation, and optimization of complex energy systems.
• Address the mathematical and computational complexities of analyzing, designing, planning, maintaining, and operating the nation's electrical energy systems and related infrastructure.
Integrated Novel Mathematics Research: • Predictive modeling that accounts for uncertainty and
errors • Mathematics of decisions that allow hierarchical, data-
driven and real-time decision making • Scalable algorithms for optimization and dynamic
simulation • Integrative frameworks leveraging model reduction
and multiscale analysis
Long-Term DOE Impact: • Development of new mathematics at the
intersection of multiple mathematical sub-domains
• Addresses a broad class of applications for complex energy systems, such as :
• Planning for power grid and related infrastructure
• Analysis and design for renewable energy integration
Team: Argonne National Lab (Lead), Pacific Northwest National Lab, Sandia National Lab, University of Wisconsin, University of Chicago
Representative decision-making activities and their time scales in electric power systems. Image courtesy of Chris de Marco (U-Wisconsin).
Multifaceted Mathematics for Complex Energy Systems
Stochastic Predictive Control
Dynamic System Model
Stochastic
Optimization
Weather Model
Low-Level
Control
Forecast
Energy System
Set-Points
Forecast & Uncertainty
Measurements
Stochastic NLMPC
Min
x0f0 (x0 ) + E Minx f (x,ω )
⎡⎣
⎤⎦{ }
subj. to. g0 x0( ) = b0gi x0 ,xi( ) = bi i =1,2…Sx0 ≥ 0, xi ≥ 0
Two-stage Stoch Prog
Mihai Anitescu - Optimization under uncertainty
23
Stochastic Unit Commitment with Wind Power
• Wind Forecast – WRF(Weather Research and Forecasting) Model – Real-time grid-nested 24h simulation – 30 samples require 1h on 500 CPUs (Jazz@Argonne)
Mihai Anitescu - Optimization under uncertainty
24
1min COST
s.t. , ,
, ,
ramping constr., min. up/down constr.
wind
wind
p u dsjk jk jk
s j ks
sjk kj
windsjk
j
wik ksj
ndsk
jjk
j
c c cN
p D s k
p D R s k
p
p
∈ ∈ ∈
∈
∈
∈
∈
⎛ ⎞= + +⎜ ⎟
⎝ ⎠+ = ∈ ∈
+ ≥ + ∈ ∈
∑ ∑∑
∑ ∑
∑ ∑S N T
N
N
N
N
S T
S T
Zavala & al 2010.
Thermal Units Schedule? Minimize Cost Satisfy Demand Have a Reserve Dispatch through network
0 24 48 720
200
400
600
800
1000
1200
Tota
l P
ow
er
[MW
]
Time [hr]
Unit commitment & energy dispatch with uncertain wind power generation for the State of Illinois, assuming 20% wind power penetration, using the same windfarm sites as the one existing today.
Full integration with 10 thermal units to meet demands. Consider dynamics of start-up, shutdown, set-point changes
The solution is only 1% more expensive then the one with exact information. Solution on average infeasible at 10%.
wind power
Mihai Anitescu - Optimization under uncertainty
25
Demand Samples Wind
Thermal
Wind power forecast and stochastic programming
2.2 Stochastic Programming Challenges
• How do we produce efficient confidence intervals for SAA (as the number of samples may be limited)?
• How do I solve the sharply increased problem efficiently (we solve problems with 3B variables, but about 10 times slower than we would like)?
• How do we deal with integer variables (millions of them ) • How do we insert economic actors (economic equilibria) when
we have integer variables? • Can we use machine learning concepts to reduce decision space? • How do we achieve stability, low memory and fast convergence? • ….
Mihai Anitescu, STAT 343, Autumn 12. Not for use outside UC