Upload
ophrah
View
53
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Scalable Multi-Stage Stochastic Programming. Cosmin Petra and Mihai Anitescu Mathematics and Computer Science Division Argonne National Laboratory DOE Applied Mathematics Program Meeting April 3-5, 2010. Motivation . Sources of uncertainty in complex energy systems Weather - PowerPoint PPT Presentation
Citation preview
Scalable Multi-Stage Stochastic Programming
Cosmin Petra and Mihai Anitescu
Mathematics and Computer Science DivisionArgonne National Laboratory
DOE Applied Mathematics Program MeetingApril 3-5, 2010
Motivation
Sources of uncertainty in complex energy systems– Weather– Consumer Demand– Market prices
Applications @Argonne – Anitescu, Constantinescu, Zavala– Stochastic Unit Commitment with Wind Power Generation– Energy management of Co-generation– Economic Optimization of a Building Energy System
2
Stochastic Unit Commitment with Wind Power
Wind Forecast – WRF(Weather Research and Forecasting) Model– Real-time grid-nested 24h simulation – 30 samples require 1h on 500 CPUs (Jazz@Argonne)
3
1min COST
s.t. , ,
, ,
ramping constr., min. up/down constr.
wind
wind
p u dsjk jk jk
s j ks
sjk kj
windsjk
j
wik ksj
ndsk
jjk
j
c c cN
p D s k
p D R s k
p
p
S N T
N
N
N
N
S T
S T
Slide courtesy of V. Zavala & E. Constantinescu
Economic Optimization of a Building Energy System
4
1
0
2
2( ,0)
1min COST = ( ) ( ) ( ) ( ) ( )
s.t. ( ) ( ) ( ) ( ) ( ,0) , (0)
, ( ) ( ,0)
t TSelec elec gas
elec c elec h gas hj t
jgas elec elec j j jI
I h h c I W I I
j j jj jW W WI W
C C C dS
TC S T T T T
T T TT T k
x
( , )
0
0, ( , ) 0
(0, ) ( ), [ , ]( ) , {1
(
, ,
)
}
jA
jj W
WL
jW Wmin j maxI I I
TT L k
T x T x t t TT T T j S
T
Proactive management - temperature forecasting & electricity prices Minimize daily energy costs
1 hr 3 hr 6 hr 9 hr 12 hr 16 hr 24 hr0
20
40
60
80
100
Horizon
Rel
ativ
e C
ost [
%]
Slide courtesy of V. Zavala
Optimization under Uncertainty Two-stage stochastic programming with recourse (“here-and-now”)
5
0
0 0( )) ( ,xx
Min f x M xin f E_subj. to. 0 0 0
0
0
) )( ( ((
)0, ) 0
A bA B b
xx x
x
x
0 1 20, 1, , ,
1( )) (S
S
x iixx ix
Min f x xfS
0 0 0
0
0 1, ...,
,0, 0,
k k k k
k Sk
A bA B bx x
xx x
subj. to.1 2, , , S
( ) : ( ( ), ( ), ( ), ( ), ( ))A B b Q c
continuous discrete
Sampling
Inference Analysis
M samples
Sample average approximation (SAA)
Multi-stage SAA SP Problems – Scenario formulation
6
0
1 4
762 53
7 7
0 0
12
T Ti i i i i
i i
QMin x cx x
s.t. 0 0 0
1 0 1 1 1
2 1 2 2 2
3 1 3 3 3
4 4 4 4
5 4 5 5 5
6 4 6 6 6
7 7 7 7
0 2 4
0
4
1 3 5 6 70 0 0 0 0 0 0 0
xx x
x xx x
x xx xx x
A bA B b
A B bA B b
A B bA B bA B bA B b
x x x x x x xx x
x
Depth-first traversal of the scenario tree Nested half-arrow shaped Jacobian Block separable obj. func.
Linear Algebra of Primal-Dual Interior-Point Methods
7
12
T Tx Qx c x
subj. to.
Min
0Ax bx
Convex quadratic problem
0
T xQ Arhs
yA
IPM Linear System
1 1
1 1
2 2
2 2
01 2 0
0
0 00 0
0 00 0
0 00 0
0 0 00 0 0 0 0 0 0
T
T
TS S
S ST T T T
S
H BB A
H BB A
H BB A
A A A H AA
Two-stage SP
arrow-shaped linear system(via a permutation)
Multi-stage SP
nested
The Direct Schur Complement Method (DSC) Uses the arrow shape of H
Solving Hz=r
8
20
1 11 1 1 10
2 22 2 2
0
10 20 01 2 0
T T T
T T
T TS NS S S S
S c cS
T
Tc
T
L DH G L LL DH G L L
L DH G L LL L L L DG G G H L
10
1
10
, , 1, ,
, .
T T
T Ti i i c
i i i i i i i i
S
ci
c
L D L H L G L D i S
H H G D L CC G L
1
10 0
10
, 1, ,i i i
c i i
S
i
Lw r i S
w L r wL
1 , 0,...,i i iv iD Sw
10 0
0 0 ,
1, .
c
T Ti i i i
L v
L v
z
z
i S
L z
Back substitution
Implicit factorization
Diagonal solve Forward substitution
High performance computing with DSC
9
Gondzio (OOPS) 6-stages 1 billion variables Zavala et.al., 2007 (in IPOPT) Our experiments (PIPS) – strong scaling is investigated
Building energy system• Almost linear scaling
Unit commitment • Relaxation solved• Largest instance
• 28.9 millions variables• 1000 cores
12 x
on Fusion @ Argonne
Scalability of DSC
10
Unit commitment 76.7% efficiency
but not always the case
Large number of 1st stage variables: 38.6% efficiency
on Fusion @ Argonne
Preconditioned Schur Complement (PSC)
11
1
0 01
i i i
Nii
i
N
w r
r L
L
r w
1 , 0,...,i i iv iD Nw
0 0T T
i i i iz L v L z
10
1
Ti
N
ii
iH H GC G
1
0 1, , ,,T Ti i i i i i i iL L iD L H G L D N
(separate process)
TM M ML D L M
00 Krylov( , , )C Mz r
The Stochastic Preconditioner The exact structure of C is
IID subset of n scenarios:
The stochastic preconditioner (Petra & Anitescu, 2010)
For C use the constraint preconditioner (Keller et. al., 2000)
12
1
11
00
0
1.
0
T T Ti i i i
S
iiQ A B Q B A A
CA
S
SS
1 2{ , , , }nk k k K
1
1
1
01 .
i i i ii
n
k k k kki
T TnS Q A B B AQ
n
0
0
.0
TnS A
MA
The “Ugly” Unit Commitment Problem
13
DSC on P processes vs PSC on P+1 processOptimal use of PSC – linear scaling
Factorization of the preconditioner can not behidden anymore.
• 120 scenarios
Quality of the Stochastic Preconditioner
“Exponentially” better preconditioning (Petra & Anitescu 2010)
Proof: Hoeffding inequality
Assumptions on the problem’s random data1. Boundedness2. Uniform full rank of and
14
21
24 24Pr(| ( exp) 1| ) 2
||2 ||n SS max
np L S
S pS
)(A )(B
1
11
01
i i i ii
T Tn k
n
k k k ki
S Q A B BQ An
1
01
11 S
S i i i iT T
iiS Q A B BQ A
S
not restrictive
Quality of the Constraint Preconditioner
has an eigenvalue 1 with order of multiplicity .
The rest of the eigenvalues satisfy
Proof: based on Bergamaschi et. al., 2004.
15
0
0 0
TnS A
MA
0
0 0
TSS A
CA
p r
1M C 2r
m1
ax1 1( ) ( ) ( )0 .min n S n SS S M C S S
The Krylov Methods Used for
BiCGStab using constraint preconditioner M
Preconditioned Projected CG (PPCG) (Gould et. al., 2001)– Preconditioned projection onto the
– Does not compute the basis for Instead,
–
16
1
0 0 0 0T T
nP S ZZ Z Z
0 .KerA
1 10 0 0 0 0 0( )T
Ny xA rA A S
0
0
Pr .00
Tn g rS A
guA
is computed from
100 0
200 00
TS xS A r
yA r
0 0Cz =r
0Z 0 .KerA
Performance of the preconditioner Eigenvalues clustering & Krylov iterations
Affected by the well-known ill-conditioning of IPMs.
17
1
01
0
( ) , where and S
( ( )) ( ) ) ) )( ( ( )(
n N N
T T
S
S Q A D
S
Q A
S
BD B
E
A Parallel Interior-Point Solver for Stochastic Programming (PIPS)
Convex QP SAA SP problems
Input: users specify the scenario tree
Object-oriented design based on OOQP
Linear algebra: tree vectors, tree matrices, tree linear systems
Scenario based parallelism – tree nodes (scenarios) are distributed across processors– inter-process communication based on MPI– dynamic load balancing
Mehrotra predictor-corrector IPM
18
Tree Linear Algebra – Data, Operations & Linear Systems
Data– Tree vector: b, c, x, etc– Tree symmetric matrix: Q– Tree general matrix: A
Operations
Linear systems: for each non-leaf node a two-stage problem is solved via Schur complement methods as previously described.
19
12
T Tx Qx c x
subj. to.
Min
0Ax bx
0
1 4
762 53
*( ) ( )
( )
,( )( )
,
(
,
( )
)
n n n
n n
n a n a n n n
T T Tn n n c c
c C
n
n
u nx n
Ax A x B x n
y B y
u v vQ
A A y
x Q
TT
T
* {0}(6) 4 (ancestor map)(1) {2,3} (children map)
aC
T T
T
Parallelization – Tree Distribution
The tree is distributed across processes. Example: 3 processes
Dynamic load balancing of the tree– Number partitioning problem --> graph partitioning --> METIS
20
CPU 1 CPU 2 CPU 3
0
1
2 3 7
0
4
65
0
4
Conclusions
The DSC method offers a good parallelism in an IPM framework.
The PSC method improves the scalability.
PIPS – solver for SP problems.
PIPS is ready for larger problems: 100,000 cores.
21
Future work New math / stat
– Asynchronous optimization– SAA error estimate
New scalable methods for a more efficient software– Better interconnect between (iterative) linear algebra and sampling
• importance-based preconditioning• multigrid decomposition
– Target: emerging exa architectures
PIPS– IPM hot-start, parallelization of the nodes– Ensure compatibility with other paradigms: NLP, conic progr., MILP/MINLP solvers– A ton of other small enhancements
Ensure computing needs for important applications – Unit commitment with transmission constraints & market integration (Zavala)
22
Thank you for your attention!
Questions?
23