View
46
Download
0
Category
Preview:
DESCRIPTION
This talk is supported by Ewha University. High Performance Solvers for Semidefinite Programs. Makoto Yamashita @ Tokyo Tech Katsuki Fujisawa @ Chuo Univ Mituhiro Fukuda @ Tokyo Tech Kazuhiro Kobayashi @ NMRI Kazuhide Nakata @ Tokyo Tech Maho Nakata @ RIKEN. - PowerPoint PPT Presentation
Citation preview
High Performance Solvers for Semidefinite Programs
Makoto Yamashita @ Tokyo TechKatsuki Fujisawa @ Chuo UnivMituhiro Fukuda @ Tokyo TechKazuhiro Kobayashi @ NMRIKazuhide Nakata @ Tokyo TechMaho Nakata @ RIKEN
KSIAM Annual Meeting @ Jeju 2011/11/25(2011/11/25-2011/11/26)
This talk is supported by Ewha University
KSIAM 2011 @ Jeju 2
Our interests & SDPA Family
How fast can we solve SDPs? How large SDP can we solve? How accurate can we solve SDPs?
SDPA Homepage http://sdpa.sf.net/
Parallel
SDPA
SDPARA
SDPA-M
SDPARA-C
SDPA-C
SDPA-GMP
Matlab
Base solver
Multiple precision
Strucutural Sparsity
KSIAM 2011 @ Jeju 3
SDPA Online Solver
1. Log-in the online solver
2. Upload your problem
3. Push ’Execute’ button
4. Receive the result via Web/Mail
http://sdpa.sf.net/ ⇒ Online Solver
Outline
1. SDP Applications2. Primal-Dual Interior-Point Methods3. Inside of SDPARA (Large & Fast)4. Inside of SDPA-GMP (Accurate)5. Conclusion
KSIAM 2011 @ Jeju 5
SDP Applications
Control Theory Quantum Chemistry Sensor Network Localization Problem Polynomial Optimization
INFOMRS 2011 @ Charlotte 6
SDP Applications 1.Control theory
Against swing,we want to keep stability.
Stability Condition⇒ Lyapnov Condition⇒ SDP
INFOMRS 2011 @ Charlotte 7
Ground state energy Locate electrons
Schrodinger Equation⇒Reduced Density Matrix⇒SDP
SDP Applications2. Quantum Chemistry
INFOMRS 2011 @ Charlotte 8
SDP Applications3. Sensor Network Localization
Distance Information⇒Sensor Locations
Protein Structure
KSIAM 2011 @ Jeju 9
SDP Applications 4. Polynomial Optimization
For example,
NP-hard in general Very good lower bound
by SDP relaxation method
sconstraintPolynomialPolynomial ..:min ts
nn
iiii Rxxxxxf
,)(100)1()(:min
1
1
221
2
KSIAM 2011 @ Jeju 10
SDP Applications
Control Theory Quantum Chemistry Polynomial Optimization Sensor Network Localization Problem
How Large & How Fast & How Accurate
Many Applications
KSIAM 2011 @ Jeju 11
Standard form
The variables are Inner Product is The size is roughly determined by
m
kkk
m
kkk
kk
OYCYzA
zbD
OXmkbXA
XCP
1
1
,s.t.
max)(
),,,1(s.t.
min)(
mnn RSSzYX ,,,,
n
jiijijYXYX
1,
YXn
Pm
and of size the
)(in sconstraintequality ofnumber the Our target 000,30m
Ordinal solver
000,10m
KSIAM 2011 @ Jeju 12
Primal-Dual Interior-Point Methods
Feasible region
mnn RSSzYX ,,,, *** ,, zYX
Optimal
Central Path
000 ,, zYX
),,( dzdYdXTarget
111 ,, zYX
222 ,, zYX
KSIAM 2011 @ Jeju 13
Schur Complement Matrix
2/,1
1T
m
jjj
dXdXdXYXdYRdX
dzADdY
rBdz
jiij AYXAB 1where
Schur Complement Equation
Schur Complement Matrix
1. ELEMENTS (Evaluation of SCM)2. CHOLESKY (Cholesky factorization of SCM)
KSIAM 2011 @ Jeju 14
Computation time on single processor
SDPARA replaces these bottleneks by parallel computation
Control POP
ELEMENTS 22228 668
CHOLESKY 1593 1992
Total 23986 2713
Time unit is second, SDPA 7, Xeon 5460 (3.16GHz)
%95Row-wise distribution
Two-dimensional block-cyclic distribution
KSIAM 2011 @ Jeju 15
Row-wise distribution
All rows are independent
Assign processorsin a cyclic manner
Simple idea⇒Very EFFICIENT
High scalability
Processor1
Processor2
Processor3
Processor2
Processor3
Processor4
Processor1
Processor4
B
jiij AYXAB 188SBExample
Block Algorithm for Cholesky factorization
Triangular Factorization
UUB T
222212121211
12111111
22
1211
22
1211
2212
1211
UUUUUU
UUUU
UO
UU
UO
UU
BB
BBTTTT
TTT
T
12122222
12
1
1112
111111
.3
.2
.1
UUBB
BUU
UUB
T
T
T
Small Cholesky factorizaton
Block Updates
Parallel Computing
(U: upper triangular matrix)
)4.e.g(, 2211 pSBSB pmp
KSIAM 2011 @ Jeju 17
Two-dimensional block-cyclic distribution
Scalapack library
From the row-wise to TDBCD requires network communication
Cholesky on TDBCD is much faster than the on row-wise
1 1 2 2 1 1 2 2
1 1 2 2 1 1 2 2
3 3 4 4 3 3 4 4
3 3 4 4 3 3 4 4
1 1 2 2 1 1 2 2
1 1 2 2 1 1 2 2
3 3 4 4 3 3 4 4
3 3 4 4 3 3 4 4
B
88SBExample
Processor1
Processor2
Processor3
Processor2
Processor3
Processor4
Processor1
Processor4
B
KSIAM 2011 @ Jeju 18
Numerical Results of SDPARA Quantum Chemistry (m=7230, SCM=100%), middle size SDPARA 7.3.1, Xeon X5460, 3.16GHz x2, 48GB memory
28678
7192
1826548
13147
29700
7764
2294
10
100
1000
10000
100000
1 4 16
Servers
Sec
ond
ELEMENTSCHOLESKYTotal
ELEMENTS 15x speedupCHOLESKY 12x speedupTotal 13x speedup
Very FAST!!
KSIAM 2011 @ Jeju 19
Acceleration by Multiple Threading
Modern Processors have multi-cores
Multiple Threading is becoming common
Processor1:Thread1
Processor2:Thread1
Processor1:Thread2
Processor2:Thread1
Processor1:Thread2
Processor2:Thread2
Processor1:Thread1
Processor2:Thread2
B 2 Processorsx2 Threads on each processor
Two-level Parallel Computing
KSIAM 2011 @ Jeju 20
Comparison with PCSDP
developed by Ivanov & de Klerk
Servers 1 2 4 8 16
PCSDP 53,768 27,854 14,273 7995 4050
SDPARA 5983 3002 1680 901 565
SDP: B.2P Quantum Chemistry (m = 7230, SCM = 100%)Xeon X5460, 3.16GHz x2 (8core), 48GB memory
Time unit is second
SDPARA is 8x faster by MPI & Multi-Threading(Two-level parallization)
KSIAM 2011 @ Jeju 21
Extremely Large-Scale SDPs
16 Servers [Xeon X5670(2.93GHz) , 128GB Memory]
m SCM time
Esc32_b(QAP) 198,432 100% 129,186 second (1.5days)
Other solvers can handle only 000,30m
The LARGEST solved SDP in the world
KSIAM 2011 @ Jeju 22
Numerical Accuracy
One weakpoint of PDIPM . PDIPM requires
Eventually, numerical trouble (often, Cholesky fails)
),(),(lim, **** YXYXOYX kk
k
11 )(&)( kk YX
optimal ),,( *** zYX
jiij AYXAB 1for example,
Ordinal double precision in C or C++
arbitrary precision in GMP library
Numerical Precision
1610
Replace BLAS(Basic Linear Algebra Sytems) by MPLAPACK (Multiple precision LAPACK) SDPA-GMP
64bit = 1bit(sign) + 11bit(exponent)+53bit(fraction);
a b c cba 121
accuracy =
a b cWe can arbitrary set the bit number of fraction part.
(for example, 200bit = )5310
KSIAM 2011 @ Jeju 24
Numerically Hard problem Test Problem
PDIPM is stable if Slater’s condition
Graph Partition Problemhas no interior
Small ⇒ Numerically Hard
OXniXeeXeetsXC Tii
T ),,,1(1,..:min
0
OXXeeXeeX Tii
T ,1,0:
),,1(.. mkbXAtsOX kk
KSIAM 2011 @ Jeju 25
Numerical Results of SDPA-GMP Small ⇒ Numerically Hard
Solver Accuracy Time(second)
1.0e-1 SDPA 1.08e-8 2.03
SDPA-GMP 4.80e-48 77760.19
1.0e-15 SDPA 1.63e-7 2.26
SDPA-GMP 2.97e-48 82115.52
0 SDPA 5.26e-9 2.36
SDPA-GMP 7.29e-24 105325.74
SDPA-GMP uses 300 digits
24digits for even no-interior case
KSIAM 2011 @ Jeju 26
Conclusion
SDPARA ⇒ How Fast & How Large 100times &
SDPA-GMP ⇒ How Accurate
http://sdpa.sf.net/ & Online solver
Thank you very much for your attention.
000,200m4810
Recommended