Upload
henry-tate
View
227
Download
0
Embed Size (px)
Citation preview
Multilevel Optimization Methods for Engineering Design and PDE-Constrained Optimization
Copyright, 1996 © Dale Carnegie & Associates, Inc.
Stephen G. NashGeorge Mason University
Joint with R. Michael LewisCollege of William & Mary
[email protected] G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Setting
• Optimize a high-fidelity model:minimize fh(a)subject to <constraints>
• Also available: an easier-to-solve low-fidelity model:
minimize fH(a)subject to <constraints>
• How can you exploit the low-fidelity model?
[email protected] G. Nash, George Mason University
Some Applications
• PDE-constrained optimization• Aeronautical design• Nano-porous materials• Image processing• VLSI design
In many cases, there may be a hierarchy of lower-fidelity models
[email protected] G. Nash, George Mason University
[email protected] Stephen G. Nash, George Mason University
Example: Minimal SurfaceN=92, flops=2x106
N=182, flops=2x107
N=272, flops=1x108
N=32, flops=8x104
[email protected] Stephen G. Nash, George Mason University
An ExampleModel Framework
• An optimization model governed by a system of differential equations
• S(a,u) = 0: system of PDEs Design variables: a State variables: u Vary the discretization
0))(,(subject to
))(,()(minimize
auaS
auaFaf
[email protected] Stephen G. Nash, George Mason University
User-supplied Information
• Procedure to solve S (a,u ) = 0 for u given a
• Procedure to evaluate Fh (a,u ) and a Fh (a,u) for any level h
• Procedures to implement downdate IhH and
update IHh operators
IHh = <constant> (Ih
H)T
[email protected] G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Some Simplifications (for this talk)
• Either: No constraints in optimization models, or Constraint equations solved exactly
• But computational approaches are designed to extend to the constrained case: avoid explicit use of (reduced) Hessian only need Hessian-vector products do not assume sparsity or known sparsity
pattern
[email protected] G. Nash, George Mason University
[email protected] G. Nash, George Mason University
Model Management: Algorithmic Template
• Given some initial guess ak of the solution: set a(1) ← ak
(pre-smoothing) partially minimize fh to get a(1)
(recursion) Compute Obtain a(2) by solving
subject to bounds on a. Define search direction line search: a(3) a(1) + e
(post-smoothing) partially minimize fh to get a(4)
• Set ak+1 ← a(3)
)()( )1()1( afafv hH
avafaf THs )()(min
)1()2( aae
[email protected] G. Nash, George Mason University
Multilevel (no coarsening): Algorithmic Template
• Given some initial guess ak of the solution: set a(0) ← ak • (pre-smoothing) partially minimize fh to get a(1)
• (recursion) Compute Obtain a(2) by solving
subject to bounds on a. Define search direction line search: a(3) a(1) + e
• (post-smoothing) partially minimize fh to get a(4)
• Set ak+1 ← a(4)
)()( )1()1( afafv hH
avafaf THs )()(min
)1()2( aae
[email protected] G. Nash, George Mason University
Multilevel: MG/OptAlgorithmic Template
• Given some initial guess ak of the solution: set a(0) ← ak • (pre-smoothing) partially minimize fh to get a(1)
• (recursion) Compute Obtain a(2) by solving
subject to bounds on a. Define search direction line search: a(3) a(1) + e
• (post-smoothing) partially minimize fh to get a(4)
• Set ak+1 ← a(4)
)()( )1()1( afIaIfv hHh
HhH
avafaf THs )()(min
)1()2( aaIe hH
[email protected] Stephen G. Nash, George Mason University
The Reduced Hessian
• Properties of the reduced Hessian govern the behavior of MG/Opt
• Not the same as the PDE S(a,u): E.g., hyperbolic PDE, elliptic reduced Hessian
• If L = Lagrangian, Sa and Su = Jacobians:
• We don’t know its properties or sparsity pattern
auuuua
uauaauauaa
SSLSS
LSSSSLLf1**
**12
[email protected] G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
[email protected] Stephen G. Nash, George Mason University
Some of the Justifications
• Richer class of models• Guarantees of convergence• Better operator properties than for
PDEs alone• Good performance (even far from
solution)• Connection to other optimization
methods
[email protected] Stephen G. Nash, George Mason University
Optimization Models are More Flexible
• Applies to a large variety of optimization models and constraints Not just for solving PDEs
• Can add additional constraints: Bounds Inequalities
• True generalization of multigrid
[email protected] Stephen G. Nash, George Mason University
Analogy: Nonlinear equations vs. Optimization
• If we solve optimality conditions
• If we minimize
( ) 0f x
lim ( ) 0f x
( )f x
lim ( ) 0f x
[email protected] Stephen G. Nash, George Mason University
Convergence
• If underlying optimization algorithm is
guaranteed to converge (to a stationary point) without multilevel strategy
• Then MG/Opt is guaranteed to converge (to a
stationary point)
[email protected] Stephen G. Nash, George Mason University
When will MG/Opt work well?
• convex ≈ elliptic ≈ positive definite ≈ “nice”
• The reduced Hessian will be positive (semi) definite at the solution
• Multigrid works well for elliptic PDEs• Optimization methods work well on
convex problems
[email protected] Stephen G. Nash, George Mason University
A Sample Model Problem
• Match a target function u*:
• Where u (a) solves the 1-way wave eqn.:
• With• Computations use c = constant = 1
0
( ,0) ( )t xu cu
u x a x
0 1, 0 1x t
2 2
* *( , ( )) ( )x xf a u a u u u u
[email protected] Stephen G. Nash, George Mason University
Model Problem: Wave Eqn.
• Hyperbolic equation• Initial value moves
without dissipation or dispersion
• Multigrid methods (applied to constraint alone) are not ideal: usual approach is to march forward in time
-5 -4 -3 -2 -1 0 1 2 3 4 5-2
-1
0
1
2Initial Solution
-5 -4 -3 -2 -1 0 1 2 3 4 5-2
-1
0
1
2Solution at t=1
[email protected] Stephen G. Nash, George Mason University
Model Problem: Analysis of Continuous Problem
• The reduced Hessian is
• This is like the 1-dimensional Laplacian• Ideal for multigrid• Likely to cause difficulties for general-
purpose large-scale optimization methods • Analogous results for discretized model
problem
2
2
dI
dx
[email protected] Stephen G. Nash, George Mason University
Model Problem: Computations (cont.)• # of design variables: n=1025 [1,051,650 total variables: n(n+1)]
n=1025
n=512 n=257 n=129 n=65 n=33
Optimization
it 99
ls 100
cg 967
Successive refinement
it 23 25 25 25 25 19
ls 100 26 26 26 26 20
cg 956 216 225 214 220 145
MG/Opt
it 10 12 14 16 18 232
ls 20 24 28 32 36 242
cg 57 65 79 89 112 1974
[email protected] Stephen G. Nash, George Mason University
Choice of Comparative Algorithms
• Why only compare MG/Opt with traditional optimization algorithms (and not MG for systems of equations)? Inequality constraints may be present Optimality conditions not elliptic in
constrained case Hard to derive reduced Hessian/system
(thus hard to identify a good preconditioner) No obvious relationship between original
optimization model and reduced system
[email protected] Stephen G. Nash, George Mason University
MG/Opt & Steepest Descent
• Coarse-level problem is a first-order approximation to the fine level problem
Gradient of coarse-level problem at aH =
IhH [gradient of fine-level problem at ah]
• Analogous to the first-order approximation used to derive the steepest-descent method
[email protected] Stephen G. Nash, George Mason University
MG/Opt & Newton’s Method
• Multilevel line search: let Well-scaled search direction: Search direction of the form
• If subproblems solved accurately, then:
• Search direction is “Newton-like”
'(1) 0s h
h H He I e
)()( heafs
)(])([)1('322
HHHhHh
Hh
TH eOefIfIes
[email protected] G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Scenario
• Apply algorithm (e.g., model management) Suppose that it does not work well
• Why not? Examine results of diagnostic tests Performed as part of optimization algorithm
– Diagnostic tests have low overhead– Analogous to condition-number estimators
• Now what options do you have? Manual versus automatic
[email protected] G. Nash, George Mason University
Critical Condition
• Multilevel:
• Can be automatically guaranteed through additive (as here) or multiplicative corrections
• Convergence is guaranteed regardless of the quality of the approximate models
[email protected] G. Nash, George Mason University
)()( )1()1( afIaIf hHh
Hhs
[email protected] G. Nash, George Mason University
Sufficient to ConsiderFour Properties
• Nonlinearity• Model Consistency• Level Complementarity• Separability across Levels
Some assessment tests assume use of truncated-Newton method (TN) based on conjugate-gradient method (CG)
Some tests assume coarsening: ah → aH
[email protected] Stephen G. Nash, George Mason University
TN Search Directions
• Let • Line search: approximates • For the search directions from TN:
• Test: Is
)()( pafv
3)1(' pOv
)(min v
?0)1(' v
Diagnostic Test #2: Model Consistency
Compare predicted and actual reductions in the multilevel line search
[email protected] Stephen G. Nash, George Mason University
Predicted & Actual Reduction
• Predicted reduction: reduction in coarse-level objective (via standard
optimization)
• Actual reduction: reduction in fine-level objective (via multilevel
line search)
• Difference between (scaled) actual & predicted:
3
02
02 ))(()()()(
2
1 HHhH
hh
ThH
HH
TH eOeIafIafe
consistency of problems nonlinearity
Diagnostic Test #3: Level Complementarity
Does the coarse level correspond to the near null space of the fine-level Hessian?
[email protected] Stephen G. Nash, George Mason University
Algebraic Smoothness
• Optimizer: TN based on conjugate-gradient CG reduces error corresponding to large
eigenvalues on the fine level Complementary components correspond to
small eigenvalues (“near null space”)• Does the coarse level correspond to the
near null space of the reduced Hessian? Extend ideas from adaptive algebraic
multigrid for linear problems …
[email protected] Stephen G. Nash, George Mason University
Near Null-Space
• The error in the design variables should lie in the near null-space of the reduced Hessian
• Generalized Rayleigh quotient should be small:
hT
hhh
hhT
hh
hhGG
GGRQ
)(
)()()(
2
Reduced Hessian(not known)
Error in design variables (not known)
[email protected] Stephen G. Nash, George Mason University
Practical Test
• We must estimate: Norm of reduced Hessian (estimate via CG
method) Error in the design variables (use the multilevel
search direction)
• Test: Is small?
hT
hhh
hhT
hh
hh eeGT
eGeGeR
))((
)()()(
Multilevel search direction
Norm estimate from CG method
Matrix-vector product (as in TN)
)( hh eR
Diagnostic Test #4: Separability across Levels
Compare corresponding fine-level and coarse-level Hessian-vector products
[email protected] Stephen G. Nash, George Mason University
Separability?
• Can the fine-level and coarse-level components of the solution be computed separately?
• How much do they interact?• Is the reduced Hessian (nearly) block
diagonal in terms of fine-level and coarse-level components?
[email protected] Stephen G. Nash, George Mason University
Is it Possible to Test for Separability?
• How do you test for separability of the reduced Hessian when: You don’t compute the Hessian You can’t construct/analyze the Hessian You only have function & gradient values
and update & downdate operators
• Our test is based on Hessian-vector products: already estimated by TN
[email protected] Stephen G. Nash, George Mason University
Rough Idea
• Write reduced Hessian in block form, based on high/low frequencies:
• Use “perfect” update/downdate operators:
• Compare coarse/fine Hessian-vector products:
HHHh
THhhhh
GG
GGG
HH
hHhh
Hh
HH
hH p
p
pIpI
ppI
)(
0
hHhhHh
Hh
hHh pGpIGpGI )()( )()( 0 if separable
[email protected] Stephen G. Nash, George Mason University
Perturbation Analysis
• Apply MG/Opt to
• Assume user-supplied procedures are correct
• Assume nonlinearity test satisfied• Then MG/Opt solves a perturbed problem
• How large are the perturbations?
sorder term-higher)(min 21 h
Thh
hThh baaGaaF
)()(min 21 bbaaGGa h
Thh
hTh
[email protected] Stephen G. Nash, George Mason University
Perturbation Analysis (cont.)
HHH
hhhH GGGGG 12
rmssimilar te)(1 hhhhhH bGGb
small: separability
small: model consistency
small: level complementarity
[email protected] Stephen G. Nash, George Mason University
What if the diagnostic tests are not satisfied?
• Further analysis based on problem-specific techniques
• Nonlinearity Is it worthwhile to use a sophisticated optimization
method far from the solution?• Model Consistency
Over-coarsening? Programming errors?
• Level Complementarity Add or improve preconditioner
• Separability Use a different optimization method? Delay using multilevel until closer to solution of
optimization problem?
[email protected] Stephen G. Nash, George Mason University
Computational Tests
• Tests based on specified choices for the reduced Hessian
• Test problems chosen to isolate a particular property and measure sensitivity of diagnostic tests Multilevel already known to work well
• Ideal case: reduced Hessian is a discretized Laplacian
• Assume nonlinearity test satisfied: use quadratic optimization problems Nonlinearity test has been studied in other
contexts
[email protected] Stephen G. Nash, George Mason University
Level Complementarity
• Laplacian versus Laplacian with permuted eigenvalues
• Satisfies separability and problem consistencynH nh Laplacian
Permuted Ratio
7 15 0.03 0.34 11.7
15 31 0.04 0.21 4.8
31 63 0.04 0.57 13.6
63 127 0.07 0.30 4.5
127 255 0.13 0.25 1.9
255 511 0.20 0.34 1.7
511 1023 0.26 0.77 3.0
[email protected] Stephen G. Nash, George Mason University
Separability
• Diagonalize Laplacian:
• Test problems:
• R is random, with norm 1• Satisfies problem consistency and, for
small values of , level complementarity
T
H
hh VD
DVG
0
0
T
HT
hh VDR
RDVG
)(
[email protected] Stephen G. Nash, George Mason University
Model Consistency
• Test problems derived from discretized Laplacian
• Q is orthogonal
• R is random, with norm 1• Satisfies level complementarity and
separability
1))((, RIRIQQGQG HTH
[email protected] G. Nash, George Mason University
Outline
• Introduction• Model management and multilevel methods• Justification for optimization-based
multilevel methods• Diagnostic tests for multilevel methods
Related Research
• vast literature on multigrid methods for PDEs
• optimization-based multigrid methods Based on full approximation scheme
(Brandt, 1977) applied to optimality conditions for optimization model
Lewis & Nash (2005)– SIAM J. Sci. Comp., v. 26, pp. 1811-1837
• model management Alexandrov & Lewis (2001)
– Optimization and Engineering, v. 2, pp. 413-430
[email protected] G. Nash, George Mason University
[email protected] Stephen G. Nash, George Mason University
Related Research (cont.)
• diagnostic tests and related ideas for optimization-based multilevel methods
– Nash & Lewis (2008)- www.math.wm.edu/~buckaroo/pubs/LeNa08a.pdf
adaptive algebraic multigrid:– Brandt (1977)
- Math. Comp., v. 31, pp. 333-390– Brannick & Zikatanov (2006)
- Tech. Report, Penn. State University– Brezina, et al. (2006)
- SIAM J. Sci. Comp., v. 27, pp. 1261-1286.
stopping rules for inexact Newton methods: – Eisenstat & Walker (1996)
- SIAM J. Sci. Comp., v. 4, pp. 16-32