Upload
evelyn-evans
View
219
Download
0
Embed Size (px)
Citation preview
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA
DRIVEN STOCHASTIC INPUT MODELS
Baskar Ganapathysubramanian and Nicholas Zabaras Materials Process Design and Control Laboratory
Sibley School of Mechanical and Aerospace EngineeringCornell University
Ithaca, NY 14853-3801
[email protected], [email protected]://mpdc.mae.cornell.edu
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
- Thermal and fluid transport in heterogeneous media are ubiquitous
- Range from large scale systems (geothermal systems) to the small scale
- Most critical devices/applications utilize heterogeneous/polycrystalline/functionally graded materials
TRANSPORT IN HETEROGENEOUS MEDIA
- Properties depend on the distribution of material/microstructure
- But only possess limited information about the microstructure/property distribution
Incorporate limited information into stochastic analysis:- worst case scenarios - variations on physical properties
Hydrodynamic transport through heterogeneous permeable media
Thermal transport through polycrystalline and functionally graded materials
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
PROBLEM OF INTEREST
Interested in modeling diffusion through heterogeneous random media
Aim: To develop procedure to predict statistics of properties of heterogeneous materials undergoing diffusion based transport
What is given
-Realistically speaking, one usually has access to a few experimental 2D images of the microstructure. Statistics of the heterogeneous microstructure can then be extracted from the same.
- This is our starting point
Account for the uncertainties in the topology of the heterogeneous media
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
FRAMEWORK FOR ANALYSIS OF HETEROGENEOUS MEDIA
Extract properties P1, P2, .. Pn, that the structure satisfies.
These properties are usually statistical: Volume fraction, 2 Point correlation, auto correlation
Reconstruct realizations of the structure satisfying the correlations.
Construct a reduced stochastic model of property variations from the
data. This model must be able to approximate the class of structures.
Solve the heterogeneous property problem in the reduced stochastic space for computing property variations.
1. Property extraction 2. Microstructure/property reconstruction
3. Reduced model4. Stochastic analysis
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
DEVELOPING INPUT STOCHASTIC MODELS Data driven techniques for encoding the variability in properties into a viable, finite dimensional stochastic model.
Advances in using Bayesian modeling, Random domain decomposition
Aim is to create a seamless technique that utilizes the tools of the mature field of property/ microstructure reconstruction
First investigations into constructing data-driven reduced order representation of topological/ material/ property distributions utilized a Principal Component Analysis (PCA/POD/KLE) based approach.
Generate 3D samples from the microstructure space and apply PCA to them
= a1 a2 +..+ an+
1. B. Ganapathysubramanian, N. Zabaras, Modelling diffusion in random heterogeneous media: Data-driven models, stochastic collocation and the variational multi-scale method, J Comp Physics, in press
Convert variability of property/microstructure to variability of coefficients.
Not all combinations allowed. Developed subspace reducing methodology1 to find the space of allowable coefficients that reconstruct plausible microstructures
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
INPUT STOCHASTIC MODELS: LINEAR APPROACH
Further related issues:- How to generalize it to other properties/structures? Can
PCA be applied to other classes of microstructures, say, polycrystals?
- How does convergence change as the amount of information increases? Computationally?
- PCA based approaches find the smallest coordinate representation of the data ….
… but assumes that the data lies in a linear vector space
What is the result when the data lies in a nonlinear space?
As the number of input samples increases, PCA based approaches tend to overestimate the dimensionality of the reduced representation.
Becomes computationally challenging
# of samples
# of
eig
en v
ecto
rs
Only guaranteed to discover the true structure of data lying on a linear subspace of the high dimensional input space
NONLINEAR APPRACHES TO MODEL REDUCTION:
IDEAS FROM IMAGE PROCESSING, PSYCOLOGY
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
NONLINEAR REDUCTION: THE KEY IDEASet of images. Each image = 64x64 = 4096 pixels
Each image is a point in 4096 dimensional space.
But each and every image is related (they are pictures of the same object). Same object but different poses.
That is, all these images lie on a unique curve (manifold) in 4096.
Can we get a parametric representation of this curve?
Problem: Can the parameters that define this manifold be extracted, ONLY given these images (points in 4096)
Solution: Each image can be uniquely represented as a point in 2D space (UD, LR).
Strategy: based on the ‘manifold learning’ problem
Different images of the same object: changes in up-down (UD) and left-right (LR) poses
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
NONLINEAR REDUCTION: EXTENSION TO INPUT MODELS
Different microstructure realizations satisfying some experimental correlations
Given some experimental correlation that the microstructure/property variation satisfies.
Construct several plausible ‘images’ of the microstructure/property.
Each of these ‘images’ consists of , say, n pixels.
Each image is a point in n dimensional space.
But each and every ‘image’ is related.
That is, all these images lie on a unique curve (manifold) in n.
Can a low dimensional parameterization of this curve be computed?
Strategy: based on a variant of the ‘manifold learning’ problem.
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
A FORMAL DEFINITION OF THE PROBLEM
State the problem as a parameterization problem (also called the manifold learning problem)
Given a set of N unordered points belonging to a manifold embedded in a high dimensional space n, find a low dimensional region d that
parameterizes , where d << n
Classical methods in manifold learning have been methods like the Principle Component Analysis (PCA) and multidimensional scaling (MDS).
These methods have been shown to extract optimal mappings when the manifold is embedded linearly or almost linearly in the input space.
In most cases of interest, the manifold is nonlinearly embedded in the input space, making the classical methods of dimension reduction highly approximate.
Two approaches developed that can extract non-linear structures while maintaining the computational advantage offered by PCA1,2.
1. J. B. Tenenbaum, V. De Silva, J. C. Langford, A global geometric framework for nonlinear dimension reduction Science 290 (2000), 2319-2323.
2. S Roweis, L. Saul., Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science 290 (2000) 2323--2326.
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
AN INTUITIVE PICTURE OF THE STRATEGY
- Attempt to reduce dimensionality while preserving the geometry at all scales.
- Ensure that nearby points on the manifold map to nearby points in the low-dimensional space and faraway points map to faraway points in the low dimensional space.
3D dataPCA
0 20 40 60 80 100-10
0
10
20
30
40
50
60
-10
-5
0
5
10
15
-10
-5
0
5
10
15
20
40
Linear approach Non-linear approach: unraveling the curve
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
KEY CONCEPT
Pt A Pt B
Euclidian dist
Geodesic dist
1) Geometry can be preserved if the distances between the points are preserved – Isometric mapping.
2) The geometry of the manifold is reflected in the geodesic distance between points
3) First step towards reduced representation is to construct the geodesic distances between all the sample points
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
THE NONLINEAR MODEL REDUCTION ALGORITHM
Given N unordered samples
Compute pairwise geodesic distance
1) Given the N unordered sample points ( microstructures, property maps …)
2) Compute the geodesic distance between each pair of samples (i,j) .
3) Given the pairwise distance matrix between N objects, compute the location of N points, {ξi} in d such that the distance between these points is arbitrarily close to the given distance matrix . Basic premise of group of statistical methods called Multi Dimensional Scaling1 (MDS)
Perform MDS on this distance matrix
N points in a low dimensional space
1. T.F.Cox, M.A.A.Cox, Multidimensional scaling, 1994, Chapman and Hall
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
How to compute geodesic distance? Sum over short hops.
Need the notion of distance between samples
Flexibility in defining the distance measure…..
MATHEMATICAL DETAILS
The distance measure defines the properties of the manifold that the samples lie on
1. Properties of the manifold n.The distance measure, , based on how much the microstructures vary. Defined as the difference in statistical correlation between two microstructures.
Observations:
1. (, ) is a metric space
2. (, ) is a compact metric space1.
1. B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics,
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
MATHEMATICAL DETAILS
2. Mapping a compact manifold to a low-dimensional set
Have no notion of the geometry of the manifold to start with. Hence cannot construct true geodesic distances!
Approximate the geodesic distance using the concept of graph distance G(i,j) : the distance of points far away is computed as a sequence of small hops.
This approximation, G, asymptotically matches the actual geodesic distance . In the limit of large number of samples1,2. (Theorem 4.5 in 1)
1. B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics.
2. M.Bernstein, V. deSilva, J.C.Langford, J.B.Tenenbaum, Graph approximations to geodesics on embedded manifolds, Dec 2000
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
Perform MDS on the geodesic matrix. i.e perform an eigenvalue decomposition of the squared geodesic matrix. The largest d eigenvalues are the coordinates of the N points.
MATHEMATICAL DETAILS
3. MDS and choosing the dimensionality of the reduced space
Estimate the dimensionality of the manifold based on a novel geometrical probability approach (developed by A. Hero et. al.)
The manifold has an intrinsic dimensionality. How to choose the correct value of d? (related with issues of accuracy and computational effort)
Based on ideas from graph theory. The rate of convergence of the length functional, L of the minimal spanning tree of the geodesic distance matrix is related to the dimensionality1,2, d.
1. B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics.
2. J.A.Costa, A.O.Hero, Geodesic Entropic Graphs for Dimension and Entropy Estimation in Manifold Learning, IEEE Trans. on Signal Processing, 52 (2004) 2210--2221.
with
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
THE REDUCED ORDER STOCHASTIC MODEL
Given N unordered samples
N points in a low dimensional space
n. d
The procedure results in N points in a low-dimensional space. The geodesic distance + MDS step (Isomap algorithm1) results in a low-dimensional convex, connected space2, d.
1. J. B. Tenenbaum, V. De Silva, J. C. Langford, A global geometric framework for nonlinear dimension reduction Science 290 (2000), 2319-2323.
2. B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics.
Using the N samples, the reduced space is given as
serves as the surrogate space for .
Access variability in by sampling over .
BUT have only come up with → map …. Need → map too
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
THE REDUCED ORDER STOCHASTIC MODEL
Only have N pairs to construct → map. Various possibilities based on specific problem at hand. But have to be conscious about computational effort and efficiency.
Illustrate 3 such possibilities below. Error bounds can be computed1.
n d
n d
n
d
1. Nearest neighbor map 2. Local linear interpolation
3. Local linear interpolation with projection1. B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven
stochastic input models", submitted to Journal of Computational Physics.
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
THE LOW DIMENSIONAL STOCHASTIC MODEL
Given N unordered samples
Compute pairwise geodesic distance
Perform MDS on this distance matrix
N points in a low dimensional
space
Algorithm consists of two parts.
1) Compute the low dimensional representation of a set of N unordered sample points belonging to a high dimensional space
For using this model in a stochastic collocation framework, must sample points in →
2) For an arbitrary point ξ € must fins the corresponding point x €. Compute the mapping from →
n. d
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
NUMERICAL EXAMPLE
Problem strategy:
Extract pertinent statistical information form the experimental image
Reconstruct dataset of plausible 3D microstructures
Construct a low dimensional parametrization of this space of microstructures
Solve the SPDE for temperature evolution using this input model in a stochastic collocation framework
Given an experimental image of a two-phase metal-metal composite (Silver-Tungsten composite).
Find the variability in temperature arising due to the uncertainty in the knowledge of the exact 3D material distribution of the specified microstructure.
T= -0.5 T= 0.5
1. S. Umekawa, R. Kotfila, O. D. Sherby, Elastic properties of a tungsten-silver composite above and below the melting point of silver J. Mech. Phys. Solids 13 (1965) 229-230
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
Experimental image Experimental statistics
GRF statistics
Realizations of 3D microstructure
TWO PHASE MATERIAL
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
NON LINEAR DIMENSION REDUCTION
The developments detailed before are applied to find a low dimensional representation of these 1000 microstructure samples.
The optimal representation of these points was a 9 dimensional region
Able to theoretically show that these points in 9D space form a convex region in 9.
This convex region now represents the low dimensional stochastic input space
Use sparse grid collocation strategies to sample this space.
10
15
20-15 -10 -5 0 5 10 15
-10
-5
0
5
10
15
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
Computational domain of each deterministic problem: 65x65x65 pixels
COMPUTATIONAL DETAILS
The construction of the stochastic solution: through sparse grid collocation level 5 interpolation scheme used
Number of deterministic problems solved: 26017
Computational platform: 50 nodes on local Linux cluster (x2 3.2 GHz)
Total time: 210 minutes
Total number of dof’s: 653x26017 ~ 7x109
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
MEAN TEMPERATURE PROFILE
a
b c d e
f
g
(a) Temp contour
(b-d) Temp isocontours
(e-g) Temp slices
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
HIGHER ORDER TEMPERATURE STATISTICS
a
(a) Temp contour
(b) Temp isocontours
(c) PDF of temp
(d-f) Temp slices
b d
e
f
c
Materials Process Design and Control LaboratoryMaterials Process Design and Control Laboratory
CCOORRNNEELLLL U N I V E R S I T Y
CCOORRNNEELLLL U N I V E R S I T Y
CONCLUSIONS
RELATED PUBLICATIONS:
1. B. Ganapathysubramanian and N. Zabaras, "A non-linear dimension reduction methodology for generating data-driven stochastic input models", submitted to Journal of Computational Physics.
2. B. Ganapathysubramanian and N. Zabaras, "Modelling diffusion in random heterogeneous media: Data-driven models, stochastic collocation and the variational multi-scale method", Journal of Computational Physics, in press
1. Developed an efficient data-driven non-linear model reduction technique for experimental statistics into viable stochastic input models.
2. Seamlessly meshes with any reconstruction method
3. Showcased the framework to construct a reduced model of topology of two-phase material given limited statistical data
4. This methodology has significant applications to problems where working in high dimensional spaces is computationally intractable: visualizing property evolution, process-property maps, searching and contouring