Upload
harrison-denver
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
Slycat Ensemble AnalysisPatricia J. Crossno, Timothy M. Shead, Milosz A. Sielicki,
Warren L. Hunt, Shawn Martin, and Ming-Yu Hsieh Sandia National Laboratories
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.SAND 2014-1399P
Patricia J. Crossno: [email protected] Timothy M. Shead: [email protected] Milosz A. Sielicki: [email protected] Warren L. Hunt: [email protected] Shawn Martin: [email protected] Hsieh: [email protected]
Analysis Tasks:• Find strongest input/output correlations• Find inputs with least impact on outputs
• Find anomalous simulation runs
CCA Visual Representations
Scatterplot: Each Simulation Relative to Ensemble
Distance off diagonal shows difference from
ensemble as a whole, plus potential anomalies.
Purple = Outputs
Bar chart: Ensemble-wide Relationships
Viewing 1st CCA component in
both views
Input x1 has the least impact on
outputs y1 and y2
250 simulations, each color-coded by its y1 output value
Selected simulation
Positive many-to-
many correlation (bar color the same)
between X25 & X14 and
Y2 & Y1
Green = Inputs
Inputs x25 & x14 have the most impact on both
outputs y1 and y2
Viewing 2nd CCA component in
both bar chart & scatterplot
250 simulations, each color-coded by its x23 input value
Inputs and outputs sorted by
correlation strength within
CCA2 component
X23 selected for scatterplot color-
coding (dark green row highlight)
Three distinct groups of
input values
Inverse correlation (red vs. blue)
between x23 & y4; CCA3 captures
relationship between x8 & y3
Scatterplot color-coding changed by clicking on y4
row (darker purple highlight)
Three output value groupsmap to the 3 input groups
250 simulations, each color-coded by its y4 output value
Click CCA column header to select CCA component in
views
Viewing 3rd CCA component in
both bar chart & scatterplot
Inverse correlation between x8 & y3; CCA2 captures
relationship between x23 & y4
250 simulations, each color-coded by
its x8 input value
X8 inputs range from low (blue)
to high (red)
X8 selected for scatterplot color-
coding (dark green row highlight)
Click header triangle to sort
variables (toggles fromdecreasing to
increasing)
250 simulations, each color-coded by its y3 output value
Corresponding y3 outputs
inversely range from high (red)
to low (blue)
Scatterplot color-coding changed by clicking on y3
row (darker purple highlight)
Approach: Canonical Correlation Analysis (CCA)
feat
ure
s
simulations
ou
tpu
tsin
pu
ts
s1 s2 sn
o2
i1
om
…s3 s4
ik
o1
.
.
.
.
.
.
CCA
feat
ure
s
inp
uts
i1
ik
.
.
.
ou
tpu
ts
o2
o1
.
.
.om
CCA components
c1 ck…
CCA1
input meta-features
ou
tpu
t m
eta
-fe
atu
res
s1
s2
sns4
s3
Structure Correlations
Slycat Sensitivity Analysis
Input parameters
SimulationEnsemble
Simple Regression(1-to-1)
Multiple Regression(Many-to-1)
Model Confidence
How About Many-to-Many Correlations?
Problem: Electrical Circuit Simulation Sensitivity Analysis
Rerun CCA analysis between all inputs and y4 to find strongest correlations (all-to-1)
All to y4 analysis4 anomalous runs
share common x248 values
2641 simulations, each color-coded by its x248 input value
(strongest)
All to y4 analysis4 anomalous runs
share common x255 values
2641 simulations, each color-coded by its x255
input value (2nd strongest)
2641 simulations, each color-coded by its y4 output value
4 anomalous runs in y4 values
All to all analysis
Finding Anomalous SimulationsFinding Most Significant Inputs
Objectives:• Map Output Variability Back to Inputs• Reduce Number of Input Parameters
• Reduce Number of Simulations to Run• Identify Anomalous Runs
• Increase Model Confidence
266 scrollable
Inputs
Note R2 is increasing & P is decreasing with
each CCA component
Available Open Sourcehttps://github.com/sandialabs/slycat
Reduce Inputs & Simulations
In the 2641 run ensemble above, analysis allowed input parameters to be reduced from 266 to 21,
decreasing simulation time ten-fold.