View
6
Download
0
Category
Preview:
Citation preview
Materials Informatics for the ICME CyberInfrastructure
Kim F. FerrisPacific Northwest National Laboratory, Richland, WA
Dumont M. JonesProximate Technologies, Columbus, OH
May 12, 2011
Project ID: LM038
This presentation does not contain any proprietary, confidential, or otherwise restricted information
Project Overview
Budget
Total project fundingDOE – $350K
FY09 Funding - $60KFY10 Funding - $170KFY11 Funding - ~$120K
Start: Mid-Aug 2009Finish: October 2011~2/3 complete (Feb 2011)
Project Timeline Greater materials diversity / joining dissimiliarmaterialsInformation Gaps in ICME Cyberinfrastructure
Data sufficiencyIncomplete structure among partnersCompleteness and availability of repositories
Mg Processing Model DevelopmentFirst principles to Processing - multiscaleMg-sheet forming
Technology Gaps/Barriers
PartnersFord Motor Company (John Allison)Mississippi State University (Mark Horstemeyer, Tomasz Haupt)Canada Center for Mineral & Energy Technology (CANMET) (Kevin Boyle)University of Virginia (Sean Agnew)Metals Bank – Korea (YM Rhyim)ESPEI Workgroup (Liu, Wolverton, Haupt)Joining Materials Knowledge (Beckham)
Project Overview: Project’s place in the cyberinfrastructure
CurrentPrograms
This Informatics Effort
Analysis DB• Fills similar role to data warehouse • Informatics storage/retrieval focused•Analytic platform, not a re-formed repository
record_<prototype>
PK record_id
FK4 sys_idFK3 source_idFK1 var1_id val1 val1_min val1_maxFK2 [pvar1_id] [pval1] [pval1_min] [pval1_max] ...
<var>
record_idsys_idvar1_idval1val1_minval1_max[pvar1_id][pval1][pval1_min][pval1_max]...
var
PK id
name description cardinality prototype unit
source
PK id
name description ...
rule
FK1 origin_var_idFK2 destination_var_id rule
One for each variable
prototype
system
id creation_time
One for each
variable
Repository• Federated/consolidated• Data exchange support• ISO 10303 STEP support
1
Informatics repository augmentation• Property correlations• ID feature space for complex data such as processing sequences• When can data from distinct systems be safely fused• Coverage and lineage anomalies• Insufficient information for prediction (predictor diversity)• Property definition inconsistencies• Material definition inconsistencies
3
2
Overview: Basic Tasks & Motivation
Atomic Mesoscale Macroscopic
AlgorithmInputs
PhysicalInputs
AlgorithmInputs
AlgorithmInputs
2) How can the ‘model’ as a whole be verified; reliable data for verification3) Data from different sources needed for greater variety of materials and joining
1) As parameters are passed from scale to scale, do models adequately represent each scale and what information is lost?
4) Does data support design?
Project Objectives
Challenges:• Greater diversity of materials in design problems• Reliable information needed for new materials and model
development, and verification• Mg-alloy performance properties can conflict. Use data-driven
design/informatics to address this challenge?
Objectives:• Develop materials informatics techniques for robust model
development• Identify gaps in the fundamental information contained in the
ICME knowledge infrastructure• Augment knowledge –address gaps/barriers to developing Mg
alloys• Collaborate with current repositories on knowledge content
Milestones / Deliverables
Month/Year Milestone / Deliverables
Program start.
Milestone: Assess data support needs within context of workgroup research.Deliverable: Preliminary toolkit and data-storage design, assessment of data support. Milestone: Toolkit for informatics data assessment Preliminary Mg-alloy data assessment *
Deliverables: Create and deliver toolkit for lightweight alloys. Validated tools against prototype alloy design data .
Milestone: Extend work to multiple repositories
Deliverable: Apply toolkit to address global data sufficiency.
Feb 2010
Aug 2009
Sep 2010
Jun 2011
Technical Approach
Task 1Informatics Toolkit Design
Contextual assessment of data support for workgroup models development Identify algorithms and storage approach for materials knowledge retention and
generation
Task 2Informatics Toolkit Development
informatics toolkit for data sufficiency/completeness, support for model development, and knowledge augmentation
Analytic support for cyberInfrastructure
Assessment of data support for model development Contextual assessment of data support compatible with heterogeneous sources Collaborative contact with data sources (MSSU, CanMet, MetalsBank, DataSoc)
Task 3
Accomplishments & Impacts
Guidance to First Principles and Cyberinfrastructure Workgroups for data
completeness
Assessed data support for first principles development and materials repository
Data sufficiency – informatics tools to assess modeling power of finite data sets
Data analytics for first principles and sheet forming groups for robust models
Tools to assess repository support for information-driven design
Knowledge content of multiple information sources• Repository/repository• PI/repository• PI/PI
Guidance on integrating ICME applications with repositories and
cyberinfrastructure
Design of prototype data schema Knowledge retention model for
workgroups
Accomplishment #1: Analysis Database Schema
record_<prototype>
PK record_id
FK4 sys_idFK3 source_idFK1 var1_id val1 val1_min val1_maxFK2 [pvar1_id] [pval1] [pval1_min] [pval1_max] ...
<var>
record_idsys_idvar1_idval1val1_minval1_max[pvar1_id][pval1][pval1_min][pval1_max]...
var
PK id
name description cardinality prototype unit
source
PK id
name description ...
rule
FK1 origin_var_idFK2 destination_var_id rule
One for each variable
prototype
system
id creation_time
One for each
variable
•Developed prototype schema as example for knowledge retention• Identify key weaknesses in conversion from conventional storage•Strongly-typed records•Formed design constraints for information transport
Ferris, KF, and DM Jones. 2010. “Assessing Data in Support of Structure-Processing-Property Relationships in Mg-alloy Design” TMS 2010.Jones, DM and KF Ferris. 2010. “Generic Materials Property Data Storage and Retrieval for Alloy Material Applications,” TMS 2010
Technical Accomplishment 2: Enabling the Phase Diagram Infrastructure to MSSU Repository
First principles workgroup: Bridging first principles calculation to phase diagrams
PNNL Contribution• Informatic effort provided Query/Response ‘glue’ for ESPEI, MSSU
Cyberinfrastructure, first principles, Calphad.• Automated procedure, retaining critical information for data• ‘Glue’ formalism applicable to other information joining situations
DBSQL
DBMS-SQL
ESPEI(Materials
Informatics)MSSU DataRepository
Query
Response
CalphadThermoCalc
Technical Accomplishment 3: Tools to Assess Data Completeness
Data Completeness System Definition
Property Definition & Scale Property Correlation
Structure
Den
sity
Are properties ambiguously defined? Should they be resolved by scale or other parameters?
Here, a well-defined system has different density measurements, depending on scale. • These must be clearly distinguished and parameterized in the database• Different end-properties may require one, or both measurements.
Small scale
Small scale
Large scale
Structure
Prop
ertie
s
Are the existing data complete for The desired design parameters?
OK…
More data are required, extrapolation is too great
Existing system
To-be-created system
Structure
Den
sity
Do systems have incomplete structure definitions, so one structure gives more than one property result?
Density models will be ambiguous!
Structure
Prop
ertie
s
Are some properties correlated, so informatics can proceed without complete measures of all properties?
Correlated for a class of structures
Not correlated
Not correlated
Are the existing data complete for the desired design parameters?
Data alone do not guarantee successful information-driven design.
Example: Predicting mechanical constants from individual data points•Harmonic behavior• Two different phases w/ different
responses
Knowing there are two different groupings is critical to correct predictions/models/behaviors
Not having enough information about this data may lead to –• Linear fit• Quadratic model for average behavior• Averaging non-equivalent system (two
phases)•Greater property value uncertainties;
validation/verification difficulties
Ene
rgy
Displacement
Step-through animation sequence illustrates challenge
Incomplete System Definition Can Lead to Incorrect Conclusions
Major property deviations for system definition = alloy+processing+post treatment+shape
AM
50:F
:Die
Cas
ting:
No
Form
AZ3
1:F:
Cas
ting:
No
Form
AZ3
1:F:
Col
d E
xtru
sion
:Rod
or B
ar
AZ3
1:F:
Col
d R
ollin
g:P
late
AZ3
1:F:
Hot
Rol
ling:
Pla
te
AZ3
1:F:
Hot
Rol
ling:
She
et o
r Stri
p
AZ3
1:O
:Cas
ting:
She
et o
r Stri
p
AZ3
1:O
:Hot
Rol
ling:
She
et o
r Stri
p
AZ3
1:W
eldi
ng:C
old
Rol
ling:
She
et o
r Stri
p
AZ3
1B:H
24:H
ot R
ollin
g:P
late
AZ6
1:F:
Hot
Rol
ling:
Pla
te
AZ9
1:F:
Cas
ting:
No
Form
AZ9
1:F:
Col
d E
xtru
sion
:Pla
te
AZ9
1:F:
Hot
Ext
rusi
on:R
od o
r Bar
AZ9
1:F:
Hot
Rol
ling:
Pla
te
AZ9
1:O
:Hot
EC
AP
:She
et o
r Stri
p
AZ9
1D:F
:Cas
ting:
No
Form
AZ9
1D:F
:Cas
ting:
She
et o
r Stri
p
AZ9
1D:F
:Die
Cas
ting:
No
Form
ZA84
:F:C
astin
g:P
late
Average_grain_size:
Brinell_Hardness_HB:
Tensile_Elongation:
Tensile_Yield_Strength:
Ultimate_Tensile_Strength:
weight_percent:Al
weight_percent:Cu
weight_percent:Mg
weight_percent:Mn
weight_percent:Sb
weight_percent:Sr
weight_percent:Zn
0
10
20
30
40
50
60
70
80
90
100
Property Value Deviations Resulting From Incomplete System Definition (Metals Bank)
• Illustration: % Deviation in Property Values for ‘Nominally’ Equivalent Systems• Problem: Improperly joined information not useful for verification and validation
% DevIn ValueProperty
Systems
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Estimated impact of system definition components on property errors
compon
ent val
ue (~fr
actiona
l standa
rd devia
tion)
sys po
sttreatm
entsys
proces
sing
sys sha
pe
Averag
e grain
size
Bearing
Streng
th
Bearing
Yield S
trength
Brinell H
ardnes
s HB
Casting
Temp
erature
Compre
ssive E
lastic M
odulus
Compre
ssive S
trength
Compre
ssive Y
ield Str
ength
Corros
ion Ra
te
Creep
Strengt
h
High C
ycle Fa
tigue A
xial
Impact
Toughn
ess cha
rpy
Latent
Heat o
f Fusion
Recryst
allizatio
n Tem
peratu
re
Rotatin
g Bend
ing No
tched
Rotatin
g Bend
ing Un
notche
d
Shear S
trength
Specific
Heat
Tensile
Elonga
tion
Tensile
Yield S
trength
Tension
Elastic
Modul
us
Therma
l Condu
ctivity
Ultima
te Tens
ile Stren
gth
sys allo
y
PCA 1 - alloyPCA 2 - posttreatmentPCA 3 - processingPCA 4 - shape
Tools for data assessment (Metals Bank): Contribution of system definition components to property error
14
Create reduced system definitions andtheir associated error for each
measured property by aggregation
Principal components analysis, usingindicator variables to describe the reduced
System definition components
Normalize components to unitindicator variable height for comparison;
flag non-varying components
Principle: the importance of a system definition component relates to the observed increase in error when that component is removed.
Example: for density, observed error using alloy+shape as a system definition is nearly the same as that for alloy alone.
Shape is not a crucial system definition component for density.
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Estimated impact of system definition components on property errors
com
pone
nt v
alue
(~fra
ctio
nal s
tand
ard
devi
atio
n)
sys
post
treat
men
tsy
s pr
oces
sing
sys
shap
e
Aver
age
grai
n si
ze
Bear
ing
Stre
ngth
Bear
ing
Yiel
d St
reng
th
Brin
ell H
ardn
ess
HB
Cas
ting
Tem
pera
ture
Com
pres
sive
Ela
stic
Mod
ulus
Com
pres
sive
Stre
ngth
Com
pres
sive
Yie
ld S
treng
th
Cor
rosi
on R
ate
Cre
ep S
treng
th
Hig
h C
ycle
Fat
igue
Axi
al
Impa
ct T
ough
ness
cha
rpy
Late
nt H
eat o
f Fus
ion
Rec
ryst
alliz
atio
n Te
mpe
ratu
re
Rot
atin
g Be
ndin
g N
otch
ed
Rot
atin
g Be
ndin
g U
nnot
ched
Shea
r Stre
ngth
Spec
ific
Hea
t
Tens
ile E
long
atio
n
Tens
ile Y
ield
Stre
ngth
Tens
ion
Elas
tic M
odul
us
Ther
mal
Con
duct
ivity
Ulti
mat
e Te
nsile
Stre
ngth
sys
allo
y
PCA 1 - alloyPCA 2 - posttreatmentPCA 3 - processingPCA 4 - shape
15
• Two important signatures: “thermodynamic,” “constrained”•Error for core mechanical properties depend on all components
“Constrained” “Constrained”“Equilibrium”
Non-varying components
Tools to identify system definition components controlling property value deviations
Basic data support: Tools for data density...what properties, alloys measured, what aren’t.
Measurement density for Alloys
AM10
0AAM
50AM
50A
AM60
AM60
AAM
60B
AS41
AAS
41B
AZ3
1AZ
31B
AZ61
AZ61
AAZ
63A
AZ80
AAZ
81AZ
81-2
RE
AZ81
-3R
EAZ
81-4
RE
AZ81
AAZ
81-R
EA
Z91
AZ91
-0.2
SbAZ
91-0
.5Sb
AZ91
AAZ
91B
AZ91
CA
Z91D
AZ91
EAZ
92A
EZ33
AK1
AM
1AQ
E22A
WE4
3AW
E54A
ZA84
ZC63
AZE
41A
ZE63
AZK
40A
ZK51
AZK
60ZK
60-0
.5Yb
ZK60
-0.5
YbZK
60-1
YbZK
60-2
YbZK
60A
Aging_TemperatureAnnealing_Temperature
Average_grain_sizeBearing_Strength
Bearing_Yield_StrengthBrinell_Hardness_HBCasting_Temperature
ent_of_Linear_Thermal_ExpansionCompressive_Elastic_Modulus
Compressive_StrengthCompressive_Yield_Strength
Corrosion_PotentialCorrosion_RateCreep_Strength
DensityElectrical_Resistivity
High_Cycle_Fatigue_AxialHigh_Cycle_Fatigue_SN_Curve
Ignition_TemperatureImpact_Toughness_charpy
Incipient_Melting_TemperatureLatent_Heat_of_FusionLiquidus_Temperature
Plate_Bending_NotchedPlate_Bending_Unnotched
Poisson's_RatioRecrystallization_Temperature
Rockw ell_Hardness_HRERotating_Bending_Notched
Rotating_Bending_UnnotchedShear_Elastic_Modulus
Shear_StrengthSolid_Solution_Temperature
Solidus_TemperatureSpecif ic_Heat
Tensile_ElongationTensile_Yield_Strength
Tension_Elastic_ModulusThermal_Conductivity
Thermal_FatigueUltimate_Tensile_Strength
Vickers_Hardness_HVw eight_percent:Agw eight_percent:Alw eight_percent:Caw eight_percent:Cuw eight_percent:Few eight_percent:Gaw eight_percent:Mgw eight_percent:Mnw eight_percent:Ndw eight_percent:Ni
w eight_percent:Otherw eight_percent:Sbw eight_percent:Siw eight_percent:Srw eight_percent:Y
w eight_percent:Ybw eight_percent:Znw eight_percent:Zr
5
10
15
20
25
30
• Data density map for lightweight alloy section of Metals Bank• Illustrates dense and sparse data regions; information gaps• Caveat: Data by itself does not necessarily construe knowledge
Property
Composition
Number ofReported Values
Tools for data assessment: Data coverage summary (Metals Bank/ASM)
17 AZ9
1[A-
C,E
]
AZ3
1
AZ9
1
AZ9
1D
AZ3
1B
AZ6
1*
AZ8
1*
AM
*
AS
*
AZ9
2*
Oth
er (1
8)
Elongation
Strength
Equilibrium
Corrosion
Processing/Tempering
Shape
Hardness
MB ASM
HighModerate
Light
Summary of Technical Accomplishments
Assessed data support in current data resources and current state of data capabilities (Milestone Feb 2010) Analysis repository schema and tested implementation. (Milestone Feb 2010)
Design helps resolve incomplete system or property definitionsAllows direct entry of "as is" data in its original formAccommodates differing system and property definitions
Translated knowledge retention schema into information transfer method for first principles workgroup (ThermoCalc-ESPEI-MSSU-First Principles)Repository assessment tools; assess support for information-driven validation, verification and design (Milestones Sep 2010).
Diagram database content; compare data population (single & combined) Diagram of property value deviations as a function of system; identify system/property combinations with ambiguous property valuesMap and diagram impact of system definition components against property deviationsSystematically look for associations between properties and processing variables
Collaborations
Partners/CollaboratorsMississippi State University (materials repository)Pennsylvania State University (ESPEI integration with repository)ThermoCalc (ESPEI integration, Calphad model)Materials Atlas (Iowa State)CANMET (Kevin Boyle)Sean Agnew (University of Virginia)MetalsBank (materials repository)Discussion in progress for collaborations with other repository developers (ASM)
Technology TransferTechniques/design recommendations for software/repository integration (Pennsylvania State University collaboration)Informatics tools / database design available to ICME communitySharepoint site: https://spteams1.pnl.gov/sites/mat_informatics/default.aspx
Future Work
FY2011Informatics support for multiple data resources•Repository+Repository, Individual+Repository, Individual+Individual•Display ‘global’ information gaps – multi-resource•Physics-based data merging•‘Multi-repository’ additions to system definitions
Materials informatics Toolkit•Data completeness and support•Multiphysics deconvolution for knowledge/system definition augmentation•Modeling power visualization for workgroup validation and verification tasks
Follow-on ProposalFeature development to identify new or missing multiphysicsBridge new compositions to enable more robust designGranularity and multiscale hierarchy in data resourcesRobust strategies to identify/address global information gaps
Technical Back-Up Slides (limit of 5)
Technical Approach – Data Completeness
Identify and fill gaps in ICME knowledge infrastructure required for Mg/Alloy design
Identify data ambiguities -- “same” measurement, different value (missing context in repository).Identify data discrepancies between current repositoriesAssure data are sufficiently complete to engage Phase II modeling – structure-processing/property relations for alloys
Method: materials informatics techniquesLook for structure or processing data correlated with final properties
Examine impact of associating data across multiple length scalesExamine whether data forms must be updated to enable modeling
Anomaly detection Property values that are not defined by specified structure-processing informationAmbiguities across repositories
Collaborate with current repositories on inputs and outcomes
Technical Approach - Knowledge Augmentation
What: after identifying data completeness problems, suggest knowledge augmentation in current alloy/processing databases
Make outcomes available for updates to repositories as neededInformatics tools
Why: Current data may be incomplete or ambiguous for modeling purposesAssess and enable overall data support for information-driven designExamine issues with data ambiguitiesKnowledge assessments are measured against a problem (what is to be learned)
How:Look for structure or processing data correlated with final properties
Examine impact of associating data across multiple length scalesExamine whether data forms must be updated to enable modeling
Anomaly detection Property values that are not defined by specified structure-processing informationAmbiguities across repositories
Example: Data completeness and knowledge augmentation in MSSU repository
Goal: Identify compositions and processing conditions for Mg alloys consistent with improved propertiesData resource: MSSU/CAVS Materials Properties repository
Within the repository Available: Approximately 30 Mg alloysAvailable: Stress-strain curves for each alloyNot available: Detailed composition or processing informationNot available: Yield strength numbers deduced from the stress-strain curve
In advance of detailed analysis (knowledge augmentation)Processing sequences should be added to the repository(Bulk) composition should be added to the repositoryA yield stress feature should be added to the stress-strain curves Other features, based on experience and theoryTry for a small set first, assessing potential association with yield stress
More generally, systematically identify missing information, anomalies in characterization, and associations between structure and properties…
Example: Knowledge augmentation; where, how
It is easy to intermingle properties and systems…
Inconsistent or ambiguous properties (name, form) Difficult to compare across systems
Merged systems (e.g. Alnico II)Merged properties (e.g. Form)Augmentation in this case: can rectify ambiguities
Table 1. An example of alloy property data [CRC Handbook 64th edition]
System Name Composition Density (g/cm3) Tensile StrengthKg/m.m
Form
1 Alnico I Fe 37-39; Al 12; Ni 20-22; Co 5
6.9 2.9 Cast, isotropic
2 Alnico II Fe 52.5; Al 12; Ni 24-26; Cu 6; Co 12.5
7.1 2.1 Cast, isotropic
2 Alnico II Fe 52.5; Al 12; Ni 24-26; Cu 6; Co 12.5
7.1 45.7 Sintered, isotropic
3 Alnico III Fe 59-61; Al 12; Ni 24-26; Cu 3
6.9 8.5 Cast, isotropic
4 Alnico IV Fe 55-56 ; Al 12; Ni 27-28; Co 5
7.0 6.3 Cast, isotropic
5 Hypothetical Calculation
Fe 40, Ni 24, Co 5 9.0 60 Repeated fcc unit cell
More Detailed Summary -Toolkit Accomplishments
Repository assessment tools; assess support for information-driven validation, verification and design
Diagram of database contentAbility to survey data population and compare repository contentLocate information gaps; identify/prioritize maximum leverage pointsIdentify data conditions conducive to robust validation of model predictions/experimental measurementsDiagram of property value deviations as a function of systemIdentify repository system/property combinations with ambiguous property values (anamolies/system and measurement uncertainties)Indicate which systems/property combinations are most susceptible to large property variationsMap and diagram impact of system definition components against property deviationsSystematically look for associations between properties and processing variables
Recommended