Upload
shyue-ping-ong
View
106
Download
2
Tags:
Embed Size (px)
Citation preview
The Materials Project Ecosystem A Complete Software and Data Platform for Materials Informatics
Shyue Ping Ong, University of California, San Diego
“Information wants to be free.” – Steward Brand, 1960s
“Information wants to be free and code wants to be wrong.”
– RSA Conference 2008
“Materials information and code wants to be free and right.”
The Materials Project is an open science project to make the computed properties of all known inorganic materials publicly available to all researchers to accelerate materials innovation.
June 2011: Materials Genome Initiative which aims to “fund computational tools, software, new methods for material characterization, and the development of open standards and databases that will make the process of discovery and development of advanced materials faster, less expensive, and more predictable”
https://www.materialsproject.org
As of Jun 5 2015q Over 58,000 unique
compounds, and growingq Diverse set of many
propertiesq Structural (lattice parameters,
atomic positions, etc.), q Energetic (formation
energies, phase stability, etc.) q Electronic structure (DOS,
Bandstructures) q Elastic constants
q Suite of Web Apps for materials analysis
User-friendly Web Apps
Materials Explorer: Search for materials by formula, elements or properties Battery Explorer: Search for battery materials by voltage, capacity and other properties Crystal Toolkit: Design new materials from existing materials Structure Predictor: Predict novel structures Phase Diagram App: Generate compositional and grand canonical phase diagrams Pourbaix Diagram App: Generate Pourbaix diagrams Reaction Calculator: Balance reactions and calculate their enthalpies
Materials Project data in User papers M. Meinert, M.P. Geisler, Phase stability of chromium based compensated ferrimagnets with inverse Heusler structure, J. Magn. Magn. Mater. 341 (2013) 72–74.
J. Rustad, Density functional calculations of the enthalpies of formation of rare-earth orthophosphates, Am. Mineral. 97 (2012) 791–799.
M. Fondell, T.J. Jacobsson, M. Boman, T. Edvinsson, Optical quantum confinement in low dimensional hematite, J. Mater. Chem. A. 2 (2014) 3352.
Web frontend is only the tip of the iceberg…
pymatgen FireWorks REST API custodian MPWorks
MPEnv rubicon
Hierarchical design of codebases keeps infrastructure nimble to changes
WORKFLOW CODE
CHEMISTRY CODE
Many types of use cases
FireWorks pymatgen custodian MPWorksCrystal workflows
FireWorks pymatgen custodian rubicon (private)Molecule workflows
pymatgen
FireWorks
externalMAST, MaterialsHub
externalBerlin ML, JGI, MoDeNa
Sustainable software development
¨ Open-source ¤ Managed via ¤ More eyes => robustness
¤ Contributions from all over the world
¨ Benevolent dictators ¤ Unified vision
¤ Quality control
¨ Clear documentation ¤ Prevent code rot
¤ More users
¨ Continuous integration and testing ¤ Ensure code is always working
Python Materials Genomics (pymatgen)
¨ Core materials analysis powering the Materials Project
¨ Defines core extensible Python objects for materials data representation.
¨ Provides a robust and well-documented set of structure and thermodynamic analysis tools relevant to many applications.
¨ Establishes an open platform for researchers to collaboratively develop sophisticated analyses of materials data.
Extensive Materials Analysis Capabilities
Input/Output
objects
(Modular, Reusable, Extendable)
Defects and Transformations Electronic Structure
XRD Patterns
Phase and Pourbaix Diagrams
Functional properties
Comprehensively documented
Continuously tested and integrated
Active dev/user community
www.pymatgen.org stats • > 6000 views per month on average • (~50% increase from previous year)
V2.9.12 è v3.0.13 *Python 2/3 compatible! Other improvements • ABINIT support • Defects (Haranczyk/LBNL) • Qchem (JCESR) • Bug fixes & improvements Very active user community!
81 forks (developers making changes and contributing)
Actual commits has slowed somewhat, as expected for a maturing and robust code base.
Pymatgen-db
¨ Database add-on for pymatgen. Enables the creation of Materials Project-style MongoDB (www.mongodb.org) databases for management of materials data. Key features: ¤ Query engine for easy translation of MongoDB docs to
useful pymatgen objects for analysis purposes. ¤ Includes a clean and intuitive web ui (the Materials
Genomics UI) for exploring Mongo collections. ¤ http://pythonhosted.org//pymatgen-db/
Custodian
¨ Simple, robust and flexible just-in-time (JIT) job management framework. ¤ Wrappers to perform error checking,
job management and error recovery. ¤ Error recovery is an important aspect
for HT: O(100,000) jobs + 1% error rate => O(1000) errored jobs.
¤ Existing sub-packages for error handling for VASP, NwChem and QChem calculations.
¨ Blue: Controlled by subclasses of Job
¨ Red: Defined by ErrorHandlers.
Concrete Example for VASP calculations
¨ Extensive set of rules have been codified for running VASP calculations
¨ Significantly reduces error rate of calculations (< 1%)
VaspJob class
¨ auto_npar: automatically modifies NPAR in INCAR to a relatively optimal number based on detected number of processors! Enhances vasp calculation efficiency by ~10-30%!!!
¨ auto_gamma: If this is a gamma-only calculation and a gamma compiled version of vasp exists, use it. Another 10-20% increase in efficiency!
¨ Even without error handling, custodian already significantly improves resource utilization of running VASP calculations!
VaspJob(vasp_cmd, output_file="vasp.out”, auto_npar=True, auto_gamma=True, …<other options>...)
FireWorks is the Workflow Manager 21
Custom material
A cool material !! Lots of information about
cool material !!
Submit!
Input generation (parameter choice) Workflow mapping
Supercomputer submission / monitoring
Error handling File Transfer
File Parsing / DB insertion
FireWorks as a platform
Community can write any workflow in FireWorks à We can automate it over most supercomputing resources
structure
charge
Band structure
DOS
Optical
phonons
XAFS spectra
GW
Workflows in Development by Internal/External Collaborations
¨ Elastic constants (in production) ¨ Thermal properties (Phonon / GIBBS: in testing) ¨ Surfaces (in testing) ¨ GW / hybrid calculations ¨ ABINIT workflows (Geoffroy Hautier, UCL) ¨ Any code can be added and automated
Materials Project DB
How do I access MP
data?
Materials Project DB
How do I access MP
data?
Option 1: Direct access
Most flexible and powerful, but • User needs to know db language • Security is an issue • Fragile – if db tech or schema
changes, user’s analysis breaks
Materials Project DB
How do I access MP
data?
Option 2: Web Apps
Pros • Intuitive and user-friendly • Secure
Cons • Significant loss in flexibility
and power
Web
App
s
Materials Project DB
How do I access MP
data?
Option 3: Web Apps built on RESTful API
Pros • Intuitive and user-friendly • Secure
Web
App
s
RE
STf
ul A
PI
• Programmatic access for developers
and researchers
The Materials API An open platform for accessing Materials Project data based on REpresentational State Transfer (REST) principles. Flexible and scalable to cater to large number of users, with different access privileges. Simple to use and code agnostic.
A REST API maps a URL to a resource. Example: GET https://api.dropbox.com/1/account/info Returns information about a user’s account. Methods: GET, POST, PUT, DELETE, etc. Response: Usually JSON or XML or both
Who implements REST APIs?
https://www.materialsproject.org/rest/v2/materials/Fe2O3/vasp/energy
Preamble
Identifier, typically a formula (Fe2O3), id (1234) or chemical system (Li-Fe-O)
Data type (vasp, exp, etc.)
Property
Request type
Secure access An individual API key provides secure access with defined privileges. All https requests must supply API key as either a “x-api-key” header or a GET/POST “API_KEY” parameter. API key available at https://www.materialsproject.org/dashboard
Sample output (JSON)
¨ Intuitive response format
¨ Machine-readable (JSON parsers available for most programming languages)
¨ Metadata provides provenance for tracking
{
}
created_at: "2014-07-18T11:23:25.415382",valid_response: true,version: {
},
-pymatgen: "2.9.9",db: "2014.04.18",rest: "1.0"
response: [
],
-{
},
-energy: -67.16532048,material_id: "mp-24972"
{
},
-energy: -132.33035197,material_id: "mp-542309"
{…},+{…},+{…},+{…},+{…},+{…},+{…},+{…}+
copyright: "Materials Project, 2012"
Can I really access any piece of data in the Materials Project?
Github-powered RESTful documentation http://bit.ly/materialsapi
Via the shockingly powerful https://www.materialsproject.org/rest/v2/query
Demo http://localhost:8888/notebooks
The Materials API + pymatgen in Education – UCSD’s NANO 106
¨ Data mined over the Materials Project’s 49,000+ unique crystals
http://www.bit.ly/sg_stats
P21/c is the most common space group, comprising ~9.8% of all compounds
The Materials Virtual Lab @ UCSD’s One-click AIMD
Starting candidates
Topological Screening (augmented by DFT)
Stability (phase & EW) screening
Diffusivity
Optimized candidates
Automated “one-click” MD workflow based on pymatgen, custodian and fireworks
AIMD SDSC
Multi-week AIMD simulation
Statistical exclusionary screening
Y. Mo, S. P. Ong, G. Ceder, “Insights into Diffusion Mechanisms in P2 Layered Oxide Materials by First-Principles Calculations”, submitted
Automated pathway extraction + NEB
Coming soon (full launch in next few
weeks)!!
Sounds good, where do I learn more?
¨ The Materials Project ¤ https://www.materialsproject.org/open
¨ The Materials API Github Doc ¤ http://bit.ly/materialsapi
¨ The Materials Virtual Lab (MAVRL) @ UCSD ¤ Slides from Workshop on MP infrastructure (
http://mavrl.org/software)
Thank you.