35
www.data61.csiro.au DATA ANALYTICS IN NANOMATERIALS DISCOVERY Michael Fernandez | OCE-Postdoctoral Fellow September 2016

DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

www.data61.csiro.au

DATA ANALYTICS IN NANOMATERIALS DISCOVERY

Michael Fernandez | OCE-Postdoctoral Fellow September 2016

Page 2: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez3 |

Materials Discovery Process

Integrating computational methods and informationwith sophisticated computational and analyticaltools to shorten the duration of materialsdevelopment from 10-20 years to 2 or 3 years.

Materials Genome Project

Page 3: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez2 |

Materials and Molecular Modeling

• Deep learning and GPU computationsChris Watkins

• Big data and HPC integrationPiotr SzulYulia Arzhaeva

Collaborators

Team leaderAmanda Barnard

Sun Baichuan

Page 4: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez2 |

Outline

• Material Discovery Process• Methods for atomistic simulations of materials

• Experimental confirmation

• Data-driven Computatioanl Nanomaterials Discovery• Hypothetical material space sampling (structure generation and statistics)

• In silico high-throughput characterization (atomistic simulations and machine learning)

• Data storage, analytics, exploitation and integration

Page 5: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez4 |

Atomistic Simulations of MaterialsTheory can predict properties of materials

𝐻 = 𝐸𝑛𝑢𝑐𝑙𝑒𝑖({𝑅𝐼})-σ𝑖=1𝑁𝑒 𝛻𝑖

2 + 𝑉𝑛𝑢𝑐𝑙𝑒𝑖 𝑟𝑖 +1

2σ𝑖≠𝑗𝑁𝑒 1

𝑟𝑖−𝑟𝑗

• Quantum chemistry methods can cover any chemistry

• Empirical potentials exist for large number of elements

• Computation is scalable and generically deployable

Page 6: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez5 |

Atomistic Simulations of Materials

Computational predictions later confirmed by experiments

• Self-assembly mechanism of nanodiamonds

Barnard, A. S. et al. Nanoscale 3, 958–62 (2011)

The arrow indicates (111)|(111) interface between two 4 nm sized nanodiamonds.

DFTB simulations of the surface electrostatic potential of dodecahedral diamond nanoparticles of a) 2.2 nm and b) 2.5 nm

Page 7: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez6 |

Materials libraries

Experiment design

DatabasePerformance

Measurement

Lead materials scale up

Knowledge for rational

design

Banks of materials

Hypothetical materials

In-silico screening

Modern materials discovery cycle

• Polydispersive systems• Exponential increase in

complexity and diversity• Nearly infinite combinatorial

problem

Potyrailo, R. et al. ACS Comb. Sci. 13,579–633 (2011).

Nano-

Theory, modeling and

informatics

Page 8: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez7 |

Nanomaterials ScreeningDeparting from the Edisonian approach

vs.

We can accurately predict a property, so it can be computed for entire materials spaces

Combinatorial In silico design

In silico structure generation

Page 9: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez7 |

Nanomaterials Screening

• Systematic and extensive materials performance ranking.

• “Big data” discovery of structure-property relationships in unknown materials domains.

• Accelerated identification of high potential candidates and rational design principles.

Page 10: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez7 |

Data Analytics Challenges

Information representation• Fingerprints

Information extraction• Multivariate statistical analysis

Knowledge discovery• Data mining and machine learning

Knowledge representation• Visualization

Page 11: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez9 |

Polydispersity Challenge in Nanomaterials

Polydispersive sample Quasi-monodisperse sample

Purification

Controlled synthesis

$$$

Polydispersity can be detrimental for high-performing applications

Purification of polydispersive nanoparticles samples is expensive

Page 12: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez10 |

Data Analytics of Nanocarbons

Nanodiamonds Graphenes

Virtual structures relaxed using TB-DFT

Page 13: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez10 |

Archetypal Analysis (AA)

The predictors of Xi are finite mixtures of archetypes Zj, which are convex combinations of the observations.

Finds a k m matrix Z that corresponds to the archetypal or ”pure patterns” in the data in such away that each data point can be represented as a mixture of those archetypes. In other words, thearchetypal analysis yields the two n k coefficient matrices α and β, which minimize the residualsum of squares:

Cutler, A. & Breiman, L. Archetypal Analysis. Technometrics 36, 338–347 (1994).

Page 14: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez11 |

Archetypal Analysis of Nanocarbons

Nanodiamonds

Fernandez, M. & Barnard, A. S. ACS Nano 9, 11980–11992 (2015).

Graphene nanoflakes

Page 15: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Nanocarbons Prototypes

Nanodiamonds prototypes

Fernandez, M. & Barnard, A. S. ACS Nano 9, 11980–11992 (2015).

Graphene prototypes

Page 16: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Estimation of Nanodiamonds Properties

Fernandez, M. & Barnard, A. S. ACS Nano 9, 11980–11992 (2015).

Page 17: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Structural Diversity Challenge

Defects, oxidation and edge passivation yield large structural diversity

Graphene nanoflakes

Trigonal Rectangular Hexagonal

Page 18: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

PP

P

Silicon qbits

Structural Diversity Challenge

Single Si substitution by P yields a large structural diversity

Page 19: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Metal-Organic Framework (MOF)

nitroimidazole

benzimidazole

Zn2+ CO2 capture and sequestration(Science, 2008)

ZIF-68

Structural Diversity Challenge

Page 20: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Metal-Organic Framework (MOF)

In-silico Combinatorial design

Structural Diversity Challenge

Page 21: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Metal-Organic Framework (MOF)

Modification with of 35 functional groups gives a total of ~1.5 million (354) unique combinations

MOF ZBPOrganic Linkersa) b)

Structural Diversity Challenge

Page 22: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Machine Learning Approach

Machine learning prediction of functional properties

Machine learningFeature fingerprints

Page 23: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Data Analytics Challenge

Binary decision tree of the Band Gap of graphene

Fernandez, M., Shi, H. & Barnard, A. S Carbon (2016). doi:10.1016/j.carbon.2016.03.005

Features• Surface area• Number of atoms• Shape aspect ratio

Accuracy80%

Page 24: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Machine Learning vs. Atomistic Simulations

Estimation of the graphene Band Gap from topological features

Pi and Pj , are the values of a bond order of thecarbon atoms in graphene, while L is the topologicaldistance, whilst Lij is a delta function delta function

Fernandez, M. et al. ACS Comb. Sci. (2016) doi:10.1021/acscombsci.6b00094

Page 25: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Machine Learning of Graphene

Fernandez, M.; Shi, H.; Barnard, A et al. J. Chem. Inf. Model. (2015), 55, 2500-2506

Radial Distribution Function (RDF) scores for graphene

the summation is over the N atom pairs in the graphene structure, andrij is the distance of these pairs and B is a smoothing parameter set to 10.

Page 26: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Machine Learning vs. Atomistic Simulations

Ionization Potential

Fernandez, M.; Shi, H.; Barnard, A et al. J. Chem. Inf. Model. (2015), 55, 2500-2506

Energy of the Fermi level

From RDF scores

Page 27: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Machine Learning vs. Atomistic SimulationsMachine learning prediction of gas adsorption in MOF

Fernandez, M., et al. J. Phys. Chem. C 117, 14095–14105 (2013).

Page 28: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Accuracy and Coverage Challenges

Syst

em

siz

eAccuracy of electronic calculations methods vs. system size

TBDF/Semiempirical

Density Functional

Accuracy

Quantum Monte Carlo

Coupled Cluster

5000 atoms

500 atoms

100 atoms

20 atoms

Page 29: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez15 |

Data-driven ChallengeMachine learning for large material spaces

∆3

Machine learning predictions:-Functional property value or threshold-Accuracy of different quantum-chemistry methods =f( )structure

Machine Learning

Partial Database Screening

Full DatabasePredictions

Ramakrishnan, R. et al. J. Chem. Theory Comput. 11, 2087–2096 (2015). Fernandez, M et al. J. Chem. Inf. Model. (2015), 55, 2500-2506

Page 30: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Challenges and Limitations

E

QMC

B3LYP

6,095 isomers of C7H10O2

big gapbig gap

Accuracy Gap Between Different Levels of Theory

Page 31: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data Analytics for Nanomaterials| Michael Fernandez13 |

Accuracy Gap Predictions

Predictions

Machine learning calibration

Page 32: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Data-driven High-throughput Screening

outputsData storage

Input jobsQueue management

Computational resources

Resubmit or kill failed runs

Finished runs

Accuracy refinement

Data Analytics for Nanomaterials| Michael Fernandez13 |

Page 33: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Self-organization Map (SOM) of NPs

Data Analytics for Nanomaterials| Michael Fernandez13 |

Ag-NP

SOM

Electrostatic Potential

Page 34: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

Deep Learning for Nanomaterials

Data Analytics for Nanomaterials| Michael Fernandez13 |

C2

Feature maps10x10

5x5convolution

2x2subsampling

5x5convolution

2x2subsampling Fully connected

input32x32

C1

Feature maps28x28

S1

Feature maps14x14

S2

Feature maps5x5

B1B2

Output

ClassificationFeature extraction

Image32x32

Page 35: DATA ANALYTICS IN NANOMATERIALS DISCOVERY...10 | Data Analytics for Nanomaterials| Michael Fernandez Archetypal Analysis (AA) The predictors of Xi are finite mixtures of archetypes

www.data61.csiro.au

Data61Michael Fernandeze [email protected]

w https://research.csiro.au/mmm/

Thank you