Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Review Article
Machine Learning for Condensed Matter Physics
Edwin A. Bedolla-Montiel1, Luis Carlos Padierna1 and Ramon
Castaneda-Priego1
1 Division de Ciencias e Ingenierıas, Universidad de Guanajuato, Loma del Bosque
103, 37150 Leon, Mexico
E-mail: [email protected]
Abstract. Condensed Matter Physics (CMP) seeks to understand the microscopic
interactions of matter at the quantum and atomistic levels, and describes how these
interactions result in both mesoscopic and macroscopic properties. CMP overlaps
with many other important branches of science, such as Chemistry, Materials Science,
Statistical Physics, and High-Performance Computing. With the advancements
in modern Machine Learning (ML) technology, a keen interest in applying these
algorithms to further CMP research has created a compelling new area of research at
the intersection of both fields. In this review, we aim to explore the main areas within
CMP, which have successfully applied ML techniques to further research, such as the
description and use of ML schemes for potential energy surfaces, the characterization
of topological phases of matter in lattice systems, the prediction of phase transitions
in off-lattice and atomistic simulations, the interpretation of ML theories with physics-
inspired frameworks and the enhancement of simulation methods with ML algorithms.
We also discuss in detial the main challenges and drawbacks of using ML methods on
CMP problems, as well as some perspectives for future developments.
Keywords: machine learning, condensed matter physics
Submitted to: J. Phys.: Condens. Matter
arX
iv:2
005.
1422
8v3
[ph
ysic
s.co
mp-
ph]
14
Aug
202
0
Machine Learning for Condensed Matter Physics 2
CondensedMatterPhysics
+MachineLearning
HardMatter
Classicaland
Modern MLApproaches
TopologicalPhases of
Matter
Phase Transitions& Critical Parameters
Materials Modeling
Machine-learnedPotentialsEnhanced
SimulationMethods
AtomisticFeature
Engineering
Physics-inspiredMachine Learning
Theory
RestrictedBoltzmannMachines
RenormalizationGroup
Explanationof
Deep Learning Theory
Machine-learnedFree
EnergySurfaces
EnhancedSimulationMethods Interpretation
ofML models
SoftMatter
Colloidal systems
Experimental setups
Phase transitionsStructural and
dynamical properties
Figure 1. Schematic representation of the potential applications of ML to CMP
discussed in this review.
1. Introduction
Currently, machine learning (ML) technology has seen widespread use in various aspects
of modern society: automatic language translation, movie recommendations, face
recognition in social media, fraud detection, and more everyday life activities [1] are
all powered by a diverse application of ML methods. Tasks like human-like image
classification performed by machines [2]; the effectiveness of understanding what a
person means when writing a sentence within a context [3]; the ability to distinguish a
face from a car in a picture [4]; and automated medical image analysis [5] are some of
the most outstanding advancements that stem from the mainstream use of modern ML
technology [6]. Inspired by these notable applications, scientists have thrivingly applied
ML technology to scientific fields, such as matter engineering [7], drug discovery [8] and
protein structure predictions [9].
Influenced by the overwhelming effectiveness of ML technology in other scientific
fields, physicists have also applied such methods to specific subfields within Physics,
with the main objective of making progress in the most challenging research questions.
The motivating reasons for this surge of interest at the intersection between Condensed
Matter Physics (CMP) and ML are vast. For instance, images, language and music
Machine Learning for Condensed Matter Physics 3
exhibit power-law decaying correlations equivalent to those seen in classical or quantum
many-body systems at their critical points [10]; very large data sets, with high-
dimensional inputs found in materials science [11] are the quintessential problem that
modern ML methods are tailored to deal with. It seems as though ML is well-suited to
be applied to CMP research problems given all these deep connections between both,
but not without its drawbacks. Such is the case of the encoding into a latent space
of features that represent the atomistic structure found in soft matter and physical
chemistry molecular systems [12], which is a challenge that is currently being solved by
modern natural language processing and object detection techniques. The challenge of
encoding and feature selection, as well as other key challenges of ML applications to
CMP will be discussed with further detail later on.
In this work, we review the contributions of ML methods to Physics with the aim
of providing a starting point for both computer scientists and physicists interested in
getting a better understanding of the interaction between these two research fields. For
easier navigation of the topics covered in this review, the diagram shown in figure 1
displays a schematic representation of the outline, but we shall summarize them briefly
here.
We start the review with a very brief overview of both CMP and ML for the sake of
self-containment, to help reduce the gap between both areas. In particular, the overview
of CMP is meant to emphasize the difference between both main branches, i.e. Hard
and Soft Matter, because these areas have a fundamental Physical difference that will
later influence the way ML techniques are applied to their respective types of problems.
We then continue by exploring some of the research inquiries within hard condensed
matter physics. Hard Matter has seen most of the ML applications, for instance in
phase transition detection and critical phenomena for lattice models. We continue with
the review by analyzing how both classical ML and Deep Learning (DL) techniques have
been applied to obtain detailed descriptions of strongly correlated, frustrated and out-
of-equilibrium systems. We investigate how standard simulation methods, like Monte
Carlo, have also seen enhancements from using ML models.
Moving forward, we examine a new research area dubbed physics-inspired ML
theory, an area where the most rigorous and fundamental physics frameworks have
been applied to ML to obtain a deeper insight of the learning mechanisms of ML
models. We discuss how in this engaging research area a robust understanding of
the learning mechanism behind Restricted Boltzmann Machines has been obtained.
We move on to reviewing how common theoretical physics frameworks, such as the
Renormalization Group, have been used with interesting results to give insights into the
learning mechanisms of the most common models in DL.
The review continues with the exploration of Materials modeling, Soft Matter,
and related fields, which recently have seen a strong surge of ML applications, mainly
in the subfields of materials science and colloidal systems, e.g., proteins and complex
molecular structures. Enhanced intelligent modeling, atomistic feature engineering, ML
potential and free energy surfaces have been the primary topics of research; with the
Machine Learning for Condensed Matter Physics 4
identification of a phase transition as well as the determination of the phases of matter
being a close second.
As one has to be aware that every technique has some limitations, the review is
closed with some of the main challenges and drawbacks of ML applications to CMP, in
addition to some perspectives and outlooks about current challenges.
Before we continue with the review, we would like to address that many other
notable applications within realated fields of CMP have been omitted due to lack of
depth of knowledge in such fields and space in this short review. Such applications
include graph-based ML force fields [13] and the deep learning architecture SchNet used
to model atomistic systems [14]; deep learning methodologies used to predict the ground-
state energy of an electron and effectively solving the Schrodinger equation for various
potentials [15]; the use of ML techniques for many-body quantum states [16, 17]; the
use of crystal graph convolutional neural networks to directly learn material properties
from the connection of atoms in the crystal [18]; the utilization of ML supervised and
unsupervised techniques to approximate density functionals, as well as providing some
insights into how these techniques might achieve such approximations[19, 20]. Lastly,
we would like to suggest the reader interested in applications of ML to a wide range of
areas within Physics, the recenet review by Carleo et al [21] discusses in detail some of
the most relevant works and research areas.
2. Overview of Machine Learning
In this section, common ML concepts including those used in this review are described,
following a specific hierarchy composed of category, problem, model, and method or
architecture. Figure 2 illustrates the relationship between these concepts. The interested
reader can get further information from this partial taxonomy by following the references
provided in the figure caption.
Machine Learning is a research field within Artificial Intelligence aimed to construct
programs that use example data or past experience to solve a given problem [22]. The
ML algorithms can be divided into categories of supervised, unsupervised, reinforcement
and deep learning [47, 22, 6]: in supervised learning the aim is to learn a mapping from
the input data to an output whose correct values are provided by a supervisor; in
unsupervised learning, there is no such supervisor and the goal is to find the regularities
or useful properties of the structure in the input; in reinforcement learning, the learner
is an agent that takes actions in an environment and receives reward or penalty for
its actions in trying to solve a problem, the output is a sequence of actions, and the
main purpose is to learn from past good sequences to generate a policy that maximize
a reward; deep learning is a special kind of ML, designed to deal with high-dimensional
environments. DL may be regarded as a transverse category, since it has been used to
solve the limitations of traditional algorithms of supervised [48], unsupervised [41], and
reinforcement learning [31]. DL is focused on learning nested concepts, e.g., the image of
a person, in terms of simpler concepts, such as the corners, contours or textures within
Machine Learning for Condensed Matter Physics 5
Machine Learning i
Unsupervised ii
LearningReinforcement ii
Learning
Classification iii
RegressionData Imputation
…
Supervised ii
Learning
Clustering iv
Novelty detectionRanking
…
Categories
Problems
Models
Methods & Architectures
Deep Learning ii
Path tracking v
Optimal control Games
…
SVC iii SV Clustering iv
KLSPI vii DKNs x
SVR iii Kernel PCA iv
SVM-A2C viii DEK xi
LS-SVM iii SV Ranking iv
KBR ix TWSVC xii
… … … …
Neural Networks viKernel Methods vi
MLP xiii AE xv ART xvii CNN xiii
LSTM xiv RBM xv MLP xiii Deep Q-Network xviii
RBF Networks xiv
GAN xvi RAN xvii DBM xviii
… … … …
Figure 2. Partial taxonomy of Machine Learning methods and architectures. (i)
A detailed description of other models not presented in the figure, such as graphical
or Bayesian models can be consulted in [22]. (ii) For further information regarding
the ML categories please confer to [6] and [23]. (iii) Examples of methods for
classification, regression and data imputation are discussed in [24], [25], and [26],
respectively. (iv) Examples of methods for clustering, novelty detection and ranking
can be found in [27], [28], and [29], respectively. (v) Examples of methods for path
tracking, optimal control and games are introduced in [30], [31], [32], respectively. (vi)
For further information regarding kernel methods and neural networks models please
confer to [33] and [34], respectively. (vii) KLSPI - Kernel-based least squares policy
iteration [35]. (viii) SVM-A2C - Advantage Actor-Critic with Support Vector Machine
Classification [36]. (ix) KBR - Kernel-based relational reinforcement Learning [37].
(x) DKN - Deep Kernel Networks [38]. (xi) DEK - Deep Embedding Kernel [39].
(xii) TWSVC - Twin Support Vector Clustering [40]. (xiii) Some architectures have
been used in more than one category, for instance, variants of Convolutional Neural
Networks (CNN) were employed both in supervised and unsupervised problems [41],
and Multilayer Perceptrons (MLP) have been used for supervised and reinforcement
learning [33, 42]. (xiv) LSTM - Long short-term memories and RBF - Radial
Basis Function Networks [34]. (xv) AE - Autoencoders and RBM - Restricted
Boltzmann Machines [23]. (xvi) GAN - Generative Adversarial Networks [43]. (xvii)
ART - Adaptive Resonance Theory Networks [44] and RAN - Resource Allocation
Networks [45]. (xviii) Deep Q-Network [32] and DBM - Deep Boltzmann Machines [46].
Machine Learning for Condensed Matter Physics 6
the image [6]. The DL approach enhances previous ML models such as neural networks
and kernel methods on several aspects, for instance by extending the capability of dealing
with large-scale data, and providing new architectures, operators, and learning methods
for the solution of problems with higher precision [32, 49]. As a result, DL methods are
able to deal with huge amounts of heterogeneous information, as well as learning directly
from raw data without the user-dependent preprocessing step of receiving the main and
filtered features of raw data as valid training vectors (feature extraction) required by
classical ML methods.
Although Figure 2 presents several ML concepts, there exist other important models
like Decision Trees, Gaussian Processes, and methods such as boosting, LASSO, ridge
regression, k-means, etc. [50, 51, 52], that are not included in this brief overview in
order to focus on those concepts that frequently appear in the rest of the manuscript.
Specifically we will restrict ourselves to the description of Neural Networks (feed-forward,
autoencoders, restricted Boltzmann machines and convolutional) and kernel methods
(support vector machines). These methods and architectures are now described, with
the exception of restricted Boltzmann machines that, because of their relevance, are
presented in subsequent sections. Useful references are included for readers interested
in obtaining a deeper insight.
Artificial Neural Networks (ANNs) are models of ML easily found in science
and engineering applications. ANNs attempt to simulate the decision process of
neurons of biological central nervous systems [53]. Many ANNs are computational
graphs [54, 34, 55]. In these graphs, the nodes compute activation functions (sigmoidal,
radial basis, and any other Tauber-Wiener function [56]) according to the input
data provided by the connecting edges. The learning process consists in assigning
weights to the edges and updating them according to a learning algorithm—such as
backpropagation, Levenberg-Marquardt, Stochastic Gradient Descent, Natural Gradient
Descent, etc. [57, 58, 59]—. The tasks that ANNs can solve depend on the architecture
that specifies how data is processed and stored. The storage of information can be
either by using an external memory [60, 61] or as an implicit memory codified into the
weights of the network, being the latter the most common approach. The number of
current ANNs architectures is immeasurable, but a project trying to track them can be
consulted in Ref. [62].
Multilayer feed-forward Neural Networks (FFNNs) are one of the simplest ANN
architectures, where neurons are organized in layers [63]. The default version of FFNNs
assumes that all neurons in one layer are connected to those of the next layer. Neurons
of the input layer receive the data to be processed and then pass it to the next
layer, successive layers feed into one another in the forward direction from input to
output. The key point is that the transfer of information between layers is controlled
by nonlinear activation functions evaluated by the neurons. A neat illustration of
this architecture and further details about its variants can be consulted in [34] sect
1.2.2. Two examples of these networks are the Multilayer Perceptrons and Radial
Basis Function Networks. When the learning process in FFNNs is supervised, in
Machine Learning for Condensed Matter Physics 7
addition to the training data—X ⊂ <d—provided to the input layer, expected output
data—Y ⊂ <—labeling each input vector is required to verify the performance of the
network. A learning algorithm is then employed based on the network output until the
best approximation to an underlying decision function, f : X → Y , is achieved. FFNNs
networks have been proved to be universal approximators to any Borel measurable
function [64]. In addition to the function approximation, supervised FFNNs may be
used for classification, prediction, and control tasks [65]. FFNNs have also been applied
to reinforcement learning problems [42, 32, 66].
Autoencoders (AEs) constitute a different type of NN architecture, considered a
special case of FFNNs, and designed to learn a latent representation of the training
data in order to perform data compression or dimensionality reduction [67, 68]. AEs
consist of two parts: an encoder function—g(x),x ∈ X—, and a decoder that produces
a reconstruction, r = f(g(x)). The main purpose of AEs is to generate an efficient
encoding of the input data instead of learning to copy this data perfectly. Despite
the success of ANNs to solve a vast number of applications in both science and
engineering [69, 70], this model has been criticized due to the lack of providing explicit
representations of the learned solutions [71, 72, 73]. Since ANNs are considered black-
box models, it becomes difficult to interpret their results, which is specially important
in applications such as medicine, business or self-driving cars, where the reliance of
the model must be guaranteed [73]. However, new techniques have been developed to
increase the interpretation capabilities of NNs in order to extract new insights from
complex physical, chemical and biological systems [74].
Convolutional Neural Networks (CNNs) are DL architectures specialized for
processing data with a grid-like topology (time-series as one-dimensional data or images
as two-dimensional grid of pixels) that use convolution in place of general matrix
multiplication in at least one of their layers. The simplest CNN includes an input
layer that receives the raw data; next, a convolutional layer filters local regions from
the input layer’s ouput. The use of convolution brings many benefits: provides a mean
for working with inputs of variable size, it has the property of translation invariance,
it is more efficient in terms of memory than dense matrix multiplication because of its
property of parameter sharing, and the feature extraction is performed without human
interference. The convolution is performed on two elements, the input data and a
kernel whose values are assigned automatically by the learning algorithm, instead of
being predefined by the user. After convolution, a pooling layer is then used to down-
sample the spatial dimensions; lastly, a fully connected layer computes the class scores,
and an output layer provides the final result [6].
Support Vector Machines (SVMs) are explicit ML models that belong to the family
of kernel methods and emerged as an alternative to ANNs [24]. Both ANNs and
SVMs have the ability to generalize and may be designed to solve practically the same
tasks (classification, regression, density estimations, clustering, among others) [75, 76].
However, while ANNs were inspired by the biological analogy to the brain, SVMs were
inspired by statistical learning theory [77]. Because of their solid theoretical basis, SVMs
Machine Learning for Condensed Matter Physics 8
can provide explicit solutions with measurable generalization capability by conducting
structural risk minimization; by solving a convex quadratic programming problem, the
dreaded problem of local minima that permeates ANNs is avoided; and finally, the model
can represent the learned solution based on few parameters [78, 79]. However, SVMs
are limited in the size of the datasets that can handle since training a standard SVM
has a time complexity between O(N2) and O(N3), with N the number of input vectors;
furthermore, the associated Gram matrix of size N ×N must be allocated [80, 81]. This
limitation is presented for general purpose SVM solvers like LIBSVM [82]. Different
approaches have been followed to deal with large-scale datasets. For instance, if the
kernel function can be limited to the linear one (as in high-dimensional sparce spaces),
then efficient solvers like LIBLINEAR [83] or PEGASOS [84] can be implemented to
solve problems with hundreds of thousands of input vectors in seconds. Other strategies
like sampling, boosting or hierarchical training have been used to speed up the SVM
training with non-linear kernels [81, 85]. For Deep Kernel Learning methods, where a
concatenation of several kernel functions can be reformulated as neural networks [86],
some strategies focus in designing maps in the Reproducing Kernel Hilbert Space to
reduce the complexity of evaluating Deep Kernel Networks, which scale quadratically on
N and linearly w.r.t the depth of the trained networks [38]. Based on these limitations
and proposed solutions, an application-based analysis should be done to choose the
correct set of SVM algorithms or related kernel methods.
3. Overview of Condensed Matter Physics
Understanding the intricate phenomena of all types of matter and their properties is
the driving force of condensed matter physicists. From quantum theory all the way to
materials science, CMP encompasses a large diversity of subfields within physics, as it
deals with different time and length scales depending on both the molecular details and
the type of matter being analyzed. To study such systems, theoretical, computational
and experimental physicists collaborate to gain a deeper insight into the behavior in
and out of equilibrium of matter. We shall briefly talk about the two main areas within
CMP, namely, hard matter and soft matter, the essential models as well as the central
problems these research areas are focused on. More information about these research
areas can be found in standard textbooks, see, e.g., [87, 88, 89] and reviews [90, 91].
3.1. Soft Condensed Matter
Fluids, liquid crystals, polymers, colloids, and many more are the principal types of
matter that are studied by Soft Condensed Matter, or Soft Matter (SM). SM has the
peculiar property that it will easily respond to external forces. SM will change its
flow when a small shear force is applied to it, or thermal fluctuations are applied,
thus changing its properties as a whole. We are surrounded by these materials in our
everyday life, from glass windows, toothpaste, hair-styling gel, among others. These
Machine Learning for Condensed Matter Physics 9
materials can show diverse properties and when some external force is applied, they will
exhibit a different set of properties, but in all cases SM shows the fascinating property of
self-organization of its molecules [92]. Proteins and bio-molecules are also good examples
of SM, they change their structure when subjected to thermal fluctuations, while also
showing a pattern of self-assembly [93, 94], as is the case in most types of SM. All in
all, SM is a subfield of CMP that gained a lot of attention when first introduced in the
1970s by Nobel prize laureate Pierre-Gilles de Gennes [95], because it has shown to be of
great importance to science in general, with very useful applications, as well as pushing
the boundaries in theoretical, experimental and computational Physics.
In order to explore all these properties and changes in a SM system, a model
that describes most of these types of matter is needed. Most fluids and colloidal
dispersions show a repulsive behavior, which can be modeled very well with the hard-
sphere model [96]. A system modeled as a hard-sphere dispersion only shows an
infinite repulsive interaction when there exists interparticle separations less than the
particle diameter, and zero otherwise. There is no attractive interaction included in this
model. The hard-sphere model is a very simple one, which has also exhibited intriguing,
but unexpected equilibrium and non-equilibrium states [97, 98, 99]. There are some
interesting properties that define the hard-sphere model in SM. First of all, the internal
energy of any allowed system configuration is always zero. Furthermore, forces between
particles and variations in the free energy—defined from classical thermodynamics
as F = E − TS, with E being the internal energy, T the temperature and S the
entropy—are both determined entirely by the entropy. Moreover, the entropy in a
hard-sphere system depends directly on the total volume occupied by the hard spheres,
commonly defined as the volume fraction, φ. When φ is small then the system shows
the properties of an ideal gas, but as φ starts to increase, particles interact with each
other and their motion is restricted by collisions with other nearby particles. Even more
interesting is the fact that the phase transitions of hard-sphere systems are completely
defined by φ [100]. In general, a SM system does not need to be composed only of
hard spheres, it can also be built with non-spherical molecules, for example, rods,
sphero-cylinders—cylinders with semi-circular caps—, and hard disks, just to mention
a few examples. Nevertheless, all these systems experience a large diversity of phase
transitions [101, 102, 103].
The description of phase transitions and critical phenomena are one of the main
challenges in all CMP. When a thermodynamic system changes its uniform physical
properties to another set of properties, then the system encounters a phase transition.
In SM, phase transitions are one of the most important phenomena that can be analyzed
given the diversity of phases that could appear in unexpected circumstances [104, 105].
In particular, a gas-liquid phase transition is defined by a temperature—, known as
the critical temperature, Tc—, and a density—, called the critical density, ρc—. These
quantities are also known as the critical point of the system, meaning that two phases
can coexist when the system is held to these special values. When neither of these
thermodynamical quantities can be employed, an order parameter is more than sufficient
Machine Learning for Condensed Matter Physics 10
to determine the phase transitions of a system. Order parameters are special quantities
that can measure the degree of order between a phase transition; they range between
several values for the different phases in the system [106]. For instance, when studying
the liquid-gas transition in a fluid, a specified order parameter can take the value of
zero in the gaseous phase, but it will take a non-zero value in the liquid state. In this
scenario, the density difference between both phases (ρl − ρg)—where ρl is the density
in the liquid state and ρg in the gas state—can be a useful order parameter.
When traversing the phase diagram of a SM system, one might encounter the
so-called topological defects. A topological defect in ordered systems happens when
the order parameter space is continuous everywhere except at isolated points, lines or
surfaces [107]. This can be the case of a system which is stuck in-between two phases,
and there is no continuous distortion or movement from the particles that can return the
system to an ordered phase. One such example is that of nematic liquid crystals [108].
The nematic phase is observed in a hard-rod system, when all the rods align in a
certain direction. Hard rods are allowed to move and to rotate freely. Furthermore, the
neighboring hard rods tend to align in the same direction. But if it so happens that
some hard rods are aligned in one direction and the rest in another direction, then the
system contains topological defects [109]. These defects depend on the symmetry of the
order parameter of the phases, as well as the topological properties of the space in which
the transformation is happening. Topological defects are of the utmost importance in
a large variety of systems in SM, e.g., they determine the strength of the liquid crystal
mentioned before.
Nevertheless, these types of systems and problems are not so easily solved. How
can we tell when a system will be prone to topological defects? Is it possible to
determine the phases that a system will come across given its configuration? Even
more challenging questions are always originating when these types of exotic matter
appear. But theoretical, experimental or even computational frameworks are not enough
to tackle these problems. Physicists are always open to new ways to approach such
challenges, and ML is slowly starting to become a powerful tool to deal with these
complex systems, and in this review we shall discuss some of the ways ML has been
applied to such problems, as well as commenting on some of the problems that can arise
from using such techniques.
However, it is important to note that SM also experiences thermodynamic
transitions to solid-like phases. For example, in the case of the hard-sphere model,
when the packing fraction φ approaches the value of φ ≈ 0.492 [110, 111], it undergoes
a liquid-solid first order transition [112, 113]. This thermodynamic behavior can also
be studied using theoretical and experimental techniques employed when one deals with
Hard Matter. Thus, at some point, SM is also at the boundary Hard Matter.
Machine Learning for Condensed Matter Physics 11
3.2. Hard Condensed Matter
When physicists deal with materials that show structural rigidity, such as solids, metals,
insulators or semiconductors, their interest in the properties of these materials are
the main focus of Hard Condensed Matter, or Hard Matter (HM). Furthermore, in
contrast to SM systems, HM typically deals with systems on smaller scales, i.e., matter
that is governed by atomic and quantum-mechanical interactions. Entropy no longer
determines the free energy of a HM system, it is now up to the internal energy.
One particular example of interest in HM is the phenomenon of superconductivity. A
material is considered a type I superconductor if all electrical resistance is absent and
it is a perfect diamagnet [114] —i.e., all magnetic flux fields are excluded from the
system—; a type II superconductor is not completely diamagnetic because they do
not exhibit a complete Meissner effect [115]. The phenomenon of superconductivity is
described by both quantum and statistical mechanical interactions. This means that
superconductivity can be seen as a macroscopic manifestation of the laws of quantum
mechanics. As was the case with SM, we are also surrounded by HM materials, for
instance, semiconductors are the principal components in microelectronics [116], which
in turn are the components of every digital device we use, ranging from computers to
smart cellphones. The applications that stem from HM are too great to enumerate in
this short review, and the theoretical, experimental and computational frameworks that
have derived from HM are nothing short of revolutionary; we suggest the reader who
might be interested in knowing some of these applications to a technical perspective
article [117], and a more thorough book on some of the modern aspects of HM, as well
as its applications to technology [118].
In HM, we also need a reference model that can help explain at the basic level
some of the investigated materials. Many other materials can use a similar reference
model, but with different assumptions and physical quantities. Such is the case of lattice
systems, and, in particular, the Ising model [119]. The Ising model is the simplest
model that can explain ferromagnetic properties, in addition to accounting for quantum
interactions. It is defined in a lattice, much like if it were a crystalline solid. In each
site of the lattice, there is a discrete variable σk that takes one of either two values, such
that σk ∈ −1,+1. The energy for a lattice is given by the following Hamiltonian
H(σ) = −∑〈i,j〉
Jijσiσj − µ∑j
hjσj, (1)
where Jij is a pairwise interaction between lattice sites; the notation 〈i, j〉 indicates
that i and j are lattice site neighbors; µ is the magnetic moment and hj is an external
magnetic field. As originally solved by Ising in 1924 [120], when the system lies in a one-
dimensional lattice, the system shows no phase transition. Later, Lars Onsager proved in
1944 [121] that when the Ising model lies in a two-dimensional lattice a continuous phase
transition is observed. For dimensions larger than two, different theoretical [122, 123]
and computational [124] frameworks have been proposed.
Machine Learning for Condensed Matter Physics 12
While the Ising model has seen some very useful applications, it is not the only
model that has shaped the research interests within CMP, as it cannot possible contain
and explain all phenomena seen in HM. A generalized model, such as the n-vector
model [125] or the Potts model [126] are extensively studied models in HM that have
been successfully applied to explain even more intricate phenomena [127, 128].
In general, all these models show phase transitions, in the same way as SM systems.
This research area grew even more when the Berezinsky-Kosterlitz-Thouless (BKT)
phase transition was discovered [129, 130, 131, 132]. The BKT transition was unveiled
on two-dimensional lattice systems, such as the XY model—which is a special case of
the n-vector model—. The BKT transition consists of a phase transition that is driven
by topological excitations and does not break any symmetries in the system [133]. This
type of phase transition was quite abstract in the 1970s when first introduced, but
it turned out to be a great theoretical framework that was later discovered in exotic
materials [134, 135], which in turn showed puzzling properties. Such was the impact of
the BKT transition in CMP that it would later be awarded a Nobel prize in Physics in
2016 [136].
However, the BKT transition is a complicated transition to study. Hard condensed
matter physicists are always trying to find a better way to look for these kind of unique
aspects in materials. It is an elusive and complicated challenge. In the same spirit as in
SM, ML has already been put to the test on this task, if it is possible to re-formulate the
problem at hand as a ML problem, it can then be solved by standard ML techniques.
But, as mentioned before, the task of re-formulation and feature selection is not simple;
there is currently no standard way to approach this in CMP applications, contrary
to most ML models and applications. We shall examine these applications and their
potential challenges as well as drawbacks with more detail in the following sections.
4. Hard Matter and Machine Learning
In this section, we shall review the main techniques and ML models applied to CMP,
mainly to HM. These applications might constitute a paradigm of ML applications to
CMP due to the results obtained, which are generally acceptable. The structure of
HM data is similar to the one found in computer vision applications in ML, among
other applications. Furthermore, these schemes have also shown most of the major
shortcomings such as feature selection, training and tuning of ML models.
4.1. Phase transitions and critical parameters
One of the most prominent uses of NNs was on the two-dimensional ferromagnetic
Ising model which has a Hamiltonian similar to Eq. (1), with the aim to identify the
critical temperature, Tc, at which the system leaves an ordered phase—the ferromagnetic
phase—and enters a disordered phase—the paramagnetic phase. Carrasquilla and
Melko [137] introduced a new paradigm by essentially formulating the problem of
Machine Learning for Condensed Matter Physics 13
phase transitions as a supervised classification problem, where the ferromagnetic phase
constitutes one class and the paramagnetic phase is another one, then a ML algorithm
is used to discriminate between both of them under specific conditions. By sampling
configurations using standard Monte Carlo simulations for different system sizes, and
then feeding this raw data to a FFNN, an estimate of the correlation-length critical
exponent ν is obtained directly from the output of the last layer of the trained NN.
Finite-size scaling theory [138] is then employed to obtain an estimation of Tc. Even if
the geometry of the lattice is modified, the power of abstraction provided by the NN [34]
is enough to predict the critical parameters even if it was trained in a completely different
geometry; this makes it a very convenient tool to identify critical parameters for systems
that do not have a defined order parameter [139].
4.2. Topological phases of matter
Once having shown that ML is able to identify phases of matter in simple models
as shown for models that have a similar Hamiltonian to Eq. (1), researchers have been
interested in trying out these algorithms in more complex systems, and a lot of work has
been put into the topic of identifying topological phases of matter using ML algorithms
for strongly correlated topological systems [140, 141, 142, 143, 144, 145, 146, 147, 148,
149, 150]. For instance, Zhang et al [151] have applied FFNNs as well as CNNs to
find the BKT phase transition, as well as constructing a complete phase diagram of the
XY and generalized XY models, with very promising results. In the case of the FFNN
approach, Zhang et al found out that a simple NN architecture consisting of one input
layer, one hidden layer with four neurons and one output layer is more than enough to
obtain the critical exponent ν for the XY model. We can compare this to the results
obtained by Kim and Kim [152], where they explain a similar behavior but for the Ising
model. In their work, analyzing a tractable model of a NN, they found out that a
NN architecture consisting of one input layer, one hidden layer with two neurons and
one output layer is sufficient to obtain the critical exponent ν for the Ising model. If
such is the case for a shallow architecture to be able to obtain good approximations for
the critical exponent ν then, perhaps, such models can be further simplified into more
robust and analytically tractable ones. Such is the case of the Support Vector Machine
methodology developed by Giannetti et al [153].
It is important to keep track of the model complexity specially for complicated
models to avoid overfitting and to reduce the amount of data needed to train the model.
In the case of the NN methodologies exposed before, we can argue that such models are
easier to train if there are enough data samples in the data set, when this is not the case
then overfitting will occur yielding low accuracy results; NNs are also faster to train due
to the fact that these models can be implemented in accelerated computing platforms,
such as Graphical Processing Units (GPUs). On the other hand, SVMs cannot be
readily accelerated with such techniques but they are less prone to overfitting because
the model complexity, in cases where the support vectors—the subset of the data set
Machine Learning for Condensed Matter Physics 14
needed to construct the decision function and thus make a prediction—are less than
the total number of samples, is low enough that a smaller data set might be enough to
obtain desirable approximations. Still, SVMs need to be adjusted according to the task,
so hyperparameter tuning must always be part of the data processing pipeline, which in
turn might take longer to obtain results. If this is the case, then simpler methods—such
as logistic regression or kernel ridge regression [154]—could potentially be better suited
for the task of approximating critical exponents due to the fact that these models are
simple to implement, need less data samples and most implementations are numerically
stable and robust; at the very least these simple models could potentially provide a
starting point for such approximations.
New and interesting methodologies are always arising in order to solve the problem
of phase transitions. Kenta et al [155] developed a generalized version of the
methodology proposed by Carrasquilla and Melko [137] for a variety of models, namely,
the Ising model, the q-state Potts model [156] and the q-state clock model. Instead of
using the raw data obtained from Monte Carlo simulations as proposed in [137], Kenta
et al considered the configuration of a specific long-range spatial correlation function
for each model, effectively exploiting feature engineering, hence providing meaningful
data to the NN, which now contains relevant information from the systems at hand.
This strategy has the advantage of generalization for the already trained NN, thus it
can be readily applied to similar systems without further manipulation of the physical
model or the production of new simulation data. The strategy is also quite general in
the sense that it can be applied to multi-component systems, with the similar precision
as standard theoretical methods.
Finally, we remark some of the advances in topological electronic phases. The most
fundamental quantity to characterize these states is the so called topological invariant,
whose value determines the topological class of the system. In particular, interfaces
between systems with different topological invariants show topologically protected
excitations, resilient towards perturbations respecting the symmetry class of the system.
Mappings of topological invariants using NN as performed in [157] shows that supervised
learning is an efficient methodology to characterize the local topology of a system. DL
has also seen some useful applications to these type of systems, e.g., for random quantum
systems [158, 159, 160].
4.3. Classical and modern ML approaches
The amount of success that ML has seen in the area of Lattice Field Theory and Spin
Systems has given it a special place in the toolboxes of physicists in order to provide
a different perspective to a given problem based on different ML models, each with its
own advantages and drawbacks. It is specially important to identify those tools as the
ones that have seen the most use in this topic, the main one being FFNNs given its
generalization and abstraction capabilities for which it has become the workhorse of
most of the applications in this field—particularly when the problem can be formulated
Machine Learning for Condensed Matter Physics 15
as a supervised learning problem.
For more classical approaches, SVMs have also been implemented on the phase
transition problem [148, 153] with promising results given their ease of use and physical
interpretation of the results, which we shall discuss in detail in section 5.4. It is useful
to compare the SVM method against the NN one presented in the previous subsection,
specially given the fact that both techniques can yield very similar results; equivalently
we can ask ourselves, when should we use one technique or the other? As mentioned in
the previous subsection, SVMs are a great tool when the number of samples in the data
set is low, this is because SVMs only need a small subset, i.e., the support vectors, to
effectively predict some value. The main disadvantage of SVMs is their time complexity
when training, as it completely depends on the implementation used and the data set.
For example, the most popular implementation of a SVM solver is LIBSVM [82], which
scales between O(M×N2) and O(M×N3), where M is the number of features and N is
the number of samples in the data set. SVMs depend strongly on the hyperparameters
used, so in the case of an incorrectly tuned SVM the time it takes to train could be
O(N3) in the worst-case scenario, while also yielding poor results. On the other hand,
if the user is careful in tuning and selecting a good kernel as presented in [153], one can
solve a CMP problem with very high precision; in this case some Physics insight of the
problem can be helpful in providing good parameters for the SVM.
If we now turn our attention to NNs and their training time complexity, it
normally scales as O(N3) provided the NN use an efficient linear algebra and matrix
multiplication implementation. If we also account for the backpropagation operations,
which scale linearly with the number of layers in the network, we can see that NNs
are significantly more difficult to train than SVMs. Research is being carried out to
understand why NN are hard to train for and at the same time they are efficiently
trained in practice [161, 162]. The reason for this is that modern numerical linear algebra
implementations are efficient, and they can be accelerated with modern computing
technologies, such as GPUs. This makes NNs a great tool when the number of data
samples is large, while training can also happen online, i.e. feeding the network with
newly created samples. In short, both techniques give similar results provided there are
enough representative data samples from the problem. A simple criterion for choosing
between both methodologies depends greatly on the data set and the problem, but
starting off with SVMs can give a good approximation, and in the case of needing even
more precision or if the data set contains a large number of features then NNs might be
a better choice for the problem at hand.
Many other classical techniques—specifically the ones that stem from unsupervised
learning methods —have become prominent algorithms. For instance, one such
technique is the nonlinear dimensionality reduction algorithm t-distributed Stochastic
Neighbor Embedding (t-SNE) [163] that Zhang et al [151] used to map high-dimensional
features onto lower dimensional spaces in order to identify three distinct phases in
the site percolation model. In this methodology, the idea is to use different site
configurations, handled as a single array of values from the lattice, and using t-SNE
Machine Learning for Condensed Matter Physics 16
to map this large dimensional space into a 2-dimensional one, such that when enough
iterations are performed three distinct clusters appear: one for each ordered phase
(percolating and non-percolating phases), and the transition phase between these two
phases. Although t-SNE is mostly a visualization tool, by employing other unsupervised
learning techniques such as k-means [154] or spectral clustering [164], together with the
results obtained from t-SNE one could possibly obtain a good approximation of the
transition point in the percolation model.
Another prime example is Principal Component Analysis (PCA) [165, 154] which
has proved to be one of the best feature extraction algorithms that can be used in
Spin Systems [166, 167], as it can be readily applied to raw configurations and use the
information from the system and build different features, which then can be compared to
physical observables like structure factors and order parameters. For instance, Wei [167]
established a methodology to build a matrix Mwith different site configurations from
the Ising model—using a simplified version of the Hamiltonian from Eq. (1)—and then
performing PCA on M . This yields a set of new component vectors that describe the
large dimensional space of M in a new, lower dimensional space. These new vectors are
used as a basis set to project the original samples in this space and it is in this new space
that clustering techniques like k-means can be used in order to find a phase transition.
DL has also been wielded as a powerful technique to understand quantum phase
transitions [168] in the form of adversarial NNs [169]; to provide a thorough description of
many-body quantum systems [170] by employing Variational Autoencoders [171, 6]; and
CNNs [6] have been applied, for instance, to learn quantum topological invariants [172],
or to solve frustrated many-body models [173]. One of the main reasons for DL to
be such a strong asset is that it can handle complex data representations and achieve
good accuracy results, provided these models have enough data to learn from. The
main drawbacks of these models is precisely the fact that the data needed must be
representative of the task at hand; the data must also preserve invariants of the system
in the case that the system has such invariants, such as invariance to translation and
rotations. These type of issues are mostly solved with data augmentation techniques,
such as adding some white noise to the data samples, or creating some additional samples
that are mirrored, although there are other such modifications that can be done. But
again, this implies that the user of such models must know the problem well enough to
perform such adjustments for the sake of stable training and yielding good results.
4.4. Enhanced Simulation Methods
One of the standard computer simulation technique in HM, along with SM, is the Monte
Carlo (MC) method, an unbiased probabilistic method that enables the sampling of
configurations from a classical or quantum many-body system [174, 175, 176, 177].
Throughout the decades, specific modifications have been performed to the MC method
in order to enhance it, creating a better, more rigorous method that could be used in
all types of simulations—from observing phase transitions to computing quantum path
Machine Learning for Condensed Matter Physics 17
integrals—.
It is in this situation that ML has something to offer in order to improve the original
MC method; such is the case of the Self-learning Monte Carlo (SLMC) method developed
by Li et al [178]. In this novel scheme, the classical MC method is used to obtain
configurations from a similar Hamiltonian to Eq. (1)—these configurations will serve as
the training data—; we call this set of configurations s from a state space S where each
sample should have a probability density given by some function p:S → <+. Then a ML
method is employed—linear regression, in this case—to learn the rules that guide the
configuration updates in the simulation, i.e. a parametrized effective model wθ which can
approximate the original model and obtain wθ ≈ w, with wθ: S → <+ and S ⊇ S. This
enables the MC method to learn by itself under which conditions a given configuration is
updated without external manipulation or guidance, efficiently sampling configurations
for the most general possible model. Demonstrated on a generalized version of the
Hamiltonian given by Eq. (1), the SLMC method is extremely efficient when producing
configurations from the system, lowering the autocorrelation time from each of the
updates. Li et al discussed that, although a ten-fold speedup from the original MC
method is obtained, the SLMC still suffers from the quintessential problem of the original
MC method—when approaching the thermodynamical limit, the method heavily slows
down, rendering it useless—. Almost simultaneously, Huang and Wang [179] developed
a similar scheme but using restricted Boltzmann Machines (RBMs). This interesting
approach consists of building special types of RBMs that can use the physical properties
of the system being studied. The strategy by Huang and Wang lowers autocorrelation
time as well as raising the acceptance ratio of sampled configurations. The SLMC
method has sparked interesting new research [180, 181, 182, 183, 184]; recently, Nagai,
Okumura and Tanaka [185] have shown that the SLMC is even more promising by
coupling it with Behler-Parrinello NNs [186], making it more general and less biased
towards specific physical systems.
Despite its advantages, the SLMC method still has some shortcomings and
drawbacks. As with every other ML model, good and representative training data
is required in order to train the model; for a complex model, i.e., if the model contains
a large number of parameters to be fitted, the more data samples are needed to avoid
overfiting. This might come as a problem if the original MC method used to sample
configurations is already slow. Another important drawback is that the ML model
needed to approximate the probability density for the configuration set must be robust
enough to handle the problem at hand, so some intervention from the user is required
in order to choose a good model, but this is not simple a task because this would imply
that the dyamics of the system must be known from the start; this might remove all
advantages of using ML models for automated computation of the problem. To alleviate
some of these disadvantages, Bojesen [187] proposed a reinforcement learning approach,
defined as the Policy-guided Monte Carlo (PGMC) method. In this methodology,
the focus is now in an automatic approach to target the proposal generation for each
configuration, which can be done by tuning the policy for making updates. Showing
Machine Learning for Condensed Matter Physics 18
promising results in some models, the PGMC proves to be a better candidate for
automatic MC sampling, although it also has its own shortcomings, which is primarily
the fact that if the available information is inadequate, the training might become
unstable and the probability density will be biased, ultimately breaking ergodicity.
Nevertheless, these methods prove that ML models can be used in CMP to obtain
some gains in computational efficiency and simplicity making them a potentially good
tool for more challenging problems.
4.5. Outlook
Current research is geared towards enhancing mainstream methodologies in CMP with
ML, and enabling physicists to obtain high-precision results while also providing some
insight of the learning mechanisms used by the ML algorithms employed. Another
current area of research in Spin Systems is that of figuring out which ML techniques
are more suitable to understand out-of-equilibrium systems [188], or even more complex
frustrated models [189], just to name a few. Challenges, such as represetantive data set
generation and feature selection must also be overcome within this area if we wish to
see these methodologies become more popular. This matter is even more important for
out-of-equilibrium systems because it is not a simple task to generate useful training
data sets.
5. Physics-inspired Machine Learning Theory
5.1. Explanation of Deep Learning Theory
The gain in knowledge between ML and CMP is not one-sided as one might think.
ML has also benefited greatly from CMP with its thorough and rigorous theoretical
frameworks that it possesses. Despite their huge success in human-like tasks—e.g.,
the classification of images [2]—, ML, and more importantly DL are not always able
to explain why this success is even possible in the first place because these models
might not reach the algorithmic transparency of other classical ML models, e.g. linear
models. Schmidt et al [11] mention that representative datasets, a good knowledge of
the training process, and a comprehensive validation of the model can usually overcome
this obstacle. Other classical algorithms, such as MC computations are not always
analytically tractable, so we might argue that this problem is not unique to ML models.
With this in mind, an interpretation of the models used can be done to some extent,
and even more so with the help of some theoretical frameworks, such as those found
within Physics.
In 2016, Zhang et al [190] took it upon themselves to run detailed experiments
that showed the lack of complete understanding of the current learning theory from
a theoretical perspective; on the practical side of things, DL is still very prosperous
in its results. Since then, physicists have tried to give insights [191] into this topic,
creating a fruitful research area with extremely interesting cross-fertilization between
Machine Learning for Condensed Matter Physics 19
both communities. For a more thorough and complete analysis on the explanation and
interpretability of DL contributions until now, the reader can refer to the recent review
on the topic by Fan, Xiong and Wang [192].
5.2. Renormalization Group
The lack of theoretical understanding of the learning theory in most ML algorithms
seemed like a good challenge for condensed matter physicists that exerted experienced
manipulation in theoretical frameworks, such as the Renormalization Group (RG) [122,
123]. RG consists of a mathematical framework that allows us a very rigorous
and systematic investigation of a many-body system by working with short-distance
fluctuations and building up this information to understand long-distance interactions
of the whole system. In 2014, Mehta and Schwab [193] provided such an insight
by creating a mapping between the Variational Renormalization Group and the DL
workflow that is common nowadays. This turned out to be an interesting take on the
problem, resulting in a very rigorous framework by which one could explain the learning
theory that Deep NNs follow when data is fed to them. In the same spirit, Li and
Wang [194] proposed a new scheme, this time in a more systematic way, by means of
hierarchical transformations of the data from an abstract space—the space of features
present in the data—to a latent space. It turns out that this approach has an exact
and tractable likelihood, facilitating unbiased training with a step-by-step scheme; they
proved this through a practical example using Monte Carlo configurations sampled from
the one-dimensional and two-dimensional Ising models. More research regarding a deep
link between RG and DL has been carried out since then, for instance, Koch-Janusz and
Ringel [195] showed that even though the association between RG and DL is strong,
most of the practical methodology to achieve good results from this is not so readily
available. By using feed-forward NNs, they proved that a RG flow is obtainable without
discriminating the important degrees of freedom that are sometimes integrated out by
following the RG framework. This seems like a promising and enriching area of research
as new insights are constantly developed [196, 197, 198].
5.3. Restricted Boltzmann Machines
Instead of giving a complete explanation of a learning theory, CMP has been able
to give thorough explanations of ML algorithms; such is the case of the Restricted
Boltzmann Machine (RBM) [199]. This type of model is a very interesting case of the
inspiration drawn from statistical physics onto ML. In the RBM, a NN is trained with
a given data set, the intelligent model then learns complex internal representations
of the data, after the model has been trained it outputs an approximation of the
probability distribution of the input data. The resulting approximate probability
distribution is then linked to a certain ML task, be it classification [200], dimensionality
reduction [201], feature selection/engineering [202], and many more. The underlying
structure of the model is not a conventional FFNN, instead, it is a type of model that
Machine Learning for Condensed Matter Physics 20
is built using graphs [203]. The principal mechanism by which the RBM learns features
is by computing an energy function E(h, v) from certain hidden—h —and visible —v
—units that form a configuration; after computing this energy function E(h, v) one
evaluates the Boltzmann factor of the energy to obtain a probability for the given
configuration —P (h, v) = e−E(h,v)/Z —. From classical statistical mechanics we know
that the partition function Z =∑
i e−E(h,v) is the normalizing constant needed to ensure
that 0 ≤ P (h, v) ≤ 1. It is this link between the Boltzmann description and the energy
function from the RBM that makes it an ideal candidate to be analyzed under the
statistical mechanics framework.
Interestingly enough, RBMs are extremely versatile, despite the fact that it is a
ML method for which the training mechanisms are not as fast as in more modern ML
algorithms, thus special technical schemes have been developed for the task. This is
possibly the greatest drawback of using RBMs, the training procedure is not unique
and the most common procedure to train such models is the so-called contrastive
divergence algorithm [204]. This algorithm can be tricky to use effectively and requires
a certain amount of practical experience to decide how to tune the corresponding
hyperparameters from the model. A good practical guide to learn about training RBMs
is the one by Hinton [205], where special care is taken to inform the reader of the uses
and shortcomings of the training procedure for RMBs. We shall discuss some of the
attempts at providing some insight into the training mechanisms of RMBs by using
some theoretical frameworks.
For example, Huang and Toyoizumi [206] introduced a mean-field theory
approach to perform the training using a message-passing-based method that evaluates
the partition function, together with its gradients without requiring statistical
sampling—which is one of the drawbacks of traditional learning schemes—. Further
work on understanding the thermodynamics of the model were carried out [207] with
encouraging results, showing that the learning mechanism in general is completely
tractable, albeit complex when the model enters a non-linear regime. The RBM can also
be seen as the inverse Ising model [208], where instead of obtaining physical quantities
like order parameters and critical exponents from system configurations, one has as
input the physical observables and intends to obtain insights on the system itself. With
this theoretical framework in mind, the first steps to produce rigorous explanations of
the algorithm were done by Decelle et al. [209] by studying the linear regime of the
training dynamics using spectral analysis.
RBMs are such rich and resourceful models that have been applied to a number of
hard tasks; for instance, the modeling of protein families from sequential data [210, 211].
Another such complex task is that of continuous speech recognition, on which a special
type of RBM was employed with promising results [212]. Within the realm of CMP, these
models have been used to construct accurate ground-state wave functions of strongly
interacting and entangled quantum spins, as well as fermionic models on lattices [213].
RBMs are currently one of the most flexible algorithms, despite its training
mechanisms disadvantages. These models are particularly well suited to study problems
Machine Learning for Condensed Matter Physics 21
within CMP and quantum many-body physics [214]. With even more research efforts
to make RBMs scalable to larger data set and faster training mechanims, it might be
possible for RBMs to become a standard tool in ML applications to CMP problems.
5.4. Interpretation of Machine Learning models
Despite the success they have achieved, ML algorithms are prone to a lack of
interpretation, resulting in so-called black box algorithms that cannot grant a full
rigorous explanation of the learning mechanisms and features that result in the outputs
obtained from them. However, for a condensed matter physicist, it is more important
to understand the core mechanism by which the ML model obtained the desired output,
so that it can later be linked to a useful physical model. As an example, we can take
the model with the Hamiltonian described by Eq. (1) and train a FFNN on simulation
data, following the methodology of Carrasquilla and Melko [137]; once the NN has been
trained, the critical exponent is obtained and used to compute the critical temperature,
but the question remains, what enabled the NN to obtain such information? Is there a
mechanism to extract meaningful physical knowledge from the weights and layers from
the NN?
To answer this type of questions researchers have had to resort to arduous extraction
and analysis from the underlying model architectures. For example, Suchsland
and Wessel [215] trained a shallow FFNN on two-dimensional configurations from
Hamiltonians similar to Eq. (1), then proceeded to comprehensively analyze the weight
matrices obtained from the training scheme, while also performing a meticulous study
of the reasons why activation layers and their respective neurons fire a response
when required; they found out that both elements—training weights and activation
layers—play a crucial role in the understanding of how the NN is able to output the
physical observables that we give a meaning to. They also determined that the learning
mechanism corresponds to a domain-specific understanding of the system, i.e., in the
case of the Ising model, the NN effectively learned the structure of the surrounding
neighbors for a particular spin, with this information it then proceeded to compute the
magnetization order parameter as part of the computations involved in the training and
predicting strategies.
Another approach is to use classical ML algorithms like SVMs given that these
models provide a rigorous mathematical formulation for the mechanism by which they
learn from given features in the data. Ponte and Melko [148], as well as Gianetti et
al. [153] exploited these models and used the underlying decision function of the SVMs
to map the Ising model’s order parameters and obtained a robust way to extract the
critical exponents from the system. Within this approach, one does not need to provide
a meticulous explanation of the learning mechanism for the model, this has already been
done in its own theoretical formulation—instead, an adjustment of the physical model
with respect to the ML algorithm needs to be performed in order to successfully achieve
a link between both frameworks. Nonetheless, this approach has the disadvantage that
Machine Learning for Condensed Matter Physics 22
it can be hard to map the ML method to the physical model, as it requires a deep
and experienced understanding of the physical system, while simultaneously needing a
comprehensive awareness of the theoretical formulation of the ML algorithm being used.
5.5. Outlook
The approach of illustrating the training scheme, as well as the loss function landscape
of a ML algorithm with CMP has proven to be both intriguing and effective. Baity-Jesi
et al. [216] used the theory of glassy dynamics to give an interesting interpretation of
the training dynamics in DL models. Similar work was performed by Geiger et al. [217],
but this time with the theory of jamming transitions, disclosing why poor minima of
the loss function in a FFNN cannot be encountered in an overparametrized regime—i.e.,
when there exists more parameters to fit in a NN than training samples—. Dynamical
explanations of training schemes have the advantage of providing precise frameworks
that have been understood in the physics community for a long time, but it is an
intricate solution as it needs a scrupulous analysis of the model, leaving little room for
generalization on other ML models. It is true that FFNNs, as well as Deep NNs are
the less understood ML models, but they are not the only ones that need a carefully
detailed framework; maybe in future research we can see how other ML models benefit
from all these powerful techniques developed so far by the CMP community.
6. Materials Modeling
6.1. Enhanced Simulation Methods
Molecular dynamics (MD) is a classical simulation technique that integrates Newton’s
equations of motion at the microscale to simulate the dynamical evolution of atoms
and molecules [218]. It is, along with the MC method, a fundamental tool in computer
simulations within CMP, specifically in HM, SM and Quantum Chemistry. The powerful
method of MD enables the scientist to obtain physical observables through a controlled
and simulated environment, by-passing limitations that most experimental setups have
to deal with. MD has become a cornerstone method to model complex structures, such
as proteins [219] and chemical structures [220], just to name a few. But such complex
structures have to deal with large number of particles—with their many degrees of
freedom—as well as complicated interactions in order to obtain meaningful results out
of the simulations. This creates a difficult challenge to overcome, as efficient exploration
of the phase space is a difficult task, even with MD simulation code accelerated with
modern hardware. In order to reduce the computational burden of modeling such
many-body systems, special enhanced sampling methods have been developed [221],
but not even these schemes can alleviate the arduousness of modeling complicated
systems. ML is seen as a promising candidate to surpass modern solutions in MD
simulations, as shown primarily by Sidky and Whitmer [222]. In their novel approach,
authors employed FFNNs to learn the free energies of relevant chemical processes,
Machine Learning for Condensed Matter Physics 23
effectively enhancing MD simulations with ML and creating a comprehensive method
to obtain high-precision information from simulations that were not previously readily
available. Another interesting procedure to enhance MD simulations was the automated
discovery of features that could be potentially fed into intelligent models, making it
possible to create a systematic and automated workflow for materials research. Sultan
and Pande [223] developed such techniques by training several classical ML methods
—Support Vector Machines, Logistic Regression, and FFNNs —, showing that given
a MD framework one could set up a complete, self-adjustable scheme to obtain the
system’s free energy, entropy, structural information and many more observables.
Another powerful tool to study many-body systems is Density Functional
Theory [224] (DFT), which is a quantum mechanical framework for the investigation
of the electronic structure of atoms. We refer the reader to useful references for more
information on DFT [225, 226]. Although a very robust tool its main disadvantage is
that the performance of DFT results depends on the approximations and functionals
used; it is also computationally demanding for large systems. The use of NN to enhance
these computations has shown favorable results, for instance, in the work by Custodio,
Filletti and Franca [227], where an approximation of a density functional by means of
a NN reduces the computational cost, while simultaneously obtaining accurate results
for a vast range of fermionic systems. For similar applications to DFT, we would like to
refer the reader to a review where a number of different methods are used to solve the
Schrodinger equation and related problems of constructing functionals for DFT [228].
Although in its early stages, the enhancement of classical methods with ML is an
intriguing area of research. When no rigorous foundation is needed, ML models are able
to exceed human performance in the modeling of atomistic simulations, making it a
suitable candidate to become the mainstream method in the future. Of course, there is
a long path ahead to fully embrace these methods, as a complete simulation strategy is
not easily available yet; further research on the coupling between molecular simulations
and ML schemes is the primary way to achieve this.
6.2. Machine-learned potentials
In this new day and age, computer simulations are a standard technique to do research
in CMP and its sub-fields. These simulations produce enormous quantities of data
depending on the sub-field they are tasked in, for example, biophysics and applied soft
matter have produced one of the largest databases—the Protein Data Bank [229]—that
contain free theoretical and experimental molecular information as a by-product of
research in specific topics. This explosion of immense data available to the public has
attracted scientists to look into the ML methodologies that have been created for the
sole purpose of dealing with the so-called big data outbreak [230] of recent years, and
effectively uniting both fields to attempt to use such well-tested intelligent algorithms
in the understanding and automation of computer-aided models.
One such task is that of constructing complex potential energy surfaces (PESs)
Machine Learning for Condensed Matter Physics 24
with the experimental and computational data available [12]. This has turned out to
be a successful application to materials science, physical chemistry and computational
chemistry, as it allows scientists to obtain high-quality, high-precision representations of
atomistic interactions of all kinds of materials and compounds. In atomistic simulations
and experiments, scientists are always interested in obtaining a good representation
of PESs because it guarantees the full recovery of the potential energy—and as a
consequence, the full atomistic interactions—of a many-body system. The baseline
technique that provides such information is the coupling of MD with DFT which turns
out to be a very computational demanding scheme, even for the simplest cases. This
constituted the starting point in a long standing approach to use ML models that could
compute such complicated functions, but not without the usual challenges of good data
representation, stable training and good hyperparameter tuning. The first attempt of
this was done by Bank et al [231] where they proposed to fit FFNNs with information
produced by some low-dimensional models of a CO molecule chemisorbed on a Ni(111)
surface. They obtained promising results, such as faster evaluation of the potential
energy by means of the NN model compared to the original empirical model. The data
sets used by Bank et al were composed with a range from two up to twelve features, or
degrees of freedom, making it easy and simple to train the NN models.
But this is not always the case, in fact, one could argue that it is never the case, as
systems like proteins consist of thousands of particles and electronic densities, hence a
large number of degrees of freedom need to be computed simultaneously [232]. Another
challenge for these type of ML potentials is that of the descriptors or features needed
in order to be fed to an intelligent model. Data fed into ML models should account for
all possible variations and transformations seen in real physical models to extend the
generalization of the ML schemes and avoid bias in them.
It took more than ten years after the pioneering work of Bank et al for Behler
and Parrinello [186] to undertake this challenge to a new level by proposing a similar
methodology employing FFNNs, advocating now to a more generalized approach to
computing the PES of an atomistic system by creating what they called atomic-centered
symmetric functions. We will describe in detail this method as it is an important
achievement in the application of ML to CMP and materials modeling.
The main purpose of the Behler-Parrinello approach is to represent the total energy
E of the system as a sum of atomic contributions Ei, such that E =∑
iEi. This is
done by approximating E with a NN. The input of the NN should reflect information
of the system from the contribution of all corresponding atoms, while also preserving
important information about the local energetic environment. In this sense, symmetric
functions are used as features, transforming the positions to other values that contain
more useful information from the system. In order to define the energetically relevant
local environment, a cutoff function is employed fc of the interatomic distance Rij, which
Machine Learning for Condensed Matter Physics 25
has the form
fc(Rij) =
0.5 ·[cos(πRij
Rc+ 1)]
for Rij ≤ Rc ,
0 for Rij > Rc .(2)
Next, radial symmetry functions are constructed as a sum of Gaussian functions
with parameters η and Rs
G1i =
all∑j 6=i
e−η(Rij−Rs)2fc(Rij). (3)
Finally, angular terms are constructed for all triplets of atoms by summing the
cosine values of the angles θijk =Rij ·Rik
RijRikcentered at atom i, with Rij = Ri −Rj , and
the following symmetry function is constructed
G2i = 21−ζ
all∑j,k 6=i
(1 + λ cos θijk)ζ e−η(R
2ij+R
2ik+R
2jk)fc(Rij)fc(Rik)fc(Rjk), (4)
with parameters λ(= +1,−1), η, ζ. These parameters, along with the ones defined
in Eq. (3) need to be found before these symmetry functions can be used as input to
the NN. The parameter fitting is done by doing DFT calculations for the system to
be studied. By computing the root mean squared error (RMSE) between the fit and
the true values from the computations, the training set is constructed with the DFT
calculations if the RMSE is larger than the fit. With this information, a training and
testing data set is constructed and then used to train the NN, which in turn will yield an
approximation for E. This approximation is then used to compute different observables
and quantities from the system.
This methodology introduced a paradigm shift because it was no longer a proof-of-
concept that ML algorithms could be used in these research fields, and many other
attempts have been carried out since then. Following the footsteps of Behler and
Parrinello, Bartok et al [233] proposed a similar methodology, but with completely
different ML schemes—in this case being Gaussian Processes [51] —which in turn needed
new atomistic descriptors. We shall discuss atomistic descriptors in more detail in the
next section.
The advantages of such techniques in all the sub-fields of CMP is manifold, the
main one being that simulations are now faster and more precise. Another advantage
of such methods is the fact that we can leverage the retention of the learned features
from the NN and by employing transfer learning [234] the model does not need to
be trained again to compute the same interactions it learned to compute. For the
reader interested in more detailed discussions on transfer learning we point out some
reviews [235, 236] that give enough information on the current advancements on this
topic. Nevertheless, this approach has one significant drawback which is the fact that
most NNs cannot learn sequential tasks, which in turn creates a phenomenon called
catastrophic forgetting [237, 238, 239, 240]. In the case of catastrophic learning, the NN
Machine Learning for Condensed Matter Physics 26
abruptly loses most of the information learned from a previous task, unable to use it
in the new task at hand. Transfer learning is a ML technique that needs meticulous
manipulation from the user, involving fine tuning, which is a training process where some
of the learned information are kept and some are learned again, and then joint training
where new information is now learned using previous and newly acquired information
obtained from fine tuning. Research is currently being carried out [241, 242] to overcome
this disadvantage from NNs and shared learning.
Machine-learned potentials have also been successfully applied to explaining
complex interactions, like the van der Waals interaction in water molecules and the
reason this interaction results in such unique properties seen in water [243]. Another
interesting application was the simulation of solid-liquid interfaces with MD [244], an
application that was not thought of being computationally simple or even tractable. One
of the remaining challenges in this area of research is a paradoxical one, being that these
ML potentials require very large data sets that are computationally demanding in order
to train the intelligent model, one possible approach to alleviate this issue is to build and
maintain a decentralized data bank similar to the Protein Data Bank—which could be a
cornerstone of modern computational materials science and chemistry—enabling anyone
interested in this type of research to collect insights from the acquired data, without
needing to run large-scale simulations or re-train complicated NN architectures to do
so. This is an interesting research field in the junction of many other specialized sub-
fields which is also growing faster due to current computing processing power available,
setting up the pathway for new and interesting intelligent materials modeling—which
could surely become the future of materials science.
6.3. Atomistic Feature Engineering
So far, we have discussed the different ways in which scientists have been able to apply
ML technology in different sub-fields of CMP, but lattice systems seem to prevail, where
most of the research done is reported with these systems—why is that?—.
One of the fundamental parts of using a ML method is that of data input with
specific features; for instance, images are described by pixels in a two-dimensional grid
where the pixels are the features that numerically represent the image, these pixels are
then fed to a ML model, e.g., a classifier; another clear example are numerical inputs
that can be readily used in regression tasks, and then used as a prediction model. We
wish to obtain a similar representation for physical systems, so as to ease the use of
ML models in condensed matter systems. Recall that the intrinsic features of lattice
systems are spins (σi), these spins do not change location they actually stay put and
only change their spin value taken from a finite, discrete set of values. We can easily
see a link between images and pixels, and a lattice system and its spins; both can be
described by a set of discrete features that do not change their values as the system
is evolving and changing. But we cannot say the same about atomistic and off-lattice
systems, thus we cannot use the same representation as lattice systems because atomistic
Machine Learning for Condensed Matter Physics 27
systems are described primarily by the constant movement of their atoms and particles,
and their many degrees of freedom—for instance, for a three-dimensional system, one
has 3N translational degrees of freedom with N being the total number of particles—.
By using the raw positional coordinates, velocities or accelerations, we might not be
able to convey the true features that the system possesses; if even possible, they could
be conveyed in a biased way, e.g., by defining an arbitrary coordinate system or by not
being invariant to translation and rotations. In order to overcome this limitation, a
special mechanism needs to be developed to feed this data to the ML algorithms. It is
this encoding of raw atomistic data into a set of special features that has seen a strong
surge in recent years when dealing with materials of any kind.
Atomistic descriptors have been studied heavily throughout the years [245, 246,
247, 248, 249, 250, 251] because they are the gateway to applying ML methodologies in
atomistic systems. The reader is referred to these reviews [12, 252] for more information
on the topic. Descriptors are a systematic way of encoding physical information in a
set of feature vectors that can then be fed to a ML algorithm. These encodings need to
have the following properties: a) the descriptors must be invariant to any type of point
symmetry, for example, translations, rotations and permutations; b) the computation
of the descriptor must be fast as it needs to be evaluated for every possible structure in
the system; and c) the descriptors have to be differentiable with respect to the atomic
positions to enable the computation of analytical gradients for the forces. Computing
these descriptors is essentially called feature selection in the ML literature, and most
of the time these procedures are carried out in an automated way in a normal ML
workflow. Once a particular descriptor has been used, all the methodology from ML
that we have seen applied to lattice systems can now be employed on atomistic and
off-lattice systems.
6.4. Outlook
ML-enhanced materials modeling is a strong research area, with a focus on streamlining
computations and automating material discoveries. Descriptors that employ less data
should be the primary research area, as it currently is one of its main shortcomings. A
standard framework and benchmark data sets could also be used to further this area,
while also looking forward to an automated workflow.
7. Soft Matter
At this stage, most of the ML applications we have discussed have primarily been
focused on HM and atomistic many-body systems. It has been so due to the fact
that ML methods can be applied using data sets that need little feature engineering,
even without its disadvantages. However, off-lattice and SM systems, along with other
atomistic many-body systems, on the other hand, need thorough feature engineering
and a different approach to training ML models.
Machine Learning for Condensed Matter Physics 28
The applications we will review in this section reflect this inherent challenge of
finding representative features that could enable ML models to be used in tasks common
to all CMP, e.g., phase transitions and critical phenomena.
7.1. Phase Transitions
Phase transitions are the cornerstone of ML applications to lattice systems, but with the
use of atomistic descriptors these can easily be extended to off-lattice systems, which
are usually observed in SM. Jadrich, Lindquist and Truskett [253, 254] developed the
idea of using the set of distances between particles as a descriptor for the system, and
used unsupervised learning—in particular, Principal Component Analysis—on these
descriptors. They wanted to test whether a ML model was able to detect a phase
transition in hard-disk and hard-sphere systems. When the unsupervised method was
applied, the output from the model was later found to be the positional bond order
parameter for the system [255, 256], which effectively determines whether the system
has been subjected to a phase transition. In other words, the ML model was able
to automatically compute the order parameter without having been told to do so.
A similar approach to colloidal systems was done by Boattini and coworkers [257].
Instead of using inter-particle distances, they computed the orientational bond order
parameters [258, 259], then they used a DL unsupervised method—a FFNN based
autoencoder—to obtain a latent representation of the computed order parameters.
With all these preprocessing steps, they fed the encoded data to a clustering algorithm
and obtained detailed descriptions of all the phases of matter in the current system.
This methodology is systematic and leads itself nicely to an automated form of phase
classification in colloidal systems.
7.2. Colloidal systems
Liquid crystals are a state of matter that have both the properties of a liquid and of
a solid crystal [260]. At the simplest case, liquid crystals can be studied using one of
the reference model from SM, i.e., hard rods, or similar anisotropic particles [261]. This
type of systems exhibit properties similar to those present in HM, such as topological
defects. The main issue with using the same methodology is that building a data set
with useful and representative features in systems made of liquid crystal molecules is
not trivial, mainly due to the fact that positions and other physical observables can
take any continuous value instead of a specific value from a given set. In the work by
Walters, Wei and Chen [262] a time series approach using Recurrent NN was employed,
using the basic descriptors from a system of liquid crystals, namely the orientation
angle and positions of all the particles in the system. By using long short-term memory
networks [263, 264] (LSTM), they showed that the off-lattice simulation data can be
used as input and expect to identify between several topological defects. This is a great
step towards the use of mainstream techniques from ML in SM, as one of the main
drawbacks, which is feature selection and data set building, is resolved by using this
Machine Learning for Condensed Matter Physics 29
kind of methods. Although a data set collected from computer simulations does not
convey much problem, it might be so for experimental setups.
Similarly, Terao [265] used CNNs to analyze liquid crystals, as well as different types
of solid-like structures, by computing special descriptors for the system. In particular,
the approach presented in that work can be seen as a generalization of the one tipically
used for HM, which is to transform the off-lattice data into a common structure normally
used in computer vision. By extracting any subsystems with few particles from the total
system, a conversion to grayscale images of size l× l in the case of 2 dimensional systems
were performed by using the following expression,
ui,j = C
mp∑n=1
exp
(−|rn −Rij|2
a2
),
were rn is the positions of the n-th particle in the subsystem with a size Ls and
total number of particles mp; Rij is the positional vector in the center of the pixel
(i, j); and the constant C is determined such that the set ui,j is rescaled within the
interval [0, 1]. The advantage of this method is its generalization potential due to the
fact that no physical input is needed in order for the technique to work, so a large range
of systems could be studied under the same methodology. One possible disadvantage
is the fact that some systems might need to conserve some kind of symmetries, so this
should be taken into account when constructing the data set.
Polymers are materials that consist of large molecules composed of smaller subunits.
These materials are of great interest in SM because most polymers are surrounded
by a continuum, turning them into colloid systems [266]. Polymers are also seen in
nature, e.g., the case of the deoxyribonucleic acid (DNA) found in almost all living
organisms. Even though these systems have been studied extensively throughout the
years, computational and experimental issues always arise. It is in these situations where
ML might be a suitable approach. One such example is the design of a computational
method for understanding and optimizing the properties of complex physical systems
by Menon et al [267]. In the work by Menon et al physical and statistical modeling
are integrated into a hierarchical framework of machine learning, as well as using simple
experiments to probe the underlying forces that combine to determine changes in slurry
rheology due to dispersants. One of the main contributions of this work is that they
show how physical and chemical information can aid in the reduction of the number of
samples in a data set needed to do efficient ML modeling. If data sets are constructed
using experimental data and some insight about the problem, the use of ML techniques
might be easier to achieve. An issue presented in the work by Menon et al considers
how data sets might be subjected to correlation between system variables and responses
from the system. This could be a potential issue when training some ML models due
to training stability and yielding acceptable results.
On a more recent approach to polymers one can find is the work by Xu and
Jiang [268], where a ML methodology is used for studying the swelling of polymers
immersed in liquids. In this work, the importance of using chemical and physical
Machine Learning for Condensed Matter Physics 30
descriptors from the beginning to construct the ML framework is highlighted. By
constructing and using these descriptors, molecular representations are used, along with
data augmentation, to study the relationships between molecular fragments and swelling
degrees with the help of PCA. Various other applications are constantly arising, such as
the discovery and design of new types of polymers in ML methods [269]. As ML methods
help with complex representations of data, it could potentially be used to accelerate the
discovery of innovative materials within polymer science.
7.3. Properties of equilibrium and non-equilibrium systems
The structural and dynamical properties of equilibrium and non-equilibrium systems,
such as complex liquids, glasses, gels, granular materials, and many more, are extensively
studied under the branch of colloidal SM [270, 271]. Common tools, such as computer
simulations are used to study such properties, but in some systems these tools are
computationally demanding to the point that special facilities, such as high-performance
computing (HPC) clusters are needed. Even if in the current era we are able to use this
kind of computational resources, we are still limited to the amount of computations
needed for large, physically-representative systems. Studying simple reference models,
Moradzadeh and Aluru [272] were able to fabricate a DL autoencoder framework to
compute structural properties of these simple systems. Using MD and long running
simulations, they collected several position snapshots for a diverse range of Lennard-
Jones systems and using the autoencoder as a denoising framework they computed the
time average radial distribution function (RDF). This was achieved using a data set
built with three main features, the thermodynamic state composed of the temperature
and density—(T, ρ) —at each time step, and the RDF snapshot at that exact moment.
Both time and data are efficiently reduced with this methodology, as only 100 samples
are needed when the network is trained, although a large data set is employed to do
such training.
Non-equilibrium systems have also seen some applications from ML. Employing
SVMs, Schoenholz et al [273] found that the structure is important to glassy dynamics
in three dimensions. The results were obtained by performing MD simulations of a
bidisperse Kob–Andersen Lennard-Jones glass [274] in three dimensions at different
densities ρ and temperatures T above its dynamical glass transition temperature. At
each density, a training set of 6,000 particles was selected, taken from a MD trajectory
at the lowest T studied, to build a hyperplane in <M using SVMs. This hyperplane
is then used to calculate Si(t) for each particle i at each time t during an interval of
30,000 time steps at each ρ and T , where Si is the softness of a particle i, defined as
the shortest distance between its position in <M and the hyperplane, where Si > 0 if
i lies on the soft side of the hyperplane and Si < 0 otherwise. With this method, it is
shown that there is structure hidden in the disorder of glassy liquids and this structure
can be quantified by softness, computed through a ML framework.
New ML applications to design and create glass materials are steadily appearing as
Machine Learning for Condensed Matter Physics 31
presented in the review by Han et al [275], where challenges and limitations are also
discussed for these applications to glassy systems.
7.4. Experimental setups and Machine Learning
Experiments are a fundamental part of CMP, providing results which can be later
compared to theoretical and computational results. Experimental setups have also been
enhanced with ML models in different aspects of SM.
In the work by Li et al [276] collective dynamics are explored, measured through
piezoelectric relaxation studies, to identify the onset of a structural phase transition in
nanometer-scale volumes. Aided by k-means clustering on raw experimental data, Li et
al were able to automatically determine phase diagrams on systems where identification
of underpinning physical mechanisms is difficult to achieve and microscopic degrees of
freedom are generally unavailable for observations. Although these results are more
precise than those obtained with traditional tools, experimental setups might still be
difficult and expensive to perform each time, so ML methods might be able to alleviate
this shortcommings by requiring less data or performing more efficiently than other
techniques.
We have touched several times on the fact that data and the amount of samples
needed are one of the main shortcomings for the applications of ML methods to CMP.
This is particularly more of a challenge on experimental setups, where large quantities
of data samples might not be possible on a given system. The end-to-end framework
developed by Minor et al [277] shows that this problem can be overcome. The collection
of physical data of a liquid crystalline mesophase was performed using an experimental
setup, i.e., a mechanical quench experiment. The ML pipeline consisted of real time
object detection using the YOLOv2 [278] architecture to detect topological defects on the
system. Standardization and image enhancement techniques were needed to mimic real-
world inaccuracies in the training data set. The model resulted in comparable spatial
and number resolution human annotations to which the pipeline was compared; with
significant improvement on the time spent in each frame, thus resulting in a dramatic
increase in the time-resolution (more frames analyzed).
7.5. Outlook
As we have discussed so far, the fundamental step to applying ML to SM and off-
lattice systems is to define and use a set of meaningful descriptors, but the choices for
descriptors are large as well as system-dependent. In the case of experimental setups,
the efficient use of data samples are necessary. In order to fully embrace ML in SM
and colloidal systems a more automated approach will need to be developed, posing
it as new and fascinating research area. Recently, DeFever et al [279] developed a
generalized approach that does not use hand-made descriptors; instead, a complete DL
workflow is used with raw coordinates from simulations and it is able to obtain promising
results about the phase transitions, both in and out of thermodynamic equilibrium in
Machine Learning for Condensed Matter Physics 32
SM systems. This scheme is a great step forward towards a fully automated strategy
for off-lattice systems, as it reduces the amount of bias a scientist might introduce while
using certain descriptors for given systems. The only downside to employing these types
of DL methodologies is the large amount of data needed, so maybe in future research we
can see approaches where one can use a small amount of data to obtain similar results.
These disadvantes seem to disappear in end-to-end frameworks as the one described in
Minor et al [277]; still, the amount of data might be an obstacle when applying ML
models. This is discussed in detail in the next section.
8. Challenges and Perspectives
At this point of the review, we have discussed some of the major applications of ML to
CMP. We have also touched on other useful insights provided by theoretical frameworks
on ML, as well as discussing important points on various shortcomings and drawbacks
of using ML models in diverse applications. In this section, however, we would like to
expand on such discussions and address the main challenges that we see in the current
climate of this research area.
8.1. Benchmark data sets and performance metrics
In order to test new algorithms in ML a benchmark data set is needed. This data set
serves the purpose of assessing the performance of a particular algorithm by means
of a defined performance metric. One such example is the extremely popular MNIST
data set [280], which is currently extensively used to test new ML algorithms. The
most common task to solve on this data set is to check if the proposed ML algorithm
can successfuly recognize all ten different classes of handwritten digits, in a supervised
learning fashion. Thus, the performance metric is the accuracy—which corresponds to
the total number of correct predictions over the total number of samples in the data set.
This means that if a proposed NN architecture scores a 97% accuracy on the MNIST
data set, the NN architecture can predict at least 9.7 handwritten digits correctly.
Although the MNIST has been widely used, it is not the only one and most researchers
agree that this data set should be replaced by a more challenging one, for instance the
Fashion-MNIST [281].
With this in mind, the main point we wish to convey is that there is an important
need for standardized and well-tested benchmark data sets in the current research area
of ML and CMP, along with their respective performance metrics. Benchmark data
sets will be used to try and test new ML models, while verifying if such models can
provide the expected results, and achieve a good score on the respective performance
metric. We can look at the example of computing the critical temperature of an Ising
model where we do not have the external field shown in Eq. (1). In this example,
we have seen several possible ML models—NNs, SVMs, DL approaches, Reinforcement
Learning approaches—that can achieve good results on this problem, so we might agree
Machine Learning for Condensed Matter Physics 33
that such problem can be taken as a benchmark problem. The Ising model is a simple
one, which can be computed with standard MC methods and a standard benchmark
data set is fairly easy to obtain. This is not the case, however, of other CMP systems
and problems, for example, the case of machine-learned potentials. As discussed in
Ref. [12], the construction of such models currently require very large reference data
sets from electronic structure calculations, making the construction computationally
very demanding and time consuming. This means that it is not simple nor easy to test
newly created ML models that can tackle this problem. Thus, it implies that there is a
strong need to spend countless computing hours to obtain representative data samples
just to test if the proposed ML method can perform well at the simplest cases. Other
CMP systems are even more complicated to produce data sets, in particular those that
come from experimental setups. Experiments can be very costly; obtaining clean, labeled
and representative data is not something that is currently being addressed completely.
The solution to this problem is to generate well-tested standard data sets for each
problem in CMP. Although this is not a simple task, some efforts have been done in
current research areas. Related to materials modeling and CMP, MoleculeNet by Wu et
al [282] is a contribution that curates multiple public datasets, establishes metrics for
evaluation, and offers high quality open-source implementations of multiple previously
proposed molecular featurization and learning algorithms. Other effort, although in the
realm of quantumm chemistry, is the QM9 data set [283] for molecular structures and
their properties. Not much attempts are seen in the realm of SM, where one could
test for the simplest cases, namely, the hard-sphere model, binary hard-sphere mixtures
and some other simple liquids benchmarks. This could signal a different approach
to solving the problem of performance metrics and benchmark data sets by means of
physical correctness and standard results drawn from physical theories. For instance,
when testing new molecular dynamics algorithms, the Lennard-Jones potential [284] is
a well established benchmark given that all its properties are well known and studied.
Instead of having to compute these values each time, one wishes to test a new ML
model, or having to read and extract the data set from the literature, a joint effort of
building a common and curated data bank might help scientists to quickly test their
newly proposed algorithms.
A final approach to solving this problem could be the reuse and sharing of created
data sets for a given problem. While solving some problem within CMP, the data
set created can be shared through an open service, e.g., Zenodo [285], where other
scientists can easily access such information and use them in an open science way [286].
Benchmarks are most needed for the prevalence and proliferation of newly proposed ML
algorithms applied to CMP, and this key challenge should be one of the main challenges
to overcome.
Machine Learning for Condensed Matter Physics 34
8.2. Feature selection and Training of ML models
Another key challenge in the application of ML to CMP is that of feature selection.
ML models rely on a data set that is representative of the system itself, i.e., without
interpretation capabilities, models might not provide physical insights or meaning. If
a data set is not correctly labeled or structured, training can be unstable and different
issues can arise. In plain ML various feature selection techniques are available along
with some useful metrics [287]. Feature selection strongly depends on the data set and
problem alike, so most of the feature selection methods cannot be transfered directly to
CMP problems. For this to be employed, physics criteria and insight into the problem is
mandatory. For example, it requires the expertise of a SM physicist to identify betwen
a nematic phase and a smectic phase in a crystal liquid, or the distinction of symmetries
in a given system that need to be conserved when using a ML model. In this review,
we have seen some of these shortcomings, when talking about atomistic descriptors and
other applications. Unfortunately, this is a challenge that is not so easily resolved as it
requires having experience on the problem itself and the ML model to be used. One such
example that we have already covered is the description of hard-spheres and hard-disks
using the distance between particles shown in Ref. [253]. This feature selection is not a
standard one in ML, it comes directly from the Physics of the problem at hand, and is an
important contribution to feature selection in ML and CMP. Research of this kind might
be needed even more than creating new ML models, so that more CMP systems can
be tested with more feature selection techniques. Another possible approach to such
problem can be the automatization of feature selection by means of ML models; for
instance, using a modification of the one proposed by Bojensen [187]. It is interesting to
see that most automatization methodologies use Reinforcement Learning, which might
indicate that it is well suited to solve some of these drawbacks.
We now discuss the issue of training and using ML models in general. Handling
and training ML models well so as to obtain good results is not so simple at first.
Nowadays, there are a lot of standard techniques used to train ML models, such as
regularization, early stopping and many others; for a thorough and detailed explanation
of these techniques we refer the reader to the excellent review by Mehta et al [288]
especially written for physicists. The main point here is that training ML models has
its difficulties and most of the time is not so easy to implement all of them as not
all are needed in order for the training to be stable enough. Dealing with overfitting
and underfitting is one of the main issues in training ML models, especially in CMP
applications because of the possible lack of data samples and the difficulty of choosing the
correct ML algorithm. Moreover, tuning of hyperparameters in ML models requires time
and patience from the user as it can be a time consuming task, especially if approaching
the problem with a ML model for which there is little knowledge of how to use it
correctly. There are solutions to this problem, such as Bayesian optimization [289],
a robust tool which can help tune hyperparameters in an automated fashion, with
good results. Fortunately, more and more research is being carried out to alleviate the
Machine Learning for Condensed Matter Physics 35
problem in ML, for instance, the DropConnect approach [290]. which greatly helps with
the overfitting problem.
In general, most ML models are not so readily applicable to CMP problems, mostly
because of the lack of good data representation and the need of model tuning. There
is a diverse set of possible solutions to these problems brought directly from ML, but
these need to be tested in CMP problems to ensure such techniques are viable. New
research should be carried out in this way to enhance current and recently proposed ML
applications to CMP, and to make it easier for physicists to employ ML models.
8.3. Replicability of results
It is a known fact that current ML research faces a reproducibility crisis [291]—the
fact that even with detailed descriptions from the original paper, an independent
implementation of such models does not yield the same results as those claimed in the
original work. One of the main problems with this phenomenon is the fact that most
research papers do not provide the code that was used to obtain the results presented in
the work. Even if the code is provided, most of the results claimed cannot obtained due
to the fact of stochasticity of certain algorithms—for example, when using dropout [292],
a technique that reduces overfitting by randomly turning on or off certain parts of the
NN architecture. The intrinsic randomness of most ML models is another key issue in
solving the replicability problem. Most researchers deal with this problem by sharing
the trained models along with the code used. This can greatly alleviate the problem,
the fact is that this is a rare practice in current ML research. Nonetheless, important
efforts are carried out to solve this problem. OpenML [293] is an iniciative to collect and
centralize ML models, data sets, pipelines and most importantly reproducible results.
Another great example is DeepDIVA [294], an infrastructure designed to enable quick
and intuitive setup of reproducible experiments with a large range of useful analysis
functionality. Current research in this area [295, 296, 297] is necessary to create outreach
on research and encourage practices that can lead to reproducible science.
Code and data sharing is not a common practice within CMP. Some researchers
believe that the original paper must suffice to reproduce the results claimed, this might
be true for most of the proposed methodologies and techniques, but no so much in ML
applications to CMP. In order for ML to work with CMP, reproducibility is a key factor
that needs to be present from the beginning. By focusing on sharing the full training
pipeline, the data set used to train a given ML model, and the code used, ML can be
quite succesful in CMP, most importantly, it can be implemented by anyone that wishes
to use such tools. Until now, there is no approach to this in the current climate of ML
and CMP, so it is a great opportunity to start implementing these practices of code,
data and model sharing to ensure reproducibility of results.
Machine Learning for Condensed Matter Physics 36
8.4. Perspectives
ML and CMP have created a very enriching and prosperous joint research area. Modern
ML techniques have been applied to CMP with outstanding results, and Physics has
given fruitful insights into the theoretical background of ML. But in order to embrace
and fully adopt ML tools into standard physics workflows, some challenges need to be
overcome.
Currently, there is no standard way to couple both ML and Physics simulation
software suites; the main approach is to use available software and hand-craft code that
can use the strengths of both parts. Physicists have seen the great advantage that is
the use of standard simulation software suites like LAMMPS [298] or ESPResSO [299],
but these are not meant to be used with ML workflows. A universal, coupling scheme
might be able to simplify research in this exciting scientific area.
Quantum systems have also seen a great deal of advances with ML applications [10],
thus it might be interesting to see if the same schemes can be modified and applied to
larger-scale systems, such as soft materials, as we have seen it was possible in the topic
of phase transitions.
New interesting theoretical insights have also been proposed to explain ML
by applying not only CMP, but other Physics frameworks, like the AdS/CFT
correspondence [300]. Physics is full of these rich and rigorous frameworks and new
research into the area might be able to give a full and thorough explanation of the
theory of Deep Learning and its learning mechanisms.
All in all, ML and CMP are just starting to convey ideas to each other, creating a
fertile research area where the work done is towards an automated, faster and simplified
workflow that will enable condensed matter physicists to unravel the deep and intricate
features of many-body systems.
Acknowledgments
This work was financially supported by Conacyt grant 287067.
References
[1] LeCun Y, Bengio Y and Hinton G 2015 Nature 521 436–444
[2] Alom M Z, Taha T M, Yakopcic C, Westberg S, Sidike P, Nasrin M S, Van Esesn B C, Awwal
A A S and Asari V K 2018 The history began from alexnet: A comprehensive survey on deep
learning approaches (Preprint 1803.01164)
[3] Young T, Hazarika D, Poria S and Cambria E 2018 IEEE Computational Intelligence Magazine
13 55–75
[4] Voulodimos A, Doulamis N, Doulamis A and Protopapadakis E 2018 Computational Intelligence
and Neuroscience 2018
[5] Litjens G, Kooi T, Bejnordi B E, Setio A A A, Ciompi F, Ghafoorian M, Van Der Laak J A,
Van Ginneken B and Sanchez C I 2017 Medical image analysis 42 60–88
[6] Goodfellow I, Bengio Y and Courville A 2016 Deep Learning (MIT Press) URL http://www.
deeplearningbook.org
Machine Learning for Condensed Matter Physics 37
[7] Sanchez-Lengeling B and Aspuru-Guzik A 2018 Science 361 360–365 ISSN 0036-
8075 (Preprint https://science.sciencemag.org/content/361/6400/360.full.pdf) URL
https://science.sciencemag.org/content/361/6400/360
[8] Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah
P, Spitzer M et al. 2019 Nature Reviews Drug Discovery 18 463–477
[9] Senior A W, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zıdek A, Nelson A W,
Bridgland A et al. 2020 Nature 1–5
[10] Carrasquilla J 2020 Machine learning for quantum matter (Preprint 2003.11040)
[11] Schmidt J, Marques M R, Botti S and Marques M A 2019 npj Computational Materials 5 1–36
[12] Behler J 2016 Journal of Chemical Physics 145 ISSN 00219606
[13] Webb M A, Delannoy J Y and de Pablo J J 2019 Journal of Chemical Theory and Computation
15 1199–1208 ISSN 1549-9618 publisher: American Chemical Society URL https://doi.org/
10.1021/acs.jctc.8b00920
[14] Schutt K T, Sauceda H E, Kindermans P J, Tkatchenko A and Muller K R 2018 The Journal of
Chemical Physics 148 241722 ISSN 0021-9606 publisher: American Institute of Physics URL
https://aip.scitation.org/doi/full/10.1063/1.5019779
[15] Mills K, Spanner M and Tamblyn I 2017 Physical Review A 96 042113 publisher: American
Physical Society URL https://link.aps.org/doi/10.1103/PhysRevA.96.042113
[16] Carleo G and Troyer M 2017 Science 355 602–606 ISSN 0036-8075, 1095-9203 publisher:
American Association for the Advancement of Science Section: Research Articles URL https:
//science.sciencemag.org/content/355/6325/602
[17] Glasser I, Pancotti N, August M, Rodriguez I D and Cirac J I 2018 Physical Review X 8 011006
publisher: American Physical Society URL https://link.aps.org/doi/10.1103/PhysRevX.
8.011006
[18] Xie T and Grossman J C 2018 Physical Review Letters 120 145301 publisher: American Physical
Society URL https://link.aps.org/doi/10.1103/PhysRevLett.120.145301
[19] Snyder J C, Rupp M, Hansen K, Muller K R and Burke K 2012 Physical Review Letters 108
253002 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/
PhysRevLett.108.253002
[20] Li L, Snyder J C, Pelaschier I M, Huang J, Niranjan U N, Duncan P, Rupp M, Muller
K R and Burke K 2016 International Journal of Quantum Chemistry 116 819–833 ISSN
1097-461X eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/qua.25040 URL https:
//onlinelibrary.wiley.com/doi/abs/10.1002/qua.25040
[21] Carleo G, Cirac I, Cranmer K, Daudet L, Schuld M, Tishby N, Vogt-Maranto L and Zdeborova
L 2019 Reviews of Modern Physics 91 045002 publisher: American Physical Society URL
https://link.aps.org/doi/10.1103/RevModPhys.91.045002
[22] Ethem A 2014 Introduction to Machine Learning 3rd ed (Cambridge: MIT Press)
[23] Alom M Z, Taha T M, Yakopcic C, Westberg S, Sidike P, Nasrin M S, Hasan M, Van Essen B C,
Awwal A A and Asari V K 2019 Electronics 8 292
[24] Vapnik V 1998 Statistical Learning Theory 1st ed (New York: John Wiley and Sons, Inc)
[25] Smola A J and Scholkopf B 2004 Statistics and computing 14 199–222
[26] Zhang Y and Liu Y 2009 IEEE Signal Processing Letters 16 414–417
[27] Ben-Hur A, Horn D, Siegelmann H T and Vapnik V 2001 Journal of machine learning research
2 125–137
[28] Hoffmann H 2007 Pattern recognition 40 863–874
[29] Tan S, Zheng F, Liu L, Han J and Shao L 2016 IEEE Transactions on Circuits and Systems for
Video Technology 28 356–363
[30] Liu J, Huang Z, Xu X, Zhang X, Sun S and Li D 2020 IEEE Transactions on Systems, Man, and
Cybernetics: Systems
[31] Nguyen T T, Nguyen N D and Nahavandi S 2020 IEEE Transactions on Cybernetics 1–14 ISSN
2168-2275
Machine Learning for Condensed Matter Physics 38
[32] Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller
M, Fidjeland A K, Ostrovski G et al. 2015 nature 518 529–533
[33] Shawe-Taylor J, Cristianini N et al. 2004 Kernel methods for pattern analysis (Cambridge
university press)
[34] Aggarwal C C 2018 Neural Networks and Deep Learning: A Textbook 1st ed (Springer Publishing
Company, Incorporated) ISBN 3319944622
[35] Xu X, Hu D and Lu X 2007 IEEE Transactions on Neural Networks 18 973–992
[36] An Y, Ding S, Shi S and Li J 2018 Pattern Recognition Letters 111 30–35
[37] Driessens K, Ramon J and Gartner T 2006 Machine Learning 64 91–119
[38] Jiu M and Sahbi H 2019 Pattern Recognition 88 447 – 457 ISSN 0031-3203 URL http:
//www.sciencedirect.com/science/article/pii/S0031320318304266
[39] Le L and Xie Y 2019 Neurocomputing 339 292–302
[40] Bai L, Shao Y H, Wang Z and Li C N 2019 Knowledge-Based Systems 163 227–240
[41] Sanakoyeu A, Bautista M A and Ommer B 2018 Pattern Recognition 78 331 – 343 ISSN 0031-3203
URL http://www.sciencedirect.com/science/article/pii/S0031320318300293
[42] Tesauro G 1992 Practical issues in temporal difference learning Advances in neural information
processing systems pp 259–266
[43] Liu H, Zhou J, Xu Y, Zheng Y, Peng X and Jiang W 2018 Neurocomputing 315 412–424
[44] da Silva L E B, Elnabarawy I and Wunsch II D C 2019 Neural Networks 120 167–203
[45] Tan A H, Lu N and Xiao D 2008 IEEE Transactions on Neural Networks 19 230–244
[46] Salakhutdinov R and Hinton G 2009 Deep boltzmann machines Artificial intelligence and
statistics pp 448–455
[47] Bishop C 2006 Pattern Recognition and Machine Learning Information Science and
Statistics (Springer) ISBN 9780387310732 URL https://books.google.com.mx/books?id=
qWPwnQEACAAJ
[48] Hu Y, Luo S, Han L, Pan L and Zhang T 2020 Artificial Intelligence in Medicine
102 101764 ISSN 0933-3657 URL http://www.sciencedirect.com/science/article/pii/
S093336571830366X
[49] Hinton G E and Salakhutdinov R R 2006 science 313 504–507
[50] Quinlan J R 1986 Machine learning 1 81–106
[51] Williams C K and Rasmussen C E 2006 Gaussian processes for machine learning 2nd ed (MIT
press Cambridge, MA)
[52] Roth V 2004 IEEE transactions on neural networks 15 16–28
[53] Graupe D 2019 Principles of Artificial Neural Networks: bacis designs to deep learning 4th ed
(Singapore: World Scientific)
[54] Aggarwal C 2020 Linear Algebra and Optimization for Machine Learning: A Textbook (Springer
International Publishing) ISBN 9783030403447 URL https://books.google.com.mx/books?
id=EdTkDwAAQBAJ
[55] Guresen E and Kayakutlu G 2011 Procedia Computer Science 3 426 – 433 ISSN 1877-
0509 world Conference on Information Technology URL http://www.sciencedirect.com/
science/article/pii/S1877050910004461
[56] Chen T and Chen H 1995 IEEE Transactions on Neural Networks 6 911–917
[57] Shrestha A and Mahmood A 2019 IEEE Access 7 53040–53065
[58] Bottou L 2012 Stochastic gradient descent tricks Neural Networks: Tricks of the Trade:
Second Edition ed Montavon G, Orr G B and Muller K R (Berlin, Heidelberg: Springer
Berlin Heidelberg) pp 421–436 ISBN 978-3-642-35289-8 URL https://doi.org/10.1007/
978-3-642-35289-8_25
[59] Pascanu R and Bengio Y 2013 Revisiting natural gradient for deep networks (Preprint 1301.3584)
[60] Graves A, Wayne G, Reynolds M, Harley T, Danihelka I, Grabska-Barwinska A, Colmenarejo
S G, Grefenstette E, Ramalho T, Agapiou J et al. 2016 Nature 538 471–476
[61] Quan Z, Zeng W, Li X, Liu Y, Yu Y and Yang W 2019 IEEE Transactions on Neural Networks
Machine Learning for Condensed Matter Physics 39
and Learning Systems 31 813–826
[62] Van Veen F and Leijnen S 2019 Asimov institute: The neural network zoo available at: URL
https://www.asimovinstitute.org/neural-network-zoo
[63] Rosenblatt F 1958 Psychological review 65 386–408
[64] Hornik K, Stinchcombe M, White H et al. 1989 Neural networks 2 359–366
[65] Jain A K, Mao J and Mohiuddin K M 1996 Computer 29 31–44
[66] Frommberger L 2010 Qualitative spatial abstraction in reinforcement learning (Springer Science
& Business Media)
[67] Bourlard H and Kamp Y 1988 Biological cybernetics 59 291–294
[68] Snoek J, Adams R P and Larochelle H 2012 Journal of Machine Learning Research 13 2567–2588
[69] Blaschke T, Olivecrona M, Engkvist O, Bajorath J and Chen H 2018 Molecular Informatics 37
1700123 (Preprint https://onlinelibrary.wiley.com/doi/pdf/10.1002/minf.201700123)
URL https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700123
[70] Talwar D, Mongia A, Sengupta D and Majumdar A 2018 Scientific reports 8 1–11
[71] Vellido A, Lisboa P J and Vaughan J 1999 Expert Systems with applications 17 51–70
[72] Hsieh T J, Hsiao H F and Yeh W C 2011 Applied soft computing 11 2510–2525
[73] Bohanec M, Borstnar M K and Robnik-Sikonja M 2017 Expert Systems with Applications 71
416–428
[74] Montavon G, Samek W and Muller K R 2018 Digital Signal Processing 73 1–15
[75] Padierna L C, Carpio M, Rojas-Domınguez A, Puga H and Fraire H 2018 Pattern Recognition
84 211 – 225 ISSN 0031-3203 URL http://www.sciencedirect.com/science/article/pii/
S0031320318302280
[76] Rojas-Domınguez A, Padierna L C, Valadez J M C, Puga-Soberanes H J and Fraire H J 2017
IEEE Access 6 7164–7176
[77] Vapnik V N 1999 IEEE transactions on neural networks 10 988–999
[78] Maldonado S and Lopez J 2014 Information Sciences 268 328–341
[79] Xu Y, Yang Z and Pan X 2016 IEEE transactions on neural networks and learning systems 28
359–370
[80] Tsang I W, Kwok J T and Cheung P M 2005 Journal of Machine Learning Research 6 363–392
URL https://www.jmlr.org/papers/v6/tsang05a.html
[81] Sadrfaridpour E, Razzaghi T and Safro I 2019 Machine Learning 108 1879–1917
[82] Chang C C and Lin C J 2011 ACM Transactions on Intelligent Systems and Technology 2(3)
27:1–27:27 software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
[83] Fan R E, Chang K W, Hsieh C J, Wang X R and Lin C J 2008 Journal of Machine Learning
Research 9 1871–1874
[84] Shalev-Shwartz S, Singer Y, Srebro N and Cotter A 2011 Mathematical programming 127 3–30
[85] Nandan M, Khargonekar P P and Talathi S S 2014 J. Mach. Learn. Res. 15 59–98 ISSN 1532-4435
[86] Bohn B, Griebel M and Rieger C 2019 Journal of Machine Learning Research 20 1–32 URL
http://jmlr.org/papers/v20/17-621.html
[87] Chaikin P M, Lubensky T C and Witten T A 1995 Principles of condensed matter physics vol 10
(Cambridge university press Cambridge)
[88] Anderson P W 2018 Basic notions of condensed matter physics (CRC Press)
[89] Girvin S M and Yang K 2019 Modern Condensed Matter Physics (Cambridge University Press)
[90] Kohn W 1999 Rev. Mod. Phys. 71(2) S59–S77
[91] Lubensky T 1997 Solid State Communications 102 187 – 197 ISSN 0038-1098 highlights in
Condensed Matter Physics and Materials Science
[92] Witten T A 1999 Rev. Mod. Phys. 71(2) S367–S373
[93] Yan H, Park S H, Finkelstein G, Reif J H and LaBean T H 2003 Science 301 1882–1884 ISSN
0036-8075
[94] Zhang S, Marini D M, Hwang W and Santoso S 2002 Current Opinion in Chemical Biology 6
865 – 871 ISSN 1367-5931
Machine Learning for Condensed Matter Physics 40
[95] De Gennes P 2005 Soft Matter 1(1) 16–16
[96] Russel W, Russel W, Saville D and Schowalter W 1991 Colloidal Dispersions Cambridge
Monographs on Mechanics (Cambridge University Press) ISBN 9780521426008 URL https:
//books.google.com.mx/books?id=3shp8Kl6YoUC
[97] Eberle A P R, Wagner N J and Castaneda Priego R 2011 Phys. Rev. Lett. 106(10) 105704 URL
https://link.aps.org/doi/10.1103/PhysRevLett.106.105704
[98] Eberle A P, Castaneda-Priego R, Kim J M and Wagner N J 2012 Langmuir 28 1866–1878
[99] Valadez-Perez N E, Benavides A L, Scholl-Paschinger E and Castaneda-Priego R 2012 The
Journal of chemical physics 137 084905
[100] Cheng Z, Chaikin P, Russel W, Meyer W, Zhu J, Rogers R and Ottewill R 2001 Materials &
Design 22 529–534
[101] McGrother S C, Gil-Villegas A and Jackson G 1998 Molecular Physics 95 657–
673 (Preprint https://doi.org/10.1080/00268979809483199) URL https://doi.org/10.
1080/00268979809483199
[102] McGrother S C, Gil-Villegas A and Jackson G 1996 Journal of Physics: Condensed Matter 8
9649–9655
[103] Bleil S, von Grunberg H H, Dobnikar J, Castaneda-Priego R and Bechinger C 2006 Europhysics
Letters (EPL) 73 450–456
[104] Anderson V J and Lekkerkerker H N 2002 Nature 416 811–815
[105] Zinn-Justin J 2007 Phase Transitions and Renormalization Group Oxford Graduate Texts (OUP
Oxford) ISBN 9780199227198 URL https://books.google.com.mx/books?id=mUMVDAAAQBAJ
[106] Binney J, Binney J, Binney M, Dowrick N, Fisher A, Newman M, Fisher B and Newman L 1992
The Theory of Critical Phenomena: An Introduction to the Renormalization Group Oxford
Science Publ (Clarendon Press) ISBN 9780198513933 URL https://books.google.com.mx/
books?id=lvZcGJI3X9cC
[107] Mermin N D 1979 Rev. Mod. Phys. 51(3) 591–648
[108] Chuang I, Yurke B, Pargellis A N and Turok N 1993 Phys. Rev. E 47(5) 3343–3356
[109] Gil-Villegas A, McGrother S C and Jackson G 1997 Chemical Physics Letters 269
441 – 447 ISSN 0009-2614 URL http://www.sciencedirect.com/science/article/pii/
S0009261497003072
[110] Alder B J and Wainwright T E 1957 The Journal of Chemical Physics 27 1208–1209 ISSN 0021-
9606 publisher: American Institute of Physics URL https://aip.scitation.org/doi/10.
1063/1.1743957
[111] Fernandez L A, Martın-Mayor V, Seoane B and Verrocchio P 2012 Physical Review Letters 108
165701 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/
PhysRevLett.108.165701
[112] Hoover W G and Ree F H 1968 The Journal of Chemical Physics 49 3609–3617 ISSN 0021-9606
publisher: American Institute of Physics URL https://aip.scitation.org/doi/10.1063/
1.1670641
[113] Robles M, Lopez de Haro M and Santos A 2014 The Journal of Chemical Physics 140 136101
ISSN 0021-9606 publisher: American Institute of Physics URL https://aip.scitation.org/
doi/10.1063/1.4870524
[114] Annett J F 2004 Superconductivity, superfluids, and condensates Oxford mas-
ter series in condensed matter physics (Oxford University Press) ISBN
9780198507550,0198507550,0198507569,9780198507567
[115] Tinkham M 1996 Introduction to superconductivity 2nd ed International series in pure and applied
physics (McGraw Hill) ISBN 0070648786,9780070648784
[116] Santi E, Eskandari S, Tian B and Peng K 2018 6 - power electronic modules Power Electronics
Handbook (Fourth Edition) ed Rashid M H (Butterworth-Heinemann) pp 157 – 173 fourth
edition ed ISBN 978-0-12-811407-0 URL http://www.sciencedirect.com/science/article/
pii/B9780128114070000064
Machine Learning for Condensed Matter Physics 41
[117] Yeh N C 2008 AAPPS Bulletin 18 11
[118] Council N R 2007 Condensed-Matter and Materials Physics: The Science of the World Around
Us (Washington, DC: The National Academies Press) ISBN 978-0-309-10969-7
[119] Cipra B A 1987 The American Mathematical Monthly 94 937–959 (Preprint https://doi.
org/10.1080/00029890.1987.12000742) URL https://doi.org/10.1080/00029890.1987.
12000742
[120] Ising E 1925 Z. Phys. 31 253–258
[121] Onsager L 1944 Phys. Rev. 65(3-4) 117–149
[122] Wilson K G 1971 Phys. Rev. B 4(9) 3174–3183
[123] Wilson K G 1971 Phys. Rev. B 4(9) 3184–3205 URL https://link.aps.org/doi/10.1103/
PhysRevB.4.3184
[124] Ferrenberg A M and Landau D P 1991 Phys. Rev. B 44(10) 5081–5091 URL https://link.
aps.org/doi/10.1103/PhysRevB.44.5081
[125] Stanley H E 1968 Phys. Rev. Lett. 20(12) 589–592 URL https://link.aps.org/doi/10.1103/
PhysRevLett.20.589
[126] Ashkin J and Teller E 1943 Phys. Rev. 64(5-6) 178–184 URL https://link.aps.org/doi/10.
1103/PhysRev.64.178
[127] Kunz H and Zumbach G 1992 Phys. Rev. B 46(2) 662–673 URL https://link.aps.org/doi/
10.1103/PhysRevB.46.662
[128] Straley J P and Fisher M E 1973 Journal of Physics A: Mathematical, Nuclear and General 6
1310–1326 URL https://doi.org/10.1088/0305-4470/6/9/007
[129] Berezinsky V 1971 Sov. Phys. JETP 32 493–500 URL https://inspirehep.net/literature/
61186
[130] Berezinsky V 1972 Sov. Phys. JETP 34 610 URL https://inspirehep.net/literature/
1716846
[131] Kosterlitz J M and Thouless D J 1973 Journal of Physics C: Solid State Physics 6 1181–1203
[132] Kosterlitz J M 1974 Journal of Physics C: Solid State Physics 7 1046–1060
[133] Chen J, Liao H J, Xie H D, Han X J, Huang R Z, Cheng S, Wei Z C, Xie Z Y and Xiang T 2017
Chinese Physics Letters 34 050503
[134] Yong J, Lemberger T R, Benfatto L, Ilin K and Siegel M 2013 Phys. Rev. B 87(18) 184505 URL
https://link.aps.org/doi/10.1103/PhysRevB.87.184505
[135] Lin Y H, Nelson J and Goldman A M 2012 Phys. Rev. Lett. 109(1) 017002 URL https:
//link.aps.org/doi/10.1103/PhysRevLett.109.017002
[136] Gibney E and Castelvecchi D 2016 Nature News 538 18
[137] Carrasquilla J and Melko R G 2017 Nature Physics 13 431–434 ISSN 1745-2473
[138] Nightingale P 1982 Journal of Applied Physics 53 7927–7932
[139] Kogut J B 1979 Rev. Mod. Phys. 51(4) 659–713
[140] Deng D L, Li X and Das Sarma S 2017 Phys. Rev. B 96(19) 195145
[141] Broecker P, Carrasquilla J, Melko R G and Trebst S 2017 Scientific reports 7 1–10
[142] Tanaka A and Tomiya A 2017 Journal of the Physical Society of Japan 86 063001
[143] Zhang Y, Melko R G and Kim E A 2017 Phys. Rev. B 96(24) 245119
[144] Rodriguez-Nieva J F and Scheurer M S 2019 Nature Physics 15 790–795
[145] Beach M J S, Golubeva A and Melko R G 2018 Phys. Rev. B 97(4) 045207 URL https:
//link.aps.org/doi/10.1103/PhysRevB.97.045207
[146] Melko R G and Gingras M J P 2004 Journal of Physics: Condensed Matter 16 R1277–R1319
URL https://doi.org/10.1088/0953-8984/16/43/r02
[147] Ch’ng K, Carrasquilla J, Melko R G and Khatami E 2017 Phys. Rev. X 7(3) 031038 URL
https://link.aps.org/doi/10.1103/PhysRevX.7.031038
[148] Ponte P and Melko R G 2017 Phys. Rev. B 96(20) 205146 URL https://link.aps.org/doi/
10.1103/PhysRevB.96.205146
[149] Morningstar A and Melko R G 2017 J. Mach. Learn. Res. 18 5975–5991 ISSN 1532-4435
Machine Learning for Condensed Matter Physics 42
[150] Efthymiou S, Beach M J S and Melko R G 2019 Phys. Rev. B 99(7) 075113 URL https:
//link.aps.org/doi/10.1103/PhysRevB.99.075113
[151] Zhang W, Liu J and Wei T C 2019 Physical Review E 99 1–16 ISSN 24700053 (Preprint
arXiv:1804.02709v2)
[152] Kim D and Kim D H 2018 Phys. Rev. E 98(2) 022138
[153] Giannetti C, Lucini B and Vadacchino D 2019 Nuclear Physics B 944 114639 ISSN 05503213
[154] Friedman J, Hastie T and Tibshirani R 2001 The elements of statistical learning vol 1 (Springer
series in statistics New York)
[155] Kenta S, Hiroyuki M, Yutaka O and Kuan L H 2020 Scientific Reports (Nature Publisher Group)
10
[156] Wu F Y 1982 Rev. Mod. Phys. 54(1) 235–268 URL https://link.aps.org/doi/10.1103/
RevModPhys.54.235
[157] Carvalho D, Garcıa-Martınez N A, Lado J L and Fernandez-Rossier J 2018 Phys. Rev. B 97(11)
115453 URL https://link.aps.org/doi/10.1103/PhysRevB.97.115453
[158] Ohtsuki T and Ohtsuki T 2016 Journal of the Physical Society of Japan 85 123706
(Preprint https://doi.org/10.7566/JPSJ.85.123706) URL https://doi.org/10.7566/
JPSJ.85.123706
[159] Ohtsuki T and Ohtsuki T 2017 Journal of the Physical Society of Japan 86 044708
(Preprint https://doi.org/10.7566/JPSJ.86.044708) URL https://doi.org/10.7566/
JPSJ.86.044708
[160] Ohtsuki T and Mano T 2020 Journal of the Physical Society of Japan 89 022001 (Preprint https:
//doi.org/10.7566/JPSJ.89.022001) URL https://doi.org/10.7566/JPSJ.89.022001
[161] Livni R, Shalev-Shwartz S and Shamir O 2014 On the Computational Efficiency
of Training Neural Networks Advances in Neural Information Processing Systems
27 ed Ghahramani Z, Welling M, Cortes C, Lawrence N D and Weinberger
K Q (Curran Associates, Inc.) pp 855–863 URL http://papers.nips.cc/paper/
5267-on-the-computational-efficiency-of-training-neural-networks.pdf
[162] Song L, Vempala S, Wilmes J and Xie B 2017 On the Complexity of Learning
Neural Networks Advances in Neural Information Processing Systems 30 ed Guyon
I, Luxburg U V, Bengio S, Wallach H, Fergus R, Vishwanathan S and Gar-
nett R (Curran Associates, Inc.) pp 5514–5522 URL http://papers.nips.cc/paper/
7135-on-the-complexity-of-learning-neural-networks.pdf
[163] Maaten L v d and Hinton G 2008 Journal of machine learning research 9 2579–2605
[164] Ng A Y, Jordan M I and Weiss Y 2001 On spectral clustering: Analysis and an algorithm
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (MIT Press) pp 849–
856
[165] FRS K P 1901 The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science
2 559–572
[166] Wetzel S J 2017 Phys. Rev. E 96(2) 022140 URL https://link.aps.org/doi/10.1103/
PhysRevE.96.022140
[167] Wang L 2016 Phys. Rev. B 94(19) 195105 URL https://link.aps.org/doi/10.1103/
PhysRevB.94.195105
[168] Huembeli P, Dauphin A and Wittek P 2018 Phys. Rev. B 97(13) 134109
[169] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and
Bengio Y 2014 Generative adversarial nets Advances in neural information processing systems
pp 2672–2680
[170] Rocchetto A, Grant E, Strelchuk S, Carleo G and Severini S 2018 npj Quantum Information 4
1–7
[171] Kingma D P and Welling M 2013 arXiv preprint arXiv:1312.6114
[172] Zhang P, Shen H and Zhai H 2018 Phys. Rev. Lett. 120(6) 066401 URL https://link.aps.org/
doi/10.1103/PhysRevLett.120.066401
Machine Learning for Condensed Matter Physics 43
[173] Liang X, Liu W Y, Lin P Z, Guo G C, Zhang Y S and He L 2018 Phys. Rev. B 98(10) 104426
URL https://link.aps.org/doi/10.1103/PhysRevB.98.104426
[174] Baumgartner A, Burkitt A, Ceperley D, De Raedt H, Ferrenberg A, Heermann D, Herrmann H,
Landau D, Levesque D, von der Linden W et al. 2012 The Monte Carlo method in condensed
matter physics vol 71 (Springer Science & Business Media)
[175] Newman M and Barkema G 1999 Monte Carlo methods in statistical physics chapter 1-4 (Oxford
University Press: New York, USA)
[176] Gubernatis J E 2003 The monte carlo method in the physical sciences: celebrating the 50th
anniversary of the metropolis algorithm The Monte Carlo Method in the Physical Sciences vol
690
[177] Landau D P and Binder K 2014 A guide to Monte Carlo simulations in statistical physics
(Cambridge university press)
[178] Liu J, Qi Y, Meng Z Y and Fu L 2017 Physical Review B 95 1–5 ISSN 24699969
[179] Huang L and Wang L 2017 Phys. Rev. B 95(3) 035105 URL https://link.aps.org/doi/10.
1103/PhysRevB.95.035105
[180] Liu J, Shen H, Qi Y, Meng Z Y and Fu L 2017 Phys. Rev. B 95(24) 241104 URL https:
//link.aps.org/doi/10.1103/PhysRevB.95.241104
[181] Nagai Y, Shen H, Qi Y, Liu J and Fu L 2017 Phys. Rev. B 96(16) 161102 URL https:
//link.aps.org/doi/10.1103/PhysRevB.96.161102
[182] Shen H, Liu J and Fu L 2018 Phys. Rev. B 97(20) 205140 URL https://link.aps.org/doi/
10.1103/PhysRevB.97.205140
[183] Chen C, Xu X Y, Liu J, Batrouni G, Scalettar R and Meng Z Y 2018 Phys. Rev. B 98(4) 041102
URL https://link.aps.org/doi/10.1103/PhysRevB.98.041102
[184] Pilati S, Inack E M and Pieri P 2019 Phys. Rev. E 100(4) 043301 URL https://link.aps.org/
doi/10.1103/PhysRevE.100.043301
[185] Nagai Y, Okumura M and Tanaka A 2020 Phys. Rev. B 101(11) 115111 URL https://link.
aps.org/doi/10.1103/PhysRevB.101.115111
[186] Behler J and Parrinello M 2007 Phys. Rev. Lett. 98(14) 146401
[187] Bojesen T A 2018 Physical Review E 98 063303 publisher: American Physical Society URL
https://link.aps.org/doi/10.1103/PhysRevE.98.063303
[188] Venderley J, Khemani V and Kim E A 2018 Phys. Rev. Lett. 120(25) 257204 URL https:
//link.aps.org/doi/10.1103/PhysRevLett.120.257204
[189] Choo K, Neupert T and Carleo G 2019 Phys. Rev. B 100(12) 125124 URL https://link.aps.
org/doi/10.1103/PhysRevB.100.125124
[190] Zhang C, Bengio S, Hardt M, Recht B and Vinyals O 2016 Understanding deep learning requires
rethinking generalization (Preprint 1611.03530)
[191] Lin H W, Tegmark M and Rolnick D 2017 Journal of Statistical Physics 168 1223–1247
[192] Fan F, Xiong J and Wang G 2020 On interpretability of artificial neural networks (Preprint
2001.02522)
[193] Mehta P and Schwab D J 2014 An exact mapping between the Variational Renormalization Group
and Deep Learning (Preprint 1410.3831) URL http://arxiv.org/abs/1410.3831
[194] Li S H and Wang L 2018 Phys. Rev. Lett. 121(26) 260601 URL https://link.aps.org/doi/
10.1103/PhysRevLett.121.260601
[195] Koch-Janusz M and Ringel Z 2018 Nature Physics 14 578–582
[196] Lenggenhager P M, Gokmen D E, Ringel Z, Huber S D and Koch-Janusz M 2020 Phys. Rev. X
10(1) 011037 URL https://link.aps.org/doi/10.1103/PhysRevX.10.011037
[197] de Melllo Koch E, de Mello Koch A, Kastanos N and Cheng L 2020 Short sighted deep learning
(Preprint 2002.02664)
[198] Hu H Y, Li S H, Wang L and You Y Z 2019 Machine learning holographic mapping by neural
network renormalization group (Preprint 1903.00804)
[199] Smolensky P 1986 Information processing in dynamical systems: Foundations of harmony theory
Machine Learning for Condensed Matter Physics 44
Tech. rep. Colorado Univ at Boulder Dept of Computer Science
[200] Larochelle H and Bengio Y 2008 Classification using discriminative restricted boltzmann machines
Proceedings of the 25th International Conference on Machine Learning ICML ’08 (New York,
NY, USA: Association for Computing Machinery) p 536–543 ISBN 9781605582054 URL
https://doi.org/10.1145/1390156.1390224
[201] Hinton G E and Salakhutdinov R R 2006 Science 313 504–507 ISSN 0036-8075 (Preprint https:
//science.sciencemag.org/content/313/5786/504.full.pdf) URL https://science.
sciencemag.org/content/313/5786/504
[202] Coates A, Ng A and Lee H 2011 An analysis of single-layer networks in unsupervised feature
learning Proceedings of the fourteenth international conference on artificial intelligence and
statistics pp 215–223
[203] Fischer A and Igel C 2014 Pattern Recognition 47 25–39
[204] Hinton G E 2002 Neural Computation 14 1771–1800 ISSN 0899-7667 publisher: MIT Press URL
https://doi.org/10.1162/089976602760128018
[205] Hinton G E 2012 A Practical Guide to Training Restricted Boltzmann Machines Neural Networks:
Tricks of the Trade: Second Edition Lecture Notes in Computer Science ed Montavon G, Orr
G B and Muller K R (Berlin, Heidelberg: Springer) pp 599–619 ISBN 978-3-642-35289-8 URL
https://doi.org/10.1007/978-3-642-35289-8_32
[206] Huang H and Toyoizumi T 2015 Phys. Rev. E 91(5) 050101
[207] Decelle A, Fissore G and Furtlehner C 2018 Journal of Statistical Physics 172 1576–1608
[208] Nguyen H C, Zecchina R and Berg J 2017 Advances in Physics 66 197–261 (Preprint https:
//doi.org/10.1080/00018732.2017.1341604) URL https://doi.org/10.1080/00018732.
2017.1341604
[209] Decelle A, Fissore G and Furtlehner C 2017 EPL (Europhysics Letters) 119 60001 URL
https://doi.org/10.1209/0295-5075/119/60001
[210] Tubiana J, Cocco S and Monasson R 2019 Elife 8 e39397
[211] Tubiana J, Cocco S and Monasson R 2019 Neural Computation 31 1671–1717 pMID: 31260391
(Preprint https://doi.org/10.1162/neco_a_01210) URL https://doi.org/10.1162/
neco_a_01210
[212] Dahl G, Ranzato M, Mohamed A r and Hinton G E 2010 Phone recognition with the mean-
covariance restricted boltzmann machine Advances in neural information processing systems
pp 469–477
[213] Nomura Y, Darmawan A S, Yamaji Y and Imada M 2017 Phys. Rev. B 96(20) 205152 URL
https://link.aps.org/doi/10.1103/PhysRevB.96.205152
[214] Melko R G, Carleo G, Carrasquilla J and Cirac J I 2019 Nature Physics 15 887–892
[215] Suchsland P and Wessel S 2018 Phys. Rev. B 97(17) 174435 URL https://link.aps.org/doi/
10.1103/PhysRevB.97.174435
[216] Baity-Jesi M, Sagun L, Geiger M, Spigler S, Arous G B, Cammarota C, LeCun Y, Wyart M and
Biroli G 2019 Journal of Statistical Mechanics: Theory and Experiment 2019 124013
[217] Geiger M, Spigler S, d’Ascoli S, Sagun L, Baity-Jesi M, Biroli G and Wyart M 2019 Phys. Rev.
E 100(1) 012115 URL https://link.aps.org/doi/10.1103/PhysRevE.100.012115
[218] Allen M P and Tildesley D J 2017 Computer simulation of liquids (Oxford university press)
[219] Lindahl E and Sansom M S 2008 Current Opinion in Structural Biology 18 425 – 431 ISSN 0959-
440X membranes / Engineering and design URL http://www.sciencedirect.com/science/
article/pii/S0959440X08000304
[220] Badia A, Lennox R B and Reven L 2000 Accounts of Chemical Research 33 475–481 pMID:
10913236
[221] Sidky H, Colon Y J, Helfferich J, Sikora B J, Bezik C, Chu W, Giberti F, Guo A Z, Jiang X,
Lequieu J, Li J, Moller J, Quevillon M J, Rahimi M, Ramezani-Dakhel H, Rathee V S, Reid
D R, Sevgen E, Thapar V, Webb M A, Whitmer J K and de Pablo J J 2018 The Journal of
Chemical Physics 148 044104
Machine Learning for Condensed Matter Physics 45
[222] Sidky H and Whitmer J K 2018 The Journal of Chemical Physics 148 104111
[223] Sultan M M and Pande V S 2018 The Journal of Chemical Physics 149 094106
[224] Marx D and Hutter J 2009 Ab initio molecular dynamics: basic theory and advanced methods
(Cambridge University Press)
[225] Koch W and Holthausen M 2015 A Chemist’s Guide to Density Functional Theory (Wiley) ISBN
9783527802814
[226] Parr R G 1980 Density functional theory of atoms and molecules Horizons of quantum chemistry
(Springer) pp 5–15
[227] Custodio C A, Filletti E R and Franca V V 2019 Scientific reports 9 1–7
[228] Manzhos S 2020 Machine Learning: Science and Technology 1 013002 URL https://doi.org/
10.1088/2632-2153/ab7d30
[229] Berman H, Henrick K, Nakamura H and Markley J L 2006 Nucleic Acids Research 35 D301–D303
ISSN 0305-1048
[230] Hilbert M and Lopez P 2011 Science 332 60–65 ISSN 0036-8075 (Preprint https://science.
sciencemag.org/content/332/6025/60.full.pdf) URL https://science.sciencemag.
org/content/332/6025/60
[231] Blank T B, Brown S D, Calhoun A W and Doren D J 1995 The Journal of Chemical Physics
103 4129–4137
[232] Ozboyaci M, Kokh D B, Corni S and Wade R C 2016 Quarterly Reviews of Biophysics 49 e4
[233] Bartok A P, Payne M C, Kondor R and Csanyi G 2010 Phys. Rev. Lett. 104(13) 136403
[234] Pan S J and Yang Q 2010 IEEE Transactions on Knowledge and Data Engineering 22 1345–1359
[235] Torrey L and Shavlik J 2010 Transfer Learning iSBN: 9781605667669 Library Catalog: www.igi-
global.com Pages: 242-264 Publisher: IGI Global URL www.igi-global.com/chapter/
transfer-learning/36988
[236] Weiss K, Khoshgoftaar T M and Wang D 2016 Journal of Big Data 3 9 ISSN 2196-1115 URL
https://doi.org/10.1186/s40537-016-0043-6
[237] French R M 1999 Trends in Cognitive Sciences 3 128–135 ISSN 1364-6613 URL http://www.
sciencedirect.com/science/article/pii/S1364661399012942
[238] McCloskey M and Cohen N J 1989 Catastrophic Interference in Connectionist Networks: The
Sequential Learning Problem Psychology of Learning and Motivation vol 24 ed Bower G H
(Academic Press) pp 109–165 URL http://www.sciencedirect.com/science/article/pii/
S0079742108605368
[239] Kumaran D, Hassabis D and McClelland J L 2016 Trends in Cognitive Sciences 20
512–534 ISSN 1364-6613 URL http://www.sciencedirect.com/science/article/pii/
S1364661316300432
[240] McClelland J L, McNaughton B L and O’Reilly R C 1995 Psychological Review 102 419–457
ISSN 1939-1471(Electronic),0033-295X(Print) place: US Publisher: American Psychological
Association
[241] Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu A A, Milan K, Quan J,
Ramalho T, Grabska-Barwinska A, Hassabis D, Clopath C, Kumaran D and Hadsell R 2017
Proceedings of the National Academy of Sciences 114 3521–3526 ISSN 0027-8424 publisher:
National Academy of Sciences eprint: https://www.pnas.org/content/114/13/3521.full.pdf
URL https://www.pnas.org/content/114/13/3521
[242] Li Z and Hoiem D 2018 IEEE Transactions on Pattern Analysis and Machine Intelligence 40
2935–2947 ISSN 1939-3539 conference Name: IEEE Transactions on Pattern Analysis and
Machine Intelligence
[243] Morawietz T, Singraber A, Dellago C and Behler J 2016 Proceedings of the National Academy of
Sciences 113 8368–8373 ISSN 0027-8424
[244] Natarajan S K and Behler J 2016 Phys. Chem. Chem. Phys. 18(41) 28704–28725
[245] Bartok A P, Kondor R and Csanyi G 2013 Phys. Rev. B 87(18) 184115
[246] Rupp M, Tkatchenko A, Muller K R and von Lilienfeld O A 2012 Phys. Rev. Lett. 108(5) 058301
Machine Learning for Condensed Matter Physics 46
[247] Thompson A, Swiler L, Trott C, Foiles S and Tucker G 2015 Journal of Computational Physics
285 316 – 330 ISSN 0021-9991 URL http://www.sciencedirect.com/science/article/
pii/S0021999114008353
[248] Geiger P and Dellago C 2013 The Journal of Chemical Physics 139 164105
[249] Balabin R M and Lomakina E I 2011 Phys. Chem. Chem. Phys. 13(24) 11710–11718
[250] Schutt K T, Glawe H, Brockherde F, Sanna A, Muller K R and Gross E K U 2014 Phys. Rev. B
89(20) 205118 URL https://link.aps.org/doi/10.1103/PhysRevB.89.205118
[251] Faber F, Lindmaa A, von Lilienfeld O A and Armiento R 2015 International Journal of Quantum
Chemistry 115 1094–1101 (Preprint https://onlinelibrary.wiley.com/doi/pdf/10.1002/
qua.24917) URL https://onlinelibrary.wiley.com/doi/abs/10.1002/qua.24917
[252] Ceriotti M 2019 The Journal of Chemical Physics 150 150901
[253] Jadrich R B, Lindquist B A and Truskett T M 2018 The Journal of Chemical Physics 149 194109
[254] Jadrich R B, Lindquist B A, Pineros W D, Banerjee D and Truskett T M 2018 The Journal of
Chemical Physics 149 194110
[255] Mak C H 2006 Phys. Rev. E 73(6) 065104 URL https://link.aps.org/doi/10.1103/
PhysRevE.73.065104
[256] Bagchi K, Andersen H C and Swope W 1996 Phys. Rev. Lett. 76(2) 255–258 URL https:
//link.aps.org/doi/10.1103/PhysRevLett.76.255
[257] Boattini E, Dijkstra M and Filion L 2019 The Journal of Chemical Physics 151 154901
[258] Steinhardt P J, Nelson D R and Ronchetti M 1983 Phys. Rev. B 28(2) 784–805 URL https:
//link.aps.org/doi/10.1103/PhysRevB.28.784
[259] Lechner W and Dellago C 2008 The Journal of Chemical Physics 129 114707
[260] Chandrasekhar S, Chandrasekhar S and Press C U 1992 Liquid Crystals Cambridge monographs
on physics (Cambridge University Press) ISBN 9780521417471 URL https://books.google.
com.mx/books?id=GYDxM1fPArMC
[261] Sluckin T, Dunmur D and Stegemeyer H 2004 Crystals That Flow: Classic Papers from the
History of Liquid Crystals Liquid Crystals Book Series (Taylor & Francis) ISBN 9780415257893
[262] Walters M, Wei Q and Chen J Z Y 2019 Physical Review E 99 062701 publisher: American
Physical Society URL https://link.aps.org/doi/10.1103/PhysRevE.99.062701
[263] Hochreiter S and Schmidhuber J 1997 Neural computation 9 1735–80
[264] Greff K, Srivastava R K, Koutnık J, Steunebrink B R and Schmidhuber J 2017 IEEE Transactions
on Neural Networks and Learning Systems 28 2222–2232 ISSN 2162-2388
[265] Terao T 2020 Soft Materials 0 1–13
[266] de Gennes P, Gennes P and Press C U 1979 Scaling Concepts in Polymer Physics (Cornell
University Press) ISBN 9780801412035 URL https://books.google.com.mx/books?id=
ApzfJ2LYwGUC
[267] Menon A, Gupta C, M Perkins K, L DeCost B, Budwal N, T Rios R, Zhang K, Poczos B and
R Washburn N 2017 Molecular Systems Design & Engineering 2 263–273 publisher: Royal
Society of Chemistry URL https://pubs.rsc.org/en/content/articlelanding/2017/me/
c7me00027h
[268] Xu Q and Jiang J 2020 ACS Applied Polymer Materials Publisher: American Chemical Society
URL https://doi.org/10.1021/acsapm.0c00586
[269] Wu S, Kondo Y, Kakimoto M a, Yang B, Yamada H, Kuwajima I, Lambard G, Hongo K, Xu
Y, Shiomi J, Schick C, Morikawa J and Yoshida R 2019 npj Computational Materials 5 1–11
ISSN 2057-3960 number: 1 Publisher: Nature Publishing Group URL https://www.nature.
com/articles/s41524-019-0203-2
[270] Hamley I 2013 Introduction to Soft Matter: Synthetic and Biological Self-Assembling Materials
(Wiley) ISBN 9781118681428 URL https://books.google.com.mx/books?id=Qd794-DI49wC
[271] Jones R, R Jones P and Jones R 2002 Soft Condensed Matter Oxford Master Series in Physics
(OUP Oxford) ISBN 9780198505891 URL https://books.google.com.mx/books?id=Hl_
HBPUvoNsC
Machine Learning for Condensed Matter Physics 47
[272] Moradzadeh A and Aluru N R 2019 The Journal of Physical Chemistry Letters 10 7568–
7576 publisher: American Chemical Society URL https://doi.org/10.1021/acs.jpclett.
9b02820
[273] Schoenholz S S, Cubuk E D, Sussman D M, Kaxiras E and Liu A J 2016 Nature Physics
12 469–471 ISSN 1745-2481 number: 5 Publisher: Nature Publishing Group URL https:
//www.nature.com/articles/nphys3644
[274] Kob W and Andersen H C 1994 Phys. Rev. Lett. 73(10) 1376–1379 URL https://link.aps.
org/doi/10.1103/PhysRevLett.73.1376
[275] Liu H, Fu Z, Yang K, Xu X and Bauchy M 2019 Journal of Non-Crystalline Solids
119419 ISSN 0022-3093 URL http://www.sciencedirect.com/science/article/pii/
S0022309319302662
[276] Li L, Yang Y, Zhang D, Ye Z G, Jesse S, Kalinin S V and Vasudevan R K 2018 Science Advances
4 eaap8672 ISSN 2375-2548 publisher: American Association for the Advancement of Science
Section: Research Article URL https://advances.sciencemag.org/content/4/3/eaap8672
[277] Minor E N, Howard S D, Green A A S, Glaser M A, Park C S and Clark N A 2020 Soft
Matter 16 1751–1759 ISSN 1744-6848 publisher: The Royal Society of Chemistry URL
https://pubs.rsc.org/en/content/articlelanding/2020/sm/c9sm01979k
[278] Redmon J and Farhadi A 2016 Yolo9000: Better, faster, stronger (Preprint 1612.08242)
[279] DeFever R S, Targonski C, Hall S W, Smith M C and Sarupria S 2019 Chem. Sci. 10(32) 7503–
7515
[280] Lecun Y, Bottou L, Bengio Y and Haffner P 1998 Proceedings of the IEEE 86 2278–2324 ISSN
1558-2256 conference Name: Proceedings of the IEEE
[281] Xiao H, Rasul K and Vollgraf R 2017 arXiv:1708.07747 [cs, stat] ArXiv: 1708.07747 version: 2
URL http://arxiv.org/abs/1708.07747
[282] Wu Z, Ramsundar B, N Feinberg E, Gomes J, Geniesse C, S Pappu A, Leswing K and
Pande V 2018 Chemical Science 9 513–530 publisher: Royal Society of Chemistry URL
https://pubs.rsc.org/en/content/articlelanding/2018/sc/c7sc02664a
[283] Ramakrishnan R, Dral P O, Rupp M and von Lilienfeld O A 2014 Scientific Data 1 140022 ISSN
2052-4463 number: 1 Publisher: Nature Publishing Group URL https://www.nature.com/
articles/sdata201422
[284] Verlet L 1967 Physical Review 159 98–103 publisher: American Physical Society URL https:
//link.aps.org/doi/10.1103/PhysRev.159.98
[285] Zenodo - Research. Shared. URL https://zenodo.org/
[286] Allen C and Mehler D M A 2019 PLOS Biology 17 e3000246 ISSN 1545-7885 publisher: Public
Library of Science URL https://journals.plos.org/plosbiology/article?id=10.1371/
journal.pbio.3000246
[287] Cai J, Luo J, Wang S and Yang S 2018 Neurocomputing 300 70–79 ISSN 0925-2312 URL
http://www.sciencedirect.com/science/article/pii/S0925231218302911
[288] Mehta P, Bukov M, Wang C H, Day A G R, Richardson C, Fisher C K and Schwab D J 2019
Physics Reports 810 1–124 ISSN 0370-1573 URL http://www.sciencedirect.com/science/
article/pii/S0370157319300766
[289] Snoek J, Larochelle H and Adams R P 2012 Practical Bayesian Optimiza-
tion of Machine Learning Algorithms Advances in Neural Information Process-
ing Systems 25 ed Pereira F, Burges C J C, Bottou L and Weinberger K Q
(Curran Associates, Inc.) pp 2951–2959 URL http://papers.nips.cc/paper/
4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf
[290] Wan L, Zeiler M, Zhang S, Cun Y L and Fergus R 2013 Regularization of Neural Networks using
DropConnect International Conference on Machine Learning pp 1058–1066 iSSN: 1938-7228
Section: Machine Learning URL http://proceedings.mlr.press/v28/wan13.html
[291] Hutson M 2018 Science 359 725–726 ISSN 0036-8075, 1095-9203 publisher: American Association
for the Advancement of Science Section: In Depth URL https://science.sciencemag.org/
Machine Learning for Condensed Matter Physics 48
content/359/6377/725
[292] Srivastava N, Hinton G, Krizhevsky A, Sutskever I and Salakhutdinov R 2014 The Journal of
Machine Learning Research 15 1929–1958 ISSN 1532-4435
[293] Vanschoren J, van Rijn J N, Bischl B and Torgo L 2013 SIGKDD Explorations 15 49–60 URL
http://doi.acm.org/10.1145/2641190.2641198
[294] Alberti M, Pondenkandath V, Wursch M, Ingold R and Liwicki M 2018 DeepDIVA: A
Highly-Functional Python Framework for Reproducible Experiments 2018 16th International
Conference on Frontiers in Handwriting Recognition (ICFHR) pp 423–428
[295] Forde J Z and Paganini M 2019 arXiv:1904.10922 [cs, stat] ArXiv: 1904.10922 URL http:
//arxiv.org/abs/1904.10922
[296] Allen C and Mehler D M A 2019 PLOS Biology 17 1–14 URL https://doi.org/10.1371/
journal.pbio.3000246
[297] Samuel S, Loffler F and Konig-Ries B 2020 arXiv:2006.12117 [cs, stat] ArXiv: 2006.12117 URL
http://arxiv.org/abs/2006.12117
[298] Plimpton S 1993 Fast parallel algorithms for short-range molecular dynamics Tech. rep. Sandia
National Labs., Albuquerque, NM (United States) URL https://lammps.sandia.gov/index.
html
[299] Weik F, Weeber R, Szuttor K, Breitsprecher K, de Graaf J, Kuron M, Landsgesell J, Menke H,
Sean D and Holm C 2019 The European Physical Journal Special Topics 227 1789–1816
[300] Hashimoto K, Sugishita S, Tanaka A and Tomiya A 2018 Phys. Rev. D 98(4) 046019 URL
https://link.aps.org/doi/10.1103/PhysRevD.98.046019