Machine Learning for Condensed Matter Physics arXiv:2005

Review Article

Machine Learning for Condensed Matter Physics

Edwin A. Bedolla-Montiel1, Luis Carlos Padierna1 and Ramon

Castaneda-Priego1

1 Division de Ciencias e Ingenierıas, Universidad de Guanajuato, Loma del Bosque

103, 37150 Leon, Mexico

E-mail: [email protected]

Abstract. Condensed Matter Physics (CMP) seeks to understand the microscopic

interactions of matter at the quantum and atomistic levels, and describes how these

interactions result in both mesoscopic and macroscopic properties. CMP overlaps

with many other important branches of science, such as Chemistry, Materials Science,

Statistical Physics, and High-Performance Computing. With the advancements

in modern Machine Learning (ML) technology, a keen interest in applying these

algorithms to further CMP research has created a compelling new area of research at

the intersection of both fields. In this review, we aim to explore the main areas within

CMP, which have successfully applied ML techniques to further research, such as the

description and use of ML schemes for potential energy surfaces, the characterization

of topological phases of matter in lattice systems, the prediction of phase transitions

in off-lattice and atomistic simulations, the interpretation of ML theories with physics-

inspired frameworks and the enhancement of simulation methods with ML algorithms.

We also discuss in detial the main challenges and drawbacks of using ML methods on

CMP problems, as well as some perspectives for future developments.

Keywords: machine learning, condensed matter physics

Submitted to: J. Phys.: Condens. Matter

arX

iv:2

005.

1422

8v3

[ph

ysic

s.co

mp-

ph]

14

Aug

202

0

Machine Learning for Condensed Matter Physics 2

CondensedMatterPhysics

+MachineLearning

HardMatter

Classicaland

Modern MLApproaches

TopologicalPhases of

Matter

Phase Transitions& Critical Parameters

Materials Modeling

Machine-learnedPotentialsEnhanced

SimulationMethods

AtomisticFeature

Engineering

Physics-inspiredMachine Learning

Theory

RestrictedBoltzmannMachines

RenormalizationGroup

Explanationof

Deep Learning Theory

Machine-learnedFree

EnergySurfaces

EnhancedSimulationMethods Interpretation

ofML models

SoftMatter

Colloidal systems

Experimental setups

Phase transitionsStructural and

dynamical properties

Figure 1. Schematic representation of the potential applications of ML to CMP

discussed in this review.

1. Introduction

Currently, machine learning (ML) technology has seen widespread use in various aspects

of modern society: automatic language translation, movie recommendations, face

recognition in social media, fraud detection, and more everyday life activities [1] are

all powered by a diverse application of ML methods. Tasks like human-like image

classification performed by machines [2]; the effectiveness of understanding what a

person means when writing a sentence within a context [3]; the ability to distinguish a

face from a car in a picture [4]; and automated medical image analysis [5] are some of

the most outstanding advancements that stem from the mainstream use of modern ML

technology [6]. Inspired by these notable applications, scientists have thrivingly applied

ML technology to scientific fields, such as matter engineering [7], drug discovery [8] and

protein structure predictions [9].

Influenced by the overwhelming effectiveness of ML technology in other scientific

fields, physicists have also applied such methods to specific subfields within Physics,

with the main objective of making progress in the most challenging research questions.

The motivating reasons for this surge of interest at the intersection between Condensed

Matter Physics (CMP) and ML are vast. For instance, images, language and music


exhibit power-law decaying correlations equivalent to those seen in classical or quantum

many-body systems at their critical points [10]; very large data sets, with high-

dimensional inputs found in materials science [11] are the quintessential problem that

modern ML methods are tailored to deal with. It seems as though ML is well-suited to

be applied to CMP research problems given all these deep connections between both,

but not without its drawbacks. Such is the case of the encoding into a latent space

of features that represent the atomistic structure found in soft matter and physical

chemistry molecular systems [12], which is a challenge that is currently being solved by

modern natural language processing and object detection techniques. The challenge of

encoding and feature selection, as well as other key challenges of ML applications to

CMP will be discussed with further detail later on.

In this work, we review the contributions of ML methods to Physics with the aim

of providing a starting point for both computer scientists and physicists interested in

getting a better understanding of the interaction between these two research fields. For

easier navigation of the topics covered in this review, the diagram shown in figure 1

displays a schematic representation of the outline, but we shall summarize them briefly

here.

We start the review with a very brief overview of both CMP and ML for the sake of

self-containment, to help reduce the gap between both areas. In particular, the overview

of CMP is meant to emphasize the difference between both main branches, i.e. Hard

and Soft Matter, because these areas have a fundamental Physical difference that will

later influence the way ML techniques are applied to their respective types of problems.

We then continue by exploring some of the research inquiries within hard condensed

matter physics. Hard Matter has seen most of the ML applications, for instance in

phase transition detection and critical phenomena for lattice models. We continue with

the review by analyzing how both classical ML and Deep Learning (DL) techniques have

been applied to obtain detailed descriptions of strongly correlated, frustrated and out-

of-equilibrium systems. We investigate how standard simulation methods, like Monte

Carlo, have also seen enhancements from using ML models.

Moving forward, we examine a new research area dubbed physics-inspired ML

theory, an area where the most rigorous and fundamental physics frameworks have

been applied to ML to obtain a deeper insight of the learning mechanisms of ML

models. We discuss how in this engaging research area a robust understanding of

the learning mechanism behind Restricted Boltzmann Machines has been obtained.

We move on to reviewing how common theoretical physics frameworks, such as the

Renormalization Group, have been used with interesting results to give insights into the

learning mechanisms of the most common models in DL.

The review continues with the exploration of Materials modeling, Soft Matter,

and related fields, which recently have seen a strong surge of ML applications, mainly

in the subfields of materials science and colloidal systems, e.g., proteins and complex

molecular structures. Enhanced intelligent modeling, atomistic feature engineering, ML

potential and free energy surfaces have been the primary topics of research; with the


identification of a phase transition as well as the determination of the phases of matter

being a close second.

As one has to be aware that every technique has some limitations, the review is

closed with some of the main challenges and drawbacks of ML applications to CMP, in

addition to some perspectives and outlooks about current challenges.

Before we continue with the review, we would like to address that many other

notable applications within realated fields of CMP have been omitted due to lack of

depth of knowledge in such fields and space in this short review. Such applications

include graph-based ML force fields [13] and the deep learning architecture SchNet used

to model atomistic systems [14]; deep learning methodologies used to predict the ground-

state energy of an electron and effectively solving the Schrodinger equation for various

potentials [15]; the use of ML techniques for many-body quantum states [16, 17]; the

use of crystal graph convolutional neural networks to directly learn material properties

from the connection of atoms in the crystal [18]; the utilization of ML supervised and

unsupervised techniques to approximate density functionals, as well as providing some

insights into how these techniques might achieve such approximations[19, 20]. Lastly,

we would like to suggest the reader interested in applications of ML to a wide range of

areas within Physics, the recenet review by Carleo et al [21] discusses in detail some of

the most relevant works and research areas.

2. Overview of Machine Learning

In this section, common ML concepts including those used in this review are described,

following a specific hierarchy composed of category, problem, model, and method or

architecture. Figure 2 illustrates the relationship between these concepts. The interested

reader can get further information from this partial taxonomy by following the references

provided in the figure caption.

Machine Learning is a research field within Artificial Intelligence aimed to construct

programs that use example data or past experience to solve a given problem [22]. The

ML algorithms can be divided into categories of supervised, unsupervised, reinforcement

and deep learning [47, 22, 6]: in supervised learning the aim is to learn a mapping from

the input data to an output whose correct values are provided by a supervisor; in

unsupervised learning, there is no such supervisor and the goal is to find the regularities

or useful properties of the structure in the input; in reinforcement learning, the learner

is an agent that takes actions in an environment and receives reward or penalty for

its actions in trying to solve a problem, the output is a sequence of actions, and the

main purpose is to learn from past good sequences to generate a policy that maximize

a reward; deep learning is a special kind of ML, designed to deal with high-dimensional

environments. DL may be regarded as a transverse category, since it has been used to

solve the limitations of traditional algorithms of supervised [48], unsupervised [41], and

reinforcement learning [31]. DL is focused on learning nested concepts, e.g., the image of

a person, in terms of simpler concepts, such as the corners, contours or textures within


Machine Learning i

Unsupervised ii

LearningReinforcement ii

Learning

Classification iii

RegressionData Imputation

…

Supervised ii

Learning

Clustering iv

Novelty detectionRanking

…

Categories

Problems

Models

Methods & Architectures

Deep Learning ii

Path tracking v

Optimal control Games

…

SVC iii SV Clustering iv

KLSPI vii DKNs x

SVR iii Kernel PCA iv

SVM-A2C viii DEK xi

LS-SVM iii SV Ranking iv

KBR ix TWSVC xii

… … … …

Neural Networks viKernel Methods vi

MLP xiii AE xv ART xvii CNN xiii

LSTM xiv RBM xv MLP xiii Deep Q-Network xviii

RBF Networks xiv

GAN xvi RAN xvii DBM xviii

… … … …

Figure 2. Partial taxonomy of Machine Learning methods and architectures. (i)

A detailed description of other models not presented in the figure, such as graphical

or Bayesian models can be consulted in [22]. (ii) For further information regarding

the ML categories please confer to [6] and [23]. (iii) Examples of methods for

classification, regression and data imputation are discussed in [24], [25], and [26],

respectively. (iv) Examples of methods for clustering, novelty detection and ranking

can be found in [27], [28], and [29], respectively. (v) Examples of methods for path

tracking, optimal control and games are introduced in [30], [31], [32], respectively. (vi)

For further information regarding kernel methods and neural networks models please

confer to [33] and [34], respectively. (vii) KLSPI - Kernel-based least squares policy

iteration [35]. (viii) SVM-A2C - Advantage Actor-Critic with Support Vector Machine

Classification [36]. (ix) KBR - Kernel-based relational reinforcement Learning [37].

(x) DKN - Deep Kernel Networks [38]. (xi) DEK - Deep Embedding Kernel [39].

(xii) TWSVC - Twin Support Vector Clustering [40]. (xiii) Some architectures have

been used in more than one category, for instance, variants of Convolutional Neural

Networks (CNN) were employed both in supervised and unsupervised problems [41],

and Multilayer Perceptrons (MLP) have been used for supervised and reinforcement

learning [33, 42]. (xiv) LSTM - Long short-term memories and RBF - Radial

Basis Function Networks [34]. (xv) AE - Autoencoders and RBM - Restricted

Boltzmann Machines [23]. (xvi) GAN - Generative Adversarial Networks [43]. (xvii)

ART - Adaptive Resonance Theory Networks [44] and RAN - Resource Allocation

Networks [45]. (xviii) Deep Q-Network [32] and DBM - Deep Boltzmann Machines [46].


the image [6]. The DL approach enhances previous ML models such as neural networks

and kernel methods on several aspects, for instance by extending the capability of dealing

with large-scale data, and providing new architectures, operators, and learning methods

for the solution of problems with higher precision [32, 49]. As a result, DL methods are

able to deal with huge amounts of heterogeneous information, as well as learning directly

from raw data without the user-dependent preprocessing step of receiving the main and

filtered features of raw data as valid training vectors (feature extraction) required by

classical ML methods.

Although Figure 2 presents several ML concepts, there exist other important models

like Decision Trees, Gaussian Processes, and methods such as boosting, LASSO, ridge

regression, k-means, etc. [50, 51, 52], that are not included in this brief overview in

order to focus on those concepts that frequently appear in the rest of the manuscript.

Specifically we will restrict ourselves to the description of Neural Networks (feed-forward,

autoencoders, restricted Boltzmann machines and convolutional) and kernel methods

(support vector machines). These methods and architectures are now described, with

the exception of restricted Boltzmann machines that, because of their relevance, are

presented in subsequent sections. Useful references are included for readers interested

in obtaining a deeper insight.

Artificial Neural Networks (ANNs) are models of ML easily found in science

and engineering applications. ANNs attempt to simulate the decision process of

neurons of biological central nervous systems [53]. Many ANNs are computational

graphs [54, 34, 55]. In these graphs, the nodes compute activation functions (sigmoidal,

radial basis, and any other Tauber-Wiener function [56]) according to the input

data provided by the connecting edges. The learning process consists in assigning

weights to the edges and updating them according to a learning algorithm—such as

backpropagation, Levenberg-Marquardt, Stochastic Gradient Descent, Natural Gradient

Descent, etc. [57, 58, 59]—. The tasks that ANNs can solve depend on the architecture

that specifies how data is processed and stored. The storage of information can be

either by using an external memory [60, 61] or as an implicit memory codified into the

weights of the network, being the latter the most common approach. The number of

current ANNs architectures is immeasurable, but a project trying to track them can be

consulted in Ref. [62].

Multilayer feed-forward Neural Networks (FFNNs) are one of the simplest ANN

architectures, where neurons are organized in layers [63]. The default version of FFNNs

assumes that all neurons in one layer are connected to those of the next layer. Neurons

of the input layer receive the data to be processed and then pass it to the next

layer, successive layers feed into one another in the forward direction from input to

output. The key point is that the transfer of information between layers is controlled

by nonlinear activation functions evaluated by the neurons. A neat illustration of

this architecture and further details about its variants can be consulted in [34] sect

1.2.2. Two examples of these networks are the Multilayer Perceptrons and Radial

Basis Function Networks. When the learning process in FFNNs is supervised, in


addition to the training data—X ⊂ <d—provided to the input layer, expected output

data—Y ⊂ <—labeling each input vector is required to verify the performance of the

network. A learning algorithm is then employed based on the network output until the

best approximation to an underlying decision function, f : X → Y , is achieved. FFNNs

networks have been proved to be universal approximators to any Borel measurable

function [64]. In addition to the function approximation, supervised FFNNs may be

used for classification, prediction, and control tasks [65]. FFNNs have also been applied

to reinforcement learning problems [42, 32, 66].

Autoencoders (AEs) constitute a different type of NN architecture, considered a

special case of FFNNs, and designed to learn a latent representation of the training

data in order to perform data compression or dimensionality reduction [67, 68]. AEs

consist of two parts: an encoder function—g(x),x ∈ X—, and a decoder that produces

a reconstruction, r = f(g(x)). The main purpose of AEs is to generate an efficient

encoding of the input data instead of learning to copy this data perfectly. Despite

the success of ANNs to solve a vast number of applications in both science and

engineering [69, 70], this model has been criticized due to the lack of providing explicit

representations of the learned solutions [71, 72, 73]. Since ANNs are considered black-

box models, it becomes difficult to interpret their results, which is specially important

in applications such as medicine, business or self-driving cars, where the reliance of

the model must be guaranteed [73]. However, new techniques have been developed to

increase the interpretation capabilities of NNs in order to extract new insights from

complex physical, chemical and biological systems [74].

Convolutional Neural Networks (CNNs) are DL architectures specialized for

processing data with a grid-like topology (time-series as one-dimensional data or images

as two-dimensional grid of pixels) that use convolution in place of general matrix

multiplication in at least one of their layers. The simplest CNN includes an input

layer that receives the raw data; next, a convolutional layer filters local regions from

the input layer’s ouput. The use of convolution brings many benefits: provides a mean

for working with inputs of variable size, it has the property of translation invariance,

it is more efficient in terms of memory than dense matrix multiplication because of its

property of parameter sharing, and the feature extraction is performed without human

interference. The convolution is performed on two elements, the input data and a

kernel whose values are assigned automatically by the learning algorithm, instead of

being predefined by the user. After convolution, a pooling layer is then used to down-

sample the spatial dimensions; lastly, a fully connected layer computes the class scores,

and an output layer provides the final result [6].

Support Vector Machines (SVMs) are explicit ML models that belong to the family

of kernel methods and emerged as an alternative to ANNs [24]. Both ANNs and

SVMs have the ability to generalize and may be designed to solve practically the same

tasks (classification, regression, density estimations, clustering, among others) [75, 76].

However, while ANNs were inspired by the biological analogy to the brain, SVMs were

inspired by statistical learning theory [77]. Because of their solid theoretical basis, SVMs


can provide explicit solutions with measurable generalization capability by conducting

structural risk minimization; by solving a convex quadratic programming problem, the

dreaded problem of local minima that permeates ANNs is avoided; and finally, the model

can represent the learned solution based on few parameters [78, 79]. However, SVMs

are limited in the size of the datasets that can handle since training a standard SVM

has a time complexity between O(N2) and O(N3), with N the number of input vectors;

furthermore, the associated Gram matrix of size N ×N must be allocated [80, 81]. This

limitation is presented for general purpose SVM solvers like LIBSVM [82]. Different

approaches have been followed to deal with large-scale datasets. For instance, if the

kernel function can be limited to the linear one (as in high-dimensional sparce spaces),

then efficient solvers like LIBLINEAR [83] or PEGASOS [84] can be implemented to

solve problems with hundreds of thousands of input vectors in seconds. Other strategies

like sampling, boosting or hierarchical training have been used to speed up the SVM

training with non-linear kernels [81, 85]. For Deep Kernel Learning methods, where a

concatenation of several kernel functions can be reformulated as neural networks [86],

some strategies focus in designing maps in the Reproducing Kernel Hilbert Space to

reduce the complexity of evaluating Deep Kernel Networks, which scale quadratically on

N and linearly w.r.t the depth of the trained networks [38]. Based on these limitations

and proposed solutions, an application-based analysis should be done to choose the

correct set of SVM algorithms or related kernel methods.

3. Overview of Condensed Matter Physics

Understanding the intricate phenomena of all types of matter and their properties is

the driving force of condensed matter physicists. From quantum theory all the way to

materials science, CMP encompasses a large diversity of subfields within physics, as it

deals with different time and length scales depending on both the molecular details and

the type of matter being analyzed. To study such systems, theoretical, computational

and experimental physicists collaborate to gain a deeper insight into the behavior in

and out of equilibrium of matter. We shall briefly talk about the two main areas within

CMP, namely, hard matter and soft matter, the essential models as well as the central

problems these research areas are focused on. More information about these research

areas can be found in standard textbooks, see, e.g., [87, 88, 89] and reviews [90, 91].

3.1. Soft Condensed Matter

Fluids, liquid crystals, polymers, colloids, and many more are the principal types of

matter that are studied by Soft Condensed Matter, or Soft Matter (SM). SM has the

peculiar property that it will easily respond to external forces. SM will change its

flow when a small shear force is applied to it, or thermal fluctuations are applied,

thus changing its properties as a whole. We are surrounded by these materials in our

everyday life, from glass windows, toothpaste, hair-styling gel, among others. These


materials can show diverse properties and when some external force is applied, they will

exhibit a different set of properties, but in all cases SM shows the fascinating property of

self-organization of its molecules [92]. Proteins and bio-molecules are also good examples

of SM, they change their structure when subjected to thermal fluctuations, while also

showing a pattern of self-assembly [93, 94], as is the case in most types of SM. All in

all, SM is a subfield of CMP that gained a lot of attention when first introduced in the

1970s by Nobel prize laureate Pierre-Gilles de Gennes [95], because it has shown to be of

great importance to science in general, with very useful applications, as well as pushing

the boundaries in theoretical, experimental and computational Physics.

In order to explore all these properties and changes in a SM system, a model

that describes most of these types of matter is needed. Most fluids and colloidal

dispersions show a repulsive behavior, which can be modeled very well with the hard-

sphere model [96]. A system modeled as a hard-sphere dispersion only shows an

infinite repulsive interaction when there exists interparticle separations less than the

particle diameter, and zero otherwise. There is no attractive interaction included in this

model. The hard-sphere model is a very simple one, which has also exhibited intriguing,

but unexpected equilibrium and non-equilibrium states [97, 98, 99]. There are some

interesting properties that define the hard-sphere model in SM. First of all, the internal

energy of any allowed system configuration is always zero. Furthermore, forces between

particles and variations in the free energy—defined from classical thermodynamics

as F = E − TS, with E being the internal energy, T the temperature and S the

entropy—are both determined entirely by the entropy. Moreover, the entropy in a

hard-sphere system depends directly on the total volume occupied by the hard spheres,

commonly defined as the volume fraction, φ. When φ is small then the system shows

the properties of an ideal gas, but as φ starts to increase, particles interact with each

other and their motion is restricted by collisions with other nearby particles. Even more

interesting is the fact that the phase transitions of hard-sphere systems are completely

defined by φ [100]. In general, a SM system does not need to be composed only of

hard spheres, it can also be built with non-spherical molecules, for example, rods,

sphero-cylinders—cylinders with semi-circular caps—, and hard disks, just to mention

a few examples. Nevertheless, all these systems experience a large diversity of phase

transitions [101, 102, 103].

The description of phase transitions and critical phenomena are one of the main

challenges in all CMP. When a thermodynamic system changes its uniform physical

properties to another set of properties, then the system encounters a phase transition.

In SM, phase transitions are one of the most important phenomena that can be analyzed

given the diversity of phases that could appear in unexpected circumstances [104, 105].

In particular, a gas-liquid phase transition is defined by a temperature—, known as

the critical temperature, Tc—, and a density—, called the critical density, ρc—. These

quantities are also known as the critical point of the system, meaning that two phases

can coexist when the system is held to these special values. When neither of these

thermodynamical quantities can be employed, an order parameter is more than sufficient


to determine the phase transitions of a system. Order parameters are special quantities

that can measure the degree of order between a phase transition; they range between

several values for the different phases in the system [106]. For instance, when studying

the liquid-gas transition in a fluid, a specified order parameter can take the value of

zero in the gaseous phase, but it will take a non-zero value in the liquid state. In this

scenario, the density difference between both phases (ρl − ρg)—where ρl is the density

in the liquid state and ρg in the gas state—can be a useful order parameter.

When traversing the phase diagram of a SM system, one might encounter the

so-called topological defects. A topological defect in ordered systems happens when

the order parameter space is continuous everywhere except at isolated points, lines or

surfaces [107]. This can be the case of a system which is stuck in-between two phases,

and there is no continuous distortion or movement from the particles that can return the

system to an ordered phase. One such example is that of nematic liquid crystals [108].

The nematic phase is observed in a hard-rod system, when all the rods align in a

certain direction. Hard rods are allowed to move and to rotate freely. Furthermore, the

neighboring hard rods tend to align in the same direction. But if it so happens that

some hard rods are aligned in one direction and the rest in another direction, then the

system contains topological defects [109]. These defects depend on the symmetry of the

order parameter of the phases, as well as the topological properties of the space in which

the transformation is happening. Topological defects are of the utmost importance in

a large variety of systems in SM, e.g., they determine the strength of the liquid crystal

mentioned before.

Nevertheless, these types of systems and problems are not so easily solved. How

can we tell when a system will be prone to topological defects? Is it possible to

determine the phases that a system will come across given its configuration? Even

more challenging questions are always originating when these types of exotic matter

appear. But theoretical, experimental or even computational frameworks are not enough

to tackle these problems. Physicists are always open to new ways to approach such

challenges, and ML is slowly starting to become a powerful tool to deal with these

complex systems, and in this review we shall discuss some of the ways ML has been

applied to such problems, as well as commenting on some of the problems that can arise

from using such techniques.

However, it is important to note that SM also experiences thermodynamic

transitions to solid-like phases. For example, in the case of the hard-sphere model,

when the packing fraction φ approaches the value of φ ≈ 0.492 [110, 111], it undergoes

a liquid-solid first order transition [112, 113]. This thermodynamic behavior can also

be studied using theoretical and experimental techniques employed when one deals with

Hard Matter. Thus, at some point, SM is also at the boundary Hard Matter.


3.2. Hard Condensed Matter

When physicists deal with materials that show structural rigidity, such as solids, metals,

insulators or semiconductors, their interest in the properties of these materials are

the main focus of Hard Condensed Matter, or Hard Matter (HM). Furthermore, in

contrast to SM systems, HM typically deals with systems on smaller scales, i.e., matter

that is governed by atomic and quantum-mechanical interactions. Entropy no longer

determines the free energy of a HM system, it is now up to the internal energy.

One particular example of interest in HM is the phenomenon of superconductivity. A

material is considered a type I superconductor if all electrical resistance is absent and

it is a perfect diamagnet [114] —i.e., all magnetic flux fields are excluded from the

system—; a type II superconductor is not completely diamagnetic because they do

not exhibit a complete Meissner effect [115]. The phenomenon of superconductivity is

described by both quantum and statistical mechanical interactions. This means that

superconductivity can be seen as a macroscopic manifestation of the laws of quantum

mechanics. As was the case with SM, we are also surrounded by HM materials, for

instance, semiconductors are the principal components in microelectronics [116], which

in turn are the components of every digital device we use, ranging from computers to

smart cellphones. The applications that stem from HM are too great to enumerate in

this short review, and the theoretical, experimental and computational frameworks that

have derived from HM are nothing short of revolutionary; we suggest the reader who

might be interested in knowing some of these applications to a technical perspective

article [117], and a more thorough book on some of the modern aspects of HM, as well

as its applications to technology [118].

In HM, we also need a reference model that can help explain at the basic level

some of the investigated materials. Many other materials can use a similar reference

model, but with different assumptions and physical quantities. Such is the case of lattice

systems, and, in particular, the Ising model [119]. The Ising model is the simplest

model that can explain ferromagnetic properties, in addition to accounting for quantum

interactions. It is defined in a lattice, much like if it were a crystalline solid. In each

site of the lattice, there is a discrete variable σk that takes one of either two values, such

that σk ∈ −1,+1. The energy for a lattice is given by the following Hamiltonian

H(σ) = −∑〈i,j〉

Jijσiσj − µ∑j

hjσj, (1)

where Jij is a pairwise interaction between lattice sites; the notation 〈i, j〉 indicates

that i and j are lattice site neighbors; µ is the magnetic moment and hj is an external

magnetic field. As originally solved by Ising in 1924 [120], when the system lies in a one-

dimensional lattice, the system shows no phase transition. Later, Lars Onsager proved in

1944 [121] that when the Ising model lies in a two-dimensional lattice a continuous phase

transition is observed. For dimensions larger than two, different theoretical [122, 123]

and computational [124] frameworks have been proposed.


While the Ising model has seen some very useful applications, it is not the only

model that has shaped the research interests within CMP, as it cannot possible contain

and explain all phenomena seen in HM. A generalized model, such as the n-vector

model [125] or the Potts model [126] are extensively studied models in HM that have

been successfully applied to explain even more intricate phenomena [127, 128].

In general, all these models show phase transitions, in the same way as SM systems.

This research area grew even more when the Berezinsky-Kosterlitz-Thouless (BKT)

phase transition was discovered [129, 130, 131, 132]. The BKT transition was unveiled

on two-dimensional lattice systems, such as the XY model—which is a special case of

the n-vector model—. The BKT transition consists of a phase transition that is driven

by topological excitations and does not break any symmetries in the system [133]. This

type of phase transition was quite abstract in the 1970s when first introduced, but

it turned out to be a great theoretical framework that was later discovered in exotic

materials [134, 135], which in turn showed puzzling properties. Such was the impact of

the BKT transition in CMP that it would later be awarded a Nobel prize in Physics in

2016 [136].

However, the BKT transition is a complicated transition to study. Hard condensed

matter physicists are always trying to find a better way to look for these kind of unique

aspects in materials. It is an elusive and complicated challenge. In the same spirit as in

SM, ML has already been put to the test on this task, if it is possible to re-formulate the

problem at hand as a ML problem, it can then be solved by standard ML techniques.

But, as mentioned before, the task of re-formulation and feature selection is not simple;

there is currently no standard way to approach this in CMP applications, contrary

to most ML models and applications. We shall examine these applications and their

potential challenges as well as drawbacks with more detail in the following sections.

4. Hard Matter and Machine Learning

In this section, we shall review the main techniques and ML models applied to CMP,

mainly to HM. These applications might constitute a paradigm of ML applications to

CMP due to the results obtained, which are generally acceptable. The structure of

HM data is similar to the one found in computer vision applications in ML, among

other applications. Furthermore, these schemes have also shown most of the major

shortcomings such as feature selection, training and tuning of ML models.

4.1. Phase transitions and critical parameters

One of the most prominent uses of NNs was on the two-dimensional ferromagnetic

Ising model which has a Hamiltonian similar to Eq. (1), with the aim to identify the

critical temperature, Tc, at which the system leaves an ordered phase—the ferromagnetic

phase—and enters a disordered phase—the paramagnetic phase. Carrasquilla and

Melko [137] introduced a new paradigm by essentially formulating the problem of


phase transitions as a supervised classification problem, where the ferromagnetic phase

constitutes one class and the paramagnetic phase is another one, then a ML algorithm

is used to discriminate between both of them under specific conditions. By sampling

configurations using standard Monte Carlo simulations for different system sizes, and

then feeding this raw data to a FFNN, an estimate of the correlation-length critical

exponent ν is obtained directly from the output of the last layer of the trained NN.

Finite-size scaling theory [138] is then employed to obtain an estimation of Tc. Even if

the geometry of the lattice is modified, the power of abstraction provided by the NN [34]

is enough to predict the critical parameters even if it was trained in a completely different

geometry; this makes it a very convenient tool to identify critical parameters for systems

that do not have a defined order parameter [139].

4.2. Topological phases of matter

Once having shown that ML is able to identify phases of matter in simple models

as shown for models that have a similar Hamiltonian to Eq. (1), researchers have been

interested in trying out these algorithms in more complex systems, and a lot of work has

been put into the topic of identifying topological phases of matter using ML algorithms

for strongly correlated topological systems [140, 141, 142, 143, 144, 145, 146, 147, 148,

149, 150]. For instance, Zhang et al [151] have applied FFNNs as well as CNNs to

find the BKT phase transition, as well as constructing a complete phase diagram of the

XY and generalized XY models, with very promising results. In the case of the FFNN

approach, Zhang et al found out that a simple NN architecture consisting of one input

layer, one hidden layer with four neurons and one output layer is more than enough to

obtain the critical exponent ν for the XY model. We can compare this to the results

obtained by Kim and Kim [152], where they explain a similar behavior but for the Ising

model. In their work, analyzing a tractable model of a NN, they found out that a

NN architecture consisting of one input layer, one hidden layer with two neurons and

one output layer is sufficient to obtain the critical exponent ν for the Ising model. If

such is the case for a shallow architecture to be able to obtain good approximations for

the critical exponent ν then, perhaps, such models can be further simplified into more

robust and analytically tractable ones. Such is the case of the Support Vector Machine

methodology developed by Giannetti et al [153].

It is important to keep track of the model complexity specially for complicated

models to avoid overfitting and to reduce the amount of data needed to train the model.

In the case of the NN methodologies exposed before, we can argue that such models are

easier to train if there are enough data samples in the data set, when this is not the case

then overfitting will occur yielding low accuracy results; NNs are also faster to train due

to the fact that these models can be implemented in accelerated computing platforms,

such as Graphical Processing Units (GPUs). On the other hand, SVMs cannot be

readily accelerated with such techniques but they are less prone to overfitting because

the model complexity, in cases where the support vectors—the subset of the data set


needed to construct the decision function and thus make a prediction—are less than

the total number of samples, is low enough that a smaller data set might be enough to

obtain desirable approximations. Still, SVMs need to be adjusted according to the task,

so hyperparameter tuning must always be part of the data processing pipeline, which in

turn might take longer to obtain results. If this is the case, then simpler methods—such

as logistic regression or kernel ridge regression [154]—could potentially be better suited

for the task of approximating critical exponents due to the fact that these models are

simple to implement, need less data samples and most implementations are numerically

stable and robust; at the very least these simple models could potentially provide a

starting point for such approximations.

New and interesting methodologies are always arising in order to solve the problem

of phase transitions. Kenta et al [155] developed a generalized version of the

methodology proposed by Carrasquilla and Melko [137] for a variety of models, namely,

the Ising model, the q-state Potts model [156] and the q-state clock model. Instead of

using the raw data obtained from Monte Carlo simulations as proposed in [137], Kenta

et al considered the configuration of a specific long-range spatial correlation function

for each model, effectively exploiting feature engineering, hence providing meaningful

data to the NN, which now contains relevant information from the systems at hand.

This strategy has the advantage of generalization for the already trained NN, thus it

can be readily applied to similar systems without further manipulation of the physical

model or the production of new simulation data. The strategy is also quite general in

the sense that it can be applied to multi-component systems, with the similar precision

as standard theoretical methods.

Finally, we remark some of the advances in topological electronic phases. The most

fundamental quantity to characterize these states is the so called topological invariant,

whose value determines the topological class of the system. In particular, interfaces

between systems with different topological invariants show topologically protected

excitations, resilient towards perturbations respecting the symmetry class of the system.

Mappings of topological invariants using NN as performed in [157] shows that supervised

learning is an efficient methodology to characterize the local topology of a system. DL

has also seen some useful applications to these type of systems, e.g., for random quantum

systems [158, 159, 160].

4.3. Classical and modern ML approaches

The amount of success that ML has seen in the area of Lattice Field Theory and Spin

Systems has given it a special place in the toolboxes of physicists in order to provide

a different perspective to a given problem based on different ML models, each with its

own advantages and drawbacks. It is specially important to identify those tools as the

ones that have seen the most use in this topic, the main one being FFNNs given its

generalization and abstraction capabilities for which it has become the workhorse of

most of the applications in this field—particularly when the problem can be formulated


as a supervised learning problem.

For more classical approaches, SVMs have also been implemented on the phase

transition problem [148, 153] with promising results given their ease of use and physical

interpretation of the results, which we shall discuss in detail in section 5.4. It is useful

to compare the SVM method against the NN one presented in the previous subsection,

specially given the fact that both techniques can yield very similar results; equivalently

we can ask ourselves, when should we use one technique or the other? As mentioned in

the previous subsection, SVMs are a great tool when the number of samples in the data

set is low, this is because SVMs only need a small subset, i.e., the support vectors, to

effectively predict some value. The main disadvantage of SVMs is their time complexity

when training, as it completely depends on the implementation used and the data set.

For example, the most popular implementation of a SVM solver is LIBSVM [82], which

scales between O(M×N2) and O(M×N3), where M is the number of features and N is

the number of samples in the data set. SVMs depend strongly on the hyperparameters

used, so in the case of an incorrectly tuned SVM the time it takes to train could be

O(N3) in the worst-case scenario, while also yielding poor results. On the other hand,

if the user is careful in tuning and selecting a good kernel as presented in [153], one can

solve a CMP problem with very high precision; in this case some Physics insight of the

problem can be helpful in providing good parameters for the SVM.

If we now turn our attention to NNs and their training time complexity, it

normally scales as O(N3) provided the NN use an efficient linear algebra and matrix

multiplication implementation. If we also account for the backpropagation operations,

which scale linearly with the number of layers in the network, we can see that NNs

are significantly more difficult to train than SVMs. Research is being carried out to

understand why NN are hard to train for and at the same time they are efficiently

trained in practice [161, 162]. The reason for this is that modern numerical linear algebra

implementations are efficient, and they can be accelerated with modern computing

technologies, such as GPUs. This makes NNs a great tool when the number of data

samples is large, while training can also happen online, i.e. feeding the network with

newly created samples. In short, both techniques give similar results provided there are

enough representative data samples from the problem. A simple criterion for choosing

between both methodologies depends greatly on the data set and the problem, but

starting off with SVMs can give a good approximation, and in the case of needing even

more precision or if the data set contains a large number of features then NNs might be

a better choice for the problem at hand.

Many other classical techniques—specifically the ones that stem from unsupervised

learning methods —have become prominent algorithms. For instance, one such

technique is the nonlinear dimensionality reduction algorithm t-distributed Stochastic

Neighbor Embedding (t-SNE) [163] that Zhang et al [151] used to map high-dimensional

features onto lower dimensional spaces in order to identify three distinct phases in

the site percolation model. In this methodology, the idea is to use different site

configurations, handled as a single array of values from the lattice, and using t-SNE


to map this large dimensional space into a 2-dimensional one, such that when enough

iterations are performed three distinct clusters appear: one for each ordered phase

(percolating and non-percolating phases), and the transition phase between these two

phases. Although t-SNE is mostly a visualization tool, by employing other unsupervised

learning techniques such as k-means [154] or spectral clustering [164], together with the

results obtained from t-SNE one could possibly obtain a good approximation of the

transition point in the percolation model.

Another prime example is Principal Component Analysis (PCA) [165, 154] which

has proved to be one of the best feature extraction algorithms that can be used in

Spin Systems [166, 167], as it can be readily applied to raw configurations and use the

information from the system and build different features, which then can be compared to

physical observables like structure factors and order parameters. For instance, Wei [167]

established a methodology to build a matrix Mwith different site configurations from

the Ising model—using a simplified version of the Hamiltonian from Eq. (1)—and then

performing PCA on M . This yields a set of new component vectors that describe the

large dimensional space of M in a new, lower dimensional space. These new vectors are

used as a basis set to project the original samples in this space and it is in this new space

that clustering techniques like k-means can be used in order to find a phase transition.

DL has also been wielded as a powerful technique to understand quantum phase

transitions [168] in the form of adversarial NNs [169]; to provide a thorough description of

many-body quantum systems [170] by employing Variational Autoencoders [171, 6]; and

CNNs [6] have been applied, for instance, to learn quantum topological invariants [172],

or to solve frustrated many-body models [173]. One of the main reasons for DL to

be such a strong asset is that it can handle complex data representations and achieve

good accuracy results, provided these models have enough data to learn from. The

main drawbacks of these models is precisely the fact that the data needed must be

representative of the task at hand; the data must also preserve invariants of the system

in the case that the system has such invariants, such as invariance to translation and

rotations. These type of issues are mostly solved with data augmentation techniques,

such as adding some white noise to the data samples, or creating some additional samples

that are mirrored, although there are other such modifications that can be done. But

again, this implies that the user of such models must know the problem well enough to

perform such adjustments for the sake of stable training and yielding good results.

4.4. Enhanced Simulation Methods

One of the standard computer simulation technique in HM, along with SM, is the Monte

Carlo (MC) method, an unbiased probabilistic method that enables the sampling of

configurations from a classical or quantum many-body system [174, 175, 176, 177].

Throughout the decades, specific modifications have been performed to the MC method

in order to enhance it, creating a better, more rigorous method that could be used in

all types of simulations—from observing phase transitions to computing quantum path


integrals—.

It is in this situation that ML has something to offer in order to improve the original

MC method; such is the case of the Self-learning Monte Carlo (SLMC) method developed

by Li et al [178]. In this novel scheme, the classical MC method is used to obtain

configurations from a similar Hamiltonian to Eq. (1)—these configurations will serve as

the training data—; we call this set of configurations s from a state space S where each

sample should have a probability density given by some function p:S → <+. Then a ML

method is employed—linear regression, in this case—to learn the rules that guide the

configuration updates in the simulation, i.e. a parametrized effective model wθ which can

approximate the original model and obtain wθ ≈ w, with wθ: S → <+ and S ⊇ S. This

enables the MC method to learn by itself under which conditions a given configuration is

updated without external manipulation or guidance, efficiently sampling configurations

for the most general possible model. Demonstrated on a generalized version of the

Hamiltonian given by Eq. (1), the SLMC method is extremely efficient when producing

configurations from the system, lowering the autocorrelation time from each of the

updates. Li et al discussed that, although a ten-fold speedup from the original MC

method is obtained, the SLMC still suffers from the quintessential problem of the original

MC method—when approaching the thermodynamical limit, the method heavily slows

down, rendering it useless—. Almost simultaneously, Huang and Wang [179] developed

a similar scheme but using restricted Boltzmann Machines (RBMs). This interesting

approach consists of building special types of RBMs that can use the physical properties

of the system being studied. The strategy by Huang and Wang lowers autocorrelation

time as well as raising the acceptance ratio of sampled configurations. The SLMC

method has sparked interesting new research [180, 181, 182, 183, 184]; recently, Nagai,

Okumura and Tanaka [185] have shown that the SLMC is even more promising by

coupling it with Behler-Parrinello NNs [186], making it more general and less biased

towards specific physical systems.

Despite its advantages, the SLMC method still has some shortcomings and

drawbacks. As with every other ML model, good and representative training data

is required in order to train the model; for a complex model, i.e., if the model contains

a large number of parameters to be fitted, the more data samples are needed to avoid

overfiting. This might come as a problem if the original MC method used to sample

configurations is already slow. Another important drawback is that the ML model

needed to approximate the probability density for the configuration set must be robust

enough to handle the problem at hand, so some intervention from the user is required

in order to choose a good model, but this is not simple a task because this would imply

that the dyamics of the system must be known from the start; this might remove all

advantages of using ML models for automated computation of the problem. To alleviate

some of these disadvantages, Bojesen [187] proposed a reinforcement learning approach,

defined as the Policy-guided Monte Carlo (PGMC) method. In this methodology,

the focus is now in an automatic approach to target the proposal generation for each

configuration, which can be done by tuning the policy for making updates. Showing


promising results in some models, the PGMC proves to be a better candidate for

automatic MC sampling, although it also has its own shortcomings, which is primarily

the fact that if the available information is inadequate, the training might become

unstable and the probability density will be biased, ultimately breaking ergodicity.

Nevertheless, these methods prove that ML models can be used in CMP to obtain

some gains in computational efficiency and simplicity making them a potentially good

tool for more challenging problems.

4.5. Outlook

Current research is geared towards enhancing mainstream methodologies in CMP with

ML, and enabling physicists to obtain high-precision results while also providing some

insight of the learning mechanisms used by the ML algorithms employed. Another

current area of research in Spin Systems is that of figuring out which ML techniques

are more suitable to understand out-of-equilibrium systems [188], or even more complex

frustrated models [189], just to name a few. Challenges, such as represetantive data set

generation and feature selection must also be overcome within this area if we wish to

see these methodologies become more popular. This matter is even more important for

out-of-equilibrium systems because it is not a simple task to generate useful training

data sets.

5. Physics-inspired Machine Learning Theory

5.1. Explanation of Deep Learning Theory

The gain in knowledge between ML and CMP is not one-sided as one might think.

ML has also benefited greatly from CMP with its thorough and rigorous theoretical

frameworks that it possesses. Despite their huge success in human-like tasks—e.g.,

the classification of images [2]—, ML, and more importantly DL are not always able

to explain why this success is even possible in the first place because these models

might not reach the algorithmic transparency of other classical ML models, e.g. linear

models. Schmidt et al [11] mention that representative datasets, a good knowledge of

the training process, and a comprehensive validation of the model can usually overcome

this obstacle. Other classical algorithms, such as MC computations are not always

analytically tractable, so we might argue that this problem is not unique to ML models.

With this in mind, an interpretation of the models used can be done to some extent,

and even more so with the help of some theoretical frameworks, such as those found

within Physics.

In 2016, Zhang et al [190] took it upon themselves to run detailed experiments

that showed the lack of complete understanding of the current learning theory from

a theoretical perspective; on the practical side of things, DL is still very prosperous

in its results. Since then, physicists have tried to give insights [191] into this topic,

creating a fruitful research area with extremely interesting cross-fertilization between


both communities. For a more thorough and complete analysis on the explanation and

interpretability of DL contributions until now, the reader can refer to the recent review

on the topic by Fan, Xiong and Wang [192].

5.2. Renormalization Group

The lack of theoretical understanding of the learning theory in most ML algorithms

seemed like a good challenge for condensed matter physicists that exerted experienced

manipulation in theoretical frameworks, such as the Renormalization Group (RG) [122,

123]. RG consists of a mathematical framework that allows us a very rigorous

and systematic investigation of a many-body system by working with short-distance

fluctuations and building up this information to understand long-distance interactions

of the whole system. In 2014, Mehta and Schwab [193] provided such an insight

by creating a mapping between the Variational Renormalization Group and the DL

workflow that is common nowadays. This turned out to be an interesting take on the

problem, resulting in a very rigorous framework by which one could explain the learning

theory that Deep NNs follow when data is fed to them. In the same spirit, Li and

Wang [194] proposed a new scheme, this time in a more systematic way, by means of

hierarchical transformations of the data from an abstract space—the space of features

present in the data—to a latent space. It turns out that this approach has an exact

and tractable likelihood, facilitating unbiased training with a step-by-step scheme; they

proved this through a practical example using Monte Carlo configurations sampled from

the one-dimensional and two-dimensional Ising models. More research regarding a deep

link between RG and DL has been carried out since then, for instance, Koch-Janusz and

Ringel [195] showed that even though the association between RG and DL is strong,

most of the practical methodology to achieve good results from this is not so readily

available. By using feed-forward NNs, they proved that a RG flow is obtainable without

discriminating the important degrees of freedom that are sometimes integrated out by

following the RG framework. This seems like a promising and enriching area of research

as new insights are constantly developed [196, 197, 198].

5.3. Restricted Boltzmann Machines

Instead of giving a complete explanation of a learning theory, CMP has been able

to give thorough explanations of ML algorithms; such is the case of the Restricted

Boltzmann Machine (RBM) [199]. This type of model is a very interesting case of the

inspiration drawn from statistical physics onto ML. In the RBM, a NN is trained with

a given data set, the intelligent model then learns complex internal representations

of the data, after the model has been trained it outputs an approximation of the

probability distribution of the input data. The resulting approximate probability

distribution is then linked to a certain ML task, be it classification [200], dimensionality

reduction [201], feature selection/engineering [202], and many more. The underlying

structure of the model is not a conventional FFNN, instead, it is a type of model that


is built using graphs [203]. The principal mechanism by which the RBM learns features

is by computing an energy function E(h, v) from certain hidden—h —and visible —v

—units that form a configuration; after computing this energy function E(h, v) one

evaluates the Boltzmann factor of the energy to obtain a probability for the given

configuration —P (h, v) = e−E(h,v)/Z —. From classical statistical mechanics we know

that the partition function Z =∑

i e−E(h,v) is the normalizing constant needed to ensure

that 0 ≤ P (h, v) ≤ 1. It is this link between the Boltzmann description and the energy

function from the RBM that makes it an ideal candidate to be analyzed under the

statistical mechanics framework.

Interestingly enough, RBMs are extremely versatile, despite the fact that it is a

ML method for which the training mechanisms are not as fast as in more modern ML

algorithms, thus special technical schemes have been developed for the task. This is

possibly the greatest drawback of using RBMs, the training procedure is not unique

and the most common procedure to train such models is the so-called contrastive

divergence algorithm [204]. This algorithm can be tricky to use effectively and requires

a certain amount of practical experience to decide how to tune the corresponding

hyperparameters from the model. A good practical guide to learn about training RBMs

is the one by Hinton [205], where special care is taken to inform the reader of the uses

and shortcomings of the training procedure for RMBs. We shall discuss some of the

attempts at providing some insight into the training mechanisms of RMBs by using

some theoretical frameworks.

For example, Huang and Toyoizumi [206] introduced a mean-field theory

approach to perform the training using a message-passing-based method that evaluates

the partition function, together with its gradients without requiring statistical

sampling—which is one of the drawbacks of traditional learning schemes—. Further

work on understanding the thermodynamics of the model were carried out [207] with

encouraging results, showing that the learning mechanism in general is completely

tractable, albeit complex when the model enters a non-linear regime. The RBM can also

be seen as the inverse Ising model [208], where instead of obtaining physical quantities

like order parameters and critical exponents from system configurations, one has as

input the physical observables and intends to obtain insights on the system itself. With

this theoretical framework in mind, the first steps to produce rigorous explanations of

the algorithm were done by Decelle et al. [209] by studying the linear regime of the

training dynamics using spectral analysis.

RBMs are such rich and resourceful models that have been applied to a number of

hard tasks; for instance, the modeling of protein families from sequential data [210, 211].

Another such complex task is that of continuous speech recognition, on which a special

type of RBM was employed with promising results [212]. Within the realm of CMP, these

models have been used to construct accurate ground-state wave functions of strongly

interacting and entangled quantum spins, as well as fermionic models on lattices [213].

RBMs are currently one of the most flexible algorithms, despite its training

mechanisms disadvantages. These models are particularly well suited to study problems


within CMP and quantum many-body physics [214]. With even more research efforts

to make RBMs scalable to larger data set and faster training mechanims, it might be

possible for RBMs to become a standard tool in ML applications to CMP problems.

5.4. Interpretation of Machine Learning models

Despite the success they have achieved, ML algorithms are prone to a lack of

interpretation, resulting in so-called black box algorithms that cannot grant a full

rigorous explanation of the learning mechanisms and features that result in the outputs

obtained from them. However, for a condensed matter physicist, it is more important

to understand the core mechanism by which the ML model obtained the desired output,

so that it can later be linked to a useful physical model. As an example, we can take

the model with the Hamiltonian described by Eq. (1) and train a FFNN on simulation

data, following the methodology of Carrasquilla and Melko [137]; once the NN has been

trained, the critical exponent is obtained and used to compute the critical temperature,

but the question remains, what enabled the NN to obtain such information? Is there a

mechanism to extract meaningful physical knowledge from the weights and layers from

the NN?

To answer this type of questions researchers have had to resort to arduous extraction

and analysis from the underlying model architectures. For example, Suchsland

and Wessel [215] trained a shallow FFNN on two-dimensional configurations from

Hamiltonians similar to Eq. (1), then proceeded to comprehensively analyze the weight

matrices obtained from the training scheme, while also performing a meticulous study

of the reasons why activation layers and their respective neurons fire a response

when required; they found out that both elements—training weights and activation

layers—play a crucial role in the understanding of how the NN is able to output the

physical observables that we give a meaning to. They also determined that the learning

mechanism corresponds to a domain-specific understanding of the system, i.e., in the

case of the Ising model, the NN effectively learned the structure of the surrounding

neighbors for a particular spin, with this information it then proceeded to compute the

magnetization order parameter as part of the computations involved in the training and

predicting strategies.

Another approach is to use classical ML algorithms like SVMs given that these

models provide a rigorous mathematical formulation for the mechanism by which they

learn from given features in the data. Ponte and Melko [148], as well as Gianetti et

al. [153] exploited these models and used the underlying decision function of the SVMs

to map the Ising model’s order parameters and obtained a robust way to extract the

critical exponents from the system. Within this approach, one does not need to provide

a meticulous explanation of the learning mechanism for the model, this has already been

done in its own theoretical formulation—instead, an adjustment of the physical model

with respect to the ML algorithm needs to be performed in order to successfully achieve

a link between both frameworks. Nonetheless, this approach has the disadvantage that


it can be hard to map the ML method to the physical model, as it requires a deep

and experienced understanding of the physical system, while simultaneously needing a

comprehensive awareness of the theoretical formulation of the ML algorithm being used.

5.5. Outlook

The approach of illustrating the training scheme, as well as the loss function landscape

of a ML algorithm with CMP has proven to be both intriguing and effective. Baity-Jesi

et al. [216] used the theory of glassy dynamics to give an interesting interpretation of

the training dynamics in DL models. Similar work was performed by Geiger et al. [217],

but this time with the theory of jamming transitions, disclosing why poor minima of

the loss function in a FFNN cannot be encountered in an overparametrized regime—i.e.,

when there exists more parameters to fit in a NN than training samples—. Dynamical

explanations of training schemes have the advantage of providing precise frameworks

that have been understood in the physics community for a long time, but it is an

intricate solution as it needs a scrupulous analysis of the model, leaving little room for

generalization on other ML models. It is true that FFNNs, as well as Deep NNs are

the less understood ML models, but they are not the only ones that need a carefully

detailed framework; maybe in future research we can see how other ML models benefit

from all these powerful techniques developed so far by the CMP community.

6. Materials Modeling

6.1. Enhanced Simulation Methods

Molecular dynamics (MD) is a classical simulation technique that integrates Newton’s

equations of motion at the microscale to simulate the dynamical evolution of atoms

and molecules [218]. It is, along with the MC method, a fundamental tool in computer

simulations within CMP, specifically in HM, SM and Quantum Chemistry. The powerful

method of MD enables the scientist to obtain physical observables through a controlled

and simulated environment, by-passing limitations that most experimental setups have

to deal with. MD has become a cornerstone method to model complex structures, such

as proteins [219] and chemical structures [220], just to name a few. But such complex

structures have to deal with large number of particles—with their many degrees of

freedom—as well as complicated interactions in order to obtain meaningful results out

of the simulations. This creates a difficult challenge to overcome, as efficient exploration

of the phase space is a difficult task, even with MD simulation code accelerated with

modern hardware. In order to reduce the computational burden of modeling such

many-body systems, special enhanced sampling methods have been developed [221],

but not even these schemes can alleviate the arduousness of modeling complicated

systems. ML is seen as a promising candidate to surpass modern solutions in MD

simulations, as shown primarily by Sidky and Whitmer [222]. In their novel approach,

authors employed FFNNs to learn the free energies of relevant chemical processes,


effectively enhancing MD simulations with ML and creating a comprehensive method

to obtain high-precision information from simulations that were not previously readily

available. Another interesting procedure to enhance MD simulations was the automated

discovery of features that could be potentially fed into intelligent models, making it

possible to create a systematic and automated workflow for materials research. Sultan

and Pande [223] developed such techniques by training several classical ML methods

—Support Vector Machines, Logistic Regression, and FFNNs —, showing that given

a MD framework one could set up a complete, self-adjustable scheme to obtain the

system’s free energy, entropy, structural information and many more observables.

Another powerful tool to study many-body systems is Density Functional

Theory [224] (DFT), which is a quantum mechanical framework for the investigation

of the electronic structure of atoms. We refer the reader to useful references for more

information on DFT [225, 226]. Although a very robust tool its main disadvantage is

that the performance of DFT results depends on the approximations and functionals

used; it is also computationally demanding for large systems. The use of NN to enhance

these computations has shown favorable results, for instance, in the work by Custodio,

Filletti and Franca [227], where an approximation of a density functional by means of

a NN reduces the computational cost, while simultaneously obtaining accurate results

for a vast range of fermionic systems. For similar applications to DFT, we would like to

refer the reader to a review where a number of different methods are used to solve the

Schrodinger equation and related problems of constructing functionals for DFT [228].

Although in its early stages, the enhancement of classical methods with ML is an

intriguing area of research. When no rigorous foundation is needed, ML models are able

to exceed human performance in the modeling of atomistic simulations, making it a

suitable candidate to become the mainstream method in the future. Of course, there is

a long path ahead to fully embrace these methods, as a complete simulation strategy is

not easily available yet; further research on the coupling between molecular simulations

and ML schemes is the primary way to achieve this.

6.2. Machine-learned potentials

In this new day and age, computer simulations are a standard technique to do research

in CMP and its sub-fields. These simulations produce enormous quantities of data

depending on the sub-field they are tasked in, for example, biophysics and applied soft

matter have produced one of the largest databases—the Protein Data Bank [229]—that

contain free theoretical and experimental molecular information as a by-product of

research in specific topics. This explosion of immense data available to the public has

attracted scientists to look into the ML methodologies that have been created for the

sole purpose of dealing with the so-called big data outbreak [230] of recent years, and

effectively uniting both fields to attempt to use such well-tested intelligent algorithms

in the understanding and automation of computer-aided models.

One such task is that of constructing complex potential energy surfaces (PESs)


with the experimental and computational data available [12]. This has turned out to

be a successful application to materials science, physical chemistry and computational

chemistry, as it allows scientists to obtain high-quality, high-precision representations of

atomistic interactions of all kinds of materials and compounds. In atomistic simulations

and experiments, scientists are always interested in obtaining a good representation

of PESs because it guarantees the full recovery of the potential energy—and as a

consequence, the full atomistic interactions—of a many-body system. The baseline

technique that provides such information is the coupling of MD with DFT which turns

out to be a very computational demanding scheme, even for the simplest cases. This

constituted the starting point in a long standing approach to use ML models that could

compute such complicated functions, but not without the usual challenges of good data

representation, stable training and good hyperparameter tuning. The first attempt of

this was done by Bank et al [231] where they proposed to fit FFNNs with information

produced by some low-dimensional models of a CO molecule chemisorbed on a Ni(111)

surface. They obtained promising results, such as faster evaluation of the potential

energy by means of the NN model compared to the original empirical model. The data

sets used by Bank et al were composed with a range from two up to twelve features, or

degrees of freedom, making it easy and simple to train the NN models.

But this is not always the case, in fact, one could argue that it is never the case, as

systems like proteins consist of thousands of particles and electronic densities, hence a

large number of degrees of freedom need to be computed simultaneously [232]. Another

challenge for these type of ML potentials is that of the descriptors or features needed

in order to be fed to an intelligent model. Data fed into ML models should account for

all possible variations and transformations seen in real physical models to extend the

generalization of the ML schemes and avoid bias in them.

It took more than ten years after the pioneering work of Bank et al for Behler

and Parrinello [186] to undertake this challenge to a new level by proposing a similar

methodology employing FFNNs, advocating now to a more generalized approach to

computing the PES of an atomistic system by creating what they called atomic-centered

symmetric functions. We will describe in detail this method as it is an important

achievement in the application of ML to CMP and materials modeling.

The main purpose of the Behler-Parrinello approach is to represent the total energy

E of the system as a sum of atomic contributions Ei, such that E =∑

iEi. This is

done by approximating E with a NN. The input of the NN should reflect information

of the system from the contribution of all corresponding atoms, while also preserving

important information about the local energetic environment. In this sense, symmetric

functions are used as features, transforming the positions to other values that contain

more useful information from the system. In order to define the energetically relevant

local environment, a cutoff function is employed fc of the interatomic distance Rij, which


has the form

fc(Rij) =

0.5 ·[cos(πRij

Rc+ 1)]

for Rij ≤ Rc ,

0 for Rij > Rc .(2)

Next, radial symmetry functions are constructed as a sum of Gaussian functions

with parameters η and Rs

G1i =

all∑j 6=i

e−η(Rij−Rs)2fc(Rij). (3)

Finally, angular terms are constructed for all triplets of atoms by summing the

cosine values of the angles θijk =Rij ·Rik

RijRikcentered at atom i, with Rij = Ri −Rj , and

the following symmetry function is constructed

G2i = 21−ζ

all∑j,k 6=i

(1 + λ cos θijk)ζ e−η(R

2ij+R

2ik+R

2jk)fc(Rij)fc(Rik)fc(Rjk), (4)

with parameters λ(= +1,−1), η, ζ. These parameters, along with the ones defined

in Eq. (3) need to be found before these symmetry functions can be used as input to

the NN. The parameter fitting is done by doing DFT calculations for the system to

be studied. By computing the root mean squared error (RMSE) between the fit and

the true values from the computations, the training set is constructed with the DFT

calculations if the RMSE is larger than the fit. With this information, a training and

testing data set is constructed and then used to train the NN, which in turn will yield an

approximation for E. This approximation is then used to compute different observables

and quantities from the system.

This methodology introduced a paradigm shift because it was no longer a proof-of-

concept that ML algorithms could be used in these research fields, and many other

attempts have been carried out since then. Following the footsteps of Behler and

Parrinello, Bartok et al [233] proposed a similar methodology, but with completely

different ML schemes—in this case being Gaussian Processes [51] —which in turn needed

new atomistic descriptors. We shall discuss atomistic descriptors in more detail in the

next section.

The advantages of such techniques in all the sub-fields of CMP is manifold, the

main one being that simulations are now faster and more precise. Another advantage

of such methods is the fact that we can leverage the retention of the learned features

from the NN and by employing transfer learning [234] the model does not need to

be trained again to compute the same interactions it learned to compute. For the

reader interested in more detailed discussions on transfer learning we point out some

reviews [235, 236] that give enough information on the current advancements on this

topic. Nevertheless, this approach has one significant drawback which is the fact that

most NNs cannot learn sequential tasks, which in turn creates a phenomenon called

catastrophic forgetting [237, 238, 239, 240]. In the case of catastrophic learning, the NN


abruptly loses most of the information learned from a previous task, unable to use it

in the new task at hand. Transfer learning is a ML technique that needs meticulous

manipulation from the user, involving fine tuning, which is a training process where some

of the learned information are kept and some are learned again, and then joint training

where new information is now learned using previous and newly acquired information

obtained from fine tuning. Research is currently being carried out [241, 242] to overcome

this disadvantage from NNs and shared learning.

Machine-learned potentials have also been successfully applied to explaining

complex interactions, like the van der Waals interaction in water molecules and the

reason this interaction results in such unique properties seen in water [243]. Another

interesting application was the simulation of solid-liquid interfaces with MD [244], an

application that was not thought of being computationally simple or even tractable. One

of the remaining challenges in this area of research is a paradoxical one, being that these

ML potentials require very large data sets that are computationally demanding in order

to train the intelligent model, one possible approach to alleviate this issue is to build and

maintain a decentralized data bank similar to the Protein Data Bank—which could be a

cornerstone of modern computational materials science and chemistry—enabling anyone

interested in this type of research to collect insights from the acquired data, without

needing to run large-scale simulations or re-train complicated NN architectures to do

so. This is an interesting research field in the junction of many other specialized sub-

fields which is also growing faster due to current computing processing power available,

setting up the pathway for new and interesting intelligent materials modeling—which

could surely become the future of materials science.

6.3. Atomistic Feature Engineering

So far, we have discussed the different ways in which scientists have been able to apply

ML technology in different sub-fields of CMP, but lattice systems seem to prevail, where

most of the research done is reported with these systems—why is that?—.

One of the fundamental parts of using a ML method is that of data input with

specific features; for instance, images are described by pixels in a two-dimensional grid

where the pixels are the features that numerically represent the image, these pixels are

then fed to a ML model, e.g., a classifier; another clear example are numerical inputs

that can be readily used in regression tasks, and then used as a prediction model. We

wish to obtain a similar representation for physical systems, so as to ease the use of

ML models in condensed matter systems. Recall that the intrinsic features of lattice

systems are spins (σi), these spins do not change location they actually stay put and

only change their spin value taken from a finite, discrete set of values. We can easily

see a link between images and pixels, and a lattice system and its spins; both can be

described by a set of discrete features that do not change their values as the system

is evolving and changing. But we cannot say the same about atomistic and off-lattice

systems, thus we cannot use the same representation as lattice systems because atomistic


systems are described primarily by the constant movement of their atoms and particles,

and their many degrees of freedom—for instance, for a three-dimensional system, one

has 3N translational degrees of freedom with N being the total number of particles—.

By using the raw positional coordinates, velocities or accelerations, we might not be

able to convey the true features that the system possesses; if even possible, they could

be conveyed in a biased way, e.g., by defining an arbitrary coordinate system or by not

being invariant to translation and rotations. In order to overcome this limitation, a

special mechanism needs to be developed to feed this data to the ML algorithms. It is

this encoding of raw atomistic data into a set of special features that has seen a strong

surge in recent years when dealing with materials of any kind.

Atomistic descriptors have been studied heavily throughout the years [245, 246,

247, 248, 249, 250, 251] because they are the gateway to applying ML methodologies in

atomistic systems. The reader is referred to these reviews [12, 252] for more information

on the topic. Descriptors are a systematic way of encoding physical information in a

set of feature vectors that can then be fed to a ML algorithm. These encodings need to

have the following properties: a) the descriptors must be invariant to any type of point

symmetry, for example, translations, rotations and permutations; b) the computation

of the descriptor must be fast as it needs to be evaluated for every possible structure in

the system; and c) the descriptors have to be differentiable with respect to the atomic

positions to enable the computation of analytical gradients for the forces. Computing

these descriptors is essentially called feature selection in the ML literature, and most

of the time these procedures are carried out in an automated way in a normal ML

workflow. Once a particular descriptor has been used, all the methodology from ML

that we have seen applied to lattice systems can now be employed on atomistic and

off-lattice systems.

6.4. Outlook

ML-enhanced materials modeling is a strong research area, with a focus on streamlining

computations and automating material discoveries. Descriptors that employ less data

should be the primary research area, as it currently is one of its main shortcomings. A

standard framework and benchmark data sets could also be used to further this area,

while also looking forward to an automated workflow.

7. Soft Matter

At this stage, most of the ML applications we have discussed have primarily been

focused on HM and atomistic many-body systems. It has been so due to the fact

that ML methods can be applied using data sets that need little feature engineering,

even without its disadvantages. However, off-lattice and SM systems, along with other

atomistic many-body systems, on the other hand, need thorough feature engineering

and a different approach to training ML models.


The applications we will review in this section reflect this inherent challenge of

finding representative features that could enable ML models to be used in tasks common

to all CMP, e.g., phase transitions and critical phenomena.

7.1. Phase Transitions

Phase transitions are the cornerstone of ML applications to lattice systems, but with the

use of atomistic descriptors these can easily be extended to off-lattice systems, which

are usually observed in SM. Jadrich, Lindquist and Truskett [253, 254] developed the

idea of using the set of distances between particles as a descriptor for the system, and

used unsupervised learning—in particular, Principal Component Analysis—on these

descriptors. They wanted to test whether a ML model was able to detect a phase

transition in hard-disk and hard-sphere systems. When the unsupervised method was

applied, the output from the model was later found to be the positional bond order

parameter for the system [255, 256], which effectively determines whether the system

has been subjected to a phase transition. In other words, the ML model was able

to automatically compute the order parameter without having been told to do so.

A similar approach to colloidal systems was done by Boattini and coworkers [257].

Instead of using inter-particle distances, they computed the orientational bond order

parameters [258, 259], then they used a DL unsupervised method—a FFNN based

autoencoder—to obtain a latent representation of the computed order parameters.

With all these preprocessing steps, they fed the encoded data to a clustering algorithm

and obtained detailed descriptions of all the phases of matter in the current system.

This methodology is systematic and leads itself nicely to an automated form of phase

classification in colloidal systems.

7.2. Colloidal systems

Liquid crystals are a state of matter that have both the properties of a liquid and of

a solid crystal [260]. At the simplest case, liquid crystals can be studied using one of

the reference model from SM, i.e., hard rods, or similar anisotropic particles [261]. This

type of systems exhibit properties similar to those present in HM, such as topological

defects. The main issue with using the same methodology is that building a data set

with useful and representative features in systems made of liquid crystal molecules is

not trivial, mainly due to the fact that positions and other physical observables can

take any continuous value instead of a specific value from a given set. In the work by

Walters, Wei and Chen [262] a time series approach using Recurrent NN was employed,

using the basic descriptors from a system of liquid crystals, namely the orientation

angle and positions of all the particles in the system. By using long short-term memory

networks [263, 264] (LSTM), they showed that the off-lattice simulation data can be

used as input and expect to identify between several topological defects. This is a great

step towards the use of mainstream techniques from ML in SM, as one of the main

drawbacks, which is feature selection and data set building, is resolved by using this


kind of methods. Although a data set collected from computer simulations does not

convey much problem, it might be so for experimental setups.

Similarly, Terao [265] used CNNs to analyze liquid crystals, as well as different types

of solid-like structures, by computing special descriptors for the system. In particular,

the approach presented in that work can be seen as a generalization of the one tipically

used for HM, which is to transform the off-lattice data into a common structure normally

used in computer vision. By extracting any subsystems with few particles from the total

system, a conversion to grayscale images of size l× l in the case of 2 dimensional systems

were performed by using the following expression,

ui,j = C

mp∑n=1

exp

(−|rn −Rij|2

a2

),

were rn is the positions of the n-th particle in the subsystem with a size Ls and

total number of particles mp; Rij is the positional vector in the center of the pixel

(i, j); and the constant C is determined such that the set ui,j is rescaled within the

interval [0, 1]. The advantage of this method is its generalization potential due to the

fact that no physical input is needed in order for the technique to work, so a large range

of systems could be studied under the same methodology. One possible disadvantage

is the fact that some systems might need to conserve some kind of symmetries, so this

should be taken into account when constructing the data set.

Polymers are materials that consist of large molecules composed of smaller subunits.

These materials are of great interest in SM because most polymers are surrounded

by a continuum, turning them into colloid systems [266]. Polymers are also seen in

nature, e.g., the case of the deoxyribonucleic acid (DNA) found in almost all living

organisms. Even though these systems have been studied extensively throughout the

years, computational and experimental issues always arise. It is in these situations where

ML might be a suitable approach. One such example is the design of a computational

method for understanding and optimizing the properties of complex physical systems

by Menon et al [267]. In the work by Menon et al physical and statistical modeling

are integrated into a hierarchical framework of machine learning, as well as using simple

experiments to probe the underlying forces that combine to determine changes in slurry

rheology due to dispersants. One of the main contributions of this work is that they

show how physical and chemical information can aid in the reduction of the number of

samples in a data set needed to do efficient ML modeling. If data sets are constructed

using experimental data and some insight about the problem, the use of ML techniques

might be easier to achieve. An issue presented in the work by Menon et al considers

how data sets might be subjected to correlation between system variables and responses

from the system. This could be a potential issue when training some ML models due

to training stability and yielding acceptable results.

On a more recent approach to polymers one can find is the work by Xu and

Jiang [268], where a ML methodology is used for studying the swelling of polymers

immersed in liquids. In this work, the importance of using chemical and physical


descriptors from the beginning to construct the ML framework is highlighted. By

constructing and using these descriptors, molecular representations are used, along with

data augmentation, to study the relationships between molecular fragments and swelling

degrees with the help of PCA. Various other applications are constantly arising, such as

the discovery and design of new types of polymers in ML methods [269]. As ML methods

help with complex representations of data, it could potentially be used to accelerate the

discovery of innovative materials within polymer science.

7.3. Properties of equilibrium and non-equilibrium systems

The structural and dynamical properties of equilibrium and non-equilibrium systems,

such as complex liquids, glasses, gels, granular materials, and many more, are extensively

studied under the branch of colloidal SM [270, 271]. Common tools, such as computer

simulations are used to study such properties, but in some systems these tools are

computationally demanding to the point that special facilities, such as high-performance

computing (HPC) clusters are needed. Even if in the current era we are able to use this

kind of computational resources, we are still limited to the amount of computations

needed for large, physically-representative systems. Studying simple reference models,

Moradzadeh and Aluru [272] were able to fabricate a DL autoencoder framework to

compute structural properties of these simple systems. Using MD and long running

simulations, they collected several position snapshots for a diverse range of Lennard-

Jones systems and using the autoencoder as a denoising framework they computed the

time average radial distribution function (RDF). This was achieved using a data set

built with three main features, the thermodynamic state composed of the temperature

and density—(T, ρ) —at each time step, and the RDF snapshot at that exact moment.

Both time and data are efficiently reduced with this methodology, as only 100 samples

are needed when the network is trained, although a large data set is employed to do

such training.

Non-equilibrium systems have also seen some applications from ML. Employing

SVMs, Schoenholz et al [273] found that the structure is important to glassy dynamics

in three dimensions. The results were obtained by performing MD simulations of a

bidisperse Kob–Andersen Lennard-Jones glass [274] in three dimensions at different

densities ρ and temperatures T above its dynamical glass transition temperature. At

each density, a training set of 6,000 particles was selected, taken from a MD trajectory

at the lowest T studied, to build a hyperplane in <M using SVMs. This hyperplane

is then used to calculate Si(t) for each particle i at each time t during an interval of

30,000 time steps at each ρ and T , where Si is the softness of a particle i, defined as

the shortest distance between its position in <M and the hyperplane, where Si > 0 if

i lies on the soft side of the hyperplane and Si < 0 otherwise. With this method, it is

shown that there is structure hidden in the disorder of glassy liquids and this structure

can be quantified by softness, computed through a ML framework.

New ML applications to design and create glass materials are steadily appearing as


presented in the review by Han et al [275], where challenges and limitations are also

discussed for these applications to glassy systems.

7.4. Experimental setups and Machine Learning

Experiments are a fundamental part of CMP, providing results which can be later

compared to theoretical and computational results. Experimental setups have also been

enhanced with ML models in different aspects of SM.

In the work by Li et al [276] collective dynamics are explored, measured through

piezoelectric relaxation studies, to identify the onset of a structural phase transition in

nanometer-scale volumes. Aided by k-means clustering on raw experimental data, Li et

al were able to automatically determine phase diagrams on systems where identification

of underpinning physical mechanisms is difficult to achieve and microscopic degrees of

freedom are generally unavailable for observations. Although these results are more

precise than those obtained with traditional tools, experimental setups might still be

difficult and expensive to perform each time, so ML methods might be able to alleviate

this shortcommings by requiring less data or performing more efficiently than other

techniques.

We have touched several times on the fact that data and the amount of samples

needed are one of the main shortcomings for the applications of ML methods to CMP.

This is particularly more of a challenge on experimental setups, where large quantities

of data samples might not be possible on a given system. The end-to-end framework

developed by Minor et al [277] shows that this problem can be overcome. The collection

of physical data of a liquid crystalline mesophase was performed using an experimental

setup, i.e., a mechanical quench experiment. The ML pipeline consisted of real time

object detection using the YOLOv2 [278] architecture to detect topological defects on the

system. Standardization and image enhancement techniques were needed to mimic real-

world inaccuracies in the training data set. The model resulted in comparable spatial

and number resolution human annotations to which the pipeline was compared; with

significant improvement on the time spent in each frame, thus resulting in a dramatic

increase in the time-resolution (more frames analyzed).

7.5. Outlook

As we have discussed so far, the fundamental step to applying ML to SM and off-

lattice systems is to define and use a set of meaningful descriptors, but the choices for

descriptors are large as well as system-dependent. In the case of experimental setups,

the efficient use of data samples are necessary. In order to fully embrace ML in SM

and colloidal systems a more automated approach will need to be developed, posing

it as new and fascinating research area. Recently, DeFever et al [279] developed a

generalized approach that does not use hand-made descriptors; instead, a complete DL

workflow is used with raw coordinates from simulations and it is able to obtain promising

results about the phase transitions, both in and out of thermodynamic equilibrium in


SM systems. This scheme is a great step forward towards a fully automated strategy

for off-lattice systems, as it reduces the amount of bias a scientist might introduce while

using certain descriptors for given systems. The only downside to employing these types

of DL methodologies is the large amount of data needed, so maybe in future research we

can see approaches where one can use a small amount of data to obtain similar results.

These disadvantes seem to disappear in end-to-end frameworks as the one described in

Minor et al [277]; still, the amount of data might be an obstacle when applying ML

models. This is discussed in detail in the next section.

8. Challenges and Perspectives

At this point of the review, we have discussed some of the major applications of ML to

CMP. We have also touched on other useful insights provided by theoretical frameworks

on ML, as well as discussing important points on various shortcomings and drawbacks

of using ML models in diverse applications. In this section, however, we would like to

expand on such discussions and address the main challenges that we see in the current

climate of this research area.

8.1. Benchmark data sets and performance metrics

In order to test new algorithms in ML a benchmark data set is needed. This data set

serves the purpose of assessing the performance of a particular algorithm by means

of a defined performance metric. One such example is the extremely popular MNIST

data set [280], which is currently extensively used to test new ML algorithms. The

most common task to solve on this data set is to check if the proposed ML algorithm

can successfuly recognize all ten different classes of handwritten digits, in a supervised

learning fashion. Thus, the performance metric is the accuracy—which corresponds to

the total number of correct predictions over the total number of samples in the data set.

This means that if a proposed NN architecture scores a 97% accuracy on the MNIST

data set, the NN architecture can predict at least 9.7 handwritten digits correctly.

Although the MNIST has been widely used, it is not the only one and most researchers

agree that this data set should be replaced by a more challenging one, for instance the

Fashion-MNIST [281].

With this in mind, the main point we wish to convey is that there is an important

need for standardized and well-tested benchmark data sets in the current research area

of ML and CMP, along with their respective performance metrics. Benchmark data

sets will be used to try and test new ML models, while verifying if such models can

provide the expected results, and achieve a good score on the respective performance

metric. We can look at the example of computing the critical temperature of an Ising

model where we do not have the external field shown in Eq. (1). In this example,

we have seen several possible ML models—NNs, SVMs, DL approaches, Reinforcement

Learning approaches—that can achieve good results on this problem, so we might agree


that such problem can be taken as a benchmark problem. The Ising model is a simple

one, which can be computed with standard MC methods and a standard benchmark

data set is fairly easy to obtain. This is not the case, however, of other CMP systems

and problems, for example, the case of machine-learned potentials. As discussed in

Ref. [12], the construction of such models currently require very large reference data

sets from electronic structure calculations, making the construction computationally

very demanding and time consuming. This means that it is not simple nor easy to test

newly created ML models that can tackle this problem. Thus, it implies that there is a

strong need to spend countless computing hours to obtain representative data samples

just to test if the proposed ML method can perform well at the simplest cases. Other

CMP systems are even more complicated to produce data sets, in particular those that

come from experimental setups. Experiments can be very costly; obtaining clean, labeled

and representative data is not something that is currently being addressed completely.

The solution to this problem is to generate well-tested standard data sets for each

problem in CMP. Although this is not a simple task, some efforts have been done in

current research areas. Related to materials modeling and CMP, MoleculeNet by Wu et

al [282] is a contribution that curates multiple public datasets, establishes metrics for

evaluation, and offers high quality open-source implementations of multiple previously

proposed molecular featurization and learning algorithms. Other effort, although in the

realm of quantumm chemistry, is the QM9 data set [283] for molecular structures and

their properties. Not much attempts are seen in the realm of SM, where one could

test for the simplest cases, namely, the hard-sphere model, binary hard-sphere mixtures

and some other simple liquids benchmarks. This could signal a different approach

to solving the problem of performance metrics and benchmark data sets by means of

physical correctness and standard results drawn from physical theories. For instance,

when testing new molecular dynamics algorithms, the Lennard-Jones potential [284] is

a well established benchmark given that all its properties are well known and studied.

Instead of having to compute these values each time, one wishes to test a new ML

model, or having to read and extract the data set from the literature, a joint effort of

building a common and curated data bank might help scientists to quickly test their

newly proposed algorithms.

A final approach to solving this problem could be the reuse and sharing of created

data sets for a given problem. While solving some problem within CMP, the data

set created can be shared through an open service, e.g., Zenodo [285], where other

scientists can easily access such information and use them in an open science way [286].

Benchmarks are most needed for the prevalence and proliferation of newly proposed ML

algorithms applied to CMP, and this key challenge should be one of the main challenges

to overcome.


8.2. Feature selection and Training of ML models

Another key challenge in the application of ML to CMP is that of feature selection.

ML models rely on a data set that is representative of the system itself, i.e., without

interpretation capabilities, models might not provide physical insights or meaning. If

a data set is not correctly labeled or structured, training can be unstable and different

issues can arise. In plain ML various feature selection techniques are available along

with some useful metrics [287]. Feature selection strongly depends on the data set and

problem alike, so most of the feature selection methods cannot be transfered directly to

CMP problems. For this to be employed, physics criteria and insight into the problem is

mandatory. For example, it requires the expertise of a SM physicist to identify betwen

a nematic phase and a smectic phase in a crystal liquid, or the distinction of symmetries

in a given system that need to be conserved when using a ML model. In this review,

we have seen some of these shortcomings, when talking about atomistic descriptors and

other applications. Unfortunately, this is a challenge that is not so easily resolved as it

requires having experience on the problem itself and the ML model to be used. One such

example that we have already covered is the description of hard-spheres and hard-disks

using the distance between particles shown in Ref. [253]. This feature selection is not a

standard one in ML, it comes directly from the Physics of the problem at hand, and is an

important contribution to feature selection in ML and CMP. Research of this kind might

be needed even more than creating new ML models, so that more CMP systems can

be tested with more feature selection techniques. Another possible approach to such

problem can be the automatization of feature selection by means of ML models; for

instance, using a modification of the one proposed by Bojensen [187]. It is interesting to

see that most automatization methodologies use Reinforcement Learning, which might

indicate that it is well suited to solve some of these drawbacks.

We now discuss the issue of training and using ML models in general. Handling

and training ML models well so as to obtain good results is not so simple at first.

Nowadays, there are a lot of standard techniques used to train ML models, such as

regularization, early stopping and many others; for a thorough and detailed explanation

of these techniques we refer the reader to the excellent review by Mehta et al [288]

especially written for physicists. The main point here is that training ML models has

its difficulties and most of the time is not so easy to implement all of them as not

all are needed in order for the training to be stable enough. Dealing with overfitting

and underfitting is one of the main issues in training ML models, especially in CMP

applications because of the possible lack of data samples and the difficulty of choosing the

correct ML algorithm. Moreover, tuning of hyperparameters in ML models requires time

and patience from the user as it can be a time consuming task, especially if approaching

the problem with a ML model for which there is little knowledge of how to use it

correctly. There are solutions to this problem, such as Bayesian optimization [289],

a robust tool which can help tune hyperparameters in an automated fashion, with

good results. Fortunately, more and more research is being carried out to alleviate the


problem in ML, for instance, the DropConnect approach [290]. which greatly helps with

the overfitting problem.

In general, most ML models are not so readily applicable to CMP problems, mostly

because of the lack of good data representation and the need of model tuning. There

is a diverse set of possible solutions to these problems brought directly from ML, but

these need to be tested in CMP problems to ensure such techniques are viable. New

research should be carried out in this way to enhance current and recently proposed ML

applications to CMP, and to make it easier for physicists to employ ML models.

8.3. Replicability of results

It is a known fact that current ML research faces a reproducibility crisis [291]—the

fact that even with detailed descriptions from the original paper, an independent

implementation of such models does not yield the same results as those claimed in the

original work. One of the main problems with this phenomenon is the fact that most

research papers do not provide the code that was used to obtain the results presented in

the work. Even if the code is provided, most of the results claimed cannot obtained due

to the fact of stochasticity of certain algorithms—for example, when using dropout [292],

a technique that reduces overfitting by randomly turning on or off certain parts of the

NN architecture. The intrinsic randomness of most ML models is another key issue in

solving the replicability problem. Most researchers deal with this problem by sharing

the trained models along with the code used. This can greatly alleviate the problem,

the fact is that this is a rare practice in current ML research. Nonetheless, important

efforts are carried out to solve this problem. OpenML [293] is an iniciative to collect and

centralize ML models, data sets, pipelines and most importantly reproducible results.

Another great example is DeepDIVA [294], an infrastructure designed to enable quick

and intuitive setup of reproducible experiments with a large range of useful analysis

functionality. Current research in this area [295, 296, 297] is necessary to create outreach

on research and encourage practices that can lead to reproducible science.

Code and data sharing is not a common practice within CMP. Some researchers

believe that the original paper must suffice to reproduce the results claimed, this might

be true for most of the proposed methodologies and techniques, but no so much in ML

applications to CMP. In order for ML to work with CMP, reproducibility is a key factor

that needs to be present from the beginning. By focusing on sharing the full training

pipeline, the data set used to train a given ML model, and the code used, ML can be

quite succesful in CMP, most importantly, it can be implemented by anyone that wishes

to use such tools. Until now, there is no approach to this in the current climate of ML

and CMP, so it is a great opportunity to start implementing these practices of code,

data and model sharing to ensure reproducibility of results.


8.4. Perspectives

ML and CMP have created a very enriching and prosperous joint research area. Modern

ML techniques have been applied to CMP with outstanding results, and Physics has

given fruitful insights into the theoretical background of ML. But in order to embrace

and fully adopt ML tools into standard physics workflows, some challenges need to be

overcome.

Currently, there is no standard way to couple both ML and Physics simulation

software suites; the main approach is to use available software and hand-craft code that

can use the strengths of both parts. Physicists have seen the great advantage that is

the use of standard simulation software suites like LAMMPS [298] or ESPResSO [299],

but these are not meant to be used with ML workflows. A universal, coupling scheme

might be able to simplify research in this exciting scientific area.

Quantum systems have also seen a great deal of advances with ML applications [10],

thus it might be interesting to see if the same schemes can be modified and applied to

larger-scale systems, such as soft materials, as we have seen it was possible in the topic

of phase transitions.

New interesting theoretical insights have also been proposed to explain ML

by applying not only CMP, but other Physics frameworks, like the AdS/CFT

correspondence [300]. Physics is full of these rich and rigorous frameworks and new

research into the area might be able to give a full and thorough explanation of the

theory of Deep Learning and its learning mechanisms.

All in all, ML and CMP are just starting to convey ideas to each other, creating a

fertile research area where the work done is towards an automated, faster and simplified

workflow that will enable condensed matter physicists to unravel the deep and intricate

features of many-body systems.

Acknowledgments

This work was financially supported by Conacyt grant 287067.

References

[1] LeCun Y, Bengio Y and Hinton G 2015 Nature 521 436–444

[2] Alom M Z, Taha T M, Yakopcic C, Westberg S, Sidike P, Nasrin M S, Van Esesn B C, Awwal

A A S and Asari V K 2018 The history began from alexnet: A comprehensive survey on deep

learning approaches (Preprint 1803.01164)

[3] Young T, Hazarika D, Poria S and Cambria E 2018 IEEE Computational Intelligence Magazine

13 55–75

[4] Voulodimos A, Doulamis N, Doulamis A and Protopapadakis E 2018 Computational Intelligence

and Neuroscience 2018

[5] Litjens G, Kooi T, Bejnordi B E, Setio A A A, Ciompi F, Ghafoorian M, Van Der Laak J A,

Van Ginneken B and Sanchez C I 2017 Medical image analysis 42 60–88

[6] Goodfellow I, Bengio Y and Courville A 2016 Deep Learning (MIT Press) URL http://www.

deeplearningbook.org

1803.01164

http://www.deeplearningbook.org

http://www.deeplearningbook.org


[7] Sanchez-Lengeling B and Aspuru-Guzik A 2018 Science 361 360–365 ISSN 0036-

8075 (Preprint https://science.sciencemag.org/content/361/6400/360.full.pdf) URL

https://science.sciencemag.org/content/361/6400/360

[8] Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah

P, Spitzer M et al. 2019 Nature Reviews Drug Discovery 18 463–477

[9] Senior A W, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zıdek A, Nelson A W,

Bridgland A et al. 2020 Nature 1–5

[10] Carrasquilla J 2020 Machine learning for quantum matter (Preprint 2003.11040)

[11] Schmidt J, Marques M R, Botti S and Marques M A 2019 npj Computational Materials 5 1–36

[12] Behler J 2016 Journal of Chemical Physics 145 ISSN 00219606

[13] Webb M A, Delannoy J Y and de Pablo J J 2019 Journal of Chemical Theory and Computation

15 1199–1208 ISSN 1549-9618 publisher: American Chemical Society URL https://doi.org/

10.1021/acs.jctc.8b00920

[14] Schutt K T, Sauceda H E, Kindermans P J, Tkatchenko A and Muller K R 2018 The Journal of

Chemical Physics 148 241722 ISSN 0021-9606 publisher: American Institute of Physics URL

https://aip.scitation.org/doi/full/10.1063/1.5019779

[15] Mills K, Spanner M and Tamblyn I 2017 Physical Review A 96 042113 publisher: American

Physical Society URL https://link.aps.org/doi/10.1103/PhysRevA.96.042113

[16] Carleo G and Troyer M 2017 Science 355 602–606 ISSN 0036-8075, 1095-9203 publisher:

American Association for the Advancement of Science Section: Research Articles URL https:

//science.sciencemag.org/content/355/6325/602

[17] Glasser I, Pancotti N, August M, Rodriguez I D and Cirac J I 2018 Physical Review X 8 011006

publisher: American Physical Society URL https://link.aps.org/doi/10.1103/PhysRevX.

8.011006

[18] Xie T and Grossman J C 2018 Physical Review Letters 120 145301 publisher: American Physical

Society URL https://link.aps.org/doi/10.1103/PhysRevLett.120.145301

[19] Snyder J C, Rupp M, Hansen K, Muller K R and Burke K 2012 Physical Review Letters 108

253002 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/

PhysRevLett.108.253002

[20] Li L, Snyder J C, Pelaschier I M, Huang J, Niranjan U N, Duncan P, Rupp M, Muller

K R and Burke K 2016 International Journal of Quantum Chemistry 116 819–833 ISSN

1097-461X eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/qua.25040 URL https:

//onlinelibrary.wiley.com/doi/abs/10.1002/qua.25040

[21] Carleo G, Cirac I, Cranmer K, Daudet L, Schuld M, Tishby N, Vogt-Maranto L and Zdeborova

L 2019 Reviews of Modern Physics 91 045002 publisher: American Physical Society URL

https://link.aps.org/doi/10.1103/RevModPhys.91.045002

[22] Ethem A 2014 Introduction to Machine Learning 3rd ed (Cambridge: MIT Press)

[23] Alom M Z, Taha T M, Yakopcic C, Westberg S, Sidike P, Nasrin M S, Hasan M, Van Essen B C,

Awwal A A and Asari V K 2019 Electronics 8 292

[24] Vapnik V 1998 Statistical Learning Theory 1st ed (New York: John Wiley and Sons, Inc)

[25] Smola A J and Scholkopf B 2004 Statistics and computing 14 199–222

[26] Zhang Y and Liu Y 2009 IEEE Signal Processing Letters 16 414–417

[27] Ben-Hur A, Horn D, Siegelmann H T and Vapnik V 2001 Journal of machine learning research

2 125–137

[28] Hoffmann H 2007 Pattern recognition 40 863–874

[29] Tan S, Zheng F, Liu L, Han J and Shao L 2016 IEEE Transactions on Circuits and Systems for

Video Technology 28 356–363

[30] Liu J, Huang Z, Xu X, Zhang X, Sun S and Li D 2020 IEEE Transactions on Systems, Man, and

Cybernetics: Systems

[31] Nguyen T T, Nguyen N D and Nahavandi S 2020 IEEE Transactions on Cybernetics 1–14 ISSN

2168-2275

https://science.sciencemag.org/content/361/6400/360.full.pdf


2003.11040

https://doi.org/10.1021/acs.jctc.8b00920

https://doi.org/10.1021/acs.jctc.8b00920

https://aip.scitation.org/doi/full/10.1063/1.5019779

https://link.aps.org/doi/10.1103/PhysRevA.96.042113



https://link.aps.org/doi/10.1103/PhysRevX.8.011006


https://link.aps.org/doi/10.1103/PhysRevLett.120.145301



https://onlinelibrary.wiley.com/doi/abs/10.1002/qua.25040




[32] Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller

M, Fidjeland A K, Ostrovski G et al. 2015 nature 518 529–533

[33] Shawe-Taylor J, Cristianini N et al. 2004 Kernel methods for pattern analysis (Cambridge

university press)

[34] Aggarwal C C 2018 Neural Networks and Deep Learning: A Textbook 1st ed (Springer Publishing

Company, Incorporated) ISBN 3319944622

[35] Xu X, Hu D and Lu X 2007 IEEE Transactions on Neural Networks 18 973–992

[36] An Y, Ding S, Shi S and Li J 2018 Pattern Recognition Letters 111 30–35

[37] Driessens K, Ramon J and Gartner T 2006 Machine Learning 64 91–119

[38] Jiu M and Sahbi H 2019 Pattern Recognition 88 447 – 457 ISSN 0031-3203 URL http:

//www.sciencedirect.com/science/article/pii/S0031320318304266

[39] Le L and Xie Y 2019 Neurocomputing 339 292–302

[40] Bai L, Shao Y H, Wang Z and Li C N 2019 Knowledge-Based Systems 163 227–240

[41] Sanakoyeu A, Bautista M A and Ommer B 2018 Pattern Recognition 78 331 – 343 ISSN 0031-3203

URL http://www.sciencedirect.com/science/article/pii/S0031320318300293

[42] Tesauro G 1992 Practical issues in temporal difference learning Advances in neural information

processing systems pp 259–266

[43] Liu H, Zhou J, Xu Y, Zheng Y, Peng X and Jiang W 2018 Neurocomputing 315 412–424

[44] da Silva L E B, Elnabarawy I and Wunsch II D C 2019 Neural Networks 120 167–203

[45] Tan A H, Lu N and Xiao D 2008 IEEE Transactions on Neural Networks 19 230–244

[46] Salakhutdinov R and Hinton G 2009 Deep boltzmann machines Artificial intelligence and

statistics pp 448–455

[47] Bishop C 2006 Pattern Recognition and Machine Learning Information Science and

Statistics (Springer) ISBN 9780387310732 URL https://books.google.com.mx/books?id=

qWPwnQEACAAJ

[48] Hu Y, Luo S, Han L, Pan L and Zhang T 2020 Artificial Intelligence in Medicine

102 101764 ISSN 0933-3657 URL http://www.sciencedirect.com/science/article/pii/

S093336571830366X

[49] Hinton G E and Salakhutdinov R R 2006 science 313 504–507

[50] Quinlan J R 1986 Machine learning 1 81–106

[51] Williams C K and Rasmussen C E 2006 Gaussian processes for machine learning 2nd ed (MIT

press Cambridge, MA)

[52] Roth V 2004 IEEE transactions on neural networks 15 16–28

[53] Graupe D 2019 Principles of Artificial Neural Networks: bacis designs to deep learning 4th ed

(Singapore: World Scientific)

[54] Aggarwal C 2020 Linear Algebra and Optimization for Machine Learning: A Textbook (Springer

International Publishing) ISBN 9783030403447 URL https://books.google.com.mx/books?

id=EdTkDwAAQBAJ

[55] Guresen E and Kayakutlu G 2011 Procedia Computer Science 3 426 – 433 ISSN 1877-

0509 world Conference on Information Technology URL http://www.sciencedirect.com/

science/article/pii/S1877050910004461

[56] Chen T and Chen H 1995 IEEE Transactions on Neural Networks 6 911–917

[57] Shrestha A and Mahmood A 2019 IEEE Access 7 53040–53065

[58] Bottou L 2012 Stochastic gradient descent tricks Neural Networks: Tricks of the Trade:

Second Edition ed Montavon G, Orr G B and Muller K R (Berlin, Heidelberg: Springer

Berlin Heidelberg) pp 421–436 ISBN 978-3-642-35289-8 URL https://doi.org/10.1007/

978-3-642-35289-8_25

[59] Pascanu R and Bengio Y 2013 Revisiting natural gradient for deep networks (Preprint 1301.3584)

[60] Graves A, Wayne G, Reynolds M, Harley T, Danihelka I, Grabska-Barwinska A, Colmenarejo

S G, Grefenstette E, Ramalho T, Agapiou J et al. 2016 Nature 538 471–476

[61] Quan Z, Zeng W, Li X, Liu Y, Yu Y and Yang W 2019 IEEE Transactions on Neural Networks

http://www.sciencedirect.com/science/article/pii/S0031320318304266



https://books.google.com.mx/books?id=qWPwnQEACAAJ

https://books.google.com.mx/books?id=qWPwnQEACAAJ

http://www.sciencedirect.com/science/article/pii/S093336571830366X

http://www.sciencedirect.com/science/article/pii/S093336571830366X

https://books.google.com.mx/books?id=EdTkDwAAQBAJ

https://books.google.com.mx/books?id=EdTkDwAAQBAJ



https://doi.org/10.1007/978-3-642-35289-8_25

https://doi.org/10.1007/978-3-642-35289-8_25

1301.3584


and Learning Systems 31 813–826

[62] Van Veen F and Leijnen S 2019 Asimov institute: The neural network zoo available at: URL

https://www.asimovinstitute.org/neural-network-zoo

[63] Rosenblatt F 1958 Psychological review 65 386–408

[64] Hornik K, Stinchcombe M, White H et al. 1989 Neural networks 2 359–366

[65] Jain A K, Mao J and Mohiuddin K M 1996 Computer 29 31–44

[66] Frommberger L 2010 Qualitative spatial abstraction in reinforcement learning (Springer Science

& Business Media)

[67] Bourlard H and Kamp Y 1988 Biological cybernetics 59 291–294

[68] Snoek J, Adams R P and Larochelle H 2012 Journal of Machine Learning Research 13 2567–2588

[69] Blaschke T, Olivecrona M, Engkvist O, Bajorath J and Chen H 2018 Molecular Informatics 37

1700123 (Preprint https://onlinelibrary.wiley.com/doi/pdf/10.1002/minf.201700123)

URL https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700123

[70] Talwar D, Mongia A, Sengupta D and Majumdar A 2018 Scientific reports 8 1–11

[71] Vellido A, Lisboa P J and Vaughan J 1999 Expert Systems with applications 17 51–70

[72] Hsieh T J, Hsiao H F and Yeh W C 2011 Applied soft computing 11 2510–2525

[73] Bohanec M, Borstnar M K and Robnik-Sikonja M 2017 Expert Systems with Applications 71

416–428

[74] Montavon G, Samek W and Muller K R 2018 Digital Signal Processing 73 1–15

[75] Padierna L C, Carpio M, Rojas-Domınguez A, Puga H and Fraire H 2018 Pattern Recognition

84 211 – 225 ISSN 0031-3203 URL http://www.sciencedirect.com/science/article/pii/

S0031320318302280

[76] Rojas-Domınguez A, Padierna L C, Valadez J M C, Puga-Soberanes H J and Fraire H J 2017

IEEE Access 6 7164–7176

[77] Vapnik V N 1999 IEEE transactions on neural networks 10 988–999

[78] Maldonado S and Lopez J 2014 Information Sciences 268 328–341

[79] Xu Y, Yang Z and Pan X 2016 IEEE transactions on neural networks and learning systems 28

359–370

[80] Tsang I W, Kwok J T and Cheung P M 2005 Journal of Machine Learning Research 6 363–392

URL https://www.jmlr.org/papers/v6/tsang05a.html

[81] Sadrfaridpour E, Razzaghi T and Safro I 2019 Machine Learning 108 1879–1917

[82] Chang C C and Lin C J 2011 ACM Transactions on Intelligent Systems and Technology 2(3)

27:1–27:27 software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

[83] Fan R E, Chang K W, Hsieh C J, Wang X R and Lin C J 2008 Journal of Machine Learning

Research 9 1871–1874

[84] Shalev-Shwartz S, Singer Y, Srebro N and Cotter A 2011 Mathematical programming 127 3–30

[85] Nandan M, Khargonekar P P and Talathi S S 2014 J. Mach. Learn. Res. 15 59–98 ISSN 1532-4435

[86] Bohn B, Griebel M and Rieger C 2019 Journal of Machine Learning Research 20 1–32 URL

http://jmlr.org/papers/v20/17-621.html

[87] Chaikin P M, Lubensky T C and Witten T A 1995 Principles of condensed matter physics vol 10

(Cambridge university press Cambridge)

[88] Anderson P W 2018 Basic notions of condensed matter physics (CRC Press)

[89] Girvin S M and Yang K 2019 Modern Condensed Matter Physics (Cambridge University Press)

[90] Kohn W 1999 Rev. Mod. Phys. 71(2) S59–S77

[91] Lubensky T 1997 Solid State Communications 102 187 – 197 ISSN 0038-1098 highlights in

Condensed Matter Physics and Materials Science

[92] Witten T A 1999 Rev. Mod. Phys. 71(2) S367–S373

[93] Yan H, Park S H, Finkelstein G, Reif J H and LaBean T H 2003 Science 301 1882–1884 ISSN

0036-8075

[94] Zhang S, Marini D M, Hwang W and Santoso S 2002 Current Opinion in Chemical Biology 6

865 – 871 ISSN 1367-5931

https://www.asimovinstitute.org/neural-network-zoo

https://onlinelibrary.wiley.com/doi/pdf/10.1002/minf.201700123

https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700123



https://www.jmlr.org/papers/v6/tsang05a.html

http://www.csie.ntu.edu.tw/~cjlin/libsvm

http://jmlr.org/papers/v20/17-621.html


[95] De Gennes P 2005 Soft Matter 1(1) 16–16

[96] Russel W, Russel W, Saville D and Schowalter W 1991 Colloidal Dispersions Cambridge

Monographs on Mechanics (Cambridge University Press) ISBN 9780521426008 URL https:

//books.google.com.mx/books?id=3shp8Kl6YoUC

[97] Eberle A P R, Wagner N J and Castaneda Priego R 2011 Phys. Rev. Lett. 106(10) 105704 URL


[98] Eberle A P, Castaneda-Priego R, Kim J M and Wagner N J 2012 Langmuir 28 1866–1878

[99] Valadez-Perez N E, Benavides A L, Scholl-Paschinger E and Castaneda-Priego R 2012 The

Journal of chemical physics 137 084905

[100] Cheng Z, Chaikin P, Russel W, Meyer W, Zhu J, Rogers R and Ottewill R 2001 Materials &

Design 22 529–534

[101] McGrother S C, Gil-Villegas A and Jackson G 1998 Molecular Physics 95 657–

673 (Preprint https://doi.org/10.1080/00268979809483199) URL https://doi.org/10.

1080/00268979809483199

[102] McGrother S C, Gil-Villegas A and Jackson G 1996 Journal of Physics: Condensed Matter 8

9649–9655

[103] Bleil S, von Grunberg H H, Dobnikar J, Castaneda-Priego R and Bechinger C 2006 Europhysics

Letters (EPL) 73 450–456

[104] Anderson V J and Lekkerkerker H N 2002 Nature 416 811–815

[105] Zinn-Justin J 2007 Phase Transitions and Renormalization Group Oxford Graduate Texts (OUP

Oxford) ISBN 9780199227198 URL https://books.google.com.mx/books?id=mUMVDAAAQBAJ

[106] Binney J, Binney J, Binney M, Dowrick N, Fisher A, Newman M, Fisher B and Newman L 1992

The Theory of Critical Phenomena: An Introduction to the Renormalization Group Oxford

Science Publ (Clarendon Press) ISBN 9780198513933 URL https://books.google.com.mx/

books?id=lvZcGJI3X9cC

[107] Mermin N D 1979 Rev. Mod. Phys. 51(3) 591–648

[108] Chuang I, Yurke B, Pargellis A N and Turok N 1993 Phys. Rev. E 47(5) 3343–3356

[109] Gil-Villegas A, McGrother S C and Jackson G 1997 Chemical Physics Letters 269

441 – 447 ISSN 0009-2614 URL http://www.sciencedirect.com/science/article/pii/

S0009261497003072

[110] Alder B J and Wainwright T E 1957 The Journal of Chemical Physics 27 1208–1209 ISSN 0021-

9606 publisher: American Institute of Physics URL https://aip.scitation.org/doi/10.

1063/1.1743957

[111] Fernandez L A, Martın-Mayor V, Seoane B and Verrocchio P 2012 Physical Review Letters 108

165701 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/

PhysRevLett.108.165701

[112] Hoover W G and Ree F H 1968 The Journal of Chemical Physics 49 3609–3617 ISSN 0021-9606

publisher: American Institute of Physics URL https://aip.scitation.org/doi/10.1063/

1.1670641

[113] Robles M, Lopez de Haro M and Santos A 2014 The Journal of Chemical Physics 140 136101

ISSN 0021-9606 publisher: American Institute of Physics URL https://aip.scitation.org/

doi/10.1063/1.4870524

[114] Annett J F 2004 Superconductivity, superfluids, and condensates Oxford mas-

ter series in condensed matter physics (Oxford University Press) ISBN

9780198507550,0198507550,0198507569,9780198507567

[115] Tinkham M 1996 Introduction to superconductivity 2nd ed International series in pure and applied

physics (McGraw Hill) ISBN 0070648786,9780070648784

[116] Santi E, Eskandari S, Tian B and Peng K 2018 6 - power electronic modules Power Electronics

Handbook (Fourth Edition) ed Rashid M H (Butterworth-Heinemann) pp 157 – 173 fourth

edition ed ISBN 978-0-12-811407-0 URL http://www.sciencedirect.com/science/article/

pii/B9780128114070000064

https://books.google.com.mx/books?id=3shp8Kl6YoUC

https://books.google.com.mx/books?id=3shp8Kl6YoUC


https://doi.org/10.1080/00268979809483199

https://doi.org/10.1080/00268979809483199

https://doi.org/10.1080/00268979809483199

https://books.google.com.mx/books?id=mUMVDAAAQBAJ

https://books.google.com.mx/books?id=lvZcGJI3X9cC

https://books.google.com.mx/books?id=lvZcGJI3X9cC



https://aip.scitation.org/doi/10.1063/1.1743957








http://www.sciencedirect.com/science/article/pii/B9780128114070000064

http://www.sciencedirect.com/science/article/pii/B9780128114070000064


[117] Yeh N C 2008 AAPPS Bulletin 18 11

[118] Council N R 2007 Condensed-Matter and Materials Physics: The Science of the World Around

Us (Washington, DC: The National Academies Press) ISBN 978-0-309-10969-7

[119] Cipra B A 1987 The American Mathematical Monthly 94 937–959 (Preprint https://doi.

org/10.1080/00029890.1987.12000742) URL https://doi.org/10.1080/00029890.1987.

12000742

[120] Ising E 1925 Z. Phys. 31 253–258

[121] Onsager L 1944 Phys. Rev. 65(3-4) 117–149

[122] Wilson K G 1971 Phys. Rev. B 4(9) 3174–3183

[123] Wilson K G 1971 Phys. Rev. B 4(9) 3184–3205 URL https://link.aps.org/doi/10.1103/

PhysRevB.4.3184

[124] Ferrenberg A M and Landau D P 1991 Phys. Rev. B 44(10) 5081–5091 URL https://link.

aps.org/doi/10.1103/PhysRevB.44.5081

[125] Stanley H E 1968 Phys. Rev. Lett. 20(12) 589–592 URL https://link.aps.org/doi/10.1103/

PhysRevLett.20.589

[126] Ashkin J and Teller E 1943 Phys. Rev. 64(5-6) 178–184 URL https://link.aps.org/doi/10.

1103/PhysRev.64.178

[127] Kunz H and Zumbach G 1992 Phys. Rev. B 46(2) 662–673 URL https://link.aps.org/doi/

10.1103/PhysRevB.46.662

[128] Straley J P and Fisher M E 1973 Journal of Physics A: Mathematical, Nuclear and General 6

1310–1326 URL https://doi.org/10.1088/0305-4470/6/9/007

[129] Berezinsky V 1971 Sov. Phys. JETP 32 493–500 URL https://inspirehep.net/literature/

61186

[130] Berezinsky V 1972 Sov. Phys. JETP 34 610 URL https://inspirehep.net/literature/

1716846

[131] Kosterlitz J M and Thouless D J 1973 Journal of Physics C: Solid State Physics 6 1181–1203

[132] Kosterlitz J M 1974 Journal of Physics C: Solid State Physics 7 1046–1060

[133] Chen J, Liao H J, Xie H D, Han X J, Huang R Z, Cheng S, Wei Z C, Xie Z Y and Xiang T 2017

Chinese Physics Letters 34 050503

[134] Yong J, Lemberger T R, Benfatto L, Ilin K and Siegel M 2013 Phys. Rev. B 87(18) 184505 URL

https://link.aps.org/doi/10.1103/PhysRevB.87.184505

[135] Lin Y H, Nelson J and Goldman A M 2012 Phys. Rev. Lett. 109(1) 017002 URL https:

//link.aps.org/doi/10.1103/PhysRevLett.109.017002

[136] Gibney E and Castelvecchi D 2016 Nature News 538 18

[137] Carrasquilla J and Melko R G 2017 Nature Physics 13 431–434 ISSN 1745-2473

[138] Nightingale P 1982 Journal of Applied Physics 53 7927–7932

[139] Kogut J B 1979 Rev. Mod. Phys. 51(4) 659–713

[140] Deng D L, Li X and Das Sarma S 2017 Phys. Rev. B 96(19) 195145

[141] Broecker P, Carrasquilla J, Melko R G and Trebst S 2017 Scientific reports 7 1–10

[142] Tanaka A and Tomiya A 2017 Journal of the Physical Society of Japan 86 063001

[143] Zhang Y, Melko R G and Kim E A 2017 Phys. Rev. B 96(24) 245119

[144] Rodriguez-Nieva J F and Scheurer M S 2019 Nature Physics 15 790–795

[145] Beach M J S, Golubeva A and Melko R G 2018 Phys. Rev. B 97(4) 045207 URL https:

//link.aps.org/doi/10.1103/PhysRevB.97.045207

[146] Melko R G and Gingras M J P 2004 Journal of Physics: Condensed Matter 16 R1277–R1319

URL https://doi.org/10.1088/0953-8984/16/43/r02

[147] Ch’ng K, Carrasquilla J, Melko R G and Khatami E 2017 Phys. Rev. X 7(3) 031038 URL


[148] Ponte P and Melko R G 2017 Phys. Rev. B 96(20) 205146 URL https://link.aps.org/doi/

10.1103/PhysRevB.96.205146

[149] Morningstar A and Melko R G 2017 J. Mach. Learn. Res. 18 5975–5991 ISSN 1532-4435

https://doi.org/10.1080/00029890.1987.12000742

https://doi.org/10.1080/00029890.1987.12000742

https://doi.org/10.1080/00029890.1987.12000742

https://doi.org/10.1080/00029890.1987.12000742







https://link.aps.org/doi/10.1103/PhysRev.64.178




https://doi.org/10.1088/0305-4470/6/9/007

https://inspirehep.net/literature/61186









https://doi.org/10.1088/0953-8984/16/43/r02





[150] Efthymiou S, Beach M J S and Melko R G 2019 Phys. Rev. B 99(7) 075113 URL https:


[151] Zhang W, Liu J and Wei T C 2019 Physical Review E 99 1–16 ISSN 24700053 (Preprint

arXiv:1804.02709v2)

[152] Kim D and Kim D H 2018 Phys. Rev. E 98(2) 022138

[153] Giannetti C, Lucini B and Vadacchino D 2019 Nuclear Physics B 944 114639 ISSN 05503213

[154] Friedman J, Hastie T and Tibshirani R 2001 The elements of statistical learning vol 1 (Springer

series in statistics New York)

[155] Kenta S, Hiroyuki M, Yutaka O and Kuan L H 2020 Scientific Reports (Nature Publisher Group)

10

[156] Wu F Y 1982 Rev. Mod. Phys. 54(1) 235–268 URL https://link.aps.org/doi/10.1103/

RevModPhys.54.235

[157] Carvalho D, Garcıa-Martınez N A, Lado J L and Fernandez-Rossier J 2018 Phys. Rev. B 97(11)

115453 URL https://link.aps.org/doi/10.1103/PhysRevB.97.115453

[158] Ohtsuki T and Ohtsuki T 2016 Journal of the Physical Society of Japan 85 123706

(Preprint https://doi.org/10.7566/JPSJ.85.123706) URL https://doi.org/10.7566/

JPSJ.85.123706

[159] Ohtsuki T and Ohtsuki T 2017 Journal of the Physical Society of Japan 86 044708

(Preprint https://doi.org/10.7566/JPSJ.86.044708) URL https://doi.org/10.7566/

JPSJ.86.044708

[160] Ohtsuki T and Mano T 2020 Journal of the Physical Society of Japan 89 022001 (Preprint https:

//doi.org/10.7566/JPSJ.89.022001) URL https://doi.org/10.7566/JPSJ.89.022001

[161] Livni R, Shalev-Shwartz S and Shamir O 2014 On the Computational Efficiency

of Training Neural Networks Advances in Neural Information Processing Systems

27 ed Ghahramani Z, Welling M, Cortes C, Lawrence N D and Weinberger

K Q (Curran Associates, Inc.) pp 855–863 URL http://papers.nips.cc/paper/

5267-on-the-computational-efficiency-of-training-neural-networks.pdf

[162] Song L, Vempala S, Wilmes J and Xie B 2017 On the Complexity of Learning

Neural Networks Advances in Neural Information Processing Systems 30 ed Guyon

I, Luxburg U V, Bengio S, Wallach H, Fergus R, Vishwanathan S and Gar-

nett R (Curran Associates, Inc.) pp 5514–5522 URL http://papers.nips.cc/paper/

7135-on-the-complexity-of-learning-neural-networks.pdf

[163] Maaten L v d and Hinton G 2008 Journal of machine learning research 9 2579–2605

[164] Ng A Y, Jordan M I and Weiss Y 2001 On spectral clustering: Analysis and an algorithm

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (MIT Press) pp 849–

856

[165] FRS K P 1901 The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science

2 559–572

[166] Wetzel S J 2017 Phys. Rev. E 96(2) 022140 URL https://link.aps.org/doi/10.1103/

PhysRevE.96.022140

[167] Wang L 2016 Phys. Rev. B 94(19) 195105 URL https://link.aps.org/doi/10.1103/

PhysRevB.94.195105

[168] Huembeli P, Dauphin A and Wittek P 2018 Phys. Rev. B 97(13) 134109

[169] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and

Bengio Y 2014 Generative adversarial nets Advances in neural information processing systems

pp 2672–2680

[170] Rocchetto A, Grant E, Strelchuk S, Carleo G and Severini S 2018 npj Quantum Information 4

1–7

[171] Kingma D P and Welling M 2013 arXiv preprint arXiv:1312.6114

[172] Zhang P, Shen H and Zhai H 2018 Phys. Rev. Lett. 120(6) 066401 URL https://link.aps.org/

doi/10.1103/PhysRevLett.120.066401



arXiv:1804.02709v2




https://doi.org/10.7566/JPSJ.85.123706









http://papers.nips.cc/paper/5267-on-the-computational-efficiency-of-training-neural-networks.pdf

http://papers.nips.cc/paper/5267-on-the-computational-efficiency-of-training-neural-networks.pdf

http://papers.nips.cc/paper/7135-on-the-complexity-of-learning-neural-networks.pdf

http://papers.nips.cc/paper/7135-on-the-complexity-of-learning-neural-networks.pdf

https://link.aps.org/doi/10.1103/PhysRevE.96.022140







[173] Liang X, Liu W Y, Lin P Z, Guo G C, Zhang Y S and He L 2018 Phys. Rev. B 98(10) 104426

URL https://link.aps.org/doi/10.1103/PhysRevB.98.104426

[174] Baumgartner A, Burkitt A, Ceperley D, De Raedt H, Ferrenberg A, Heermann D, Herrmann H,

Landau D, Levesque D, von der Linden W et al. 2012 The Monte Carlo method in condensed

matter physics vol 71 (Springer Science & Business Media)

[175] Newman M and Barkema G 1999 Monte Carlo methods in statistical physics chapter 1-4 (Oxford

University Press: New York, USA)

[176] Gubernatis J E 2003 The monte carlo method in the physical sciences: celebrating the 50th

anniversary of the metropolis algorithm The Monte Carlo Method in the Physical Sciences vol

690

[177] Landau D P and Binder K 2014 A guide to Monte Carlo simulations in statistical physics

(Cambridge university press)

[178] Liu J, Qi Y, Meng Z Y and Fu L 2017 Physical Review B 95 1–5 ISSN 24699969

[179] Huang L and Wang L 2017 Phys. Rev. B 95(3) 035105 URL https://link.aps.org/doi/10.

1103/PhysRevB.95.035105

[180] Liu J, Shen H, Qi Y, Meng Z Y and Fu L 2017 Phys. Rev. B 95(24) 241104 URL https:


[181] Nagai Y, Shen H, Qi Y, Liu J and Fu L 2017 Phys. Rev. B 96(16) 161102 URL https:


[182] Shen H, Liu J and Fu L 2018 Phys. Rev. B 97(20) 205140 URL https://link.aps.org/doi/

10.1103/PhysRevB.97.205140

[183] Chen C, Xu X Y, Liu J, Batrouni G, Scalettar R and Meng Z Y 2018 Phys. Rev. B 98(4) 041102

URL https://link.aps.org/doi/10.1103/PhysRevB.98.041102

[184] Pilati S, Inack E M and Pieri P 2019 Phys. Rev. E 100(4) 043301 URL https://link.aps.org/

doi/10.1103/PhysRevE.100.043301

[185] Nagai Y, Okumura M and Tanaka A 2020 Phys. Rev. B 101(11) 115111 URL https://link.

aps.org/doi/10.1103/PhysRevB.101.115111

[186] Behler J and Parrinello M 2007 Phys. Rev. Lett. 98(14) 146401

[187] Bojesen T A 2018 Physical Review E 98 063303 publisher: American Physical Society URL


[188] Venderley J, Khemani V and Kim E A 2018 Phys. Rev. Lett. 120(25) 257204 URL https:


[189] Choo K, Neupert T and Carleo G 2019 Phys. Rev. B 100(12) 125124 URL https://link.aps.

org/doi/10.1103/PhysRevB.100.125124

[190] Zhang C, Bengio S, Hardt M, Recht B and Vinyals O 2016 Understanding deep learning requires

rethinking generalization (Preprint 1611.03530)

[191] Lin H W, Tegmark M and Rolnick D 2017 Journal of Statistical Physics 168 1223–1247

[192] Fan F, Xiong J and Wang G 2020 On interpretability of artificial neural networks (Preprint

2001.02522)

[193] Mehta P and Schwab D J 2014 An exact mapping between the Variational Renormalization Group

and Deep Learning (Preprint 1410.3831) URL http://arxiv.org/abs/1410.3831

[194] Li S H and Wang L 2018 Phys. Rev. Lett. 121(26) 260601 URL https://link.aps.org/doi/

10.1103/PhysRevLett.121.260601

[195] Koch-Janusz M and Ringel Z 2018 Nature Physics 14 578–582

[196] Lenggenhager P M, Gokmen D E, Ringel Z, Huber S D and Koch-Janusz M 2020 Phys. Rev. X

10(1) 011037 URL https://link.aps.org/doi/10.1103/PhysRevX.10.011037

[197] de Melllo Koch E, de Mello Koch A, Kastanos N and Cheng L 2020 Short sighted deep learning

(Preprint 2002.02664)

[198] Hu H Y, Li S H, Wang L and You Y Z 2019 Machine learning holographic mapping by neural

network renormalization group (Preprint 1903.00804)

[199] Smolensky P 1986 Information processing in dynamical systems: Foundations of harmony theory




















1611.03530

2001.02522

1410.3831

http://arxiv.org/abs/1410.3831




2002.02664

1903.00804


Tech. rep. Colorado Univ at Boulder Dept of Computer Science

[200] Larochelle H and Bengio Y 2008 Classification using discriminative restricted boltzmann machines

Proceedings of the 25th International Conference on Machine Learning ICML ’08 (New York,

NY, USA: Association for Computing Machinery) p 536–543 ISBN 9781605582054 URL

https://doi.org/10.1145/1390156.1390224

[201] Hinton G E and Salakhutdinov R R 2006 Science 313 504–507 ISSN 0036-8075 (Preprint https:

//science.sciencemag.org/content/313/5786/504.full.pdf) URL https://science.

sciencemag.org/content/313/5786/504

[202] Coates A, Ng A and Lee H 2011 An analysis of single-layer networks in unsupervised feature

learning Proceedings of the fourteenth international conference on artificial intelligence and

statistics pp 215–223

[203] Fischer A and Igel C 2014 Pattern Recognition 47 25–39

[204] Hinton G E 2002 Neural Computation 14 1771–1800 ISSN 0899-7667 publisher: MIT Press URL

https://doi.org/10.1162/089976602760128018

[205] Hinton G E 2012 A Practical Guide to Training Restricted Boltzmann Machines Neural Networks:

Tricks of the Trade: Second Edition Lecture Notes in Computer Science ed Montavon G, Orr

G B and Muller K R (Berlin, Heidelberg: Springer) pp 599–619 ISBN 978-3-642-35289-8 URL

https://doi.org/10.1007/978-3-642-35289-8_32

[206] Huang H and Toyoizumi T 2015 Phys. Rev. E 91(5) 050101

[207] Decelle A, Fissore G and Furtlehner C 2018 Journal of Statistical Physics 172 1576–1608

[208] Nguyen H C, Zecchina R and Berg J 2017 Advances in Physics 66 197–261 (Preprint https:

//doi.org/10.1080/00018732.2017.1341604) URL https://doi.org/10.1080/00018732.

2017.1341604

[209] Decelle A, Fissore G and Furtlehner C 2017 EPL (Europhysics Letters) 119 60001 URL

https://doi.org/10.1209/0295-5075/119/60001

[210] Tubiana J, Cocco S and Monasson R 2019 Elife 8 e39397

[211] Tubiana J, Cocco S and Monasson R 2019 Neural Computation 31 1671–1717 pMID: 31260391

(Preprint https://doi.org/10.1162/neco_a_01210) URL https://doi.org/10.1162/

neco_a_01210

[212] Dahl G, Ranzato M, Mohamed A r and Hinton G E 2010 Phone recognition with the mean-

covariance restricted boltzmann machine Advances in neural information processing systems

pp 469–477

[213] Nomura Y, Darmawan A S, Yamaji Y and Imada M 2017 Phys. Rev. B 96(20) 205152 URL


[214] Melko R G, Carleo G, Carrasquilla J and Cirac J I 2019 Nature Physics 15 887–892

[215] Suchsland P and Wessel S 2018 Phys. Rev. B 97(17) 174435 URL https://link.aps.org/doi/

10.1103/PhysRevB.97.174435

[216] Baity-Jesi M, Sagun L, Geiger M, Spigler S, Arous G B, Cammarota C, LeCun Y, Wyart M and

Biroli G 2019 Journal of Statistical Mechanics: Theory and Experiment 2019 124013

[217] Geiger M, Spigler S, d’Ascoli S, Sagun L, Baity-Jesi M, Biroli G and Wyart M 2019 Phys. Rev.

E 100(1) 012115 URL https://link.aps.org/doi/10.1103/PhysRevE.100.012115

[218] Allen M P and Tildesley D J 2017 Computer simulation of liquids (Oxford university press)

[219] Lindahl E and Sansom M S 2008 Current Opinion in Structural Biology 18 425 – 431 ISSN 0959-

440X membranes / Engineering and design URL http://www.sciencedirect.com/science/

article/pii/S0959440X08000304

[220] Badia A, Lennox R B and Reven L 2000 Accounts of Chemical Research 33 475–481 pMID:

10913236

[221] Sidky H, Colon Y J, Helfferich J, Sikora B J, Bezik C, Chu W, Giberti F, Guo A Z, Jiang X,

Lequieu J, Li J, Moller J, Quevillon M J, Rahimi M, Ramezani-Dakhel H, Rathee V S, Reid

D R, Sevgen E, Thapar V, Webb M A, Whitmer J K and de Pablo J J 2018 The Journal of

Chemical Physics 148 044104

https://doi.org/10.1145/1390156.1390224





https://doi.org/10.1162/089976602760128018

https://doi.org/10.1007/978-3-642-35289-8_32

https://doi.org/10.1080/00018732.2017.1341604

https://doi.org/10.1080/00018732.2017.1341604

https://doi.org/10.1080/00018732.2017.1341604

https://doi.org/10.1080/00018732.2017.1341604

https://doi.org/10.1209/0295-5075/119/60001

https://doi.org/10.1162/neco_a_01210







http://www.sciencedirect.com/science/article/pii/S0959440X08000304

http://www.sciencedirect.com/science/article/pii/S0959440X08000304


[222] Sidky H and Whitmer J K 2018 The Journal of Chemical Physics 148 104111

[223] Sultan M M and Pande V S 2018 The Journal of Chemical Physics 149 094106

[224] Marx D and Hutter J 2009 Ab initio molecular dynamics: basic theory and advanced methods

(Cambridge University Press)

[225] Koch W and Holthausen M 2015 A Chemist’s Guide to Density Functional Theory (Wiley) ISBN

9783527802814

[226] Parr R G 1980 Density functional theory of atoms and molecules Horizons of quantum chemistry

(Springer) pp 5–15

[227] Custodio C A, Filletti E R and Franca V V 2019 Scientific reports 9 1–7

[228] Manzhos S 2020 Machine Learning: Science and Technology 1 013002 URL https://doi.org/

10.1088/2632-2153/ab7d30

[229] Berman H, Henrick K, Nakamura H and Markley J L 2006 Nucleic Acids Research 35 D301–D303

ISSN 0305-1048

[230] Hilbert M and Lopez P 2011 Science 332 60–65 ISSN 0036-8075 (Preprint https://science.

sciencemag.org/content/332/6025/60.full.pdf) URL https://science.sciencemag.

org/content/332/6025/60

[231] Blank T B, Brown S D, Calhoun A W and Doren D J 1995 The Journal of Chemical Physics

103 4129–4137

[232] Ozboyaci M, Kokh D B, Corni S and Wade R C 2016 Quarterly Reviews of Biophysics 49 e4

[233] Bartok A P, Payne M C, Kondor R and Csanyi G 2010 Phys. Rev. Lett. 104(13) 136403

[234] Pan S J and Yang Q 2010 IEEE Transactions on Knowledge and Data Engineering 22 1345–1359

[235] Torrey L and Shavlik J 2010 Transfer Learning iSBN: 9781605667669 Library Catalog: www.igi-

global.com Pages: 242-264 Publisher: IGI Global URL www.igi-global.com/chapter/

transfer-learning/36988

[236] Weiss K, Khoshgoftaar T M and Wang D 2016 Journal of Big Data 3 9 ISSN 2196-1115 URL

https://doi.org/10.1186/s40537-016-0043-6

[237] French R M 1999 Trends in Cognitive Sciences 3 128–135 ISSN 1364-6613 URL http://www.

sciencedirect.com/science/article/pii/S1364661399012942

[238] McCloskey M and Cohen N J 1989 Catastrophic Interference in Connectionist Networks: The

Sequential Learning Problem Psychology of Learning and Motivation vol 24 ed Bower G H

(Academic Press) pp 109–165 URL http://www.sciencedirect.com/science/article/pii/

S0079742108605368

[239] Kumaran D, Hassabis D and McClelland J L 2016 Trends in Cognitive Sciences 20

512–534 ISSN 1364-6613 URL http://www.sciencedirect.com/science/article/pii/

S1364661316300432

[240] McClelland J L, McNaughton B L and O’Reilly R C 1995 Psychological Review 102 419–457

ISSN 1939-1471(Electronic),0033-295X(Print) place: US Publisher: American Psychological

Association

[241] Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu A A, Milan K, Quan J,

Ramalho T, Grabska-Barwinska A, Hassabis D, Clopath C, Kumaran D and Hadsell R 2017

Proceedings of the National Academy of Sciences 114 3521–3526 ISSN 0027-8424 publisher:

National Academy of Sciences eprint: https://www.pnas.org/content/114/13/3521.full.pdf

URL https://www.pnas.org/content/114/13/3521

[242] Li Z and Hoiem D 2018 IEEE Transactions on Pattern Analysis and Machine Intelligence 40

2935–2947 ISSN 1939-3539 conference Name: IEEE Transactions on Pattern Analysis and

Machine Intelligence

[243] Morawietz T, Singraber A, Dellago C and Behler J 2016 Proceedings of the National Academy of

Sciences 113 8368–8373 ISSN 0027-8424

[244] Natarajan S K and Behler J 2016 Phys. Chem. Chem. Phys. 18(41) 28704–28725

[245] Bartok A P, Kondor R and Csanyi G 2013 Phys. Rev. B 87(18) 184115

[246] Rupp M, Tkatchenko A, Muller K R and von Lilienfeld O A 2012 Phys. Rev. Lett. 108(5) 058301

https://doi.org/10.1088/2632-2153/ab7d30

https://doi.org/10.1088/2632-2153/ab7d30





www.igi-global.com/chapter/transfer-learning/36988

www.igi-global.com/chapter/transfer-learning/36988

https://doi.org/10.1186/s40537-016-0043-6







https://www.pnas.org/content/114/13/3521


[247] Thompson A, Swiler L, Trott C, Foiles S and Tucker G 2015 Journal of Computational Physics

285 316 – 330 ISSN 0021-9991 URL http://www.sciencedirect.com/science/article/

pii/S0021999114008353

[248] Geiger P and Dellago C 2013 The Journal of Chemical Physics 139 164105

[249] Balabin R M and Lomakina E I 2011 Phys. Chem. Chem. Phys. 13(24) 11710–11718

[250] Schutt K T, Glawe H, Brockherde F, Sanna A, Muller K R and Gross E K U 2014 Phys. Rev. B

89(20) 205118 URL https://link.aps.org/doi/10.1103/PhysRevB.89.205118

[251] Faber F, Lindmaa A, von Lilienfeld O A and Armiento R 2015 International Journal of Quantum

Chemistry 115 1094–1101 (Preprint https://onlinelibrary.wiley.com/doi/pdf/10.1002/

qua.24917) URL https://onlinelibrary.wiley.com/doi/abs/10.1002/qua.24917

[252] Ceriotti M 2019 The Journal of Chemical Physics 150 150901

[253] Jadrich R B, Lindquist B A and Truskett T M 2018 The Journal of Chemical Physics 149 194109

[254] Jadrich R B, Lindquist B A, Pineros W D, Banerjee D and Truskett T M 2018 The Journal of

Chemical Physics 149 194110

[255] Mak C H 2006 Phys. Rev. E 73(6) 065104 URL https://link.aps.org/doi/10.1103/

PhysRevE.73.065104

[256] Bagchi K, Andersen H C and Swope W 1996 Phys. Rev. Lett. 76(2) 255–258 URL https:


[257] Boattini E, Dijkstra M and Filion L 2019 The Journal of Chemical Physics 151 154901

[258] Steinhardt P J, Nelson D R and Ronchetti M 1983 Phys. Rev. B 28(2) 784–805 URL https:


[259] Lechner W and Dellago C 2008 The Journal of Chemical Physics 129 114707

[260] Chandrasekhar S, Chandrasekhar S and Press C U 1992 Liquid Crystals Cambridge monographs

on physics (Cambridge University Press) ISBN 9780521417471 URL https://books.google.

com.mx/books?id=GYDxM1fPArMC

[261] Sluckin T, Dunmur D and Stegemeyer H 2004 Crystals That Flow: Classic Papers from the

History of Liquid Crystals Liquid Crystals Book Series (Taylor & Francis) ISBN 9780415257893

[262] Walters M, Wei Q and Chen J Z Y 2019 Physical Review E 99 062701 publisher: American

Physical Society URL https://link.aps.org/doi/10.1103/PhysRevE.99.062701

[263] Hochreiter S and Schmidhuber J 1997 Neural computation 9 1735–80

[264] Greff K, Srivastava R K, Koutnık J, Steunebrink B R and Schmidhuber J 2017 IEEE Transactions

on Neural Networks and Learning Systems 28 2222–2232 ISSN 2162-2388

[265] Terao T 2020 Soft Materials 0 1–13

[266] de Gennes P, Gennes P and Press C U 1979 Scaling Concepts in Polymer Physics (Cornell

University Press) ISBN 9780801412035 URL https://books.google.com.mx/books?id=

ApzfJ2LYwGUC

[267] Menon A, Gupta C, M Perkins K, L DeCost B, Budwal N, T Rios R, Zhang K, Poczos B and

R Washburn N 2017 Molecular Systems Design & Engineering 2 263–273 publisher: Royal

Society of Chemistry URL https://pubs.rsc.org/en/content/articlelanding/2017/me/

c7me00027h

[268] Xu Q and Jiang J 2020 ACS Applied Polymer Materials Publisher: American Chemical Society

URL https://doi.org/10.1021/acsapm.0c00586

[269] Wu S, Kondo Y, Kakimoto M a, Yang B, Yamada H, Kuwajima I, Lambard G, Hongo K, Xu

Y, Shiomi J, Schick C, Morikawa J and Yoshida R 2019 npj Computational Materials 5 1–11

ISSN 2057-3960 number: 1 Publisher: Nature Publishing Group URL https://www.nature.

com/articles/s41524-019-0203-2

[270] Hamley I 2013 Introduction to Soft Matter: Synthetic and Biological Self-Assembling Materials

(Wiley) ISBN 9781118681428 URL https://books.google.com.mx/books?id=Qd794-DI49wC

[271] Jones R, R Jones P and Jones R 2002 Soft Condensed Matter Oxford Master Series in Physics

(OUP Oxford) ISBN 9780198505891 URL https://books.google.com.mx/books?id=Hl_

HBPUvoNsC




https://onlinelibrary.wiley.com/doi/pdf/10.1002/qua.24917

https://onlinelibrary.wiley.com/doi/pdf/10.1002/qua.24917








https://books.google.com.mx/books?id=GYDxM1fPArMC

https://books.google.com.mx/books?id=GYDxM1fPArMC


https://books.google.com.mx/books?id=ApzfJ2LYwGUC

https://books.google.com.mx/books?id=ApzfJ2LYwGUC

https://pubs.rsc.org/en/content/articlelanding/2017/me/c7me00027h

https://pubs.rsc.org/en/content/articlelanding/2017/me/c7me00027h

https://doi.org/10.1021/acsapm.0c00586

https://www.nature.com/articles/s41524-019-0203-2

https://www.nature.com/articles/s41524-019-0203-2

https://books.google.com.mx/books?id=Qd794-DI49wC

https://books.google.com.mx/books?id=Hl_HBPUvoNsC

https://books.google.com.mx/books?id=Hl_HBPUvoNsC


[272] Moradzadeh A and Aluru N R 2019 The Journal of Physical Chemistry Letters 10 7568–

7576 publisher: American Chemical Society URL https://doi.org/10.1021/acs.jpclett.

9b02820

[273] Schoenholz S S, Cubuk E D, Sussman D M, Kaxiras E and Liu A J 2016 Nature Physics

12 469–471 ISSN 1745-2481 number: 5 Publisher: Nature Publishing Group URL https:

//www.nature.com/articles/nphys3644

[274] Kob W and Andersen H C 1994 Phys. Rev. Lett. 73(10) 1376–1379 URL https://link.aps.

org/doi/10.1103/PhysRevLett.73.1376

[275] Liu H, Fu Z, Yang K, Xu X and Bauchy M 2019 Journal of Non-Crystalline Solids

119419 ISSN 0022-3093 URL http://www.sciencedirect.com/science/article/pii/

S0022309319302662

[276] Li L, Yang Y, Zhang D, Ye Z G, Jesse S, Kalinin S V and Vasudevan R K 2018 Science Advances

4 eaap8672 ISSN 2375-2548 publisher: American Association for the Advancement of Science

Section: Research Article URL https://advances.sciencemag.org/content/4/3/eaap8672

[277] Minor E N, Howard S D, Green A A S, Glaser M A, Park C S and Clark N A 2020 Soft

Matter 16 1751–1759 ISSN 1744-6848 publisher: The Royal Society of Chemistry URL

https://pubs.rsc.org/en/content/articlelanding/2020/sm/c9sm01979k

[278] Redmon J and Farhadi A 2016 Yolo9000: Better, faster, stronger (Preprint 1612.08242)

[279] DeFever R S, Targonski C, Hall S W, Smith M C and Sarupria S 2019 Chem. Sci. 10(32) 7503–

7515

[280] Lecun Y, Bottou L, Bengio Y and Haffner P 1998 Proceedings of the IEEE 86 2278–2324 ISSN

1558-2256 conference Name: Proceedings of the IEEE

[281] Xiao H, Rasul K and Vollgraf R 2017 arXiv:1708.07747 [cs, stat] ArXiv: 1708.07747 version: 2

URL http://arxiv.org/abs/1708.07747

[282] Wu Z, Ramsundar B, N Feinberg E, Gomes J, Geniesse C, S Pappu A, Leswing K and

Pande V 2018 Chemical Science 9 513–530 publisher: Royal Society of Chemistry URL

https://pubs.rsc.org/en/content/articlelanding/2018/sc/c7sc02664a

[283] Ramakrishnan R, Dral P O, Rupp M and von Lilienfeld O A 2014 Scientific Data 1 140022 ISSN

2052-4463 number: 1 Publisher: Nature Publishing Group URL https://www.nature.com/

articles/sdata201422

[284] Verlet L 1967 Physical Review 159 98–103 publisher: American Physical Society URL https:

//link.aps.org/doi/10.1103/PhysRev.159.98

[285] Zenodo - Research. Shared. URL https://zenodo.org/

[286] Allen C and Mehler D M A 2019 PLOS Biology 17 e3000246 ISSN 1545-7885 publisher: Public

Library of Science URL https://journals.plos.org/plosbiology/article?id=10.1371/

journal.pbio.3000246

[287] Cai J, Luo J, Wang S and Yang S 2018 Neurocomputing 300 70–79 ISSN 0925-2312 URL


[288] Mehta P, Bukov M, Wang C H, Day A G R, Richardson C, Fisher C K and Schwab D J 2019

Physics Reports 810 1–124 ISSN 0370-1573 URL http://www.sciencedirect.com/science/

article/pii/S0370157319300766

[289] Snoek J, Larochelle H and Adams R P 2012 Practical Bayesian Optimiza-

tion of Machine Learning Algorithms Advances in Neural Information Process-

ing Systems 25 ed Pereira F, Burges C J C, Bottou L and Weinberger K Q

(Curran Associates, Inc.) pp 2951–2959 URL http://papers.nips.cc/paper/

4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf

[290] Wan L, Zeiler M, Zhang S, Cun Y L and Fergus R 2013 Regularization of Neural Networks using

DropConnect International Conference on Machine Learning pp 1058–1066 iSSN: 1938-7228

Section: Machine Learning URL http://proceedings.mlr.press/v28/wan13.html

[291] Hutson M 2018 Science 359 725–726 ISSN 0036-8075, 1095-9203 publisher: American Association

for the Advancement of Science Section: In Depth URL https://science.sciencemag.org/

https://doi.org/10.1021/acs.jpclett.9b02820

https://doi.org/10.1021/acs.jpclett.9b02820

https://www.nature.com/articles/nphys3644

https://www.nature.com/articles/nphys3644





https://advances.sciencemag.org/content/4/3/eaap8672

https://pubs.rsc.org/en/content/articlelanding/2020/sm/c9sm01979k

1612.08242


https://pubs.rsc.org/en/content/articlelanding/2018/sc/c7sc02664a

https://www.nature.com/articles/sdata201422

https://www.nature.com/articles/sdata201422



https://zenodo.org/

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000246

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000246




http://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf

http://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf

http://proceedings.mlr.press/v28/wan13.html



content/359/6377/725

[292] Srivastava N, Hinton G, Krizhevsky A, Sutskever I and Salakhutdinov R 2014 The Journal of

Machine Learning Research 15 1929–1958 ISSN 1532-4435

[293] Vanschoren J, van Rijn J N, Bischl B and Torgo L 2013 SIGKDD Explorations 15 49–60 URL

http://doi.acm.org/10.1145/2641190.2641198

[294] Alberti M, Pondenkandath V, Wursch M, Ingold R and Liwicki M 2018 DeepDIVA: A

Highly-Functional Python Framework for Reproducible Experiments 2018 16th International

Conference on Frontiers in Handwriting Recognition (ICFHR) pp 423–428

[295] Forde J Z and Paganini M 2019 arXiv:1904.10922 [cs, stat] ArXiv: 1904.10922 URL http:

//arxiv.org/abs/1904.10922

[296] Allen C and Mehler D M A 2019 PLOS Biology 17 1–14 URL https://doi.org/10.1371/

journal.pbio.3000246

[297] Samuel S, Loffler F and Konig-Ries B 2020 arXiv:2006.12117 [cs, stat] ArXiv: 2006.12117 URL


[298] Plimpton S 1993 Fast parallel algorithms for short-range molecular dynamics Tech. rep. Sandia

National Labs., Albuquerque, NM (United States) URL https://lammps.sandia.gov/index.

html

[299] Weik F, Weeber R, Szuttor K, Breitsprecher K, de Graaf J, Kuron M, Landsgesell J, Menke H,

Sean D and Holm C 2019 The European Physical Journal Special Topics 227 1789–1816

[300] Hashimoto K, Sugishita S, Tanaka A and Tomiya A 2018 Phys. Rev. D 98(4) 046019 URL

https://link.aps.org/doi/10.1103/PhysRevD.98.046019



http://doi.acm.org/10.1145/2641190.2641198



https://doi.org/10.1371/journal.pbio.3000246

https://doi.org/10.1371/journal.pbio.3000246


https://lammps.sandia.gov/index.html

https://lammps.sandia.gov/index.html

https://link.aps.org/doi/10.1103/PhysRevD.98.046019

Documents

Machine Learning for Condensed Matter Physics arXiv:2005