7
Adding Machine Learning within Hamiltonians: Renormalization Group Transformations, Symmetry Breaking and Restoration Dimitrios Bachtis, 1, * Gert Aarts, 2, and Biagio Lucini 1, 3, 1 Department of Mathematics, Swansea University, Bay Campus, SA1 8EN, Swansea, Wales, UK 2 Department of Physics, Swansea University, Singleton Campus, SA2 8PP, Swansea, Wales, UK 3 Swansea Academy of Advanced Computing, Swansea University, Bay Campus, SA1 8EN, Swansea, Wales, UK (Dated: September 30, 2020) We present a physical interpretation of machine learning functions, opening up the possibility to control properties of statistical systems via the inclusion of these functions in Hamiltonians. In particular, we include the predictive function of a neural network, designed for phase classification, as a conjugate variable coupled to an external field within the Hamiltonian of a system. Results in the two-dimensional Ising model evidence that the field can induce an order-disorder phase tran- sition by breaking or restoring the symmetry, in contrast with the field of the conventional order parameter which can only cause explicit symmetry breaking. The critical behaviour is then studied by proposing reweighting that is agnostic to the original Hamiltonian and forming a renormalization group mapping on quantities derived from the neural network. Accurate estimates of the critical fixed point and the operators that govern the divergence of the correlation length are provided. We conclude by discussing how the method provides an essential step towards bridging machine learning and physics. I. INTRODUCTION At the heart of our understanding of phase transitions lies a mathematical apparatus called the renormalization group [1–6]. Central concepts behind its application are those of scale invariance and universality: the former re- lates to the observation that at criticality the considered phenomena can be described by a scale-invariant theory, and the latter to the notion that systems seemingly unre- lated in their microscopic descriptions have a large-scale behaviour that is governed by an identical set of relevant operators. Computational frameworks of the renormal- ization group [7, 8] have seen resounding success in con- densed matter systems [9–12] and lattice field theories [13–15]. Recently, deep learning [16], which pertains to a class of machine learning methods that progressively extract hierarchical structures in data, has impacted certain as- pects of computational science. Artificial neural net- works, consisting of multiple layers, have been effi- ciently applied in various research fields, including parti- cle physics and cosmology as well as statistical mechanics. For a recent review see Refs. [17, 18]. Notable examples include the use of neural networks to study phase tran- sitions [19–25] and quantum many-body systems [26], while investigations at the intersection of machine learn- ing and the renormalization group have recently emerged [27–31]. In consequence, an in-depth understanding of the underlying mechanics of machine learning algorithms, as well as a simultaneous advancement of efficient ways to implement them in physical problems, is a crucial step to be undertaken by physicists [32]. * [email protected] [email protected] [email protected] In this paper a physical interpretation of machine learning is presented. In particular, we consider the pre- dictive function of a neural network, designed for phase classification, as a conjugate variable coupled to an exter- nal field, and introduce it as a term in the Hamiltonian of a system. Given this formulation, we propose reweight- ing that is agnostic to the original Hamiltonian to explore if the external field generates a richer structure than the one associated with the conventional order parameter, and if it can induce a phase transition by breaking or restoring the system’s symmetry. The critical behaviour can then be investigated using histogram-reweighted ex- trapolations from configurations obtained in one phase and without knowledge of the Hamiltonian. To study the phase transition, we propose a real-space renormalization group transformation that is formulated on machine learning quantities. In particular, a mapping is established between an original and a rescaled system using the neural network function and its field, overcom- ing the need to rely on observables related to the original Hamiltonian. We then explore, based on minimally-sized lattices, its capability to locate the critical fixed point and to extract the operators of the renormalization group transformation. The entirety of critical exponents can then be obtained using scaling relations and a complete study of the phase transition can be conducted. We validate our proposal in the two-dimensional Ising model using quantities derived from the machine learn- ing algorithm. By giving a physical interpetation to the function of a neural network as a Hamiltonian term, we explore the effect of the coupled field on the considered system, demonstrate that it can break or restore the re- flection symmetry by inducing a phase transition, and extract with high accuracy the location of the critical inverse temperature and the operators of the renormal- ization group transformation that govern the divergence of the correlation length. arXiv:2010.00054v1 [hep-lat] 30 Sep 2020

arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

Adding Machine Learning within Hamiltonians: Renormalization GroupTransformations, Symmetry Breaking and Restoration

Dimitrios Bachtis,1, ∗ Gert Aarts,2, † and Biagio Lucini1, 3, ‡

1Department of Mathematics, Swansea University, Bay Campus, SA1 8EN, Swansea, Wales, UK2Department of Physics, Swansea University, Singleton Campus, SA2 8PP, Swansea, Wales, UK

3Swansea Academy of Advanced Computing, Swansea University, Bay Campus, SA1 8EN, Swansea, Wales, UK(Dated: September 30, 2020)

We present a physical interpretation of machine learning functions, opening up the possibilityto control properties of statistical systems via the inclusion of these functions in Hamiltonians. Inparticular, we include the predictive function of a neural network, designed for phase classification,as a conjugate variable coupled to an external field within the Hamiltonian of a system. Results inthe two-dimensional Ising model evidence that the field can induce an order-disorder phase tran-sition by breaking or restoring the symmetry, in contrast with the field of the conventional orderparameter which can only cause explicit symmetry breaking. The critical behaviour is then studiedby proposing reweighting that is agnostic to the original Hamiltonian and forming a renormalizationgroup mapping on quantities derived from the neural network. Accurate estimates of the criticalfixed point and the operators that govern the divergence of the correlation length are provided. Weconclude by discussing how the method provides an essential step towards bridging machine learningand physics.

I. INTRODUCTION

At the heart of our understanding of phase transitionslies a mathematical apparatus called the renormalizationgroup [1–6]. Central concepts behind its application arethose of scale invariance and universality: the former re-lates to the observation that at criticality the consideredphenomena can be described by a scale-invariant theory,and the latter to the notion that systems seemingly unre-lated in their microscopic descriptions have a large-scalebehaviour that is governed by an identical set of relevantoperators. Computational frameworks of the renormal-ization group [7, 8] have seen resounding success in con-densed matter systems [9–12] and lattice field theories[13–15].

Recently, deep learning [16], which pertains to a classof machine learning methods that progressively extracthierarchical structures in data, has impacted certain as-pects of computational science. Artificial neural net-works, consisting of multiple layers, have been effi-ciently applied in various research fields, including parti-cle physics and cosmology as well as statistical mechanics.For a recent review see Refs. [17, 18]. Notable examplesinclude the use of neural networks to study phase tran-sitions [19–25] and quantum many-body systems [26],while investigations at the intersection of machine learn-ing and the renormalization group have recently emerged[27–31]. In consequence, an in-depth understanding ofthe underlying mechanics of machine learning algorithms,as well as a simultaneous advancement of efficient waysto implement them in physical problems, is a crucial stepto be undertaken by physicists [32].

[email protected][email protected][email protected]

In this paper a physical interpretation of machinelearning is presented. In particular, we consider the pre-dictive function of a neural network, designed for phaseclassification, as a conjugate variable coupled to an exter-nal field, and introduce it as a term in the Hamiltonian ofa system. Given this formulation, we propose reweight-ing that is agnostic to the original Hamiltonian to exploreif the external field generates a richer structure than theone associated with the conventional order parameter,and if it can induce a phase transition by breaking orrestoring the system’s symmetry. The critical behaviourcan then be investigated using histogram-reweighted ex-trapolations from configurations obtained in one phaseand without knowledge of the Hamiltonian.

To study the phase transition, we propose a real-spacerenormalization group transformation that is formulatedon machine learning quantities. In particular, a mappingis established between an original and a rescaled systemusing the neural network function and its field, overcom-ing the need to rely on observables related to the originalHamiltonian. We then explore, based on minimally-sizedlattices, its capability to locate the critical fixed pointand to extract the operators of the renormalization grouptransformation. The entirety of critical exponents canthen be obtained using scaling relations and a completestudy of the phase transition can be conducted.

We validate our proposal in the two-dimensional Isingmodel using quantities derived from the machine learn-ing algorithm. By giving a physical interpetation to thefunction of a neural network as a Hamiltonian term, weexplore the effect of the coupled field on the consideredsystem, demonstrate that it can break or restore the re-flection symmetry by inducing a phase transition, andextract with high accuracy the location of the criticalinverse temperature and the operators of the renormal-ization group transformation that govern the divergenceof the correlation length.

arX

iv:2

010.

0005

4v1

[he

p-la

t] 3

0 Se

p 20

20

Page 2: arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

2

FIG. 1. The architecture of the fully-connected neural network (see App. B). A renormalization group mapping is formulatedbetween an original and a rescaled system based on the predictive function of the neural network.

II. NEURAL NETWORKS AS HAMILTONIANTERMS

We consider a statistical system, such as the Isingmodel (see App. A), which is described by a Hamilto-nian E. The equilibrium occupation probabilities of thesystem are of Boltzmann form and are given by:

pσ =exp[−βEσ]∑σ exp[−βEσ]

, (1)

where β is the inverse temperature, σ a state of thesystem and Z =

∑σ exp[−βEσ] the partition function.

When the system is in equilibrium the expectation valueof an arbitrary observable O is:

〈O〉 =

∑σ Oσ exp[−βEσ]∑σ exp[−βEσ]

. (2)

After a neural network is trained on a system for phaseclassification (see App. B), the learned neural networkfunction f(·) can be applied to a configuration σ, con-verting fσ into a statistical mechanical observable withan associated Boltzmann weight [35]. In addition, weconsider fσ as equivalent to the conditional probabilityfσ ≡ P bσ that a configuration belongs in the broken-symmetry phase. Consequently, fσ is an intensive quan-tity bound between [0, 1] and since it has no dependenceon the size of the system we can multiply it with thevolume V and recast V fσ as an extensive property.

We are now able to investigate the extensive neuralnetwork function V f by introducing it in the Hamilto-nian of the system. In particular, we consider V f as aconjugate variable that couples to an external field Y anddefine a modified Hamiltonian:

EY = E − V fY. (3)

The expectation value of the neural network functioncan then be expressed as a derivative of the modifiedpartition function ZY in terms of the field:

〈f〉 =1

βV

∂ logZY∂Y

=

∑σ fσ exp[−βEσ + βV fσY ]∑σ exp[−βEσ + βV fσY ]

. (4)

Setting the neural network field Y to zero results in thestandard definition of Eq. (2). Nevertheless, a derivationof Eq. (4) in terms of the field gives:

χf =∂〈f〉∂Y

= βV (〈f2〉 − 〈f〉2). (5)

The quantity χf is recognized as a susceptibility. It isa measure of the response of the predictive function f tochanges in the neural network field Y . Consequently, theopportunity to study the effect of a non-zero field Y 6= 0in the statistical system is now available. One way toachieve this is to conduct Monte Carlo sampling usingthe modified Hamiltonian of Eq. (3) to obtain configu-rations of this modified system. However, an alternativeoption that overcomes the need for sampling is the use ofhistogram reweighting [33, 34], where machine learningderived observables can also be reweighted in parameterspace [35].

III. SYMMETRY BREAKING ANDRESTORATION

Consider a set of N obtained configurations σi from asystem whose explicit form of the Hamiltonian E is notknown. These configurations have been drawn from anequilibrium distribution, described by Eq. (1), and canbe utilized with reweighting to predict the behaviour ofthe modified system, when the neural network field Y isset to non zero-values.

To achieve this we define the expectation value for anarbitrary observable O, estimated during a Markov chainMonte Carlo simulation, in the modified system that weaim to sample:

〈O〉 =

∑Ni=1Oσi

p−1σiexp[−βEσi

+ βV fσiY ]∑N

i=1 p−1σi exp[−βEσi

+ βV fσiY ]

, (6)

where p are the sampling probabilities of the equilibriumdistribution. The probabilities p of the original system,defined in Eq. (1), can be substituted for pσi

to obtain:

〈O〉 =

∑Ni=1Oσi

exp[βV fσiY ]∑N

i=1 exp[βV fσiY ]

, (7)

Page 3: arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

3

FIG. 2. Mean neural network function 〈f〉 versus externalfield Y at inverse temperature β = 0.43, 0, 440687, 0.45 (rightto left). The statistical uncertainty is comparable with thewidth of the lines.

Using Eq. (7) one can calculate the expectation valueof an observable for a modified system with a non-zeroneural network field Y 6= 0 by using configurations of theoriginal system sampled at inverse temperature β andwith zero field Y = 0. Before investigating the effect ofY 6= 0 to machine learning devised observables we recallthat the function f has emerged by training the neuralnetwork on obtained configurations, where no knowledgeabout the explicit form of the Hamiltonian is introduced.As a result, both Eq. (7) and the function f have noimmediate dependence on the Hamiltonian and the op-portunity to conduct reweighting that is Hamiltonian-agnostic is present.

For the case of the neural network function f , resultsthat have been obtained using Eq. (7) can be seen inFig. 2. A two-dimensional Ising model of size L = 64at each dimension is simulated at inverse temperaturesβ = 0.43 in the symmetric phase, β = 0.440687 in theknown inverse critical temperature and β = 0.45 in thebroken-symmetry phase. We observe that regardless ofthe phase that the system is in, positive and negative val-ues of the external field Y drive the system towards thebroken-symmetry or the symmetric phase, respectively.To gain further insights, we recall that the neural net-work function is correlated with the probability P s thata configuration is associated with the symmetric phasethrough f ≡ P b = 1 − P s. Consequently, the associ-ated field which is coupled to f is anticipated to have theobserved behaviour.

To measure the response of the predictive function fto changes in the field we calculate the susceptibility χf ,which is depicted in Fig. 3. We note that χf has max-imum values, evidencing the crossing of a phase tran-sition. The results indicate that the field can inducean order-disorder phase transition in the Ising model bybreaking or restoring the symmetry of the system. Thisis in contrast with the external field associated with theconventional order parameter (the magnetization) that,irrespective of its sign, induces an explicit breaking ofthe symmetry. Based on universality, the emerging phasetransition is assumed to be governed by the same rele-vant operators and the critical behaviour of the neural

FIG. 3. Mean susceptibility of the neural network func-tion 〈χf 〉 versus external field Y at inverse temperatureβ = 0.43, 0, 440687, 0.45 (right to left). The statistical un-certainty is comparable with the width of the lines.

network field can now be studied with the renormaliza-tion group.

IV. RENORMALIZATION GROUP

A. Locating the Critical Fixed Point

We consider a configuration of an Ising model that hasbeen obtained at a specific inverse temperature β andhas an emerged correlation length ξ. By applying theblocking procedure with the majority rule and a rescalingfactor of b = 2 (see App. C), the reduction in the latticesize L′ = L/b will also induce an analogous reduction inthe given correlation length:

ξ′ =ξ

b. (8)

The correlation length ξ is a quantity that emergesdynamically when the system is approaching the criticalpoint β ≈ βc and is therefore dependent on the valueof the inverse temperature ξ(β). The rescaled systemhas a reduced correlation length ξ′ and is, consequently,representative of an Ising model at a different inversetemperature, with ξ′(β′). At the critical fixed point β′ =β = βc the correlation length in the thermodynamic limitξ(βc, L = ∞) diverges, and intensive quantities of theoriginal and the rescaled system become equal.

This opens up the opportunity to use an observablederived from a machine learning algorithm to locate thecritical fixed point. In particular, we consider at β = βc,the neural network function f which has been expressedas a statistical mechanical observable and is thereforedependent on the inverse temperature:

f(βc) = f ′(βc). (9)

In Fig. 4, the predictive function f has been drawn foran original and a rescaled system. We recall that, underthe assumption that configurations of the rescaled systemappear with the Boltzmann probabilities of the originalIsing Hamiltonian, observables of the rescaled system can

Page 4: arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

4

FIG. 4. Mean neural network function 〈f〉 versus inversetemperature β for an original and a rescaled system of sizeL = L′ = 32.

be reweighted as observables of the original [36]. We notethat the two lines cross at βfc ≈ 0.44055, yielding a firstestimate of the location of the critical inverse tempera-ture. The results are obtained using reweighting on aMonte Carlo dataset which has been simulated near theknown inverse critical temperature βc ≈ 0.440687. Whenthe inverse critical temperature isn’t known it can be es-timated by iterating the same procedure on points of in-tersection until convergence [36]. Given this knowledge,the operators and the critical fixed point of the renor-malization group transformation can then be calculatedin a quantitative manner.

B. Extracting Operators of the RenormalizationGroup

The original and the rescaled systems are located atinverse temperatures β and β′, and their neural networkfunctions are, therefore, related through:

f(β′) = f ′(β). (10)

For the case of the inverse critical temperature,Eq. (10) reduces to Eq. (9). We are now able to form,based on Eq. (10), a renormalization group mapping thatassociates the two inverse temperatures:

β′ = f−1(f ′(β)). (11)

The correlation length then diverges in the thermody-namic limit according to relations ξ ∼ |t|−ν and ξ′ ∼|t′|−ν for an original and a rescaled system, respectively,where t = (βc−β)/βc is the reduced inverse temperature.Dividing the two equations, c.f. Eq. (8), we obtain:(

t

t′

)−ν= b. (12)

The renormalization group mapping is then linearizedthrough a Taylor expansion to leading order in the prox-imity of the fixed point [37], to obtain:

βc − β′ = (βc − β)dβ′

∣∣∣∣βc

. (13)

FIG. 5. Rescaled inverse temperature β′ versus inverse tem-perature β. The intersection with g(x) = x gives the criticalfixed point β = β′ = βc. The dashed lines indicate the statis-tical uncertainty.

By substituting into Eq. (12) we are able to calculatethe correlation length exponent:

ν =log b

log dβ′

∣∣∣βc

. (14)

In Fig. 5 results based on Eq. (11) are depicted usingHamiltonian-dependent reweighting on the inverse tem-perature [35] . We obtain an estimate of the critical fixedpoint βc = 0.44063(21) and the correlation length expo-nent ν = 1.01(2).

Since the neural network field Y induces a phase tran-sition in the system it is bound to affect the correlationlength. Another critical exponent θY can then be de-fined when the field converges to zero at the critical fixedpoint:

ξ ∼ |Y |−θY . (15)

Following an analogous derivation, and formulating amapping Y ′ = f−1(f ′(Y )) for the field, the correspond-ing critical exponent is then calculated through:

θY =log b

log dY ′

dY

∣∣∣Y=0

. (16)

Fig. 6 shows the results for the case of the neuralnetwork field, where we obtain the value of the criti-cal exponent θY = 0.534(3), using Hamiltonian-agnosticreweighting based on Eq. (7).

The phase transition of the Ising model is describedin completeness based on two relevant operators ν andθ (see App. A). The exponent θ governs the divergenceof the correlation length in terms of the external fieldh that is coupled to the conventional order parameter.We note that the predictive function f is reminiscent ofan effective order parameter (see Fig. 2). We find thatthe numerical value of the exponent θY agrees withinstatistical errors with θ = 8/15. We hence conclude thatY couples to the same relevant operator as the externalmagnetic field. The results are summarized in Table I

Page 5: arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

5

FIG. 6. Rescaled field Y ′ versus original field Y atβ = 0.440687. The dashed lines indicate the statistical uncer-tainty.

and the remaining critical exponents can be calculatedthrough scaling relations (see App. A).

We emphasize that the operators and the criticalfixed point have been calculated using observables de-rived from the neural network implementation and theirreweighted extrapolations where no explicit informationabout the symmetries of the Hamiltonian was introduced.

V. CONCLUSIONS

Machine learning can become physically interpretableby being introduced as a term in the Hamiltonian of a sta-tistical system, allowing an efficient study of a phase tran-sition based on renormalization group arguments. Evenunder the use of minimally-sized lattices, a highly accu-rate calculation of the critical fixed point and the oper-ators of the two-dimensional Ising model is conducted.The inclusion of a neural network field within the Hamil-tonian enables a certain control over the considered sys-tem, documenting a structure that is richer than the fieldof the conventional order parameter. In particular, theneural network field can drive a system towards boththe ordered and disordered phase giving rise to an order-disorder phase transition, in contrast with the externalmagnetic field that only causes explicit symmetry break-ing.

Numerous research directions can be anticipated. Anymachine learning function can, in principle, be instilledwithin Hamiltonians to drive a system to desired be-haviours, making the method generally applicable. Forinstance, a function from a simple model might be usedto study a complicated one [38]. The effect of a neu-ral network field in systems with an absence of an order

TABLE I. Estimates for the critical exponents ν, θY and thecritical fixed point βc of the two-dimensional Ising model.

βc ν θY , θRG+NN 0.44063(21) 1.01(2) θY = 0.534(3)

Exact ln(1 +√

2)/2 1 θ = 8/15

parameter [19] is now also readily available to explore,a prospect that is further enhanced through the use ofHamiltonian-agnostic reweighting. Most importantly, byusing Monte Carlo simulations to sample configurationsof modified systems that include machine learning func-tions, an in-depth understanding of the underlying me-chanics can be obtained.

In conclusion, by including machine learning as a termin the Hamiltonian an essential step towards bridging ma-chine learning and physics is established, one that couldpotentially alter our understanding of machine learningalgorithms and their effects on systems.

VI. ACKNOWLEDGEMENTS

The authors received funding from the European Re-search Council (ERC) under the European Union’s Hori-zon 2020 research and innovation programme under grantagreement No 813942. The work of GA and BL hasbeen supported in part by the UKRI Science and Tech-nology Facilities Council (STFC) Consolidated GrantST/P00055X/1. The work of BL is further supportedin part by the Royal Society Wolfson Research MeritAward WM170010 and by the Leverhulme FoundationResearch Fellowship RF-2020-461\9. Numerical simula-tions have been performed on the Swansea SUNBIRDsystem. This system is part of the Supercomputing Walesproject, which is part-funded by the European RegionalDevelopment Fund (ERDF) via Welsh Government. Wethank COST Action CA15213 THOR for support.

Appendix A: The Ising Model

We consider the two-dimensional Ising model on asquare lattice which is described by the Hamiltonian:

E = −J∑〈ij〉

sisj − h∑i

si, (A1)

where 〈ij〉 is a sum over nearest neighbor interactions,J is the coupling constant which is set to one and h theexternal magnetic field which is set to zero. The systemundergoes a second-order phase transition at the criticalinverse temperature βc:

βc =1

2ln(1 +

√2) ≈ 0.440687. (A2)

The order parameter of the Ising model is the magne-tization. We often consider the absolute magnetization,normalized by the volume V = L× L of the system:

m =1

V

∣∣∣∑i

si

∣∣∣. (A3)

The fluctuations of the magnetization are equivalentto the magnetic susceptibility, defined as:

Page 6: arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

6

χ = βV (〈m2〉 − 〈m〉2). (A4)

To measure the distance from the critical point a di-mensionless reduced inverse temperature is defined:

t =βc − ββc

. (A5)

As the system approaches the critical temperatureβ ≈ βc, finite size effects dominate and fluctuations, suchas the magnetic susceptibility χ, have maximum valueswhich act as phase transition indicators.

The phase transition of the Ising model is described incompleteness by two relevant operators that govern thedivergence of the correlation length. One is the criticalexponent ν which is defined via:

ξ ∼ |t|−ν , (A6)

and the second one is the critical exponent θ which isgiven through:

ξ ∼ |h|−θ, (A7)

where h is the external magnetic field. Eq. (A7) is validwhen β = βc and h→ 0.

Given the two relevant operators ν and θ, the criti-cal exponents that govern the divergence of the specificheat (α), magnetization (β(m)) and magnetic susceptibil-ity (γ) can then be calculated through scaling relations:

α = 2− νd (A8)

β(m) =ν[d− 1

θ

](A9)

γ =ν[2θ − d

](A10)

δ = 1dθ−1 , (A11)

where d = 2 is the dimensionality of the system.

Appendix B: Neural Network Architecture andSimulation details

The neural network architecture is comprised of a fully-connected layer (FC1) with a rectified linear unit (ReLU)non-linear function, defined as k(x) = max(0, x). Theresult is then passed to a second fully-connected layer(FC2) with 32 units and a ReLU function, and is subse-quently forwarded to a third fully-connected (FC3) with2 units and a softmax function. The training is conductedbased on the Adam algorithm with a learning rate of 10−4

and a batch size of 8. The architecture is implementedwith TensorFlow and the Keras library.

Configurations are sampled with Markov chain MonteCarlo simulations using the Wolff algorithm [39], and arechosen to be minimally correlated. The data set is com-prised of 1000 configurations per each inverse tempera-ture, where 100 have been chosen to create a validation

set. Specifically the training range chosen to sample con-figurations is β = 0.27, . . . , 0.36 in the symmetric phaseand β = 0.52, . . . , 0.61 in the broken-symmetry phasewith a step size of 0.01. The architecture has been op-timized on the Ising model based on the values of thetraining and the validation loss.

Appendix C: The Blocking Procedure

A common choice of a transformation for the Isingmodel is the blocking procedure with a rescaling factorof b = 2 and the majority rule. To apply the blockingprocedure the lattice structure is initially separated intoblocks of size b × b. Within each block of the originalsystem, degrees of freedom that have distinct values arecounted and a majority rule defines the rescaled degreeof freedom. When the counted degrees are equal, thechoice is made arbitrarily. For an illustration of all pos-sible outcomes see Fig. 7. The application of a blockingprocedure preserves the large-scale information of a con-figuration and we assume that it results in a rescaledsystem of size L′ = L/b that is a valid representation ofan Ising model [36].

Appendix D: Binning Error Analysis

The error analysis is conducted with the binningmethod to address statistical errors associated with thefinite Monte Carlo data sets. In particular, each MonteCarlo data set, comprised of 10000 minimally correlatedconfigurations is separated into n = 10 groups. Resultsare calculated using the data in each group. The stan-dard deviation is then given by:

σx =

√1

n− 1(x2 − x2). (D1)

FIG. 7. An illustration of the blocking procedure with arescaling factor of b = 2 and the majority rule. For the twocases at the bottom, the choice of the rescaled degree of free-dom was made arbitrarily.

Page 7: arXiv:2010.00054v1 [hep-lat] 30 Sep 2020pyweb.swan.ac.uk/~aarts/archivedpapers/2010.00054.pdfrestoring the system’s symmetry. The critical behaviour can then be investigated using

7

[1] K. G. Wilson, Renormalization group and critical phe-nomena. i. renormalization group and the kadanoff scal-ing picture, Phys. Rev. B 4, 3174 (1971).

[2] K. G. Wilson, Renormalization group and critical phe-nomena. ii. phase-space cell analysis of critical behavior,Phys. Rev. B 4, 3184 (1971).

[3] L. P. Kadanoff, Scaling laws for ising models near Tc,Physics Physique Fizika 2, 263 (1966).

[4] K. G. Wilson, The renormalization group: Critical phe-nomena and the kondo problem, Rev. Mod. Phys. 47,773 (1975).

[5] K. G. Wilson and M. E. Fisher, Critical exponents in 3.99dimensions, Phys. Rev. Lett. 28, 240 (1972).

[6] K. G. Wilson and J. Kogut, The renormalization groupand the expansion, Physics Reports 12, 75 (1974).

[7] R. H. Swendsen, Monte carlo renormalization group,Phys. Rev. Lett. 42, 859 (1979).

[8] S.-k. Ma, Renormalization group by monte carlo meth-ods, Phys. Rev. Lett. 37, 461 (1976).

[9] R. H. Swendsen, Monte carlo calculation of renormalizedcoupling parameters. i. d = 2 ising model, Phys. Rev. B30, 3866 (1984).

[10] D. Ron, R. H. Swendsen, and A. Brandt, Inverse montecarlo renormalization group transformations for criticalphenomena, Phys. Rev. Lett. 89, 275701 (2002).

[11] H. W. J. Blote, J. R. Heringa, A. Hoogland, E. W. Meyer,and T. S. Smit, Monte carlo renormalization of the 3dising model: Analyticity and convergence, Phys. Rev.Lett. 76, 2613 (1996).

[12] D. Ron, A. Brandt, and R. H. Swendsen, Surprising con-vergence of the monte carlo renormalization group for thethree-dimensional ising model, Phys. Rev. E 95, 053305(2017).

[13] K. Akemi, M. Fujisaki, M. Okuda, Y. Tago, P. de For-crand, T. Hashimoto, S. Hioki, O. Miyamura,T. Takaishi, A. Nakamura, and I. O. Stamatescu, Scalingstudy of pure gauge lattice qcd by monte carlo renormal-ization group method, Phys. Rev. Lett. 71, 3063 (1993).

[14] A. Hasenfratz, Investigating the critical properties ofbeyond-qcd theories using monte carlo renormalizationgroup matching, Phys. Rev. D 80, 034505 (2009).

[15] A. Hasenfratz, Conformal or walking? monte carlo renor-malization group studies of su(3) gauge models with fun-damental fermions, Phys. Rev. D 82, 014506 (2010).

[16] I. J. Goodfellow, Y. Bengio, and A. Courville, DeepLearning (MIT Press, Cambridge, MA, USA, 2016)http://www.deeplearningbook.org.

[17] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld,N. Tishby, L. Vogt-Maranto, and L. Zdeborova, Machinelearning and the physical sciences, Reviews of ModernPhysics 91, 10.1103/revmodphys.91.045002 (2019).

[18] J. Carrasquilla, Machine learning for quantum matter,Advances in Physics: X 5, 1797528 (2020).

[19] J. Carrasquilla and R. G. Melko, Machine learning phasesof matter, Nature Physics 13, 431 (2017).

[20] E. L. van Nieuwenburg, Y.-H. Liu, and S. Huber, Learn-ing phase transitions by confusion, Nature Physics 13,435 (2017).

[21] J. Venderley, V. Khemani, and E.-A. Kim, Machinelearning out-of-equilibrium phases of matter, Phys. Rev.Lett. 120, 257204 (2018).

[22] J. F. Rodriguez-Nieva and M. S. Scheurer, Identifyingtopological order through unsupervised machine learn-ing, Nature Physics 15, 790 (2019).

[23] C. Giannetti, B. Lucini, and D. Vadacchino, Machinelearning as a universal tool for quantitative investiga-tions of phase transitions, Nuclear Physics B 944, 114639(2019).

[24] M. N. Chernodub, H. Erbin, V. A. Goy, and A. V.Molochkov, Topological defects and confinement withmachine learning: The case of monopoles in compactelectrodynamics, Phys. Rev. D 102, 054501 (2020).

[25] D. L. Boyda, M. N. Chernodub, N. V. Gerasi-meniuk, V. A. Goy, S. D. Liubimov, and A. V.Molochkov, Machine-learning physics from unphysics:Finding deconfinement temperature in lattice yang-mills theories from outside the scaling window (2020),arXiv:2009.10971 [hep-lat].

[26] G. Carleo and M. Troyer, Solving the quantum many-body problem with artificial neural networks, Science355, 602606 (2017).

[27] S.-H. Li and L. Wang, Neural network renormalizationgroup, Phys. Rev. Lett. 121, 260601 (2018).

[28] P. Mehta and D. J. Schwab, An exact mapping betweenthe variational renormalization group and deep learning(2014), arXiv:1410.3831 [stat.ML].

[29] M. Koch-Janusz and Z. Ringel, Mutual information,neural networks and the renormalization group, NaturePhysics 14, 578 (2018).

[30] S. Iso, S. Shiba, and S. Yokoo, Scale-invariant featureextraction of neural network and renormalization groupflow, Phys. Rev. E 97, 053304 (2018).

[31] C. Bny, Deep learning and the renormalization group(2013), arXiv:1301.3124 [quant-ph].

[32] L. Zdeborova, Understanding deep learning is also a jobfor physicists, Nature Physics 16, 602 (2020).

[33] A. M. Ferrenberg and R. H. Swendsen, New monte carlotechnique for studying phase transitions, Phys. Rev. Lett.61, 2635 (1988).

[34] A. M. Ferrenberg and R. H. Swendsen, Optimized montecarlo data analysis, Phys. Rev. Lett. 63, 1195 (1989).

[35] D. Bachtis, G. Aarts, and B. Lucini, Extending ma-chine learning classification capabilities with histogramreweighting, Phys. Rev. E 102, 033303 (2020).

[36] M. E. J. Newman and G. T. Barkema, Monte Carlomethods in statistical physics (Clarendon Press, Oxford,1999).

[37] T. Niemeijer and J. M. J. van Leeuwen, Phase Transi-tions and Critical Phenomena (Academic Press, 1976)edited by C. Domb and M. S. Green.

[38] D. Bachtis, G. Aarts, and B. Lucini, A mapping ofdistinct phase transitions to a neural network (2020),arXiv:2007.00355 [cond-mat.stat-mech].

[39] U. Wolff, Collective monte carlo updating for spin sys-tems, Phys. Rev. Lett. 62, 361 (1989).