Click here to load reader

Stochastic self-assembly of incommensurate clusters · THE JOURNAL OF CHEMICAL PHYSICS 136, 084110 (2012) Stochastic self-assembly of incommensurate clusters M. R. D’Orsogna,1 G

  • View
    0

  • Download
    0

Embed Size (px)

Text of Stochastic self-assembly of incommensurate clusters · THE JOURNAL OF CHEMICAL PHYSICS 136, 084110...

  • THE JOURNAL OF CHEMICAL PHYSICS 136, 084110 (2012)

    Stochastic self-assembly of incommensurate clustersM. R. D’Orsogna,1 G. Lakatos,2 and T. Chou3,a)1Department of Mathematics, CSUN, Los Angeles, California 91330-8313, USA2Department of Chemistry, The University of British Columbia, Vancouver, BC V6T-1Z1, Canada3Departments of Biomathematics and Mathematics, UCLA, Los Angeles, California 90095-1766, USA

    (Received 11 January 2012; accepted 7 February 2012; published online 29 February 2012)

    Nucleation and molecular aggregation are important processes in numerous physical and biologicalsystems. In many applications, these processes often take place in confined spaces, involving a finitenumber of particles. Analogous to treatments of stochastic chemical reactions, we examine the clas-sic problem of homogeneous nucleation and self-assembly by deriving and analyzing a fully discretestochastic master equation. We enumerate the highest probability steady states, and derive exact an-alytical formulae for quenched and equilibrium mean cluster size distributions. Upon comparisonwith results obtained from the associated mass-action Becker-Döring equations, we find striking dif-ferences between the two corresponding equilibrium mean cluster concentrations. These differencesdepend primarily on the divisibility of the total available mass by the maximum allowed clustersize, and the remainder. When such mass “incommensurability” arises, a single remainder particlecan “emulsify” the system by significantly broadening the equilibrium mean cluster size distribution.This discreteness-induced broadening effect is periodic in the total mass of the system but arises evenwhen the system size is asymptotically large, provided the ratio of the total mass to the maximumcluster size is finite. Ironically, classic mass-action equations are fairly accurate in the coarseningregime, before equilibrium is reached, despite the presence of large stochastic fluctuations found viakinetic Monte-Carlo simulations. Our findings define a new scaling regime in which results fromclassic mass-action theories are qualitatively inaccurate, even in the limit of large total system size.© 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.3688231]

    I. INTRODUCTION

    Self-assembly arises in countless physical and biologi-cal processes, over many length and time scales.1 Atoms andmolecules can nucleate to form small multiphase structuresthat can influence overall bulk material properties. For ex-ample, adatoms adsorbed on growing surfaces aggregate toform islands whose shapes and characteristics control epi-taxial synthesis of thin films and layered materials. In morerecent years, advances in nanotechnology have opened thepossibility of controlling the self-assembly of specifically de-signed mesoscopic2 and macroscopic3 parts into functionalcomponents. Because of the importance of nucleation andself-assembly in material science, these processes have beenextensively studied.4–6

    Nucleation and self-assembly are also ubiquitous in cel-lular biology. The polymerization of actin filaments7–11 andamyloid fibrils,12 the assembly of virus capsids13–17 and ofantimicrobial peptides into transmembrane pores,18, 19 the re-cruitment of transcription factors, and the self-assembly ofclathrin-coated pits20–22 are all important cell-level processesthat can be cast as initial binding and self-assembly prob-lems for which there is great interest in developing theoreticaltools.

    Because it is such a fundamental process across so manydisciplines, there is a vast, longstanding literature on nucle-

    a)Author to whom correspondence should be addressed. Electronic mail:[email protected]

    ation and self-assembly, from both the theoretical and ex-perimental perspectives.23 Theoretical models were initiallydeveloped using mass-action kinetics, as exemplified by theBecker-Döring (BD) equations describing the evolution of themean concentrations of clusters of a given size.24 These well-studied mass-action equations implicitly employ the mean-field assumption by neglecting correlations. Notwithstanding,solutions to the BD equations exhibit rich behavior, includingmetastable particle distributions,25 multiple time scales,26 andnontrivial convergence to equilibrium and coarsening.25, 27, 28

    Becker-Döring-type models implicitly assume infinite systemsizes and/or the possibility of infinitely large cluster sizes,without addressing the importance of how these limits aretaken. To date, all the biological and physical applications de-scribed above have been modeled almost exclusively usingBD-type equations. As a result the discreteness and cluster“stoichiometry” in these problems have not been explored.

    In this paper, we carefully investigated a simple homoge-neous nucleation and growth process starting from the funda-mental, multi-dimensional, fully stochastic master equation.In particular, we consider the probability of the system to bein a state with specific numbers of clusters of each size. Thefully stochastic master equation governing the evolution ofthe state probabilities is derived, simulated, and solved ana-lytically in different steady-state limits. Upon comparing themean cluster concentrations found from the stochastic masterequation with those calculated from numerical integration ofthe mean-field BD equations, we find qualitative differences,even in the large system size limit. Our results especially

    0021-9606/2012/136(8)/084110/10/$30.00 © 2012 American Institute of Physics136, 084110-1

    http://dx.doi.org/10.1063/1.3688231http://dx.doi.org/10.1063/1.3688231mailto: [email protected]

  • 084110-2 D’Orsogna, Lakatos, and Chou J. Chem. Phys. 136, 084110 (2012)

    FIG. 1. Homogeneous nucleation and growth in a closed unit volume ini-tiated with M = 30 monomers. If the constant monomer detachment rate qis small, monomers will be nearly exhausted in the long time limit. In thisexample, the final cluster size distribution consists of two dimers, one trimer,one 4-mer, one pentamer, and two hexamers. In this paper we will considerthe limit of slow, but non-zero monomer detachment rates q/p ≡ ε � 1.

    highlight the importance of system size in self-assembly, andhow they lead to cluster concentrations dramatically differ-ent from those obtained by solving classical, mean-field BDequations.

    We begin by considering the simple homogeneous nu-cleation and growth process in a closed system containing Mtotal particles, as depicted in Fig. 1. Monomers first bind to-gether to form dimers. Large clusters are formed by succes-sive one-by-one binding of monomers. Individual monomerscan also desorb shrinking clusters. Depending on the molec-ular details and application, different monomer bindingand unbinding rate structures,25, 26, 28, 29 cluster fragmenta-tion/coagulation rules,24, 30 and monomer sources26, 31, 32 havebeen considered. For example, in epitaxial growth, atoms maybe continuously deposited on the substrate, giving rise to aconstant flux of adatoms (monomers) that contribute to islandnucleation and growth. In other applications, self-assemblymay occur in small volumes, such as during the assembly ofbiomolecular aggregates inside cells. Here, monomer produc-tion may be slow compared to the growth dynamics and thetotal number of monomers, bound and part of clusters, can beassumed constant.

    Especially within biophysics, cluster sizes are usuallylimited by the finite total mass of the system, or by someintrinsic cluster stoichiometry. For example, virus capsids,clathrin coated pits, and antimicrobial peptide pores typi-cally consist of N ∼ 100 − 1000, N ∼ 10 − 20, and N ∼5 − 8 molecular subunits, respectively. By defining the self-assembly problem with fixed total mass and a maximum clus-ter size, we can now study the process under different lim-its of total mass M and maximum cluster size N, and clarifyhow different large system size limits can be properly taken,finding unexpected results. Our analysis highlights the impor-tance of stochastics and particle discreteness in a wide classof applications of self-assembly models.

    In general, the monomer attachment and detachmentrates pk and qk depend on the cluster size k. For example,clusters that grow spherically can be modeled by attachmentand detachment rates that are proportional to the cluster sur-face area, pk, qk ∼ k2/3. Other scalings for attachment and de-tachment rates have also been used to model specific physicallimits of nucleation.28 For the sake of simplicity we assume inthis work that monomer attachment and detachment occur at

    constant, cluster size-independent rates p and q, respectively.This scaling would be appropriate for, say, the growth of aone-dimensional filament, where attachment and detachmentalways occur at only one or both ends. Henceforth, we willalso focus on the case p � q, as this strong binding limit bestreveals the importance of stochasticity and finite-size effectsin this class of models.

    II. MASS-ACTION TREATMENT

    We first briefly review classical mass-action nucleationtheory. Since the underlying assumption is that all clusters arewell-mixed, we assume, without loss of generality, a closedsystem of fixed unit volume. The mass-action BD equationscan be written in dimensionless form:

    ċ1(t) = −c21 − c1N−1∑j=2

    cj + 2εc2 + εN∑

    j=3cj ,

    ċ2(t) = −c1c2 + 12c21 − εc2 + εc3,

    (1)ċk(t) = −c1ck + c1ck−1 − εck + εck+1,ċN (t) = c1cN−1 − εcN,

    where ε ≡ q/p. For a system defined with fixed unit vol-ume, the ck in Eqs. (1) represent the mean number of clus-ters of size k. However, to better distinguish the solutions ofBD equations from the mean cluster numbers derived fromsubsequent stochastic analyses, we shall continue to referto ck(t) as “concentrations.” Note that setting ε = 0 pre-cludes monomer detachment, giving rise to irreversibility andergodicity-breaking in the evolution to the final cluster sizedistribution.33 When 0 < ε � 1, detachment occurs, anda true equilibrium cluster size distribution will be slowlyreached. Since we also assume mass conservation, the con-straint

    ∑Nk=1 kck = M is also imposed to fix the total number

    of monomer particles, whether free or part of clusters, to M.We also use the initial condition ck(t = 0) = Mδk, 1 (where theKroenecker δ-function δi, j = 1 if i = j, and zero otherwise),corresponding to all mass in the form of monomers at t = 0.

    Each term in the BD equations represents a particular at-tachment or detachment event. For example, the −c21 term onthe RHS of the first equation arises from the ∼ c21/2 waysof dimerizing two monomers so that the time rate of changeof c1(t) is proportional to −2c21/2 = −c21. Equations (1) havebeen solved numerically and are extensively analyzed inasymptotic limits under the total mass constraint.25–28

    Figure 2 plots the numerical solution to Eqs. (1) as afunction of time for N = 4 and ε = 10−5. In general, ck > 1(t)initially rise at the expense of c1(t). After monomers are sig-nificantly depleted, the mean cluster size distributions re-main nearly constant at “quenched” or “metastable” valuesc∗k . Since detachment is slow (ε � 1), the long-lived values c∗kcorrespond to a quenched size distribution before detachmentand redistribution of mass have had time to occur apprecia-bly. After a long time tc ∼ ε−1, this metastable size distribu-tion begins to “coarsen,” eventually reaching an equilibriumdistribution ceqk .

  • 084110-3 Stochastic self-assembly J. Chem. Phys. 136, 084110 (2012)

    FIG. 2. Numerical solution of Eqs. (1) for the cluster concentrations ck(t),plotted as a function of log10t. In this example, N = 4, M = 9, and ε = 10−5.Analytic expressions for both the intermediate-time metastable concentra-tions c∗k , and the long-time equilibrium concentrations c

    eqk (indicated at the

    right for k = 2 and k = 4, respectively) can be found in Sec. 1 of Appendix.The typical coarsening time tc ∼ ε−1 is indicated in this plot by the verticaldashed line at log10(ε−1) = 5. In the mass-action BD theory, ck(t)/M is nearlyindependent of M (see Fig. 3(c)).

    The metastable concentrations c∗k can be computed ex-actly in the ε → 0+ asymptotic limit by directly setting ε = 0in Eqs. (1). In this case, Eqs. (1) can be linearized via a propertime rescaling, effectively eliminating the multiplicative termc1 from the RHS. The mathematical details are outlined inSec. 1 of Appendix, which also shows that c∗k ∝ M .

    The equilibrium concentrations ceqk for small ε can befound by setting ċk = 0 in Eqs. (1) and perturbatively com-puting the resulting algebraic equations. As described in Sec.1 of Appendix this leads to

    ceqk ≈

    ε

    2

    (2M

    εN

    )k/N [1 − k(N − 1)

    N2

    (εN

    2M

    )1/N+ . . .

    ].

    (2)At equilibrium, we find ceqk � ceqk−1, and for ε � 1, themaximal cluster of size N dominates with concentrationc

    eqN ≈ M/N , while ceqk N, there is a questionas to how to take the limits M → ∞ and N → ∞. Recogniz-ing these obvious deficiencies in BD-type mass-action equa-tions, we now perform a more careful analysis of the fullydiscrete stochastic master equation to reveal dramatic differ-ences between mass-action and stochastic results, even in thelarge system limit.

    III. STOCHASTIC ANALYSIS

    Consider the probability density P({n}; t) ≡ P(n1, n2, . . . ,nN; t) of the system being in a state with n1 monomers, n2dimers, n3 trimers, . . . , nN N-mers. Using the above defini-tion ε ≡ q/p, the full stochastic master equation describingthe time evolution of P({n}; t) is31, 32

    Ṗ ({n}; t) = −�({n})P ({n}; t)

    + 12

    (n1 + 2)(n1 + 1)W+1 W+1 W−2 P ({n}; t)

    + ε(n2 + 1)W+2 W−1 W−1 P ({n}; t)

    +N−1∑i=2

    (n1 + 1)(ni + 1)W+1 W+i W−i+1P ({n}; t)

    + εN∑

    i=3(ni + 1)W−1 W−i−1W+i P ({n}; t), (3)

    where P({n}, t) = 0 if any ni < 0, �({n}) = 12n1(n1 − 1)+ ∑N−1i=2 n1ni + ε ∑Ni=2 ni is total rate out of configuration{n}, and W±j is the unit raising/lowering operator on the num-ber of clusters of size j. For example,

    W+1 W+i W

    −i+1P ({n}; t)

    ≡ P (n1 + 1, . . . , ni + 1, ni+1 − 1, . . . ; t). (4)To be consistent with the analysis of the BD equations

    of Sec. II, we will assume that all the mass is initially inthe form of monomers: P ({n}; t = 0) = δn1,Mδn2,0 · · · δnN ,0.By construction, the stochastic dynamics described by Eq. (3)obey the total mass conservation constraint

    M =N∑

    k=1knk. (5)

    The mean numbers of clusters of size k, defined by 〈nk(t)〉≡ ∑{n}nkP({n}; t) are the direct counterparts to the meancluster “concentrations” ck(t) obtained from numerical solu-tions of the BD equations in Sec. II.

    We first simulate the stochastic master equation using akinetic Monte-Carlo (KMC) or residence time algorithm asdescribed by Bortz et al.34 and detailed in Sec. 2 of Appendix.Figure 3 plots mean cluster numbers 〈nk(t)〉 derived from sim-ulations of Eq. (3) with N = 8, M = 16, 17, and ε = 10−6.Also shown for comparison are the mean-field results ck(t)(thin, dashed curves) for expected cluster concentrations ck(t)derived from numerical solutions of the BD equations. Ourresults show that during short and intermediate times t � ε−1,there is little difference between the predictions for M = 16and M = 17. Moreover, the mass-action concentrations ck(t)≈ 〈nk(t)〉.

    The most striking differences between the M = 16 and M= 17 cases occur at long times t � ε−1. For M = 2N = 16(Fig. 3(a)) the mass-action solution ceqk roughly approximates〈nk(t)〉, while for M = 17, there is a dramatic difference be-tween ceqk and the asymptotically exact mean numbers 〈neqk 〉.Figure 3(c) highlights the differences between ceqk and 〈neqk 〉,particularly for k = N = 8 (red curves). The approximation

  • 084110-4 D’Orsogna, Lakatos, and Chou J. Chem. Phys. 136, 084110 (2012)

    FIG. 3. Mean cluster sizes 〈nk(t)〉 obtained from averaging 106 KMC simulations of a stochastic self-assembly process with ε = 10−6. Only k = 2, 4, 6, 8are displayed. (a) For N = 8 and M = 16, nearly all the mass is concentrated in 〈neq8 〉 ≈ 2 at equilibrium. (b) For N = 8, M = 17, a much broader equilibriummean cluster distribution arises. For comparison, the numerical solution for ck(t) from the BD equations is displayed by the dashed curves. The simulation andmean-field results agree well with each other, but diverge at long times as equilibrium is approached, particularly when the total mass M is indivisible by N.Interestingly, even though the BD equations are most accurate for short and intermediate times, it is during the intermediate metastable regime that the variationsin the cluster concentrations are largest across simulation trajectories (see Sec. 2 of Appendix). (c) The difference between ceqk and 〈neqk 〉 (plotted in units of M/Nhere) is highlighted as a function of M. The red dashed line corresponds to ceq8 (which is nearly independent of M), while the open circles correspond to 〈neq8 〉found from Monte-Carlo simulation. Note that ceq8 ∼ 〈neq8 〉 only when ε → 0+ and M is divisible by N = 8, or when M → ∞. The filled red circles correspondto M = 16 and M = 17 as detailed in (a) and (b), respectively. A few other mean concentrations, 〈neq4,6,7〉, along with the corresponding ceq4,6,7 (dashed lines) arealso plotted for reference.

    ceqk ∼ 〈neqk 〉 is qualitatively reasonable only when M is divis-

    ible by N, or when M is very large. Even in this case, ceqkconverges slowly to 〈neqk 〉 as ε → 0+, especially for large k.For example, ceqN converges to 〈neqN 〉 but with a remaining errorterm of order ε1/N.

    To quantitatively understand these differences, and howstochastic and finite-size effects influence the self-assemblyprocess, we must find analytic ways of computing the rele-vant configurational probabilities P({n}; t → ∞) obeying themaster equation. Since we assume that detachment is slow,the most highly weighted equilibrium configurations are thosewith the fewest total number of clusters. For each set {M,N}, we can thus enumerate the states with the lowest numberof clusters and use detailed balance to compute their relativeweights.

    As an explicit example, consider the possible states forthe simple case N = 4, M = 9 shown in Fig. 4. When 0< ε � 1, nearly all the weight settles into states with thelowest number of clusters (Nmin = 3 here). As detailed inSec. 3 of Appendix, applying detailed balance between theNmin = 3 and Nmin + 1 = 4 states, neglecting corrections ofO(ε), we find 〈n1〉 ≈ 〈n2〉 ≈ 6/13, 〈n3〉 ≈ 9/13, and 〈n4〉≈ 18/13.

    To extend our results to general M and N, we start fromthe state with the highest possible number of maximum-sizedclusters, given by the integer part of [M/N], and distribute theremaining particles among the smaller ones. The number oflargest clusters is then successively reduced until all mass isexhausted. In this way, we inductively enumerate all stateswith near minimal total numbers of clusters. We then use de-tailed balance to compute the relative equilibrium weights ofthese few-cluster states and find closed-form solutions for themean equilibrium cluster numbers 〈neqk 〉. In order to write ouranalytic solutions we consider the following representation ofmass M = σN − j where σ denotes the maximum possiblenumber of largest clusters, and 0 ≤ j ≤ N − 1 represents theremainder of M/N. We thus arrive at one of the main results ofthis paper: exact solutions to the expected equilibrium cluster

    numbers in the ε → 0+ limit:〈neqN 〉 =

    σ (σ − 1)(σ + j − 1) , (6)

    〈neqN−k〉 =σ (σ − 1)j (j − 1) . . . (j − k + 1)

    (σ + j − 1)(σ + j − 2) . . . (σ + j − k − 1) .

    These expressions are valid for 0 ≤ j < N − 1 and all k. Notethat within all minimum cluster number states, the small-est cluster that can be formed is of size N − j. Therefore,〈neqN−k〉 = 0 for k > j, which is automatically satisfied byEq. (6). In the special case j = N − 1, the total mass can alsobe expressed as M = σN − (N − 1) = (σ − 1)N + 1. There-fore, j = N − 1 corresponds to adding a single monomer to a

    (9,0,0,0)

    (7,1,0,0)

    (5,2,0,0)

    (3,3,0,0)

    (1,4,0,0)

    (6,0,1,0)

    (4,1,1,0)*

    (2,2,1,0)* (3,1,0,1)

    (5,0,0,1)

    (1,2,0,1)

    (2,2,1,0)*

    (4,1,1,0)*

    (3,0,2,0)

    (1,1,2,0)(2,0,1,1)

    N= 4, 9M=

    (1,0,0,2) (0,0,3,0)

    (0,3,1,0)

    3

    4

    9

    8

    7

    6

    5

    tota

    l # o

    f cl

    uste

    rs

    (0,1,1,1)

    FIG. 4. Enumeration of the configurations (n1, n2, n3, n4) for the N = 4,M = 9 case. Only three distinct states with a minimum number of clus-ters Nmin = 3 arise. These states are all connected by monomer attach-ment/detachment steps with states having Nmin + 1 = 4 clusters. By iden-tifying the states that connect the three minimum cluster states, we can com-pute the weights of each of these states in the ε → 0+ limit. If ε = 0, thesystem loses ergodicity and appreciable probability will be forever trappedin state (0, 3, 1, 0), leading to a very different metastable, pre-coarseningdistribution 〈n∗k〉.

  • 084110-5 Stochastic self-assembly J. Chem. Phys. 136, 084110 (2012)

    16182022

    24262830

    32

    1 23 4

    5 67 8

    0

    1

    2

    3

    4

    kM

    neqk

    FIG. 5. The expected equilibrium cluster numbers 〈neqk 〉 in the ε → 0+ limit,plotted as functions of 1 ≤ k ≤ N = 8 and total mass M.

    system with initially M = (σ − 1)N monomers. In this case,the formulae for the equilibrium concentrations must take intoaccount combinatoric factors of 2 that arise when monomersappear in the populated configurations. A careful enumerationfor j = N − 1 yields

    〈neq1 〉 =2(N − 1)!

    D(σ,N − 1) ,

    〈neqN−k〉 =∏k

    �=1(N − �)∏N−k−1

    i=1 (σ − 2 + i)D(σ,N − 1) ,

    〈neqN 〉 = (σ − 1)D(σ − 1, N − 1)

    D(σ,N − 1) , (7)

    where D(σ, j ) ≡ j ! + ∏j−1�=1(σ + �). Equations (6) and (7)are asymptotically exact in the ε → 0+ limit. We verifiedthat results derived from our Monte-Carlo simulations matchclosely to these formulae for small ε.

    Figure 5 plots our exact result 〈neqk 〉 for N = 8 and varyingtotal mass M. The previously analyzed cases M = 16, 17 arerepresented by the last two rows to the right. When j = 0, suchas in the M = 16 case, the maximum cluster size N is commen-surate with the total mass M, so all the mass is deposited intothe largest clusters. The inaccuracy of the BD mass-actionequations can be readily seen by considering the incommen-surate case where j > 0 and the total monomer number M isnot an integer multiple of the maximum cluster size N. Re-call that from Eq. (2) the ε → 0+ limit of the mass-actionequations always give ceqk 0) a large number (in this case 7)of additional nontrivial 3-cluster states are possible:

    (0, 1, 0, 0, 0, 0, 1, 1) (0, 0, 0, 1, 0, 1, 1, 0)(0, 0, 1, 0, 0, 1, 0, 1) and (0, 0, 0, 0, 2, 0, 1, 0)(0, 0, 0, 1, 1, 0, 0, 1) (0, 0, 1, 0, 0, 0, 2, 0)

    (0, 0, 0, 0, 1, 2, 0, 0).

    (8)

    The equilibrium weights of these 8 new states are com-parable, resulting in a very flat mean cluster size distributionas shown in Fig. 5. When j < N − 1, or when σ is large,this dispersal effect diminishes. From Eqs. (7), 〈neqN−1〉/〈neqN 〉∼ N2/M , showing that the BD result ceqk ∼ (M/N)δk,N isasymptotically accurate when M � N2, or equivalently, whenσ � N. Thus, the periodically varying curve (N/M)〈neqk 〉 inFig. 3(c) asymptotes to the mass-action result (ε → 0+) asM/N2 → ∞.IV. SUMMARY AND CONCLUSIONS

    We have formulated the fully stochastic model of homo-geneous nucleation and self-assembly. From the associatedstochastic master equation, analytic results for the expectedequilibrium and cluster numbers were computed, verifiedwith KMC simulations, and compared with those derivedfrom the mass-action Becker-Döring equations. At interme-diate times, we find that the mean cluster concentrationsderived from mean-field (mass-action BD equations) andstochastic treatments are qualitatively similar. Ironically, inthis regime, stochastic fluctuations are strongest (Sec. 2 ofAppendix). Recursion relations for computing the expectedcluster numbers in the intermediate-time, metastable regimeare also presented in Sec. 4 of Appendix.

    Surprisingly, stochastic and mass-action results differdramatically in the coarsened, equilibrium regime wherestochastic fluctuations are small, but where mass incommen-surability arises. Our results indicate a dramatic broadeningof the mean cluster size distribution when the total mass Mis indivisible by N and is just above a multiple of N. This“dispersal” effect is especially prevalent for small σ , even ifthe total mass M or system size is large. In general, the meancluster numbers increase with cluster size, but in the specialcase M = N + 1, the cluster size distribution can even be “in-verted” (〈neqk 〉 > 〈neq� 〉, when k < �). Our findings originatefrom the analysis of the discrete stochastic master equationand are completely neglected by mean field theories.

    Our discrete stochastic model also allows us to considerself-assembly in systems of differing total mass and maxi-mum cluster size. Table I lists regimes of validity and re-sults for three different models: mass-action Becker-Döringequations without an imposed maximum cluster size, Becker-Döring equations with a fixed finite maximum cluster sizeN, and the fully stochastic master equation. Three differ-ent ways of taking the large system limits M, N → ∞ areconsidered. The first case of N = ∞ with M finite corre-sponds to nucleation with unbounded cluster sizes. All mod-els yield a single cluster of size M, but give different ratesof coarsening. If M � N2, the finite−N BD model matchesthe asymptotically exact stochastic model where all the massis concentrated into the largest allowed cluster. In this case,

  • 084110-6 D’Orsogna, Lakatos, and Chou J. Chem. Phys. 136, 084110 (2012)

    TABLE I. Accuracy and validity regimes for equilibrium cluster numbersof different nucleation models in the ε � 1. The results indicated by * or† match in the ε → 0+ limit, but they approach their common result verydifferently as ε → 0.

    Equilibrium clusternumbers (ε � 1) M

    N→ 0 M

    Nfinite M

    N� N

    BD (N = ∞) Eq. (2)* . . . . . .BD (finite N) Eq. (2)* Eq. (2) Eq. (2)†Stochastic model Eq. (6)* Eqs. (6) and (7) Eqs. (6) and (7)†

    concentrations ceqk

  • 084110-7 Stochastic self-assembly J. Chem. Phys. 136, 084110 (2012)

    Upon using the Laplace transform c̃k(s) =∫ ∞

    0ck(τ )e−sτ dτ , and the initial condition ck(t = 0) = Mδk, 1,Eqs. (A5) can be solved to find

    c̃1(s) = 2Ms(s + 1)N−2

    (2s2 + 2s + 1)(s + 1)N−2 − 1 . (A6)

    This expression can be inverse Laplace transformedto find c1(τ ). By defining τ * as the rescaled time at whichmonomers are depleted, c1(τ *) = 0, we can find the quenchedconcentrations of all the other larger clusters ck(τ *) byLaplace inverting the expressions

    c̃k(s) = Ms(s + 1)N−k−1

    (2s2 + 2s + 1)(s + 1)N−2 − 1 ,

    c̃N (s) = M(2s2 + 2s + 1)(s + 1)N−2 − 1 , (A7)

    and evaluating the results at τ *. Clearly, the solutions to themetastable concentrations c∗k are proportional to the total massM. In addition, if N ≤ 5, the poles of c̃1(s) can be found ana-lytically, and τ * and ck(τ *) can be found as solutions to simpletranscendental equations. For N = 4, we find τ * = 1.7124, andc∗1 ≡ 0, c∗2 = 0.9445, c∗3 = 1.0058, c∗4 = 1.0234. (A8)

    The mean-field concentrations stay close to these values untila time of order ε−1 when detachment and mass redistributionallows the clusters to coarsen towards their final equilibriumvalues ceqk .

    2. Kinetic Monte-Carlo simulations

    To check our analytic results and explore parame-ter regimes, we performed extensive KMC simulations ofthe self-assembly process (Eq. (3)) using the Bortz-Kalos-Lebowitz continuous-time algorithm. For each combinationof M and N, Nruns = 106 separate simulations were performedwith data for the time evolution of the cluster populations ag-gregated across the simulations. Specifically, simulation datawas recorded every 0.01 time units up to a simulation time of100, and then recorded every 100 time units thereafter. A sim-ulation time of 100 was generally sufficient for the process toreach the long-lived metastable state. Time series for the av-erage cluster populations 〈nk(t)〉 were computed by averagingthe independent simulations according to

    〈nk(i �t)〉 = N−1runsNruns∑r=1

    n(r)k (i�t), (A9)

    where r indexes the simulation run, i specifies the time step,and �t is the magnitude of each time step. Estimates for equi-librium populations were provided by 〈nk(t)〉 with t ≈ 1010.

    All simulations used to generate our results were startedfrom n1 = M, and thorough insensitivity of the equilibriumcluster populations to initial conditions was verified for anumber of random cases. Our simulations also provide theentire distribution of cluster numbers, which can be used tocompute higher moments of cluster concentrations nk for eachsize k. For example, we define the variance of the concentra-

    FIG. 6. (a) The variance (Eq. (A10)) for N = 8, M = 16, and ε = 10−6.(b) The variance when M = 17. The variance is large during the metastableregime and is insensitive to ε → 0+. At equilibrium, the variance is extremelysmall when M is divisible by N, but is measurable when there is an extramonomer.

    tions of each k-cluster by

    σk(i�t) ≡ N−1runsNruns∑r=1

    (n

    (r)k (i�t) − 〈nk(i�t)〉

    )2, (A10)

    and plot them in Figs. 6. Note that the variance is largest in themetastable regime and diminishes in the equilibrium regime ifM is divisible by N. For incommensurate masses, significantvariance remains. Provided sufficient trajectories are sampled,our simulations can also be used to accurately construct thefull distribution of each concentration nk. In Fig. 7, we plot theprobability Prob(n5 = m) of observing m pentamers for eighttime slices. Figure 8 shows the distributions of each clustersize at equilibrium. Note the dispersal, or “emulsification” ofthe system when M = 33 = 4 × 8 + 1.

    3. Equilibrium solution 〈neqk 〉Figure 4 displays all accessible states for an N = 4,

    M = 9 self-assembly process started with all monomers (n1(t= 0) = M = 9). For the slow detachment (small ε) pro-cess nearly all the weight at equilibrium is distributed amongstates with the lowest number of clusters, three in this case. To

    −2 −1 0 2 4 6 8 10

    01

    23

    45

    67

    0

    0.25

    0.5

    0.75

    1

    Numb

    er of p

    entam

    ers(m

    )

    log(t)

    Pro

    b(n

    5=

    m)

    FIG. 7. The fraction of simulation trajectories with n5 = m at various times.The simulation parameters are N = 8, M = 33, and ε = 10−6.

  • 084110-8 D’Orsogna, Lakatos, and Chou J. Chem. Phys. 136, 084110 (2012)

    FIG. 8. The distributions of each cluster size at equilibrium determined from KMC simulations with N = 8 and ε = 10−6. (a) For M = 31 nearly all the massresides in three k = 8 clusters, with one remaining k = 7 cluster. For M = 32, all the distributions of nk are peaked at zero, except for Prob(n8 = 4) = 1 (notshown). (b) Upon adding yet one more monomer (M = 33), the mean cluster concentration distribution disperses significantly and smaller clusters arise. Thebroader distribution of cluster concentrations arises from indivisibility of the total mass by the maximum cluster size, resulting in a proliferation of allowableconfigurations as illustrated by the states listed in Eq. (8).

    compute the partitioning, we consider transitions only amongthese low cluster states, as illustrated in Fig. 9.

    The transition rates are determined by the number ofways each transition can occur, and are labeled in Fig. 9.Therefore, detailed balance requires Peq(2, 0, 1, 1) = εPeq(0,1, 1, 1), 2Peq(2, 0, 1, 1) = 2εPeq(1, 0, 0, 2), 2Peq(1, 1, 2, 0)= εPeq(0, 1, 1, 1), and Peq(1, 1, 2, 0) = 3εPeq(0, 0, 3, 0).Imposing normalization of the three-cluster states, neglect-ing corrections of O(ε), we find Peq(0, 1, 1, 1) ≈ Peq(1,0, 0, 2) ≈ 6/13 and Peq(0, 0, 3, 0) ≈ 1/13. These proba-bilities yield 〈neq1 〉 ≈ 6/13, 〈neq2 〉 ≈ 6/13, 〈neq3 〉 ≈ 9/13, and

    .

    .

    .

    .

    .

    .(2,0,1,1)

    3 clusters(0,0,3,0)

    4 clusters(1,1,2,0)

    13ε

    (1,0,0,2)

    2ε2

    ε2

    (0,1,1,1)

    FIG. 9. Enumeration of the transitions between the two lowest cluster num-ber states (3 and 4 clusters) for the N = 4, M = 9 case. Only three distinctstates with a minimum number of clusters Nmin = 3 arise. These states areall connected by monomer attachment/detachment steps with states havingNmin + 1 = 4 clusters. By identifying the states that connect the three mini-mum cluster states, we can compute the weights of each of these states in theε → 0+ limit.

    〈neq4 〉 ≈ 18/13. In general, we extend this analysis to generalM and N by enumerating all low cluster number states. Thisis done by identifying the state containing the highest pos-sible number of largest clusters, and distributing the remain-ing mass into smaller clusters. The number of largest clustersis then successively decreased and only states with the low-est number of total clusters are counted. States generated thisway are connected through detailed balance to find the rela-tive probabilities of the dominant states. After normalizationwe find Peq({n}) and 〈neqk 〉 as presented in Eqs. (6) and (7).

    4. Recursion for metastable concentrations 〈n∗k 〉Besides Monte-Carlo simulations and our main analytic

    results for the mean equilibrium cluster concentrations 〈neqk 〉,we can also derive a recursion relation that can be used tocompute the expected cluster numbers for the irreversible as-sembly process where ε = 0. In this case, some probabilityis trapped in configurations that cannot further evolve (e.g.,the state (0, 3, 1, 0) in Fig. 4). These “quenched” probabil-ities P*({n}) lead to a good approximation to the expected,metastable cluster densities 〈n∗k〉 when ε > 0 is small. Furthercoarsening of these metastable cluster size distributions ariseonly after t � ε−1 when detachment and mass redistributionhas had time to occur.

  • 084110-9 Stochastic self-assembly J. Chem. Phys. 136, 084110 (2012)

    We consider irreversible dynamics (ε = 0) and derive arecursion relation for the frozen steady-state probability den-sity P*(n1, . . . , nN) which emerges either when n1 = 0 orwhen n1 = 1 and nk �= 1, N = 0. In the first case, all monomershave been depleted, while in the second one, there are noincomplete clusters available to accept the single remainingmonomer. Note that the latter case arises only when (M − 1)is a multiple of N.

    In order to construct the quenched steady state we startfrom the initial configuration when all mass is in the formof monomers n1 = M and P*(M, 0, . . . , 0) = 1. From thisinitial state, the only other state that can be constructed is thatcontaining one dimer, which leads also to P*(M − 2, 1, 0, . . . ,0) = 1. This means that our initial free monomer probabilitywill “migrate” in its entirety to a new configuration with onedimer and M − 2 monomers. There are no other choices forthis first step. We now find the relative weights of the secondgeneration states ((M − 4, 2, 0, . . . , 0) and (M − 3, 0, 1, . . . ,0)) formed from (M − 2, 1, 0, . . . , 0):

    (M − 4, 2, 0, . . . , 0)↗(M − 2, 1, 0, . . . , 0) ↘

    (M − 3, 0, 1, . . . , 0). (A11)

    This process is continued until the quenched steady state isreached where n1 = 0, or n1 = 1 and nk �= 1, N = 0. For thesimple case of N = 3, only two possibilities exist: monomeraggregation leads to the formation of a dimer, or a monomerand a dimer can give rise to a trimer. The recursion relationthat determines the quenched probabilities is

    P ∗(n1, n2, n3) = M−2n2−3n3+1M−3n3−1 P

    ∗(n1 + 2, n2−1, n3)∣∣n2≥1

    + 2(n2+1)M−3n3+2P

    ∗(n1+1, n2+1, n3−1)∣∣n3≥1,

    (A12)

    where we have explicitly taken into account the fact that M =n1 + 2n2 + 3n3. The prefactors can be calculated by balancingthe flux of material into each state. For example, the first termin Eq. (A12) arises from the following consideration. State(n1, n2, n3) arises only via states (n1 + 2, n2 − 1, n3) (dimerformation) and (n1 + 1, n2 + 1, n3 − 1) (trimer formation). Tocalculate the contribution from state (n1 + 2, n2 − 1, n3), wenote that the latter equilibrates only with states (n1, n2, n3) and(n1 + 1, n2 − 2, n3 + 1). There are (n1 + 2)(n1 + 1)/2 waysof creating a new dimer from (n1 + 2) monomers. Similarly,there are (n1 + 2)(n2 − 1) ways to create a trimer from (n1+ 2) monomers and (n2 − 1) dimers. The relative weight w 1for material transferring from state (n1 + 2, n1 − 1, n3) to (n1,n2, n3) is thus given by

    w1 = (n1 + 2)(n1 + 1)(n1 + 2)(n1 + 1) + 2(n1 + 2)(n2 − 1) . (A13)

    Upon simplifying and using the mass conservation constraintwe find that w 1 corresponds to the first prefactor in Eq. (A12).Similarly, to calculate the contribution from state (n1 + 1, n2+ 1, n3 − 1), we note that it equilibrates with states (n1, n2,n3) and (n1 − 1, n2 + 2, n1 − 1). There are (n1 + 1)n1/2 waysof creating a new dimer from n1 + 1 monomers. Similarly,there are (n1 + 1)(n2 + 1) ways to create a trimer from (n1+ 1) monomers and (n2 + 1) dimers. The relative weight w2for material transferring from state (n1 + 1, n2 + 1, n3 − 1)→ (n1, n2, n3) is given by

    w2 = 2(n1 + 1)(n2 + 1)2(n1 + 1)(n2 + 1) + (n1 + 1)n1 , (A14)

    which is exactly the second prefactor in Eq. (A12). We canapply the same reasoning for general N to find relationshipsamong the steady-state probabilities in the irreversible caseε = 0:

    P ∗(n1, n2, . . . , nN ) = (n1 + 1)n1 − 1 + 2

    ∑N−1k=2 nk

    P ∗(n1 + 2, n2 − 1, n3, . . . , nN )|n2≥1

    +N−2∑j=2

    2(nj + 1)n1 + 2

    ∑N−1k=2 nk

    P ∗(n1 + 1, . . . , nj + 1, nj+1 − 1, . . . , nN )∣∣nj+1≥1

    + 2(nN−1 + 1)n1 + 2 + 2

    ∑N−1k=2 nk

    P ∗(n1 + 1, n2, . . . , nN−1 + 1, nN − 1)∣∣nN ≥1. (A15)

    It is understood that all occupation numbers must be non-negative and the process stops whenever n1 = 0 or n1 = 1and nk �= 1, N = 0.

    This algorithm is difficult to implement for large M be-cause the number of computations grows exponentially withM. However, we can straightforwardly apply it to our small-system examples. For example, for the N = 4, M = 9 case

    illustrated in Fig. 4, we find that for ε = 0, the system settlesto three configurations with the following weights:

    P ∗(0, 3, 1, 0) = 9215488

    , P ∗(0, 0, 3, 0) = 287324696

    ,

    P ∗(0, 1, 1, 1) = 40157056

    , P ∗(1, 0, 0, 2) = 2591764

    .

  • 084110-10 D’Orsogna, Lakatos, and Chou J. Chem. Phys. 136, 084110 (2012)

    These probabilities yield the mean cluster numbers

    〈n∗1〉 = 0.14683, 〈n∗2〉 = 1.07248, 〈n∗3〉 = 1.08584,and 〈n∗4〉 = 0.86267.

    From the recursion relations, it is clear that the exact results〈neqk 〉 depend on M in a nontrivial manner. Although the re-sults for c∗k derived from BD equations are simply propor-tional to M (see Eqs. (A6) and (A7)), they cannot be exact.However, comparing the values of ceqk in Eqs. (A8) with thevalues for 〈n∗k〉 above, they form a reasonable approximationto the exact results. We have also shown that the c∗k give rea-sonable approximations to 〈n∗k〉 for all values N and M thatwe have investigated. This good approximation holds despitethe large variances in the mean cluster numbers in metastableregime as shown in Fig. 6.

    1D. Kashchiev, Nucleation: Basic Theory With Applications (Butterworth-Heinemann, Oxford, 2000).

    2C. J. Hernandez and T. G. Mason, J. Phys. Chem. C 111, 4477 (2007).3R. Groß and M. Dorigo, Proc. IEEE 96, 1490 (2008).4J. G. Amar, M. N. Popescu, and F. Family, Mater. Res. Soc. Symp. Proc.570, 3 (1999).

    5M. N. Popescu, J. G. Amar, and F. Family, Phys. Rev. B 64, 205404(2001).

    6P. A. Mulheran and M. Basham, Phys. Rev. B 77, 075427 (2001).7T. P. J. Knowles, C. A. Waudby, G. L. Devlin, S. I. A. Cohen, A. Aguzzi,M. Vendruscolo, E. M. Terentjev, M. E. Welland, and C. M. Dobson,Science 326, 1533 (2009).

    8D. Sept and A. J. McCammon, Biophys. J. 81, 667 (2001).9J. Miné, L. Disseau, M. Takahashi, G. Cappello, M. Dutreix, andJ.-L. Viovy, Nucleic Acids Res. 35, 7171 (2007).

    10L. Edelstein-Keshet and G. Ermentrout, Bull. Math. Biol. 60, 449(1998).

    11M. F. Bishop and F. A. Ferrone, Biophys. J. 46, 631 (1984).12E. T. Powers and D. L. Powers, Biophys. J. 91, 122 (2006).13D. Endres and A. Zlotnick, Biophys. J. 83, 1217 (2002).14A. Zlotnick, J. Mol. Biol. 366, 14 (2007).15B. Sweeney, T. Zhang, and R. Schwartz, Biophys. J. 94, 772 (2008).16J. Z. Porterfield and A. Zlotnick, “An overview of capsid assembly kinet-

    ics,” in Emerging Topics in Physical Virology, edited by P. G. Stockley andR. Twarock (Imperial College, London, 2010), pp. 131–158.

    17A. Yu. Morozov, R. F. Bruinsma, and J. Rudnick, J. Chem. Phys. 131,155101 (2009).

    18K. A. Brogden, Nat. Rev. Microbiol. 3, 238 (2005).19G. L. Ryan and A. D. Rutenberg, J. Bacteriol. 189, 4749 (2007).20M. Ehrlich, W. Boll, A. van Oijen, R. Hariharan, K. Chandran, M. L.

    Nibert, and T. Kirchhausen, Cell 118, 591 (2004).21B. I. Shraiman, Biophys. J. 72, 953 (1997).22L. Foret and P. Sens, Proc. Natl. Acad. Sci. U.S.A. 105, 14763 (2008).23M. Torkkeli, R. Serimaa, O. Ikkala, and M. Linder, Biophys. J. 83, 2240

    (2002).24P. L. Krapivsky, E. Ben-Naim, and S. Redner, Statistical Physics of Ir-

    reversible Processes (Cambridge University Press, Cambridge, England,2010)

    25O. Penrose, J. Stat. Phys. 89, 305 (1997).26J. A. D. Wattis and J. R. King, J. Phys. A 31, 7169 (1998).27P.-E. Jabin and B. Niethammer, J. Differ. Equations 191, 518 (2003).28B. Niethammer, J. Nonlinear Sci. 13, 115 (2008).29P. Smereka, J. Stat. Phys. 132, 519 (2008).30S. N. Majumdar, S. Krishnamurthy, and M. Barma, Phys. Rev. Lett. 81,

    3691 (1998).31J. S. Bhatt and I. J. Ford, J. Chem. Phys. 118, 3166 (2003).32F. Schweitzer, L. Schimansky-Geier, W. Ebeling, and H. Ulbricht, Physica

    A 150, 261 (1988).33T. Chou and M. R. D’Orsogna, Phys. Rev. E 84, 011608 (2011).34A. B. Bortz, M. H. Kalos, and J. L. Lebowitz, J. Comput. Phys. 17, 10

    (1975).35G. M. Whitesides and M. Boncheva, Proc. Natl. Acad. Sci. U.S.A. 99, 4769

    (2002).36W. Pan, P. G. Vekilov, and V. Lubchenko, J. Phys. Chem. B 114, 7620

    (2010).

    http://dx.doi.org/10.1021/jp0672095http://dx.doi.org/10.1109/JPROC.2008.927352http://dx.doi.org/10.1557/PROC-570-3http://dx.doi.org/10.1103/PhysRevB.64.205404http://dx.doi.org/10.1103/PhysRevB.77.075427http://dx.doi.org/10.1126/science.1178250http://dx.doi.org/10.1016/S0006-3495(01)75731-1http://dx.doi.org/10.1093/nar/gkm752http://dx.doi.org/10.1006/bulm.1997.0011http://dx.doi.org/10.1016/S0006-3495(84)84062-Xhttp://dx.doi.org/10.1529/biophysj.105.073767http://dx.doi.org/10.1016/S0006-3495(02)75245-4http://dx.doi.org/10.1016/j.jmb.2006.11.034http://dx.doi.org/10.1529/biophysj.107.107284http://dx.doi.org/10.1063/1.3212694http://dx.doi.org/10.1038/nrmicro1098http://dx.doi.org/10.1128/JB.00392-07http://dx.doi.org/10.1016/j.cell.2004.08.017http://dx.doi.org/10.1016/S0006-3495(97)78729-0http://dx.doi.org/10.1073/pnas.0801173105http://dx.doi.org/10.1016/S0006-3495(02)73984-2http://dx.doi.org/10.1007/BF02770767http://dx.doi.org/10.1088/0305-4470/31/34/018http://dx.doi.org/10.1016/S0022-0396(03)00021-4http://dx.doi.org/10.1007/s00332-002-0535-8http://dx.doi.org/10.1007/s10955-008-9552-9http://dx.doi.org/10.1103/PhysRevLett.81.3691http://dx.doi.org/10.1063/1.1538605http://dx.doi.org/10.1016/0378-4371(88)90059-3http://dx.doi.org/10.1016/0378-4371(88)90059-3http://dx.doi.org/10.1103/PhysRevE.84.011608http://dx.doi.org/10.1016/0021-9991(75)90060-1http://dx.doi.org/10.1073/pnas.082065899http://dx.doi.org/10.1021/jp100617w