Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
1
ITR: Non-equilibrium surface growth and the scalability of parallel discrete-
event simulations for large asynchronous systems
NSF DMR-0113049
http://www.rpi.edu/~korniss/Research/gk_research.html
2
PIs:PIs: GyorgyGyorgy Korniss (Rensselaer), Mark Novotny Korniss (Rensselaer), Mark Novotny ((Mississippi State U.)Mississippi State U.)
postdocpostdoc: Alice : Alice Kolakowska Kolakowska (Mississippi State U.)(Mississippi State U.)graduate student: H.graduate student: H. GucluGuclu (Rensselaer)(Rensselaer)
undergraduate students: Katie undergraduate students: Katie BarbieriBarbieri, John Marsh, Brad , John Marsh, Brad McAdams (Rensselaer); Shannon Wheeler (McAdams (Rensselaer); Shannon Wheeler (MSStateMSState))collaborators: P.A. collaborators: P.A. Rikvold Rikvold (Florida State U.), Z. (Florida State U.), Z. Toroczkai Toroczkai (CNLS, Los Alamos), B.D. (CNLS, Los Alamos), B.D. LubachevskyLubachevsky, Alan Weiss , Alan Weiss (Lucent/Bell Labs)(Lucent/Bell Labs)
Funded by NSF DMR/ITR, The Research Corporation, DOE (NERSC, SCRI/CSIT, LANL),
Rensselaer, Mississippi State U.
3
�Nature�?
�Nature�?
computer architectures +
algorithms
4
Discrete-event systems
!Cellular communication networks (call arrivals)!Internet traffic routing/queueing systems
§
�Dynamics is asynchronous�Updates in the local �configuration� are discrete events in
continuous time (Poisson arrivals) ⇒ discrete-event simulation
Modeling the evolution of spatially extended interacting systems: updates in �local� configuration as discrete-events!Magnetization dynamics in condensed matter(Ising model with single-spin flip Glauber dynamic)!Spatial epidemic models (contact process)
5
Parallelization for asynchronous dynamics
The paradoxical task:! (algorithmically) parallelize (physically) non-paralleldynamics
Difficulties:! Discrete events (updates) are not synchronized by a
global clock!Traditional algorithms appear inherently serial (e.g.,Glauber attempt one site/spin update at a time)
"However, these algorithms are not inherently serial(B.D. Lubachevsky �87)
6
Parallel discrete-event simulation
�Spatial decomposition on lattice/grid(for systems with short-range interactionsonly local synchronization between subsystems)
�Changes/updates: independent Poisson arrivals
"Each subsystem/block of sites, carried by aprocessing element (PE) must must have itsown local simulated time, {τi} (�virtual time�)
"Synchronization scheme"PEs must concurrently advance their own
Poisson streams, without violating causality
7
Two approaches
"Conservative!PE �idles� if causality is not guaranteed!utilization, ⟨u⟩: fraction of non-idling PEs
τi
(site index) i
d=1
"Optimistic (or speculative)!PEs assume no causality violations!Rollbacks to previous states once causality violation isfound (extensive state saving or reverse simulation)!Rollbacks can cascade (�avalanches�)
8
Basic conservative approach �Worst-case� analysis:�One-site-per PE, NPE=Ld
�t=0,1,2,¢ parallel steps�τi(t) fluctuating time horizon�Local time increments areiid exponential random variables
�Advance only if
"Scalability modeling!utilization (efficiency) ⟨u(t)⟩(fraction of non-idling PEs)density of local minima!width (spread) of time surface:
2
1
2 )]()([1)( ttN
twPEN
ii
PE
ττ∑=
−=
}min{ nnττ ≤i(nn: nearest neighbors)
9
Coarse graining for the stochastic time surface evolution
),(2
2
2
txxxt ητλττ +
∂∂−
∂∂=∂
Kardar-Parisi-Zhangequation
∂∂−∝ ∫
2
21exp)]([
xdx
DxP ττ Steady state (d=1):
Edwards-WilkinsonHamiltonian
"Random-walk profile: short-range correlated local slopes
G. K., Toroczkai, Novotny, Rikvold, �00
( ) ( ) )()()()()()()1( 11 ttttttt iiiiiii ηττττττ −Θ−Θ=−+ +−
�Θ(�) is the Heaviside step-function�ηi(t) iid exponential random numbers
M
10
"Universality/roughness
5.0,33.0 ≈≈ αβ
>><<
⟩⟨×
×
ttifLttift
tw L ,,
~)( 2
22
α
β
exactKPZ:
β=1/3α=1/2
2464.0≈⟩⟨ ∞u
Lconstuu L
.+⟩⟨≅⟩⟨ ∞
"Utilization/efficiency
βα /,~ =zLt z
2/122 /1~ Luu LLL ⟩⟨−⟩⟨=σ
(d=1)
11
Higher-d simulations (one site per PE)
d=1
d=2
d=3
246.0≈⟩⟨ ∞u
12.0≈⟩⟨ ∞u
075.0≈⟩⟨ ∞u
dLN =PE
12
Implications for scalabilitySimulation reaches steady state for(arbitrary d)
zLt >>
"Simulation phase: scalable
"Measurement (data management) phase: not scalable
)1(2
.α−∞ +⟩⟨≅⟩⟨
Lconstuu L
⟨u⟩∞ asymptotic average growth rate (simulation speed orutilization ) is non-zero
α22 ~ Lw L⟩⟨w
measurement at τmeas:(e.g., simple averages)
Krug and Meakin, �90
"But CAN be made scalable by considering complex underlying communication topologies among PEs
13
Actual implementation
1. Local time increment:∆τ=-ln(r), r✌U(0,1)
2. If chosen site is on the boundary,PE must wait until τ≤min{τnn}
l×l blocksNPE=(L/l)2
14
Application: metastability and dynamic phase transition in spatially extended bistable systems
∫= dttst
Q ii )(2
1
2/1
⟩⟨< τ2/1t ⟩⟨≈ τ2/1t ⟩⟨> τ2/1t
⟩⟨ ),( HTτmetastablelifetime
2/1thalf-period of the oscillating field
period-averaged spin
}{ is
}{ iQ
15
# L×L lattice with periodic boundary conditions# Single-spin-flip Glauber dynamics
# Periodic square-wave field of amplitude HHalf-period: t1/2
Magnetization: m(t)=(1/L2)Σisi(t)
T<Tc H→ −Ht=0: m=1 escape from metastable well: t=τ : m=0
Lifetime: ⟨τ ⟩= ⟨τ (T,H) ⟩
∑∑=><
−−=2
1,)(
L
iij
jii stHssJH J>01±=is
Application: metastability and hysteresisKinetic Ising model
16
Hysteresis and dynamic response
$ Periodic square-wave field of amplitude Ho
$ Half-period t1/2 ; Θ=t1/2/<τ (T,Ho)>
∑∑ −−=>< i
ijji
i stHssJ )(,
H
Θ>>1 symmetric limit cycle Θ<<1 asymmetric limit cycle
17
Dynamic Phase Transition (DPT)
# Θ >> Θc : |Q| ≈ 0 symmetric dynamic phase
# Θ << Θc : |Q| ≈ 1 symmetry-broken dynamic phase
# Θ = Θc ¿ 1 (t1/2 ¿ ⟨τ ⟩) large fluctuations in Q → DPT
∫= dttmt
Q )(2
1
2/1 ⟩⟨=Θ
),(2/1
HTt
τ
Sides et.al., PRL�98, PRE�99G.K. et.al., PRE�01
finite-size scaling evidence for a continuous (dynamic) phase transition}
order parameter fluctuations 4th order cumulant
18
Large-scale finite-size analysis of the dynamic phase transition : Absence of the Tri-critical Point
∫= dttmt
Q )(2
1
2/1
period-averaged magnetization( dynamic order parameter)
19
Summary and outlook$ The tools and machinery of non-equilibrium
statistical physics (coarse-graining, finite-size scaling, universality, etc.) can be applied to scalability modeling and algorithm engineering
$ Conservative schemes can be made scalable$ Optimistic schemes: rollbacks (avalanches in
virtual time): Self-organized criticality ???$ Non-Poisson asynchrony (e.g., in �fat-tail�
internet traffic)$ Applications: metastability, nucleation, and
dynamic phase transition in spatially extended bistable systems