39
ROSS for PDES Research Elsa Gonsiorowski Justin LaPre

ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

ROSS for PDES Research

Elsa GonsiorowskiJustin LaPre

Page 2: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

CD Carothers

02468

1012

1994 1997 2000 2003 2006 2009 2012 2015

2

Page 3: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

CD Carothers

02468

1012

1994 1997 2000 2003 2006 2009 2012 2015

2

Page 4: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

“Efficient Optimistic Parallel Simulations Using Reverse Computation”

0

10

20

30

1994 1997 2000 2003 2006 2009 2012 2015

3

Page 5: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

“Efficient Optimistic Parallel Simulations Using Reverse Computation”

0

10

20

30

1994 1997 2000 2003 2006 2009 2012 2015

3

Page 6: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

“On the Role of Burst Buffers in Leadership-Class Storage Systems”

0

10

20

30

40

50

1994 1997 2000 2003 2006 2009 2012 2015

4

Page 7: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Doxygen Automated Documentation • GitHub (and GitHub Issues) • LP-Printf • Continuous Integration with Travis • AVL Tree • LORAIN • Delta Encoding

Research vs Software Engineering

5

Page 8: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• AVL Tree (PADS 2014) • LORAIN (PADS 2014) • Delta Encoding (WSC 2015) • Gates (MASCOTS 2012) • Automatic Model Generation (PADS 2015)

Research vs Software Engineering

5

Page 9: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Issues

Open

11

14

5

Closed

111

12

2

gonsie JohnPJenkinscarns laprejmarkplagge mmubarak

6

Page 10: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Punch Card

7

Page 11: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Research

• AVL Tree (PADS 2014) • LORAIN (PADS 2014) • Delta Encoding (WSC 2015) • Gates (MASCOTS 2012) • Automatic Model Generation (PADS 2015) • RIO (??)

8

Page 12: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Events are electrical signals

• Logical processes (LPs) are the boolean logic gates

Gate-Level Circuits Simulation

Timestamp: 1 2 3 49

Page 13: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Events are electrical signals

• Logical processes (LPs) are the boolean logic gates

Gate-Level Circuits Simulation

11

1

0

Timestamp: 1 2 3 49

Page 14: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Events are electrical signals

• Logical processes (LPs) are the boolean logic gates

Gate-Level Circuits Simulation

1

0

Timestamp: 1 2 3 49

Page 15: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Events are electrical signals

• Logical processes (LPs) are the boolean logic gates

Gate-Level Circuits Simulation

1

Timestamp: 1 2 3 49

Page 16: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Events are electrical signals

• Logical processes (LPs) are the boolean logic gates

Gate-Level Circuits Simulation

0

Timestamp: 1 2 3 49

Page 17: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Events are electrical signals

• Logical processes (LPs) are the boolean logic gates

Gate-Level Circuits Simulation

Timestamp: 1 2 3 410

Page 18: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

Timestamp: 1 2 3 411

Page 19: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

11

1

0

Timestamp: 1 2 3 411

Page 20: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

11

0

Timestamp: 1 2 3 411

Page 21: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

11

0

Timestamp: 1 2 3 411

Page 22: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

1

0

Timestamp: 1 2 3 411

Page 23: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

1

0

Timestamp: 1 2 3 411

Page 24: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

1

0

Timestamp: 1 2 3 411

Page 25: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

1

Timestamp: 1 2 3 411

Page 26: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

0

Timestamp: 1 2 3 411

Page 27: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

• Time Warp algorithm • minimal global time synchronization • local recovery when out-of-order event detected • Reverse computation and/or state-saving

PDES: Optimistic Synchronization

Timestamp: 1 2 3 412

Page 28: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

PDES Research

• How can we automate event rollback?

• Incremental state-saving with delta encoding

• Automatic reverse computation with LORAIN

13

Page 29: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Delta Encoding

• Motivation: not all event handlers are reversible, e.g., while loop containing if statements

• We have to fall back to state-saving, but consumes too much memory!

• Source control systems typically store changes as deltas

• Often a substantial win when changes are small

• PDES state changes typically small

• Deltas on sizable models will likely have many zeroes, i.e., things are the same

• Excellent opportunity for compression!14

Page 30: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Approach

• In the main event loop in ROSS:

• memcpy() LP state into a buffer statebefore

• run event

• Now we have stateafter

• Call delta_diff function on statebefore and stateafter

• Compress diff with LZ4, save with event

15

Page 31: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Example: OLSR ProtocolOUTPUT

Generation

INPUT

Forwarding

OLSR Message

Route Calculation

Local Node

Information Repositories

HELLO

TC

MIDHELLO

TC

MID

Link Set

Neighbor Set

2 Hop Neighbor Set

Multipoint Relay Set

Multipoint Relay Selector Set

Topology Information Base

Duplicate Set

Multiple Interface Association Set

16

Page 32: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

OLSR Performance for Various Approaches

17

1

4

16

64

256

1024

4096

128 518 2048

Runn

ing

time

(sec

onds

)

Cores used

OLSR run time comparison for delta encoding, conservative, and optimistic with state saving

Delta encodingConservative

Optimistic w/ state saving

Page 33: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Delta Encoding Uses Far Less Memory!

18

213

214

215

216

217

218

219

220

221

222

Delta Encoding State Saving

Mem

ory

used

(KB)

OLSR model memory consumption fordelta encoding and state saving

LP StateEvent MemoryBuddy System

Page 34: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Motivation Behind LORAIN

• Writing reverse code is hard!

• Example: swapping into messages is common!

// Forward if (msg->val == 1) { SWAP(lp->val, msg->val); }

// Reverse if (msg->val == 1) { SWAP(lp->val, msg->val); }

Looks “obviously” correct forward and backward, but it’s not.

// Forward if (msg->val == 1) { SWAP(lp->val, msg->val); }

// Reverse if (lp->val == 1) { SWAP(lp->val, msg->val); }

Hard error to spot!

19

Page 35: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

What Does LLVM IR Look Like?void test_if(void) { if (test_if_x_condition) { test_if_x = test_if_x + 1; } }

define void @test_if() nounwind uwtable ssp { %1 = load i32* @test_if_x_condition, align 4 %2 = icmp ne i32 %1, 0 br i1 %2, label %3, label %6

; <label>:3 ; preds = %0 %4 = load i32* @test_if_x, align 4 %5 = add nsw i32 %4, 1 store i32 %5, i32* @test_if_x, align 4 br label %6

; <label>:6 ; preds = %3, %0 ret void }

��

� �

��

���

��

20

Page 36: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

3 Steps

• Augment the appropriate “message” data structure to support swap operations

• Catch “destructive” operations e.g., x = 5

• Instrument / analyze the forward event handler

• Mark non-important store instructions with metadata

• Clone and invert the forward event handler

21

Page 37: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Continuing Research RIO: Checkpointing API For ROSS

• checkpoint.readme.txt

• checkpoint.metadata.mh

• checkpoint.data-#

22

Page 38: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

Using RIO

• Compile in with CMake USE_RIO option

• --io-parts=n and --io-files=n flags or function call

• io_lptype struct

• Mapping extension

• Implement: Model size function

• Implement: Serialize and deserialize functions

23

Page 39: ROSS for PDES Research - press3.mcs.anl.govpress3.mcs.anl.gov/summerofcodes2015/files/2015/07/justin-elsa-su… · Elsa Gonsiorowski Justin LaPre. CD Carothers 0 2 4 6 8 10 12 1994

YOUR PAPER HERE T. Liu, N. Wolfe, C. D. Carothers, Wei Ji, and X. George Xu,``Optimizing the Monte Carlo neutron cross-section construction code, XSBench, to MIC and GPU platforms'', In Proceedings the ANS MC2015 - Joint International Conference on Mathematics and Computation (M&C), Supercomputing in Nuclear Applications (SNA) and the Monte Carlo (MC) Methods, Nashville, TN, April 19-23, 2015.

N. Wolfe, C. D. Carothers, T. Liu and G. Xu, ``Concurrent CPU, GPU and MIC Execution Algorithms for Archer Monte Carlo Code Involving Photon and Neutron Radiation Transport Problems'', In Proceedings the ANS MC2015 - Joint International Conference on Mathematics and Computation (M&C), Supercomputing in Nuclear Applications (SNA) and the Monte Carlo (MC) Methods, Nashville, TN, April 19-23, 2015.

M. Mubarak, C. D. Carothers, P. Carns and R. B. Ross, ``Using Massively Parallel Simulation for MPI Collective Communication Modeling in Extreme-Scale Networks'', In Proceedings of the 2014 Winter Simulation Conference, Savannah GA, December, 2014.

S. Snyder, P. Carns, R. B. Ross, J. Jenkins, K. Harms, M. Mubarak and C. D Carothers, ``A Case for Epidemic Fault Detection and Group Membership in HPC Storage Systems'', In Proceedings of the 5th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS 2015) as part of Supercomputing (SC'14). New Orleans, LA, November 2014.