RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to...

RCUArray: An RCU-like Parallel-Safe Distributed Resizable Array

By Louis Jenkins

The ProblemParallel-Safe Resizing

• Not inherently thread-safe to access memory while it is being resized• Memory has to be ‘moved’ from the smaller storage into larger storage

• Concurrent loads and stores can result in undefined behavior• Stores after memory is moved can be lost entirely

• Loads and Stores after the smaller storage is reclaimed can produce undefined behavior

• Why not just synchronize access?• Not scalable

• What do we need?

• What do we need?1. Allow concurrent access to both smaller and larger storage

2. Ensure safe memory management of smaller storage

3. Ensure that stores to old memory are visible in larger storage

Read-Copy-Update (RCU)• Synchronization strategy that favors performance of readers over writers

• Read the current snapshot 𝑠

𝑆 = 𝑏1

• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′

𝑆 = 𝑏1

𝑆′ = 𝑏1

• Update applied to s′…

𝑆 = 𝑏1

𝑆′ = 𝑏1, 𝑏2

• Update applied to s′, 𝑠′ becomes new current snapshot

𝑆 = 𝑏1

𝑆′ = 𝑏1, 𝑏2

• Update applied to s′, 𝑠′ becomes new current snapshot• Not always applicable in all situations

• Must be safe to access at least two different snapshots of the same data

𝑆 = 𝑏1 𝑆′ = 𝑏1, 𝑏2

Reader

Read-Copy-Update (RCU)

Read-Copy-Update

• Readers Concurrent with Readers

Reader-Writer Locks

• Readers Concurrent With Readers

• Synchronization strategy that favors performance of readers over writers• Read the current snapshot 𝑠• Copy 𝑠 to create 𝑠′

Read-Copy-Update

• Writers Mutually Exclusive with Writers

Reader-Writer Locks

Read-Copy-Update

• Readers Concurrent with Writers

Reader-Writer Locks

• Readers Mutually Exclusive with Writers

Distributed RCU• Privatization and Snapshots

• Each node in the cluster has its own local snapshotLocale #0 Locale #1

Locale #2 Locale #3

𝑆 = 𝑏1 𝑆 = 𝑏1

• Each node in the cluster has its own local snapshot

• All local snapshots point to the same block

Locale #0 Locale #1

Locale #2 Locale #3

𝑆 = 𝑏1 𝑆 = 𝑏1

• Reader Concurrency• Readers will read from local snapshot only

• All readers regardless of node will see same block

• All stores to 𝑏1 are seen by any snapshot or node

Locale #0 Locale #1

Locale #2 Locale #3

𝑆 = 𝑏1 𝑆 = 𝑏1

Reader Reader

• Writer Mutual Exclusion• Use a distributed lock

Locale #0 Locale #1

Locale #2 Locale #3

𝑆 = 𝑏1 𝑆 = 𝑏1

Reader Reader

• Perform each update local to each node

Locale #0 Locale #1

Locale #2 Locale #3

𝑆′ = 𝑏1, 𝑏2 𝑆′ = 𝑏1, 𝑏2

Reader Reader

• Perform each update local to each node

• Results• Fast and parallel-safe loads/stores across multiple nodes

• Allow for loads and stores to be immediately visible

• 40x faster resizing than naïve Block Distribution at 32-nodes

Locale #0 Locale #1

Locale #2 Locale #3

𝑆′ = 𝑏1, 𝑏2 𝑆′ = 𝑏1, 𝑏2

Reader Reader

RCUArray – Resizing Example

Set of readers 𝑅 begin using snapshot 𝑠

RCUArray – Resizing

Writer acquires Cluster Lock

Writer clones 𝑠 to create 𝑠′

𝑅 𝑠 𝑠′

Writer appends block 𝑏2 to 𝑠′𝑠 𝑠′

𝑠 𝑠′

Writer updates current snapshot to 𝑠′𝑠 𝑠′

𝑠 𝑠′

Set of readers 𝑅′ begin accessing 𝑠′𝑠 𝑠′

𝑠 𝑠′

𝑅′

Readers 𝑅 finish using 𝑠

𝑠 𝑠′

𝑏1 𝑏2

𝑠 𝑠′

𝑅′𝑅′

Reclaim 𝑠𝑠′

𝑏1 𝑏2

𝑠 𝑠′

𝑏1 𝑏2

𝑅′𝑅′

Writer releases cluster lock𝑠′

𝑏1 𝑏2

𝑅′𝑠′

𝑏1 𝑏2

𝑅′

Network Atomics vs Remote Execution Atomics• In Chapel, pointers to potentially remote memory are widened to 128-bits

• 64-bit Address, 32-bit Locale id, 32-bit Sub-locale id (NUMA)

• Cray’s Aeries NIC only supports 64-bit network atomic operations• Atomics via remote execution proves to be significantly slower than network atomics

• Distributed wait-free algorithms can scale with network atomics• Must have a low constant bounds in inter-node communications

Network Execution 26x faster (32 Nodes) Network Execution 20x faster (32 Nodes)

RCUArray as a Dynamic Heap

• Replacing Wide Pointers• Blocks have locality information

• 64-bits vs 128-bits

• Network Atomics

• Recycling Memory• Each node recycles indices to local

blocks

• Dynamic Heap• Parallel-Safe and Fast Resizing

• Distributed across multiple locales

• Great as a per data-structure heap

Conclusion• Chapel makes RCU easier…

• Lot of abstraction and language constructs• Privatization

• Parallel remote tasks

• Including Distributed RCU…

• RCUArray as a distribution • Exploring implementation under Domain map Standard Interface (DSI)

• Memory Management Related Efforts• Current efforts to add Quiescent State-Based “Garbage Collector” into language

• 75% finished runtime changes… but on hold

• Plans to introduce a Epoch-Based “Garbage Collector” as a Chapel module…

RCUArray: An RCU-like Parallel-Safe Distributed Resizable ... · •Not inherently thread-safe to...

Documents

Linked Lists A dynamically resizable, efficient implementation of a list

A Fast Concurrent and Resizable Robin Hood Hash Table

Gaming Desktop - Enabling Resizable BAR

Type-safe Off-heap Memory for Scaladownloads.typesafe.com/website/presentations/...Type-safe Off-heap Memory for Scala Denys Shabalin, LAMP/EPFL

How to create customizable, resizable graphics in MS Publisher for print or on the web

Oracle 12cR2 Database In-Memory: Adventures with ... · 12-10-2017 · Dynamically Resizable In-Memory Area •In 12.1.0.2, the In-Memory Area was in essence a static cache •Resizing

Morphing Upper Torso: A Resizable and Adjustable EVA ... - TDL

Watchdog: Hardware for Safe and Secure Manual Memory Management

Project Snowflake: Non-blocking Safe Manual Memory ... · 95 Project Snowflake: Non-blocking Safe Manual Memory Management in .NET MATTHEW PARKINSON, DIMITRIOS VYTINIOTIS, KAPIL VASWANI,

Experience With Safe Manual Memory-Management in Cyclonemwh/papers/ismm.pdf · Experience With Safe Manual Memory-Management in Cyclone ... BDW with type-safe stack allocation and

Hazard Pointers: Safe Memory Reclamation of Lock-Free Objects Maged M. Michael

Today Show How to improve memory- 6 mins Today Show How to improve memory- 6 mins sist_safety_mode=1&safe=active

Making the Java Memory Model Safe - IPD Snelting · Making the Java Memory Model Safe A:3 the memory model. In fact, I reuse the existing type-safety proofs from previous work [Lochbihler

Safe driving and memory loss

Project Snow ake: Non-blocking Safe Manual Memory ... · PDF fileProject Snow ake: Non-blocking Safe Manual Memory Management in .NET Matthew Parkinson Dimitrios Vytiniotis Kapil Vaswani

Safe Memory Regions for Big Data Processinggowthamk.github.io/docs/broom-techrep.pdf · Safe Memory Regions for Big Data Processing Gowtham Kaki Purdue University, USA gkaki@purdue.edu

Using Amazon EC2 in Computer and Network Security Lab … · 2. Use of Amazon EC2 “Amazon EC2 is a Web service that provides resizable compute ... (60.5 GB of memory, 88 EC2 compute

Watchdog: Hardware for Safe and Secure Manual Memory ... · Watchdog: Hardware for Safe and Secure Manual Memory Management and Full Memory Safety Santosh Nagarakatte Milo M. K. …

Sightline Enterprise Data Manager (EDM) Software Release Notes · 2014. 8. 11. · Sightline EDM Release Notes June 2014 3 Resizable Left-Side Bar Left-side menu bar is now resizable,

xCalls: Safe I/O in Memory Transactions