29
www.itu.dk 1 Programming Language Seminar Concurrency I-1: Java and C# Memory Models Peter Sestoft Friday 2013-10-25

Programming Language Seminar Concurrency I-1: Java … · Programming Language Seminar Concurrency I-1: Java and C# Memory Models ... GUI, ... do 6 . ... splg2013-1.ppt

Embed Size (px)

Citation preview

www.itu.dk 1

Programming Language Seminar Concurrency I-1:

Java and C# Memory Models

Peter Sestoft Friday 2013-10-25

www.itu.dk

Outline for today, 1 •  Why parallel programming? •  Concurrency in Java and C#

– Problem: shared mutable state (data, fields) – Solutions:

•  Locks, synchronized!•  AtomicInteger, AtomicLong, AtomicReference

•  Concurrency without locks – Weird behavior legal in Java and C# for speed – Safe publication

•  The double meaning of synchronized!•  The meaning of volatile!

–  Immutability and visibility •  The double meaning of final

2

www.itu.dk

Why parallel programming? •  Until 2003, CPUs became faster every year

–  So sequential software became faster every year •  Today, CPUs are still 2-4 GHz as in 2003

–  So sequential software has not become faster •  Instead, we get

–  Multicore: 2, 4, 8, ... CPUs on a chip –  Vector instructions (4 x MAC, SIMD, SSE) in CPUs –  Superfast Graphics Processing Units (GPU)

•  96 simple CUDA codes in this ancient 2009 laptop •  3027 simple but fast CUDA cores in Nvidia Tesla K10

•  Herb Sutter: The free lunch is over (2005) •  More speed requires parallel programming

–  But parallel programming is difficult and errorprone –  ... with existing means: threads, synchronization, ...

3

www.itu.dk

A simple counter, incremented in parallel

4

class BareCounter implements Counter { private int counter = 0; public void inc() { counter++; } }

Thread[] ts = new Thread[threads]; for (int j=0; j<threads; j++) ts[j] = new Thread() { public void run() { for (int i=0; i<iterations; i++)

counter.inc(); } }; for (int j=0; j<threads; j++) ts[j].start(); for (int j=0; j<threads; j++) ts[j].join();

Many threads increment counter in

parallel

Simple counter

This goes wrong, of

course

Why?

www.itu.dk

Locks: Ensure mutual exclusion

5

class SyncCounter implements Counter { private int counter = 0; public synchronized void inc() { counter++; } }

Really, abbreviation for this code

Synchronized counter

This works

Why?

class SyncCounter implements Counter { private int counter = 0; public void inc() { synchronized(this) { counter++; } } }

File ConcurrentCounters.java

www.itu.dk

Locking/synchronization •  A lock does not guarantee anything in itself •  Disciplined use of locks can lead to

– Exclusive access to shared mutable state – And hence consistent update of the state

•  Easy to misuse – Forget synchronized one place => anarchy

•  Low performance under high contention – Context switches

•  Not compositional – Using multiple locks can lead to deadlock – Easy to avoid by always locking in the same order – But hard to know that libraries, GUI, ... do

6

www.itu.dk

Atomic update (Java 5)

•  This uses an atomic x86 instruction •  Mono JITted code, from CIL, from C#

– See file Interlocked.cs

7

class AtomicCounter implements Counter { private final AtomicInteger counter = new AtomicInteger();

public void inc() { counter.getAndIncrement(); } }

Atomic counter

www.itu.dk

Atomic variables •  Java

–  java.util.concurrent.atomic package – AtomicInteger, AtomicReference<T>, ...

•  C#/.NET – System.Threading.Interlocked namespace – Add(ref int, int), Exchange<T>(ref T, T), ...

•  More efficient than locking/synchronized – When applicable – Translates directly to x86 instructions

•  We shall look more into these next week –  In lock-free algorithms

8

www.itu.dk

Strange but legal behavior •  Java Language Specification, sect 17.4:

– Run these code fragments in two threads – Assume A and B shared fields, initially 0

•  What are the possible results? – Strangely, r1==1 and r2==2 is possible

•  The Java (or C#/.NET) memory model – Does not guarantee sequential consistency

• Not between threads, only within each individual thread – Compiler may reorder and share memory accesses

9

r2=A; B=1;

r1=B; A=2;

Thread 1 Thread 2

www.itu.dk

Why permit such strange behaviors? •  More comprehensible example from JLS 17.4

– Assume p, q shared, p==q and p.x==0

– Classic compiler optimization:

(p.x seems to switch from r2=0 to r4=3 and back to r5=0)

10

r1 = p; !r2 = r1.x; !r3 = q; !r4 = r3.x; !r5 = r1.x;!

r6 = p; !r6.x = 3;!

r1 = p; !r2 = r1.x; !r3 = q; !r4 = r3.x; !r5 = r2;!

r6 = p; !r6.x = 3;!

Thread 1 Thread 2

www.itu.dk

Sequential consistency •  The volatile field modifier

– avoids these compiler optimizations – offers a number of guarantees (in Java and C#) – but loses some performance

•  IntArray.IsSorted example, sequential – Files VolatileArray.java, VolatileArray.cs –  Java, 0.085 sec non-volatile, 0.346 sec volatile – C# MS 0.194 sec non-volatile, 0.320 sec volatile – C# Mono 0.319 sec in both cases –  In this particular case, Mono does no optimization

•  See machine code, in source file

11

www.itu.dk

Java and C# •  Java

–  Java Language Specification (JLS), Java 7, 2013: section 8.3.1.4 Volatile Fields (brief) and section 17.4 Memory Model (rather complicated)

–  JVM Specification – just refers to JLS •  C#/.NET

– C# Language Specification 17.3.4 Volatile Fields – CLI Ecma-335 standard section I.12.6.7:

•  "volatile read has acquire semantics ... the read is guaranteed to occur prior to any references to memory than occur after the read instruction in the CIL instruction sequence"

•  "volatile write has release semantics ... the write ... occur after any memory references ... prior to the write ..."

12

www.itu.dk

•  One thread may never see the updates performed by another one

Thread-unsafe integer holder

13

public class MutableInteger { private int value;

public int get() { return value; }

public void set(int value) { this.value = value; } }

www.itu.dk

•  Locking (synchronized) has two effects: – Mutual exclusion – Visibility of memory updates: all fields visible to

thread A before releasing a lock are visible to thread B after acquiring the lock ("synchronizes")

Thread-safe integer holder

14

public class MutableInteger { private int value;

public synchronized int get() { return value; }

public synchronized void set(int value) { this.value = value; } }

www.itu.dk

Visibility by synchronization

15

Goe

tz p

. 37

"acquire"

"release"

www.itu.dk

•  The volatile modifier has one effect: – Visibility of memory updates: all fields visible to

thread A before writing the field are visible to thread B after reading the field (it "synchronizes")

•  Stronger guarantee than in C/C++ – Affects visibility of all fields, not just the volatile

Another thread-safe integer holder?

16

public class MutableInteger { private volatile int value; public int get() { return value; } public void set(int value) { this.value = value; } }

Not in the book, but should work

www.itu.dk

C#/.NET •  CLI Ecma-335 standard section I.12.6.7:

–  "A volatile write has release semantics ... the write is guaranteed to happen after any memory references prior to the write instruction in the CIL instruction sequence"

–  "volatile read has acquire semantics ... the read is guaranteed to occur prior to any references to memory that occur after the read instruction in the CIL instruction sequence"

•  So same as Java: volatile write+read has the visibility effect of lock release+acquire –  (but not the mutual exclusion effect, of course)

17

www.itu.dk

Goetz factorization servlet example: Stateless servlet

•  No concurrent access to any shared state •  All state is thread-confined (local variables)

18

public class StatelessFactorizer ... implements Servlet { public void service(ServletRequest req, ServletResponse resp) { BigInteger i = extractFromRequest(req); BigInteger[] factors = factor(i); encodeIntoResponse(resp, factors); }

BigInteger extractFromRequest(ServletRequest req) { ... }

BigInteger[] factor(BigInteger i) { ... }

void encodeIntoResponse(ServletResponse resp, ...) { ... } }

http://www.javaconcurrencyinpractice.com/listings.html

www.itu.dk

•  Concurrent access to shared mutable state •  Unsafe because ++i operation is not atomic

– Risk of lost updates

Goetz factorization servlet example: Count accesses in shared int

19

public class UnsafeCountingFactorizer ... { private long count = 0;

public void service(ServletRequest req, ServletResponse resp) { BigInteger i = extractFromRequest(req); BigInteger[] factors = factor(i); ++count; encodeIntoResponse(resp, factors); } }

Unsafe

Shared state

www.itu.dk

•  Concurrent access to shared mutable state •  Safe because operation is atomic

– No lost updates •  Could we use synchronized instead?

Goetz factorization servlet example: Count accesses with atomic int

20

public class CountingFactorizer ... { private final AtomicLong count = new AtomicLong(0);

public void service(ServletRequest req, ServletResponse resp) { BigInteger i = extractFromRequest(req); BigInteger[] factors = factor(i); count.incrementAndGet(); encodeIntoResponse(resp, factors); } }

Safe

Shared state

www.itu.dk

•  Invariant: lastNumber = product of lastFactors •  Can we use synchronized here?

Goetz factorization servlet example: Cache last factorization

21

public class UnsafeCachingFactorizer ... { private final AtomicReference<BigInteger> lastNumber = ...; private final AtomicReference<BigInteger[]> lastFactors = ...;

public void service(ServletRequest req, ServletResponse resp) { BigInteger i = extractFromRequest(req); if (i.equals(lastNumber.get())) encodeIntoResponse(resp, lastFactors.get()); else { BigInteger[] factors = factor(i); lastNumber.set(i); lastFactors.set(factors); encodeIntoResponse(resp, factors); } } }

Unsafe, may violate invariant

Goetz factorization servlet example: Cache last factorization, I

22

public class CachedFactorizer ... { private BigInteger lastNumber; private BigInteger[] lastFactors;

public void service(ServletRequest req, ServletResponse resp) { BigInteger i = extractFromRequest(req); BigInteger[] factors = null; synchronized (this) { if (i.equals(lastNumber)) factors = lastFactors.clone(); } if (factors == null) { factors = factor(i); synchronized (this) { lastNumber = i; lastFactors = factors.clone(); } } encodeIntoResponse(resp, factors); }

Preserves invariant

Why needed?

www.itu.dk

Immutable factor cache

23

public class OneValueCache { private final BigInteger lastNumber; private final BigInteger[] lastFactors;

public OneValueCache(BigInteger i, BigInteger[] factors) { lastNumber = i; lastFactors = Arrays.copyOf(factors, factors.length); }

public BigInteger[] getFactors(BigInteger i) { if (lastNumber == null || !lastNumber.equals(i)) return null; else return Arrays.copyOf(lastFactors, lastFactors.length); } }

•  Final fields, and •  instance-private copies of arrays, and •  BigInteger instances are immutable

www.itu.dk

Goetz factorization servlet example: Cache last factorization, II

24

public class VolatileCachedFactorizer ... { private volatile OneValueCache cache = new OneValueCache(null, null);

public void service(ServletRequest req, ServletResponse resp) { BigInteger i = extractFromRequest(req); BigInteger[] factors = cache.getFactors(i); if (factors == null) { factors = factor(i); cache = new OneValueCache(i, factors); } encodeIntoResponse(resp, factors); } }

•  Volatile field cache ensures visibility •  Immutable cache object

–  avoids shared mutable state and ensures visibility

NB!

www.itu.dk

Semantics of final fields •  Final has two effects

–  field cannot be updated after initialization, and –  field's value is visible after construction

•  Java Language Specification 17.5: A thread that can only see a reference to an object after [that object's constructor has finished] is guaranteed to see the correctly initialized values for that object's final fields

•  This is similar to volatile fields •  But the JIT compiler can perform lots of

optimizations (caching, ...) on final fields that are not possible for volatile fields

25

JLS example 17.5-1

26

class FinalFieldExample { ! final int x;! int y;! static FinalFieldExample f;!

public FinalFieldExample() { ! x = 3;! y = 4;! }!

static void writer() { ! f = new FinalFieldExample();! }!

static void reader() { ! if (f != null) { ! int i = f.x; // guaranteed to see 3! int j = f.y; // could see 0! }!} }!

Thread 1. Writes to f after

constructor finished

Thread 2

•  What about C#/.NET readonly fields?!–  No mention found in C# Language Specification (readonly) or Ecma-335 CLI

Specification (initonly) –  In fact, no such guarantee intended, see mails from Microsoft (Carol Eidt and

Eric Eilebrecht) 2012-12-04

27

www.itu.dk

Visibility of memory updates •  Caused by synchronized/lock •  Caused by volatile •  Caused by final (visible after construction) •  Caused by CAS and similar (next week) •  Caused by synchronized collections, in

–  Java package java.util.concurrent –  .NET namespace System.Collections.Concurrent – and older synchronized collections

28

www.itu.dk 29

Reading •  Week 1 (this week)

– Read Goetz et al.: Java Concurrency in Practice, chapters 1, 2, 3, 4, 5

– Look at Java Language Specification, section 17.4 http://docs.oracle.com/javase/specs/jls/se7/jls7.pdf

•  Week 2 – Goetz et al.: Java Concurrency in Practice,

chapter 15 – Michael and Scott: Simple, fast, and practical ... – Herlihy & Shavit: The Art of Multiprocessor

Programming, chapters 3 and 9