37
(do “Concurrency in Clojure”) (by (and “Alex” “Nithya”))

Clojure concurrency

Embed Size (px)

DESCRIPTION

Concurrency concepts (STM) in clojure with comparison to existing implementation of concurrency in imperative languages

Citation preview

Page 1: Clojure concurrency

(do “Concurrency in Clojure”)

(by (and “Alex” “Nithya”))

Page 2: Clojure concurrency

Agenda

Introduction Features Concurrency, Locks, Shared state STM Vars, Atoms, Refs, Agents Stock picker example Q&A

Page 3: Clojure concurrency

Introduction

What is clojure?

– Lisp style

– Runs on JVM/CLR

Why Clojure?

– Immutable Persistent Data structures

– FP aspects

– Concurrency

– Open Source

– Short and sweet

Java Libraries

JVM

Evaluator

Clojure/Repl

Byte code

public class StringUtils { public static boolean isBlank(String str) { int strLen; if (str == null || (strLen = str.length()) == 0) { return true; } for (int i = 0; i < strLen; i++) { if ((Character.isWhitespace(str.charAt(i)) == false)) { return false; } } return true; }}

(defn blank? [s] (every? #(Character/isWhitespace %) s))

Fn name

parameters

body

Page 4: Clojure concurrency

Immutable data structures

Functions as first class objects, closures

Java Interop Tail Recursion

Features

(def vector [1 2 3]) (def list '(1 2 3)) (def map {:A “A”}) (def set #{“A”}) (defn add [x] ( fn [y] + x y) )

Page 5: Clojure concurrency

Lazy evaluation - abstract sequences + library– “cons cell” - (cons 4 '(1 2 3))

Features

Ins to generate the next component seqItem

FirstRest

(defn lazy-counter-iterate [base increment] ( iterate (fn [n] (+ n increment)) base))

user=> (def iterate-counter (lazy-counter-iterate 2 3))user=> (nth iterate-counter 1000000)

user=> (nth (lazy-counter-iterate 2 3) 1000000)3000002

Page 6: Clojure concurrency

Mutable objects are the new spaghetti code

– Hard to understand, test, reason about

– Concurrency disaster

– Default architecture (Java/C#/Python/Ruby/Groovy)

State – You are doing it wrong

Object

Data

Behaviour

Object 2

Data

Behaviour

Page 7: Clojure concurrency

Mutable Variables

Identity points to a different state after the update which is supported via atomic references to values.

location:Chennai location:Bangalore

Values are constants, they never change

(def location (ref “”) ) Identity - have different states in different point of time

States value of anidentity

Page 8: Clojure concurrency

Interleaving / parallel coordinated execution Usual Issues

– Deadlock

– Livelock

– Race condition

UI should be functional with tasks running Techniques - Locking, CAS, TM, Actors

Concurrency

Page 9: Clojure concurrency

One thread per lock - blocking lock/synchronized (resource) { .. }

Cons Reduces concurreny

Readers block readers What Order ? - deadlock, livelock Overlapping/partial operations Priority inversion

public class LinkedBlockingQueue<E>

public E peek() { final ReentrantLock takeLock = this.takeLock; takeLock.lock(); try { Node<E> first = head.next; if (first == null) return null; else return first.item; } finally { takeLock.unlock(); }

}

Locks

Page 10: Clojure concurrency

CAS operation includes three operands - a memory location (V), expected old value (A), and a new value (B) Wait- free algorithms Dead locks are avoided

Cons Complicated to implement JSR 166- not intended to be

used directly by most developers

public class AtomicInteger extends Number public final int getAndSet(int newValue) { for (;;) { int current = get(); if (compareAndSet(current, newValue)) return current; } }

public final boolean compareAndSet(int expect, int update) {

return unsafe.compareAndSwapInt(this, valueOffset, expect, update);

}

Compare And Swap

Page 11: Clojure concurrency

Enhancing Read Parallelism

Multi-reader/Single -writer locks

– Readers don't block each others

– Writers wait for readers

CopyOnWrite Collections

– Read snapshot

– Copy & Atomic writes

– Expensive

– Multi-step writes still

require locks

public boolean add(E e) { final ReentrantLock lock = this.lock;

lock.lock(); try {

Object[] elements = getArray(); int len = elements.length;

Object[] newElements = Arrays.copyOf(elements, len + 1);

newElements[len] = e; setArray(newElements);

return true; } finally {

lock.unlock();} }

Page 12: Clojure concurrency

Threads modifies shared memory Doesn't bother about other threads Records every read/write in a log Analogous to database transactions ACI(!D)

– Atomic -> All changes commit or rollback

– Consistency -> if validation fn fails transaction fails

– Isolation -> Partial changes in txn won't be visible to other threads

– Not durable -> changes are lost if s/w crashes or h/w fails

Software Transaction Memory

Page 13: Clojure concurrency

Txn

Txn

Clojure Transactions

Adam

Mr B

Minion

R ~ 40 U ~ 30

(defstruct stock :name :quantity)(struct stock “CSK” 40)

StocksList

R ~ 30

Buy 10

Buy 5

Sell 10

CSK ~ 30

TxnFail & RetryR - 40 U ~ 35

R - 30 U - 25

U ~ 40

Transaction Creation - (dosync (...))

Page 14: Clojure concurrency

Clojure STM

Concurrency semantics for references

• Automatic/enforced

• No locks!

Clojure does not replace the Java thread system, rather it works with it.

Clojure functions (IFn implement java.util.concurrent.Callable, java.lang.Runnable)

Page 15: Clojure concurrency

STM

Pros Optimistic, increased concurrency - no thread waiting Deadlock/Livelock is prevented/handled by Transaction manager Data is consistent Simplifies conceptual understanding – less effort

Cons Overhead of transaction retrying Performance hit (<4 processor) on maintaining committed, storing in-

transaction values and locks for commiting transactions Cannot perform any operation that cannot be undone, including most I/O

Solved using queues (Agents in Clojure)

Page 16: Clojure concurrency

Persistent Data Structures

Immutable + maintain old versions Structure sharing – not full copies

Thread/Iteration safe Clojure data structures are persistent

Hash map and vector – array mapped hash tries (bagwell) Sorted map – red black tree

MVCC – Multi-version concurrency control Support sequencing, meta-data Pretty fast: Near constant time read access for maps and

vectors

(actually O(log32n))

Page 17: Clojure concurrency

PersistentHashMap

32 children per node, so O(log32 n)

static interface INode{ INode assoc(int shift, int hash,

Object key, Object val, Box addedLeaf); LeafNode find(int hash, Object key);}BitMapIndexedNode

Page 18: Clojure concurrency

Concurrency Library

Coordinating multiple activities happening simutaneously

Reference Types

Refs

Atoms

Agents

Vars

Uncoordinated Coordinated

Synchronous Var Atom Ref

Asynchronous Agent

Page 19: Clojure concurrency

Vars Vars - per-thread mutables, atomic read/write

(def) is shared root binding – can be unbound

(binding) to set up a per-thread override

Bindings can only be used when def is defined at the top level

(set!) if per-thread binding

T1

T2

(def x 10) ; Global object

(defn get-val [] (+ x y))(defn fn []

(println x)(binding [x 2] (get-val))

Can’t see the binded value

Page 20: Clojure concurrency

Vars

Safe use mutable storage location via thread isolation

Thread specific Values

Setting thread local dynamic binding

Scenarios:

Used for constants and configuration variables such as *in*, *out*, *err*

Manually changing a program while running (def max-users 10)

Functions defined with defn are stored in Vars enables re-definition

of functions – AOP like enabling logging

user=> (def variable 1)#'user/variable

user=> (.start (Thread. (fn [] (println variable))))niluser=> 1

user=> (def variable 1)#'user/variableuser=>(defn print [] (println variable))user=> (.start (Thread. (fn [] (binding [variable 42] (print)))))niluser=> 1

(set! var-symbol value)

(defn say-hello [] (println "Hello")) (binding [say-hello #(println "Goodbye")] (say-hello))

Page 21: Clojure concurrency

Vars...

Augmenting the behavior

– Memoization – to wrap functions

Has great power

Should be used sparsely

Not pure functions (ns test-memoization)(defn triple[n](Thread/sleep 100)(* n 3))

(defn invoke_triple [] ( map triple [ 1 2 3 4 4 3 2 1]))

(time (dorun (invoke_triple))) -> "Elapsed time: 801.084578 msecs"

;(time (dorun (binding [triple (memoize triple)] (invoke_triple)))) ->

"Elapsed time: 401.87119 msecs"

Page 22: Clojure concurrency

Atoms

Single value shared across threads

Reads are atomic

Writes are atomic

Multiple updates are not possible

(def current-track (atom “Ooh la la la”))

(deref current-track ) or @current-track

(reset! current-track “Humma Humma”

(reset! current-track {:title : “Humma Humma”, composer” “What???”})

(def current-track (atom {:title : “Ooh la la la”, :composer: “ARR”}))

(swap! current-track assoc {:title” : “Hosana”})

Page 23: Clojure concurrency

Refs

Mutable reference to a immutable state

Shared use of mutable storage location via STM

ACI and retry properties

Reads are atomic

Writes inside an STM txn

Page 24: Clojure concurrency

Refs in Txn

• Maintained by each txn

• Only visible to code running in the txn

• Committed at end of txn if successful

• Cleared after each txn try

• Committed values

• Maintained by each Ref in a circular linked-list (tvals field)

• Each has a commit “timestamp” (point field in TVal objects)

Page 25: Clojure concurrency

Changing Ref

Txn retry( ref-set ref new-value)

( alter ref function arg*)

Commute

( commute ref function arg*)

Order of changes doesn't matter

Another txn change will not invoke retry

Commit -> all commute fns invoked using latest commit values

Example:Adding objects to collection

(def account1 (ref 1000))(def account2 (ref 2000))

(defn transfer "transfers amount of money from a to b" [a b amount] (dosync ( alter a - amount) ( alter b + amount)))

(transfer account1 account2 300)(transfer account2 account1 50)

;@account1 -> 750;@account2 -> 2250

Page 26: Clojure concurrency

Validators

Validators:

Invoked when the transaction is to commit

When fails -> IllegalStateException is thrown( ref initial-value :validator validator-fn)

user=> (def my-ref (ref 5))#'user/my-ref

user=> (set-validator! my-ref (fn [x] (< 0 x)))Nil

user=> (dosync (alter my-ref – 10))#<CompilerException java.lang.IllegalStateException: Invalid Reference State>

user=> (dosync (alter my-ref – 10) (alter my-ref + 15))10

user=> @my-ref5

Page 27: Clojure concurrency

Watches

Called when state changes Called on an identity

Example:

( add-watch identity key watch-function)

(defn function-name [key identity old-val new-val] expressions)

(remove-watch identity key)

user=> (defn my-watch [key identity old-val new-val] ( println (str "Old: " old-val)) ( println (str "New: " new-val)))

#'user/my-watchuser=> (def my-ref (ref 5))#'user/my-refuser=> (add-watch my-ref "watch1" my-watch)#<Ref 5>user=> (dosync (alter my-ref inc))Old: 5

Page 28: Clojure concurrency

Other features...

Write Skew Ensure

Doesn't change the state of ref Forces a txn retry if ref changes Ensures that ref is not changed during the txn

Page 29: Clojure concurrency

Agents

Agents share asynchronous independent changes between threads

State changes through actions (functions)

Actions are sent through send, send-off

Agents run in thread pools

- send fn is tuned to no of processors

- send-off for intensive operations, pre-emptive

Page 30: Clojure concurrency

Agents

Only one agent per action happens at a time

Actions of all Agents get interleaved amongst threads in a thread pool

Agents are reactive - no imperative message loop and no blocking receive

Page 31: Clojure concurrency

Agents

(def my-agent (agent 5))

( send my-agent + 3)

( send an-agent / 0)

( send an-agent + 1)

java.lang.RuntimeException: Agent is failed, needs restart

( agent-error an-agent)

( restart-agent my-agent 5 :clear-actions true)

Page 32: Clojure concurrency

Concurrency

Page 33: Clojure concurrency

Parallel Programming

(defn heavy [f] (fn [& args] (Thread/sleep 1000) (apply f args)))

(time (+ 5 5));>>> "Elapsed time: 0.035009 msecs"(time ((heavy +) 5 5));>>> "Elapsed time: 1000.691607 msecs"

pmap

(time (doall (map (heavy inc) [1 2 3 4 5])));>>> "Elapsed time: 5001.055055 msecs"(time (doall (pmap (heavy inc) [1 2 3 4 5])));>>> "Elapsed time: 1004.219896 msecs"

(pvalues (+ 5 5) (- 5 3) (* 2 4))(pcalls #(+ 5 2) #(* 2 5))

Page 34: Clojure concurrency

– Process

Pid = spawn(fun() -> loop(0) end)

Pid ! Message,

.....

– Receiving Process

receive

Message1 ->

Actions1;

Message2 ->

Actions2;

...

after Time ->

TimeOutActions

end

Immutable Message MachineMachine

Process

Erlang

Actors - a process that executes a function. Process - a lightweight user-space thread.Mailbox - essentially a queue with multiple producers

Page 35: Clojure concurrency

Actor Model

In an actor model, state is encapsulated in an actor (identity) and can only be affected/seen via the passing of messages (values).

In an asynchronous system like Erlang’s, reading some aspect of an actor’s state requires sending a request message, waiting for a response, and the actor sending a response.

Principles

* No shared state

* Lightweight processes

* Asynchronous message-passing

* Mailboxes to buffer incoming messages

* Mailbox processing with pattern matching

Page 36: Clojure concurrency

Actor Model

Advantages

Lots of computers (= fault tolerant scalable ...)

No locks

Location Transparency

Not for Clojure

Actor model was designed for distributed programs – location transparency

Complex programming model involving 2 message conversation for simple reads

Potential for deadlock since blocking messages

Copy structures to be sent

Coordinating between multiple actors is difficult

Page 37: Clojure concurrency

References

http://clojure.org/concurrent_programming

http://www.cis.upenn.edu/~matuszek/cis554-2010/Pages/clojure-cheat-sheet.txt

http://blip.tv/file/812787