E81 CSE 532S: Advanced Multi-Paradigm Software Development

E81 CSE 532S: Advanced Multi-Paradigm Software Development

Christopher Gill and Venkita SubramonianDepartment of Computer Science and Engineering

Washington University, St. [email protected]

Deadlock Avoidance in Pattern Oriented Software Architecture

Main Themes of this Talk

• Vertical vs. horizontal design of architectures• Consider how design choices have side effects• Design scenario case study

– A distributed request-response system– Concurrency as a scarce resource– Avoiding deadlock efficiently

• Also, illustrative as a Service Access and Configuration Pattern Language– Interceptor to manipulate concurrency at key points– Component Configurator to install DA protocol

Motivating Scenario• Request-response client server applications

– E.g., CORBA, N-tier web servers, etc.– Any “component” can send requests, obtain results

• Distributed data and analysis components – Requests may be sent locally or remotely– Processing a request may require making new requests– Thus, request chains may span endsystem boundaries

• Raises new concerns related to liveness and safety– Assume a reactive/multithreaded concurrency architecture

• Reactor threads service upcalls to clients• Number of threads overall is bounded (e.g., due to overhead)

– Local thread management decisions impact global behavior• E.g., WaitOnReactor leads to stacking of result handling• E.g., Stackless approach increases request/result matching

overhead• E.g., WaitOnConnection raises deadlock issues

Vertical Design of an Architecture• From HS/HA & LF lecture

– Designed to handle streams of mixed duration requests

• Focused on interactions among local mechanisms– Concurrency and

synchronization concerns– Hand-offs among threads

• Well suited for “hub and spokes” or “processing pipeline” style applications

• However, in some applications, a distributed view is more appropriate

enqueuerequests leader

thread

reactor thread

handoff

chains

followerthreads

Mixed Duration Request Handlers (MDRH)

Horizontal Design of an Architecture• Application components are implemented as handlers

– Use reactor threads to run input and output methods– Send requests to other handlers via sockets, upcalls– These in turn define key interception points end-to-end

• Example of a multi-host request/result chain– h1 to h2, h2 to h3, h3 to h4

reactor r1

handler h1

reactor r2 reactor r3socket socket

handler h2 handler h4 handler h3

WaitOnConnection Strategy

Client ServerC

Reactor

3 wait

Re

acto

r

Servant

Deadlock here

Callback

1 2 4

5

• Handler waits on socket connection for the reply– Makes a blocking call to

socket’s recv() method• Benefits

– No interference from other requests that arrive while the reply is pending

• Drawbacks– One less thread in the

Reactor for new requests– Could allow deadlocks when

upcalls are nested

WaitOnReactor Strategy• Handler returns control to

reactor until reply comes back – Reactor can keep processing

other requests while replies are pending

• Benefits– Thread available, no

deadlock– Thread stays fully occupied

• Drawbacks– Interleaving of request reply

processing– Interference from other

requests issued while reply is pending

Client ServerC

Reactorwait

Reactor

Servant

Callback6

Deadlock avoided bywaiting on reactor

1

342

5

Blocking with WaitOnReactor• Wait-on-Reactor

strategy could cause interleaved request/reply processing

• Blocking factor could be large or even unbounded – Based on the upcall

duration– And sequence of other

intervening upcalls• Blocking factors may

affect real-time properties of other end-systems– Call-chains can have

a cascading blocking effect

f2f5

f3

f5 replyqueued

f3 completes

f5 replyprocessed

f2 completes

blocking factor

due to f2

“Stackless” WaitOnReactor Variant• What if we didn’t “stack” processing of

results?– But instead allowed them to handled

asynchronously as they are ready– Thanks to Caleb Hines for pointing out this

exemplar from “Stackless Python”• Benefits

– No interference from other requests that arrive when reply is pending

– No risk of deadlock as thread still returns to reactor

• Drawbacks– Significant increase in implementation complexity– Time and space overhead to match requests to

results (hash maps, AO, pointer ACTs, etc. could help, though)

Could WaitOnConnection Be Used?

• Main limitation is its potential for deadlock– And, it offers low overhead, ease of use

• Could we make a system deadlock-free …– if we knew its call-graph … and were

careful about how threads were allowed to proceed?

• Each reactor is assigned a color

• Deadlock can exist – If there exists > Kc

segments of color C– Where Kc is the

number of threads in node with color C

– E.g., f3-f2-f4-f5-f2 needs at least 2 & 1

Deadlock Problem in Terms of Call Graph

f1f2

f3

f4

f5

From Venkita Subramonian and Christopher Gill, “A Generative Programming Framework for Adaptive Middleware”,37th Hawaii International Conference on System Sciences (HICSS ’04)

Simulation Showing Thread Exhaustion

Reactor1

Client1

Client2

Client3

Reactor2

• Increasing number of reactor threads may not always prevent deadlock

• Can model this formally (UPPAL, IF)

Server1 Server2

Flow1

Flow2

Flow3

EH11

EH31

EH21

EH12

EH13

EH22

EH23

EH32

EH33

Clients send requests 3: Client3 : TRACE_SAP_Buffer_Write(13,10) 4: Unidir_IPC_13_14 : TRACE_SAP_Buffer_Transfer(13,14,10) 5: Client2 : TRACE_SAP_Buffer_Write(7,10) 6: Unidir_IPC_7_8 : TRACE_SAP_Buffer_Transfer(7,8,10) 7: Client1 : TRACE_SAP_Buffer_Write(1,10) 8: Unidir_IPC_1_2 : TRACE_SAP_Buffer_Transfer(1,2,10)Reactor1 makes upcalls to event handlers 10: Reactor1_TPRHE1 ---handle_input(2,1)---> Flow1_EH1 12: Reactor1_TPRHE2 ---handle_input(8,2)---> Flow2_EH1 14: Reactor1_TPRHE3 ---handle_input(14,3)---> Flow3_EH1Flow1 proceeds 15: Time advanced by 25 units. Global time is 28 16: Flow1_EH1 : TRACE_SAP_Buffer_Write(3,10) 17: Unidir_IPC_3_4 : TRACE_SAP_Buffer_Transfer(3,4,10) 19: Reactor2_TPRHE4 ---handle_input(4,4)---> Flow1_EH2 20: Time advanced by 25 units. Global time is 53 21: Flow1_EH2 : TRACE_SAP_Buffer_Write(5,10) 22: Unidir_IPC_5_6 : TRACE_SAP_Buffer_Transfer(5,6,10)Flow2 proceeds 23: Time advanced by 25 units. Global time is 78 24: Flow2_EH1 : TRACE_SAP_Buffer_Write(9,10) 25: Unidir_IPC_9_10 : TRACE_SAP_Buffer_Transfer(9,10,10) 27: Reactor2_TPRHE5 ---handle_input(10,5)---> Flow2_EH2 28: Time advanced by 25 units. Global time is 103 29: Flow2_EH2 : TRACE_SAP_Buffer_Write(11,10) 30: Unidir_IPC_11_12 : TRACE_SAP_Buffer_Transfer(11,12,10)Flow3 proceeds 31: Time advanced by 25 units. Global time is 128 32: Flow3_EH1 : TRACE_SAP_Buffer_Write(15,10) 33: Unidir_IPC_15_16 : TRACE_SAP_Buffer_Transfer(15,16,10) 35: Reactor2_TPRHE6 ---handle_input(16,6)---> Flow3_EH2 36: Time advanced by 25 units. Global time is 153 37: Flow3_EH2 : TRACE_SAP_Buffer_Write(17,10) 38: Unidir_IPC_17_18 : TRACE_SAP_Buffer_Transfer(17,18,10) 39: Time advanced by 851 units. Global time is 1004

Solution: Deadlock Avoidance Protocols

• César Sánchez: PhD dissertation at Stanford• Paul Oberlin: MS project here at WUSTL• Avoid interactions leading to deadlock

– a liveness property• Like synchronization, achived via scheduling

– Upcalls are delayed until enough threads are ready

• But, introduces small blocking delays – a timing property– In real-time systems, also a safety property

DA Protocol Overview• Designed* and proven+ by

Cesar Sanchez, Henny Sipma and Zohar Manna (Stanford)

• Regulates upcalls based on # of available reactor threads and call graph’s “thread height”– Does not allow exhaustion

• BASIC-P protocol implemented in the ACE TP Reactor– Using handle suspension

and resumption– Backward compatible,

minimal overhead

*Sanchez, Sipma, Subramonian and Gill, “Thread Allocation Protocols for Distributed Real-Time and Embedded Systems”, FORTE 2005+ Sanchez, Sipma, Manna, Subramonian, and Gill, “On Efficient Distributed Deadlock Avoidance for Real-Time and Embedded Systems”, IPDPS 2006

EH11EH21

EH12

EH13

EH22

EH23

EH33

Client3

Client2

Client1

Server1 Server2

Reactor1 Reactor2

EH31

EH32

Flow1

Flow2

Flow3

Choosing our First Patterns• Two main issues must be addressed

– How reactors can know how many threads it’s safe to allocate at once to a handler

– How reactors can use that information at run-time• For the first issue, need a way to obtain call

graph depth (number of threads needed)– Can specify this a priori (give a table to reactor)– Can also ask objects to “call downstream” to

obtain graph heights from their children• Can use ACT and Interceptor to implement this• Can use Component Configurator to decouple and

enable installation of standard vs. (various) DA protocol services

Choosing our Next Patterns• Second Issue

– How reactors can use that information at run-time

• Need to determine when it’s ok to dispatch– Maintain counters of thread upcalls in progress– Can use Interceptor again to implement this

• Need to block threads until it’s ok to proceed– Can use a form of Scoped Locking

• But modify guard condition so it only proceeds when safe

• Can think of this being similar to leader election in L/F• Need to record when threads complete

– Again, scoped locking decreases “in-use count”– And may be done within an “upcall” interceptor

Timing traces from model/execution show DA protocol regulating the flows to use available resources without deadlock

EH33

EH23EH13

Timing Traces: DA Protocol at Work

EH22EH12

R1 R2

EH32Flow2

R1 R2

Flow3

EH31

EH21EH11

R1 R2

Flow1

DA Blocking Delay (Simulated vs. Actual)Actual ExecutionModel Execution

Blocking delayfor Client2

Blocking delayfor Client3

Overhead of ACE TP reactor with DA

Negligible overhead with no DA protocol

Overhead increases with number of event handlers because of their suspension and resumption on protocol entry and exit

Concluding Remarks• Horizontal Design of Architectures

– Often as important as vertical design– Not as frequently encountered/addressed

• Design Choices Demand Further Effort– Needed deadlock avoidance

theory/implementation to make WaitOnConnection an effective option

• That Effort Leads to Further Design Choices– To implement deadlock avoidance efficiently

• And in Turn Leads to Further Design Forces– For example, DA protocol blocking factors, costs

Documents

E81 CSE 532S: Advanced Multi-Paradigm Software Development