Upload
darius
View
42
Download
0
Embed Size (px)
DESCRIPTION
Failure Handling in a modal Language. Nels Eric Beckman Research Talk Institute for Software Research October 30, 2006. Claims Made in this Talk. ML5 is an elegant language for programming distributed systems. In the face of node failure, the meaning of ML5 programs becomes unclear. - PowerPoint PPT Presentation
Citation preview
1
Failure Handling in a modal Language
Nels Eric BeckmanResearch Talk
Institute for Software ResearchOctober 30, 2006
Failure Handling in a Modal Language
ISR
2
Claims Made in this Talk
• ML5 is an elegant language for programming distributed systems.
• In the face of node failure, the meaning of ML5 programs becomes unclear.
• We propose extensions to ML5 that makes their meaning clear.• (In reality, this research is a work in
progress.)
Failure Handling in a Modal Language
ISR
3
ML5
• A Programming Language for Distributed Systems
• Based on a Modal Logic• i.e. A Logic With an Embedded Notion
of Place
• Tom Murphy’s Thesis Work• Targeted for Grid Programming
Failure Handling in a Modal Language
ISR
4
ML5, Briefly...
• Allows Hosts to Send ‘Thunks’ to One Another for Execution• In practice, code can be more cleanly
decomposed.
• Has An Advanced Type System • Location-specific resources can be
typed as so.
Failure Handling in a Modal Language
ISR
5
RPC-Style Distributed Programming
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
Failure Handling in a Modal Language
ISR
6
RPC-Style Distributed Programming
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
Failure Handling in a Modal Language
ISR
7
RPC-Style Distributed Programming
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
Failure Handling in a Modal Language
ISR
8
RPC-Style Distributed Programming
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
rpc “b”
Failure Handling in a Modal Language
ISR
9
RPC-Style Distributed Programming
PC
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
Failure Handling in a Modal Language
ISR
10
RPC-Style Distributed Programming
PCPC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
Failure Handling in a Modal Language
ISR
11
RPC-Style Distributed Programming
PC
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
Failure Handling in a Modal Language
ISR
12
RPC-Style Distributed Programming
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
ret x
Failure Handling in a Modal Language
ISR
13
RPC-Style Distributed Programming
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
ret x
Failure Handling in a Modal Language
ISR
14
RPC-Style Distributed Programming
PC
Host
Active thread
Blocked thread
Message
fun a = fun b =
rpc(“b”,19.x.x.x) + r
return x;
ret x
Failure Handling in a Modal Language
ISR
15
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
16
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
17
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
18
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
19
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
20
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
21
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
22
ML5 Illustration
PC
Host
Location of thread
Migration of thread
Failure Handling in a Modal Language
ISR
23
Example
• Remotely Finding List’s Sum (RPC)Server Code:
class ListServ {
List<Integer> myList = new ...
List<Integer> getList() {
return myList; }
}
Failure Handling in a Modal Language
ISR
24
Example
• Remotely Finding List’s Sum (RPC)Client Code:class ListClient {
ListServerStub myServ = new ...
public void foo() {
List<Integer> list = myServ.getList();
for(Integer item: list) {
count+= item.intValue();
}
if( count >= 40 )
...
}}
Failure Handling in a Modal Language
ISR
25
Example
• Remotely Finding List’s Sum (RPC)• To Fix Should We:
• Add a new server operation that returns true if a list’s sum is greater than 40?• Weird if operation is only used once.• We wouldn’t structure application this
way in a centralized setting.• Bite the performance bullet and send
the whole list?
Failure Handling in a Modal Language
ISR
26
Example
• Remotely Finding List’s Sum (ML5)Before:fun foo remote_host remote_list_ref =
let fun sum a_list =
foldl op+ 0 a_list
in
if sum (
get[remote_host]( !remote_list_ref )
) > 40
then true
else false
Failure Handling in a Modal Language
ISR
27
Example
• Remotely Finding List’s Sum (ML5)After:fun foo remote_host remote_list_ref =
let fun sum a_list =
foldl op+ 0 a_list
in
get[remote_host](
if sum ( !remote_list_ref ) > 40
then true
else false
)
Failure Handling in a Modal Language
ISR
28
Types
• ML5 Type System Embeds a Notion of Place• Some values can be used at any
place.• e.g. Primitive data types, structures
• Some values can only be used at the location where they make sense.• e.g. File descriptors, reference cells,
printers
Failure Handling in a Modal Language
ISR
29
Just a Few Types…
• τ@w – “The type τ is well-typed on host w.”
Failure Handling in a Modal Language
ISR
30
Just a Few Types…
• get[w’,a]e – “Evaluate e on host w’ and return the result to the current host. Change e’s type from @w’ to @w.”
• Example:fun foo (x: int ref @w’, a: w’ addr @w) =
get[w’,a]( !x + !x )
Failure Handling in a Modal Language
ISR
31
Just a Few Types…
• get[w’,a]e – “Evaluate e on host w’ and return the result to the current host. Change e’s type from @w’ to @w.”
• Example:fun foo (x: int ref @w’, a: w’ addr @w) =
get[w’,a]( !x + !x ) Typedint@w’
Failure Handling in a Modal Language
ISR
32
• get[w’,a]e – “Evaluate e on host w’ and return the result to the current host. Change e’s type from @w’ to @w.”
• Example:fun foo (x: int ref @w’, a: w’ addr @w) =
get[w’,a]( !x + !x )
Just a Few Types…
Typedint@w
Failure Handling in a Modal Language
ISR
33
Just a Few Types…
• □τ – “Suspended code that can be evaluated anywhere. Produces a value of type τ.”
• Example:(let fun sum il = foldl op+ 0 ilin
box (sum [1,2,3,4,5])end): □int @w
Failure Handling in a Modal Language
ISR
34
Just a Few Types…
• ◊τ – “A value of type τ that exists at some other location.”
• Example:here (ref 5):◊(ref int) @w
Failure Handling in a Modal Language
ISR
35
But What About Host Failure?
• What happens here?
(* at host 1 *)
get[w_2, a_2](
(* at host 2 *)
!int_ref_at_w_2 +
get[w_3, a_3](
(* at host 3 *)
!int_ref_at_w_3))
Failure Handling in a Modal Language
ISR
36
But What About Host Failure?
• What happens here?
(* at host 1 *)
get[w_2, a_2](
(* at host 2 *)
!int_ref_at_w_2 +
get[w_3, a_3](
(* at host 3 *)
!int_ref_at_w_3)) Host 2 dies!
Failure Handling in a Modal Language
ISR
37
But What About Host Failure?
• What happens here?
(* at host 1 *)
get[w_2, a_2](
(* at host 2 *)
!int_ref_at_w_2 +
get[w_3, a_3](
(* at host 3 *)
!int_ref_at_w_3)) Host 2 dies!
Throw an exception?
Failure Handling in a Modal Language
ISR
38
But What About Host Failure?
• What happens here?
(* at host 1 *)
get[w_2, a_2](
(* at host 2 *)
!int_ref_at_w_2 +
get[w_3, a_3](
(* at host 3 *)
!int_ref_at_w_3)) Host 2 dies!
Throw an exception?
Continue on from Host 3?
Failure Handling in a Modal Language
ISR
39
But What About Host Failure?
• What happens here?
(* at host 1 *)
get[w_2, a_2](
(* at host 2 *)
!int_ref_at_w_2 +
get[w_3, a_3](
(* at host 3 *)
!int_ref_at_w_3)
or_if_i_cant_return (...))) Host 2 dies!
Throw an exception?
Continue on from Host 3?
Failure Handling in a Modal Language
ISR
40
But What About Host Failure?
• What happens here?
(* at host 1 *)
get[w_2, a_2](
(* at host 2 WHICH DOESN’T EXIST!*)
!int_ref_at_w_2 +
get[w_3, a_3](
(* at host 3 *)
!int_ref_at_w_3)
or_if_i_cant_return (...))) Host 2 dies!
Throw an exception?
Continue on from Host 3?
Failure Handling in a Modal Language
ISR
41
What We Want (Intuitively)
callcc x =>(* at host 1 *)get[w_2, a_2](
(* at host 2 *)
!int_ref_at_h_2 +get[w_3, a_3](
(* at host 3 *)!int_ref_at_h_3or_if_i_cant_return (throw (raise NetFail) to
x)))
Failure Handling in a Modal Language
ISR
42
What We Want (Intuitively)
callcc x =>(* at host 1 *)get[w_2, a_2](
(* at host 2 *)
!int_ref_at_h_2 +get[w_3, a_3](
(* at host 3 *)!int_ref_at_h_3or_if_i_cant_return (throw (raise NetFail) to
x)))
Don’t actually throw
something through the
network.
Failure Handling in a Modal Language
ISR
43
What We Want (Intuitively)
callcc x =>(* at host 1 *)get[w_2, a_2](
(* at host 2 *)
!int_ref_at_h_2 +get[w_3, a_3](
(* at host 3 *)!int_ref_at_h_3or_if_i_cant_return (throw (raise NetFail) to
x)))
Don’t actually throw
something through the
network.
Have host one detect the failure.
Failure Handling in a Modal Language
ISR
44
Isn’t This Just a ‘Timeout’ Exception?
• A Good Question:• “Why not just have the ‘get’ operation
throw a timeout exception, like in Java?”
• e.g.
get[w_2, a_2] (
!int_on_w2
) handle TimeOut => (* do something *)
Failure Handling in a Modal Language
ISR
45
Answers
1. This is actually a little smarter than just ‘timeout.’
2. The ‘Implicit Spawn’ Problem
Failure Handling in a Modal Language
ISR
46
Answers
1. This is actually a little smarter than just ‘timeout.’
2. The ‘Implicit Spawn’ Problem
get[w_2, a_2] (
(* extremely complicated op *)
) handle TimeOut => (* do something *)
Failure Handling in a Modal Language
ISR
47
Answers
1. This is actually a little smarter than just ‘timeout.’
2. The ‘Implicit Spawn’ Problem
get[w_2, a_2] (
(* extremely complicated op *)
) handle TimeOut => (* do something *)
T2
T1
Failure Handling in a Modal Language
ISR
48
What We Need
• Share the Fact that Host 1 Has ‘Given Up’
• Kill the Thread ASAP• Make That Thread’s Actions
Irrelevant• Each host gets a chance to ‘undo’
potential effects.
• All with ‘Best Effort’
Failure Handling in a Modal Language
ISR
49
One More Wrinkle
Catom 1
Catom 2
Grab ‘continuation’
Failure Handling in a Modal Language
ISR
50
One More Wrinkle
Catom 1
Catom 2
Assign ‘Catom1’ to ‘myLeader’
Failure Handling in a Modal Language
ISR
51
One More Wrinkle
Catom 1
Catom 2
Failure Handling in a Modal Language
ISR
52
The Design, In Short
try
e_1
continuing
e_2
end
Failure Handling in a Modal Language
ISR
53
The Design, In Short
try
e_1
continuing
e_2
end
1. Execute e_1
Failure Handling in a Modal Language
ISR
54
The Design, In Short
try
e_1
continuing
e_2
end
1. Execute e_1
2. In the event of node failure... the entire expression will throw an exception on this host.
Failure Handling in a Modal Language
ISR
55
The Design, In Short
try
e_1
continuing
e_2
end
1. Execute e_1
2. In the event of node failure... the entire expression will throw an exception on this host.
3. On the other hosts, e_2 will be executed, and its value discarded.
Failure Handling in a Modal Language
ISR
56
The Design, In Short
(* host 1*)
try
(* set all of my neighbor’s
‘myLeader’ to host 1 *)
continuing
if !myLeader = host_1
then myLeader := NONE
else ()
end
Failure Handling in a Modal Language
ISR
57
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PCtry
continuing
l:
end
Failure Handling in a Modal Language
ISR
58
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC try
continuing
l:
end
Store Cont(stack)
Failure Handling in a Modal Language
ISR
59
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PCtry
continuing
l:
end
Store Cont(▪;l)
Failure Handling in a Modal Language
ISR
60
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PCtry
continuing
l:
end
Failure Handling in a Modal Language
ISR
61
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC
try
continuing
l:
end
Store Cont(▪;l)
Failure Handling in a Modal Language
ISR
62
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC
try
continuing
l:
end
Failure Handling in a Modal Language
ISR
63
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC
try
continuing
l:
end
Failure Handling in a Modal Language
ISR
64
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC
try
continuing
l:
end
Error!
Error!
Failure Handling in a Modal Language
ISR
65
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC try
continuing
l:
end
Restore Cont.
Restore Cont.
PCl:
Failure Handling in a Modal Language
ISR
66
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC raise Fail)
handle...
PC
l:
Failure Handling in a Modal Language
ISR
67
ML5-C: Error Continuations
Host
Visited Host
Location of thread
Migration of thread
PC
raise Fail)
handle...
Failure Handling in a Modal Language
ISR
68
Interesting Note
• In Failure Case, We Have to Reason About Client and Server.• (The avoidance of this was one of the
touted benefits of ML5!)
Failure Handling in a Modal Language
ISR
69
Future Work
• This Work is Not Yet Finished• More Restrictive Modal Basis
• Only neighbor catoms are accessible• This would be a ‘lower level’
language in some sense.
70
Thanks!
Additional Questions?
Failure Handling in a Modal Language
ISR
71
Failure Handling is More Natural
• In Claytronics, Failure is Possible at Any Moment.
• Intuitively, it would be nice to say:
try {
// a complex, multi host operation }
catch (Failure v) {
// take an alternate
// course of action. }
72
So You Want to See the Typing Rules...
Note: These rules represent just a snapshot of the work.