Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
CPL 2016, week 9Erlang fault tolerance and distributed programming
Oleg Batrashev
Institute of Computer Science, Tartu, Estonia
April 4, 2016
Overview
Previous weekI Erlang: functional core and agents
TodayI Erlang fault tolerance and distributed programming
Next weeksI Clojure language and asynchronous Javascript
Fault tolerance 27/50 -
General ideas
I system process – handles exit signals from linked processesI traps exit messages {’EXIT’,Pid,Why}
I default: usual process dies if Why 6= normal
I link – connection between 2 processesI symmetric, must be set explicitly
I exit signal – sent to the set of linked processesI when process dies {’EXIT’,B,Why}I when process finishes {’EXIT’,B,normal}
A
B{’EXIT’,B,Why}
Fault tolerance 27/50 -
General ideas
I system process – handles exit signals from linked processesI traps exit messages {’EXIT’,Pid,Why}I default: usual process dies if Why 6= normal
I link – connection between 2 processesI symmetric, must be set explicitly
I exit signal – sent to the set of linked processesI when process dies {’EXIT’,B,Why}I when process finishes {’EXIT’,B,normal}
A B
{’EXIT’,B,Why}
Fault tolerance 27/50 -
General ideas
I system process – handles exit signals from linked processesI traps exit messages {’EXIT’,Pid,Why}I default: usual process dies if Why 6= normal
I link – connection between 2 processesI symmetric, must be set explicitly
I exit signal – sent to the set of linked processesI when process dies {’EXIT’,B,Why}I when process finishes {’EXIT’,B,normal}
A B
{’EXIT’,B,Why}
Fault tolerance 27/50 -
General ideas
I system process – handles exit signals from linked processesI traps exit messages {’EXIT’,Pid,Why}I default: usual process dies if Why 6= normal
I link – connection between 2 processesI symmetric, must be set explicitly
I exit signal – sent to the set of linked processesI when process dies {’EXIT’,B,Why}I when process finishes {’EXIT’,B,normal}
A B{’EXIT’,B,Why}
Fault tolerance 28/50 -
Links
I defines error propagation path between 2 processesI if one dies then another gets exit signalI links are established with
I link(B) orI spawn_link(Fun)
Fault tolerance 29/50 -
Signals
Exit signalI generated when process dies or finishesI {’EXIT’,Pid,Why}
I Why=normal if a process just finishes (i.e. recursion ends)I Why=<exception desc> if there was a problemI exit(Why) may be called to stop itself
I sent to all linked processesFaking death: exit(Pid2, Why)
I sends {’EXIT’,Pid,Why} to process Pid2I continues exection
Fault tolerance 30/50 -
System processes
I usual processI dies if receives exit signal from any linked process where
Why6=normal
I system processI set with process_flag(trap_exit,true)I traps exit signals from linked processes
I messages {’EXIT’,Pid,Why} are added to its mailbox
I exit signals with Why=kill are not caught at all!I process is killed, even system processI {’EXIT’,Pid,killed} broadcasted to all linked processes
(notice that kill is propagated as killed)
Fault tolerance 31/50 -
Example (1)
on_exit(Pid , Fun) ->spawn(fun() ->
process_flag(trap_exit , true),link(Pid),receive
{'EXIT',Pid ,Why} ->Fun(Why)
endend).
I creates a process that “monitors” the process with given PidI upon exit calls the given function Fun
Fault tolerance 32/50 -
Example (2)
F = fun() ->receive
X -> list_to_atom(X)end
end.Pid = spawn(F).on_exit(Pid ,
fun(Why) ->io:format("~p died with ~p~n",[Pid ,Why])
end).
I create a process that transforms lists to atomsI add error handler with on_exitI Now sending Pid ! hello.
I results in <0.61.0> died with:{badarg,[{· · ·
Fault tolerance 33/50 -
Summary of exit signals
What happens ifI a process with given trap_exit (i.e. system or not)I receives the given Exit signaltrap_exit Exit signal Action
true kill Die: broadcast the exit signal killed tothe link set
true X Add {’EXIT’,Pid,X} to the mailboxfalse normal Continue: Do nothing, signal vanishesfalse kill Die: broadcast the exit signal killed to
the link setfalse X Die: broadcast the exit signal X to the link
set
Fault tolerance 34/50 -
Idioms for trapping exits1. Don’t care about new process
Pid=spawn(fun() -> ... end)
2. Want to die if new process dies
Pid=spawn_link(fun() -> ... end)
3. Want to handle errors if new process dies
...process_flag(trap_exit , true),Pid=spawn_link(fun() -> ... end),loop (...).
loop(State) ->receive {'EXIT', SomePid , Reason} ->
%% do something with the errorloop(State1)
...end.
Distributed programming 35/50 -
Overview
I In trusted environment – allow to run any code remotelyI Distributed Erlang – all message passing and error handling
work automatically
I In non-trusted environment – restrict what can be runI lib_chan library – RPC like Erlang specific, automates
serialization of Erlang objects;I socket based programming – low-level, but can be used with
other languages
I Binary data manipulation in Erlang – allows to encode/decodemessages for socket based programming
Distributed programming 36/50 Erlang specific -
Outline
Fault tolerance
Distributed programmingErlang specific
Distributed ErlangChan library
Binary data manupulationSocket programming
Erlang support libraries
Distributed programming 37/50 Erlang specific - Distributed Erlang
Erlang node
I Erlang node is a separate Erlang VMI may be run on the same or different host
I if it fails/exits then other VMs are not directly affectedI all communication between two or more nodes is transparentI programmer does not see a difference
I except when creating processes or making explicit remote callsI values (numeric, tuples, etc) are copiedI agents are provided with proxies to communicate
Distributed programming 38/50 Erlang specific - Distributed Erlang
Running nodes
I run erl with the argument -sname <nodename>I may run any code from other local node
I run erl with the argument -name <nodename@host>I may run on local or remote networkI use the same version of the codeI nodes must have the same cookie -setcookie <abc>I make sure Erlang Port Mapper Daemon port is available (4369)I choose a range of ports to be used (-kernel
inet_dist_listen_min <port>)
Distributed programming 39/50 Erlang specific - Distributed Erlang
ExampleI rmt.erl
-module(rmt).-export([runme /0]).
runme () ->io:format("Running on ~p~n", [node()]).
I Run node a
erl -sname a1> c(rmt).2> rmt:runme ().
I Run node b
erl -sname b1> rpc:call(a@mycomputer ,rmt ,runme ,[]).Running on a@mycomputer
I replace mycomputer with your host name
Distributed programming 40/50 Erlang specific - Distributed Erlang
Distribution primitives
I built in rpc module
rpc:call(Node ,Mod ,Function ,Args) ->Result|{badrpc ,Reason}
I built in global moduleI extended spawn functions, etc
spawn(Node ,Fun) -> Pidspawn(Node ,Mod ,Func ,ArgList) -> Pidspawn_link(Node ,Fun) -> Pidspawn_link(Node ,Mod ,Func ,ArgList) -> Piddisconnect_node(Node) -> bool ()| ignorednode() -> Nodenode(Arg) -> Nodenodes () -> [Node]Pid ! Msg{RegName ,Node} ! Msg
Distributed programming 41/50 Binary data manupulation -
Outline
Fault tolerance
Distributed programmingErlang specific
Distributed ErlangChan library
Binary data manupulationSocket programming
Erlang support libraries
Distributed programming 42/50 Binary data manupulation -
Binary dataI used with data from external programs/sourcesI double bracket syntax to define binaries (arrays)
2> R=31,G=0,B=0.3> Color = <<R:5, G:6, B:5>>.<<248,0>>
I Integer:NumOfBitsI tricky with endianess (big or little)
I shorthand for strings4> <<64,65,66>>.<<"@AB">>
I IoList is a list of integers (0..255), binaries, or IoLists5> Lst = [ <<1,2,3>>,4,[ <<5,6>>,7] ,8].
I convenient, if we want to defer combining into the final binary6> list_to_binary(Lst).<<1,2,3,4,5,6,7,8>>
Distributed programming 43/50 Binary data manupulation -
BIFs for binaries
BIF – built-in function in ErlangI list_to_binary(IoList) ->binary()
I many IO functions accept IoLists, so no need to transform
I split_binary(Bin,Pos) ->{Bin1,Bin2}
I pattern matching for binaries may be more convenient
7> <<Z:3/binary , Rest/binary >> = <<1,2,3,4,5,6,7,8>>.<<1,2,3,4,5,6,7,8>>
8> Z.<<1,2,3>>
9> Rest.<<4,5,6,7,8>>
I term_to_binary(Term) ->Bin
I binary_to_term(Bin) ->Term
I size(Bin) ->Int
Distributed programming 44/50 Binary data manupulation -
The bit syntaxFull syntax for binaries
I <<E1,E2,...,En>> where Ei are the elementsI each element is one of the forms
Ei = Value |Value:Size |Value/TypeSpecifierList |Value:Size/TypeSpecifierList
I where TypeSpecifierList is hypnen separated list ofI endianess: big | little | nativeI sign: signed | unsignedI type: integer | float | binary
I Size is in bits for integer/float and in bytes for binaries
13> <<3:16/big -unsigned -integer , 6,7>>.<<0,3,6,7>>
14> <<3:16/integer , 6,7, 2.1415:64/ float >>.<<0,3,6,7,64,1,33,202,192,131,18,111>>
15> <<A:2/binary ,B:16/ little -integer >> = <<1,2,3,0>>.
Distributed programming 45/50 Binary data manupulation -
Macros
I Erlang has macros that allow to define constants
-define(BYTE ,8/ signed -big -integer ).-define(INT ,32/ signed -big -integer ).-define(LONG ,64/ signed -big -integer ).
I use them in the code with the question sign
<<DLen: ?INT , Salt:DLen/binary >> = Content ,{params , binary_to_list(Salt)};
I notice how length of the binary is read from the same streamfirst
Distributed programming 46/50 Socket programming -
Outline
Fault tolerance
Distributed programmingErlang specific
Distributed ErlangChan library
Binary data manupulationSocket programming
Erlang support libraries
Distributed programming 47/50 Socket programming -
Sockets
Erlang gen_tcp libraryI gen_tcp:connect(Host,Port,Options) -> {ok, Socket}I list of options, some possible:
I binary – open in binary modeI {packet, Len} – Len is the number of bytes before each
packet, that define the length of the packet; Erlang splits thestream and delivers the whole packets
I 0 means deliver unchanged stream
I process that created the socket is the controlling processI process exit closes the socketI data from socket is delivered as {tcp,Socket,Bin} messageI on socket close the process gets {tcp_closed,Socket}
Distributed programming 48/50 Socket programming -
Parallel server
start_parallel_server () ->{ok , Listen} = gen_tcp:listen (...),spawn(fun() -> par_connect(Listen) end).
par_connect(Listen) ->{ok , Socket} = gen_tcp:accept(Listen),spawn(fun() -> par_connect(Listen) end),loop(Socket ).
loop (...) -> ...
I be cautious of controlling processI spawn new process to listen another connection
Distributed programming 49/50 Socket programming -
Control issues
Different regimes for socket receptionI active – process gets {tcp, Socket, Bin} messages
I may be flooded with messages, but non-blocking
I passive – process must call gen_tcp:recv(Socket,N)I may block the server (client) if buffers are empty (full)
I mixed – active for single messageI create socket with {active,once}I re-enable after each message with
inet:setopts(Socket , [{active ,once}])
I best of two worlds
Erlang support libraries 50/50 -
List of Erlang libraries
I ETS and DETS: Large data storage mechanismsI ETS in memory, DETS on diskI key-value; mutable unlike Erlang core!I table types: sets, ordered sets, bags, duplicate bags
I OTP (Open Telecom Platform) is like J2EE to JavaI gen_server – transaction and hot-swapI supervision trees (erl -man supervisor)
I Mnesia: the Erlang databaseI relational, replication
I Crypto library, make a sha-1 hashI Digest = crypto:hash(sha, Salt++"mypassword")