Recursion & Erlang, FunctionalConf 14, Bangalore

recursion and erlang

functionalconf ’14, Bangalore

Bhasker KodeHelpshift

@bhaskerkode @linesofkode http://github.com/helpshift

git clone https://github.com/bosky101/fuconf

http://github.com/helpshift

https://github.com/bosky101/fuconf

PART IHISTORICAL walkthrough recursion

and erlang

automatic computingSymposium. Germany, October 1955

ACM, US, 1957

goals of the IAL

mathematical notation

printable in publications

“should be able to be translated to programs”

great stridesagreed upon the BNF

renamed from IAL to Algol

spanned 50 centres and 74 persons

formalized pass by value/reference

agreed to meet in 1960

featuresvariables

assignment

expressions

selection

iteration

maybe some sort of procedure

clashesto use “,” or a “.” in numbers ?

US industrial programmers had real-world inputs and recommendations

pass by reference was confusing

“Add recursion?” Lisp had just added recursion few months before release

Dijkstra saw the potential of recursion.

With editor Naur, added it secretly to

the draft over phone from Amsterdam.

Hence called the “Amsterdam Plot”

Recursion was a major step. It was

introduced in a sneaky way.

1986Erlang was designed for writing concurrent programs that “run forever.”

Joe Armstrong at Ericsson Labs.

#telephony star #1 on erlang team

expressive

robust

object oriented

concurrency

ecosystem

long running

easy to debug

closures

immutability

single-threaded / multi-threaded

recursion

actor model

hot code loading

language design vs compiler design vs consequences

why concurrency was #1Idea came from existing telephone exchanges

Lets say there are two separate phone calls

One call going down should not affect the other

They should not share anything, no sharing pointers

So copy data over. beginnings of message passing.

Be able to work on different registers simultaneously.

thought process

with concurrency as the first goal, everything thing else was worked out or a consequences around decisions around concurrency

concurrency existed before erlang

• Ada (with tasks)

• EriPascal (an Ericsson dialect of Pascal with concurrent processes)

• Chill (with processes)

• PLEX (processes implemented in hardware. Also inspired linking errors)

• and Euclid

#emp2 to #emp1“3 properties of a programming language were central to the efficient operation of a concurrent language or operating system.“

1) the time to create a process.

2) the time to perform a context switch between two different processes

3) the time to copy a message between two processes

3 emps“We rapidly adopted a philosophy of message passing by copying and no sharing of data resources. Again, this was counter to the mood of the time, where threads and shared resources protected by semaphores were the dominant way of programming concurrent systems.”

funny incident• the 3 employees working on erlang

went to a conference just like this

• asked speakers what would happen if one of the many machines crashes

• reply was “no failures” or “can’t do anything” . This was still 1985.

• Joe armstrong first wrote this in the algebra (hurray for algol 60 )

• a prolog engineer saw the algebra and said, i can program this

• Joe was amazed. Learned prolog.

• prior to prolog, the implementation in smalltalk

• dial a number, send that as a message to exchange.

• dial a valid number, other phone should ring

• if its busy, get back busy

• if its not busy, get back not busy

state machines

“the software for call control was best modeled using finite state machines that undergo state transitions in response to protocol messages.”

What started as an experiment in adding concurrency to Prolog, became more of a language in its own right and this language acquired a name Erlang in 1987.

with concurrency as goal

“we didn’t realise at the time, was that copying data between processes increases process isolation, increases concurrency and simplifies the construction of distributed systems.”

history contd.• http://www.erlang-factory.com/

conference/SFBay2010/speakers/joearmstrong

http://www.erlang-factory.com/conference/SFBay2010/speakers/joearmstrong

Mike Williams Employee #2

Employee #1

Employee #3

PART II

under the hood of recursion

• usually a function gets executed and returns to where it was calledreturn 1 + foo(a)

• but in a tail optimised function, it effectively does not have to remember where to go back toreturn tail(1,a)

• #gajini

int r(int x) { if (x == 1) { return 1; } return x*r(x-1); }

r(3)

!

!

recursive


r(3) 3 * r(2) 3 * (2 * r(1)) 3 * (2 * 1) 3 * 2 6

recursive

int tr(int x,int y){ if (x == 0) { return y; } return tr(x-1, x*y); } tr(3,1)

!

!

tail recursive

int tr(int x,int y){ if (x == 0) { return y; } return tr(x-1, x*y); } tr(3,1) tr(3-1, 3 * 1) tr(2-1, 2 * 3) tr(1-1, 1 * 6) 6

tail recursive


r(3) 3 * r(2) 3 * (2 * r(1)) 3 * (2 * 1) 3 * 2 6

int tr(int x,int y){ if (x == 0) { return y; } return tr(x-1, x*y); } tr(3,1) tr(3-1, 3 * 1) tr(2-1, 2 * 3) tr(1-1, 1 * 6) 6

recursive tail recursive

• 3 procedural functions approach • 1 recursive function approach • 1 tail recursive function approach

see who is knocking on 3 doors

3 doors. “anybody here.no?ok.” x3

http://here.no

“anybody here. oh another door.”

“anybody here. oh another door”x 3

oh a backdoor!”

procedural call stack

recursive call stack

tail recursive call stack

3 * r(2) 3 * (2 * r(1)) 3 * (2 * 1) 3 * 2 6

tr(3-1, 3 * 1) tr(2-1, 2 * 3) tr(1-1, 1 * 6) 6

are these both recursive? which of these are easier to spawn off ?

which of these are easier to stop midway? which of these are easier to debug?

3 * r(2) 3 * (2 * r(1)) 3 * (2 * 1) 3 * 2 6

tr(3-1, 3 * 1) tr(2-1, 2 * 3) tr(1-1, 1 * 6) 6

are these both recursive? which of these are easier to spawn off ?

which of these are easier to stop midway? which of these are easier to debug?

recursive good backtrace cant change order

recursive easier to parallelise

Amdahl’s Law

“The speedup of a program using multiple processors in parallel computing

is limited by the time needed for the sequential fraction of the program.”

which brings us to

spawning/forking

PART III

let’s look at some code

what is a daemon

• fair to say a daemon is a long running process?

• fair to say a long running process is a tail-recursive function?

rough skeleton for daemon in C

a long running process is in fact a tail-recursive

function

a daemon in erlang

Does this block the scheduler? is this an infinite loop? !

Let’s start foo and bar on new processes and see what happens?!

runs a function as a new process. here we ask it to spawn both foo, then bar.

spawn

erlang’s scheduler splits CPU time between multiple concurrent processes based on the “reductions” ( # times a

function is called)

use 8 cores, so 8 schedulers

fork and start a daemonthat again does nothing…

as long as the heart is beating, you are alive same way: as long as a process is a) tail-recursive orb) waiting to receive ( on its inbox ) …it is alive

#expressive #erlang

#expressive #erlang !

compose to suit your needs

#expressive #erlang

! tail-recursion

+network

programming= killer combo

still think erlang has hard to read syntax?

busting some erlang mythsbusting some erlang myths

meh

Leslie Lamport , “father of distributed computing” recent Turing Awardee said that if you are not representing your distributed program as a state machine you’re doing something wrong.

PART III conclusion

tail resursive functions are the foundation for processes

processes are the foundation for state machines

tail recursive functions / processes with sockets are the foundation for network programming

networking programming is the foundation of distributed computing.

PART III conclusion

erlang & distributed systems

so in other words, erlang’s strength in distributed computing can be indirectly be attributed to supporting tail recursion

Concurrency comes with itschallenges

parallelism and ordering, another great debate and tale of priorities and compromises

found in any producer/consumer scenario

1,2,3,4,5,6

1,2

3

4

5

6

5,4,1,2,3,6

eg: ekaf_partition_strategy

}

why state machines

send(Socket, Data) Foo = receive(Socket)

socket already closed, re-open it

socket already open

data being received currently

data being sent currently

socket closed before sending

socket closed before receiving

timeouts

and lastly, not friendly towards batched sending

“India can never join the G7. Why?”

@linesofkode

“Because if they do, it’ll be the G8”

recursion and erlang #ftw


@bhaskerkode http://github.com/helpshift




BONUS PART V ( if we have time)

are language designers sadists?

weird syntax/semanticsimmutable variables

monkey patching boilerplate code

callback hellweird stack traces

bad handling of floats whitespace, colons, semicolons, comma,

arrows

compiler design vs language goals vs consequences

expressive

robust

object oriented

concurrency

ecosystem

long running

easy to debug

closures

immutability

single-threaded / multi-threaded

recursion

actor model


expressive!

robust!

object oriented!

concurrency!

ecosystem!

long running!

easy to debug!

closures!

immutability!

single-threaded / multi-threaded!

actor model!

recursion!


So, No. Language designers do not draw joy

making you cringe.

language designer: “i want to be concurrent”

compiler designer: “ok, lets try not bottle necking on resources. that way it’ll be easier to garbage collect.”

language: ”ok, i’ll favour immutability. this may mean recreating entire Objects/trees/structures and passing them around.

OO folks wont like this”

compiler: “btw since variables won’t change, can i re-use the call frame.”

language:”ok. as I need to add the sum of 100 numbers, i’ll create A1=1, A2=increment(A1), …A100 = increment(A99)”

compiler: that looks too repetitive. use A = increment(100, previous) and write a recursive fn that calls itself until 0.

concurrency

OO

mutability

recursion

message passing

speed

if you were to pick 1 as your top priority and see its effects

concurrency

OO

mutability

recursion

message passing

speed

1

CPS

readability

debugging

tail call optimised threads

1

• Guido, was in favour of not loosing stack traces, and the fact that tail-call optimised functions would be tougher to debug

• Trivia Alert! An early contributor to python was an anonymous lisp hacker “Amit Prem" who added map,reduce,lambda to python in 1.0.0

TCO/recursion in other languages

• java compiler doesnt tco

• but scala allows tail recursive functions

• infact if you mention @tailrec it wont compile until you make the function tail recursive

• lua compiler does tail call optimisation


• The ECMAScript 4 spec was originally going to add support for TCO, but it was dropped.

• But ECMAScript 6 does, but it only becomes a standard mid 2015 ( PS: it also has destructuring which is cool)

• in C compilers , do tail call optimisation but not necessarily support tail recursion ( where the last function is intact the same function )


pseudo-code notes# not tail call optimised but is recursivea(n){ …. return 1 + a(n-1)}

# is tail call optimised but not recursivea(n){ …. return b(n-1) }# is tail call optimised and is recursivea(n){ …. return a(n-1) }

• in erlang, if a function a calls b, and a function b calls a, its memory foot print will never increase.

• infact all process that receive messages need to be tail-recursive

• this is the fundamental construct behind long lived processes and building state machines in erlang

TCO/recursion in erlang

recursion and erlang #ftw


@bhaskerkode http://github.com/helpshift




Technology

Recursion & Erlang, FunctionalConf 14, Bangalore