148
Purely Functional Data Structures and Monoids Donnacha Ois´ ın Kidney May 9, 2020 1

Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Purely Functional Data Structures andMonoids

Donnacha Oisı́n Kidney

May 9, 2020

1

Page 2: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Purely Functional Data Structures

Page 3: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why DoWe Need Them?

Why do pure functional languages need a di�erent way to do datastructures? Why can’t we just use traditional algorithms fromimperative programming?

To answer that question, we’re going to look at a very simplealgorithm in an imperative language, and we’re going to see hownot to translate it into Haskell.

The mistake we make may well be one which you have made in past!

2

Page 4: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why DoWe Need Them?

Why do pure functional languages need a di�erent way to do datastructures? Why can’t we just use traditional algorithms fromimperative programming?

To answer that question, we’re going to look at a very simplealgorithm in an imperative language, and we’re going to see hownot to translate it into Haskell.

The mistake we make may well be one which you have made in past!

2

Page 5: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why DoWe Need Them?

Why do pure functional languages need a di�erent way to do datastructures? Why can’t we just use traditional algorithms fromimperative programming?

To answer that question, we’re going to look at a very simplealgorithm in an imperative language, and we’re going to see hownot to translate it into Haskell.

The mistake we make may well be one which you have made in past!

2

Page 6: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

3

Page 7: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

(in Python)

3

Page 8: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

We’re going to write a func-tion to create an array filledwith some ints.

3

Page 9: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

It works like this.

>>> create_array_up_to(5)

[0,1,2,3,4]

3

Page 10: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

This is its implementa-tion.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

3

Page 11: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

We first initialise an emptyarray.

def create_array_up_to(n):

array = [] ⇐for i in range(n):

array.append(i)

return array

3

Page 12: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

And then we loop throughthe numbers from 0 ton-1.

def create_array_up_to(n):

array = []

for i in range(n): ⇐array.append(i)

return array

3

Page 13: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

We append each number onto the array.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i) ⇐return array

3

Page 14: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

And we return the array.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array ⇐

3

Page 15: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Simple Imperative Algorithm

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

>>> create_array_up_to(5)

[0,1,2,3,4]

3

Page 16: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Trying to Translate it to Haskell

We’re going to run into a problemwith this line.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i) ⇐return array

The append function mutates array:a�er calling append, the value of thevariable array changes.array has di�erent values before anda�er line 3.

We can’t do that in an immutable language! A variable’s valuecannot change from one line to the next in Haskell.

4

Page 17: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Trying to Translate it to Haskell

We’re going to run into a problemwith this line.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i) ⇐return array

The append function mutates array:a�er calling append, the value of thevariable array changes.array has di�erent values before anda�er line 3.

We can’t do that in an immutable language! A variable’s valuecannot change from one line to the next in Haskell.

4

Page 18: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Trying to Translate it to Haskell

We’re going to run into a problemwith this line.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i) ⇐return array

The append function mutates array:a�er calling append, the value of thevariable array changes.

array has di�erent values before anda�er line 3.

We can’t do that in an immutable language! A variable’s valuecannot change from one line to the next in Haskell.

4

Page 19: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Trying to Translate it to Haskell

We’re going to run into a problemwith this line.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i) ⇐return array

1 array = [1,2,3]

2 print(array)

3 array.append(4)

4 print(array)

The append function mutates array:a�er calling append, the value of thevariable array changes.array has di�erent values before anda�er line 3.

We can’t do that in an immutable language! A variable’s valuecannot change from one line to the next in Haskell.

4

Page 20: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Trying to Translate it to Haskell

We’re going to run into a problemwith this line.

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i) ⇐return array

1 array = [1,2,3]

2 print(array)

3 array.append(4)

4 print(array)

The append function mutates array:a�er calling append, the value of thevariable array changes.array has di�erent values before anda�er line 3.

We can’t do that in an immutable language! A variable’s valuecannot change from one line to the next in Haskell.

4

Page 21: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Append in Haskell

Instead of mutating variables, in Haskell when we want to change adata structure we usually write a function which returns a newvariable equal to the old data structure with the change applied.

append :: Array a→ a→ Array a

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

main = doprint myArrayprint myArray2

5

Page 22: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Append in Haskell

Instead of mutating variables, in Haskell when we want to change adata structure we usually write a function which returns a newvariable equal to the old data structure with the change applied.

append :: Array a→ a→ Array a

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

main = doprint myArrayprint myArray2

5

Page 23: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Append in Haskell

Instead of mutating variables, in Haskell when we want to change adata structure we usually write a function which returns a newvariable equal to the old data structure with the change applied.

append :: Array a→ a→ Array a

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

main = doprint myArrayprint myArray2

5

Page 24: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

Let’s look at the imperative algorithm, and try to translate itbit-by-bit.

O(n) O(n2)

6

Page 25: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

First we’ll need to write the type signature and skeleton of theHaskell function.What should the type be?

O(n) O(n2)

6

Page 26: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

O(n) O(n2)

6

Page 27: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

We tend not to use loops in functional languages, but this loop inparticular follows a very common pa�ern which has a name andfunction in Haskell.What is it?

O(n) O(n2)

6

Page 28: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl

(λarray i → append array i)emptyArray

[0 . . n− 1]

foldl is the function we need.How would the output have di�ered if we used foldr instead?

O(n) O(n2)

6

Page 29: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl

(λarray i → append array i)emptyArray

[0 . . n− 1]

O(n) O(n2)

6

Page 30: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl

(λarray i → append array i)

emptyArray[0 . . n− 1]

O(n) O(n2)

6

Page 31: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl

(λarray i → append array i)

emptyArray[0 . . n− 1]

O(n) O(n2)

6

Page 32: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

Is there a shorter way to write this, that doesn’t include a lambda?

O(n) O(n2)

6

Page 33: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Translating it to Haskell

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

O(n) O(n2)

6

Page 34: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why the performance di�erence?

6

Page 35: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why the performance di�erence?

It comes down to the di�erent complexities of append .

Python HaskellO(1) O(n)

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

Both implementations call append n times, which causes thedi�erence in asymptotics.

7

Page 36: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why the performance di�erence?

It comes down to the di�erent complexities of append .

Python HaskellO(1) O(n)

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

Both implementations call append n times, which causes thedi�erence in asymptotics.

7

Page 37: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why the performance di�erence?

It comes down to the di�erent complexities of append .

Python HaskellO(1) O(n)

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

Both implementations call append n times, which causes thedi�erence in asymptotics.

7

Page 38: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why the performance di�erence?

It comes down to the di�erent complexities of append .

Python HaskellO(1) O(n)

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

Both implementations call append n times, which causes thedi�erence in asymptotics.

7

Page 39: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Why the performance di�erence?

It comes down to the di�erent complexities of append .

Python HaskellO(1) O(n)

def create_array_up_to(n):

array = []

for i in range(n):

array.append(i)

return array

createArrayUpTo :: Int→ Array IntcreateArrayUpTo n =

foldl(λarray i → append array i)emptyArray[0 . . n− 1]

Both implementations call append n times, which causes thedi�erence in asymptotics. 7

Page 40: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Forgetful Imperative Languages

Why is the imperative version so much more e�icient? Why isappend O(1)?

1 array = [1,2,3]

2 print(array)

3 array.append(4)

4 print(array)

8

Page 41: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Forgetful Imperative Languages

Why is the imperative version so much more e�icient? Why isappend O(1)?

1 array = [1,2,3]

2 print(array)

3 array.append(4)

4 print(array)

8

Page 42: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Forgetful Imperative Languages

Why is the imperative version so much more e�icient? Why isappend O(1)?

To run this code e�iciently, mostimperative interpreters will look for thespace next to 3 in memory, and put 4there: an O(1) operation.

1 array = [1,2,3]

2 print(array)

3 array.append(4)

4 print(array)

8

Page 43: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Forgetful Imperative Languages

Why is the imperative version so much more e�icient? Why isappend O(1)?

To run this code e�iciently, mostimperative interpreters will look for thespace next to 3 in memory, and put 4there: an O(1) operation.

1 array = [1,2,3]

2 print(array)

3 array.append(4)

4 print(array)

(Of course, sometimes the “space next to 3” will already be occupied!There are clever algorithms you can use to handle this case.)

8

Page 44: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Forgetful Imperative Languages

Why is the imperative version so much more e�icient? Why isappend O(1)?

To run this code e�iciently, mostimperative interpreters will look for thespace next to 3 in memory, and put 4there: an O(1) operation.

1 array = [1,2,3]

2 print(array)

3 array.append(4)

4 print(array)

Semantically, in an imperative language we are allowed to “forget”the contents of array on line 1: [1,2,3]. That array has beenirreversibly replaced by [1,2,3,4].

8

Page 45: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Haskell doesn’t Forget

The Haskell version of append looks similar at first glance:

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

But we can’t edit the array [1, 2, 3] in memory, because myArraystill exists!

main = doprint myArrayprint myArray2

>>> main

[1,2,3]

[1,2,3,4]

As a result, our only option is to copy, which is O(n).

9

Page 46: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Haskell doesn’t Forget

The Haskell version of append looks similar at first glance:

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

But we can’t edit the array [1, 2, 3] in memory, because myArraystill exists!

main = doprint myArrayprint myArray2

>>> main

[1,2,3]

[1,2,3,4]

As a result, our only option is to copy, which is O(n).

9

Page 47: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Haskell doesn’t Forget

The Haskell version of append looks similar at first glance:

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

But we can’t edit the array [1, 2, 3] in memory, because myArraystill exists!

main = doprint myArrayprint myArray2

>>> main

[1,2,3]

[1,2,3,4]

As a result, our only option is to copy, which is O(n).

9

Page 48: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Haskell doesn’t Forget

The Haskell version of append looks similar at first glance:

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

But we can’t edit the array [1, 2, 3] in memory, because myArraystill exists!

main = doprint myArrayprint myArray2

>>> main

[1,2,3]

[1,2,3,4]

As a result, our only option is to copy, which is O(n).

9

Page 49: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Haskell doesn’t Forget

The Haskell version of append looks similar at first glance:

myArray = [1, 2, 3]myArray2 = myArray ‘append‘ 4

But we can’t edit the array [1, 2, 3] in memory, because myArraystill exists!

main = doprint myArrayprint myArray2

>>> main

[1,2,3]

[1,2,3,4]

As a result, our only option is to copy, which is O(n).

9

Page 50: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem

In immutable languages, old versions of data structures have to bekept around in case they’re looked at.

For arrays, this means we have to copy on every mutation. (i.e.:append is O(n))

Solutions?

1. Find a way to disallow access of old versions of data structures.

2. Find a way to implement data structures that keep their oldversions e�iciently.

10

Page 51: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem

In immutable languages, old versions of data structures have to bekept around in case they’re looked at.

For arrays, this means we have to copy on every mutation. (i.e.:append is O(n))

Solutions?

1. Find a way to disallow access of old versions of data structures.

2. Find a way to implement data structures that keep their oldversions e�iciently.

10

Page 52: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem

In immutable languages, old versions of data structures have to bekept around in case they’re looked at.

For arrays, this means we have to copy on every mutation. (i.e.:append is O(n))

Solutions?

1. Find a way to disallow access of old versions of data structures.

2. Find a way to implement data structures that keep their oldversions e�iciently.

10

Page 53: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem

In immutable languages, old versions of data structures have to bekept around in case they’re looked at.

For arrays, this means we have to copy on every mutation. (i.e.:append is O(n))

Solutions?

1. Find a way to disallow access of old versions of data structures.

2. Find a way to implement data structures that keep their oldversions e�iciently.

This approach is beyond the scope of this lecture!However, for interested students: linear type systems can enforcethis property. You may have heard of Rust, a programming languagewith linear types.

10

Page 54: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem

In immutable languages, old versions of data structures have to bekept around in case they’re looked at.

For arrays, this means we have to copy on every mutation. (i.e.:append is O(n))

Solutions?

1. Find a way to disallow access of old versions of data structures.

2. Find a way to implement data structures that keep their oldversions e�iciently.

This is the approach we’re going to look at today.

10

Page 55: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Keeping History E�iciently

Consider the linked list.

1myArray = 2 3

0myArray2 = 1 2 30myArray2 = 1 2 3

2myArray3 = 32myArray3 = 3

11

Page 56: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Keeping History E�iciently

To “prepend” an element (i.e. append to front), you might assumewe would have to copy again:

1myArray = 2 3

0myArray2 = 1 2 3

0myArray2 = 1 2 3

2myArray3 = 32myArray3 = 3

11

Page 57: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Keeping History E�iciently

However, this is not the case.

1myArray = 2 3

0myArray2 = 1 2 3

0myArray2 = 1 2 3

2myArray3 = 32myArray3 = 3

11

Page 58: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Keeping History E�iciently

The same trick also works with deletion.

1myArray = 2 3

0myArray2 = 1 2 3

0myArray2 = 1 2 3

2myArray3 = 3

2myArray3 = 3

11

Page 59: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Keeping History E�iciently

1myArray = 2 3

0myArray2 = 1 2 3

0myArray2 = 1 2 3

2myArray3 = 3

2myArray3 = 3

11

Page 60: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Persistent Data Structures

Persistent Data StructureA persistent data structure is a data structure which preserves allversions of itself a�er modification.

An array is “persistent” in some sense, ifall operations are implemented bycopying. It just isn’t very e�icient.

A linked list is much be�er:it can do persistent consand uncons in O(1) time.

ImmutabilityWhile the semantics of languages like Haskell necessitate thisproperty, they also facilitate it.

A�er several additions and deletions onto some linked structurewe will be le� with a real rat’s nest of pointers and references:strong guarantees that no-one will mutate anything is essentialfor that mess to be manageable.

12

Page 61: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Persistent Data Structures

Persistent Data StructureA persistent data structure is a data structure which preserves allversions of itself a�er modification.

An array is “persistent” in some sense, ifall operations are implemented bycopying. It just isn’t very e�icient.

A linked list is much be�er:it can do persistent consand uncons in O(1) time.

ImmutabilityWhile the semantics of languages like Haskell necessitate thisproperty, they also facilitate it.

A�er several additions and deletions onto some linked structurewe will be le� with a real rat’s nest of pointers and references:strong guarantees that no-one will mutate anything is essentialfor that mess to be manageable.

12

Page 62: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Persistent Data Structures

Persistent Data StructureA persistent data structure is a data structure which preserves allversions of itself a�er modification.

An array is “persistent” in some sense, ifall operations are implemented bycopying. It just isn’t very e�icient.

A linked list is much be�er:it can do persistent consand uncons in O(1) time.

ImmutabilityWhile the semantics of languages like Haskell necessitate thisproperty, they also facilitate it.

A�er several additions and deletions onto some linked structurewe will be le� with a real rat’s nest of pointers and references:strong guarantees that no-one will mutate anything is essentialfor that mess to be manageable.

12

Page 63: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Persistent Data Structures

Persistent Data StructureA persistent data structure is a data structure which preserves allversions of itself a�er modification.

An array is “persistent” in some sense, ifall operations are implemented bycopying. It just isn’t very e�icient.

A linked list is much be�er:it can do persistent consand uncons in O(1) time.

ImmutabilityWhile the semantics of languages like Haskell necessitate thisproperty, they also facilitate it.

A�er several additions and deletions onto some linked structurewe will be le� with a real rat’s nest of pointers and references:strong guarantees that no-one will mutate anything is essentialfor that mess to be manageable.

12

Page 64: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

?

As it happens, all of you havealready been using a persistent data structure!

Git is perhaps the most widely-usedpersistent data structure in the world.

It works like a persistent file system: when youmake a change to a file, git remembers the old version, instead ofdeleting it!

To do this e�iciently it doesn’t just store a new copy of therepository whenever a change is made, it instead uses some of thetricks and techniques we’re going to look at in the rest of this talk.

13

Page 65: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Git

As it happens, all of you havealready been using a persistent data structure!

Git is perhaps the most widely-usedpersistent data structure in the world.

It works like a persistent file system: when youmake a change to a file, git remembers the old version, instead ofdeleting it!

To do this e�iciently it doesn’t just store a new copy of therepository whenever a change is made, it instead uses some of thetricks and techniques we’re going to look at in the rest of this talk.

13

Page 66: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Git

As it happens, all of you havealready been using a persistent data structure!

Git is perhaps the most widely-usedpersistent data structure in the world.

It works like a persistent file system: when youmake a change to a file, git remembers the old version, instead ofdeleting it!

To do this e�iciently it doesn’t just store a new copy of therepository whenever a change is made, it instead uses some of thetricks and techniques we’re going to look at in the rest of this talk.

13

Page 67: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Git

As it happens, all of you havealready been using a persistent data structure!

Git is perhaps the most widely-usedpersistent data structure in the world.

It works like a persistent file system: when youmake a change to a file, git remembers the old version, instead ofdeleting it!

To do this e�iciently it doesn’t just store a new copy of therepository whenever a change is made, it instead uses some of thetricks and techniques we’re going to look at in the rest of this talk.

13

Page 68: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Book

Chris Okasaki. Purely FunctionalData Structures.Cambridge University Press,June 1999Much of the material in thislecture comes directly from thisbook.It’s also on your reading list foryour algorithms course next year.

14

Page 69: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Arrays

While our linked list can replace a normal array for someapplications, in general it’s missing some of the key operations wemight want.

Indexing in particular is O(n) on a linked list but O(1) on an array.

We’re going to build a data structure which gets to O(log n)indexing in a pure way.

15

Page 70: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Implementing a FunctionalAlgorithm: Merge Sort

Page 71: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Merge Sort

Merge sort is a classic divide-and-conquer algorithm.

It divides up a list into singleton lists, and then repeatedly mergesadjacent sublists until only one is le�.

16

Page 72: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 10 7 8 1 9 3 4 5

17

Page 73: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 10 7 8 1 9 3 4 5

2 6 10 7 8 1 9 3 4 5

17

Page 74: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 10 7 8 1 9 3 4 5

17

Page 75: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 10 7 8 1 9 3 4 5

2 6 7 10 1 8 3 9 4 5

17

Page 76: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 7 10 1 8 3 9 4 5

17

Page 77: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 7 10 1 8 3 9 4 5

2 6 7 10 1 3 8 9 4 5

17

Page 78: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 7 10 1 3 8 9 4 5

17

Page 79: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 7 10 1 3 8 9 4 5

1 2 3 6 7 8 9 10 4 5

17

Page 80: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

1 2 3 6 7 8 9 10 4 5

17

Page 81: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

1 2 3 6 7 8 9 10 4 5

1 2 3 4 5 6 7 8 9 10

17

Page 82: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

1 2 3 4 5 6 7 8 9 10

17

Page 83: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 10 7 8 1 9 3 4 5

2 6 10 7 8 1 9 3 4 5

2 6 7 10 1 8 3 9 4 5

2 6 7 10 1 3 8 9 4 5

1 2 3 6 7 8 9 10 4 5

1 2 3 4 5 6 7 8 9 10

17

Page 84: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Just to demonstrate some of the complexity of the algorithm whenimplemented imperatively, here it is in Python.

You do not need to understand the following slide!

18

Page 85: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Just to demonstrate some of the complexity of the algorithm whenimplemented imperatively, here it is in Python.

You do not need to understand the following slide!

18

Page 86: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

def merge_sort(arr):

lsz, tsz, acc = 1, len(arr), []

while lsz < tsz:

for ll in range(0, tsz-lsz, lsz*2):

lu, rl, ru = ll+lsz, ll+lsz, min(tsz, ll+lsz*2)

while ll < lu and rl < ru:

if arr[ll] <= arr[rl]:

acc.append(arr[ll])

ll += 1

else:

acc.append(arr[rl])

rl += 1

acc += arr[ll:lu] + arr[rl:ru]

acc += arr[len(acc):]

arr, lsz, acc = acc, lsz*2, []

return arr19

Page 87: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 88: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 89: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 90: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 91: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 92: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 93: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 94: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

How can we improve it?

Merge sort is actually an algorithm perfectly suited to a functionalimplementation.

In translating it over to Haskell, we are going to make the followingimprovements:

• We will abstract out some pa�erns, like the fold pa�ern.

• We will do away with index arithmetic, instead usingpa�ern-matching.

• We will avoid complex while conditions.

• We won’t mutate anything.

• We will add a healthy sprinkle of types.

Granted, all of these improvements could have been made to thePython code, too.

20

Page 95: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Merge in Haskell

We’ll start with a function that merges two sorted lists.

merge :: Ord a⇒ [a ]→ [a ]→ [a ]merge [ ] ys = ysmerge xs [ ] = xsmerge (x : xs) (y : ys)| x 6 y = x :merge xs (y : ys)| otherwise = y :merge (x : xs) ys

>>> merge [1,8] [3,9]

[1,3,8,9]

21

Page 96: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Merge in Haskell

We’ll start with a function that merges two sorted lists.

merge :: Ord a⇒ [a ]→ [a ]→ [a ]merge [ ] ys = ysmerge xs [ ] = xsmerge (x : xs) (y : ys)| x 6 y = x :merge xs (y : ys)| otherwise = y :merge (x : xs) ys

>>> merge [1,8] [3,9]

[1,3,8,9]

21

Page 97: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Merge in Haskell

We’ll start with a function that merges two sorted lists.

merge :: Ord a⇒ [a ]→ [a ]→ [a ]merge [ ] ys = ysmerge xs [ ] = xsmerge (x : xs) (y : ys)| x 6 y = x :merge xs (y : ys)| otherwise = y :merge (x : xs) ys

>>> merge [1,8] [3,9]

[1,3,8,9]

21

Page 98: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Using the Merge to Sort

Next: how do we use this merge to sort a list?

We know how to combine 2sorted lists, and that combinefunction has an identity, so howdo we use it to combine n sortedlists?

merge xs [] = xs

foldr?

22

Page 99: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Using the Merge to Sort

Next: how do we use this merge to sort a list?

We know how to combine 2sorted lists, and that combinefunction has an identity, so howdo we use it to combine n sortedlists?

merge xs [] = xs

foldr?

22

Page 100: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Using the Merge to Sort

Next: how do we use this merge to sort a list?

We know how to combine 2sorted lists, and that combinefunction has an identity, so howdo we use it to combine n sortedlists?

merge xs [] = xs

foldr?

22

Page 101: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem with foldr

sort :: Ord a⇒ [a ]→ [a ]sort xs = foldr merge [ ] [[x ] | x ← xs ]

Unfortunately, this isactually insertion sort!

merge [x] ys = insert x ys

The problem is that foldr is too unbalanced.

foldr (⊕) ∅ [1 . . 5] =1⊕ (2⊕ (3⊕ (4⊕ (5⊕ ∅))))

1 ⊕

2 ⊕

3 ⊕

4 ⊕

5 ∅Merge sort crucially divides the work in a balanced way!

23

Page 102: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem with foldr

sort :: Ord a⇒ [a ]→ [a ]sort xs = foldr merge [ ] [[x ] | x ← xs ]

Unfortunately, this isactually insertion sort!

merge [x] ys = insert x ys

The problem is that foldr is too unbalanced.

foldr (⊕) ∅ [1 . . 5] =1⊕ (2⊕ (3⊕ (4⊕ (5⊕ ∅))))

1 ⊕

2 ⊕

3 ⊕

4 ⊕

5 ∅Merge sort crucially divides the work in a balanced way!

23

Page 103: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem with foldr

sort :: Ord a⇒ [a ]→ [a ]sort xs = foldr merge [ ] [[x ] | x ← xs ]

Unfortunately, this isactually insertion sort!

merge [x] ys = insert x ys

The problem is that foldr is too unbalanced.

foldr (⊕) ∅ [1 . . 5] =1⊕ (2⊕ (3⊕ (4⊕ (5⊕ ∅))))

1 ⊕

2 ⊕

3 ⊕

4 ⊕

5 ∅Merge sort crucially divides the work in a balanced way!

23

Page 104: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem with foldr

sort :: Ord a⇒ [a ]→ [a ]sort xs = foldr merge [ ] [[x ] | x ← xs ]

Unfortunately, this isactually insertion sort!

merge [x] ys = insert x ys

The problem is that foldr is too unbalanced.

foldr (⊕) ∅ [1 . . 5] =1⊕ (2⊕ (3⊕ (4⊕ (5⊕ ∅))))

1 ⊕

2 ⊕

3 ⊕

4 ⊕

5 ∅Merge sort crucially divides the work in a balanced way!

23

Page 105: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem with foldr

sort :: Ord a⇒ [a ]→ [a ]sort xs = foldr merge [ ] [[x ] | x ← xs ]

Unfortunately, this isactually insertion sort!

merge [x] ys = insert x ys

The problem is that foldr is too unbalanced.

foldr (⊕) ∅ [1 . . 5] =1⊕ (2⊕ (3⊕ (4⊕ (5⊕ ∅))))

1 ⊕

2 ⊕

3 ⊕

4 ⊕

5 ∅

Merge sort crucially divides the work in a balanced way!

23

Page 106: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Problem with foldr

sort :: Ord a⇒ [a ]→ [a ]sort xs = foldr merge [ ] [[x ] | x ← xs ]

Unfortunately, this isactually insertion sort!

merge [x] ys = insert x ys

The problem is that foldr is too unbalanced.

foldr (⊕) ∅ [1 . . 5] =1⊕ (2⊕ (3⊕ (4⊕ (5⊕ ∅))))

1 ⊕

2 ⊕

3 ⊕

4 ⊕

5 ∅Merge sort crucially divides the work in a balanced way!

23

Page 107: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort

2 6 10 7 8 1 9 3 4 5

2 6 10 7 8 1 9 3 4 5

2 6 7 10 1 8 3 9 4 5

2 6 7 10 1 3 8 9 4 5

1 2 3 6 7 8 9 10 4 5

1 2 3 4 5 6 7 8 9 10

24

Page 108: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A More Balanced Fold

treeFold :: (a→ a→ a)→ [a ]→ atreeFold (⊕) [x ] = xtreeFold (⊕) xs = treeFold (⊕) (pairMap xs)

wherepairMap (x1 : x2 : xs) = x1 ⊕ x2 : pairMap xspairMap xs = xs

This can be used quite similarly to how you might use foldl or foldr :

sum = treeFold (+)

(although we would probably change the definition a li�le to catchthe empty list, but we won’t look at that here)

The fundamental di�erence between this fold and, say, foldr is thatit’s balanced, which is extremely important for merge sort.

25

Page 109: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A More Balanced Fold

treeFold :: (a→ a→ a)→ [a ]→ atreeFold (⊕) [x ] = xtreeFold (⊕) xs = treeFold (⊕) (pairMap xs)

wherepairMap (x1 : x2 : xs) = x1 ⊕ x2 : pairMap xspairMap xs = xs

This can be used quite similarly to how you might use foldl or foldr :

sum = treeFold (+)

(although we would probably change the definition a li�le to catchthe empty list, but we won’t look at that here)

The fundamental di�erence between this fold and, say, foldr is thatit’s balanced, which is extremely important for merge sort.

25

Page 110: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A More Balanced Fold

treeFold :: (a→ a→ a)→ [a ]→ atreeFold (⊕) [x ] = xtreeFold (⊕) xs = treeFold (⊕) (pairMap xs)

wherepairMap (x1 : x2 : xs) = x1 ⊕ x2 : pairMap xspairMap xs = xs

This can be used quite similarly to how you might use foldl or foldr :

sum = treeFold (+)

(although we would probably change the definition a li�le to catchthe empty list, but we won’t look at that here)

The fundamental di�erence between this fold and, say, foldr is thatit’s balanced, which is extremely important for merge sort.

25

Page 111: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A More Balanced Fold

treeFold :: (a→ a→ a)→ [a ]→ atreeFold (⊕) [x ] = xtreeFold (⊕) xs = treeFold (⊕) (pairMap xs)

wherepairMap (x1 : x2 : xs) = x1 ⊕ x2 : pairMap xspairMap xs = xs

This can be used quite similarly to how you might use foldl or foldr :

sum = treeFold (+)

(although we would probably change the definition a li�le to catchthe empty list, but we won’t look at that here)

The fundamental di�erence between this fold and, say, foldr is thatit’s balanced, which is extremely important for merge sort.

25

Page 112: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A More Balanced Fold

treeFold :: (a→ a→ a)→ [a ]→ atreeFold (⊕) [x ] = xtreeFold (⊕) xs = treeFold (⊕) (pairMap xs)

wherepairMap (x1 : x2 : xs) = x1 ⊕ x2 : pairMap xspairMap xs = xs

This can be used quite similarly to how you might use foldl or foldr :

sum = treeFold (+)

(although we would probably change the definition a li�le to catchthe empty list, but we won’t look at that here)

The fundamental di�erence between this fold and, say, foldr is thatit’s balanced, which is extremely important for merge sort.

25

Page 113: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of treeFold

treeFold (⊕) [1 . . 10] =treeFold (⊕) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

1 2 3 4 5 6 7 8 9 10

26

Page 114: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of treeFold

treeFold (⊕) [1 . . 10] =treeFold (⊕) [1⊕ 2, 3⊕ 4, 5⊕ 6, 7⊕ 8, 9⊕ 10]

1 2

3 4

5 6

7 8

9 10

26

Page 115: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of treeFold

treeFold (⊕) [1 . . 10] =treeFold (⊕) [(1⊕ 2)⊕ (3⊕ 4), (5⊕ 6)⊕ (7⊕ 8), 9⊕ 10]

1 2

3 4

5 6

7 8

9 10

26

Page 116: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of treeFold

treeFold (⊕) [1 . . 10] =treeFold (⊕) [((1⊕ 2)⊕ (3⊕ 4))⊕ ((5⊕ 6)⊕ (7⊕ 8)), 9⊕ 10]

1 2

3 4

5 6

7 8

9 10

26

Page 117: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of treeFold

treeFold (⊕) [1 . . 10] =(((1⊕ 2)⊕ (3⊕ 4))⊕ ((5⊕ 6)⊕ (7⊕ 8)))⊕ (9⊕ 10)

1 2

3 4

5 6

7 8

9 10

26

Page 118: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of foldr

Compare to foldr :

foldr (⊕) ∅ [1 . . 5] =1⊕ (2⊕ (3⊕ (4⊕ (5⊕ ∅))))

1 ⊕

2 ⊕

3 ⊕

4 ⊕

5 ∅

27

Page 119: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort in Haskell

treeFold merge [2, 6, 10, 7, 8, 1, 9, 3, 4, 5] =⊕

[2] [6]

[10] [7]

[8] [1]

[9] [3]

[4] [5]

28

Page 120: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort in Haskell

treeFold merge [2, 6, 10, 7, 8, 1, 9, 3, 4, 5] =⊕

[2, 6] [7, 10]

[1, 8] [3, 9]

[4, 5]

28

Page 121: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort in Haskell

treeFold merge [2, 6, 10, 7, 8, 1, 9, 3, 4, 5] =⊕

[2, 6, 7, 10] [1, 3, 8, 9]

[4, 5]

28

Page 122: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort in Haskell

treeFold merge [2, 6, 10, 7, 8, 1, 9, 3, 4, 5] =⊕

[1, 2, 3, 6, 7, 8, 9, 10] [4, 5]

28

Page 123: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Visualisation of Merge Sort in Haskell

treeFold merge [2, 6, 10, 7, 8, 1, 9, 3, 4, 5] =

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

28

Page 124: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Sort Algorithm

sort :: Ord a⇒ [a ]→ [a ]sort [ ] = [ ]

sort xs = treeFold merge [[x ] | x ← xs ]

29

Page 125: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

So Why Is This Algorithm Fast?

It’s down to the pa�ern of the fold itself.

Because it splits the input evenly, the full algorithm is O(n log n)time.

If we had just used foldr , we would have defined insertion sort,which is O(n2).

30

Page 126: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Monoids

Page 127: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Monoids

class Monoid a whereε :: a(•) :: a→ a→ a

MonoidA monoid is a set with a neutralelement ε, and a binary operator•, such that:

(x • y) • z = x • (y • z)x • ε = x

ε • x = x

31

Page 128: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Examples of Monoids

• N, under either + or ×.

• Lists:

instance Monoid [a ] whereε = [ ]

(•) = (++)

• Ordered lists, with merge.

32

Page 129: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Let’s Rewrite treeFold to use Monoids

treeFold :: Monoid a⇒ [a ]→ atreeFold [ ] = ε

treeFold [x ] = xtreeFold xs = treeFold (pairMap xs)

wherepairMap (x1 : x2 : xs) = (x1 • x2) : pairMap xspairMap xs = xs

We can actually prove that this version returns the same results asfoldr , as long as the monoid laws are followed.

It just performs the fold in a more e�icient way.

33

Page 130: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

We’ve already seen one monoid we can use this fold with: orderedlists.

Another is floating-point numbers under summation. Using foldr orfoldl will give you O(n) error growth, whereas using treeFold willgive you O(log n).

34

Page 131: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Let’s Make It Incremental

Page 132: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

treeFold currently processes the input in one big operation.

However, if we were able to process the input incrementally, withuseful intermediate results, there are some other applications wecan use the fold for.

35

Page 133: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Binary Data Structure

We’re going to build a data structure based on the binary numbers.

For, say, 10 elements, we have the following binary number:This number tells us how to arrange 10 elements into perfect trees.

1 2

3 4

5 6

7 8

9 10

36

Page 134: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Binary Data Structure

We’re going to build a data structure based on the binary numbers.

For, say, 10 elements, we have the following binary number:

I O I O

This number tells us how to arrange 10 elements into perfect trees.

1 2

3 4

5 6

7 8

9 10

36

Page 135: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Binary Data Structure

We’re going to build a data structure based on the binary numbers.

For, say, 10 elements, we have the following binary number:

I8O4I2O1

(With each bit annotated with its significance)

This number tells us how to arrange 10 elements into perfect trees.

1 2

3 4

5 6

7 8

9 10

36

Page 136: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Binary Data Structure

We’re going to build a data structure based on the binary numbers.

For, say, 10 elements, we have the following binary number:

I8O4I2O1

This number tells us how to arrange 10 elements into perfect trees.

1 2

3 4

5 6

7 8

9 10

36

Page 137: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

A Binary Data Structure

We’re going to build a data structure based on the binary numbers.

For, say, 10 elements, we have the following binary number:

I8O4I2O1

This number tells us how to arrange 10 elements into perfect trees.

1 2

3 4

5 6

7 8

9 10

36

Page 138: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

The Incremental Type

We can write this as a datatype:

type Incremental a = [(Int, a)]

cons :: (a→ a→ a)→ a→ Incremental a→ Incremental acons f = go 0

wherego i x [ ] = [(i, x)]go i x ((0, y) : ys) = (i + 1, f x y) : ysgo i x ((j , y) : ys) = (i, x) : (j − 1, y) : ys

run :: (a→ a→ a)→ Incremental a→ arun f = foldr1 f ◦map snd

And we can even implement treeFold using it:

treeFold :: (a→ a→ a)→ [a ]→ atreeFold f = run f ◦ foldr (cons f ) [ ]

37

Page 139: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

We can now use the function incrementally.

treeScanl f = map (run f ) ◦ tail ◦ scanl (flip (cons f )) [ ]treeScanr f = map (run f ) ◦ init ◦ scanr (cons f ) [ ]

We could, for instance, sortall of the tails of a liste�iciently in this way.(although I’m not sure whyyou’d want to!)

treeScanr merge(map pure [2, 6, 1, 3, 4, 5]) ≡

[[1, 2, 3, 4, 5, 6], [1, 3, 4, 5, 6], [1, 3, 4, 5], [3, 4, 5], [4, 5], [5]]

A more practical use is to extract the k smallest elements from a list,which can be achieved with a variant on this fold.

38

Page 140: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

We can now use the function incrementally.

treeScanl f = map (run f ) ◦ tail ◦ scanl (flip (cons f )) [ ]treeScanr f = map (run f ) ◦ init ◦ scanr (cons f ) [ ]

We could, for instance, sortall of the tails of a liste�iciently in this way.(although I’m not sure whyyou’d want to!)

treeScanr merge(map pure [2, 6, 1, 3, 4, 5]) ≡

[[1, 2, 3, 4, 5, 6], [1, 3, 4, 5, 6], [1, 3, 4, 5], [3, 4, 5], [4, 5], [5]]

A more practical use is to extract the k smallest elements from a list,which can be achieved with a variant on this fold.

38

Page 141: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

We can now use the function incrementally.

treeScanl f = map (run f ) ◦ tail ◦ scanl (flip (cons f )) [ ]treeScanr f = map (run f ) ◦ init ◦ scanr (cons f ) [ ]

We could, for instance, sortall of the tails of a liste�iciently in this way.(although I’m not sure whyyou’d want to!)

treeScanr merge(map pure [2, 6, 1, 3, 4, 5]) ≡

[[1, 2, 3, 4, 5, 6], [1, 3, 4, 5, 6], [1, 3, 4, 5], [3, 4, 5], [4, 5], [5]]

A more practical use is to extract the k smallest elements from a list,which can be achieved with a variant on this fold.

38

Page 142: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

But, as we saw already, the only required element here is theMonoid.

If we remember back to the (N, 0,+) monoid, we can build now acollection which tracks the number of elements it has.

data Tree a= Leaf {size :: Int, val :: a}| Node {size :: Int, lchild :: Tree a, rchild :: Tree a}

leaf :: a→ Tree aleaf x = Leaf 1 x

node :: Tree a→ Tree a→ Tree anode xs ys = Node (size xs + size ys) xs ys

39

Page 143: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Not so useful, no, but remember that we have a way to build thistype incrementally, in a balanced way.

type Array a = Incremental (Tree a)

Insertion is O(log n):

insert :: a→ Array a→ Array ainsert x = cons node (leaf x)

fromList :: [a ]→ Array afromList = foldr insert [ ]

40

Page 144: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

And finally lookup, the key feature missing from our persistentimplementation of arrays, is also O(log n):

lookupTree :: Int→ Tree a→ alookupTree (Leaf x) = xlookupTree i (Node xs ys)| i < size xs = lookupTree i xs| otherwise = lookupTree (i − size xs) ys

lookup :: Int→ Array a→ Maybe alookup = flip (foldr f b)whereb = Nothingf ( , x) xs i| i < size x = Just (lookupTree i x)| otherwise = xs (i − size x)

41

Page 145: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Finger Trees

Page 146: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

So we have seen a number of techniques today:

• Using pointers and sharing to make a data structure persistent.

• Using monoids to describe folding operations.

• Using balanced folding operations to take an O(n) operation toa O(log n) one. (in terms of time and other things like errorgrowth)

• Using a number-based data structure to incrementalise some ofthose folds.

• Using that incremental structure to implement things likelookup.

There is a single data structure which does pre�y much all of this,and more: the Finger Tree.

42

Page 147: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Finger Trees

Ralf Hinze and Ross Paterson. Finger Trees: A SimpleGeneral-purpose Data Structure.

Journal of Functional Programming, 16(2):197–217, 2006

A monoid-based tree-like structure, much like our “Incremental”type.

However, much more general.

Supports insertion, deletion, but also concatenation.

Also our lookup function is more generally described by the “split”operation.

All based around some monoid.

43

Page 148: Purely Functional Data Structures and Monoids · 2020-05-26 · Purely Functional Data Structures. Why Do We Need Them? Why do pure functional languages need a di•erent way to do

Uses for Finger Trees

Just by switching out the monoid for something else we can get analmost entirely di�erent data structure.

• Priority �eues

• Search Trees

• Priority Search �eues (think: Dijkstra’s Algorithm)

• Prefix Sum Trees

• Array-like random-access lists: this is precisely what’s done inHaskell’s Data.Sequence.

44