Upload
kenbot
View
1.045
Download
3
Embed Size (px)
Citation preview
Category Theory for Beginners
Your data structures
are made of maths!Melbourne Scala User Group Mar 2015
@KenScambler
Data structures
More and more “logical”
Less low-level memory/hardware connection
More and more support for immutability
We can use maths to reason about them!
Category Theory can reveal even deeper symmetries
Algebraic Data Types
Constants, sums, products, exponents
We can directly manipulate them with algebra!
Products
A × B
NinjaTurtles
NinjaTurtles
×
BluesBrothers×
NinjaTurtles BluesBrothers
=
NTs and
BBs× =
×
=4 × 2 8
Integers as types?
We can actually use integers to represent our types!
The integers correspond to the size of the type
Products in code
(NinjaTurtle, BluesBrother)
(NinjaTurtle, BluesBrother)4
(NinjaTurtle, BluesBrother)4 2
NinjaTurtle (,) BluesBrother4 2×
(NinjaTurtle, BluesBrother)8
case class TurtleAndBrother(foo1: NinjaTurtle, foo2: BluesBrother)
case class TurtleAndBrother(foo1: NinjaTurtle, foo2: BluesBrother)
4
case class TurtleAndBrother(foo1: NinjaTurtle, foo2: BluesBrother)
42
case class TurtleAndBrother(…)NinjaTurtleBluesBrother
42×
case class TurtleAndBrother(foo1: NinjaTurtle, foo2: BluesBrother)
8
trait Confluster {def defrabulate(): BluesBrotherdef empontigle(): NinjaTurtle
}
trait Confluster {def defrabulate(): BluesBrotherdef empontigle(): NinjaTurtle
}
2
trait Confluster {def defrabulate(): BluesBrotherdef empontigle(): NinjaTurtle
}4
2
trait Confluster { def; def; }BluesBrother
NinjaTurtle2
4×
trait Confluster {def defrabulate(): BluesBrotherdef empontigle(): NinjaTurtle
}
8
Sums
A + B
true
false
Boolean
true
false
+
Boolean Shapes+
true
false
+ =
Boolean Shapes
Boolean or
Shapes
true
false
+ =
true
falsetrue
false
=2 + 4 6
sealed trait Holidaycase object Christmas extends Holidaycase object Easter extends Holidaycase object AnzacDay extends Holiday
Sums in code
sealed trait Holidaycase object Christmas extends Holidaycase object Easter extends Holidaycase object AnzacDay extends Holiday
1
sealed trait Holidaycase object Christmas extends Holidaycase object Easter extends Holidaycase object AnzacDay extends Holiday
11
sealed trait Holidaycase object Christmas extends Holidaycase object Easter extends Holidaycase object AnzacDay extends Holiday
111
sealed trait Holiday, case, extendsChristmasEasterAnzacDay
111
+
sealed trait Holidaycase object Christmas extends Holidaycase object Easter extends Holidaycase object AnzacDay extends Holiday
3
sealed trait Opt[A]case class Some[A](a: A) extends Opt[A]case class None[A] extends Opt[A]
sealed trait Opt[A]case class Some[A](a: A) extends Opt[A]case class None[A] extends Opt[A]
A
sealed trait Opt[A]case class Some[A](a: A) extends Opt[A]case class None[A] extends Opt[A]
A1
sealed trait Opt[A], case class, extendsSome[A](a: A)None[A]
A1
+
sealed trait Opt[A]case class Some[A](a: A) extends Opt[A]case class None[A] extends Opt[A]A + 1
Either[Holiday, Opt[Boolean]]
Either[Holiday, Opt[Boolean]]3
Either[Holiday, Opt[Boolean]]3 2
Either[Holiday, Opt[Boolean]]3 2 + 1
Either[Holiday, Opt[Boolean]]3 3
Either[Holiday, Opt[Boolean]]3 3+
Either[Holiday, Opt[Boolean]]6
Exponents
BA
Exponents
BA
A type to the power of another type?? Huh?
true
false
Boolean
true
false
Boolean
Shapes
true
false
Boolean
Shapes = Shapes Boolean
true
false
true
false
true
false
true
false
true
false
true
false
true
false
true
false
true
false
23 =
true
false
true
false
true
false
true
false
true
false
true
false
true
false
true
false
8
Function types =
exponents!
A B = BA
Exponents in code
def getTurtle(): NinjaTurtles
def getTurtle: () => NinjaTurtles
def getTurtle: () => NinjaTurtles1
def getTurtle: () => NinjaTurtles1 4
def getTurtle: () => NinjaTurtles1 4
def getTurtle: () => NinjaTurtles41
def getTurtle: () => NinjaTurtles4
Functions “with no arguments”
are tacitly from a singleton type
such as Unit
Singleton types carry no
information.
trait State[S, A] {def run(state: S): (A, S)
}
trait State[S, A] {def run(state: S): (A, S)
}A×S
trait State[S, A] {def run(state: S): (A, S)
}S A×S
trait State[S, A] {def run(state: S): (A, S)
}(A×S)S
{•}Zero
On
eProduct
sSum
sExponent
s
Sets Scala Algebra
{}
A×B
A∪B
AB
(A, B)
0
1
AB
A+B
A -> B
A \/ B
Nothing
Unit
BA
Currying
def tupled[A, B, C](a: A, b: B): C
def curried[A, B, C]: A => (B => C)
Currying
def tupled[A, B, C](a: A, b: B): C
def curried[A, B, C]: A => (B => C)
A×B
Currying
def tupled[A, B, C](a: A, b: B): C
def curried[A, B, C]: A => (B => C)
A×B C
Currying
def tupled[A, B, C](a: A, b: B): C
def curried[A, B, C]: A => (B => C)
CAB
Currying
def tupled[A, B, C](a: A, b: B): C
def curried[A, B, C]: A => (B => C)
CAB
A(B C)
Currying
def tupled[A, B, C](a: A, b: B): C
def curried[A, B, C]: A => (B => C)
CAB
ACB
Currying
def tupled[A, B, C](a: A, b: B): C
def curried[A, B, C]: A => (B => C)
CAB
CAB
Recursion
sealed trait List[+A]
case class Cons[A](h: A, t: List[A]) extends List[A]
case object Nil extends List[Nothing]
sealed trait List[+A]
case class Cons[A](h: A, t: List[A]) extends List[A]
case object Nil extends List[Nothing]
A
sealed trait List[+A]
case class Cons[A](h: A, t: List[A]) extends List[A]
case object Nil extends List[Nothing]
A L(A)
sealed trait List[+A]
case class Cons[A](h: A, t: List[A]) extends List[A]
case object Nil extends List[Nothing]
A × L(A)
sealed trait List[+A]
case class Cons[A](h: A, t: List[A]) extends List[A]
case object Nil extends List[Nothing]
A × L(A)
1
sealed trait List[+A]
case class Cons[A](h: A, t: List[A]) extends List[A]
case object Nil extends List[Nothing]
1 + A × L(A)
Expanding a list…
L(a) = 1 + a × L(a)
Expanding a list…
L(a) = 1 + a L(a)
Expanding a list…
L(a) = 1 + a L(a)
= 1 + a (1 + a L(a))
Expanding a list…
L(a) = 1 + a L(a)
= 1 + a (1 + a L(a))
= 1 + a + a2 (1 + a L(a))
Expanding a list…
L(a) = 1 + a L(a)
= 1 + a (1 + a L(a))
= 1 + a + a2 (1 + a L(a))
…
= 1 + a + a2 + a3 + a4 + a5…
Expanding a list…
L(a) = 1 + a L(a)
= 1 + a (1 + a L(a))
= 1 + a + a2 (1 + a L(a))
…
= 1 + a + a2 + a3 + a4 + a5…
Nilor 1-length
or 2-length
or 3-lengthor 4-length
etc
What does it mean for
two types to have the
same number?
Not identical… but
isomorphic
Terminology
isomorphism
Terminology
isomorphism
Equal
Terminology
isomorphism
Equal-shape-ism
Terminology
isomorphism
“Sorta kinda the same-ish”
but I want to sound really
smart- Programmers
Terminology
isomorphism
“Sorta kinda the same-ish”
but I want to sound really
smart- Programmers
Terminology
isomorphism
One-to-one mapping
between two objects so
you can go back-and-forth
without losing information
These 4
Shapes
Wiggles
Setfunctions
Set
4 = 4
These 4
Shapes
Wiggles
These 4
Shapes
Wiggles
There can be lots of
isos between two
objects!
If there’s at least one, we
can say they are
isomorphic
or A ≅ B
Programming
language
syntaxProgramming
language
FEATURES!!!
NAMESNAMESNAMES
actualstructure
Programming
language
syntax
Programming
language
FEATURES!!!
NAMESNAMESNAMES
actualstructure
class Furpular { class Diggleton {
} }
Programming
language
syntaxProgramming
language
FEATURES!!!
NAMESNAMESNAMES
actualstructure
Programming
language
syntax
Programming
language
FEATURES!!!
NAMESNAMESNAMES
actualstructure
class Furpular { class Diggleton {
} }
Programming
language
syntaxProgramming
language
FEATURES!!!
NAMESNAMESNAMES
actualstructure
Programming
language
syntax
Programming
language
FEATURES!!!
NAMESNAMESNAMES
actualstructure
class Furpular { class Diggleton {
} }
Looking at data structures
algebraically lets us
compare the true structure
actualstructurebut faster
actualstructure
Knowing an isomorphism, we
can rewrite for
• Performance
• Memory usage
• Elegance
with proven correctness!
actualstructurebut faster
actualstructure
Knowing an isomorphism, we
can rewrite for
• Performance
• Memory usage
• Elegance
with proven correctness!*
*mumble mumble non-termination
Category Theory
object
A B
objectarrow
Arrows join objects. They can
represent absolutely
anything.
Categories generalise
functions over sets
A Bf
Arrows compose like
functions
A B Cf g
Arrows compose like
functions
A Cg ∘ f
A
Every object has an identity arrow,
just like the identity function
That’s all a category
is!
Products in CT
A × BA Bfirst seco
nd
trait Product[A,B] {def first: A def second: B
}
Sums in CT
A + BA BLeft Right
sealed trait Sum[A,B]
case class Left[A,B](a: A) extends Sum[A,B]
case class Right[A,B](b: B) extends Sum[A,B]
Opposite categories
C Cop
A
B
C
g ∘ f
f
g
A
B
C
f ∘ g
f
g
Isomorphic!
A
B
C
g ∘ f
f
g
A
B
C
f ∘ g
f
g
Just flip the arrows, and
reverse composition!
A
A×B
B
A product in C is a sum in Cop
A sum in C is a product in Cop
A+B
B
A
C Cop
Sums are
isomorphic to
Products!
Terminology
dual
An object and its equivalent in the
opposite category are
to each other.
Terminology
Co-(thing)
Often we call something’s dual a
Terminology
Coproducts
Sums are also called
Tightening the definitions
A × BA Bfirst seco
nd×
×
trait ProductPlusPlus[A,B] {def first: Adef second: Bdef banana: Bananadef brother: BluesBrother
}
A × BA Bfirst seco
nd×
×
Does that still count as A × B?
A × BA Bfirst seco
nd×
×
No
way!
A × BA Bfirst seco
nd×
×
Umpire
theA someB
trait Umpire {def theA: Adef someB: B
}
A × BA Bfirst seco
nd×
×
Umpire
theA someB
trait Umpire {def theA: Adef someB: B
}
unique∃
(a, b,a b
Umpire
trait Umpire {def theA: A = adef someB: B = b
}
, )Instances
(a, b,a b
Umpire
trait Umpire {def theA: A = adef someB: B = b
}
, )Instances
(a, b,a b
Umpire
trait Umpire {def theA: A = adef someB: B = b
}
, )
not
actually
unique
Instances
Requiring a unique arrow
from a 3rd object that
independently knows A
and B proves that there’s
no extra gunk.
But wait!
What if Umpire has
special knowledge about
other products?
A × BA Bfirst seco
nd
Umpire
theA someB
trait Umpire {def theA: Adef someB: Bdef specialOtherProd: (A,B)
}
It could introduce strange
new things with its special
knowledge!
We need to know nothing
about the object other
than the two arrows!
PA B
???
? ?unique∃
For all objects that 1) have an arrow to A and B
2) there exists a unique arrow to P
PA B
???
? ?unique∃
Then P is “the” product of A and B!
Same with
sums!
SA B
???
? ?unique∃
For all objects that 1) have an arrow from A and B
2) there exists a unique arrow from S
SA B
???
? ?unique∃
Then S is “the” sum of A and B!
Terminology
universal property
universal mapping
propertyUMP
Terminology
universal property
“The most efficient solution to a
problem”
Terminology
universal property
Proves that
- We generate only what we need
- We depend on only what we need
Compare to programming:
trait Monoid[M] {def id: Mdef compose(a: M, b: M): M
}
trait Foldable[F[_]] {def foldMap[M: Monoid, A](
fa: F[A], f: A => M): M}
Like UMPs, type parameters
“for all F”
“for all A and M where M is a Monoid”
don’t just prove what your code is,
but what it isn’t.
Proving what your code isn’t
prevents bloat and error, and
promotes reuse.
Proving what your code is allows
you to use it.
Zero
On
eProduct
s
Sum
sExponent
s
Scala Algebra
(A, B)
0
1
AB
A+B
A -> B
A \/ B
Nothing
Unit
BA
Types vs Algebra vs CT
CT
?
?
?
?
? ?
Intermission
Injections
A B
A function is injective if it maps 1-to-1
onto a subset of the codomain
A B
All the information in A is
preserved…
A B
All the information in A is
preserved…
A
B
But everything else in B is lost.
A B
Another way of looking at it…
C A Bfg
h
If f ∘ g = f ∘ h, then g = h
C A Bfg
h
If f ∘ g = f ∘ h, then g = h
C A Bfg
h
If f ∘ g = f ∘ h, then g = h
Injections in code
User(firstName = "Bob", lastName = "Smith", age = 73)
User JSON
{“firstName”: "Bob", “lastName”: ”Smith”, “age”: 73}
User JSON
We don’t lose information converting
to JSON.
But from JSON?
We lose structure…
User JSON
Surjections
A B
A function is surjective if it maps
onto the whole of the codomain
A
All the information in B is
preserved…
AB
A
But everything else in A is lost.
AB
A B C
If g ∘ f = h ∘ f, then g = h
f g
h
A B C
f g
h
If g ∘ f = h ∘ f, then g = h
A B C
f g
h
If g ∘ f = h ∘ f, then g = h
Injections generalised
C A Bf
g
h
If f ∘ g = f ∘ h, then g = h
Monomorphisms
C A Bf
g
h
f is monic
If f ∘ g = f ∘ h, then g = h
Monomorphisms
C A Bf
g
h
If f ∘ g = f ∘ h, then g = h
Surjections generalised
CA Bf
g
h
If g ∘ f = h ∘ f, then g = h
Epimorphisms
CA Bf
g
h
f is epic
If g ∘ f = h ∘ f, then g = h
C
A
B
A mono in C is an epi in Cop
A
B
C Cop
C
Monos are dual to
epis.
Bijections
A B
A function is bijective if it maps 1-to-1
onto the whole codomain
Bijections
A B
We don’t lose any
information A B A…
A
Bijections
B
Nor do we lose information
B A B…
A B
The CT equivalent is of
course…
The CT equivalent is of
course…
Isomorphisms!
Injection Surjection Bijection
Mono Epi Iso
Motivation
So much software is just mapping
between things!
form
form{…}
form{…} Order(a,b,c)
form{…} Order(a,b,c) INSERT INTO…
form{…} Order(a,b,c) INSERT INTO…
form{…} Order(a,b,c) INSERT INTO…
Receipt(…)
form{…} Order(a,b,c) INSERT INTO…
Receipt(…)
Logs on disk
form{…} Order(a,b,c) INSERT INTO…
Receipt(…)
Logs on disk
<xml>
form{…} Order(a,b,c) INSERT INTO…
Receipt(…)<xml>receipt
Logs on disk
It is essential to understand how
information is preserved in flows like
this
Chinese
whispers!
Bugs proliferate where data is lost!
Injections and surjections tell us
what is preserved and what is lost
Bijections are especially valuable
Conclusion
Data structures are
made out of maths!
How we map between them
is maths too.
Understanding the
underlying shape of data
and functions is enormously
helpful for more robust,
error-free software
Isomorphism is more
interesting than
equality!
Isomorphic types can be
rewritten, optimised without
error.
Isomorphic mappings allow
us to preserve information
Universal properties exemplify
• Depending on the minimum
• Producing the minimum
?
Again, Category Theory
shows us deeper, simpler
patterns, unifying concepts
that otherwise look different
?
?
×
+
Further readingAwodey, “Category Theory”
Lawvere & Schanuel, “Conceptual Mathematics: an introduction to categories”
Jeremy Kun, “Math ∩ Programming” at http://jeremykun.com/
Chris Taylor, “The algebra of algebraic datatypes”
http://chris-taylor.github.io/blog/2013/02/10/the-algebra-of-algebraic-data-types/
http://chris-taylor.github.io/blog/2013/02/11/the-algebra-of-algebraic-data-types-part-ii/
http://chris-taylor.github.io/blog/2013/02/13/the-algebra-of-algebraic-data-types-part-iii/
Further readingBartosz Milewski “Categories for Programmers”
http://bartoszmilewski.com/2014/10/28/category-theory-for-
programmers-the-preface/
http://bartoszmilewski.com/2015/03/13/function-types/