27
OCaml Datatypes Part II: An Exercise in Type Design COS 326 Andrew W. Appel Princeton University slides copyright 2013-2015 David Walker and Andrew W. Appel

OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

OCaml Datatypes Part II: An Exercise in Type Design

COS 326 Andrew W. Appel

Princeton University

slides copyright 2013-2015 David Walker and Andrew W. Appel

Page 2: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

A Note on Parameterized Type Definitions

Page 3: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

type (‘key, ‘val) tree = Leaf | Node of ‘key * ‘val * (‘key, ‘val) tree * (‘key, ‘val) tree type ‘a stree = (string, ‘a) tree type sitree = int stree

type ‘x f = body

arg f

definition:

use:

type f x = body

f arg

definition:

use:

General form: A Better Notation:

Page 4: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Take-home Message • Think of parameterized types like functions:

– a function that take a type as an argument – produces a type as a result

• Theoretical basis:

– System F-omega – a typed lambda calculus with general type-level functions as

well as value-level functions

Page 5: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Example Type Design

5

IBM developed GML (Generalize Markup Language) in 1969 • http://en.wikipedia.org/wiki/IBM_Generalized_Markup_Language • Precursor to SGML, HTML and XML

:h1.Chapter 1: Introduction :p.GML supported hierarchical containers, such as :ol :li.Ordered lists (like this one), :li.Unordered lists, and :li.Definition lists :eol. as well as simple structures. :p.Markup Minimization (later generalized and formalized in SGML), allowed the end-tags to be omitted for the “h1” and “p” elements.

Page 6: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Simplified GML

6

To process a GML document, an OCaml program would: • Read a series of characters from a text file & Parse GML structure • Represent the information content as an OCaml data structure • Analyze or transform the data structure • Print/Store/Communicate results We will focus on how to represent and transform the information content of a GML document.

Page 7: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Example Type Design

7

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

• A GML document consists of: – a list of elements

• An element is either: – a word or markup applied to an element

• Markup is either: – italicize, bold, or a font name

Page 8: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Example Data

8

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

let d = [ Formatted (Bold, Formatted (Font “Arial”, Words [“Chapter”;“One”])); Words [“It”; ”was”; ”a”; ”dark”; ”&”; ”stormy; ”night.”; "A"]; Formatted (Ital, Words[“shot”]); Words [“rang”; ”out.”] ];;

Page 9: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Challenge

9

• Change all of the “Arial” fonts in a document to “Courier”. • Of course, when we program functionally, we implement

change via a function that – receives one data structure as input – builds a new (different) data structure as an output

Page 10: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Challenge

10

• Change all of the “Arial” fonts in a document to “Courier”.

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 11: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Challenge

11

• Change all of the “Arial” fonts in a document to “Courier”.

• Technique: approach the problem top down, work on doc first:

let rec chfonts (elts:doc) : doc =

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 12: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Challenge

12

• Change all of the “Arial” fonts in a document to “Courier”.

• Technique: approach the problem top down, work on doc first:

let rec chfonts (elts:doc) : doc = match elts with | [] -> | hd::tl ->

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 13: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Challenge

13

• Change all of the “Arial” fonts in a document to “Courier”.

• Technique: approach the problem top down, work on doc first:

let rec chfonts (elts:doc) : doc = match elts with | [] -> [] | hd::tl -> (chfont hd)::(chfonts tl)

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 14: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Changing fonts in an element

14

• Change all of the “Arial” fonts in a document to “Courier”.

• Next work on changing the font of an element:

let rec chfont (e:elt) : elt =

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 15: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Changing fonts in an element

15

• Change all of the “Arial” fonts in a document to “Courier”.

• Next work on changing the font of an element:

let rec chfont (e:elt) : elt = match e with | Words ws -> | Formatted(m,e) ->

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 16: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Changing fonts in an element

16

• Change all of the “Arial” fonts in a document to “Courier”.

• Next work on changing the font of an element:

let rec chfont (e:elt) : elt = match e with | Words ws -> Words ws | Formatted(m,e) ->

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 17: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Changing fonts in an element

17

• Change all of the “Arial” fonts in a document to “Courier”.

• Next work on changing the font of an element:

let rec chfont (e:elt) : elt = match e with | Words ws -> Words ws | Formatted(m,e) -> Formatted(chmarkup m, chfont e)

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 18: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Changing fonts in an element

18

• Change all of the “Arial” fonts in a document to “Courier”.

• Next work on changing a markup:

let chmarkup (m:markup) : markup =

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 19: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Changing fonts in an element

19

• Change all of the “Arial” fonts in a document to “Courier”.

• Next work on changing a markup:

let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m

type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list

Page 20: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Summary: Changing fonts in an element

20

• Change all of the “Arial” fonts in a document to “Courier” • Lesson: function structure follows type structure

let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m let rec chfont (e:elt) : elt = match e with | Words ws -> Words ws | Formatted(m,e) -> Formatted(chmarkup m, chfont e) let rec chfonts (elts:doc) : doc = match elts with | [] -> [] | hd::tl -> (chfont hd)::(chfonts tl)

Page 21: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Poor Style

21

• Consider again our definition of markup and markup change:

type markup = Ital | Bold | Font of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m

Page 22: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Poor Style

22

• What if we make a change:

type markup = Ital | Bold | Font of string | TTFont of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m

the underscore silently catches all possible alternatives this may not be what we want -- perhaps there is an Arial TT font it is better if we are alerted of all functions whose implementation may need to change

Page 23: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Better Style

23

• Original code:

type markup = Ital | Bold | Font of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | Ital | Bold -> m

Page 24: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Better Style

24

• Updated code:

type markup = Ital | Bold | Font of string | TTFont of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | Ital | Bold -> m

..match m with | Font "Arial" -> Font "Courier" | Ital | Bold -> m.. Warning 8: this pattern-matching is not exhaustive. Here is an example of a value that is not matched: TTFont _

Page 25: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

Better Style

25

• Updated code, fixed:

• Lesson: use the type checker where possible to help you maintain your code

type markup = Ital | Bold | Font of string | TTFont of string let chmarkup (m:markup) : markup = match m with | Font "Arial" -> Font "Courier" | TTFont "Arial" -> TTFont "Courier" | Font s -> Font s | TTFont s -> TTFont s | Ital | Bold -> m

Page 26: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

A couple of practice problems

26

• Write a function that gets rid of immediately redundant markup in a document. – Formatted(Ital, Formatted(Ital,e)) can be simplified to

Formatted(Ital,e) – write maps and folds over markups

• Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are books, and others are conference papers. Journals have a name, number and issue; books have an ISBN number; All of these entries should have a title and author. – design a sorting function – design maps and folds over your bibliography entries

Page 27: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are

To Summarize

27

• Design recipe for writing OCaml code: – write down English specifications

• try to break problem into obvious sub-problems – write down some sample test cases – write down the signature (types) for the code – use the signature to guide construction of the code:

• tear apart inputs using pattern matching – make sure to cover all of the cases! (OCaml will tell you)

• handle each case, building results using data constructor – this is where human intelligence comes into play – the “skeleton” given by types can almost be done

automatically! • clean up your code

– use your sample tests (and ideally others) to ensure correctness