90
5 3 4 5 3 4 5 3 4 5 3 4 5 3 4 5 3 4 5 3 4 Certifying Compilation for Standard ML in a Type Analysis Framework Leaf Petersen Carnegie Mellon University 5 3 4

534 534534 534 534 534 534 Certifying Compilation for Standard ML in a Type Analysis Framework Leaf Petersen Carnegie Mellon University 534

Embed Size (px)

Citation preview

53 453 4

53 453 4

53 4

53 4

53 4

Certifying Compilation for Standard ML in a Type Analysis Framework

Leaf Petersen

Carnegie Mellon University

53 4

Carnegie Mellon University 2

Motivation

Carnegie Mellon University 3

• Types capture facts about programs. – Fact: This procedure expects a 32 bit integer.– Fact: This address points to executable code.– Fact: This data structure was produced here.

• Programmers use types:– To keep their facts straight.

• Capture and preserve invariants.

– To check their facts.• Typechecker verifies truth.

– Manage complexity.

Types

Carnegie Mellon University 4

Types and Compilers

• Compilers use types.– Predict size of data.– Eliminate unnecessary dynamic checks.

• Most compilers forget types early.

P1:T1 P2 .... Pn P.o

Carnegie Mellon University 5

Type Preserving Compilation

• Transform types with program.– Optimize code based on types.– Verify that invariants still hold.– Emit types on object code.

P1:T1 P2:T2 .... Pn:Tn P.o : To

Carnegie Mellon University 6

TILT

• Type preserving compiler – Standard ML.– Sparc, Alpha, (now) x86 backends– Perry Cheng, Chris Stone, Leaf Petersen,

Dave Swasey, and others.

• Intermediate languages are typed– Type based optimizations.– Internal correctness checks.

• Generates typed x86 object code (this thesis).

Carnegie Mellon University 7

Why TILT?

• Want to compile SML efficiently.– Separate compilation is a must.

• Traditional optimizations.– Loop optimizations, CSE, constant

folding, and many more.

• New challenges for optimization.– Polymorphism, GC, 1st class functions,

modules, etc.

Carnegie Mellon University 8

Example: Unknown Types.

• Module interfaces (and polymorphism) introduce unknown types:

• Clients compiled against interface– Cannot know what t is (may be

instantiated multiple times)– Cannot predict size of value (if sizes

vary).– Cannot predict traceability of value.

Ptr/non-ptr

Carnegie Mellon University 9

Old Solutions

• C, C++, Java: No unknown types.– Objects: “partially known” types.

• Traditional ML/Lisp compilers: Uniform data representation.– All values are same size (e.g. 32 bits).– Large values (e.g. 64 bit floats) must

be boxed.– Traceability dealt with via tagging

(e.g. 31 bit ints).

Carnegie Mellon University 10

TILT Solution

• Types tell size and traceability of data.

• Unknown types are instantiated with known types at runtime.– Most compilers discard types before

generating code.• TILT: Keep types at runtime and

use them to dynamically determine layout and traceability.

Carnegie Mellon University 11

Type analysis

• type Optarray[t] =

Typecase[t]

of Boxed(Float) => Array64[Float]

| _ => Array32[t]• Note:

– Optarray[Int] == Array32[Int]

– Optarray[a] where a is unknown is dynamic

• Constructor for type Optarray?– optarray[t] : int x t -> Optarray[t]

Carnegie Mellon University 12

Type analysis

• optarray[t](len : int,init : t) : Optarray[t] = typecase [t]

of Boxed (Float) => new_array64[Float](len, unbox(init))

| _ =>

new_array32[t](len,init)• For statically known types, reduces at compile

time– optarray[Int](10,0) = new_array32[Int](10,0)

• For unknown types, reduces at runtime

Carnegie Mellon University 13

Type-passing Optimizations

• Type analysis:– Enables global representation optimizations

in the presence of unknown types.• TILT uses types at runtime for:

– Better data-layouts.• Unboxed arrays of 64 bit floats• 32 bit ints• Optimized sum representations

– Flatten aggregate arguments into registers.– Mostly tag-free garbage collection.

Carnegie Mellon University 14

There’s more

• Types can help with generating efficient code.

• But not the end of the story....

Carnegie Mellon University 15

Mobile Code

• Code has become mobile.– May know very little about producer.

• Examples:– Web applets.– Grid computing.– Binary installations/upgrades.– Application downloads.

• High risk from malicious/wrong code.

Carnegie Mellon University 16

The Certification Problem

• Source language safety is checkable.– Typechecker checks the programmers

facts.

• Raw object code is not checkable.• Safety relies on trust in:

– Safety of source language.– Correctness/identity of producer/compiler.– Integrity of the object code.

Carnegie Mellon University 17

Java Approach

• Java bytecode– High-level language (almost Java)– Can be typechecked

• Interpreted – slow, somewhat complicated

• JIT compiled– somewhat faster, quite complicated

• Large trusted computing base

Carnegie Mellon University 18

Certified Code

• Typed object code – Types certify safety

• Code consumer– Does no compiling– Checks that certificate applies (easy)– Small trusted computing base

• Several instances exist:– TAL: Typed Assembly Language– PCC: Proof Carrying Code– Many extensions and variations

Carnegie Mellon University 19

Certifying Compilers

• Programs in safe languages – Types provide needed annotations– Compiler can emit code with

certificate of type/memory safety

• Certifying compilers exist for:– Safe subsets of C (TAL & PCC)– Java (PCC)

• Now for Standard ML

Carnegie Mellon University 20

Types in Compilation

• Types can be used to generate efficient code.

• Types can be used to generate certified code.

• Want to combine the two paradigms.

Carnegie Mellon University 21

My Thesis

Certifying compilation of type analyzing code is feasible for a full modern language such as Standard ML.

Carnegie Mellon University 22

Two compilers

• Theoretical compiler– Formal translation– Prove important properties– Guide the implementation

• Real compiler– Follows the structure of the

theoretical compiler– Targets a real certified code system.

Carnegie Mellon University 23

Theory

Carnegie Mellon University 24

Theoretical compiler

• Three languages: – Singleton free MIL– LIL– Idealized TAL (ITAL)

• Formal translations:– MIL to LIL– Closure conversion of LIL code– LIL to ITAL

Carnegie Mellon University 25

Languages

• Singleton free MIL– Lambda calculus– Syntactic restriction to named form– Type analysis through primitives

• LIL– Much more fine-grained than MIL

• type and type analysis representation• closure representation

• ITAL– Machine language– Idealized TAL– Simplified TAL with LX primitives for type

analysis

Carnegie Mellon University 26

Translations

• MIL to LIL– Very different type structure– Moderately different term structure– See my dissertation.

• Closure conversion– Very standard

• LIL to ITAL– Type structure is almost identical– Term structure is very different

• Explicit control flow• Binding replaced with state modification

Carnegie Mellon University 27

LIL typing

;; ` e :

• – LIL heap context• – LIL type context• – LIL term context• e – LIL expression (named form)• – LIL type for e

Carnegie Mellon University 28

ITAL typing

; ` I ok

• – ITAL heap context• – ITAL type context

• – ITAL register file type

• I – ITAL instruction sequence

Carnegie Mellon University 29

ITAL typing

; ` I ok

• – ITAL heap context• – ITAL type context

• – ITAL register file type

• I – ITAL instruction sequence

Carnegie Mellon University 30

Register files

• A register file type maps registers to ITAL types– e.g. (r) = – Notation: {r:} means with the

type of r set to .• Designated stack pointer register sp

– (sp) = – describes the stack slots

Carnegie Mellon University 31

LIL to ITAL Translations

• || - heap context translation• || - type context translation• || - type translation• Exp e maps to instruction seq I• But what is the translation of a

term context?

Carnegie Mellon University 32

Register files

• LIL variables occupy ITAL registers (or stack slots)

• Hence, the translation of a LIL context is an ITAL register file.

• Problem: what register file?• Variables are related to registers

via register allocation.

Carnegie Mellon University 33

Register allocation

• Previous work builds register allocation into the translation.– Complex and tedious– Unclear how to incorporate real RA (e.g.

Graph coloring)– Consequently, toy register allocators are

used in formal presentations

• Better idea: translate with respect to abstract register allocator.

Carnegie Mellon University 34

Allocator

Definition: An allocator A is an object such that:

1. For every variable x:– A(x) = r or A(x) = sp(i)

2. frmsz(A) is a natural number3. For every LIL typing context and

stack type , ||A = for some register file type M

Carnegie Mellon University 35

Translation judgment

;;;A, ` e : Ã I

• – LIL heap context• – LIL type context• – LIL term context• A – Allocator• – describes stack below frame• I – ITAL instruction sequence• For this talk, I’m ignoring exceptions,

other stuff.

Carnegie Mellon University 36

Translation judgment

;;;A[z! r1 , x! r1 , y! r2] , `

z = x+y : int à add r1,r2

Carnegie Mellon University 37

Question

;;;A, ` e : Ã I

• Why should I be well-typed?• Is the equational theory rich enough?

– Easy to rely on equations that don’t hold

• Want to show soundness:– Each translation maps well-typed terms to

well-typed terms.

• Doesn’t hold for all allocators: only the good ones.

Carnegie Mellon University 38

Good allocator for

Definition: Let = ||A. We say that A is a good allocator for if:

1. (sp) = f ± such that frmsz(A) = f

2. |²|A is the empty machine state.

3. If = 1, x:, 2 then a) A is a good allocator for 1 and 2

b) If A(x) = r then ||A = |1,2|A{r:||}

c) If A(x) = sp(i) then something similar.

Carnegie Mellon University 39

Good allocator for e

Definition: An allocator A is a good allocator for an expression e if:

1. For all derivations of ;;` e : , A is a good allocator for .

2. A is a good allocator for all sub-expressions of e.

Carnegie Mellon University 40

Soundness

Theorem: If A is a good allocator for e and ;; ` e : and is a well-formed stack type and ;;;A, ` e : Ã I then ||;||; ` I ok

where M = ||A

Carnegie Mellon University 41

Benefits of this approach

• Theory close to implementation• Register allocation is a parameter

– Separates out the mechanism– Concise specification of interface

between code gen and RA– Translation isn’t bogged down with

algorithmic details of RA

Carnegie Mellon University 42

Downside: completeness

• Depends on register allocator– Full completeness doesn’t hold– Possible to show parametric completeness?– Not clear what this means

• Worthwhile tradeoff– Formal presentation very close to

implementation– In practice:

• Soundness is hard (implementation had bugs).• Completeness is just a matter of covering all cases.

• Likely that this can be solved (future work)

Carnegie Mellon University 43

Summary (Theory)

• Formal translations:– MIL to LIL– Closure conversion of LIL code– LIL to ITAL

• Proof of soundness for each• New approach to dealing with typed

RA• Provides a guide for......

Carnegie Mellon University 44

Practice

Carnegie Mellon University 45

Real Compiler

• Implemented a certifying back end for TILT.– Targets TAL for x86.

• Type representation and analysis made explicit– Not gc interface (yet).

• Data layout issues made explicit.– Boxing/unboxing.– Closure representations.– Heap data layout.

Carnegie Mellon University 46

Code generationType representationUntyped output!Subsequent compilation is mostly standard.

Shrinking inliningSpeculative inliningCSE/Dead code elimConstant foldingUncurryingMonomorphizationFlatteningEta reductionClosure conversionHoistingOthers

Eliminate modulesSome data rep

RTL (Untyped)

SML Source

HIL (Typed)

MIL (Typed)

MIL (Typed)

Code Gen

Optimize

Phase split

Elaborate

Typecheck

Carnegie Mellon University 47

RTL (Untyped)

SML Source

HIL (Typed)

MIL (Typed)

MIL (Typed)

Code Gen

Optimize

Phase split

Elaborate

Carnegie Mellon University 48

New TILT IL

• LIL: Low-level internal language– Based on LX (Crary & Weirich)

• Data representation explicit• Still lambda calculus-ish

– Call/return (not CPS)

• All heap allocation explicit• Type analysis implemented at the term

level– Neat Trick– See the dissertation

Carnegie Mellon University 49

MIL (Typed)

Front end

Type repLIL (Typed)

Optimize

Closure ConvLIL (Typed)

Code Gen TAL (Typed)

LIL (Typed)

CSE/Dead code elimConstant foldingEta reductionSwitch reductionOthers

Dynamic type repsData rep structureUnified allocation

Types and termsRecursive codeSome optsDirect to TALx86Reg alloc/cogen Small peephole opts

Singleton elim

Carnegie Mellon University 50

Compilation

TILT

fib.sml

fib/obj.tofib/obj.o

TALx86fib/asm.tal

Carnegie Mellon University 51

Separate Compilation

fib.sml fib.int

Int.int TextIO.int

:>

Carnegie Mellon University 52

TILT Compilation Model

fib.sml fib.int

Int.int TextIO.int

:>

fib/asm.tal fib/asm_e.tali

TextIO/asm_e.taliInt/asm_e.tali

:>

Carnegie Mellon University 53

Annotation size

• Annotation overhead for a file (e.g. fib)– size(fib/fib.to) + size(fib/fib_e.tali)

• Important question: how big?– Mobile code requires small annotation– Annotation size affects checking time

• Optimizing for size:– Not part of my thesis!!

• Important to measure– Want to understand the baseline– Actually pretty good!

Carnegie Mellon University 54

Micro-Benchmarks

Tag Description

takc curried function calls

taku uncurried function calls

Fib fib, fact with default Int

Fib32 fib, fact with 32 bit ints

PI approximation of pi (fp)

Carnegie Mellon University 55

Selected Larger Benchmarks

Tag Description LOC

msort Merge sort (lists) 48

life Game of life (lists) 205

pqueens P queens problem (arrays) 292

frank Small theorem prover 473

leroy Knuth bendix completion (exceptions) 537

simple Spherical fluid dynamics 860

tyan Grobner basis calculation 896

lexgen Lexical-analyzer generator 1178

pia Perspective inversion algorithm 2074

Carnegie Mellon University 56

Benchmark sizes (abs)

0.00

100.00

200.00

300.00

400.00

500.00

600.00

RU

N

TIM

EA

ND

RU

N

Tak

c

Tak

u PI

fib

fib32

Isor

t

mso

rt

Qui

ckso

rt2

Qui

ckso

rt

btim

es fft

Tim

eAnd

Run

PQ

ueen

s

Bar

nesH

ut life

fran

k

lero

y

sim

ple

tyan pia

lexg

en

boye

r

obj.o obj.to asm_e.tali asm_i.tali

Carnegie Mellon University 57

Type overhead (relative)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

obj.o obj.to asm_e.tali asm_i.tali

Carnegie Mellon University 58

Annotation size

• Average factor of 5 increase• Factor of 2-3 for larger programs• Many opportunities for

improvement• Separate compilation overhead• Additional issues, discussion in my

dissertation

Carnegie Mellon University 59

Performance

• Really not part of my thesis!!– Not an optimizing back end

• Nonetheless, important points:– Valuable to measure and understand– Not a toy compiler

Carnegie Mellon University 60

Comparing Apples to Fish

• MLTon– Whole program compiler.– Very good code!

• SML-NJ– Incremental compiler– Widely used, supports interactive loop

• TILT– Separate compilation– Conservative (malloc based) GC

• TILT (Whole program)

Carnegie Mellon University 61

Normalized runtimes

0

0.5

1

1.5

2

2.5

3

3.5

4

Taku Takc Fib Fib32 PI

TILT TILT (Whole) SML/NJ MLTon

Carnegie Mellon University 62

Normalized runtimes

0

2

4

6

8

10

12

14

16

Life Leroy Simple Tyan Msort Pia Lexgen PQueens

TILT TILT (Whole) NJ110.42 MLTON

Carnegie Mellon University 63

Certifying TILT

• Type preserving, optimizing, certifying compiler for Standard ML.

• Interesting theoretical and practical challenges.– Data representation.– Type analysis representation.– Engineering challenges– Proof scalability challenges

Carnegie Mellon University 64

My Thesis

Certifying compilation of type analyzing code is feasible for a full modern language such as Standard ML.

Carnegie Mellon University 65

Related work

• Certifed Code– PCC (Necula & Lee), Foundational PCC

(Appel)– TAL (Morrissett, et al.), Foundational TAL

(Crary)• Typed compilation

– TIL, TILT (CMU) – Popcorn (Cornell)– Special-J (Cedilla systems)– Flint (Yale)

Carnegie Mellon University 66

Where to now?• Data representation.

– Integrating GC semantics in IL– Typed models of GC

• TILT and typed compilation.– Incorporate memory allocation work.– Extend space of safety policies.

• Language design.– Combining LLL and HLL paradigms

cleanly.

Carnegie Mellon University 67

Implications of goodness

• If (x) = and A(x) = r then ||A(r) = ||

• |,x:|A = ||A{r:||}

• ||A[|c|/] = |[c/]|[|c|A

Carnegie Mellon University 72

Effects of singleton elimination

0

0.5

1

1.5

Life

Lero

y

Simple

Tyan

Mso

rtPia

Lexg

enFib

Fib32 PI

PQueens

Takc

Taku

SingElim NoSingElim

Carnegie Mellon University 73

Component sizes (rel)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Basis Lib Arg Link Bench

obj.o obj.to asm_e.tali asm_i.tali

Carnegie Mellon University 74

Component sizes (abs)

0.00

2000.00

4000.00

6000.00

8000.00

10000.00

12000.00

14000.00

16000.00

18000.00

asm_i.tali 50.17 4.45 0.11 0.00 1.96

asm_e.tali 6917.00 3731.66 91.22 0.00 1718.69

obj.to 8953.31 5658.56 117.81 7264.31 2787.91

obj.o 1183.08 1005.05 11.09 46.05 988.47

Basis Lib Arg Link Bench

Carnegie Mellon University 75

Type overhead (relative)

0.00

100.00

200.00

300.00

400.00

500.00

600.00

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134

obj.o obj.to asm_e.tali asm_i.tali

Carnegie Mellon University 76

Normalized runtimes

0

2

4

6

8

10

12

14

16

Life Leroy Simple Tyan Msort Pia Lexgen Taku Takc PQueens Fib Fib32 PI

TILT TILT (Whole) NJ110.42 MLTON

Carnegie Mellon University 77

Benchmark sizes (abs)

0.00

100.00

200.00

300.00

400.00

500.00

600.00

RU

N

TIM

EA

ND

RU

N

Takc

Taku PI

fib

fib32

Isort

msort

Quic

ksort

2

Quic

ksort

btim

es fft

Tim

eA

ndR

un

PQ

ueens

Barn

esH

ut

life

frank

lero

y

sim

ple

tyan

pia

lexgen

boyer

obj.o obj.to asm_e.tali asm_i.tali

Carnegie Mellon University 78

Assembly vs object sizes (abs)

0.00

2000.00

4000.00

6000.00

8000.00

10000.00

12000.00

14000.00

16000.00

asm.tal 13740.51 12948.89 155.55 188.54 6963.54

obj.o+obj.to 10136.38 6663.61 128.90 7310.36 3776.37

Basis Lib Arg Link Bench

Carnegie Mellon University 79

Static reps

Kind of static representations of types:

Static type representations:

Carnegie Mellon University 80

Interpretation function

Maps reps to the types being represented

Note:

Carnegie Mellon University 81

Static reps

Define type constructor for optimized arrays

Uses case analysis instead of Typecase

Carnegie Mellon University 82

Dynamic case

Define term constructor for optimized arrays

Carnegie Mellon University 83

Type erasure

• Wait!! Still branching on types!– Wanted to get rid of “Typecase”, just

replaced it with “case” on types!• Clever trick:

– Encode type sums as term sums– Use type refinement to reflect term info

back into type level• Replace “case” on types with “case”

on terms.

Carnegie Mellon University 84

MIL Types• Type size controlled using definitions

– Let a be c in E– Singleton kinds: a::S(c)

• Advantages– Optimizer improves sharing.– Sharing is intrinsic in the calculus– Needed for efficient compilation anyway

• Disadvantages– Equality is contextual, not syntactic– Massive duplication of mechanism

Carnegie Mellon University 85

Complete Larger benchmarks

• Isort – insertion sort (lists)• Msort – merge sort (lists)• Quicksort – quick sort (arrays)• fft – fast fourier transform• pqueens – p queens problem (array intensive)• barnes hut – n body simulation• life – game of life• frank – small theorem prover• leroy – knuth bendix completion• simple – spherical fluid dynamics• tyan – grobner basis calculation• Pia – perspective inversion algorithm (image processing)• lexgen – lexical-analyzer generator• boyer – theorem proving

Carnegie Mellon University 86

Standard ML

• Type/memory safe.• General purpose language.

– Mutable data, standard libraries, GC.

• Advanced features.– Polymorphism (templates/generics).– First class functions (inner classes).– Modules and interfaces.– Type inference.

• Language features of tomorrow’s industrial languages.

Carnegie Mellon University 87

LIL

• LIL: Low-level internal language– Based on LX (Crary & Weirich)

• Data representation explicit• Still lambda calculus-ish

– Call/return (not CPS)

• All heap allocation explicit

Carnegie Mellon University 88

Engineering benefits

• Compiler can largely ignore types– No need to optimize– Equality is syntactic

• Type size controlled by separate mechanisms– Hash-consing– Higher-order abstract syntax– At TAL level, via ad-hoc definitions

Carnegie Mellon University 89

New Challenges

• Controlling type size!– Types can be very large as trees.– Must maintain/traverse DAG structure.

• Must optimize types.– Types exist at runtime.– Inlining, CSE, etc must be done on types.

• Compiler must maintain well-typedness.

Carnegie Mellon University 90

Libraries and linking

• Basis: Standard ML Basis library– provides basic language and OS

functionality

• SML/NJ Lib– provides extended data structures

• Arg– command line processing

• Link– Compiler generated link unit

Carnegie Mellon University 91

Compilation (over-simplified)

TILT

fib.sml

fib/obj.tofib/obj.o

TALx86fib/asm.tal

Carnegie Mellon University 92

Example: Unknown Types.

Implements

Carnegie Mellon University 93

Example: Uniform Data Rep.

0.0x

0x =

Carnegie Mellon University 94

LIL Type Analysis

• Neat trick: typecase implemented at the term level!– All runtime data exists as ordinary terms.– All control flow branches are on terms.– Types can be erased before running.

• Accomplished via type refinement– Kind structure tracks type/rep

connection.• Too involved for this talk

– Detailed development in my dissertation