32
5 3 4 5 3 4 5 3 4 5 3 4 5 3 4 5 3 4 5 3 4 A Type Theory for Memory Allocation and Data Layout Leaf Petersen, Robert Harper, Karl Crary and Frank Pfenning Carnegie Mellon University 5 3 4

534 534534 534 534 534 534 A Type Theory for Memory Allocation and Data Layout Leaf Petersen, Robert Harper, Karl Crary and Frank Pfenning Carnegie Mellon

Embed Size (px)

Citation preview

53 453 4

53 453 4

53 4

53 4

53 4

A Type Theory for Memory Allocation and Data Layout

Leaf Petersen, Robert Harper,Karl Crary and Frank Pfenning

Carnegie Mellon University

53 4

Carnegie Mellon University 2

• High-level languages – Abstract view of data, characterized by

operations– e.g. pairs:

• Introduction: (e1,e2) : t1 x t2• Elimination: fst e : t1 , snd e : t2

• Low-level languages– Concrete view of data, characterized by layout

in memory– e.g. C structs:

• Contiguous layout • Memory size determined by type

Views of data

Carnegie Mellon University 3

• Usually programmers don’t care – But sometimes have to– Marshalling, interaction with low-level

devices, precise control of initialization, interoperability

– Generally no type safety

• Compilers have to care– Represent high-level data abstractions– Allocation and initialization code

Data layout

Carnegie Mellon University 4

(3,(4,5)) : int x (int x int)

53 453 4

53 4

Carnegie Mellon University 5

Type theory for data layout

• Expose the fine structure– Expose memory layout in types– Implementation choices explicit– High-level object types defined in terms of

low-level memory types– High-level operations on objects broken

down into low-level operations on memory

• What is the fine structure of memory?

Carnegie Mellon University 6

Initialization

• Data objects – Created by initializing raw memory.

• Initialization changes types– e.g. from ns to int

• Commonly dealt with via linearity– New memory is linear– No aliases – Linear type theory handles re-typing

Carnegie Mellon University 7

Adjacency

• Memory provides a primitive notion of adjacent items: e.g. 3 next to 4.

• Large objects composed of adjacent smaller objects

• Sub-objects referenced by offsets or interior pointers.

3 4

Carnegie Mellon University 8

Associativity

• Adjacency is associative: the same memory layout is described by:– (3 next to 4) next to 5 – 3 next to (4 next to 5)

• But not commutative!– 3 next to 4 4 next to 3

3 4 5

Carnegie Mellon University 9

Indirection

• Not all objects are adjacent• Memory supports a notion of

indirection (pointers or labels).– Refer to non-adjacent data via

indirection– 3 next to (pointer to (4 next to 5))

3 4 5

Carnegie Mellon University 10

Ordered Type Theory

• Linear type theory handles initialization– Doesn’t capture other memory properties

• Ordered type theory– Variables used exactly once (linear)– Variables may not be permuted.– Adjacent variables remain adjacent– No weakening, contraction, or exchange.

• Claim: Ordered constructs admit a natural interpretation as adjacency and indirection.

Carnegie Mellon University 11

Variables and Resources

• Typing judgments:

• Ordering of x’s does not matter.– Unrestricted variables, bound to small objects

• Ordering and usage of a’s does matter.– Bound to memory– Adjacent variables bound to adjacent memory

Carnegie Mellon University 12

Ordered product

• Ordered product (fuse):

• Ordered products model adjacency

Carnegie Mellon University 13

• 3 next to 4– 3 ² 4 : int ² int

• 3 next to 4 next to 5– 3 ² (4 ² 5) : int ² (int ² int)– (3 ² 4) ² 5 : (int ² int) ² int

3 4

Adjacency

3 4 5

Carnegie Mellon University 14

Memory properties

• Associativity: – (1 ² 2) ² 3 and 1 ² (2 ² 3) are

isomorphic– Functions witness isomorphism

• Non-commutativity: – 1 ² 2 and 2 ² 1 are not isomorphic

– No function mapping one to the other (in general)

Carnegie Mellon University 15

Indirection

• Ordered modality models indirection– !M : ! corresponds to a pointer to M

– Non-linear, un-ordered term

Carnegie Mellon University 16

(3,(4,5)) : int x (int x int)

53 453 4

53 4

Carnegie Mellon University 17

(3,(4,5)) : int x (int x int)

int x (int x int) Ã !(int ² !(int ² int))(3,(4,5)) Ã !(3 ² ! (4 ² 5) )

53 4

Carnegie Mellon University 18

(3,(4,5)) : int x (int x int)

int x (int x int) Ã ! (! int ² !(! int ² ! int)(3,(4,5)) Ã !(!3 ² ! (!4 ² !5))

53 4

Carnegie Mellon University 19

(3,(4,5)) : int x (int x int)

int x (int x int) Ã ! (int ² (int ² int))(3,(4,5)) Ã !(3 ² (4 ² 5))

53 4

Carnegie Mellon University 20

Explicit Allocation

• Ordered type theory – Fine structure of data layout– But not allocation

• For example: !(x ² x)– Each time x is instantiated, new object– Initialized atomically

• Make allocation explicit– Remove !M from syntax– Add allocation primitives to introduce !

Carnegie Mellon University 21

Memory Allocation

• A well-known GC allocation protocol for copying garbage collectors:– Reserve: obtain raw, un-initialized

space.– Initialize: assign values to individual

locations.– Allocate: baptize some or all as valid

objects.

Carnegie Mellon University 22

Example: Memory Allocation

Heap

AP LPAP

? ? ??1 2 0

ReserveInitializeAllocate

x = (0,(1,2))

x

Carnegie Mellon University 23

Memory Allocation

• Type system separates terms and expressions– Terms M: no effects– Expressions E: have effects

• Allocation is an effect– Allocation primitives are expressions

Carnegie Mellon University 24

Resource a is used up!Create names for parts.Reserve space at a.

Allocating a Pair

• Allocate (1,2): Initialize a1, using it up.Re-introduce b1:int

Fuse parts and allocate.

Carnegie Mellon University 25

Coalescing ReservationAllocate two pairs: (1,2) and (3,4)

Carnegie Mellon University 26

Coalescing ReservationAllocate two pairs: (1,2) and (3,4)

Carnegie Mellon University 27

Coalescing ReservationAllocate two pairs: (1,2) and (3,4)

Carnegie Mellon University 28

Coalescing ReservationAllocate two pairs: (1,2) and (3,4)

Carnegie Mellon University 29

Summary

• Type theory for describing data layout– Adjacency requirements.– Precise control over representations.

• Type system for allocation:– Allocate raw memory.– Initialize, destructively changing types.– Ensures correct use of allocation protocol.– Permits code motion optimizations.

Carnegie Mellon University 30

What I’m not telling you

• It’s more subtle than it seems.– Plain ordered –calculus doesn’t work. – Need notion of size preserving terms,

other refinements.

• For details see the paper– Technical presentation and examples.– Interpretation of a calculus with

pairs.

Carnegie Mellon University 31

Current and Future Work

• POPL paper– Only finite products

• Technical Report: – Sums, recursive types, ordered

functions.– Extended coercion language.

• Ongoing– Dynamic extent (arrays)– Other allocation models

Carnegie Mellon University 32

Conclusion

• Ordered type theory is a natural framework for modeling data layout.– Low level issues dealt with entirely

realistically in a -calculus setting.– Correctness of allocation and

initialization protocols can be captured in the type system