48
The University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524 Programming Language Concepts Stephen Olivier March 17, 2009 Based on slides by A. Block, notes by N. Fisher, F. Hernandez-Campos, and D. Stotts

Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Lecture 13: Complex Types and Garbage CollectionCOMP 524 Programming Language Concepts Stephen OlivierMarch 17, 2009

Based on slides by A. Block, notes by N. Fisher, F. Hernandez-Campos, and D. Stotts

Page 2: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Goals

•Discuss Arrays, Pointers, Recursive Types, Strings, Sets, Lists, Equality, and Garbage Collection

2

Page 3: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Arrays

•Arrays are usually stored in contiguous locations

3

row-major order

Page 4: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Arrays

•Arrays are usually stored in contiguous locations

4

Column-major order

Page 5: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Multidimensional Arrays: Address Calculations

5

A: array [L1..U1] of array [L2..U2] of array [L3..U3] of element_type

L2L1

L3

j

k

i

Page 6: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Multidimensional Arrays: Address Calculations

6

A: array [L1..U1] of array [L2..U2] of array [L3..U3] of element_type

L2L1

L3

j

k

i

S3 = size of elementS2 = (U3 - L3 + 1) x S3S1 = (U2 - L2 + 1) x S2

Address of A[i,j,k] Address of A + (i-L1)xS1 + (j-L2)xS2 + (k-L3)xS3

Page 7: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Multidimensional Arrays: Address Calculations

7

A: array [L1..U1] of array [L2..U2] of array [L3..U3] of element_type

L2L1

L3

j

k

i

Optimized(i x S1) + (j x S2) + (k x S3) + address A -

[(L1 x S1) + (L2 x S2) + (L3 x S3)].the phrase [(L1 x S1) + (L2 x S2) + (L3 x S3)] can be

determined at compile-time

Page 8: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Heap-based Allocation

•The heap is a region of storage in which sub-blocks can be allocated and deallocated.

8

Page 9: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Issues with Heap Allocation

•Pointers

• Used in value model of variables

• Not necessary for reference model

•Dangling References

•Garbage Collection

9

Page 10: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Pointers

•Pointers serve two purposes:

• Efficient (and sometimes intuitive) access to elaborated objets (as in C)

• Dynamic creation of linked data structures, in conjunction with a heap storage manger.

•Several Languages (e.g., Pascal) restrict pointers to accessing things in the heap

•Pointers are used with a value model of variables

• They aren’t needed with a reference model (already implicit)

10

Page 11: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

C Pointers and Arrays

•C Pointers and arrays

11

int *a == int a[]int **a == int *a[]

Page 12: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

C Pointers and Arrays

•But equivalencies don’t always hold

• Specifically, a declaration allocates an array if it specifies a size for the first dimension

• otherwise it allocates a pointer

12

int **a, int *a[] // Pointer to pointer to intint *a[n] //n-element array of row pointersint a[n][m] // 2-D array

Page 13: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Contiguous 2D Arrays vs. Row Pointers (C)

13

U N CD O OB C

K

W A KU V AV T

E

F S U

char schools[][5] = {“UNC”, “DOOK”, “BC”, “WAKE”, “UVA”, “VT”,“FSU”

}; /* 35B data */

U N C DO O K BC W A KE U V A

V T FS U

char *schools[] = {“UNC”, “DOOK”, “BC”, “WAKE”, “UVA”, “VT”,“FSU”

}; /* 28B data+28B ptr */

Page 14: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Recursive Types (now with Pointers!): Binary Tree

14

struct chr_tree{struct chr_tree *l, *r;char var;

}

type chr_tree;type chr_tree_ptr is access chr_tree;type chr_tree is recordleft,right:chr_tree_ptr;val:character;

end record;

C

Ada

Page 15: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

R

X Y

Z

The University of North Carolina at Chapel Hill

Binary Tree with Explicit Pointers

15

W

C

Page 16: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Binary Tree in Reference Model

C C C

C C C

A R

A X

C C C

C C C

A Y

A Z

C C C

A W

Lisp

‘(#\R (#\X () ()) (#\Y (#\Z () ()) (#\W () ())))

Page 17: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Problems with Explicit Reclamation

•Explicit reclamation of heap objects is problematic

• The programmer may forget to deallocate some objects

• Causing memory leaks

• For example, in the previous example, the programmer may forget to include the delete statement

• References to deallocated objets may not be reset

• Creating dangling references

17

ptr1 ptr2

Page 18: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Problems with Explicit Reclamation

•Explicit reclamation of heap objects is problematic

• The programmer may forget to deallocate some objects

• Causing memory leaks

• For example, in the previous example, the programmer may forget to include the delete statement

• References to deallocated objets may not be reset

• Creating dangling references

18

ptr2

Page 19: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Issues with Heap Allocation

19

Page 20: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Dealing with Dangling References

•Tombstones: Use an intermediary device

• Pointers refer to the tombstone rather than the actual object

• Indirection costs

• Tombstones invalidated on deallocation

• Problem: when do we reclaim them

•Lock and Keys: Use a universal key to per-object lock

• Check key against lock each access (costly)

• Reset lock to zero on deallocation and it should get a different value upon reuse

20

Page 21: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Tombstones

21

ptr1new(ptr1);

ptr2 := ptr1;ptr1

ptr2

delete(ptr1);

ptr2

ptr1

Page 22: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Locks and Keys

ptr1:135942new(ptr1);

ptr2 := ptr1;

delete(ptr1);

135942

ptr1:135942

ptr2:135942

135942

0ptr1:135942

ptr2:135942

Page 23: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Garbage Collection

•Automatic reclamation of the space used by objects that are no longer useful:

• Developed for functional languages

• Essential in this programming paradigm.

• More and more popular in imperative languages

• Java, C#, Python

•Generally slower than manual reclamation, but it eliminates a very frequent programming error

23

Page 24: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Garbage Collection: Techniques

•When is an object no longer useful?

•There are several garbage collection techniques that answer this question in a different manner

• Reference counting

• Mark-and-sweep collection

• store-and-copy

• generational collection

24

Page 25: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Reference Counting

•Each object has an associated reference counter

•Keeps reference counter up to date, and deallocates objects when the counter is zero

25

2 “foo”ptr1

HeapStack

ptr2

Page 26: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Reference Counting

•Each object has an associated reference counter

•Keeps reference counter up to date, and deallocates objects when the counter is zero

26

1 “foo”ptr1

HeapStack

ptr2

Page 27: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Reference Counting

•Each object has an associated reference counter

•Keeps reference counter up to date, and deallocates objects when the counter is zero

27

0 “foo”ptr1

HeapStack

ptr2

Page 28: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Reference Counting: Problems

•Extra overhead of storing and updating reference counts

•Strong Typing required

• Impossible in language like C

• It cannot be used for variant records

• It doesn’t work with circular data structures

• This is a problem with this definition of useful object as an object with one or more references

28

Page 29: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Reference Counting: Circular Data Structures

29

2 “larry” 1 “moe”

1 “curly”

stooges

HeapStack

1 “larry” 1 “moe”

1 “curly”HeapStack

stooges

Page 30: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Mark-and-Sweep Collection

• A better definition of useless is one that cannot be reached by following a chain of valid pointers starting from outside the heap.

• Mark-and-Sweep GC applies this definition

30

Page 31: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Mark-and-Sweep

• Algorithm:

• Mark every block in the heap as useless

• Starting with all pointers outside the heap, recursively explore all linked data structures

• Add every block that remain marked to the free list.

• Run whenever free space is low

31

Page 32: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Mark-and-Sweep Collection: Problems

•Block must begin with an indication of its size

•A stack of depth proportional to the longest reference chain is required

• AND! We are usually running low when running the GC

•Must “stop the world” to run

• Much work in parallel garbage collection to address this

• Also can use periodically in combination with reference counting

32

Page 33: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Mark-and-Sweep Collection: Problems

•Block must begin with an indication of its size

•A stack of depth proportional to the longest reference chain is required

• AND! We are usually running low when running the GC

•Must “stop the world” to run

• Much work in parallel garbage collection to address this

• Also can use periodically in combination with reference counting

33

If we use type descriptors, that indicate their size, then we don’t need to do this.

Page 34: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Mark-and-Sweep Collection: Problems

•Block must begin with an indication of its size

•A stack of depth proportional to the longest reference chain is required

• AND! We are usually running low when running the GC

•Must “stop the world” to run

• Much work in parallel garbage collection to address this

• Also can use periodically in combination with reference counting

34

Possible to implement without a stack!

Page 35: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill 35

R

X Y

Z W

A

Mark-and-Sweep Collection: Pointer Reversal

Page 36: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill 36

R

X Y

Z W

A

Mark-and-Sweep Collection: Pointer Reversal

Page 37: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill 37

R

X Y

Z W

A

Mark-and-Sweep Collection: Pointer Reversal

Page 38: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill 38

R

X Y

Z W

A

Mark-and-Sweep Collection: Pointer Reversal

Page 39: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill 39

R

X Y

Z W

A

Mark-and-Sweep Collection: Pointer Reversal

Page 40: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill 40

R

X Y

Z W

A

Mark-and-Sweep Collection: Pointer Reversal

Page 41: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill 41

R

X Y

Z W

A

Mark-and-Sweep Collection: Pointer Reversal

Page 42: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Store-and-Copy

•Use to reduce external fragmentation

•S-C divides the available space in half and allocates everything in that half until its full

•When that happens, copy each useful block to the other half, clean up the remaining block, and switch the roles of each half.

•Drawback: only get to use half of heap at a time

42

Page 43: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Generational Collection

•Based on the idea that objects are often short-lived

•Heap space divided into regions based on how many times objects have been seen by the garbage collector

•Region of objects allocated since last GC (called “nursery”) explored for reclamation before the regions containing older objects

43

Page 44: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Lists

•We’ve seen lists in ML

•In LISP, homogeneity is not required

• A list is technically defined recursively as either the empty list or a pair consisting of an object (which may be either a list or an atom) and another (shorter) list

• In Lisp, in fact, a program is a list, and can extend itself at run time by constructing a list and executing it.

•Lists are also supported in some imperative programs

44

Page 45: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

List Comprehensions

•Specify expression, enumerator, and filter(s)

•Example: squares of odd numbers less than 100:

•Provided in Haskell, Miranda, Python

45

[i*i | i <- [1..100], i ‘mod’ 2 == 1]

Page 46: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Strings

•A string may be regarded as merely an array of characters or as a distinct data type

•Some languages that consider them character arrays nonetheless provide some syntactic sugar for them

• For example, C: char str[ ] = “hello world”;

• Orthogonality problem: can’t assign string literal after declaration

•Support for variable-length strings

• Built-in type in functional languages (Lisp, Scheme, ML)

• String class in object-oriented languages (C++, Java, C#)

46

Page 47: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Sets

•A Set is an unordered collection of an arbitrary number of distinct values of a common type.

•The universe comprises all the possible values that could be in the set

• e.g. The 256 ASCII characters

•Often represented as a bit vector for fast set operations

• Each bit represents one possible member of the set

• This is not feasable for a large universe, i.e. the integers

• Use a hash table or tree instead

47

Page 48: Lecture 13: Complex Types and Garbage Collectionolivier/comp524/Lecture13.pdfThe University of North Carolina at Chapel Hill Lecture 13: Complex Types and Garbage Collection COMP 524

The University of North Carolina at Chapel Hill

Equality and Assignment

•What does equality mean for complex types?

• Shallow comparison: refer to the same object

• Deep comparison: identical structure with identical values throughout all data members

•Can also have shallow or deep assignment

• Tricky with pointers: In a deep copy, pointers in both copies point to the same objects

48