Upload
fcofdezc
View
174
Download
2
Tags:
Embed Size (px)
Citation preview
Knowing your garbage collector
Francisco Fernandez Castano
upclose.me
[email protected] @fcofdezc
January 31, 2015
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 1 / 37
Overview
1 IntroductionMotivationConcepts
2 AlgorithmsCPython RCPyPy
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 2 / 37
Motivation
Managing memory manually is hard.
Who owns the memory?
Should I free these resources?
What happens with double frees?
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 3 / 37
Dangling pointers
int *func(void)
{
int num = 1234;
/* ... */
return #
}
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 4 / 37
Ownership
int *func(void)
{
int *num = malloc (10 * sizeof(int ));;
/* ... */
return num;
}
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 5 / 37
John Maccarthy
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 6 / 37
Basic concepts
Heap
A data structure in which objects may be allocated or deallocated in anyorder.
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 7 / 37
Basic concepts
Heap
A data structure in which objects may be allocated or deallocated in anyorder.
Mutator
The part of a running program which executes application code.
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 8 / 37
Basic concepts
Heap
A data structure in which objects may be allocated or deallocated in anyorder.
Mutator
The part of a running program which executes application code.
Collector
The part of a running program responsible of garbage collection.
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 9 / 37
Garbage collection
Definition
Garbage collection is automatic memory management. While themutator runs , it routinely allocates memory from the heap. If morememory than available is needed, the collector reclaims unused memoryand returns it to the heap.
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 10 / 37
CPython GC
CPython implementation has garbage collection.
CPython GC algorithm is Reference counting with cycle detector
It also has a generational GC.
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 11 / 37
Young objects
[elem * 2 for elem in elements]
balance = (a / b / c) * 4
’asdadsasd -xxx’.replace(’x’, ’y’). replace(’a’, ’b’)
foo.bar()
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 12 / 37
PyObject
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 13 / 37
PyTypeObject
typedef struct _typeobject {
PyObject_VAR_HEAD
const char *tp_name;
Py_ssize_t tp_basicsize , tp_itemsize;
destructor tp_dealloc;
printfunc tp_print;
getattrfunc tp_getattr;
setattrfunc tp_setattr;
void *tp_reserved;
.
.
} PyTypeObject;
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 14 / 37
Reference Counting Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 15 / 37
Reference Counting Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 16 / 37
Reference Counting Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 17 / 37
Reference Counting Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 18 / 37
Reference Counting Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 19 / 37
Cycles
l = []
l.append(l)
del l
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 20 / 37
Cycles
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 21 / 37
Cycles
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 22 / 37
PyObject
typedef struct _object {
_PyObject_HEAD_EXTRA
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 23 / 37
PyTypeObject
typedef struct _typeobject {
PyObject_VAR_HEAD
const char *tp_name;
Py_ssize_t tp_basicsize , tp_itemsize;
destructor tp_dealloc;
printfunc tp_print;
getattrfunc tp_getattr;
setattrfunc tp_setattr;
void *tp_reserved;
.
.
} PyTypeObject;
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 24 / 37
PyGC Head
typedef union _gc_head {
struct {
union _gc_head *gc_next;
union _gc_head *gc_prev;
Py_ssize_t gc_refs;
} gc;
double dummy; /* force worst -case alignment */
} PyGC_Head;
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 25 / 37
CPython Memory Allocator
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 26 / 37
CPython Memory Allocator
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 27 / 37
Demo
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 28 / 37
Reference counting
Pros: Is incremental, as it works, it frees memory.
Cons: Detecting Cycles could be hard.
Cons: Size overhead on objects.
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 29 / 37
PyPy
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 30 / 37
Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 31 / 37
Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 32 / 37
Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 33 / 37
Mark and Sweep Algorithm
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 34 / 37
Mark and sweep
Pros: Can collect cycles.
Cons: Basic implementation stops the world
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 35 / 37
Questions?
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 36 / 37
The End
Francisco Fernandez Castano (@fcofdezc) Python GC January 31, 2015 37 / 37