25
The Python Interpreter is Fun and Not At All Terrifying: Opcodes name: Alex Golec twitter: @alexandergolec not @alexgolec : ( email: akg2136 (rhymes with cat) columbia dot (short for education) this talk lives at: blog.alexgolec.com 1

Python opcodes

Embed Size (px)

DESCRIPTION

The python interpreter converts programs to bytecodes before beginning execution. Execution itself consist of looping over these bytecodes and performing specific operations over each one. This talk gives a very brief overview of the main classes of bytecodes. This presentation was given as a lightning talk at the Boston Python Meetup group on July 24th, 2012.

Citation preview

Page 1: Python opcodes

The Python Interpreter is Fun and Not At All Terrifying: Opcodes

name: Alex Golectwitter: @alexandergolec

not @alexgolec : (email: akg2136 (rhymes with cat) columbia dot (short for education)

this talk lives at: blog.alexgolec.com

1

Page 2: Python opcodes

Python is Bytecode-Interpreted

• Your python program is compiled down to bytecode

• Sort of like assembly for the python virtual machine

• The interpreter executes each of these bytecodes one by one

2

Page 3: Python opcodes

Before we Begin

• This presentation was written using the CPython 2.7.2 which ships with Mac OS X Mountain Lion GM Image

• The more adventurous among you will find that minor will details differ on PyPy / IronPython / Jython

3

Page 4: Python opcodes

The Interpreter is Responsible For:

• Issuing commands to objects and maintaining stack state

• Flow Control

• Managing namespaces

• Turning code objects into functions and classes

4

Page 5: Python opcodes

Issuing Commands to Objects and Maintaining Stack State

5

Page 6: Python opcodes

The dis Module

>>> def parabola(x):... return x*x + 4*x + 4... >>> dis.dis(parabola) 2 0 LOAD_FAST 0 (x) 3 LOAD_FAST 0 (x) 6 BINARY_MULTIPLY 7 LOAD_CONST 1 (4) 10 LOAD_FAST 0 (x) 13 BINARY_MULTIPLY 14 BINARY_ADD 15 LOAD_CONST 1 (4) 18 BINARY_ADD 19 RETURN_VALUE

Each instruction is exactly three bytes Opcodes have friendly (ish) mnemonics

6

Page 7: Python opcodes

Example: Arithmetic Operations

• We don’t know the type of x!

• How does BINARY_MULTIPLY know how to perform multiplication?

• What is I pass a string?

• Note the lack of registers; the Python virtual machine is stack-based

>>> def parabola(x):... return x*x + 4*x + 4... >>> dis.dis(parabola) 2 0 LOAD_FAST 0 (x) 3 LOAD_FAST 0 (x) 6 BINARY_MULTIPLY 7 LOAD_CONST 1 (4) 10 LOAD_FAST 0 (x) 13 BINARY_MULTIPLY 14 BINARY_ADD 15 LOAD_CONST 1 (4) 18 BINARY_ADD 19 RETURN_VALUE

7

Page 8: Python opcodes

Things the Interpreter Doesn’t Do:Typed Method Dispatch

• The python interpreter does not know anything about how to add two numbers (or objects, for that matter)

• Instead, it simply maintains a stack of objects, and when it comes time to perform an operation, asks them to perform the operation

• The result gets pushed onto the stack

8

Page 9: Python opcodes

Flow Control

9

Page 10: Python opcodes

• Jumps can be relative or absolute

• Relevant opcodes:

• JUMP_FORWARD

• POP_JUMP_IF_[TRUE/FALSE]

• JUMP_IF_[TRUE/FALSE]_OR_POP

• JUMP_ABSOLUTE

• SETUP_LOOP

• [BREAK/CONTINUE]_LOOP

Flow Control>>> def abs(x):... if x < 0:... x = -x... return x... >>> dis.dis(abs) 2 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (0) 6 COMPARE_OP 0 (<) 9 POP_JUMP_IF_FALSE 22

3 12 LOAD_FAST 0 (x) 15 UNARY_NEGATIVE 16 STORE_FAST 0 (x) 19 JUMP_FORWARD 0 (to 22)

4 >> 22 LOAD_FAST 0 (x) 25 RETURN_VALUE

10

Page 11: Python opcodes

Managing Namespaces

11

Page 12: Python opcodes

• Variables, functions, etc. are all treated identically

Simple Namespaces>>> def example():... variable = 1... def function():... print 'function'... del variable... del function... >>> dis.dis(example) 2 0 LOAD_CONST 1 (1) 3 STORE_FAST 0 (variable)

3 6 LOAD_CONST 2 (<code object b at 0x10c545930, file "<stdin>", line 3>) 9 MAKE_FUNCTION 0 12 STORE_FAST 1 (function)

5 15 DELETE_FAST 0 (variable)

6 18 DELETE_FAST 1 (function) 21 LOAD_CONST 0 (None) 24 RETURN_VALUE

• Once the name is assigned to the object, the interpreter completely forgets everything about it except the name

12

Page 13: Python opcodes

Turning Code Objects into Functions and Classes

13

Page 14: Python opcodes

Functions First!

>>> def square(inputfunc):... def f(x):... return inputfunc(x) * inputfunc(x)... return f... >>> dis.dis(square) 2 0 LOAD_CLOSURE 0 (inputfunc) 3 BUILD_TUPLE 1 6 LOAD_CONST 1 (<code object f at 0x10c545a30, file "<stdin>", line 2>) 9 MAKE_CLOSURE 0 12 STORE_FAST 1 (f)

4 15 LOAD_FAST 1 (f) 18 RETURN_VALUE

• The compiler generates code objects and sticks them in memory

14

Page 15: Python opcodes

Now Classes!>>> def make_point(dimension, names):... class Point:... def __init__(self, *data):... pass... dimension = dimensions... return Point... >>> dis.dis(make_point) 2 0 LOAD_CONST 1 ('Point') 3 LOAD_CONST 3 (()) 6 LOAD_CONST 2 (<code object Point at 0x10c545c30, file "<stdin>", line 2>) 9 MAKE_FUNCTION 0 12 CALL_FUNCTION 0 15 BUILD_CLASS 16 STORE_FAST 2 (Point)

6 19 LOAD_FAST 2 (Point) 22 RETURN_VALUE

BUILD_CLASS()

Creates a new class object. TOS is the methods dictionary, TOS1 the tuple of the names of the base classes, and TOS2 the class name.

15

Page 16: Python opcodes

Other Things

• Exceptions

• Loops

• Technically flow control, but they’re a little more involved

16

Page 17: Python opcodes

Now, We Have Some Fun

17

Page 18: Python opcodes

What to Do With Our Newly Acquired Knowledge of Dark

Magic?

18

Page 19: Python opcodes

Write your own Python interpreter!

19

Page 20: Python opcodes

Static Code Analysis!

20

Page 21: Python opcodes

Understand How PyPy Does It!

21

Page 22: Python opcodes

Buy Me Alcohol!Or at least provide me with pleasant conversation

22

Page 23: Python opcodes

Slideshare-only Bonus Slide: Exception Handling!

23

Page 24: Python opcodes

• The exception context is pushed by SETUP_EXCEPT

• If an exception is thrown, control jumps to the address of the top exception context, in this case opcode 15

• If there is no top exception context, the interpreter halts and notifies you of the error

• The yellow opcodes check if the exception thrown matches the type of the one in the except statement, and execute the except block

• At END_FINALLY, the interpreter is responsible for popping the exception context, and either re-raising the exception, in which case the next-topmost exception context will trigger, or returning from the function

• Notice that the red opcodes will never be executed

• The first: between a return and a jump target

• The second: only reachable by jumping from dead code.

• CPython’s philosophy of architectural and implementation simplicity tolerates such minor inefficiencies

>>> def list_get(lst, pos):... try:... return lst[pos]... except IndexError:... return None... # there is an invisible “return None” here>>> dis.dis(list_get) 2 0 SETUP_EXCEPT 12 (to 15)

3 3 LOAD_FAST 0 (lst) 6 LOAD_FAST 1 (pos) 9 BINARY_SUBSCR 10 RETURN_VALUE 11 POP_BLOCK 12 JUMP_FORWARD 18 (to 33)

4 >> 15 DUP_TOP 16 LOAD_GLOBAL 0 (IndexError) 19 COMPARE_OP 10 (exception match) 22 POP_JUMP_IF_FALSE 32 25 POP_TOP 26 POP_TOP 27 POP_TOP

5 28 LOAD_CONST 0 (None) 31 RETURN_VALUE >> 32 END_FINALLY >> 33 LOAD_CONST 0 (None) 36 RETURN_VALUE

24

Page 25: Python opcodes

Thanks!

25