Python opcodes

Preview:

DESCRIPTION

The python interpreter converts programs to bytecodes before beginning execution. Execution itself consist of looping over these bytecodes and performing specific operations over each one. This talk gives a very brief overview of the main classes of bytecodes. This presentation was given as a lightning talk at the Boston Python Meetup group on July 24th, 2012.

Citation preview

The Python Interpreter is Fun and Not At All Terrifying: Opcodes

name: Alex Golectwitter: @alexandergolec

not @alexgolec : (email: akg2136 (rhymes with cat) columbia dot (short for education)

this talk lives at: blog.alexgolec.com

1

Python is Bytecode-Interpreted

• Your python program is compiled down to bytecode

• Sort of like assembly for the python virtual machine

• The interpreter executes each of these bytecodes one by one

2

Before we Begin

• This presentation was written using the CPython 2.7.2 which ships with Mac OS X Mountain Lion GM Image

• The more adventurous among you will find that minor will details differ on PyPy / IronPython / Jython

3

The Interpreter is Responsible For:

• Issuing commands to objects and maintaining stack state

• Flow Control

• Managing namespaces

• Turning code objects into functions and classes

4

Issuing Commands to Objects and Maintaining Stack State

5

The dis Module

>>> def parabola(x):... return x*x + 4*x + 4... >>> dis.dis(parabola) 2 0 LOAD_FAST 0 (x) 3 LOAD_FAST 0 (x) 6 BINARY_MULTIPLY 7 LOAD_CONST 1 (4) 10 LOAD_FAST 0 (x) 13 BINARY_MULTIPLY 14 BINARY_ADD 15 LOAD_CONST 1 (4) 18 BINARY_ADD 19 RETURN_VALUE

Each instruction is exactly three bytes Opcodes have friendly (ish) mnemonics

6

Example: Arithmetic Operations

• We don’t know the type of x!

• How does BINARY_MULTIPLY know how to perform multiplication?

• What is I pass a string?

• Note the lack of registers; the Python virtual machine is stack-based

>>> def parabola(x):... return x*x + 4*x + 4... >>> dis.dis(parabola) 2 0 LOAD_FAST 0 (x) 3 LOAD_FAST 0 (x) 6 BINARY_MULTIPLY 7 LOAD_CONST 1 (4) 10 LOAD_FAST 0 (x) 13 BINARY_MULTIPLY 14 BINARY_ADD 15 LOAD_CONST 1 (4) 18 BINARY_ADD 19 RETURN_VALUE

7

Things the Interpreter Doesn’t Do:Typed Method Dispatch

• The python interpreter does not know anything about how to add two numbers (or objects, for that matter)

• Instead, it simply maintains a stack of objects, and when it comes time to perform an operation, asks them to perform the operation

• The result gets pushed onto the stack

8

Flow Control

9

• Jumps can be relative or absolute

• Relevant opcodes:

• JUMP_FORWARD

• POP_JUMP_IF_[TRUE/FALSE]

• JUMP_IF_[TRUE/FALSE]_OR_POP

• JUMP_ABSOLUTE

• SETUP_LOOP

• [BREAK/CONTINUE]_LOOP

Flow Control>>> def abs(x):... if x < 0:... x = -x... return x... >>> dis.dis(abs) 2 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (0) 6 COMPARE_OP 0 (<) 9 POP_JUMP_IF_FALSE 22

3 12 LOAD_FAST 0 (x) 15 UNARY_NEGATIVE 16 STORE_FAST 0 (x) 19 JUMP_FORWARD 0 (to 22)

4 >> 22 LOAD_FAST 0 (x) 25 RETURN_VALUE

10

Managing Namespaces

11

• Variables, functions, etc. are all treated identically

Simple Namespaces>>> def example():... variable = 1... def function():... print 'function'... del variable... del function... >>> dis.dis(example) 2 0 LOAD_CONST 1 (1) 3 STORE_FAST 0 (variable)

3 6 LOAD_CONST 2 (<code object b at 0x10c545930, file "<stdin>", line 3>) 9 MAKE_FUNCTION 0 12 STORE_FAST 1 (function)

5 15 DELETE_FAST 0 (variable)

6 18 DELETE_FAST 1 (function) 21 LOAD_CONST 0 (None) 24 RETURN_VALUE

• Once the name is assigned to the object, the interpreter completely forgets everything about it except the name

12

Turning Code Objects into Functions and Classes

13

Functions First!

>>> def square(inputfunc):... def f(x):... return inputfunc(x) * inputfunc(x)... return f... >>> dis.dis(square) 2 0 LOAD_CLOSURE 0 (inputfunc) 3 BUILD_TUPLE 1 6 LOAD_CONST 1 (<code object f at 0x10c545a30, file "<stdin>", line 2>) 9 MAKE_CLOSURE 0 12 STORE_FAST 1 (f)

4 15 LOAD_FAST 1 (f) 18 RETURN_VALUE

• The compiler generates code objects and sticks them in memory

14

Now Classes!>>> def make_point(dimension, names):... class Point:... def __init__(self, *data):... pass... dimension = dimensions... return Point... >>> dis.dis(make_point) 2 0 LOAD_CONST 1 ('Point') 3 LOAD_CONST 3 (()) 6 LOAD_CONST 2 (<code object Point at 0x10c545c30, file "<stdin>", line 2>) 9 MAKE_FUNCTION 0 12 CALL_FUNCTION 0 15 BUILD_CLASS 16 STORE_FAST 2 (Point)

6 19 LOAD_FAST 2 (Point) 22 RETURN_VALUE

BUILD_CLASS()

Creates a new class object. TOS is the methods dictionary, TOS1 the tuple of the names of the base classes, and TOS2 the class name.

15

Other Things

• Exceptions

• Loops

• Technically flow control, but they’re a little more involved

16

Now, We Have Some Fun

17

What to Do With Our Newly Acquired Knowledge of Dark

Magic?

18

Write your own Python interpreter!

19

Static Code Analysis!

20

Understand How PyPy Does It!

21

Buy Me Alcohol!Or at least provide me with pleasant conversation

22

Slideshare-only Bonus Slide: Exception Handling!

23

• The exception context is pushed by SETUP_EXCEPT

• If an exception is thrown, control jumps to the address of the top exception context, in this case opcode 15

• If there is no top exception context, the interpreter halts and notifies you of the error

• The yellow opcodes check if the exception thrown matches the type of the one in the except statement, and execute the except block

• At END_FINALLY, the interpreter is responsible for popping the exception context, and either re-raising the exception, in which case the next-topmost exception context will trigger, or returning from the function

• Notice that the red opcodes will never be executed

• The first: between a return and a jump target

• The second: only reachable by jumping from dead code.

• CPython’s philosophy of architectural and implementation simplicity tolerates such minor inefficiencies

>>> def list_get(lst, pos):... try:... return lst[pos]... except IndexError:... return None... # there is an invisible “return None” here>>> dis.dis(list_get) 2 0 SETUP_EXCEPT 12 (to 15)

3 3 LOAD_FAST 0 (lst) 6 LOAD_FAST 1 (pos) 9 BINARY_SUBSCR 10 RETURN_VALUE 11 POP_BLOCK 12 JUMP_FORWARD 18 (to 33)

4 >> 15 DUP_TOP 16 LOAD_GLOBAL 0 (IndexError) 19 COMPARE_OP 10 (exception match) 22 POP_JUMP_IF_FALSE 32 25 POP_TOP 26 POP_TOP 27 POP_TOP

5 28 LOAD_CONST 0 (None) 31 RETURN_VALUE >> 32 END_FINALLY >> 33 LOAD_CONST 0 (None) 36 RETURN_VALUE

24

Thanks!

25

Recommended