96
Building Interpreters with PyPy

Building Interpreters with PyPy

Embed Size (px)

Citation preview

Page 1: Building Interpreters with PyPy

Building Interpreters with PyPy

Page 2: Building Interpreters with PyPy

About me

• Computer science bachelor student at TU Berlin!

• Programming/Python since ~2008!

• Primarily involved with Pocoo projects (Sphinx, Werkzeug, Flask, Babel, …)

Page 3: Building Interpreters with PyPy

PyPy Python Interpreter

• Fast Python implementation!

• Just-in-Time compilation!

• Proper garbage collection (no reference counting)!

• Written in Python

Page 4: Building Interpreters with PyPy

PyPy Translation Toolchain

• Capable of compiling (R)Python!

• Garbage collection!

• Tracing just-in-time compiler generator!

• Software transactional memory?

Page 5: Building Interpreters with PyPy

PyPy based interpreters• Topaz (Ruby)!

• HippyVM (PHP)!

• Pyrolog (Prolog)!

• pycket (Racket)!

• Various other interpreters for (Scheme, Javascript, io, Gameboy)

Page 6: Building Interpreters with PyPy

RPython• Python subset!

• Statically typed!

• Garbage collected!

• Standard library almost entirely unavailable!

• Some missing builtins (print, open(), …)!

• rpython.rlib!

• exceptions are (sometimes) ignored!

• Not a really a language, rather a "state"

Page 7: Building Interpreters with PyPy

Hello RPython# hello_rpython.pyimport os!

def entry_point(argv): os.write(2, “Hello, World!\n”) return 0!

def target(driver, argv): return entry_point, None

Page 8: Building Interpreters with PyPy

$ rpython hello_rpython.py…$ ./hello_python-cHello, RPython!

Page 9: Building Interpreters with PyPy

Goal

• BASIC interpreter capable of running Hamurabi!

• Bytecode based!

• Garbage Collection!

• Just-In-Time Compilation

Page 10: Building Interpreters with PyPy
Page 11: Building Interpreters with PyPy

Live play session

Page 12: Building Interpreters with PyPy

Architecture

Parser

Compiler

Virtual Machine

AST

Bytecode

Source

Page 13: Building Interpreters with PyPy

10 PRINT TAB(32);"HAMURABI"20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY"30 PRINT:PRINT:PRINT80 PRINT "TRY YOUR HAND AT GOVERNING ANCIENT SUMERIA"90 PRINT "FOR A TEN-YEAR TERM OF OFFICE.":PRINT95 D1=0: P1=0100 Z=0: P=95:S=2800: H=3000: E=H-S110 Y=3: A=H/Y: I=5: Q=1210 D=0215 PRINT:PRINT:PRINT "HAMURABI: I BEG TO REPORT TO YOU,": Z=Z+1217 PRINT "IN YEAR";Z;",";D;"PEOPLE STARVED,";I;"CAME TO THE CITY,"218 P=P+I227 IF Q>0 THEN 230228 P=INT(P/2)229 PRINT "A HORRIBLE PLAGUE STRUCK! HALF THE PEOPLE DIED."230 PRINT "POPULATION IS NOW";P232 PRINT "THE CITY NOW OWNS ";A;"ACRES."235 PRINT "YOU HARVESTED";Y;"BUSHELS PER ACRE."250 PRINT "THE RATS ATE";E;"BUSHELS."260 PRINT "YOU NOW HAVE ";S;"BUSHELS IN STORE.": PRINT270 REM *** MORE CODE THAT DID NOT FIT INTO THE SLIDE FOLLOWS

Page 14: Building Interpreters with PyPy

Parser

Parser

Abstract Syntax Tree (AST)

Source

Page 15: Building Interpreters with PyPy

Parser

Parser

AST

SourceLexer

Tokens

Source

Parser

AST

Page 16: Building Interpreters with PyPy

RPLY

• Based on PLY, which is based on Lex and Yacc!

• Lexer generator!

• LALR parser generator

Page 17: Building Interpreters with PyPy

Lexerfrom rply import LexerGenerator!

lg = LexerGenerator()!

lg.add(“NUMBER”, “[0-9]+”)# …lg.ignore(“ +”) # whitespace!

lexer = lg.build().lex

Page 18: Building Interpreters with PyPy

lg.add('NUMBER', r'[0-9]*\.[0-9]+')lg.add('PRINT', r'PRINT')lg.add('IF', r'IF')lg.add('THEN', r'THEN')lg.add('GOSUB', r'GOSUB')lg.add('GOTO', r'GOTO')lg.add('INPUT', r'INPUT')lg.add('REM', r'REM')lg.add('RETURN', r'RETURN')lg.add('END', r'END')lg.add('FOR', r'FOR')lg.add('TO', r'TO')lg.add('NEXT', r'NEXT')lg.add('NAME', r'[A-Z][A-Z0-9$]*')lg.add('(', r'\(')lg.add(')', r'\)')lg.add(';', r';')lg.add('STRING', r'"[^"]*"')

lg.add(':', r'\r?\n')lg.add(':', r':')lg.add('=', r'=')lg.add('<>', r'<>')lg.add('-', r'-')lg.add('/', r'/')lg.add('+', r'\+')lg.add('>=', r'>=')lg.add('>', r'>')lg.add('***', r'\*\*\*.*')lg.add('*', r'\*')lg.add('<=', r'<=')lg.add('<', r'<')

Page 19: Building Interpreters with PyPy

>>> from basic.lexer import lex>>> source = open("hello.bas").read()>>> for token in lex(source):... print tokenToken("NUMBER", "10")Token("PRINT", "PRINT")Token("STRING",'"HELLO BASIC!"')Token(":", "\n")

Page 20: Building Interpreters with PyPy

Grammar

• A set of formal rules that defines the syntax!

• terminals = tokens!

• nonterminals = rules defining a sequence of one or more (non)terminals

Page 21: Building Interpreters with PyPy

10 PRINT TAB(32);"HAMURABI"20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY"30 PRINT:PRINT:PRINT80 PRINT "TRY YOUR HAND AT GOVERNING ANCIENT SUMERIA"90 PRINT "FOR A TEN-YEAR TERM OF OFFICE.":PRINT95 D1=0: P1=0100 Z=0: P=95:S=2800: H=3000: E=H-S110 Y=3: A=H/Y: I=5: Q=1210 D=0215 PRINT:PRINT:PRINT "HAMURABI: I BEG TO REPORT TO YOU,": Z=Z+1217 PRINT "IN YEAR";Z;",";D;"PEOPLE STARVED,";I;"CAME TO THE CITY,"218 P=P+I227 IF Q>0 THEN 230228 P=INT(P/2)229 PRINT "A HORRIBLE PLAGUE STRUCK! HALF THE PEOPLE DIED."230 PRINT "POPULATION IS NOW";P232 PRINT "THE CITY NOW OWNS ";A;"ACRES."235 PRINT "YOU HARVESTED";Y;"BUSHELS PER ACRE."250 PRINT "THE RATS ATE";E;"BUSHELS."260 PRINT "YOU NOW HAVE ";S;"BUSHELS IN STORE.": PRINT270 REM *** MORE CODE THAT DID NOT FIT INTO THE SLIDE FOLLOWS

Page 22: Building Interpreters with PyPy

program :program : lineprogram : line program

Page 23: Building Interpreters with PyPy

line : NUMBER statements

Page 24: Building Interpreters with PyPy

statements : statementstatements : statement statements

Page 25: Building Interpreters with PyPy

statement : PRINT :statement : PRINT expressions :expressions : expressionexpressions : expression ;expressions : expression ; expressions

Page 26: Building Interpreters with PyPy

statement : NAME = expression :

Page 27: Building Interpreters with PyPy

statement : IF expression THEN number :

Page 28: Building Interpreters with PyPy

statement : INPUT name :

Page 29: Building Interpreters with PyPy

statement : GOTO NUMBER :statement : GOSUB NUMBER :statement : RETURN :

Page 30: Building Interpreters with PyPy

statement : REM *** :

Page 31: Building Interpreters with PyPy

statement : FOR NAME = NUMBER TO NUMBER :statement : NEXT NAME :

Page 32: Building Interpreters with PyPy

statement : END :

Page 33: Building Interpreters with PyPy

expression : NUMBERexpression : NAMEexpression : STRINGexpression : operationexpression : ( expression )expression : NAME ( expression )

Page 34: Building Interpreters with PyPy

operation : expression + expressionoperation : expression - expressionoperation : expression * expressionoperation : expression / expressionoperation : expression <= expressionoperation : expression < expressionoperation : expression = expressionoperation : expression <> expressionoperation : expression > expressionoperation : expression >= expression

Page 35: Building Interpreters with PyPy

from rply.token import BaseBox!class Program(BaseBox): def __init__(self, lines): self.lines = lines

AST

Page 36: Building Interpreters with PyPy

class Line(BaseBox): def __init__(self, lineno, statements): self.lineno = lineno self.statements = statements

Page 37: Building Interpreters with PyPy

class Statements(BaseBox): def __init__(self, statements): self.statements = statements

Page 38: Building Interpreters with PyPy

class Print(BaseBox): def __init__(self, expressions, newline=True): self.expressions = expressions self.newline = newline

Page 39: Building Interpreters with PyPy

Page 40: Building Interpreters with PyPy

from rply import ParserGenerator!pg = ParserGenerator(["NUMBER", "PRINT", …])

Parser

Page 41: Building Interpreters with PyPy

@pg.production("program : ")@pg.production("program : line")@pg.production("program : line program")def program(p): if len(p) == 2: return Program([p[0]] + p[1].get_lines()) return Program(p)

Page 42: Building Interpreters with PyPy

@pg.production("line : number statements")def line(p): return Line(p[0], p[1].get_statements())

Page 43: Building Interpreters with PyPy

@pg.production("op : expression + expression")@pg.production("op : expression * expression")def op(p): if p[1].gettokentype() == "+": return Add(p[0], p[2]) elif p[1].gettokentype() == "*": return Mul(p[0], p[2])

Page 44: Building Interpreters with PyPy

pg = ParserGenerator([…], precedence=[ ("left", ["+", "-"]), ("left", ["*", "/"])])

Page 45: Building Interpreters with PyPy

parse = pg.build().parse

Page 46: Building Interpreters with PyPy

Compiler/Virtual Machine

Compiler

Virtual Machine

AST

Bytecode

Page 47: Building Interpreters with PyPy

class VM(object): def __init__(self, program): self.program = program

Page 48: Building Interpreters with PyPy

class VM(object): def __init__(self, program): self.program = program self.pc = 0

Page 49: Building Interpreters with PyPy

class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = []

Page 50: Building Interpreters with PyPy

class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = [] self.iterators = []

Page 51: Building Interpreters with PyPy

class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = [] self.iterators = [] self.stack = []

Page 52: Building Interpreters with PyPy

class VM(object): def __init__(self, program): self.program = program self.pc = 0 self.frames = [] self.iterators = {} self.stack = [] self.variables = {}

Page 53: Building Interpreters with PyPy

class VM(object): … def execute(self): while self.pc < len(self.program.instructions): self.execute_bytecode(self.program.instructions[self.pc])

Page 54: Building Interpreters with PyPy

class VM(object): … def execute_bytecode(self, code): raise NotImplementedError(code)

Page 55: Building Interpreters with PyPy

class VM(object): ... def execute_bytecode(self): if isinstance(code, TYPE): self.execute_TYPE(code) ... else: raise NotImplementedError(code)

Page 56: Building Interpreters with PyPy

class Program(object): def __init__(self): self.instructions = []

Bytecode

Page 57: Building Interpreters with PyPy

class Instruction(object): pass

Page 58: Building Interpreters with PyPy

class Number(Instruction): def __init__(self, value): self.value = value!class String(Instructions): def __init__(self, value): self.value = value

Page 59: Building Interpreters with PyPy

class Print(Instruction): def __init__(self, expressions, newline): self.expressions = expressions self.newline = newline

Page 60: Building Interpreters with PyPy

class Call(Instruction): def __init__(self, function_name): self.function_name = function_name

Page 61: Building Interpreters with PyPy

class Let(Instruction): def __init__(self, name): self.name = name

Page 62: Building Interpreters with PyPy

class Lookup(Instruction): def __init__(self, name): self.name = name

Page 63: Building Interpreters with PyPy

class Add(Instruction): pass!class Sub(Instruction): pass!class Mul(Instruction): pass!class Equal(Instruction): pass!...

Page 64: Building Interpreters with PyPy

class GotoIfTrue(Instruction): def __init__(self, target): self.target = target!class Goto(Instruction): def __init__(self, target, with_frame=False): self.target = target self.with_frame = with_frame!class Return(Instruction): pass

Page 65: Building Interpreters with PyPy

class Input(object): def __init__(self, name): self.name = name

Page 66: Building Interpreters with PyPy

class For(Instruction): def __init__(self, variable): self.variable = variable!class Next(Instruction): def __init__(self, variable): self.variable = variable

Page 67: Building Interpreters with PyPy

class Program(object): def __init__(self): self.instructions = [] self.lineno2instruction = {}! def __enter__(self): return self! def __exit__(self, exc_type, exc_value, tb): if exc_type is None: for i, instruction in enumerate(self.instructions): instruction.finalize(self, i)

Page 68: Building Interpreters with PyPy

def finalize(self, program, index): self.target = program.lineno2instruction[self.target]

Page 69: Building Interpreters with PyPy

class Program(BaseBox): … def compile(self): with bytecode.Program() as program: for line in self.lines: line.compile(program) return program

Page 70: Building Interpreters with PyPy

class Line(BaseBox): ... def compile(self, program): program.lineno2instruction[self.lineno] = len(program.instructions) for statement in self.statements: statement.compile(program)

Page 71: Building Interpreters with PyPy

class Line(BaseBox): ... def compile(self, program): program.lineno2instruction[self.lineno] = len(program.instructions) for statement in self.statements: statement.compile(program)

Page 72: Building Interpreters with PyPy

class Print(Statement): def compile(self, program): for expression in self.expressions: expression.compile(program) program.instructions.append( bytecode.Print( len(self.expressions), self.newline ) )

Page 73: Building Interpreters with PyPy

class Print(Statement): ... def compile(self, program): for expression in self.expressions: expression.compile(program) program.instructions.append( bytecode.Print( len(self.expressions), self.newline ) )

Page 74: Building Interpreters with PyPy

class Let(Statement): ... def compile(self, program): self.value.compile(program) program.instructions.append( bytecode.Let(self.name) )

Page 75: Building Interpreters with PyPy

class Input(Statement): ... def compile(self, program): program.instructions.append( bytecode.Input(self.variable) )

Page 76: Building Interpreters with PyPy

class Goto(Statement): ... def compile(self, program): program.instructions.append( bytecode.Goto(self.target) )!class Gosub(Statement): ... def compile(self, program): program.instructions.append( bytecode.Goto( self.target, with_frame=True ) )!class Return(Statement): ... def compile(self, program): program.instructions.append( bytecode.Return() )

Page 77: Building Interpreters with PyPy

class For(Statement): ... def compile(self, program): self.start.compile(program) program.instructions.append( bytecode.Let(self.variable) ) self.end.compile(program) program.instructions.append( bytecode.For(self.variable) )

Page 78: Building Interpreters with PyPy

class WrappedObject(object): pass!class WrappedString(WrappedObject): def __init__(self, value): self.value = value!class WrappedFloat(WrappedObject): def __init__(self, value): self.value = value

Page 79: Building Interpreters with PyPy

class VM(object): … def execute_number(self, code): self.stack.append(WrappedFloat(code.value)) self.pc += 1! def execute_string(self, code): self.stack.append(WrappedString(code.value)) self.pc += 1

Page 80: Building Interpreters with PyPy

class VM(object): … def execute_call(self, code): argument = self.stack.pop() if code.function_name == "TAB": self.stack.append(WrappedString(" " * int(argument))) elif code.function_name == "RND": self.stack.append(WrappedFloat(random.random())) ... self.pc += 1

Page 81: Building Interpreters with PyPy

class VM(object): … def execute_let(self, code): value = self.stack.pop() self.variables[code.name] = value self.pc += 1! def execute_lookup(self, code): value = self.variables[code.name] self.stack.append(value) self.pc += 1

Page 82: Building Interpreters with PyPy

class VM(object): … def execute_add(self, code): right = self.stack.pop() left = self.stack.pop() self.stack.append(WrappedFloat(left + right)) self.pc += 1

Page 83: Building Interpreters with PyPy

class VM(object): … def execute_goto_if_true(self, code): condition = self.stack.pop() if condition: self.pc = code.target else: self.pc += 1

Page 84: Building Interpreters with PyPy

class VM(object): … def execute_goto(self, code): if code.with_frame: self.frames.append(self.pc + 1) self.pc = code.target

Page 85: Building Interpreters with PyPy

class VM(object): … def execute_return(self, code): self.pc = self.frames.pop()

Page 86: Building Interpreters with PyPy

class VM(object): … def execute_input(self, code): value = WrappedFloat(float(raw_input() or “0.0”)) self.variables[code.name] = value self.pc += 1

Page 87: Building Interpreters with PyPy

class VM(object): … def execute_for(code): self.pc += 1 self.iterators[code.variable] = ( self.pc, self.stack.pop() )

Page 88: Building Interpreters with PyPy

class VM(object): … def execute_next(self, code): loop_begin, end = self.iterators[code.variable] current_value = self.variables[code.variable].value next_value = current_value + 1.0 if next_value <= end: self.variables[code.variable] = \ WrappedFloat(next_value) self.pc = loop_begin else: del self.iterators[code.variable] self.pc += 1

Page 89: Building Interpreters with PyPy

def entry_point(argv): try: filename = argv[1] except IndexError: print(“You must supply a filename”) return 1 content = read_file(filename) tokens = lex(content) ast = parse(tokens) program = ast.compile() vm = VM(program) vm.execute() return 0

Entry Point

Page 90: Building Interpreters with PyPy

JIT (in PyPy)1. Identify “hot" loops!

2. Create trace inserting guards based on observed values!

3. Optimize trace!

4. Compile trace!

5. Execute machine code instead of interpreter

Page 91: Building Interpreters with PyPy

from rpython.rlib.jit import JitDriver!jitdriver = JitDriver( greens=[“pc”, “vm”, “program”, “frames”, “iterators”], reds=[“stack”, “variables"])

Page 92: Building Interpreters with PyPy

class VM(object): … def execute(self): while self.pc < len(self.program.instructions): jitdriver.merge_point( vm=self, pc=self.pc, … )

Page 93: Building Interpreters with PyPy

Benchmark10 N = 120 IF N <= 10000 THEN 4030 END40 GOSUB 10050 IF R = 0 THEN 7060 PRINT "PRIME"; N70 N = N + 1: GOTO 20100 REM *** ISPRIME N -> R110 IF N <= 2 THEN 170120 FOR I = 2 TO (N - 1)130 A = N: B = I: GOSUB 200140 IF R <> 0 THEN 160150 R = 0: RETURN160 NEXT I170 R = 1: RETURN200 REM *** MOD A -> B -> R210 R = A - (B * INT(A / B))220 RETURN

Page 94: Building Interpreters with PyPy

cbmbasic 58.22s

basic-c 5.06s

basic-c-jit 2.34s

Python implementation (CPython) 2.83s

Python implementation (PyPy) 0.11s

C implementation 0.03s

Page 95: Building Interpreters with PyPy

Questions?

Page 96: Building Interpreters with PyPy

These slides are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License