27
the sequel Mark Rees CTO Century Software (M) Sdn Bhd is it ready for production?

Pypy is-it-ready-for-production-the-sequel

Embed Size (px)

DESCRIPTION

Slidedeck for my talk at Pycon Singapore 2013.

Citation preview

Page 1: Pypy is-it-ready-for-production-the-sequel

the sequel

Mark Rees

CTO

Century Software (M) Sdn Bhd

is it ready for production?

Page 2: Pypy is-it-ready-for-production-the-sequel

pypy & me

not affiliated with pypy team

have followed it‟s development since

2004

use cpython and jython at work

used ironpython for small projects

gave a similar talk at PyConAU 2012

the question:

would pypy improve performance of

some of our workloads?

i am a manager, who still is wants to be a

programmer, so i did the analysis

Page 3: Pypy is-it-ready-for-production-the-sequel

pypy

history- first sprint 2003, EU project from 2004 – 2007

- open source project from 2007

https://bitbucket.org/pypy

- pypy 1.4 first release suitable for “production”

12/2010

what is pypy?- RPython translation toolchain, a framework for

generating dynamic programming language

implementations

- a implementation of Python in Python using the

framework

Page 4: Pypy is-it-ready-for-production-the-sequel

pypy

current releasepypy 2.0 released may 2013

latest iteration 2.0.2

want to know more about pypy- http://pypy.org/

- david beazley pycon 2012 keynote

http://goo.gl/5PXFQ

- how the pypy jit works http://goo.gl/dKgFp

- why pypy by example http://goo.gl/vpQyJ

Page 5: Pypy is-it-ready-for-production-the-sequel

production ready – a definition

it runs

it satisfies the project requirements

its design was well thought out

it's stable

it's maintainable

it's scalable

it's documented

it works with the python modules we use

it is as fast or faster than cpython

http://programmers.stackexchange.com/questions/61726/define-production-ready

Page 6: Pypy is-it-ready-for-production-the-sequel

pypy – does it run?

of course, it runs

See http://pypy.readthedocs.org/en/latest/cpython_differences.html

for differences between PyPy and CPython

Page 7: Pypy is-it-ready-for-production-the-sequel

pypy – other production criteria

does it satisfy the project requirements

- yes

is it‟s design was well thought out

- I would assume so

is it stable

- yes

is it maintainable

- 7 out of 10

is it scalable

- stackless & greenlets built in

is it documented

- cpython docs for functionality, rpython toolchain 8 out

of 10

Page 8: Pypy is-it-ready-for-production-the-sequel

pypy – does it work with the modules we use

standard library modules supported:

__builtin__, __pypy__, _ast, _bisect, _codecs, _collections, _ffi, _hashlib,

_io, _locale, _lsprof, _md5, _minimal_curses, _multiprocessing, _random,

_rawffi, _sha, _socket, _sre, _ssl, _warnings, _weakref, _winreg, array,

binascii, bz2, cStringIO, clr, cmath, cpyext, crypt, errno, exceptions,

fcntl, gc, imp, itertools, marshal, math, mmap, operator, oracle, parser,

posix, pyexpat, select, signal, struct, symbol, sys, termios, thread, time,

token, unicodedata, zipimport, zlib

these modules are supported but written in

python:cPickle, _csv, ctypes, datetime, dbm, _functools, grp, pwd, readline,

resource, sqlite3, syslog, tputil

many python libs are known to work, like:ctypes, django, pyglet, sqlalchemy, PIL. See

https://bitbucket.org/pypy/compatibility/wiki/Home for a more

exhaustive list.

Page 9: Pypy is-it-ready-for-production-the-sequel

pypy – does it work with the modules we use

pypy c-api support is beta, worked most of

the time but failed with reportlab:Fatal error in cpyext, CPython compatibility layer, calling

PySequence_GetItemEither report a bug or consider not using this particular extension<OpErrFmt object at 0x7f94582f3100>RPython traceback:File ”pypy_module_cpyext_api_1.c", line 30287, in PySequence_GetItemFile ”pypy_module_cpyext_pyobject.c", line 1056, in

BaseCpyTypedescr_realizeFile ”pypy_objspace_std_objspace.c", line 3404, in

allocate_instance__W_ObjectObjectFile ”pypy_objspace_std_typeobject.c", line 33781, in

W_TypeObject_check_user_subclassSegmentation fault

But this was the only compatibility issue we

had running all of our python code under

pypy and we could fallback to pure python

reportlab extensions anyway.

Page 10: Pypy is-it-ready-for-production-the-sequel

pypy – does it work with the modules you use

Ipython notebook requires tornado & zeromq

Page 11: Pypy is-it-ready-for-production-the-sequel

pypy – does it work with the modules you use

Page 12: Pypy is-it-ready-for-production-the-sequel

pypy – does it run as fast as cpython

http://speed.pypy.org/

but!

Page 13: Pypy is-it-ready-for-production-the-sequel

pypy django benchmark

DJANGO_TMPL = Template("""<table>

{% for row in table %}

<tr>{% for col in row %}<td>{{ col|escape }}</td>{% endfor %}</tr>

{% endfor %}

</table>

""")

def test_django(count):

table = [xrange(150) for _ in xrange(150)]

context = Context({"table": table})

# Warm up Django.

DJANGO_TMPL.render(context)

DJANGO_TMPL.render(context)

times = []

for _ in xrange(count):

t0 = time.time()

data = DJANGO_TMPL.render(context)

t1 = time.time()

times.append(t1 - t0)

return times

Page 14: Pypy is-it-ready-for-production-the-sequel

my csv to xml benchmark

def bench(data, output):

f = open(data, 'rb')

fn = [„age‟,….]

reader = csv.DictReader(f, fn)

writer = SAXWriter(output)

writer.start_doc()

writer.start_tag('data')

try:

for row in reader:

writer.start_tag('row')

for key in row.keys():

writer.tag(key.replace(' ', '_'), body=row[key])

writer.end_tag('row')

finally:

f.close()

writer.end_tag('data')

writer.end_doc()

Page 15: Pypy is-it-ready-for-production-the-sequel

my pypy benchmarks

https://bitbucket.org/hexdump42/pypy-benchmarks

benchmark cpython

2.7.3

pypy-jit

1.9

pypy-jit

2.0.2

bm_csv2xml 88.26/94.

04

28.89 3.0549 x

faster

23.86 3.7728x

faster

average execution time (in seconds)

Page 16: Pypy is-it-ready-for-production-the-sequel

my pypy benchmarks

https://bitbucket.org/hexdump42/pypy-benchmarks

benchmark cpython

2.7.3

pypy-jit

1.9

pypy-jit

2.0.2

bm_csv2xml 88.26/94.

04

28.89 3.0549 x

faster

23.86 3.7728x

faster

bm_csv 1.54/1.65 5.89 3.8122 x

slower

1.72 0.9825 x

slower

average execution time (in seconds)

Page 17: Pypy is-it-ready-for-production-the-sequel

my pypy benchmarks

https://bitbucket.org/hexdump42/pypy-benchmarks

benchmark cpython

2.7.3

pypy-jit

1.9

pypy-jit

2.0.2

bm_csv2xml 88.26/94.

04

28.89 3.0549 x

faster

23.86 3.7728x

faster

bm_csv 1.54/1.65 5.89 3.8122 x

slower

1.72 0.9825 x

slower

bm_openpyxl 1.31/1.21 3.26 2.4871 x

slower

3.15 2.6051 x

slower

average execution time (in seconds)

Page 18: Pypy is-it-ready-for-production-the-sequel

my pypy benchmarks

https://bitbucket.org/hexdump42/pypy-benchmarks

benchmark cpython

2.7.3

pypy-jit

1.9

pypy-jit

2.0.2

bm_csv2xml 88.26/94.

04

28.89 3.0549 x

faster

23.86 3.7728x

faster

bm_csv 1.54/1.65 5.89 3.8122 x

slower

1.72 0.9825 x

slower

bm_openpyxml 1.31/1.21 3.26 2.4871 x

slower

3.15 2.6051 x

slower

bm_xhtml2pdf 1.91/1.95 3.27 1.7155 x

slower

4.22 2.1637 x

slower

average execution time (in seconds)

Page 19: Pypy is-it-ready-for-production-the-sequel

my pypy benchmarks

https://bitbucket.org/hexdump42/pypy-benchmarks

benchmark cpython

2.7.3

pypy-jit

1.9

pypy-jit

2.0.2

bm_interp 5412/5248 12556 2.32 x

larger

21880 4.1692 x

larger

bm_csv2xml 7048/7064 55180 7.8292 x

larger

55232 7.8188 x

larger

bm_csv 5812/5180 52200 8.9814 x

larger

52176 10.0726

x larger

bm_openpyxl 12656/

12656

77252 6.1040 x

larger

80428 6.3549 x

larger

bm_xhtml2pdf 48880/

34884

236792 4.8444 x

larger

101376 2.906 x

larger

max memory use

Page 20: Pypy is-it-ready-for-production-the-sequel

what is the pypy jit doing?

https://bitbucket.org/pypy/jitviewer/

Page 21: Pypy is-it-ready-for-production-the-sequel

modified csv pypy benchmarks

https://bitbucket.org/hexdump42/pypy-benchmarks

benchmark cpython

2.7.3

pypy-jit

1.9

pypy-jit

2.0.2

bm_csv2xml_mod 88.25/90.02 23.65 3.7315 x

faster

21.76 4.0556 x

faster

bm_csv_mod 1.62/1.69 1.89 0.8571 x

slower

1.68 0.9643 x

slower

average execution time (in seconds)

Page 22: Pypy is-it-ready-for-production-the-sequel

is pypy ready for production

1. it runs

2. it satisfies the project requirements

3. its design was well thought out

4. it's stable

5. it's maintainable

6. it's scalable

7. it's documented

8. it works with the python modules we use

9. it can be as fast or faster than cpython

Page 23: Pypy is-it-ready-for-production-the-sequel

some other reasons to consider pypy

cffi – C foreign function interface for python

- http://cffi.readthedocs.org/

pypy version of numpy

py3k version of pypy work-in-progress

check out the STM/AME project

-

https://speakerdeck.com/pyconslides/pypy-

python-without-the-gil-by-armin-rigo-and-

maciej-fijalkowski

You can help

http://www.pypy.org/howtohelp.html

Page 24: Pypy is-it-ready-for-production-the-sequel

now for something different

Page 25: Pypy is-it-ready-for-production-the-sequel

cffi better than ctypes?

Page 26: Pypy is-it-ready-for-production-the-sequel

cffi better than ctypes?

Page 27: Pypy is-it-ready-for-production-the-sequel

Mark Reesmark at censof dot com

+Mark Rees

@hexdump42

hex-dump.blogspot.com

contact details

http://www.slideshare.net/hexdump42/pypy-isitreadyforproductionthesequel

http://goo.gl/8IPuX