Upload
mark-rees
View
1.407
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Slidedeck for my talk at Pycon Singapore 2013.
Citation preview
the sequel
Mark Rees
CTO
Century Software (M) Sdn Bhd
is it ready for production?
pypy & me
not affiliated with pypy team
have followed it‟s development since
2004
use cpython and jython at work
used ironpython for small projects
gave a similar talk at PyConAU 2012
the question:
would pypy improve performance of
some of our workloads?
i am a manager, who still is wants to be a
programmer, so i did the analysis
pypy
history- first sprint 2003, EU project from 2004 – 2007
- open source project from 2007
https://bitbucket.org/pypy
- pypy 1.4 first release suitable for “production”
12/2010
what is pypy?- RPython translation toolchain, a framework for
generating dynamic programming language
implementations
- a implementation of Python in Python using the
framework
pypy
current releasepypy 2.0 released may 2013
latest iteration 2.0.2
want to know more about pypy- http://pypy.org/
- david beazley pycon 2012 keynote
http://goo.gl/5PXFQ
- how the pypy jit works http://goo.gl/dKgFp
- why pypy by example http://goo.gl/vpQyJ
production ready – a definition
it runs
it satisfies the project requirements
its design was well thought out
it's stable
it's maintainable
it's scalable
it's documented
it works with the python modules we use
it is as fast or faster than cpython
http://programmers.stackexchange.com/questions/61726/define-production-ready
pypy – does it run?
of course, it runs
See http://pypy.readthedocs.org/en/latest/cpython_differences.html
for differences between PyPy and CPython
pypy – other production criteria
does it satisfy the project requirements
- yes
is it‟s design was well thought out
- I would assume so
is it stable
- yes
is it maintainable
- 7 out of 10
is it scalable
- stackless & greenlets built in
is it documented
- cpython docs for functionality, rpython toolchain 8 out
of 10
pypy – does it work with the modules we use
standard library modules supported:
__builtin__, __pypy__, _ast, _bisect, _codecs, _collections, _ffi, _hashlib,
_io, _locale, _lsprof, _md5, _minimal_curses, _multiprocessing, _random,
_rawffi, _sha, _socket, _sre, _ssl, _warnings, _weakref, _winreg, array,
binascii, bz2, cStringIO, clr, cmath, cpyext, crypt, errno, exceptions,
fcntl, gc, imp, itertools, marshal, math, mmap, operator, oracle, parser,
posix, pyexpat, select, signal, struct, symbol, sys, termios, thread, time,
token, unicodedata, zipimport, zlib
these modules are supported but written in
python:cPickle, _csv, ctypes, datetime, dbm, _functools, grp, pwd, readline,
resource, sqlite3, syslog, tputil
many python libs are known to work, like:ctypes, django, pyglet, sqlalchemy, PIL. See
https://bitbucket.org/pypy/compatibility/wiki/Home for a more
exhaustive list.
pypy – does it work with the modules we use
pypy c-api support is beta, worked most of
the time but failed with reportlab:Fatal error in cpyext, CPython compatibility layer, calling
PySequence_GetItemEither report a bug or consider not using this particular extension<OpErrFmt object at 0x7f94582f3100>RPython traceback:File ”pypy_module_cpyext_api_1.c", line 30287, in PySequence_GetItemFile ”pypy_module_cpyext_pyobject.c", line 1056, in
BaseCpyTypedescr_realizeFile ”pypy_objspace_std_objspace.c", line 3404, in
allocate_instance__W_ObjectObjectFile ”pypy_objspace_std_typeobject.c", line 33781, in
W_TypeObject_check_user_subclassSegmentation fault
But this was the only compatibility issue we
had running all of our python code under
pypy and we could fallback to pure python
reportlab extensions anyway.
pypy – does it work with the modules you use
Ipython notebook requires tornado & zeromq
pypy – does it work with the modules you use
pypy – does it run as fast as cpython
http://speed.pypy.org/
but!
pypy django benchmark
DJANGO_TMPL = Template("""<table>
{% for row in table %}
<tr>{% for col in row %}<td>{{ col|escape }}</td>{% endfor %}</tr>
{% endfor %}
</table>
""")
def test_django(count):
table = [xrange(150) for _ in xrange(150)]
context = Context({"table": table})
# Warm up Django.
DJANGO_TMPL.render(context)
DJANGO_TMPL.render(context)
times = []
for _ in xrange(count):
t0 = time.time()
data = DJANGO_TMPL.render(context)
t1 = time.time()
times.append(t1 - t0)
return times
my csv to xml benchmark
def bench(data, output):
f = open(data, 'rb')
fn = [„age‟,….]
reader = csv.DictReader(f, fn)
writer = SAXWriter(output)
writer.start_doc()
writer.start_tag('data')
try:
for row in reader:
writer.start_tag('row')
for key in row.keys():
writer.tag(key.replace(' ', '_'), body=row[key])
writer.end_tag('row')
finally:
f.close()
writer.end_tag('data')
writer.end_doc()
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks
benchmark cpython
2.7.3
pypy-jit
1.9
pypy-jit
2.0.2
bm_csv2xml 88.26/94.
04
28.89 3.0549 x
faster
23.86 3.7728x
faster
average execution time (in seconds)
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks
benchmark cpython
2.7.3
pypy-jit
1.9
pypy-jit
2.0.2
bm_csv2xml 88.26/94.
04
28.89 3.0549 x
faster
23.86 3.7728x
faster
bm_csv 1.54/1.65 5.89 3.8122 x
slower
1.72 0.9825 x
slower
average execution time (in seconds)
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks
benchmark cpython
2.7.3
pypy-jit
1.9
pypy-jit
2.0.2
bm_csv2xml 88.26/94.
04
28.89 3.0549 x
faster
23.86 3.7728x
faster
bm_csv 1.54/1.65 5.89 3.8122 x
slower
1.72 0.9825 x
slower
bm_openpyxl 1.31/1.21 3.26 2.4871 x
slower
3.15 2.6051 x
slower
average execution time (in seconds)
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks
benchmark cpython
2.7.3
pypy-jit
1.9
pypy-jit
2.0.2
bm_csv2xml 88.26/94.
04
28.89 3.0549 x
faster
23.86 3.7728x
faster
bm_csv 1.54/1.65 5.89 3.8122 x
slower
1.72 0.9825 x
slower
bm_openpyxml 1.31/1.21 3.26 2.4871 x
slower
3.15 2.6051 x
slower
bm_xhtml2pdf 1.91/1.95 3.27 1.7155 x
slower
4.22 2.1637 x
slower
average execution time (in seconds)
my pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks
benchmark cpython
2.7.3
pypy-jit
1.9
pypy-jit
2.0.2
bm_interp 5412/5248 12556 2.32 x
larger
21880 4.1692 x
larger
bm_csv2xml 7048/7064 55180 7.8292 x
larger
55232 7.8188 x
larger
bm_csv 5812/5180 52200 8.9814 x
larger
52176 10.0726
x larger
bm_openpyxl 12656/
12656
77252 6.1040 x
larger
80428 6.3549 x
larger
bm_xhtml2pdf 48880/
34884
236792 4.8444 x
larger
101376 2.906 x
larger
max memory use
what is the pypy jit doing?
https://bitbucket.org/pypy/jitviewer/
modified csv pypy benchmarks
https://bitbucket.org/hexdump42/pypy-benchmarks
benchmark cpython
2.7.3
pypy-jit
1.9
pypy-jit
2.0.2
bm_csv2xml_mod 88.25/90.02 23.65 3.7315 x
faster
21.76 4.0556 x
faster
bm_csv_mod 1.62/1.69 1.89 0.8571 x
slower
1.68 0.9643 x
slower
average execution time (in seconds)
is pypy ready for production
1. it runs
2. it satisfies the project requirements
3. its design was well thought out
4. it's stable
5. it's maintainable
6. it's scalable
7. it's documented
8. it works with the python modules we use
9. it can be as fast or faster than cpython
some other reasons to consider pypy
cffi – C foreign function interface for python
- http://cffi.readthedocs.org/
pypy version of numpy
py3k version of pypy work-in-progress
check out the STM/AME project
-
https://speakerdeck.com/pyconslides/pypy-
python-without-the-gil-by-armin-rigo-and-
maciej-fijalkowski
You can help
http://www.pypy.org/howtohelp.html
now for something different
cffi better than ctypes?
cffi better than ctypes?
Mark Reesmark at censof dot com
+Mark Rees
@hexdump42
hex-dump.blogspot.com
contact details
http://www.slideshare.net/hexdump42/pypy-isitreadyforproductionthesequel
http://goo.gl/8IPuX