Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen...

Preview:

Citation preview

gevent network library

Denis Bilenko

gevent.org

Problem statement

from urllib2 import urlopenresponse = urlopen('http://gevent.org')body = response.read()

How to manage concurrent connections?

Problem statement

def on_response_read(response):     d = response.read()    d.addCallbacks(on_body_read, on_error) def on_error(error):     ...

def on_body_read(body):    ... d = readURL('http://gevent.org').d.addCallbacks(on_response_read, on_error)reactor.run()

Possible answer: Async framework (Twisted, asyncore, ...)

simplicity is lost

Problem statement

from threading import Threaddef read_url(url):    response = urllib2.urlopen(url)     body = response.read()

t1=Thread(target=read_url, args=('http://gevent.org',))t1.start() t2=Thread(target=read_url, args=('http://python.org',))t2.start()t1.join()t2.join()

Possible answer: Threads

resource hog

Memory required for 10k connections

twisted55 MB

threading400 MB

Memory required for 10k connections

gevent (greenlet + libevent)

from gevent import monkey; monkey.patch_all()  def read_url(url):  response = urllib2.urlopen(url)  body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')

gevent.joinall([a, b])

concurrent fetch

Memory required for 10k connections

twisted55 MB

gevent70 MB

threading400 MB

Memory required for 10k connections

greenlet

from greenlet import greenlet

>>> def myfunction(arg):...     return arg + 1

>>> g = greenlet(myfunction)>>> g.switch(2)3

from greenlet import greenlet

>>> MAIN = greenlet.getcurrent()>>> def myfunction(arg):...     MAIN.switch('hello')...     return arg + 1

>>> g = greenlet(myfunction)>>> g.switch(2)'hello'>>> g.switch('hello to you')3

switching deep down the stack

>>> def myfunction(arg):...     MAIN.switch('hello')...     return arg + 1 >>> def top_function(arg):...     return myfunction(arg)

>>> g = greenlet(top_function)   >>> g.switch(2)'hello'

from greenlet import greenlet

• primitive pseudothreads, share same OS thread• switched explicitly via switch() and throw()• organized in a tree, each has .parent except MAIN• switch(), throw() and .parent reserved for gevent

http://codespeak.net/py/0.9.2/greenlet.html

How gevent uses greenlet

HUB

MAIN

spawned greenlets

Hub: greenlet that runs event loopfrom gevent import core

class Hub(greenlet.greenlet):

def run(self): core.dispatch() # wrapper for event_dispatch()

def get_hub(): # return the global Hub instance # creating one if does not exist

gevent/hub.py

Event loop

• libevent 1.4.x or 2.0.5-beta• gevent.core: wraps libevent API (like pyevent)

>>> def print_hello():... print 'hello'>>> gevent.core.timer(1, print_hello)<timer ...>>>> gevent.core.dispatch()hello1 # return value (no more events)

Implementation of gevent.sleep()def sleep(seconds=0): """Put the current greenlet to sleep""“ switch = getcurrent().switch timer = core.timer(seconds, switch) try: get_hub().switch() finally: timer.cancel()

Cooperative socket

• gevent.socket: compatible synchronous interface• wraps a non-blocking socket

def recv(self, size): while True:   try:   return self._sock.recv(size)   except error, ex:    if ex[0] == EWOULDBLOCK: wait_read(self.fileno()) else: raise

Cooperative socket

• gevent.socket: compatible synchronous interface• wraps a non-blocking socket

def wait_read(fileno): switch = getcurrent().switch event = core.read_event(fileno, switch)  try: get_hub().switch() finally: event.cancel()

gevent/socket.py

Cooperative socket

• gevent.socket• dns queries are resolved through libevent-dns

(getaddrinfo, gethostbyname)• gevent.ssl

Monkey patching

from gevent import monkey; monkey.patch_all()  def read_url(url):  response = urllib2.urlopen(url)  body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')

gevent.joinall([a, b])

Monkey patching

Patches:• socket and ssl modules• time.sleep, select.select• thread and threadingBeware:• libraries that wrap C libraries (e.g. MySQLdb)• Disk I/O• things not yet patched: subprocess, os.system, sys.stdinTested with httplib, urllib2, mechanize, mysql-connector,

SQLAlchemy, ...

Greenlet objects

from gevent import monkey; monkey.patch_all()  def read_url(url):  response = urllib2.urlopen(url)  body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')

gevent.joinall([a, b])

Greenlet objects

def read_url(url):  response = urllib2.urlopen(url)  body = response.read() g = Greenlet(read_url, url)g.start()

# wait for it to completeg.join()

# or raise an exception and wait to exitg.kill()

= spawn

Greenlet objects

def read_url(url):  response = urllib2.urlopen(url)  body = response.read() g = Greenlet(read_url, url)g.start()

# wait for it to complete (or timeout expires)g.join(timeout=2)

# or raise and wait to exit (or timeout expires)g.kill(timeout=2)

= spawn

Timeouts

with gevent.Timeout(5):  response = urllib2.urlopen(url)  for line in response: print line# raises Timeout if not done after 5 seconds

with gevent.Timeout(5, False):  response = urllib2.urlopen(url)  for line in response: print line# exits block if not done after 5 seconds

Beware: catch-all “except:”, non-yielding code

API

• socket, ssl • Greenlet• Timeout

• Event, AsyncResult• Queue (also JoinableQueue, PriorityQueue, LifoQueue)

– Queue(0) is a synchronous channel

• Pool

• StreamServer: TCP and SSL servers• WSGI servers

WSGI servers

• gevent.wsgi– uses libevent-http– efficient, but lacks important features

• gevent.pywsgi– uses gevent sockets

• green unicorn (gunicorn.org)– its own parser or gevent’s server– pre-fork workers

Caveat emptor

• Reduced portability– no Jython, IronPython– not all platforms supported by CPython

• PyThreadState is shared– exc_info (saved/restored by gevent)– tracing, profiling info

Future plans

• http://code.google.com/p/gevent/issues/list• alternative coroutine libraries– Stackless– swapcontext

• more libevent:– http client– buffered socket operations– priorities

• process handling (gevent.subprocess)• even more stable API with 1.0

Examples

• bitbucket.org/denis/gevent/src/tip/examples/• chat.gevent.org• omegle.com• ProjectsUsingGevent– gevent-mysql– psycopg2

• bit.ly/use-gevent– websockets, web crawlers, facebook apps

Summary

• coroutines are easy-to-use threads• as efficient as async libraries• works well if app is I/O bound• simple API, many things familiar• works with unsuspecting 3rd party modules

Thank you!

gevent.org@gevent

Recommended