Upload
jun-wang
View
963
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Python于Web 2.0网站的应用
洪强宁QCon Beijing 2010
http://www.flickr.com/photos/arnolouise/2986467632/
Saturday, April 24, 2010
About Me• Python程序员
• 2002年开始接触Python
• 2004年开始完全使用Python工作
• http://www.douban.com/people/hongqn/
• http://twitter.com/hongqn
Saturday, April 24, 2010
Python
• Python is a programming language that lets you work more quickly and integrate your systems more effectively. You can learn to use Python and see almost immediate gains in productivity and lower maintenance costs. (via http://python.org/)
Saturday, April 24, 2010
Languages in 豆瓣
其他(Pyrex/R/Erlang/Go/Shell)1%
C++3%
Javascript12%
C27%
Python58%
Saturday, April 24, 2010
Why Python?
Saturday, April 24, 2010
简单易学
Saturday, April 24, 2010
简单易学
• Hello World: 1分钟
Saturday, April 24, 2010
简单易学
• Hello World: 1分钟
• 小工具脚本: 1下午
Saturday, April 24, 2010
简单易学
• Hello World: 1分钟
• 小工具脚本: 1下午
• 实用程序: 1周
Saturday, April 24, 2010
简单易学
• Hello World: 1分钟
• 小工具脚本: 1下午
• 实用程序: 1周
• 做个豆瓣: 3个月
Saturday, April 24, 2010
开发迅捷
Saturday, April 24, 2010
开发迅捷统计各种语言的代码行数: 13行
Saturday, April 24, 2010
开发迅捷
import osfrom collections import defaultdict
d = defaultdict(int)
for dirpath, dirnames, filenames in os.walk('.'): for filename in filenames: path = os.path.join(dirpath, filename) ext = os.path.splitext(filename)[1] d[ext] += len(list(open(path)))
for ext, n_lines in d.items(): print ext, n_lines
统计各种语言的代码行数: 13行
Saturday, April 24, 2010
易于协作
• 强制缩进保证代码结构清晰易读• Pythonic避免强烈的个人风格
Saturday, April 24, 2010
部署方便
• 上线三部曲1. svn ci
2. svn up
3. restart
Saturday, April 24, 2010
适用面广
• Web应用
• 离线计算• 运维脚本• 数据分析
Saturday, April 24, 2010
资源丰富
• Battery Included: 标准库内置200+模块
• PyPI: 9613 packages currently
• 网络/数据库/桌面/游戏/科学计算/安全/文本处理/...
• easily extensible
Saturday, April 24, 2010
更重要的是,老赵也推荐Python
Saturday, April 24, 2010
更重要的是,老赵也推荐Python
Saturday, April 24, 2010
更重要的是,老赵也推荐Python
Just kidding :-p
Saturday, April 24, 2010
示例
Saturday, April 24, 2010
Web Server
Saturday, April 24, 2010
Web Server
• python -m SimpleHTTPServer
Saturday, April 24, 2010
Web Server
• python -m SimpleHTTPServer
Saturday, April 24, 2010
web.pyimport web
urls = ( '/(.*)', 'hello')app = web.application(urls, globals())
class hello: def GET(self, name): if not name: name = 'World' return 'Hello, ' + name + '!'
if __name__ == "__main__": app.run()
http://webpy.org/
Saturday, April 24, 2010
Flaskimport flask import Flaskapp = Flask(__name__)
@app.route("/<name>")def hello(name): if not name: name = 'World' return 'Hello, ' + name + '!'
if __name__ == "__main__": app.run()
http://flask.pocoo.org/
Saturday, April 24, 2010
WSGIhttp://www.python.org/dev/peps/pep-0333/
Saturday, April 24, 2010
Why so many Python web frameworks?
• Because you can write your own framework in 3 hours and a total of 60 lines of Python code.
• http://bitworking.org/news/Why_so_many_Python_web_frameworks
Saturday, April 24, 2010
doctestdef cube(x): """ >>> cube(10) 1000 """ return x * x
def _test(): import doctest doctest.testmod()
if __name__ == "__main__": _test()
Saturday, April 24, 2010
nose http://somethingaboutorange.com/mrl/projects/nose/
from cube import cube
def test_cube(): result = cube(10) assert result == 1000
Saturday, April 24, 2010
numpy
>>> from numpy import *>>> A = arange(4).reshape(2, 2)>>> Aarray([[0, 1], [2, 3]])>>> dot(A, A.T)array([[ 1, 3], [ 3, 13]])
http://numpy.scipy.org/
Saturday, April 24, 2010
ipython
$ ipython -pylabIn [1]: X = frange(0, 10, 0.1)In [2]: Y = [sin(x) for x in X]In [3]: plot(X, Y)
http://numpy.scipy.org/
Saturday, April 24, 2010
ipython
$ ipython -pylabIn [1]: X = frange(0, 10, 0.1)In [2]: Y = [sin(x) for x in X]In [3]: plot(X, Y)
http://numpy.scipy.org/
Saturday, April 24, 2010
virtualenv
$ python go-pylons.py --no-site-packages mydevenv$ cd mydevenv$ source bin/activate(mydevenv)$ paster create -t new9 helloworld
http://virtualenv.openplans.org/
创建一个干净的、隔离的python环境
Saturday, April 24, 2010
Pyrex/Cython
cdef extern from "math.h" double sin(double)
cdef double f(double x): return sin(x*x)
Saturday, April 24, 2010
哲学Pythonic
Saturday, April 24, 2010
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
翻译:赖勇浩
http://bit.ly/pyzencn
优美胜于丑陋
明了胜于晦涩
简洁胜于复杂
复杂胜于凌乱
扁平胜于嵌套
间隔胜于紧凑
可读性很重要
即便假借特例的实用性之名,也不可违背这些规则
不要包容所有错误,除非你确定需要这样做
Saturday, April 24, 2010
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
当存在多种可能,不要尝试去猜测
而是尽量找一种,最好是唯一一种明显的解决方案
虽然这并不容易,因为你不是 Python 之父
做也许好过不做,但不假思索就动手还不如不做
如果你无法向人描述你的方案,那肯定不是一个好方案;反之亦然
命名空间是一种绝妙的理念,我们应当多加利用
Saturday, April 24, 2010
Simple is better than complex
class HelloWorld{ public static void main(String args[]) { System.out.println("Hello World!"); }}
Saturday, April 24, 2010
Simple is better than complex
print "Hello World!"
Saturday, April 24, 2010
Readability counts
Saturday, April 24, 2010
Readability counts
• 强制块缩进,没有{}和end
Saturday, April 24, 2010
Readability counts
• 强制块缩进,没有{}和end
• 没有费解的字符 (except "@" for decorators)
Saturday, April 24, 2010
Readability counts
• 强制块缩进,没有{}和end
• 没有费解的字符 (except "@" for decorators)
if limit is not None and len(ids)>limit: ids = random.sample(ids, limit)
Saturday, April 24, 2010
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
Saturday, April 24, 2010
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
Saturday, April 24, 2010
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
Saturday, April 24, 2010
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
b = []for x in a: b.append(x*2)
Saturday, April 24, 2010
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
a = [1, 2, 3, 4, 5]b = []for i in range(len(a)): b.append(a[i]*2)
b = []for x in a: b.append(x*2)
Saturday, April 24, 2010
TOOWTDI
• There (should be) Only One Way To Do It.
• vs. Perlish TIMTOWTDI (There Is More Than One Way To Do It)
b = [x*2 for x in a]
Saturday, April 24, 2010
http://twitter.com/hongqn/status/9883515681
http://twitter.com/robbinfan/status/9879724095
Saturday, April 24, 2010
有图有真相
Python C
http://www.flickr.com/photos/nicksieger/281055485/ http://www.flickr.com/photos/nicksieger/281055530/
Saturday, April 24, 2010
看图不说话
Ruby
http://www.flickr.com/photos/nicksieger/280661836/
Saturday, April 24, 2010
看图不说话
Java
http://www.flickr.com/photos/nicksieger/280662707/
Saturday, April 24, 2010
利用Python的语言特性简化开发
Saturday, April 24, 2010
案例零
Saturday, April 24, 2010
案例零
• svn中保持缺省配置,开发者环境和线上环境按需特例配置
Saturday, April 24, 2010
案例零
• svn中保持缺省配置,开发者环境和线上环境按需特例配置
• 配置中需要复合结构数据(如list)
Saturday, April 24, 2010
案例零
• svn中保持缺省配置,开发者环境和线上环境按需特例配置
• 配置中需要复合结构数据(如list)
• 多个配置文件 + 部署时自动合并?
Saturday, April 24, 2010
案例零
• svn中保持缺省配置,开发者环境和线上环境按需特例配置
• 配置中需要复合结构数据(如list)
• 多个配置文件 + 部署时自动合并?
• 编写配置文件格式parser?
Saturday, April 24, 2010
MEMCACHED_ADDR = ['localhost:11211']
from local_config import *
config.py
Saturday, April 24, 2010
MEMCACHED_ADDR = ['localhost:11211']
from local_config import *
config.py
MEMCACHED_ADDR = [ 'frodo:11211', 'sam:11211', 'pippin:11211', 'merry:11211',]
local_config.py
Saturday, April 24, 2010
MEMCACHED_ADDR = ['localhost:11211']
from local_config import *
config.py
MEMCACHED_ADDR = [ 'frodo:11211', 'sam:11211', 'pippin:11211', 'merry:11211',]
local_config.py文件名后缀不为.py时,也可使用exec
Saturday, April 24, 2010
案例一
• 某些页面必须拥有某个权限才能访问
Saturday, April 24, 2010
class GroupUI(object): def new_topic(self, request): if self.group.can_post(request.user): return new_topic_ui(self.group) else: request.response.set_status(403, "Forbidden") return error_403_ui(msg="成为小组成员才能发帖")
def join(self, request): if self.group.can_join(request.user): ...
class Group(object): def can_post(self, user): return self.group.has_member(user)
def can_join(self, user): return not self.group.has_banned(user)
Saturday, April 24, 2010
class GroupUI(object): @check_permission('post', msg="成为小组成员才能发帖") def new_topic(self, request): return new_topic_ui(self.group)
@check_permission('join', msg="不能加入小组") def join(self, request): ...
class Group(object): def can_post(self, user): return self.group.has_member(user)
def can_join(self, user): return not self.group.has_banned(user)
Saturday, April 24, 2010
decorator
def print_before_exec(func): def _(*args, **kwargs): print "decorated" return func(*args, **kwargs) return _
@print_before_execdef double(x): print x*2
double(10)
Saturday, April 24, 2010
decorator
def print_before_exec(func): def _(*args, **kwargs): print "decorated" return func(*args, **kwargs) return _
@print_before_execdef double(x): print x*2
double(10)
输出:
decorated20
Saturday, April 24, 2010
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
Saturday, April 24, 2010
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
Saturday, April 24, 2010
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
Saturday, April 24, 2010
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
Saturday, April 24, 2010
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
Saturday, April 24, 2010
class check_permission(object): def __init__(self, action, msg=None): self.action = action self.msg = msg
def __call__(self, func): def _(ui, *args, **kwargs): f = getattr(ui.perm_obj, 'can_' + action) if f(req.user): return func(ui, *args, **kwargs) raise BadPermission(ui.perm_obj, self.action, self.msg) return _
Saturday, April 24, 2010
class GroupUI(object): @check_permission('post', msg="成为小组成员才能发帖") def new_topic(self, request): return new_topic_ui(self.group)
@check_permission('join', msg="不能加入小组") def join(self, request): ...
class Group(object): def can_post(self, user): return self.group.has_member(user)
def can_join(self, user): return not self.group.has_banned(user)
Saturday, April 24, 2010
案例二
• 使用消息队列异步调用函数
Saturday, April 24, 2010
def send_notification_mail(email, subject, body): msg = MSG_SEND_MAIL + '\0' + email + '\0' + subject + '\0' + body mq.put(msg)
def async_worker(): msg = mq.get() msg = msg.split('\0') cmd = msg[0] if cmd == MSG_SEND_MAIL: email, subject, body = msg[1:] fromaddr = '[email protected]' email_body = make_email_body(fromaddr, email, subject, body) smtp = smtplib.SMTP('mail') smtp.sendmail(fromaddr, email, email_body) elif cmd == MSG_xxxx: ... elif cmd == MSG_yyyy: ...
Saturday, April 24, 2010
@asyncdef send_notification_mail(email, subject, body): fromaddr = '[email protected]' email_body = make_email_body(fromaddr, email, subject, body) smtp = smtplib.SMTP('mail') smtp.sendmail(fromaddr, email, email_body)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
def async(func): mod = sys.modules[func.__module__] fname = 'origin_' + func.__name__ mod[fname] = func def _(*a, **kw): body = cPickle.dumps((mod.__name__, fname, a, kw)) mq.put(body) return _
def async_worker(): modname, fname, a, kw = cPickle.loads(mq.get()) __import__(modname) mod = sys.modules[modname] mod[fname](*a, **kw)
Saturday, April 24, 2010
案例三
• cache函数运行结果(SQL, 复杂运算, etc)
Saturday, April 24, 2010
def get_latest_review_id(): review_id = mc.get('latest_review_id') if review_id is None: review_id = exc_sql("select max(id) from review") mc.set('latest_review_id', review_id) return review_id
Saturday, April 24, 2010
@cache('latest_review_id')def get_latest_review_id(): return exc_sql("select max(id) from review")
Saturday, April 24, 2010
def cache(key): def deco(func): def _(*args, **kwargs): r = mc.get(key) if r is None: r = func(*args, **kwargs) mc.set(key, r) return r return _ return deco
Saturday, April 24, 2010
def cache(key): def deco(func): def _(*args, **kwargs): r = mc.get(key) if r is None: r = func(*args, **kwargs) mc.set(key, r) return r return _ return deco
Saturday, April 24, 2010
def get_review(id): key = 'review:%s' % id review = mc.get(key) if review is None: # cache miss id, author_id, text = exc_sql("select id, author_id, text from review where id=%s", id) review = Review(id, author_id, text) mc.set(key, review) return review
如果cache key需要动态生成呢?
Saturday, April 24, 2010
需要动态生成的cache key该如何写decorator?
@cache('review:{id}')def get_review(id): id, author_id, text = exc_sql("select id, author_id, text from review where id=%s", id) return Review(id, author_id, text)
Saturday, April 24, 2010
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
Saturday, April 24, 2010
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
Saturday, April 24, 2010
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
Saturday, April 24, 2010
inspect.getargspec>>> import inspect>>> def f(a, b=1, c=2):... pass... >>> inspect.getargspec(f)ArgSpec(args=['a', 'b', 'c'], varargs=None, keywords=None, defaults=(1, 2))>>>>>>>>> def f(a, b=1, c=2, *args, **kwargs):... pass... >>> inspect.getargspec(f)ArgSpec(args=['a', 'b', 'c'], varargs='args', keywords='kwargs', defaults=(1, 2))
Saturday, April 24, 2010
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
Saturday, April 24, 2010
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
hint:• str.format in python 2.6: '{id}'.format(id=1) => '1'• dict(zip(['a', 'b', 'c'], [1, 2, 3])) => {'a': 1, 'b': 2, 'c': 3}
Saturday, April 24, 2010
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
hint:• str.format in python 2.6: '{id}'.format(id=1) => '1'• dict(zip(['a', 'b', 'c'], [1, 2, 3])) => {'a': 1, 'b': 2, 'c': 3}
Saturday, April 24, 2010
def cache(key_pattern, expire=0): def deco(f): arg_names, varargs, varkw, defaults = inspect.getargspec(f) if varargs or varkw: raise Exception("not support varargs") gen_key = gen_key_factory(key_pattern, arg_names, defaults)
def _(*a, **kw): key = gen_key(*a, **kw) r = mc.get(key) if r is None: r = f(*a, **kw) mc.set(key, r, expire) return r return _ return deco
Saturday, April 24, 2010
案例四
• feed阅读器同时显示多个feed的文章,按entry_id合并排序。
Saturday, April 24, 2010
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
Saturday, April 24, 2010
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
Saturday, April 24, 2010
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
Saturday, April 24, 2010
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
数据库查询行数 = len(self.feeds) * limit
Saturday, April 24, 2010
class Feed(object): def get_entries(self, limit=10): ids = exc_sqls("select id from entry where feed_id=%s order by id desc limit %s", (self.id, limit)) return [Entry.get(id) for id in ids]
class FeedCollection(object): def get_entries(self, limit=10): mixed_entries = [] for feed in self.feeds: entries = feed.get_entries(limit=limit) mixed_entries += entries mixed_entries.sort(key=lambda e: e.id, reverse=True) return mixed_entries[:10]
浪费的Entry.get数 = len(self.feeds-1) * limit
Saturday, April 24, 2010
iterator and generatordef fib(): x, y = 1, 1 while True: yield x x, y = y, x+y
def odd(seq): return (n for n in seq if n%2)
def less_than(seq, upper_limit): for number in seq: if number >= upper_limit: break yield number
print sum(odd(less_than(fib(), 4000000)))
Saturday, April 24, 2010
itertools• count([n]) --> n, n+1, n+2
• cycle(p) --> p0, p1, ... plast, p0, p1, ...
• repeat(elem [,n]) --> elem, elem, elem, ... endless or up to n times
• izip(p, q, ...) --> (p[0], q[0]), (p[1], q[1]), ...
• islice(seq, [start,] stop [, step]) --> elements from seq[start:stop:step]
• ... and more ...
Saturday, April 24, 2010
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
Saturday, April 24, 2010
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
Saturday, April 24, 2010
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
Saturday, April 24, 2010
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
Saturday, April 24, 2010
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
Saturday, April 24, 2010
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
数据库查询行数 = len(self.feeds) * 5 ~
len(self.feeds)*5 + limit -5
Saturday, April 24, 2010
class Feed(object): def iter_entries(self): start_id = sys.maxint while True: entry_ids = exc_sqls("select id from entry where feed_id=%s and id<%s order by id desc limit 5", (self.id, start_id)) if not entry_ids: break for entry_id in entry_ids: yield Entry.get(entry_id) start_id = entry_ids[-1]
class FeedCollection(object): def iter_entries(self): return imerge(*[feed.iter_entries() for feed in self.feeds])
def get_entries(self, limit=10): return list(islice(self.iter_entries(), limit))
浪费的Entry.get数 = 0
Saturday, April 24, 2010
decorator 和 generator 是简化代码的利器
Saturday, April 24, 2010
案例五
• 优化不可变对象反序列化时间
Saturday, April 24, 2010
class User(object): def __init__(self, id, username, screen_name, sig): self.id = id self.username = username self.screen_name = screen_name self.sig = sig
user = User('1002211', 'hongqn', 'hongqn', "巴巴布、巴巴布巴布巴布!")
Saturday, April 24, 2010
$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop
$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop
cPickle vs. marshal
Saturday, April 24, 2010
$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop
$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop
cPickle vs. marshal
7倍速度提升
Saturday, April 24, 2010
$ python -m timeit -s '> from user import user> from cPickle import dumps, loads> s = dumps(user, 2)' \> 'loads(s)'100000 loops, best of 3: 6.6 usec per loop
$ python -m timeit -s '> from user import user> from marshal import dumps, loads> d = (user.id, user.username, user.screen_name, user.sig)> s = dumps(d, 2)' 'loads(s)'1000000 loops, best of 3: 0.9 usec per loop
cPickle vs. marshal
7倍速度提升
Saturday, April 24, 2010
$ python -c '> import cPickle, marshal> from user import user> print "pickle:", len(cPickle.dumps(user, 2))> print "marshal:", len(marshal.dumps((user.id, \> user.username, user.screen_name, user.sig), 2))'pickle: 129marshal: 74
cPickle vs. marshaltimeit
43%空间节省
Saturday, April 24, 2010
$ python -c '> import cPickle, marshal> from user import user> print "pickle:", len(cPickle.dumps(user, 2))> print "marshal:", len(marshal.dumps((user.id, \> user.username, user.screen_name, user.sig), 2))'pickle: 129marshal: 74
cPickle vs. marshaltimeit
43%空间节省
Saturday, April 24, 2010
namedtuple
from collections import namedtuple
User = namedtuple('User', 'id username screen_name sig')
user = User('1002211', 'hongqn', 'hongqn', sig="巴巴布、巴巴布巴布巴布!")
user.username-> 'hongqn'
Saturday, April 24, 2010
__metaclass__
class User(tuple): __metaclass__ = NamedTupleMetaClass __attrs__ = ['id', 'username', 'screen_name', 'sig']
user = User('1002211', 'hongqn', 'hongqn', sig="巴巴布、巴巴布巴布巴布!")
s = marshal.dumps(user.__marshal__())User.__load_marshal__(marshal.loads(s))
Saturday, April 24, 2010
from operator import itemgetter
class NamedTupleMetaClass(type): def __new__(mcs, name, bases, dict): assert bases == (tuple,) for i, a in enumerate(dict['__attrs__']): dict[a] = property(itemgetter(i)) dict['__slots__'] = () dict['__marshal__'] = tuple dict['__load_marshal__'] = classmethod(tuple.__new__) dict['__getnewargs__'] = lambda self: tuple(self) argtxt = repr(tuple(attrs)).replace("'", "")[1:-1] template = """def newfunc(cls, %(argtxt)s): return tuple.__new__(cls, (%(argtxt)s))""" % locals() namespace = {} exec template in namespace dict['__new__'] = namespace['newfunc'] return type.__new__(mcs, name, bases, dict)
Saturday, April 24, 2010
Warning!Saturday, April 24, 2010
案例六
• 简化request.get_environ(key)的写法
• e.g. request.get_environ('REMOTE_ADDR') --> request.remote_addr
Saturday, April 24, 2010
descriptor
• 一个具有__get__, __set__或者__delete__方法的对象
class Descriptor(object): def __get__(self, instance, owner): return 'descriptor'
class Owner(object): attr = Descriptor()
owner = Owner()owner.attr --> 'descriptor'
Saturday, April 24, 2010
常用的descriptor
• classmethod
• staticmethod
• property
class C(object): def get_x(self): return self._x def set_x(self, x): self._x = x x = property(get_x, set_x)
Saturday, April 24, 2010
class environ_getter(object): def __init__(self, key, default=None): self.key = key self.default = default
def __get__(self, obj, objtype): if obj is None: return self return obj.get_environ(self.key, self.default)
class HTTPRequest(quixote.http_request.HTTPRequest): for key in ['HTTP_REFERER', 'REMOTE_ADDR', 'SERVER_NAME', 'REQUEST_URI', 'HTTP_HOST']: locals()[key.lower()] = environ_getter(key) del key
locals()
Saturday, April 24, 2010
class environ_getter(object): def __init__(self, key, default=None): self.key = key self.default = default
def __get__(self, obj, objtype): if obj is None: return self return obj.get_environ(self.key, self.default)
class HTTPRequest(quixote.http_request.HTTPRequest): for key in ['HTTP_REFERER', 'REMOTE_ADDR', 'SERVER_NAME', 'REQUEST_URI', 'HTTP_HOST']: locals()[key.lower()] = environ_getter(key) del key
Saturday, April 24, 2010
案例七
• 让 urllib.urlopen 自动利用socks代理翻墙
Saturday, April 24, 2010
Monkey Patch
Saturday, April 24, 2010
import httplib
orig_connect = httplib.HTTPConnection.connect
def _patched_connect(self): if HOSTS_BLOCKED.match(self.host): return _connect_via_socks_proxy(self) else: return orig_connect(self)
def _connect_via_socks_proxy(self): ...
httplib.HTTPConnection.connect = _patched_connect
Saturday, April 24, 2010
使用Python时需要注意的问题
Saturday, April 24, 2010
使用Python时需要注意的问题
• Pythonic!
Saturday, April 24, 2010
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
Saturday, April 24, 2010
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
• Unicode / Character Encoding
Saturday, April 24, 2010
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
• Unicode / Character Encoding
• GIL (Global Interpreter Lock)
Saturday, April 24, 2010
使用Python时需要注意的问题
• Pythonic!
• Avoid gotchas http://www.ferg.org/projects/python_gotchas.html
• Unicode / Character Encoding
• GIL (Global Interpreter Lock)
• Garbage Collection
Saturday, April 24, 2010
开发环境
• 编辑器: Vim / Emacs / Ulipad
• 版本管理: subversion / mercurial / git
• wiki/错误跟踪/代码浏览: Trac
• 持续集成: Bitten
Saturday, April 24, 2010
Python Implementations
Saturday, April 24, 2010
Python Implementations
• CPython http://www.python.org/
Saturday, April 24, 2010
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
Saturday, April 24, 2010
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
Saturday, April 24, 2010
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
• IronPython http://ironpython.net/
Saturday, April 24, 2010
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
• IronPython http://ironpython.net/
• Jython http://www.jython.org/
Saturday, April 24, 2010
Python Implementations
• CPython http://www.python.org/
• Unlanden-Swallow http://code.google.com/p/unladen-swallow/
• Stackless Python http://www.stackless.com/
• IronPython http://ironpython.net/
• Jython http://www.jython.org/
• PyPy http://pypy.org/
Saturday, April 24, 2010
感谢国家,感谢大家Q & A
Saturday, April 24, 2010