Upload
g3nittala
View
1.124
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
- Gayatri Nittala
About:
Optimization - what, when & where?General Optimization strategiesPython specific optimizationsProfiling
Golden Rule:
"First make it work. Then make it right.
Then make it fast!" - Kent Beck
The process:
1.Get it right. 2.Test it's right. 3.Profile if slow. 4.Optimize. 5.Repeat from 2.
test suites source control
Make it right & pretty!
Good Programming :-)
General optimization strategies Python optimizations
Optimization:
Aims to improveNot perfect, the resultProgramming with performance tips
When to start?Need for optimization
are you sure you need to do it at all? is your code really so bad?
benchmarking fast enough vs. faster
Time for optimization is it worth the time to tune it? how much time is going to be spent running
that code?
When to start?
Cost of optimization costly developer time addition of new features new bugs in algorithms speed vs. space
Optimize only if necessary!
Where to start? Are you sure you're done coding?
frosting a half-baked cakePremature optimization is the root of all evil!
- Don Knuth
Working, well-architected code is always a must
General strategiesAlgorithms - the big-O notationArchitectureChoice of Data structuresLRU techniquesLoop invariant code out of loopsNested loopstry...catch instead of if...elseMultithreading for I/O bound codeDBMS instead of flat files
General strategies
Big – O – The Boss!
performance of the algorithmsa function of N - the input size to the algorithm
O(1) - constant time O(ln n) - logarithmic O(n) - linear O(n2) - quadratic
Common big-O’sOrder Said to be Examples
“…. time”--------------------------------------------------O(1) constant key in dict
dict[key] = valuelist.append(item)
O(ln n) logarithmic Binary searchO(n) linear item in sequence
str.join(list)O(n ln n) list.sort()O(n2) quadratic Nested loops (with constant time
bodies)
Note the notationO(N2) O(N)def slow(it): def fast(it): result = [] result = [] for item in it: for item in it: result.insert(0, item)
result.append(item) return result result.reverse( )
return result result = list(it)
Big-O’s of Python Building blocks
lists - vectorsdictionaries - hash tablessets - hash tables
Big-O’s of Python Building blocksLet, L be any list, T any string (plain or
Unicode); D any dict; S any set, with (say) numbers as items (with O(1) hashing and comparison) and x any number:
O(1) - len( L ), len(T), len( D ), len(S), L [i], T [i], D[i], del D[i], if x in D, if x in S, S .add( x ), S.remove( x ), additions
or removals to/from the right end of L
Big-O’s of Python Building blocks
O(N) - Loops on L, T, D, S, general additions or removals to/from L (not at the right end),
all methods on T, if x in L, if x in T, most methods on L, all shallow
copies
O(N log N) - L .sort in general (but O(N) if L is already nearly sorted or reverse-sorted)
Right Data Structure
lists, sets, dicts, tuples collections - deque, defaultdict, namedtupleChoose them based on the functionality
search an element in a sequence append intersection remove from middle dictionary initializations
Right Data Structure
my_list = range(n) n in my_listmy_list = set(range(n)) n in my_list
my_list[start:end] = []my_deque.rotate(-end) for counter in (end-start):
my_deque.pop()
Right Data Structures = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list) for k, v in s: d[k].append(v) d.items() [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
d = {} for k, v in s: d.setdefault(k, []).append(v) d.items() [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
Python Performance Tips built-in modules string concatenation lookups and local variables dictionary initialization dictionary lookups import statements loops
Built-ins- Highly optimized- Sort a list of tuples by it’s n-th field
def sortby(somelist, n): nlist = [(x[n], x) for x in somelist] nlist.sort() return [val for (key, val) in nlist]n = 1 import operator nlist.sort(key=operator.itemgetter(n))
String Concatenation s = "" for substring in list: s += substring s = "".join(list) out = "<html>" + head + prologue + query +
tail + "</html>" out = "<html>%s%s%s%s</html>" % (head,
prologue, query, tail) out = "<html>%(head)s%(prologue)s%
(query)s%(tail)s</html>" % locals()
Searching: using ‘in’
O(1) if RHS is set/dictionary O(N) if RHS is string/list/tuple
using ‘hasattr’ if the searched value is an attribute if the searched value is not an attribute
Loops: list comprehensions map as for loop moved to c – if the body of the loop is
a function call
newlist = [] for word in oldlist: newlist.append(word.upper())
newlist = [s.upper() for s in oldlist]
newlist = map(str.upper, oldlist)
Lookups and Local variables: evaluating function references in loops accessing local variables vs global variables
upper = str.upper newlist = [] append = newlist.append for word in oldlist: append(upper(word))
Dictionaries Initialization -- try... Except Lookups -- string.maketrans
Regular expressions: RE's better than writing a loop Built-in string functions better than RE's Compiled re's are significantly faster
re.search('^[A-Za-z]+$', source) x = re.compile('^[A-Za-z]+$').search x(source)
Imports avoid import * use only when required(inside functions) lazy imports
exec and eval better to avoid Compile and evaluate
Summary on loop optimization - (extracted from an essay by Guido) only optimize when there is a proven speed
bottleneck small is beautiful use intrinsic operations avoid calling functions written in Python in your inner
loop local variables are faster than globals try to use map(), filter() or reduce() to replace an
explicit for loop(map with built-in, for loop with inline) check your algorithms for quadratic behaviour and last but not least: collect data. Python's excellent
profile module can quickly show the bottleneck in your code
Might be unintentional, better not to be intuitive!
The right answer to improve performance- Use PROFILERS
Spot it Right!HotspotsFact and fake( - Profiler Vs Programmers
intuition!) Threads IO operations Logging Encoding and Decoding Lookups
Rewrite just the hotspots!Psyco/PyrexC extensions
Profilerstimeit/time.clockprofile/cprofileVisualization
RunSnakeRun Gprof2Dot PycallGraph
timeitprecise performance of small code snippets.the two convenience functions - timeit and
repeat timeit.repeat(stmt[, setup[, timer[, repeat=3[,
number=1000000]]]]) timeit.timeit(stmt[, setup[, timer[,
number=1000000]]])
can also be used from command line python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h]
[statement ...]
timeitimport timeit
timeit.timeit('for i in xrange(10): oct(i)', gc.enable()')1.7195474706909972
timeit.timeit('for i in range(10): oct(i)', 'gc.enable()')2.1380978155005295
python -m timeit -n1000 -s'x=0' 'x+=1'1000 loops, best of 3: 0.0166 usec per loop
python -m timeit -n1000 -s'x=0' 'x=x+1'1000 loops, best of 3: 0.0169 usec per loop
timeitimport timeit
python -mtimeit "try:" " str.__nonzero__" "except AttributeError:" " pass"
1000000 loops, best of 3: 1.53 usec per loop
python -mtimeit "try:" " int.__nonzero__" "except AttributeError:" " pass"
10000000 loops, best of 3: 0.102 usec per loop
timeittest_timeit.py
def f(): try: str.__nonzero__
except AttributeError: pass
if __name__ == '__main__': f()
python -mtimeit -s "from test_timeit import f" "f()"100000 loops, best of 3: 2.5 usec per loop
cProfile/profileDeterministic profiling The run time performance With statisticsSmall snippets bring big changes!
import cProfilecProfile.run(command[, filename])
python -m cProfile myscript.py [-o output_file] [-s
sort_order]
cProfile statistics
E:\pycon12>profile_example.py 100004 function calls in 0.306 CPU seconds
Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.014 0.014 0.014 0.014 :0(setprofile) 1 0.000 0.000 0.292 0.292 <string>:1(<module>) 1 0.000 0.000 0.306 0.306 profile:0(example()) 0 0.000 0.000 profile:0(profiler) 1 0.162 0.162 0.292 0.292
profile_example.py:10(example) 100000 0.130 0.000 0.130 0.000
profile_example.py:2(check)
Using the stats
The pstats moduleView and compare stats
import cProfile cProfile.run('foo()', 'fooprof') import pstats p = pstats.Stats('fooprof')
p.strip_dirs().sort_stats(-1).print_stats() p.sort_stats('cumulative').print_stats(10) p.sort_stats('file').print_stats('__init__')
Visualization
A picture is worth a thousand words!Other tools to visualize profiles
kcachegrind RunSnakeRun GProf2Dot PyCallGraph PyProf2CallTree
RunSnakeRunE:\pycon12>runsnake D:\simulation_gui.profile
Don't be too clever. Don't sweat it too much. Develop an instinct for the sort of code that
Python runs well.
References
http://docs.python.org http://wiki.python.org/moin/PythonSpeed/PerformanceTips/ http://sschwarzer.com/download/
optimization_europython2006.pdf http://oreilly.com/python/excerpts/python-in-a-nutshell/testing-
debugging.html
Questions?