Why Python (for Statisticians)

Preview:

Citation preview

Why Python? (For Stats Why Python? (For Stats People)People)

@__mharrison__@__mharrison__© 2013© 2013

About MeAbout Me

● 12+ years Python12+ years Python● Worked in Data Analysis, HA, Search, Worked in Data Analysis, HA, Search,

Open Source, BI, and StorageOpen Source, BI, and Storage● Author of multiple Python BooksAuthor of multiple Python Books

BookBook

BookBook

Treading on Python Volume 1Treading on Python Volume 1 meant to meant to make people proficient in Python quicklymake people proficient in Python quickly

Why Python?Why Python?

General Purpose LanguageGeneral Purpose Language

““I’d rather do math in a general-purpose I’d rather do math in a general-purpose language than do general-purpose language than do general-purpose programming in a math language.”programming in a math language.”

John D CookJohn D Cook

Who's Using Python?Who's Using Python?

● Startups (on HN)Startups (on HN)● Data Scientists (Strata)Data Scientists (Strata)● Big CompaniesBig Companies

WhoWho● GoogleGoogle

● NasaNasa

● ILMILM

● RedhatRedhat

● FinanceFinance

● InstagramInstagram

● PinterestPinterest

● YoutubeYoutube

● ......

Open SourceOpen Source

Free in both senses of the wordFree in both senses of the word

Batteries IncludedBatteries Included

● TextText● NetworkNetwork● JSONJSON● Command LineCommand Line● FilesFiles● XMLXML

Large CommunityLarge CommunityPyPi - PYthon Package IndexPyPi - PYthon Package Index

● WebWeb

● DatabaseDatabase

● GUIGUI

● ScientificScientific

● Network ProgrammingNetwork Programming

● GamesGames

Large CommunityLarge Community

● User GroupsUser Groups● PyLadiesPyLadies● ConferencesConferences

LocalLocal

● utahpython.org - 2nd Thurs. 7pmutahpython.org - 2nd Thurs. 7pm● Utah Open Source ConferenceUtah Open Source Conference

ToolingTooling

● EditorsEditors● TestingTesting● ProfilingProfiling● DebuggingDebugging● DocumentationDocumentation

Optimizes for Programmer TimeOptimizes for Programmer Time

““We We shouldshould forget about small forget about small efficiencies, say about 97% of the time: efficiencies, say about 97% of the time: premature optimization is the root of premature optimization is the root of all evil.all evil.””

Donald KnuthDonald Knuth

Executable PseudocodeExecutable Pseudocodefunction quicksort(array)function quicksort(array) if length(array) ≤ 1 if length(array) ≤ 1 return array // an array of zero or one elements is already sorted return array // an array of zero or one elements is already sorted select and remove a pivot element pivot from 'array' // see '#Choice of select and remove a pivot element pivot from 'array' // see '#Choice of pivot' belowpivot' below create empty lists less and greater create empty lists less and greater for each x in array for each x in array if x ≤ pivot then append x to less' if x ≤ pivot then append x to less' else append x to greater else append x to greater return concatenate(quicksort(less), list(pivot), quicksort(greater)) // return concatenate(quicksort(less), list(pivot), quicksort(greater)) // two recursive callstwo recursive calls

http://en.wikipedia.org/wiki/Quicksorthttp://en.wikipedia.org/wiki/Quicksort

Executable PseudocodeExecutable Pseudocode>>> def quicksort(array):>>> def quicksort(array):... if len(array) <= 1:... if len(array) <= 1:... return array... return array... pivot = array.pop(len(array)/2)... pivot = array.pop(len(array)/2)... lt = []... lt = []... gt = []... gt = []... for item in array:... for item in array:... if item < pivot:... if item < pivot:... lt.append(item)... lt.append(item)... else:... else:... gt.append(item)... gt.append(item)... return quicksort(lt) + [pivot] + quicksort(gt)... return quicksort(lt) + [pivot] + quicksort(gt)

But...But...

Python has Python has TimsortTimsort. Optimized for real . Optimized for real world (takes advantage of inherent order) world (takes advantage of inherent order) and written in C. (Stolen by Java, and written in C. (Stolen by Java, Android, and Octave)Android, and Octave)

Multi-paradigm LanguangeMulti-paradigm Languange

● ImperativeImperative● Object OrientedObject Oriented● FunctionalFunctional

ImperativeImperative>>> def sum(items):>>> def sum(items):... total = 0... total = 0... for item in items:... for item in items:... total = total + item... total = total + item... return total... return total

>>> sum([2, 4, 8])>>> sum([2, 4, 8])1414

OOOO>>> class Summer:>>> class Summer:... def __init__(self):... def __init__(self):... self.items = []... self.items = []... def add_item(self, item):... def add_item(self, item):... self.items.append(item)... self.items.append(item)... def sum(self):... def sum(self):... return sum(self.items)... return sum(self.items)

>>> s = Summer()>>> s = Summer()>>> s.add_item(2)>>> s.add_item(2)>>> s.add_item(3)>>> s.add_item(3)>>> s.sum()>>> s.sum()55

FunctionalFunctional>>> import operator>>> import operator>>> sum = lambda x: reduce(operator.add, x)>>> sum = lambda x: reduce(operator.add, x)

>>> sum([4, 8, 22])>>> sum([4, 8, 22])3434

Why Not Python?Why Not Python?

SlowSlow

Sometimes you have to optimize. Good C Sometimes you have to optimize. Good C integrationintegration

If it ain't broke don't fix itIf it ain't broke don't fix it

Don't replace existing solutions for funDon't replace existing solutions for fun

R has more depthR has more depth

Though Python is catching up in some Though Python is catching up in some areasareas

Going ForwardGoing Forward

IPython NotebookIPython Notebook

● Notebook w/ integrated graphsNotebook w/ integrated graphs

LibrariesLibraries● Numpy - matrix mathNumpy - matrix math

● scipy - scientific librariesscipy - scientific libraries

● scipy.stats - statsscipy.stats - stats

● statsmodel - modelingstatsmodel - modeling

● pandas - dataframepandas - dataframe

● matplotlib - graphingmatplotlib - graphing

● scikit.learn - mlscikit.learn - ml

That's allThat's all

Questions? Tweet meQuestions? Tweet me

For beginning Python secrets see For beginning Python secrets see Treading on Python Volume 1Treading on Python Volume 1

@__mharrison__@__mharrison__http://hairysun.comhttp://hairysun.com

Recommended