Python at yhat (august 2013)

Preview:

DESCRIPTION

 

Citation preview

Python at YhatDev StackUpAugust 2013

Agenda

About Yhat

How we use Python

Questions

We need to reduce churn. Okay. I'll look into it.

Lots of conversations like this

I figured out that....some complex stuff about vector space that'll improve...

....and that's how we'll reduce churn.

Sounds good. Let's do that...

The "a ha" moment isn't the end.

Now what?

Any of you know what Gradient Boosting is?

So when can we go live with the new model?

What goes on in the Kludge?

Rewriting CodeBatch JobsPMML

How can we...

- eliminate implementation time

How can we...

- eliminate implementation time - let data scientists use their favorite tools

How can we...

- eliminate implementation time - let data scientists use their favorite tools

...without altering your workflow

How do we do this?

How do we do this?

great for analysis● Built for analysis and statistics● Everything is tabular● Active community; 4000+ packages

great for analysis● Built for analysis and statistics● Everything is tabular● Active community; 4000+ packages

bad for applications● Not web friendly● Everything is tabular● Slow● A list of R grievances:

○ https://github.com/tdsmith/aRrg

Hooking R up to Python

R code

R code > Compile to Bytecode

R code > Compile to Bytecode > Execute from Python

{ “data”: {

“foo”: 100, “bar”: 200

}}

Incoming data for prediction Make prediction

from Python using compiled R

R code > Compile to Bytecode > Execute from Python

Returned via REST API

Prediction sent back to Python webserver{

“prob”: 0.95}

{}

approach

Same Python server

{}

approach

Plug in different scientific environments

{ } “prob”: 0.87

approach

Predictions sent back up the chain and to the client

Result

● Ensures cross environment validation● Extensible to other languages

Want to try?yhathq.com

We're Hiringinfo@yhathq.comyhathq.com/jobs

Recommended