31
BIG DATA, We have a communication problem. GINORMOUS SYSTEMS April 30–May 1, 2013 Washington, D.C. Daniel Tunkelang Head of Query Understanding, LinkedIn

BIG DATA , We have a communication problem

Embed Size (px)

DESCRIPTION

BIG DATA , We have a communication problem. GINORMOUS SYSTEMS April 30–May 1, 2013 Washington, D.C. Daniel Tunkelang Head of Query Understanding, LinkedIn. BIG DATA IS EVERYWHERE. BIG DATA POWERS EVERYTHING. DATA SCIENTISTS WORRY ABOUT VOLUME, VELOCITY, VARIETY, …. BUT THE BOTTLENECK - PowerPoint PPT Presentation

Citation preview

Page 1: BIG DATA , We have a communication problem

BIG DATA,We have a communication problem.

GINORMOUS SYSTEMSApril 30–May 1, 2013Washington, D.C.

Daniel TunkelangHead of Query Understanding, LinkedIn

Page 2: BIG DATA , We have a communication problem

BIG DATA IS EVERYWHERE

Page 3: BIG DATA , We have a communication problem

BIG DATA POWERS EVERYTHING

Page 4: BIG DATA , We have a communication problem

DATA SCIENTISTS WORRY ABOUTVOLUME, VELOCITY, VARIETY, …

Page 5: BIG DATA , We have a communication problem

BUT THE BOTTLENECKISN’T COMPUTATIONAL

IT’S COGNITIVE

Page 6: BIG DATA , We have a communication problem

TOOLS AUGMENTHUMAN INTELLECT

BIG DATA IS A TOOL

Doug Engelbart, inventor ofthe mouse, hypertext, etc.

Page 7: BIG DATA , We have a communication problem

NOT EVERYONE SUBSCRIBESTO THIS POINT OF VIEW…

Claudia Perlich, Chief Scientist of media6degrees, speaking atTTI/Vanguard 2012 Conference on Understanding Understanding:

Page 8: BIG DATA , We have a communication problem

SHE HAS A POINT

Page 9: BIG DATA , We have a communication problem

BUT PREDICTIVE MODELINGIS NOT ENOUGH

Page 10: BIG DATA , We have a communication problem

TRAININGDATA?

OBJECTIVEFUNCTION?

Page 11: BIG DATA , We have a communication problem

WE NEED APEOPLE-CENTRICAPPROACH TOBIG DATA

INTERPRETABILITYINTERACTION

INSIGHT

Page 12: BIG DATA , We have a communication problem

LET’S START WITHINTERPRETABILITY

Page 13: BIG DATA , We have a communication problem

EXAMPLE:SVMvs.

DECISION TREE

Page 14: BIG DATA , We have a communication problem

DECISION TREES HAVE FLAWS…

DISCRETE

Page 15: BIG DATA , We have a communication problem

BUT THEYCOMMUNICATE

(if they’re shallow)

early splits provide big picture…

fat leaves guidefeature engineering

…or reveal training data problems

Page 16: BIG DATA , We have a communication problem

WHI

CHSUPPORTS

ITERATION

Page 17: BIG DATA , We have a communication problem

INTERPRETABILITY DELIVERS

Key search leader favors rule-based approach for key scoring algorithms.

Replaced regression with decision tree in local search model: gained accuracy and insight.

Using trees to recognize spam, analyze search abandonment, model / quantify social proof.

Page 18: BIG DATA , We have a communication problem

GO DEEP vs INTERPRETABILITY

A KEY DATA SCIENCE TRADE-OFF

Page 19: BIG DATA , We have a communication problem

ON TOINTERACTION

Page 20: BIG DATA , We have a communication problem
Page 21: BIG DATA , We have a communication problem

DON’T OVERPAY FOR PRECISION

Page 22: BIG DATA , We have a communication problem

BE FAST, CHEAP, AND 98% RIGHT

http://metamarkets.com/2012/fast-cheap-and-98-right-cardinality-estimation-for-big-data/

Page 23: BIG DATA , We have a communication problem

ARE PEOPLE THAT IMPATIENT?

tolerable wait time for web users

0.1s increase in latency significantly reduces # of searches, ad revenue

tl;dr: YES

Page 24: BIG DATA , We have a communication problem

IMPATIENCE IS GOODSPEED MATTERS

Page 25: BIG DATA , We have a communication problem

INSIGHT

Page 26: BIG DATA , We have a communication problem

http://blog.takejune.com/archives/52334044.html

Page 27: BIG DATA , We have a communication problem

BE TRENDY AND NORMALIZE

vs

Page 28: BIG DATA , We have a communication problem

Sept. 11thAbu Ghraib

Weapons Inspectors

SOLVE FOR INTERESTINGNESS

Page 29: BIG DATA , We have a communication problem

COMPUTE POTENTIAL INSIGHTS

APPLY HUMAN INTUITION

Page 30: BIG DATA , We have a communication problem

SUMMARY: Let’s have a conversation with Big Data.

INTERPRETABILITYINTERACTION

INSIGHT

Page 31: BIG DATA , We have a communication problem