Upload
norman-garrett
View
218
Download
0
Embed Size (px)
DESCRIPTION
Even more data 3
Citation preview
The Curse of Big Data in Mobile Analytics
Dr. Guodong (Gordon) Gao M-CERSI Workshop, 9/11/2015
Mobile devices = Big Data User generated data
Facebook ingests 500 terabytes of new data every day. Text messages, diet log, photos, videos, …
System generated data App download and usage Gesture, touches Communications with other wearable devices
Sensor-generated data 6 billion mobile phones
Geo-location data, pedometer, heart beat sensor, and oxygen saturation sensor
2
Even more data
3
7
5
6
Causal inference Most the statistical methods try to measure correlations,
not causation.
For actionable knowledge, we need causation! Does the roster crowing cause the sun to rise?
Confusing correlation with causality can be dangerous
7
8
9
Does Anne Hathaway help Warren Buffet get richer?
10
The curse of big data Heterogeneity in Treatment Effects (HTE)
Sub-group analysis Helps answer:
Which sub-group will benefit from this treatment? Should I prescribe the treatment to this particular
patient? With dozens of variable, and thousands of
combinations, we can define sub-group in many ways
e.g. 10 variables, each with 3 levels, there are 3^10 = 59,049 combinations!
We are doomed to find something statistically significant in certain sub-groups
11
Yet another curse of big data
12
Do not ignore the fundamentals Patient #11
13