Big Data Big Deal or Big Distraction? Agenda: 1.What is Big Data? 2.Why YOU Should Care About (Big) Data? 3.A Brief Introduction to Big Data Econometrics

  • Published on

  • View

  • Download

Embed Size (px)


PowerPoint Presentation

Big DataBig Deal or Big Distraction?Agenda:What is Big Data?Why YOU Should Care About (Big) Data?A Brief Introduction to Big Data Econometrics

Theory vs. measurement 2 min video:

Short vs long term good BHS blog post:

1Internet of People

Ed Moeds lecture time people are on line and what they are doing. 2

Today computers are almost wholly dependent on human beings for information -- by typing, pressing a record button, taking a digital picture or scanning a bar code... The problem is, people have limited time, attention and accuracyall of which means they are not very good at capturing data about things in the real world Kevin Ashton,'That 'Internet of Things' Thing',RFID Journal, July 22, 2009The Problem With People

Really, how valuable is all the data about the number of people who watch cat videos?What can we learn from trending on twitter?

General problem with social science people dont create data for analytics. Selection bias, error, unobservable heterogeneity4

Internet of ThingsWikepedia page with good quotes:

What do people want?But remember JobsWhere are the people we want? Customization, add placementWhat will sales be?Predicting the futureAre my ads working? : The ATTRIBUTION problem

Hal Varian: Prediction, summarization, estimation, hypothesis testingPandora WSJ article:


You will either work for one of these companies, use one of their products or contract for one of their services in both your personal and professional lives.7

+the Publicis CEO noted that "the communication and marketing landscape has undergone dramatic changes in recent years, including the exponential development of new media giants, the explosion of Big Data, blurring of the roles of all players and profound changes in consumer behavior. WSJ 7/28/13a $35.1 billion cross-border linkup that shows how Big Data is making Madison Avenue look more like Wall Street. WSJ 7/28/13Interesting interview about digital channels:

Key WSJ article (read some quotes) Big Data is making Madison Avenue look more like Wall Street8

CreativevsAnalyticalPrimacy of creative is declining: too many people, not enough robots: being hired in creative are freelancers haves and have nots like in many other industries!9

A Brief Introduction to Big Data EconometricsWhat can we do with data?Correlation vs. CausationC. Types of datai. Cross sectionii. Time seriesiii. PanelD. Fit, overfit, validationE. Tools of the tradei. Regression, logit, probitii. Trees & Forestsiii. Baysean simulation Prediction, summarization, estimation, hypothesis testing

Fast Company interview with Nate Silver (data in sports and politics) on hype of big data revolution:


PredictionSummarization EstimationHypothesis TestingTension between prediction dont care why it works as long as it works can causal modeling.11

Is Marriage Good for Your Health?Tara Parker-Pope, 4/14/10Contemporary studies, for instance, have shown that married people are less likely to getpneumonia, have surgery, developcancer or have heart attacks. A group of Swedish researchers has found that being married or cohabiting at midlife is associated with a lower risk for dementia. A study of two dozen causes of death in the Netherlands found that in virtually every category, ranging from violent deaths like homicide and car accidents to certain forms of cancer, the unmarried were at far higher risk than the married. Correlation vs. Causation12

What can get in the way of determining CAUSATION?ENDOGENEITY

1. Reverse causality (also selection bias): healthier people are more likely to get married

2. Unobservable characteristics such as time preference, aptitude, genetics

Counterfactual 1. What would happen if we change the cause?

2. Is there a plausible alternative explanation?

What would sales have been if the ad did not run?What would people do if they did not use Google?What would people buy if the weather was warmer?Big data opens more possibilities for more natural experiments because there are more people getting exposed to more stuff. 14

Cross-section Data: Lots of observations at one point in time.15Time Series Data: One observation over time.

Panel Data: Multiple observations of the same thing over time.

Varian article posted on Moodle18

Fit, Overfit, Validation and Out of Sample PredictionKey with big data is that we have bigger samples so can do more with different estimation and validation sub-samples19

Linear Regression

Logit/Probit Regression

Book: lectures:

Trees and Forests

From Varian22

Uninformative Prior ProbabilityGather DataConditional probability of observing dataUpdated ProbabilityBayesian Statistics

With BIG DATA we can repeat this process over and over again with multiple models to get better predictions!Corsea Machine Learning by Andrew Ng Introduction to Statistical LearningBook: videos & problem sets:!iom530/c21o7

Feeling a bit overwhelmed?

its no wonder that the latest fad in the business world is Big Data Big data can be an extraordinary tool, helping to gather new information about our behavior and preferences. What it cant explain is why we do what we do. WSJ 3/22/1426