26
Big Data Scope • Data produced every 2 days = all data prior to 2003 … Google’s Schmidt • Volume of Big Data, WW doubles every 1.2yrs • 90% of data was produced in last 2 years • 2012, everyday 2.5 exabytes of data created • Increased digitization of communications & transactions: mobile, IoTs Digital Marketing [email protected] University of Delaware #buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Digital Marketing Class: Big data

Embed Size (px)

DESCRIPTION

Slides for Big Data, covering Hadoop, Algorithms and Cloud computing, more.

Citation preview

Page 1: Digital Marketing Class: Big data

Big Data

Scope• Data produced every 2 days = all data prior to

2003 … Google’s Schmidt• Volume of Big Data, WW doubles every 1.2yrs• 90% of data was produced in last 2 years• 2012, everyday 2.5 exabytes of data created• Increased digitization of communications &

transactions: mobile, IoTs

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 2: Digital Marketing Class: Big data

Big Data

Generally• Data versus actionable insights (Information)– Orgs typically use small portion of data

• Good for increasing engagement with current customers

• Increases lock-in• Helps identify new products

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 3: Digital Marketing Class: Big data

Big Data

Generally• Useful when past is good predictor of future• Provides real time insights (dashboards)• Is it a revolution?• Very buzzy, re: VCs etc.• Example: Google Flu Trends– Fall 2008

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 4: Digital Marketing Class: Big data

Big Data

Data Sources• Internet of Things (Sensors)– Ex: Parking meters, Smart houses

• Machine to Machine networks: 250m 2014– APIs versus Open Standards

• Data exhaust (ex. Social media)• Clickstream data

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 5: Digital Marketing Class: Big data

Big Data

Data Sources• Location data– In-store (iBeacon, sensors) and geo fencing

• Loyalty program data (Target)• Transactional data (Krogers)• Structured versus Unstructured data• Primary to Secondary data (unimagined use)

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 6: Digital Marketing Class: Big data

Big Data

Three Aspects• Ingestion & Storage (cloud)– cost

• Analytics– value

• Visualization– communication

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 7: Digital Marketing Class: Big data

Big Data

Analytics• Continuous– Real-time reporting: Dashboards

• Predictive– Netflix– Google and house sales

• Sentiment– Sales down, twitter chatter

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 8: Digital Marketing Class: Big data

Big Data

Defining Big Data: 3 Vs• Volume• Velocity• Variety

• Veracity (AP Hack “Obama hurt”)

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 9: Digital Marketing Class: Big data

Big Data

Technology: Hadoop MapReduce• Google Origins: MapReduce: indexing web,

examining user behavior to improve algorithm• Yahoo! & Hadoop Open Source project• Cornerstone technology for Big Data– For now ? Google’s Cloud Dataflow

• Makes Big Data more manageable

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 10: Digital Marketing Class: Big data

Big Data

Technology: Hadoop MapReduce• Cheap, scalable• Fast• Store, then figure out• Open Source– Cloudera– Hortonworks

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 11: Digital Marketing Class: Big data

Big Data

Cloud Computing• Business and consumer markets• Increasing volumes of data to manage &

interpret• Moore’s law

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 12: Digital Marketing Class: Big data

Big Data

Cloud Computing• Virtualization• Variable cost• Scalability and agility

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 13: Digital Marketing Class: Big data

Big Data

Cloud Computing• Run apps from anywhere, from any device• Easier to upgrade apps• Makes IT more accessible• Software as a Service (SaaS)– Ex: Salesforce.com

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 14: Digital Marketing Class: Big data

Big Data

Cloud Computing• Very buzzy re: VCs• Key players: Google, Amazon, Windows Azure,

Oracle, IBM, Rackspace, Salesforce.com

• Privacy issues• Data ownership and security?

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 15: Digital Marketing Class: Big data

Big Data

Changes in Analysis• Data starved to data abundance– Too expensive, and not enough of it

• Samples and statistical significance• Hypothesis driven versus Data driven: end of

theory?• Correlations to design algorithms

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 16: Digital Marketing Class: Big data

Big Data

Changes in Analysis• Patterns in the data: relationships• What versus Why?– Amazon recommendations (risks ?)

• Causation versus Correlation (good enough)• Limitation: we know what, but not why

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 17: Digital Marketing Class: Big data

Big Data

Criticism• Bias in data sources– Ex: Twitter and election results

• Importance of human intuition• Filter bubble

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 18: Digital Marketing Class: Big data

Big Data

Algorithms• A repeating set of rules, series of instructions• Designed from the data, or purchased• Predicts future, based on the past (Big Data)• Input data unique, passes algorithm, output

unique (creates probability)• Replaces human intuition: or mix of both

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 19: Digital Marketing Class: Big data

Big Data

Machine Learning & Algorithms• Increasing returns on data• Designed into algorithm (real-time)• Examples– Google (search)– Amazon (human editors versus algorithm) what, not

why– Facebook (news feed)– Netflix (recommendations)– Zynga (data from game play)

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 20: Digital Marketing Class: Big data

Big Data

Old World, New World• Old World: Analogue, same content by

vehicle, medium, retail store

• New World: Digital, unique content driven by algorithms to increase engagement and lock-in

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 21: Digital Marketing Class: Big data

Big Data

Old World, New World• New World Examples:– Netflix– Amazon– In store retail experience w/ mobile app and

iBeacon technology (smart retail) Macy’s ?– Target, pregnant teen

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 22: Digital Marketing Class: Big data

Big Data

Unintended Consequence: Filter Bubble effect• Human curated content versus serendipity• Editors versus wisdom of the crowd• Personal universe of online information• Past clickstreams determine future choice:

information determinism– Netflix, Amazon etc.

• Echo chamber effect

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 23: Digital Marketing Class: Big data

Big Data

Unintended Consequence: Filter Bubble effect• Echo chamber effect• What we want to see, versus what we should see– Amazon / Netflix recommendations (what, not why)– Good enough recs. Versus risks ?

• Personalized: Google, Facebook etc.• Personalization drives ad revenue• Wikipedia is a standout

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 24: Digital Marketing Class: Big data

Big Data

Privacy• Law (and keeping up)• Terms of Service (bargain) Binary choice• Terms of Service: changes (for business model)• Scope of reach: Facebook / Google– EU: Right to be Forgotten ?– EU: Privacy rules change, Google

• Tension, privacy versus personalization• Opt in, versus Opt out: by default– Org donations; Facebook Nearby

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 25: Digital Marketing Class: Big data

Big Data

Privacy• Knowledge of data collection: invisible• Knowledge of data use (secondary: sometimes

unimagined)• Creep factor (how did they know?)– Ad retargeting– Snapchat ?

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes

Page 26: Digital Marketing Class: Big data

Big Data

Privacy• Combining data can de-anonymize• Data brokers: reselling– Acxiom

• Hacking, data theft (Target, Home Depot, Kmart, etc.)

Digital Marketing [email protected] University of Delaware

#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes