Upload
alex-brown
View
239
Download
1
Embed Size (px)
DESCRIPTION
Slides for Big Data, covering Hadoop, Algorithms and Cloud computing, more.
Citation preview
Big Data
Scope• Data produced every 2 days = all data prior to
2003 … Google’s Schmidt• Volume of Big Data, WW doubles every 1.2yrs• 90% of data was produced in last 2 years• 2012, everyday 2.5 exabytes of data created• Increased digitization of communications &
transactions: mobile, IoTs
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Generally• Data versus actionable insights (Information)– Orgs typically use small portion of data
• Good for increasing engagement with current customers
• Increases lock-in• Helps identify new products
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Generally• Useful when past is good predictor of future• Provides real time insights (dashboards)• Is it a revolution?• Very buzzy, re: VCs etc.• Example: Google Flu Trends– Fall 2008
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Data Sources• Internet of Things (Sensors)– Ex: Parking meters, Smart houses
• Machine to Machine networks: 250m 2014– APIs versus Open Standards
• Data exhaust (ex. Social media)• Clickstream data
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Data Sources• Location data– In-store (iBeacon, sensors) and geo fencing
• Loyalty program data (Target)• Transactional data (Krogers)• Structured versus Unstructured data• Primary to Secondary data (unimagined use)
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Three Aspects• Ingestion & Storage (cloud)– cost
• Analytics– value
• Visualization– communication
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Analytics• Continuous– Real-time reporting: Dashboards
• Predictive– Netflix– Google and house sales
• Sentiment– Sales down, twitter chatter
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Defining Big Data: 3 Vs• Volume• Velocity• Variety
• Veracity (AP Hack “Obama hurt”)
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Technology: Hadoop MapReduce• Google Origins: MapReduce: indexing web,
examining user behavior to improve algorithm• Yahoo! & Hadoop Open Source project• Cornerstone technology for Big Data– For now ? Google’s Cloud Dataflow
• Makes Big Data more manageable
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Technology: Hadoop MapReduce• Cheap, scalable• Fast• Store, then figure out• Open Source– Cloudera– Hortonworks
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Cloud Computing• Business and consumer markets• Increasing volumes of data to manage &
interpret• Moore’s law
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Cloud Computing• Virtualization• Variable cost• Scalability and agility
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Cloud Computing• Run apps from anywhere, from any device• Easier to upgrade apps• Makes IT more accessible• Software as a Service (SaaS)– Ex: Salesforce.com
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Cloud Computing• Very buzzy re: VCs• Key players: Google, Amazon, Windows Azure,
Oracle, IBM, Rackspace, Salesforce.com
• Privacy issues• Data ownership and security?
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Changes in Analysis• Data starved to data abundance– Too expensive, and not enough of it
• Samples and statistical significance• Hypothesis driven versus Data driven: end of
theory?• Correlations to design algorithms
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Changes in Analysis• Patterns in the data: relationships• What versus Why?– Amazon recommendations (risks ?)
• Causation versus Correlation (good enough)• Limitation: we know what, but not why
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Criticism• Bias in data sources– Ex: Twitter and election results
• Importance of human intuition• Filter bubble
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Algorithms• A repeating set of rules, series of instructions• Designed from the data, or purchased• Predicts future, based on the past (Big Data)• Input data unique, passes algorithm, output
unique (creates probability)• Replaces human intuition: or mix of both
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Machine Learning & Algorithms• Increasing returns on data• Designed into algorithm (real-time)• Examples– Google (search)– Amazon (human editors versus algorithm) what, not
why– Facebook (news feed)– Netflix (recommendations)– Zynga (data from game play)
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Old World, New World• Old World: Analogue, same content by
vehicle, medium, retail store
• New World: Digital, unique content driven by algorithms to increase engagement and lock-in
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Old World, New World• New World Examples:– Netflix– Amazon– In store retail experience w/ mobile app and
iBeacon technology (smart retail) Macy’s ?– Target, pregnant teen
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Unintended Consequence: Filter Bubble effect• Human curated content versus serendipity• Editors versus wisdom of the crowd• Personal universe of online information• Past clickstreams determine future choice:
information determinism– Netflix, Amazon etc.
• Echo chamber effect
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Unintended Consequence: Filter Bubble effect• Echo chamber effect• What we want to see, versus what we should see– Amazon / Netflix recommendations (what, not why)– Good enough recs. Versus risks ?
• Personalized: Google, Facebook etc.• Personalization drives ad revenue• Wikipedia is a standout
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Privacy• Law (and keeping up)• Terms of Service (bargain) Binary choice• Terms of Service: changes (for business model)• Scope of reach: Facebook / Google– EU: Right to be Forgotten ?– EU: Privacy rules change, Google
• Tension, privacy versus personalization• Opt in, versus Opt out: by default– Org donations; Facebook Nearby
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Privacy• Knowledge of data collection: invisible• Knowledge of data use (secondary: sometimes
unimagined)• Creep factor (how did they know?)– Ad retargeting– Snapchat ?
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes
Big Data
Privacy• Combining data can de-anonymize• Data brokers: reselling– Acxiom
• Hacking, data theft (Target, Home Depot, Kmart, etc.)
Digital Marketing [email protected] University of Delaware
#buad477 follow @alexbrownracing http://www.udel.edu/alex/classes