Upload
kris-tuttle
View
505
Download
1
Embed Size (px)
DESCRIPTION
This survey of big data and advanced technology as it relates to media was given to an industry association in November of 2012. Rocket Fuel
Citation preview
“Big Data” for Media
November 20, 2012
This presentation was prepared by SoundView Technology Group and is being provided to the Danish Media Association to promote discussion and provide information. All information contained herein is freely shareable by the association as well as anyone participating in the workshop. Attribution is appreciated but not required.
2
1979-1981 Hardware, Programming
1981-1992
Applied AI, CMU, IBM
1993-2004 Equity Research, Banking,
Brokerage, Investing
2004-Present Advisor, Entrepreneur, Researcher, Publisher,
Programmer
Kris Tuttle
Related Work
3
Eddie Obeng: After Midnight
4
Eddie Obeng: After Midnight
5
Eddie Obeng: After Midnight
6
Discussion Points
• Flyover of recent significant related events and developments
• Unpacking “Big Data” and looking at the pieces
• The intersection of big data and media: • The obvious – advertising, targeting, conversions • Data analysis = content? • Content discovery, generation and ranking • Hyper-personalization & attenuation
• Wrap up and transition to Q&A
© SoundView Technology Group 2012
7
Flyover – Evolution of what’s News
Source: Monday Note
© SoundView Technology Group 2012
8
Flyover – Artist Control
9
Flyover – “Amateur” Content Development
The Kickstarter model is disrupting the way things are created, produced and initially adopted.
10
Flyover – Sourcing & Distribution
11
Flyover – Sourcing & Distribution
12
Flyover – Data Analysis as Content
13
Flyover – Mobile
Big Data Applications
14
How Big Data Looks
© SoundView Technology Group 2012
Smörgåsboard
Size Speed
Sloppy
15
Unpacking Big Data
© SoundView Technology Group 2012
• Often not that big in terms of size often less than 1TB
• There is real big data and it’s sometimes enormous ~20 PB/day
• Tends to have additional special features:
o Multiple sources – transactional, log files, databases, geospatial,
o Semi or unstructured – clickstreams, html, sensor data
o Shorter duration – seconds to days versus weeks to months
o More flexible – sometimes scheme-less, and built for speed
• Today the default technology for most projects is Hadoop
• Some machine learning is becoming a common feature
16
Tools are Immature
© SoundView Technology Group 2012
17
New DB Technologies
© SoundView Technology Group 2012
18
New Language Technology
© SoundView Technology Group 2012
19
Data for Targeting & Conversion
© SoundView Technology Group 2012
20
Data for Targeting & Conversion
• Integrating geospatial data for local advertising
• Intersecting social data with news and search for customization
• Going beyond “A/B testing” and using real-time data analysis to improve content on the fly
• Saved data opens the door to machine learning and better algorithms
© SoundView Technology Group 2012
21
Data Analysis as Content
• Increased building of proprietary data and surveys – no brainer
• In-house data scientists and coding capabilities will help
• There are quite a few good text analysis and processing tools out there – NLTK, Python
• NoSQL and Network DB tecnology can help
© SoundView Technology Group 2012
22
Data-driven Editorial & Contributions
© SoundView Technology Group 2012
23
Using Data to Hyper-Personalize
• The future of reader relationships is about more than content quality – readers will grow to expect highly personalized content
• Managing, distributing and “learning” from collected data requires a
much more sophisticated view of how content is represented, managed and distributed
© SoundView Technology Group 2012
24
Getting Started – Education
© SoundView Technology Group 2012 http://bigdatauniversity.com/
25
Getting Started – Data Sources
© SoundView Technology Group 2012
26
Getting Started – Vendors
© SoundView Technology Group 2012
27
Getting Hygge with Big Data
© SoundView Technology Group 2012
• We are still in the very early days of implementation – nothing is precluded
• Appoint or hire a “data czar” - more than a point person, the one who can translate the technology into implementation opportunities
• For each line of business map revenue growth and profit margins to variables that might be improved with more data and better analysis
• In-house custom coding capability is a strategic advantage – data integration remains a big challenge
• Start building some proprietary data – surveys, histories, aggregations
• Are any ideas worth implementing in the local market? (Zite, HuffingPost, NYT/Nate Silver)
• Experiment, experiment, experiment
28
Kris Tuttle
+1-617-934-1877 (US)
+33(0)6.7439.8593 (France)