Upload
ngokien
View
216
Download
0
Embed Size (px)
Citation preview
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Data
Daniel Hardt
IT Management, CBS
Supply Chain Leaders Forum3 September 2015
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Outline
1 The Big Data RevolutionWatsonGoogle Self-Driving Car
2 Learning from Big Data: Text, Feelings and MachineLearning
Sentiment AnalysisMining Facebook for Feelings
3 CasesBig Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
4 Conclusions
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Increasing Availability of Data
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Data Challenges
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Characteristics
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Lots of Photos
fstoppers.com
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Lots of Photos
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Big Data: Definition
Big data – data sets so large or complex that traditional dataprocessing applications are inadequate. (Wikipedia)
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Is this Surprising?
Moore’s law: computing power doubles every 2 years(roughly)
forums.xkcd.com
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Is this True?
Big data – data sets so large or complex that traditional dataprocessing applications are inadequate. (Wikipedia)
Increase of data is keeping pace with processing powerIn fact, increase in data is itself supporting new ways toprocess data – Artificial Intelligence
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Watson: The Jeopardy Challenge
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Watson: Jeopardy
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Jeopardy is Hard!
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Watson: Health Care
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
IBM – Evolution of Computing
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Cognitive Computing
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
Google Self-Driving Car
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
The Google Car’s View of the World
Madrigal (2014), Atlantic
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
The Google Car’s View of the World
Madrigal (2014), Atlantic
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
WatsonGoogle Self-Driving Car
The Trick: “Crawling” the World
Google wants to make the self-driving car problem into aBig Data problemCar has ultra-detailed map for every road it travels on,“down to tiny details like the position and height of everysingle curb . . . a precision measured in inches”Google has mapped 2,000 miles of road. The US roadnetwork has 4 million miles of road. “It is work,” Urmsonadded, shrugging, “but it is not intimidating work.”
Madrigal (2014), Atlantic
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Liu, Bing. Sentiment analysis and subjectivity Handbook of natural language processing 2 (2010): 568.
Sentiment analysis or opinion mining is the computationalstudy of opinions, sentiments and emotions expressed intextLots of Buzz!
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Business
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Business
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Business
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Business
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Business
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Machine Learning MethodsPang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. "Thumbs up?: sentiment classification using machinelearning techniques." Proceedings of the ACL-02 conference on Empirical methods in natural languageprocessing-Volume 10. Association for Computational Linguistics, 2002.
Bag of words: With lexicon of m words, each document d isrepresented by the document vector(n1(d),n2(d), ...,nm(d))Machine Learning: Naive Bayes, Maximum Entropy,Support Vector MachinesNaive Bayes:
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Data: Facebook Feelings
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Arousal and Valence: Data
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Five Basic Feelings: Data
AnimatedExcited 155291Pumped 2979Surprised 752Amused 14993
JoyHappy 114259Wonderful 54691Awesome 22351Super 5794Great 55180Fantastic 3596Delighted 805Satisfied 1349Content 628Hopeful 21399
AngryAngry 12680Pissed 3851Annoyed 16839Frustrated 1145Disappointed 2534Disgusted 1566
FearfulWorried 3274Scared 2075Anxious 1002Shocked 1391Confused 3904
EmpoweredDetermined 29850Confident 2341Accomplished 6570Proud 31363
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Classifier
Basic Feelings (5-way classification)Classifier: MaxEntTraining Accuracy: .87Testing Accuracy: .75 (10-fold validation)
Arousal (2-way classification)Classifier: MaxEntTraining Accuracy: .99Testing Accuracy: .80 (10-fold validation)
Valence (2-way classification)Classifier: MaxEntTraining Accuracy: .99Testing Accuracy: .83 (10-fold validation)
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Two-D Classification: Valence and Arousal
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Two-D Classification: Comparisons
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Feeling Meter: Manual Assessment
Test Set: 160 examples from different sourcesManual Task: Order Feelings Expressed (1 is mostexpressed, 5 least; 0 not expressed at all)Results: Binary Decision – is feeling expressed or not?
(Ignore examples where 1st coder notes no feelingsexpressed – leaves 92 examples)Agreement on Feelings Expressed1st coder vs 2nd coder: 0.797385620915033 366 out of459 in 92 cases1st coder vs System: 0.734204793028322 337 out of 459in 92 cases
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Sentiment AnalysisMining Facebook for Feelings
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Assessment of Country Logistics Systems
What are logistics and supply chain costs in differentcountries?Specific transportation system cost categories like road,rail, air etc.Interaction of these costs with each other and withinformation and communication systemsRelevant to the investment decision-making considerationsof firms
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
An Analysis based on Reports
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Assessing Relevant Factors from Reports
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
A New Analysis using Language Technology
Extract relevant factors based on distribution of words andterms in reportsUse metrics like TFIDF, which finds terms that are likely tobe characteristic of a given textWith automatic analysis, can consider 10 or 100 timeslarger quantities of data – reports over a ten year period,with dozens of countries
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Roskilde slides
from Per Østergaard Jacobsen (CBS) and Henrik Hammer Eliassen (IBM)
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Big Transportation and Trade Data AnalyticsBig Data at Roskilde Festival
Big Data Analysis
Where do people go?What do they buy?Machine Learning and AI can predict: under a given set ofconditions (weather, previous movements, age, gender,etc), what is the probability of a given purchase?Watson technology is being brought to bear on suchquestionsRelevant for supply chain
Daniel Hardt Big Data
The Big Data RevolutionLearning from Big Data: Text, Feelings and Machine Learning
CasesConclusions
Conclusions
All industries will be fundamentally transformed by BigDataMany changes in areas like transportation, consumerforecasting that are crucial for supply chain managementLots of things happening at CBS!
Daniel Hardt Big Data