Transcript
Page 1: Consumer sentiment analysis  with Twitter

Consumer sentiment analysis with TwitterReetta Suonperä August 2013

Page 2: Consumer sentiment analysis  with Twitter

My dataset

• Two months, one csv.gz file per day

• In total about 1.2 billion tweets

• It's always easy for a person to say get over, but you don't feel what heart feels to make that statment|PrettynPinkC215|2011-02-01T04:01:16Z|2011-02-01T04:00:48Z|1296532876139018784|

Page 3: Consumer sentiment analysis  with Twitter

The tools I use

• General approach: natural language processing (NLP)

• The Natural Language Toolkit (NLTK)

Page 4: Consumer sentiment analysis  with Twitter

Introduction: the consumer sentiment index

• A survey-based indicator of consumer confidence or sentiment

• History goes back to 1946 at University of Michigan

• Ireland’s consumer sentiment index by the ESRI since 1996

Page 5: Consumer sentiment analysis  with Twitter

ESRI survey questions• Q1: Economic situation in the country (next 12

months)

• Q2: Unemployment in the country (next 12 months)

• Q3: Household financial situation (12 months ago)

• Q4: Household financial situation (next 12 months)

• Q5: Good/bad time to buy large household items

Answers: positive/neutral/negative

Page 6: Consumer sentiment analysis  with Twitter

This is what it looks like:The KBC/ESRI consumer sentiment index

Page 7: Consumer sentiment analysis  with Twitter

We can speculate on what drives sentiment – but we can’t really know

On the June 2013 improvement in households’ assessment of their personal finances:

“We think that the ECB rate cut in May played some role … a combination of low inflation, early summer sales and increasing signs of improvement in the residential property market could have contributed…”

On the decline in the July 2013 index:“We think reports that the Irish economy had fallen back into recession and a couple of high profile job loss announcements unnerved consumers last month.”

Page 8: Consumer sentiment analysis  with Twitter

Motivation: why using Twitter could help

• More timely

• Continuous information

• Save money

• What drives sentiment

Page 9: Consumer sentiment analysis  with Twitter

Previous research

• O’Connor et al (2010): From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series

• An index based on tweets containing the word “jobs” correlates with the Michigan index and Gallup’s daily poll

• Indices with economy or job correlate poorly!

Page 10: Consumer sentiment analysis  with Twitter

The process (simplified)

Page 11: Consumer sentiment analysis  with Twitter

Initial wordlist topicsGeneral economic situation

Unemployment/employment

Household financial situation

Buying climate major hh items

General economy Job losses General Acquire/buy

Good times Job gains Income Cost

Bad times Credit Pricey

Econ policy Feeling broke Bargain

Feeling flush

Page 12: Consumer sentiment analysis  with Twitter

Using WordNet to expand seed wordlist

• Use WordNet to find synonyms for initial keyword list:

• Words have many different meanings

• Include part-of-speech tag

• Word doesn’t exist in WordNet?

• Output does not include tenses or plurals

Page 13: Consumer sentiment analysis  with Twitter

Pre-processing tasks

• Regular expressions for more basic tasks:

• Cleaning, tokenising URLs, usernames

• NLTK functionality for more complex tasks

• Stopword removal, stemming, POS-tagging

Page 14: Consumer sentiment analysis  with Twitter

Fine selection – not there yet…

• Do more filtering using bigrams?

• “I broke”

• “pay cut”

• “new job”

• Use POS tags?

• Classification?

Page 15: Consumer sentiment analysis  with Twitter

• Finalise fine selection

• Sentiment classification

• Visualisation

The to-do list

Page 16: Consumer sentiment analysis  with Twitter

• www.nltk.org

• Natural Language Processing with Python:http

://nltk.org/book/

• Python Text Processing with NLTK 2.0 Cookbook

Resources

Page 17: Consumer sentiment analysis  with Twitter

Resources• O’Connor et al (2010): From Tweets to Polls: Linking Text

Sentiment to Public Opinion Time Series• Bollen et al (2011): Twitter mood predicts the stock

market• Bollen et al (2011): Modeling public mood and emotion:

Twitter sentiment and socio-economic phenomena• Go et al (2009): Twitter sentiment classification using

distant supervision• Jiang et al (2011): Target-dependent Twitter Sentiment

Classification

Page 18: Consumer sentiment analysis  with Twitter

Questions?


Recommended