34
Studying sentiment on social media Ana Isabel Canhoto - Oxford Brookes University www.anacanhoto.com Canhoto 2015 1

Challenges of using Twitter for sentiment analysis

Embed Size (px)

Citation preview

Page 1: Challenges of using Twitter for sentiment analysis

Studying sentiment on social media

Ana Isabel Canhoto - Oxford Brookes Universitywww.anacanhoto.com

Canhoto 2015 1

Page 2: Challenges of using Twitter for sentiment analysis

2

Emotions impact on:•Information retrieval•Information processing•Information retention•Decision-making•Behaviour•Assessment of consumption experiences

Why study sentiment?

Canhoto 2015

Image source: http://images.flatworldknowledge.com/sirgy/sirgy-fig06_x003.jpg

Page 3: Challenges of using Twitter for sentiment analysis

3

What are we talking about when we talk about sentiment analysis?

More: http://www.mxmindia.com/2012/03/tweets-take-wing-in-airline-social-media-study/

Canhoto 2015

Page 4: Challenges of using Twitter for sentiment analysis

4

Traditional approaches - Experiments

More: http://www.psych.nyu.edu/amodiolab/Publications_files/Social_Psychological_Methods_of_Emotion_Elicitation.pdf

Canhoto 2015

Page 5: Challenges of using Twitter for sentiment analysis

5

Traditional approaches – Interviews

Canhoto 2015

Page 6: Challenges of using Twitter for sentiment analysis

Real time Unprompted No need to recall past behaviour Non-intrusive Cost-effective…

6

The Social Media Promise

Canhoto 2015

Page 7: Challenges of using Twitter for sentiment analysis

7Canhoto 2015Source: http://cs-wordpress.s3.amazonaws.com/crowdsource-v4/uploads/2013/11/sentiment-analysis-ui.png

Page 8: Challenges of using Twitter for sentiment analysis

Canhoto 2015 8

Pratik Thakar, Head of creative content for Coca-Cola Asia-Pacific:

“Every office has a listening centre listening to what people are saying about our brands, good and bad, 24 hours a day. We look at what’s trending and how we can respond [to discussions about Coca-Cola] and to anything happening in the world. (…) I believe that social media is a big focus group. It’s a good way to identify trends and what people are talking about”

Source: http://www.campaignasia.com/Article/402239,Dont+believe+everything+you+hear+Cokes+Pratik+Thakar.aspx

Page 9: Challenges of using Twitter for sentiment analysis

9

Many turning to third parties for automated tracking and analysis of SM conversations…

Canhoto 2015

44% of businesses engaged in sentiment analysisHilpern, K. 'In it to win it?' The Marketer, July-August 2012, pp.34-37

Estimated cumulative revenues cc $2bn in 2014

Source: http://breakthroughanalysis.com/2013/12/30/aw-re-aw-text-analytics-industry-study_start-ups-and-aquisition-activities_max-breitsprecher/

How accurate are these tools?

Page 10: Challenges of using Twitter for sentiment analysis

Promotional literature: accuracy rates of 70% - 80% (Davis & O’Flaherty, 2012)– Not clear how the coefficients were

calculated– Not possible to independently verify these

claims 10Canhoto 2015

Page 11: Challenges of using Twitter for sentiment analysis

11Canhoto 2015

Open access

Page 12: Challenges of using Twitter for sentiment analysis

12

Sources of vulnerability

Canhoto 2015

Page 13: Challenges of using Twitter for sentiment analysis

• Accuracy: extent to which different researchers agree on the classification of a particular data object (Gwet, 2012)– System vs human coders– System A vs System B…

13Canhoto 2015

Page 14: Challenges of using Twitter for sentiment analysis

Conversations about coffee •Food and beverages = most widely discussed topic on social media (Forsyth, 2011)•‘Charged with a wide range of cultural meanings’ (Grinshpun, 2014)•Subject of many (netnographic) studies - e.g., Kozinets, 2002

14Canhoto 2015

Page 15: Challenges of using Twitter for sentiment analysis

• Sample of 200 tweets• Search terms: ‘coffee’ + variants ‘latte’,

‘mocha’, ‘cappuccino’, ‘espresso’ and ‘Americano’, as well as the terms ‘flavour’, ‘aroma’ and ‘caffeine’.

• Multiple users– Exclude manufacturers and retailers.

15Canhoto 2015

Page 16: Challenges of using Twitter for sentiment analysis

Analysis - Stage 1: Polarity of emotion•Positive vs. negative– As per Koppel & Schler (2006): comments that did

not express an emotion, were given the code ‘neutral’.

•Manual + 2 automated tools

16Canhoto 2015

Page 17: Challenges of using Twitter for sentiment analysis

17Canhoto 2015

Page 18: Challenges of using Twitter for sentiment analysis

Analysis - Stage 2: Type of emotion• As per Plutchik (2001)•Manual + 3 Automated tools

18Canhoto 2015

Page 19: Challenges of using Twitter for sentiment analysis

19Canhoto 2015

Page 20: Challenges of using Twitter for sentiment analysis

20Canhoto 2015

Page 21: Challenges of using Twitter for sentiment analysis

Messages where all types of coders agreed

Examples:“Found a euro cent on my walk and have a great cup of coffee in hand. Monday is already off to a good start”

“Feeling much more alive this morning now that I’ve had my coffee. Thank you #Nespresso”.

Clearly positive!21Canhoto 2015

Page 22: Challenges of using Twitter for sentiment analysis

Messages where automated tools agreed (but different from manual coding)

Example:“In uni. I think without this cup of coffee I would hulk out”

Very short segments

22Canhoto 2015

Page 23: Challenges of using Twitter for sentiment analysis

The rest

Example:“The early shift sucks. Oh well at least my latte is yummy :) “

23

Multiple objects

Multiple emotions

Canhoto 2015

Page 24: Challenges of using Twitter for sentiment analysis

Example:“100 copies of Ghosts sold overnight means a definite Starbucks run this morning. Possibly coffee out twice this week! Maybe even sushi!!”

Lack of emotionally charged words

24Canhoto 2015

Page 25: Challenges of using Twitter for sentiment analysis

Example:“How the heck am I supposed to be able to sleep well without coffee in my system? fucking snow”

Subtlety - Negative sentiment due to absence of product

25Canhoto 2015

Page 26: Challenges of using Twitter for sentiment analysis

Example:“Having coffee with my grandma before work right now. QT”

Syntax and style, specially abbreviations and slang

26Canhoto 2015

Page 27: Challenges of using Twitter for sentiment analysis

Example:“This coffee shop needs to change there music up every once and a while. Or maybe I should go home”

Target of emotion is not coffee!

27Canhoto 2015

Page 28: Challenges of using Twitter for sentiment analysis

28Canhoto 2015

Page 29: Challenges of using Twitter for sentiment analysis

29Canhoto 2015

Page 30: Challenges of using Twitter for sentiment analysis

30

Compounded by:• Very short segments of text• Rich in abbreviations and slang• Typos or grammatical errors• Specific culture and netiquette of the media• Skills of data analyst

Canhoto 2015

Page 31: Challenges of using Twitter for sentiment analysis

As a consequence:•Inaccurate representation of the overall sentiment [towards coffee]– Both sentiment polarity and emotional state

•Segments that should have been excluded from the analysis were retained in the corpus of data– Might skew results

•Concerns with the quality of the insights and subsequent decisions

31Canhoto 2015

Page 32: Challenges of using Twitter for sentiment analysis

To improve accuracy [1/2]:•Take into consideration the social context of the conversation– E.g., Tweets before or after the one being coded; wide

patterns (e.g., Mondays); cultural connotations (e.g., Japan vs. UK)

– But what about sarcasm and highly contextualised uses of language? (e.g., Sick)

32Canhoto 2015

Pratik Thakar:“When people say good things, you don’t just take it as it is. Someone might be asking them to say it; there might be some design mechanism working. But when people are unhappy, they go super-loud, and they are genuine at that time. ”Source: http://www.campaignasia.com/Article/402239,Dont+believe+everything+you+hear+Cokes+Pratik+Thakar.aspx

Page 33: Challenges of using Twitter for sentiment analysis

To improve accuracy [2/2]:•Develop dictionaries that reflect the specific syntax and style

•Software solutions that “translate” commonly used abbreviations and typos– E.g., BRB – be right back– Changing norms – e.g., LOL

•Familiarise with software

33Canhoto 2015

Page 34: Challenges of using Twitter for sentiment analysis

Studying sentiment on social media

Ana Isabel Canhoto - Oxford Brookes Universitywww.anacanhoto.com

Canhoto 2015 34