25
15 th Web Information System Engineering (WISE 2014) Thessaloniki, 14 October 2014 An Ensemble Model for Cross-Domain Polarity Classification on Twitter Adam Tsakalidis, Symeon Papadopoulos, Yiannis Kompatsiaris Information Technologies Institute (ITI) Center for Research & Technology Hellas (CERTH) #wiseconf2 014

An Ensemble Model for Cross-Domain Polarity Classification on Twitter

Embed Size (px)

DESCRIPTION

Presentation on polarity classification on Twitter using a new ensemble method @ WISE 2014, Thessaloniki, Greece

Citation preview

Page 1: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

15th Web Information System Engineering (WISE 2014)Thessaloniki, 14 October 2014

An Ensemble Model for Cross-Domain PolarityClassification on TwitterAdam Tsakalidis, Symeon Papadopoulos, Yiannis Kompatsiaris

Information Technologies Institute (ITI)Center for Research & Technology Hellas (CERTH)

#wiseconf2014

Page 2: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014 #2

Overview

• The Problem

• Existing Approaches

• Proposed Ensemble Model

• Experimental Study

• Summary – Future Work

Page 3: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014 #3

Polarity Classificationmotivation & definition

Page 4: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014 #4

Polarity Classification - Problem

Finding about whether a piece of text (e.g. tweet) conveys a positive or negative sentiment about a topic or entity of interest (person, party, brand, etc.)

Page 5: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Polarity Classification - Applications

• Monitoring sentiment about a brand could help take appropriate measures in case of negative sentiment increase (e.g. apology, assistance, compensation).

• Aggregating over many tweets could help form an overall impression of the public opinion about the topic/entity of interest.

#5

Page 6: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

The Cross-Domain Challenge

• Depending on the domain (and sometimes even the topic) of interest, the vocabulary and phrasing of sentiment may vary wildly.

• Most sentiment detection approaches require manually created ground truth from the same domain to achieve high performance expensive!

• We propose a framework that leverages an ensemble of sentiment detectors to tackle the cross-domain challenge.

#6

Page 7: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Related Work

#7

Page 8: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014 #8

Background: Sentiment Analysis

Two main problems: subjectivity & polarity

RepresentationRepresentation LearningLearningTextText SentimentSentiment

NBNB SVMSVM

Hybrid- Ensemble

Hybrid- EnsembleSSLSSLUnsupervisedUnsupervisedSupervisedSupervisedOtherOtherPOSPOSBigramsBigramsUnigramsUnigrams

Page 9: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Hybrid Models

Barbosa and Feng, COLING 2010 • Combines sentiment labels from three independent sources• Considers the quality of individual sources• Considers the different bias of individual sources

Gonçalves et al., COSN 2013 • First evaluates 8 different methods with respect to coverage

and agreement• Proposes a combined method, which combines individual

method outputs based on their performance (first in terms of coverage and then in terms of accuracy) on an independent dataset

#9

Page 10: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Ensemble Model for Cross-Domain Polarity Classificationapproach description

#10

Page 11: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Method Overview

#11

TBRTBR

FBRFBR

CRCR

LBRLBR

CL_TBRCL_TBR

CL_FBRCL_FBR

TweetTweetCL_CRCL_CR

CL_LBRCL_LBR

HCHC

LCLC

COMBCOMB

representation learning ensembling

Page 12: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Representation

• TBR – binary n-grams – n-grams with tf– n-grams with tf-idf– n = 1, 2, 3 9 representations

• FBR

• CR– TBR + POS tagging

• LBR– SentiWordNet (5D): sum of scores for Noun, Verbs, Adjectives,

Adverbs, sum of keywords (#pos - #neg)– Opinion Lexicon: Sum of keywords (#pos - #neg)

#12

TBR, FBR and CR with tf < 2 were removed!

Page 13: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Ensemble Learning

• Combine two types of classifiers: HC and LC• Hybrid classifier (HC) [domain-dependent]

i: tweet, c: class (pos/neg), r: TBR, FBR, LR, wr: weights defined based on accuracy on ED dataset, pr: individual classifier output

• Lexicon-based classifier (LC) [domain-independent]• If the outputs of the two classifiers agree, then

assign the agreed class to the tweet.• If the outputs disagree, use a domain-adapted

classifier (TBR) trained on the agreed tweets.

#13

Page 14: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Practical aspects

• In order for the ensemble model to work well, the two types of classifiers need to be individually accurate at sufficient rates, since we trust their outputs to be used as a new training set!

• In a real-time scenario, we need to allocate some time for the domain adaptation process to take place. This should be defined on a per case basis.

#14

adaptation ensemble model operation

Page 15: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Experimental Study

#15

Page 16: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Datasets

• Emoticons Dataset (ED)– 250K tweets collected over 2-days (mid-march 2014) containing

happy/sad emoticons (removed retweets) [English only!]– used for feature extraction, preliminary development

• Development set (DEV)– 10K tweets (equally balanced) collected in the same way as ED

• Stanford Twitter dataset (STS)– STS-1: 177 pos, 184 neg– STS-2: 108 pos, 75 neg

• Obama Healthcare Reform (HCR)– dev-set: 135 pos, 328 neg– test-set: 116 pos, 383 neg

• Obama-McCain Debate (OMD)– subset (agreement>50%): 707 pos, 1190 neg

#16

Page 17: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Model building

• Performance of TBR variants, role of POS (on 33% of ED)

• Performance of LR features

• Comparative performance of all classifiers (on DEV)

#17

Page 18: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Results

#18

over-optimistic estimate of real-world performance since 90% of set used for training

more realistic estimate of real-world performance / still, far better than other out-of-domain methods

Page 19: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Role of classifier agreement

• As expected, accuracy on disagreed tweets is lower– results produced using a classifier trained on “noisy” data– more challenging cases

• A similar case when training using automatically collected data based on emoticons

#19

Page 20: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Conclusions & Future Work

#20

Page 21: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Conclusions & Future Work

Conclusion•Ensemble model in tandem with domain adaptation is an effective strategy for tackling the cross-domain challenge!

Future Work•Incorporate the “neutral” class and evaluate•Test with more datasets•Check with additional classifiers as input

#21

Page 22: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Thank you!

#22

Questions?

https://github.com/socialsensor/sentiment-analysis

http://socialsensor.eu/

[email protected] / [email protected]@warwick.ac.uk

Page 23: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Key References (1/3)• Alina Andreevskaia and Sabine Bergler. When specialists and generalists work

together: Overcoming domain dependence in sentiment tagging. In ACL, pages 290-298, 2008

• Luciano Barbosa and Junlan Feng. Robust sentiment detection on twitter from biased and noisy data. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pages 36-44. ACL, 2010

• Adam Bermingham and Alan F Smeaton. Classifying sentiment in microblogs: is brevity an advantage? In Proceedings of the 19th ACM international conference on Information and knowledge management, pages 1833-1836. ACM, 2010

• Albert Bifet and Eibe Frank. Sentiment knowledge discovery in twitter streaming data. In Discovery Science, pages 1-15. Springer, 2010

• Johan Bollen, Huina Mao, and Alberto Pepe. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In ICWSM, 2011

• Samuel Brody and Nicholas Diakopoulos. Cooooooooooooooollllllllllllll !!!!!!!!!!!!!!: using word lengthening to detect sentiment in microblogs. In Proceedings of the Conference on EMNLP, pages 562-570. ACL, 2011.

#23

Page 24: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Key References (2/3)

• Andrea Esuli and Fabrizio Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, vol. 6, pages 417-422, 2006

• Alec Go, Richa Bhayani, and Lei Huang. Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, pages 1-12, 2009

• Pollyanna Goncalves, Matheus Araujo, Fabricio Benevenuto, and Meeyoung Cha. Comparing and combining sentiment analysis methods. In Proceedings of the first ACM COSN, pages 27-38. ACM, 2013

• Minqing Hu and Bing Liu. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168-177. ACM, 2004

• Long Jiang, Mo Yu, Ming Zhou, Xiaohua Liu, and Tiejun Zhao. Target-dependent twitter sentiment classification. In Proceedings of the 49th Annual Meeting of the ACL: HLT-Vol. 1, pages 151-160. ACL, 2011

• Jonathon Read. Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In Proceedings of the ACL Student Research Workshop, pages 43-48. ACL, 2005

• Hassan Saif, Yulan He, and Harith Alani. Semantic sentiment analysis of twitter. In The Semantic Web-ISWC 2012, pages 508-524. Springer, 2012

#24

Page 25: An Ensemble Model for Cross-Domain Polarity Classification on Twitter

#wiseconf2014

Key References (3/3)

• Emmanouil Schinas, Symeon Papadopoulos, Sotiris Diplaris, Yiannis Kompatsiaris, Yosi Mass, Jonathan Herzig, and Lazaros Boudakidis. Eventsense: Capturing the pulse of large-scale events by mining social media streams. In Proceedings of the 17th Panhellenic Conference on Informatics, pages 17-24. ACM, 2013

• Michael Speriosu, Nikita Sudan, Sid Upadhyay, and Jason Baldridge. Twitter polarity classification with label propagation over lexical links and the follower graph. In Proceedings of the First workshop on Unsupervised Learning in NLP, pages 53-63. ACL, 2011

• Kristina Toutanova, Dan Klein, Christopher D Manning, and Yoram Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Conference of the North American Chapter of the ACL on HLT-Vol. 1, pages 173-180. ACL, 2003

• Andranik Tumasjan, Timm Oliver Sprenger, Philipp Sandner, and Isabell Welpe. Predicting elections with twitter: What 140 characters reveal about political sentiment. In ICWSM 2010, 10:178-185, 2010

• Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on HLT and EMNLP, pages 347-354. ACL, 2005

• Ley Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu. Combining Lexicon-based and learning-based methods for twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011, 89, 2011.

#25