23
On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The 9th edition of the Language Resources and Evaluation Conference, Reykjavik, Iceland

On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter

Embed Size (px)

DESCRIPTION

Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier's feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space.

Citation preview

Page 1: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter

Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani

Knowledge Media Institute, The Open University,

Milton Keynes, United Kingdom

The 9th edition of the Language Resources and Evaluation Conference, Reykjavik, Iceland

Page 2: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

• Sentiment Analysis

• Twitter

• Stopwords Removal Methods

• Comparative Study

• Conclusion

Outline

Page 3: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

“Sentiment analysis is the task of identifying positive and negative opinions, emotions and evaluations in text”

3

The main dish was delicious

It is a Syrian dishThe main dish was salty and horrible

Opinion OpinionFact

Sentiment Analysis

Page 4: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter
Page 5: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter
Page 6: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopwords Removal

Page 7: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopwords Removal in Twitter Sentiment Analysis

- Kouloumpis et al. 2011

- Pak & Paroubek, 2010

- Asiaee et al., 2012

- Bollen et al., 2011

- Bifet and Frank, 2010

- Speriosu et al., 2011

- Zhang & Yuan, 2013

- Gokulakrishnan et al 2012

- Saif et al., 2012

- Hu et al., 2013

- Camara et al., 2013Removing Stopwordsis USEFUL

NOYES

Page 8: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

• Precompiled

• Very popular

• Outdated

• Domain-Independent

Classic Stopword Lists

Page 9: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

• Unsupervised Methods

– Term Frequency

– Term-based Random Sampling

• Supervised

– Term Entropy Measures

– Maximum Likelihood Estimation

Automatic Stopwords Generation Methods

Page 10: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopwords Removal for Twitter Sentiment Analysis

Page 11: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopword Analysis Set-Up (1)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

OMD

HCR

STS

SemEval

WAB

GASP

OMD HCR STS SemEval WAB GASP

Negative 688 957 1402 1590 2580 5235

Positive 393 397 632 3781 2915 1050

Datasets

Page 12: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopword Analysis Set-Up (2)

Stopwords Removal Methods

1. The Baseline Method

– (non removal of stopwords)

1. The Classic Method

– This method is based on removing stopwords obtained from pre-compiled lists

– Van Stoplist

Page 13: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopword Analysis Set-Up (3)

Stopwords Removal Methods3. Methods based on Zipf’s

Law

- TF-High Method

Removing most frequent

- TF1 Method

Removing singleton words (i.e., words that occur once in tweets)

- IDF Method

Removing words with low inverse document frequency (IDF)

Page 14: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopword Analysis Set-Up (4)

Stopwords Removal Methods

4. Term-based Random Sampling (TBRS)

5. The Mutual Information Method (MI)

Page 15: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Stopword Analysis Set-Up (5)

Twitter Sentiment Classifiers

– Two Supervised Classifiers:

• Maximum Entropy (MaxEnt)

• Naïve Bayes (NB)

– Measure the performance in Accuracy and F1 measure

– 10 fold cross validation

Page 16: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Experimental Results

Assess the impact of removing stopwords by observing fluctuations on:

- Classification Performance

- Feature space

- Data Sparsity

Page 17: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Experimental Results (1)

1. Classification Performance

70

75

80

85

90

95

OMD HCR STS-Gold SemEval WAB GASP

Accuracy(%)

MaxEnt NB

60

65

70

75

80

85

90

OMD HCR STS-Gold SemEval WAB GASP F1(%)

MaxEnt NB

The baseline classification performance in Accuracy and F-measure

of MaxEnt and NB classifiers across all datasets

Accuracy F-Measure

Page 18: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Experimental Results (2)

1. Classification Performance

60

65

70

75

80

85

90

Baseline Classic TF1 TF-High IDF TBRS MI

Accuracy(%)

MaxEnt NB

50

55

60

65

70

75

80

85

Baseline Classic TF1 TF-High IDF TBRS MI F1(%)

MaxEnt NB

Accuracy F-Measure

Average Accuracy and F-measure of MaxEnt and NB classifiers using different stoplists

Page 19: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Experimental Results (3)

2. Feature Space

0.005.50

65.24

0.82

11.226.06

19.34

Baseline Classic TF1 TF-High IDF TBRS MI

Reduction rate on the feature space of the various stoplists

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

OMD HCR STS-Gold SemEval WAB GASP

TF=1 TF>1

The number of singleton words to the number non singleton words in all datasets

Page 20: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Experimental Results (4)

3. Data Sparsity

0.98800

0.99000

0.99200

0.99400

0.99600

0.99800

1.00000

Baseline Classic TF1 TF-High IDF TBRS MI

SparsityDegree

OMD HCR STS-Gold SemEval WAB GASP

Stoplist impact on the sparsity degree of all datasets

Page 21: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

The Ideal Stoplist (1)

• The ideal stopword removal method is the one which:

– Helps maintaining a high classification performance,

– Leads to shrinking the classifier’s feature space

– Reduces the data sparseness

– Has low runtime and storage complexity

– Has minimal human supervision

Page 22: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

The Ideal Stoplist (2)

Average accuracy, F1, reduction rate on feature space and data sparsity of the six stoplistmethods. Positive sparsity values refer to an increase in the sparsity degree while negative values refer to a decrease in the sparsity degree.

Overall Analysis Results

Page 23: On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of  Twitter

Conclusion

• We studied how six different stopword removal methods affect the sentiment polarity classification on Twitter.

• The use of pre-compiled (classic) Stoplist has a negative impact on the classification performance.

• TF1 stopword removal method is the one that obtains the best trade-off:

– Reducing the feature space by nearly 65%, – Decreasing the data sparsity degree up to 0.37%, and – Maintaining a high classification performance.