Implicit Sentiment Mining in Twitter Streams

RIP Boris Strugatski

Science Fiction will never be the same

Implicit Sentiment Mining(do you tweet like Hamas?)

Maksim Tsvetovat

Jacqueline Kazil

Alexander Kouznetsov

My book

Twitter predicts stock market

Sentiment Mining, old-schoool

• Start with a corpus of words that have

sentiment orientation (bad/good):

• “awesome” : +1

• “horrible”: -1

• “donut” : 0 (neutral)

• Compute sentiment of a text by

averaging all words in text

…however…

• This doesn’t quite work (not reliably, at

least).

• Human emotions are actually quite complex

• ….. Anyone surprised?

We do things like this:

“This restaurant would deserve highest

praise if you were a cockroach” (a real Yelp

review ;-)

“This is only a flesh wound!”

“This concert was f**ing awesome!”

“My car just got rear-ended! F**ing

awesome!”

“A rape is a gift from God” (he lost!

Good ;-)

To sum up…

• Ambiguity is rampant

• Context matters

• Homonyms are everywhere

• Neutral words become charged as

discourse changes, charged words

lose their meaning

More Sentiment Analysis

• We can parse text using POS (parts-

of-speech) identification

• This helps with homonyms and some

ambiguity

More Sentiment Analysis

• Create rules with amplifier words and

inverter words:

– “This concert (np) was (v) f**ing (AMP) awesome

(+1) = +2

– “But the opening act (np) was (v) not (INV) great

(+1) = -1

– “My car (np) got (v) rear-ended (v)! F**ing (AMP)

awesome (+1) = +2??

To do this properly…

• Valence (good vs. bad)

• Relevance (me vs. others)

• Immediacy (now/later)

• Certainty (definitely/maybe)• …. And about 9 more less-significant dimensions

Samsonovich A., Ascoli G.: Cognitive map dimensions of the human value system extracted from the natural language. In Goertzel B. (Ed.): Advances in Artificial General Intelligence (Proc. 2006 AGIRI Workshop), IOS Press, pp. 111-124 (2007).

This is hard

• But worth it?Michelle de Haaff (2010), Sentiment Analysis, Hard But Worth It!,

CustomerThink

Sentiment, Gangnam Style!

Hypothesis

• Support for a political candidate,

party, brand, country, etc. can be

detected by observing indirect

indicators of sentiment in text

Mirroring – unconscious copying of words or body language

Fay, W. H.; Coleman, R. O. (1977). "A human sound transducer/reproducer: Temporal capabilities of a profoundly echolalic child". Brain and language 4 (3): 396–402

Marker words

• All speakers have some words and

expressions in common (e.g.

conservative, liberal, party

designation, etc)

• However, everyone has a set of

trademark words and expressions

that make him unique.

GOP Presidential Candidates

Israel vs. Hamas on Twitter

Observing Mirroring

• We detect marker words and

expressions in social media speech

and compute sentiment by observing

and counting mirrored phrases

The research question

• Is media biased towards Israel or

Hamas in the current conflict?

• What is the slant of various media

sources?

Data harvest

• Get Twitter feeds for:

– @IDFSpokesperson

– @AlQuassam

– Twitter feeds for CNN, BBC, CNBC, NPR, Al-

Jazeera, FOX News – all filtered to only

include articles on Israel and Gaza

• (more text == more reliable results)

Fast Computational Linguistics

Text Cleaning

• Tweet text is dirty

• (RT, VIA, #this and

@that, ROFL, etc)

• Use a stoplist to

produce a stripped-

down tweet

import stringstoplist_str="""aa'sableAbout......zzerortvia"""

stoplist=[w.strip() for w in stoplist_str.split('\n') if w !='']

Language ID• Language identification is pretty

easy…

• Every language has a characteristic

distribution of tri-grams (3-letter

sequences);

– E.g. English is heavy on “the” trigram

• Use open-source library “guess-

language”

Stemming

• Stemming identifies root of a word,

stripping away:

– Suffixes, prefixes, verb tense, etc

• “stemmer”, “stemming”, “stemmed”

->> “stem”

• “go”,”going”,”gone” ->> “go”

Term Networks• Output of the cleaning step is a

term vector

• Union of term vectors is a term

network

• 2-mode network linking speakers

with bigrams

• 2-mode network linking locations

with bigrams

• Edge weight = number of

occurrences of edge bigram/location

or candidate/location

Build a larger net

• Periodically purge single co-occurrences

– Edge weights are power-law distributed

– Single co-occurrences account for ~ 90% of

• Periodically discount and purge old co-

occurrences

– Discourse changes, data should reflect it.

Israel vs. Hamas on Twitter

Israel, Hamas and Media

Metrics computation

• Extract ego-networks for IDF and HAMAS

• Extract ego-networks for media organizations

• Compute hamming distance H(c,l)

– Cardinality of an intersection set between two

networks

– Or… how much does CNN mirror Hamas? What

about FOX?

• Normalize to percentage of support

Aggregate & Normalize

• Aggregate speech

differences and

similarities by

media source

• Normalize values

Media Sources, Hamas and IDF

AlJazeera

0.601137575542125

0.493295229720817

0.537492157878944

0.585616438356164

0.53034409365023

0.579395353707609

0.398862424457874

0.506704770279182

0.462507842121055

0.414383561643835

0.469655906349769

0.42060464629239

Chart Title

IDF Hamas

0 0.2 0.4 0.6 0.8 1 1.2

Ron Paul, Romney, Gingrich, Santorum March 2012 (based on Twitter Support)

Conclusions

• This works pretty well! ;-)

• However – it only works in

aggregates, especially on Twitter.

• More text == better accuracy.

Conclusions

• The algorithm is cheap:

– O(n) for words on ingest – real-time on a

stream

– O(n^2) for storage (pruning helps a lot)

• Storage can go to Redis

–make use of built-in set operations

Implicit Sentiment Mining in Twitter Streams

Technology

THE IMPLICIT BIAS OF IMPLICIT BIAS THEORY

Negative Sentiment (or "Sentiment Analysis is Sh*te")

Sentiment Analysis & Opinion Mining€¦ · Sentiment Analysis Sentiment Classification System Experimente Perspektiven * Abbildung dem Sinn nach entnommen aus Heyer (2006: 5). Sentiment

Semantic Stability and Implicit Consensus in Social Tagging … · 2018-07-17 · Semantic Stability and Implicit Consensus in Social Tagging Streams Claudia Wagner U. of Koblenz

IMPLICIT interconception IMPLICIT ... - USF Health

The Consequences Of Implicit Bias: What Is Implicit Bias ...inns.innsofcourt.org/media/161535/the_consequences_of_implicit_bias... · The Consequences Of Implicit Bias: What Is Implicit

Implicit Evaluation RUNNING HEAD: IMPLICIT EVALUATION · Implicit Evaluation 2 . Abstract . Implicit evaluation can be defined as the automatic effect of stimuli on evaluative responses

Young Streams vs. Old Streams

More Than Words: Syntactic Packaging and Implicit Sentimentverbs.colorado.edu/~mpalmer/Ling7800/ResnikMTWv3.pdf · MORE THAN WORDS: SYNTACTIC PACKAGING AND IMPLICIT SENTIMENT Greene

Sentiment Analysis in Geo Social Streams by using Machine ... · in the volume of sentiment rich social media on the web has resulted in an increased interact among researchers regarding

06 INPUT AND OUTPUT Functional Programming. Streams Two kinds of streams Character streams Binary streams Character streams are Lisp objects representing

Multi-lingual Sentiment Analysis of Financial News Streams · 2012-05-09 · Multi-lingual Sentiment Analysis of Financial News Streams Se´an Hegarty B.A. (Mod.) CSLL Final Year

Unpaired Sentiment-to-Sentiment Translation: A Cycled … · 2018-10-10 · Unpaired Sentiment -to Sentiment Translation: A Cycled Reinforcement Learning Approach 15 07 2018 4 of

Sentiment Analysis of Greek Tweets and Hashtags using ...hashtag.nonrelevant.net/Sentiment Analysis of Greek... · sentiment rating for the Greek tweets, for a variety of sentiment

Introduction to Sentiment Analysis - ETH Z · sentiment ! Sentiment analysis is also known as opinion mining L Sanders 3 What is Sentiment Analysis Sentiment analysis is the operation

An Overview of Sentiment Analysis in Social Media and its ... · highlighting contributions, shortcomings, and pitfalls due to the composition of online media streams. Next we discuss

Implicit Testing Introduction: The Implicit Testing Company Ltd

Visual sentiment analysis of customer feedback streams using geo

Visual Sentiment Analysis on Twitter Data Streams · 2016-03-30 · Visual Sentiment Analysis on Twitter Data Streams Ming Hao, Christian Rohrdantz, Halldór Janetzko, Umeshwar

Implicit Implicit Scala