Toward Formal Reasoning with Epistemic Policies about Information Quality in the Twittersphere

VIStology, Inc - Fusion 2011 1

TOWARD FORMAL REASONING WITH EPISTEMIC POLICIES ABOUT INFORMATION QUALITY IN THE TWITTERSPHERE

Brian Ulicny

VIStology, Inc.

[email protected]

Mieczyslaw Kokar Northeastern University and VIStology, [email protected]

mailto:[email protected]

mailto:Mieczyslaw%20Kokar%20%[email protected]%3E


Arab Spring Uprisings 2011


Situation Awareness (?):Al Jazeera’s Twitter Monitor


Situation Awareness:Attention Spikes from Twitter


Situation Awareness: Flu Trends from Social Media

Detecting influenza outbreaks by

analyzing Twitt

er messages

Aron Culotta

arXiv:1007.4748v1 [cs.IR] 27 Jul 2010


Twitter as Open Source Intel


Reliability (Source) Credibility (Reported Information)

A: Completely reliable. It refers to a tried and trusted source which can be depended upon with confidence.

1 : Confirmed by Other Sources. It can be stated with certainty that the reported information originates from another source than the already existing information on the same subject. (JC3IEDM: 3 Independent Sources)

B: Usually reliable. It refers to a source which has been successfully used in the past but for which there is still some element of doubt in particular cases.

2: Probably True. The independence of the source of any item of information cannot be guaranteed, but from the quantity and quality of previous reports, its likelihood is nevertheless regarded as sufficiently established. (JC3IEDM: 2 Independent Sources)

C: Fairly reliable. It refers to a source which has occasionally been used in the past and upon which some degree of confidence can be based.

3: Possibly True. Despite there being insufficient confirmation to establish any higher degree of likelihood, a freshly reported item of information that does not conflict with previously reported behaviour pattern of target. (1 …)

D: Not usually reliable. It refers to a source which has been used in the past but has proved more often than not unreliable. (JC3IEDM: The probability of producing erroneous information is high (>30%).)

4: Doubtful. An item of information which tends to conflict with the previously reported or established behaviour pattern of an intelligence target.

E: Unreliable. It refers to a source which has been used in the past and has proved unworthy of any confidence.

5: Improbable. An item of information that positively contradicts previously reported information or conflicts with the established behaviour pattern of an intelligence target in a marked degree.

F: Reliability cannot be judged. It refers to a source which has not been used in the past

6: Truth of information cannot be judged.

Confidence = <Reliability, Credibility>


Problem Statement

• How can we assess not only the volume of tweets per time period

• And the frequency of terms they contain

• But the reliability, credibility & confidence of the information they convey

• In a potentially adversarial situation?


Naïve STANAG 2022 for Twitter• Reliability = F: Cannot Be

Judged– All “sources not used in the

past”

• Credibility = 1: Confirmed by Other Sources– More than two string identical

tweets?

• Or Credibility = 3, Possibly True – Because Sources not

Independent– Because Path between all

sources in Twitter graph


Need

• Tractable Way to Calculate:– Twitter Source Reliability– Twitter Content Credibility– Twitter Source Independence

• Where – Entire Twitter graph contains 105 Million Users• As of April, 2010

– 55 Million Tweets per Day– 3 Billion Requests per day to Twitter API


The Argument from Google

• There are too many Twitter sources to evaluate their reliability directly.

• However, Google has shown that there is great value in using eigenvector centrality (PageRank) as a proxy for reliability.

• Therefore, we assume that a PageRank-like metric correlates with Reliability because

• (1) We assume that people do not pass along information they believe to be unreliable

• (2) Eigenvector centrality/retweet influence, unlike simple indegree centrality, is difficult to fake.


Not Every Twitter User is Real

CENTCOMOperation Earnest Voice


TunkRank as Reliability

• Influence(X) = Expected number of people who will read a tweet that X tweets, including all retweets of that tweet. For simplicity, we assume that, if a person reads the same message twice (because of retweets), both readings count.

• If X is a member of Followers(Y), then there is a 1/||Following(X)|| probability that X will read a tweet posted by Y, where Following(X) is the set of people that X follows.

• If X reads a tweet from Y, there’s a constant probability p that X will retweet it.

•

D. Tunkelang. 2009. A Twitter Analog to PageRank. http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/


TunkRank as Reliability TunkRank vs Indegree Centrality (log scale)

Mapping TunkRank to STANAG 2022 Reliability

TunkRank Stanag 2022 Reliability

> 90th percentile A: Completely Reliable

> 80th percentile B: Usually Reliable

>50th percentile C: Fairly Reliable

< 50th percentile D: Not Usually Reliable

< 10th percentile E: Unreliable

Undefined F: Cannot Be Determined


Unreliability Indicators

• If X retweets a message, e.g:

RT @Whitehouse Zombie uprising in Scranton

• And there is no corresponding original tweet

• Then X is E: Unreliable.

• If X tweets a message with the same URL (shortened or dereferenced)

• But different content• More than twice• Then X is D: Not Usually

Reliable.• (On the other hand:

Verification: Reliability )


Source Independence

• There is a path connecting (nearly) every user in the Twitter graph.

• This does not mean that there is no source independence in Twitter.

• We count any sources as independent if they originate the message, and

• The shortest path between them is ≥ 4.

• In T.H. dataset, 4/20 tweets cite same NY Times URL via 3 shortened URLs.

• So, not independent.• Other news sources: 2 cite

Guardian, 1 BBC, 1 Der Spiegel, 1 WaPo, 1 Times of London

• No explicit Retweets• No Implicit Retweets• => 16 originating sources• Compute distance between

remaining sources


Sameness of Content• String identical tweets are not independent. Implicit retweets

– @BWJones: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T 4/20/2011 6:16:25 PM

– @Frieze_magazine: Tim Hetherington, photographer and 'Restrepo' co-director, killed in Misrata, Libya http://nyti.ms/dIm29T 4/20/2011 7:01:30 PM

• Custom Regexes to handle dead/alive– Tweet =~ (<subject> .* (dead|died|killed|not alive|RIP) ) &&– Tweet !~ (<subject> .* (not (dead|died|killed)) => Dead

• Tim Hetherington, Restrepo director has been killed in Misurata

– OR: Tweet =~ (<subject>.*(alive|(not (killed|dead|died)) &&– Tweet !~ (<subject> .* (not alive|RIP) => Alive

• E.g. C H still alive. (true positive) Wish T H were still alive (false positive)• Misses: C H in serious condition ( |= alive)

• >2x P vs not-P: Confirmed P; not-P: Improbable; > 1.5x P vs not-P: Probably True P, Doubtful not-P; ~same P, not-P: Possibly-true P, Possibly-true not-P

• 435 Tweets report C H dead; vs 7 C H alive: Confirmed: C H Dead; Improbable: C H not Dead.

http://nyti.ms/dIm29T

http://nyti.ms/dIm29T


Recap: Algorithm• Identify set of Tweets by Search API on name• Classify into Dead/Alive content• Calculate TunkRank on Users

– Discount false retweeters• Calculate Source Independence

– Group same media URLs; retweets, implicit retweets– Calculate distance between sources for joint network two hops out for each source.

• @NYTImesPhoto: An attack in Misurata, Libya today killed the photographer Tim Hetherington. 4/20/2011 7:11:15 PM– TunkRank: 99th percentile; > 5 independent sources assert T H died; 0 alive– <A:Completely Reliable, 1:Confirmed by Other Sources>

• @Cmovila: Sad news Tim Hetherington died in Misrata now when covering the front line. 4/20/2011 4:39:57 PM– TunkRank: 0th Percentile; > 5 Independent sources assert T H died; 0 alive– <E: Unreliable; 1:Confirmed by Other Sources>

• T H Alive: 5: Improbable>


Notional Architecture

Twitter Search API

Tweet to RDF

Conversion

Message Classifier

TwitterAPI

DistanceCalculator

BaseVISorInference

Engine

Tweets Augmented with STANAG 2022Assessments

TunkRankAPI


Conclusions

• Treating all Tweets as equally legitimate OK in non-adversarial, high volume situations.

• As OSINT, Tweets need to be evaluated according to the STANAG 2022 rubric

• We have outlined tractable ways to calculate reliability (TunkRank), credibility (sameness of content) and source (in)dependence.

• By converting Tweets to RDF, we can reason about them formally with a formal reasoner (BaseVISor)

• Future work: Do large scale demonstration showing efficacy in distinguishing low-confidence death rumors from high-confidence death notices on Twitter


Questions?

Technology

Toward Formal Reasoning with Epistemic Policies about Information Quality in the Twittersphere