99
Open Source Social Media Analytics for Intelligence and Security Informatics Applications Invited Talk at 4th International Big Data Analytics Conference (BDA), Hyderabad, India 2015 December 17, 2015 Swati Agarwal PhD Scholar at Information Management and Data Analytics Group IIIT Delhi, India ([email protected]) Ashish Sureka Principal Scientist at ABB Corporate Research Center Bangalore, India ([email protected]) Vikram Goyal Associate Professor at Indraprastha Institute of Information Technology New Delhi, India ([email protected])

Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Open Source Social Media Analytics for Intelligence

and Security Informatics Applications

Invited Talk at 4th International Big Data Analytics Conference (BDA),

Hyderabad, India 2015

December 17, 2015

Swati AgarwalPhD Scholar at Information

Management and Data Analytics Group

IIIT Delhi, India

([email protected])

Ashish SurekaPrincipal Scientist at ABB Corporate

Research Center

Bangalore, India

([email protected])

Vikram GoyalAssociate Professor at IndraprasthaInstitute of Information Technology

New Delhi, India

([email protected])

Page 2: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Learning Objective

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

What

Why

How

• Open source social media Intelligence

• Focus of Tutorial (Sub-topics under the field of ISI Research)

• Need and Importance of research in the field of ISI

• Technical and Computational Challenges

• Case Studies showing the applications of CSR in the domain of ISI

• Future Directions

Page 3: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Learning Objective

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Remember

Understand

Application

Research

keywords and terminology

What is Intelligence and Security Informatics

Need and Importance of research in the field of ISI

Current state-of-the art and scope of research in the domain

What ideas of computer science

research you can apply in ISI

domain

Applying

CSR

methods for

solving the

problems

raised by ISI

Page 4: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Tutorial Structure

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Session I

Introduction to the field of Intelligence and Security Informatics

Need and Scope of Research in the field

Uniqueness of Problem/Area (Technical Challenges)

Session II

Technology Landscape and Current state-of-the-art

Case studies on open source social media intelligence applications in ISI

Leading and relevant venues for publication in the domain of ISI

Page 5: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agenda

Importance of ISI

Introduction

Technical Challenges

Focus and Scope

Technology Landscape

Case Studies

ISI Leading Venues

Conclusions

Future Directions

References

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 6: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Importance of Intelligence

and Security Informatics

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 7: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

TerrorismAn illegal use of violence, aimed against civilians in

order to achieve different kind of political ends.

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 8: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Secret Intelligence

Service (MI6) in United Kingdom

Started after 9/11 attacks in United States

FBI, Nation’s Prime Federal

Law Enforcement Organization

Research and Analysis Wing(RAW), Started

after Sino-Indian and

Indo-Pakistan War

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Image Sources: Wikipedia

Page 9: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Image Sources: RajComics, MarvelComics, Wikipedia

Page 10: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Intelligence and Security Informatics

Technologist

SociologistPsychologist

How Computer Science Research can be used

for border control, terrorism, crime prevention on

Cyber-Security

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 11: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

““The medium is the message”.

-Marshall McLuhan

Role of Social Media

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 12: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 13: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 14: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 15: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Role of Social Media

Al-Qaida encourages “homegrown terrorism” and

recruitment of young people

Information shared on social media has access to

more individuals to carry out “lone wolf” operations

against Western targets

42% of teenagers (15-17 yrs) in United States use

social networking websites.

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 16: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

50M tweets per day, 640M users on Twitter

100 hours of videos uploaded on YouTube

every minute

124.3B posts, 264.6M blogs on Tumblr

Popular Social Media Websites

Statistics

>75M daily active users, 40B photos uploaded

on Instagram

For any government organization, law

enforcement agency or security analysts-

constant monitoring and manual

annotation of each post made on social

media portals is overwhelmingly

impractical

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 17: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

3 men from Mumbai, Maharashtra join

ISIS in Iraq (age group 20-25)

Jake Bilardi, manipulated, radicalized,

Jihadist (18 yrs. old, 10th std. student)

Image Sources:

1. http://indiatoday.intoday.in/story/agencies-on-the-tail-of-

maharashtras-missing-jihadis/1/372442.html,

2. http://www.news.com.au/national/freshfaced-westerners-

are-being-lulled-into-terrorism-by-isis-propaganda/news-

story/8448148e3a0c33c01b95db4dc1bab492,

3. http://www.ibtimes.com/who-ali-shukri-amin-virginia-isis-

teenager-behind-pro-islamic-state-twitter-sentenced-

2073208

Ali Shukri Amin, (17 yrs. old) from

Virginia, sentenced to 11 yrs. of prison

for raising funds for ISIS through Twitter

@AmeerkiWitness

Page 18: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Introduction

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 19: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“ Intelligence collected and inferred form

publicly available and overt sources of

information

Open Source Intelligence

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 20: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Infochimps

Social Media Websites API

Examples- Open Source

Data Portals

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Image Sources: Website Developer Page, Official Websites of Data Portals

Page 21: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“ A sub-field within OSINT with a focus on

extracting insights from publicly available

data in Web 2.0 platforms

Open Source Social Media

Intelligence

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 22: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Social Media Websites API

Wikipedia

Social Curation (Theme or topic based

content sharing)

Crowd Sourcing Websites

Examples- Web 2.0 Open

Source Data Portals

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Image Sources: Website Developer Page, Official Websites of Data Portals

Page 23: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 24: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“ Current generation of the web

Highly participative and collaborative

Allows users to post content, messages and

comments

Online Social Media

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 25: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Online Social MediaPopularity

Anonymity

High Reachability

Social Networking

Low Publication Barriers

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Image Sources: Google Images, http://maliaholleron.com/wp-content/uploads/2012/12/Social-Media-Icons-cloud-300x256.png

Page 26: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Video Sharing and

Hosting Website

Social Networking

Website

Micro-Blogging

Websites

Community Based

Questions and Answering

Social

Bookmarking

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Image Sources: Official Websites

Page 27: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“ A field of study concerning investigation and

development of counter terrorism, national

and international security support systems

and applications

Intelligence &

Security Informatics

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 28: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Online Radicalization

Image Sources: http://www.trbimg.com/img-5178e6b1/turbine/la-na-tt-internet-imams-20130425-001/600

Page 29: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“Online Radicalization

Hate and extremism promotion

Promoting certain ideology and beliefs

Forming virtual communities online

Recruiting members for hate promotion

Brainwashing people affected from several

incidents

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 30: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

boko$haram,$islam,$muslim,$radical$islam,$terrorist,$jihad,$isis,$liberals,$al$queda,$holy$quran,$je$suis$charlie,$Islamic$Jihad,$Israel,$ak47,$guns,$rifle,$hate,$an<=American,$violence,$ba@le,$weapon,$religion$of$peace$

Tags/Keywords-associated-with-the-post-

Online Radicalization

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Image Sources: tumblr.com

Page 31: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Online Radicalization

Swarmcast: How Jihadist Networks Maintain a Persistent Online Presence- Ali Fisher (Perspectives on Terrorism, June 2015)

Image Sources: twitter.com

Page 32: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Example- Constructing

Communities

A Focused Crawler for Mining Hate and Extremism Promoting

Users and Communities on YouTube Swati Agrawal, Ashish Sureka

Indraprastha Institute of Information Technology, Delhi (IIIT-D)

{swatia, ashish}@iiitd.ac.in

!

!

Image Sources: Google Images,

antitrustlair.files

Page 33: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“ Planning and mobilizing of civil unrest related

events via web platforms

Examples: Protests, Public Demonstrations and

Riots

Online Civil Disobedience

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 34: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Online Civil Disobedience

Image Sources: twitter.com

Page 35: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“ Machine Learning Techniques

Social Network Analysis

Visualization

Text Mining and Analytics

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 36: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Technical Challenges

Page 37: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

50M tweets per day, 640M users on Twitter

100 hours of videos uploaded on YouTube

every minute

124.3B posts, 264.6M blogs on Tumblr

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

1. Massive Size and High

Velocity

>75M daily active users, 40B photos uploaded

on Instagram

Page 38: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

2. Rich User Interaction on

Social Media Websites

Following, Follower, Re-tweet, Favorite, Share

Like, Comment, Share, Subscriber, Subscription, Friend, Featured Channel

Ask, Follower, Follower, Like, Re-blog, Share, Submit, Reply

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 39: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

3. Multilingualism

Image Sources: twitter.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 40: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

4. Noisy Content

Presence of low quality messages and content of

low relevance

Image Sources: twitter.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 41: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

4. Noisy Content

Grammatical and spelling errors

Image Sources: twitter.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 42: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

4. Noisy Content

Presence of non-standard acronyms and

abbreviations

Image Sources: twitter.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 43: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

4. Noisy Content

Use of emoticons and incorrect capitalization-

informal nature of content

Image Sources: twitter.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 44: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

5. Data Annotation and Ground

Truth

Need/Importance

Effort Intensive

Imbalance Data

• basis for several machine learning algorithms- examining the performance

• Vast amount of data being uploaded on social media dataset in every second

• 1 in 100,000 posts is hate promoting or posted by extremist users

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 45: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

6. Manipulation, Fabrication

and Adversarial Behavior

Fake information

Image Sources: twitter.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 46: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

6. Manipulation, Fabrication

and Adversarial Behavior

Rumors

Image Sources: twitter.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 47: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

6. Manipulation, Fabrication

and Adversarial Behavior

Manipulative/Misleading Information

Commonly seen in videos

Textual

• Title

• Tags

• Description

Media

• Thumbnail

• Annotation

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 48: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

6. Manipulation, Fabrication

and Adversarial Behavior

Image Sources: youtube.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 49: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Focus and Scope

Page 50: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 51: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Focus of Tutorial

Exploring two sub-problems within the broader area of

Intelligence and Security Informatics:

1. Online Radicalization Detection (presence of

extremist content, users and communities on

social media websites)

2. Online Civil Unrest Prediction (an early

prediction of civil unrest related events using

social media content)

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 52: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Technology Landscape

Page 53: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Online Radicalization

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

• KNN

• SVM

• Naïve Bayes

• Boosting

• Logistic

Regression

• Topical Crawler

• Decision Tree

• EDA

• SVM

• OSLOM

• Rocchio

• Naïve Bayes

• Keyword Based

Flagging

• Honeypots

• Face Recognition

• Rule Based Classifier

• Regularized Least

Square

• TC

• EDA

• BFS

• DFS Link

Analysis

• Clustering-

Blog Spider

• Topical

Crawler (TC)

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

• Clustering-

Blog Spider

• Exploratory

Data Analysis

(EDA)

• TREC

• EDA

• Support Vector

Machine (SVM)

• Blockmodeling

• Multi-dimensional

Scaling

• Spring Embedder

• SVM

• Naïve Bayes

• Adaboost

• EDA

• Naïve Bayes

• OSLOM

• N-gram

• KNN

• SVM

• Best First

Search

• Shark Search

• Language

Modeling

Content Identification

User and Community Identification

Page 54: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Online Civil Unrest Prediction

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

• Stochastic

Hybrid Dynamic

Model

• Avatar

ensembles of

decision trees

Colb

au

gh

et.

Al.

Hu

a e

t. A

l.

• Clustering

Algorithm Nare

n e

t.

Al. • Map Reduce

• Apache Pig

• Naïve Bayes

• Logistic Regression

• Maximum Likelihood

• Generalized Linear

Model

• Dynamic Query

Expansion Jie

jun

Xu

et.

Al.

Ch

en

et.

Al.

• Heterogeneous

Graph Modeling

Com

pto

n e

t. A

l.

• Logistic

Regression Filch

en

ko

v e

t. A

l.

• Mathematical and

Theoretical Model

Sath

appan

et.

Al.

• Probabilistic

Soft Logic

Page 55: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Case Studies

Page 56: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

1. Identification of Online

Extremist Content, Users and

Hidden Communities on YouTube

S. Agarwal and A. Sureka, A Focused Crawler for Mining Hate and Extremism Promoting Users, Videos and Communities on YouTube, 25th ACM Conference on Hypertext and

Social Media (HT 2014) 1-4 Sep 2014, Santiago, Chile

Page 57: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

YouTubeLargest and most popular free video hosting and

sharing website- Found in 2005

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 58: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Image Sources: youtube.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 59: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Image Sources: youtube.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 60: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“ Mining user generated content on social

media platforms to identify topic based hate

promoting content and locating hidden &

virtual communities of extremist users.

Research Problem

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 61: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Research Contributions

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Application of focused crawler for identifying

extremist content and locating hate

promoting users on YouTube Best First Search

Shark Search

Content based characterization of hate

promoting videos Focus of the content shown in video

Targeted audiences

Keywords present in the contextual metadata and spoken in

the video

Page 62: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Focused/Topical Crawler

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Source: Dongyang Hou et. al; 2014

Page 63: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

1. Subscriber

2. Featured Channel

3. Public Contact

Page 64: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

>th

0.55

0.850.35

{Subscriber, Featured Channel, Public Contact}

0.25

0.32

Threshold: 0.30

0.32

Page 65: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Proposed Framework

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Titles'of'Videos:'

Page 66: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Image Sources: youtube.com

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 67: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Experimental Setup Discriminatory Features Set:

YouTube Profile Summary

User Activity Feeds (Title of Videos Uploaded, Shared, Commented

and Favorited)

Training Data: Discriminatory Features for 35 Hate Promoting Channel

(HinduismIslam, IndiaEternal, ISIcyberAGENT etc.)

Dynamic Parameters: 10 different YouTube Channels as Seeds (PakistanRoxxx,

hiddenpakistani, GreaterPakistan etc.)

Character N-Gram (3, 5)

Threshold Value for Relevance Computation (-2.0, -2.5, -3.0)

Focused Crawler: Best First Search and Shark Search

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 68: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Experimental Results-

Focused Crawler

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Best First Search

Approach

Predicted

Relevant Irrelevant

Actual

Relevant 921 314

Irrelevant 125 67

Shark Search

Approach

Predicted

Relevant Irrelevant

Actual

Relevant 991 295

Irrelevant 55 29

TPR OR

RecallTNR

PPV OR

PrecisionNPV F1-Score Accuracy

BFS 0.75 0.35 0.88 0.18 0.81 0.69

SSA 0.77 0.35 0.95 0.09 0.85 0.74

Confusion Matrix for Best First Search and Shark Search Algorithm (60 iterations each)

Accuracy Results for Best First Search and Shark Search Algorithm (60 iterations each)

Page 69: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Social Network Analysis

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Community Graph of YouTube Users After Applying Focused Crawlers: Best First Search

(Left), Shark Search (Right)

Page 70: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Videos YT CategoryAvg. Length

(Sec)

Content

FocusTarget Audience Keywords

43News, Non-

profit151.68

Honor

Killing,

Harassment

Women, Refugee

People, Children

Child Marriage, Rape,

Responsibility, Protest,

Women, Asylum, Arrested,

Brutal, Slave

93

News, Auto-

Vehicle,

Politics

2526.16Islam

Promotion

Jewish, Muslim

People

Taliban, Bombs, Battle,

Courage, Allah, Islam,

Courage, Belief, Macca,

Money, Shaheed, Enemies

Of Islam

30Entertainme

nt319.61 Anti-India India Haters

Kashmir, Poverty, Liberate,

Hindu, Beggars, Pundit,

Untouchable, Extremism,

Attack, Killed, Anti-Muslim,

Anti-Pakistani, Hatred,

Masks, Freedom.

Examples- Content Based

Characterization of Videos

Page 71: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Videos YT CategoryAvg. Length

(Sec)

Content

FocusTarget Audience Keywords

25

News,

Politics &

Education

1225.56Liberate

KashmirKashmiri People

Muslim, Army, Military,

1947, Partition, Azad

Kashmir, Liber- ate

Kashmir, Pakistan, India,

Killing, Murder, Border,

Fighting, Democracy,

Martyr, Torture.

83News &

Politics349.28

Anti-

MuslimsPakistan Haters

Kashmir, Jihad, Pakistan,

India, Quran, Muslim,

Hindu, Qatil, Zakir Naik,

Hate Speech, Masjid,

Pandit, Defence, Madarsa,

Tribute, Bharat, America,

Attack, Napak, Holy,

Kabba,

Examples- Content Based

Characterization of Videos

Page 72: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Content Based

Characterization of Videos

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Type of Video Content: Speech, News Segments,

Drawing, Interviews, Group Discussion, Animated

Videos, Lectures, Cartoon and Comics, Debate,

Recorded Videos, Textual Messages, Pictures with

Background Music

Page 73: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

2. Early Prediction of

Civil Unrest Related

Events on Twitter

Page 74: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

TwitterLargest and most popular free micro-blogging and

social networking website- Found in 2006

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 75: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

An expression of a moment or idea

Text, Photos, Videos, Location, Poll, External URLs, hashtag, @

Maximum of 140 character lengthTweet

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 76: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Comment on a Tweet and join the conversation

Share a Tweet with your followers and add your thoughts before sharing

Favorite a Tweet and let author to know that you like their post

Assigning a topic to the Tweet using ‘#’ and make Tweet easily searchable

Follow other micro-bloggers to receive their Tweets in your Newsfeeds

Activities

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 77: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

“Research Problem

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

A characterization study on open source

Twitter dataset to investigate the feasibility of

building event forecasting model

Evaluating the performance of machine

learning and statistical based forecasting

model on real world data

Page 78: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Research Problem

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

• location l where the protest is going to happenSpatial

• time expression ti (day, date and time) of the protestTemporal

• what the protest is about- to - root cause of the protestTopic

Page 79: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 80: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Research Contributions

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

A content based characterization and semantic

enrichment on raw tweets to classify “crowd-buzz &

commentary” and “mobilization & planning”

microposts

Investigating the application of trend analysis

(captured along the sliding window) for event

forecasting.

Page 81: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Proposed Framework

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Learning(Models:(

Named(En2ty(Recogni2on:(• • •

WebObservatory: https://web-001.ecs.soton.ac.uk/wo/dataset

Page 82: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Experimental Setup Dataset

‘Immigration’ tweets dataset downloaded from Southampton University

2M tweets (October 1, 2013 to February 28, 2014)

Collection of tweets containing terms ‘immigration’, ‘migration’,

‘immigrant’, ‘migrant’

Civil Unrest Related Events FastForFamilies in National Mall, USA

Christmas Island Hunger Strike, Australia

Sliding Window 7 days

For 16th January event (9 to 15 January)

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 83: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Available Locations of User Profiles for Christmas Island Hunger Strike Event

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 84: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Lead Indicator Classifier

Crowd-Buzz and Commentary

Planning and Mobilization

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 85: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 86: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Event Forecasting Model

Pairs of named entities: (ti, to), (to, l)

and (l, ti)

Extract all expressions of x entity that

are θ frequent

Extract all expressions of paired entity

y that are Ψ frequent for at least one x

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 87: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

US#AU#

Right#of#Asylum#

Migrant#Worker#

Refugee#

EU#

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 88: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Experimental Results

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Crowd-Buzz

Classifier

Predicted

C&C NA

Actual

C&C 719 109

NA 137 82007

Mobilization and

Planning

Predicted

M&P_E M&P_G NA

Actual

M&P_E 127 6 9

M&P_G 7 1236 22

NA 13 18 81541

Precision Recall F1-Score

C&C 0.84 0.86 0.85

M&P

0.89 0.81 0.85

0.96 0.98 0.97

0.99 0.99 0.99

Confusion Matrix for Crowd-Buzz and Mobilization & Planning Classifiers

Accuracy Results for Crowd-Buzz and Mobilization & Planning Classifiers

Page 89: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Experimental Results

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Distribution of χ2 and p-value for Frequent Pairs of Locations (Australia (LEFT), U.S. (RIGHT)) and

Topics for 3 consecutive Days in Sliding Window (Christmas Island Hunger Strike)

Page 90: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

ISI Leading Conferences

and Journals

Page 91: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

SocialComm SASCVV

SocialComm, ISI, EISIC, WebSci, PAISI, ICWSM

Journals Security Informatics, CyberTerrorism, Criminal

Justice and Popular Culture

Machine Learning/NLP BDA, COMAD, ICDCIT

PAKDD, SIGKDD, CIKM, AAAI

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 92: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Conclusions

OSSINT is a growing research field and has attracted the

attention of several researchers in past decade

We discuss two major and important applications of

OSSINT- Online Radicalization and Online Civil

Disobedience

YouTube is most widely used platform for Online

Radicalization, Hate and Extremism Promotion

KNN, SVM, Clustering, Naïve Bayes, KBF, EDA are

commonly used techniques for hate promotion detection

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 93: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Conclusions Focused Crawler is an efficient approach to locate hate

promoting content on YouTube.

User activity feeds can be used as discriminatory features to

identify a hate promoting user channel.

SSA approach has higher precision, recall and f-score in

comparison to BFS approach.

Social Network Analysis helps us to locate hidden

communities and user playing major role in community.

Hate promoting users upload videos targeting some specific

audiences who are affected by the incidents shown in the

video

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 94: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Conclusions Twitter plays and important role in facilitating mobilization

and planning of civil unrest related events

Clustering, Logistic Regression and Dynamic Query

Expansion, Ensemble Learning are most commonly used

techniques for civil unrest event prediction

We present an approach for early detection of civil unrest

related events evaluated on real world dataset (open source

Twitter data)

We perform semantic enrichment on raw tweets and filter

event related tweets (crowd-buzz and mobilization)

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 95: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Conclusions We develop a frequency based model on enriched tweets

and find those pairs of location, topics and time that are

significantly correlated

Detecting trend analysis of spatial, temporal and topic based

entities in sliding window is an efficient approach for event

forecasting

Early identification of crowd-buzz and mobilization tweets is

value added for event forecasting

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 96: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Future Directions

Investigating the application of parallel corporal for

detecting online radicalization communities over cross

platforms

Detecting protest related events in real time and events

with overlapping sliding window (multiple events occurring

in same time frame)

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

Page 97: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

References

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

1. Agarwal, S., Sureka, A.: Copyright infringement detection of music videos on youtube by mining

video and uploader meta-data. In: Big Data Analytics (BDA). pp. 48–67 (2013)

2. Agarwal, S., Sureka, A.: A focused crawler for mining hate and extremism promot- ing videos on

youtube. In: 25th ACM Conference on Hypertext and Social Media (HT). pp. 294–296 (2014)

3. Agarwal, S., Sureka, A.: Learning to classify hate and extremism promoting tweets. In: Intelligence

and Security Informatics Conference (JISIC). pp. 320–320 (2014)

4. Agarwal, S., Sureka, A.: Topic-specific youtube crawling to detect online radicaliza- tion. In:

Databases in Networked Information Systems (DNIS), 2015. pp. 133–151 (2015)

5. Agarwal, S., Sureka, A.: A topical crawler for uncovering hidden communities of extremist micro-

bloggers on tumblr. In: 5th Workshop on Making Sense of Micro- posts (MICROPOSTS) (2015)

6. Agarwal, S., Sureka, A.: Using common-sense knowledge-base for detecting word obfuscation in

adversarial communication. In: Workshop on Future Information Security (FIS) (2015)

7. Agarwal, S., Sureka, A.: Using knn and svm based one-class classifier for detecting online

radicalization on twitter. In: Distributed Computing and Internet Technol- ogy (ICDCIT). pp. 431–442

(2015)

8. Aggarwal, N., Agarwal, S., Sureka, A.: Mining youtube metadata for detecting privacy invading

harassment and misdemeanor videos. In: Privacy, Security and Trust (PST). pp. 84–93 (2014)

Page 98: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

References

Agarwal S., Sureka A., Goyal V. - Invited Talk @ 4th International Big Data Analytics Conference, Hyderabad, India (December 17, 2015)

9. Budak, C., Georgiou, T., Agrawal, D., El Abbadi, A.: Geoscope: Online detection of geo-correlated

information trends in social networks. Proceedings of the VLDB Endowment 7, 229–240 (2013)

10. Compton, R., Lee, C.: Detecting future social unrest in unprocessed twitter data:emerging

phenomena and big data. In: Intelligence and Security Informat- ics (ISI). pp. 56–60 (2013)

11. Fu, T., Huang, C.N., Chen, H.: Identification of extremist videos in online video sharing sites. In:

Intelligence and Security Informatics, 2009. ISI ’09. IEEE Inter- national Conference on. pp. 179–

181 (June 2009)

12. Hua, T., Lu, C.T., Ramakrishnan, N.: Analyzing civil unrest through social media. Computer 46(12),

80–84 (Dec 2013)

13. Kwok, I., Wang, Y.: Locate the hate: Detecting tweets against blacks. In: AAAI (2013)

14. Qazvinian, V., Rosengren, E., Radev, D.R., Mei, Q.: Rumor has it: Identifying mis- information in

microblogs. In: Proceedings of the Conference on Empirical Methods in Natural Language

Processing. pp. 1589–1599. Stroudsburg, PA, USA (2011)

15. Ramakrishnan, N., Butler, P., Muthiah, S.: ’beating the news’ with embers: Fore- casting civil unrest

using open source indicators. In: Proceedings of the 20th ACM SIGKDD International Conference

on Knowledge Discovery and Data Mining. pp. 1799–1808. KDD ’14, ACM, New York, NY, USA

(2014)

16. Wang, M., Alan, C.G.: Intelligence and security informatics. Pacific Asia Workshop (PAISI) (2011)

Page 99: Open Source Social Media Analytics for Intelligence and ... · Social Media Websites API Wikipedia Social Curation (Theme or topic based content sharing) Crowd Sourcing Websites Examples-

Thanks!Questions?

Contact:

[email protected]

[email protected]

[email protected]