21
REALIZING BUSINESS VALUE FROM OPEN SOURCE DATA AND OPEN SOURCE INTELLIGENCE Presented by: Chris Morgan

Ikanow oanyc summit

Embed Size (px)

Citation preview

Page 1: Ikanow oanyc summit

REALIZING BUSINESS VALUE FROM OPEN SOURCE DATA

AND OPEN SOURCE INTELLIGENCE

Presented by: Chris Morgan

Page 2: Ikanow oanyc summit

http://bit.ly/data-vending

DATA AND ART (PRIMER)

Providing value on the potential of bad news to serve out a bag of salty potato chips

harnessing the power of open data and sentiment

Page 3: Ikanow oanyc summit

Data Intelligence

Operational Lens

Intelligence is information that has been transformed to meet an operational need

Intelligence

Page 4: Ikanow oanyc summit

Intelligence CycleNo matter what methodology you use…

intelligence analysis is an iterative process.

Page 5: Ikanow oanyc summit

• Provide value to the organization – turn data into intelligence using an “operational lens”

• Ensure cyclical feedback occurs during collection, processing, analysis, and consumption

• Validate that a particular network is the right source of data for the questions you need answered

Open Source Analysis Goals

Page 6: Ikanow oanyc summit

Common Pitfalls

Analyzing What Instead of Why

The important thing is often not what people are saying… but why they

are saying it.

Page 7: Ikanow oanyc summit

Common Pitfalls

Using the Wrong Analysis Tools

Reporting tools rarely help dig into the why. Many common tools, reports, and metrics are misleading:– Word clouds atomize message context– Sentiment metrics are often highly inaccurate– Information in aggregate hides more than it

reveals

Page 8: Ikanow oanyc summit

Use Case

Sentiment Analysis

http://bit.ly/ikanow-and-r

Page 9: Ikanow oanyc summit

Enron Sentiment Analysis

Data source

~500,000 Publically available Enron emails

http://bit.ly/ikanow-and-r

Page 10: Ikanow oanyc summit

Enron Sentiment Analysis

Hypothesis

Utilize Sentiment analysis as first order process to prioritize and streamline the

overall analysis process

http://bit.ly/ikanow-and-r

Page 11: Ikanow oanyc summit

Enron Sentiment Analysis

Caveats

Sentiment was only attributed to the sender Not a complete representation of an organizations

email corpus Counteraction of uneven coverage was estimated Not a full analysis of the set of information

(objective was to use sentiment analysis as a reduction technique)

http://bit.ly/ikanow-and-r

Page 12: Ikanow oanyc summit

Workflow• Data Ingestion Process– Extraction of entities, events, facts and some

basic statistics• Aggregation and Reduction– Aggregation of keywords with sentiment from

each email– Average sentiment score– Follow on aggregation by email address of the

sender over a given week (average sentiment score)

• Visualize and Analyze– Imported into Infinit.e and R for visualization

http://bit.ly/ikanow-and-r

Page 13: Ikanow oanyc summit

• Horizontal Bar– Positive sentiment =

Green– Negative sentiment =

Red

• Chart on Left– Positive sentiment =

Green– Negative sentiment =

Red

• Chart on Right– Heuristic – weeks with

abrupt negative shifts indicated problems in organization

– Positive sentiment = Blue

– Negative sentiment = Red

One email sender’s Weekly Average Sentiment across time

Workflow

Page 14: Ikanow oanyc summit

Workflow

close-up snapshot of sub-set of 20 individuals email average sentiment score over time

Page 15: Ikanow oanyc summit

Individual analysis based on the reduction of the

information by the sentiment analysis

process

Workflow

Page 16: Ikanow oanyc summit

Findings• Indicators and Additional Analysis– 801 weeks highlighted out of 11,500 weeks as

important for further investigation– Keywords found could further be used to

investigate statistically the 801 weeks highlighted for manual review

– Individual evaluation of emails highlighted through a reduction process (case construction)

– Pipeline created for further analysis

Page 17: Ikanow oanyc summit

Lessons Learned

1. Drastically reduced the timeline necessary for case

construction

Page 18: Ikanow oanyc summit

Lessons Learned

2. Multiple contexts for this type of technique

Intelligence Analysis

E-Discovery

Brand management Social Media

Analysis

Page 19: Ikanow oanyc summit

Lessons Learned

3. Negative shifts were only investigated, analysis of the positivity side for other use cases could be applied to different questions easily

Page 20: Ikanow oanyc summit

Lessons Learned

4. R and Infinit.e provide a interesting technology

integration for evaluating and reducing unstructured data

Page 21: Ikanow oanyc summit

Chris [email protected]

www.ikanow.com

THANK YOU

github.com/ikanow/infinit.e