Tim Budden: "Unlocking Insights from Social Data"

Preview:

Citation preview

Unlocking insightsfrom social data

Tim BuddenVP Data Science at DataSift

Drew Conway’s Data Science Venn Diagram

2

Expanding data universe

Agenda

1

2

3

4

Evolution of social data

Social data analyticsPrivacy by design

5 Examples

Expanding digital universe1

5

Expanding digital universe

Expanding human data

universe

Evolution of social data2

The evolution of social data

From public to non-public spaces:

Public Walled 1 to 1 Image-based

Public

Where brands and consumers most commonly engage directly. This is where customer support and brand perception can be addressed directly by a brand.

Walled garden

Users engage each other in a non-public but large network. This is where users are more candid about their aspirations and attitudes toward brands.

1 to 1

Users engage each other directly on a one-to-one or small group basis. Thus far this space has been considered largely off limits to brands.

Image-based

Public spaces where people showcase their best visual content.

12

Social data analytics3

14

Business applications of social media

15

Volume and velocity

Natural Language

Privacy

2.1B People Globally on Social Networks

Challenges to extracting insights from data

Unlocking Insights from 2.1B People on Social Networks

Example analytics project: Run on the banks?

16

Bank of England experimented with trying to predict a bank run in the days preceding the Scottish independence referendumObserved spike on 15 September of tweets mentioning “RBS” and “run”

Scottishindependence

referendum

17

Run on the banks?“Great run there! Arm tackles don’t bring down good RBs”

Ambiguity in natural language

18

Synonymity in natural language

19

word2vec

20

king - man=

queen - woman

Berlin - Germany=

Paris - France

https://spacy.io/demos/sense2vec?NFL

Privacy by design4

How can information useful to business be extracted from non-public spaces, while wholeheartedly

respecting people’s privacy?

Think in terms of audiences and demographics not individuals

23

Djokovic

Federer

female male

Come on Djokovic! Come on

Roger!

Go for it Novak!

Great shot Federer!

Henman Hill at Wimbledon

Think in terms of topics and attitudes not verbatim

Sumptuous interior!

Beautiful lines!

Lots of storage

PYLON: Anonymised and Aggregated insights

25

Text available to algorithmsbut not output

Aggregated results

Audience sizes are quantised:minimum bucket size and intervals

Anonymised: allPersonallyIdentifiableInformation(PII) is dropped

API

DS

CONTENTGender: MaleAge Range: 35-44Region: California, USA

CONTENTNegativeNeutralPositive

DEMOGRAPHICS

SENTIMENT

Automatic classification of related topics

e.g. Star Wars VII (Film)

TOPIC ANALYSIS

CONTENT

LINKSAnalyze

URLs shared across Facebook

Engagement and Demographics around Likes, Comments and Shares

ENGAGEMENT

Can’t wait to take the kids to watch Star Wars VII

CONTENT

Privacy-safe aggregate analysis of

text

TEXT ANALYSIS

Topic Data is Multi-Dimensional. Build Insights into Content, Engagement, Audiences

Examples5

Analysing and visualising automotive

28

websequencediagrams.com

Writing the script with Facebook topic data

29

30

Volume and velocity

Natural Language

Privacy

2.1B People Globally on Social Networks

Challenges to extracting insights from data

Unlocking Insights from 2.1B People on Social Networks

THANK YOU

Recommended