84
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter Shalin Hai-Jew Kansas State University 2014 National Extension Technology Conference May 2014

Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

  • Upload
    learjk

  • View
    360

  • Download
    2

Embed Size (px)

DESCRIPTION

Shalin Hai-Jew Kansas State University 2014 National Extension Technology Conference

Citation preview

Page 1: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations,Eventgraphs,

and User Ego Neighborhoods: Extracting Social Network Data

from Twitter

Shalin Hai-JewKansas State University

2014 National Extension Technology Conference May 2014

Page 2: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

2

Presentation Overview

• This introduces methods for extracting and analyzing social network data from Twitter for hashtag conversations (and emergent events), event graphs, search networks, and user ego neighborhoods (using NodeXL). There will be direct demonstrations and discussions of how to analyze social network graphs. This information may be extended with human- and / or machine-based sentiment analysis.

Page 3: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

3

Self-Intros

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

• Do you use Twitter? If so, how? • Who do you follow on Twitter, and why?

• Have you analyzed your own social networks on Twitter? What’s the company you keep (online)?

• Have you ever created a hashtag for a formal conference event? Were you able to gain some insights about what your participants were experiencing during the conference?

• What would you like to learn in this session?

* My goal for you is to learn capability (what is fairly easily possible), not method… Method is for another day, another time.

Page 4: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

4

Twitter Social Networking and Microblogging Social Media Platform

• 140-character text-based Tweets• Images (Twitpics) and videos (Vine)• Accounts as humans, ‘bots (collecting and re-tweeting information,

sensor networks), and cyborgs (humans and ‘bots co-Tweeting) • Created in 2006 and based out of San Francisco, California

• 500 million registered users in 2012 • 340 million Tweets a day as the “SMS of the Internet”

• Has attracted a range of public, private, and governmental organizations; groups (religious, political, advocacy, and others); individuals• Has an application programming interface (API) which enables some

limited access to their public data

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 5: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

5

Electronic Social Network Analysis

• Extraction of social network data from social media platforms (through their APIs): social networking sites, email systems, wikis, blogs, microblogging sites, web networks, and others • Node-link, vertex-edge, entity-relationship • A form of structure mining with implications for

• Organizational analysis• Entity (node) analysis • Social ties • Understandings of social structure and power • Diffusion of innovation, information, culture, attitudes, and other

transmissible resources • Electronic event analysis

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 6: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

6

Some Basics of E-SNA

Page 7: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

7

Some Basics of E-SNA (cont.)

• Core-periphery dynamic and influence (and power) / “primary” and “secondary” membership in the network • Knowledge and influence • Collection of resources

• Clustering • Motif censuses, network structures, network topologies, geodesic

distance, connectivity • Bridging

• Network structure, network topology • Thick ties / tight coupling in electronic social spaces • Thin ties / loose coupling in electronic social spaces • Homophily vs. heterophily

• The company you keep

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 8: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

8

Some Basics of E-SNA (cont.)

Global Social Network Structures

• Betweenness centrality (shortest path betweenness centrality) • Closeness centrality (closeness of

a node to all other nodes in the network graph)• Eigenvector centrality (closeness

to important neighbors)• Clustering coefficient (the

amount of clustering in a network)

Local Social Network Structures

• Degree centrality (in-degree and out-degree) • Clustering coefficient

(embeddedness)

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 9: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

9

Units of Analysis

• Entity: Node or vertex • Relationships: Links, edges

• Dyads, triads, … motifs (different relational structures)

• Clusters and sub-clusters (groups or meta-nodes)• Islands • Pendants (one node, one link); whiskers (one link, multiple nodes) • Isolates • Ego neighborhoods • Social network • Multiple social networks • “Big data” universes

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 10: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

10

Why Learn about Electronic Social Networks?

• Understand respective roles in the community • Identify informally influential individuals who are otherwise hidden

• Monitor what messages are moving through the network to understand public sentiment and understandings • Plan diffusion of prosocial information and actions; head off negative

diffusions in a social network • Wire new networks for social and individual resilience (such as

regarding health, emotion, economics, and other) • Rewire social networks for different objectives and aims; optimize

social groups based on what is known about people’s socializing and preferences

Page 11: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

11

E-SNA on Twitter….

• Hashtag conversations (#) • Event graphs (unfolding formal and informal events by hashtags and

key words) • Search networks • Understanding user (account) social networks

• Ego neighborhoods on Twitter (direct alters) • Clusters and sub-clusters; islands; pendants; isolates• Motif censuses • Egos

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 12: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

12

Questions so Far?

• What do you think about (electronic) social network analysis (and structure mining)? Do you think that the assumptions are valid? Why or why not?

• What do you think about electronic social network analysis?

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 13: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

13

Hashtag Conversations

• Narrow-casting (to a distinct small group) and broad-casting (communicating broadly to any who care to follow) • Identifying the messages shared

• Sentiments • Semantics • Main conversationalists • Calls to action

• Identifying the networks of accounts in connection to each other around this discussion• Observing the interactions between accounts (nodes or vertices)

around the particular discussion • Identifying the “mayor of your hashtag” (using Dr. Marc A. Smith’s

phrasing) or the influential discussants and their important (central, widely followed, re-tweeted) messaging

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 14: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

14

Eventgraphs

• Mapped networks of interactions based around a physical or virtual or other event (in this case) • Formal, informal, or semi-formal• Planned or unplanned events

• Conferences with disambiguated or original hashtags; may include online or augmented reality games to increase participation (planned)

• Accidents, mass health events, or unusual “spectacle” occurrences (unplanned) • Micro (local or distributed) or mass (locationally clustered or distributed)

• Trending microblogging messaging over time (exponential messaging to peaks or multiple peaks and gradual diminishment or steep drop-off)• Multimedial with microblogged text, images, and video; interactive;

dynamic • Identification of the main geographical locations of the discussants

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 15: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

15

Search (Social) Networks (Online)

• Identification of • particular topics in discussion (the less

ambiguity of the term, the better; otherwise, the tools will track a broad range of terms with various word senses) • discussants (social media platform

accounts) • main messaging of the discussants

(Tweet or microblogging streams) • main physical locations of the discussants

(based on noisy geo information)

Page 16: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

16

User Social Networks

• Node / vertex / entity / agent analysis • Link / edge / arc / tie / relationship analysis • Identification of the alters in the ego neighborhood• Analysis of transitivity among the alters in the ego neighborhood• Capture of a 2-degree social network on Twitter

Page 17: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

17

Motif Censuses

• Understanding of the global nature of the network • The power structures within the network • The clusters, sub-clusters, islands, pendants, and isolates

• The social individuals and entities within the network • The transmissibles moving through the network • Static (vs. dynamic information captures)

Page 18: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

18

The Data Extraction and Network Visualization Tool: NodeXLNetwork Overview, Discovery and Exploration for Excel

Page 19: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

19

Network Overview, Discovery and Exploration for Excel (NodeXL)

• NodeXL• Free and open-source code• Data scraping from social media

platforms through their respect APIs (of publicly available information only)• Add-on to Excel (formerly known as

NetMap)

• Available on the Microsoft CodePlex platform • Requires Windows (or parallels on Mac)

• Sponsored by the Social Media Research Foundation • NodeXL Graph Gallery for shared

graphs and datasets

Page 20: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

20

Types of Data Extractions from Twitter

NodeXL (relations, structure, select contents)

• #hashtag • Search • Twitter “List Network”• Twitter User Network

NCapture of NVivo (semantics, message contents)

• Twitter User Tweets • Twitter List Tweets

Page 21: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

21

Input Parameters

• Size of the crawl • Degree of the crawl • Image capture • Tweet capture • Direction (followed by/ following /

both) • Edge definition: Followed /

following; replies-to; mentions• Tweet column

Page 22: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

22

Data Processing: Graph Metrics

• Degree, in-degree, out-degree• Betweenness and closeness

centralities• Eigenvector centrality • Vertex clustering coefficient • Vertex pagerank • Edge reciprocation • Words and word pairs • Twitter search network top items

• …and others

Page 23: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

23

Data Processing: Grouping

• Group by vertex attribute • Group by connected component • Group by cluster• Group by motif

Page 24: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

24

Data Visualization

• Type of layout algorithm applied to the data • Autofill

• Labeling of vertices• Labeling of edges

• Graph pane • Graph options • Zoom • Scale

Page 25: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

25

Dynamic Filtering

• Adjust parameters (with the sliders) to limit what is visualized • Change up the time

zones to analyze what is being communicating and by whom at which time (UTC / coordinated universal time) • Capture broadly and

then focus in using dynamic filtering

Page 26: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

26

Data Analysis

• Use both the dataset and the visualizations (they both complement each other and are necessary for full understanding) • Capture the Tweets column and import that into a text analysis

software program

Page 27: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

27

Limits -> Controlling for Input Parameters for the Data Extraction

• Social media platform (Twitter and its data processing rate limits), even with an account for “whitelisting” (and the time-of-day of the data extraction through its data-streaming API) • NodeXL (up to about 300,000

records or so) • Computational power of

researcher machine • Computer memory of researcher

machine

• No early indicator of size of data crawl or the acquire-ability of the electronic social network • Costly (computational and

time expense) non-captures at system limits

Page 28: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

28

Addendum

• May apply Boolean operators into the query (and query multiple terms simultaneously) • May use macros• May re-crawl using original parameters of a data extraction • May automate data extractions

Page 29: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

29

Some Sample Graph VisualizationsFrom NodeXL Extractions from Twitter

Note: Other details have been excluded because these visualizations are incomplete without the graph metrics and other complementary data…and it would be misrepresentational to explain the contexts of the data crawl behind the social network graphs incompletely. All of these graphs may be found in fuller detail and some with downloadable data sets on the NodeXL Graph Gallery. At the graph gallery, put “SHJ” in the Search bar at the top right.

Page 30: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

30

Grid

Page 31: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

31

Circle Layout (Ring Lattice Graph)

Page 32: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

32

Harel-Koren Fast Multiscale with Vertex Labels

Page 33: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

33

Random Layout Algorithm, Images at the Vertices

Page 34: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

34

Sugiyama Layout of Groups, Force-Based Overall Network Layout

Page 35: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

35

Harel-Koren Fast Multiscale

Page 36: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

36

Horizontal Sine Wave

Page 37: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

37

Harel-Koren Fast Multiscale

Page 38: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

38

Motif, Harel-Koren Fast Multiscale

Page 39: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

39

Harel-Koren Fast Multiscale

Page 40: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

40

Fruchterman-Reingold Layout, Partitioned

Page 41: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

41

3D Fruchterman-Reingold Force-Based Graph

Page 42: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

42

Circle Layout / Ring Lattice Graph at Group Level, Force-Based Layout at Network Level

Page 43: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

43

Harel-Koren Fast Multiscale

Page 44: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

44

Harel-Koren Fast Multiscale

Page 45: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

45

Fruchterman-Reingold Layout, Imagery for Vertices

Page 46: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

46

Random Layout of Groups, Force-Based Layout of Network with Combined Edges

Page 47: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

47

Harel-Koren Fast Multiscale Layout at Cluster Level, Force-Based Layout at Network Level

Page 48: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

48

Motifs Extraction (Census), Sugiyama Layout at Network Level

Page 49: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

49

Harel-Koren Fast Multiscale for Groups, Force-Based Layout at Network Level

Page 50: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

50

Clustering by Clauset-Newman-Moore, Network Layout with Harel-Koren Fast Multiscale

Page 51: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

51

Motifs at Group Level, Spiral at Network Level

Page 52: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

52

Random at Group Level, Packed Rectangles for Network

Page 53: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

53

Harel-Koren Fast Multiscale for Clusters, Treemap Layout for Network

Page 54: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

54

Horizontal Sine Wave Layout (on beta)

Page 55: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

55

Harel-Koren Fast Multiscale

Page 56: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

56

Sugiyama, Stacked Rectangles

Page 57: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

57

Fruchterman-Reingold

Page 58: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

58

Fruchterman-Reingold

Page 59: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

59

Harel-Koren Fast Multiscale

Page 60: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

60

Harel-Koren Fast Multiscale

Page 61: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

61

Motif, Fruchterman-Reingold, on Grid

Page 62: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

62

Grid, Imagery on Vertices

Page 63: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

63

Multi-Sequence Mixed Visualization

Page 64: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

64

And…

Page 65: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

65

NodeXL Graph Server

• Continuous crawl based on a certain term or account for over a month • Academic purposes only • Must be requested through Dr. Marc A. Smith (Connected Action Consulting

Group @ [email protected])

• Not retroactive crawls (a limitation of Twitter)

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

Page 66: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

66

NodeXL Beta Layouts

• Treemap• Packed rectangles• Force directed

Page 67: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

67

Mixing Up Datasets

Twitter Data Grants

• Feb. 2014 • Twitter Engineering Blog

Other Sources

• Content-sharing sites (with public APIs)• YouTube• Flickr

• Social networking sites (with public APIs)• Facebook• LinkedIn

• Email Networks• Web networks • Wiki networks

Page 68: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

68

Semantic (Meaning) Analysis of a Tweet Stream Using NCapture (add-in to Google Chrome and MS Internet Explorer browsers) and NVivo (a qualitative and mixed methods data analysis tool)

Page 69: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

69

(Partial) Twitter Feed Capture using NCapture of NVivo 10

Page 70: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

70

Word Cloud based on Word Frequency Count from Twitter Feed (Gist)

Page 71: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

71

Geolocation (Lat / Long) Data of Active Twitter User Accounts on a Tweet Stream / Feed

Page 72: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

72

Word Similarity Analysis

Page 73: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

73

Word Frequency Treemap (classical content analysis)

Page 74: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

74

Word Search Word Tree (and Stemming)

Page 75: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

75

Manual Analysis…through Coding, Categorizing, and Evaluation

• Data reduction • Summary • Matrix analysis • Coding and analysis

Topic Pro (sentiment) Con (sentiment)

Page 76: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

76

Human-Machine Analysis

• Network Text Analysis Theory (language modeled as networks of words and relations) • Semantic network

• Nodes: concepts or ideas, ideational kernels • Links: statements, relationships (strength of relationship, directionality such

as agreement / disagreement or positive / negative, type of relation, sentiment • Network: semantic map, union of all statements

• May be a one-mode network (all nodes of a type)• Concepts

• May be a multi-modal network (based on ontological coding with various mixes of node types)• Persons, places, concepts, sentiments, locations, and others

Page 77: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

77

Human-Machine Analysis (cont.)

• Meta-network analysis based on a text corpus / merged text corpuses • Drawn from unstructured natural language text data • Identification of users (account holders on Twitter) and their

interrelationships with others based on messaging and re-Tweeting and following / not following

• May use Carnegie Mellon University’s freeware text-mining tool AutoMap 3.0.10.18 on Windows (by Center for Computational Analysis of Social and Organizational Systems, CASOS) (2001 – present) • Graph visualizations in 2D and 3D made in ORA-NetScenes (CASOS)

Page 78: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

78

Human-Machine Analysis (cont.)

• AutoMap…requires data pre-processing (setting parameters) • Requires text corpuses as .txt files (transcoding from .doc, .docx, .HTML, or

other) • May combine multiple text sets (through merging); can then query on the

whole set or on the individual text sets • May create “stop words” (or “delete”) lists to de-noise data (with “stop

words” like relative pronouns, personal pronouns, articles, conjunctions, and other words with less semantic meaning, etc.) • May use universal or domain-specific “thesauruses” to define, filter, and

hone the meta-network extractions• Enables the defining of sentiment • Requires testing of a sample set and meta network visualization to ensure

appropriateness of the data refinements • Involves the design of meta-networks and ontologies from the text corpuses

Page 79: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

79

Human-Machine Analysis (cont.)

• …requires data processing and data visualization • May run the textual data processing • Includes a web scraper to main social media platforms in its ScriptRunner

feature

• …requires data post-processing • Includes accessing AutoMap data from ORA-NetSense to create network

visualizations• Includes data “mining” for meaning / sense-making (identification of

patterns) • Includes data visualization analysis

• Note: The work may require re-running this cycle multiple times for different data queries.

Page 80: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

80

Sampler: Wordle™ Word Cloud to Create an Emergent Thesaurus

Page 81: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

81

Sampler: Excerpt from a Year’s Worth of a Blog’s Text Corpus

Page 82: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

82

Sampler: @kstate_pres Tweets Visualization

Page 83: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

83

Demos?

• Would you like to see how to set up a simple data crawl from Twitter using NodeXL? (Note: Twitter rate limiting may mean that a completed data extraction may not be achieved, but you can at least see what a basic setup may look like.)

• Any questions?

Page 84: Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting Social Network Data from Twitter

Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting Social Network Data from Twitter

84

Conclusion and Contact

• Dr. Shalin Hai-Jew• Instructional Designer

• Information Technology Assistance Center• Kansas State University• 212 Hale Library• 785-532-5262• [email protected]

• Thanks to Dr. Marc A. Smith, sociologist and Chief Social Scientist for Connected Action, for generously presenting a webinar at K-State to our faculty and staff. Also, Tony Capone, NodeXL developer, made the NodeXL beta available to me and has been very gracious and encouraging.