Understanding Network Structure Through User Attributes and Behavior
Describe this network
What does it mean?
Hints Built from the photo sharing website Flickr. On Flickr, photos are labeled with descriptive keywords
called tags. Nodes represent tags and an edge between tags indicate
that they were used to describe the same image. E.g. if an image is tagged with the word “desk” and
“keyboard” the network would show a line connecting those two words.
Network is a 1.5 egocentric network of a single tag
What can we say now?
Now with content...
Connecting Content and Structure
Structural attributes only tell us a little
Must look at data about nodes and edges to really understand what is happening in a network
Node X has high betweenness is only a description of a statistic
Node X has high betweenness, and the data shows he connects a group of people from the US with a group of people from Spain tells what his role is and why it is important.
Example Analysis
Example Analysis
Network is 1.5 egocentric network of a search term on YouTube
Nodes represent videos that match the search term
Links indicate videos share at least one other keyword in common
More Data
Search term is “cubs”
Initial thoughts about what you see in the network?
Getting into content: Graph Level
Choose a few videos from each cluster and watch them See what they are about Look at their keywords
Selected nodes in white and black
White Nodes’ Keywords
Cubs, CubFans, baseball, Chicago, Please, Stop, Believing
mlb, 2k12, baseball, major, legaue, ronnie, woo, wilckers, wrigley, cubbies, north, side, billy, goat, curse, illinois, ps3, playstiation, cubs
MLB, 12, The Show, MLB 2k12, Diamond Dynasty, Baseball, triple play, world series, home run derby, PS MOVE, Jose Bautista Chicago, Cubs, win, sports, playstation, ps3, ps vita, video game, so real it's it’s unreal
Chicago Cubs, Chicago, Cubs, Wrigley Field, Opening Day, 2011, number one fan, sports fans, baseball, major leagues
Chicago, Cubs, Spring, Training, Baseball, Tony, Campana, Brett, Jackson, Sports, Hohokam, Park, Cactus, League
Black Nodes’ Keywords
dog, dogs, puppies, pup, cute, adorable, snuggle, bear cub, Medvjedić, Bär, orsacchiotto, brown bear cub, bears, teddy, medo srečko, cubs, medvedji mladič, slovenia, slovenija
National Geographic, polar, bear, cubs, mother, mom, parent, learn, teach, cute, fluffy, sweet, predator, arctic, predation, hunt
Tiger, Rescue, Lions, Leopards, Cubs, Kittens, Tiger cubs, Wild animal orphanage, Big Cat Rescue, Texas, Tigers, Rescued, Scary, Roar, Rawr, Attack, Aggressive, Sanctuary, Global
tiger, tigress, cubs, machli, fight, nick, ranthamore, croc, crocodile, mugger, india, rajastan, valmik, thapar, bbc, wildlife
cheetah, cheetahs, african, wild, cute, animals, baby, BBC, cubs
Conclusions
The cluster of nodes with the white samples represent videos about the baseball team the Chicago Cubs
The cluster of nodes with the black samples represent videos about baby animals (bear cubs, tiger cubs, etc.)
Getting into Content: Node Level
Individual nodes may represent different types in a network
This requires understanding node attributes and linking it to the role in the network.
Example: Nodes colored by department
Example: Detecting User Roles
Study by Welser, Gleave, and Smith, 2007.
Examined the roles users play in discussion groups
Example Network
Breaking Down Into Egocentric Nets
Observations
36 nodes have only one neighbor, and in almost all cases that neighbor has a high degree and had replied to the central node.
Another 17 nodes have two neighbors with this same pattern.
This accounts for nearly 60% of the nodes in the network.
Do these nodes have something in common?
Diving Into Content
Group nodes by attributes of their egocentric networks
Look at the behavior of those nodes in discussion groups to see if there are patterns
This involves actually reading their posts and understanding the communication on a content level, not just a network structure level.
Findings
Nodes with high out degree and low network density tend to answer a lot of questions, but not engage in a lot of discussion
Nodes with low degrees are generally asking questions. They get a reply and then stop participating.
Many other patterns found by researchers
These results rely on connecting structure to content
Conclusions
To understand a network, we need more than structural attributes
Connecting structure with analysis of content can lead to much deeper insights about what is happening in a network.
This is a connection is critical for full, deep, and insightful network analysis.