29
Social Networks, CompSci 49s, 11/16/2006 1 Social Networks as a Foundation for Computer Science Jeffrey Forbes http://www.cs.duke.edu/csed/socialnet

Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Embed Size (px)

Citation preview

Page 1: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 1

Social Networksas a Foundationfor Computer Science

Jeffrey Forbeshttp://www.cs.duke.edu/csed/socialnet

Page 2: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 2

A Future for Computer Science?

Page 3: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 3

Is there a Science of Networks? From Erdos numbers to random graphs to Internet

From FOAF to Selfish Routing: apparent similarities between many human and technological systems & organization

Modeling, simulation, and hypotheses Compelling concepts

• Metaphor of viral spread• Properties of connectivity has qualitative and

quantitative effects Computer Science?

From the facebook to tomogravity How do we model networks, measure them, and

reason about them? What mathematics is necessary? Will the real-world intrude?

Page 4: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 4

Physical Networks The Internet

Vertices: Routers Edges: Physical connections

Another layer of abstraction Vertices: Autonomous systems Edges: peering agreements Both a physical and business network

Other examples US Power Grid Interdependence and August 2003

blackout

Page 5: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 5

What does the Internet look like?

Page 6: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 6

US Power Grid

Page 7: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 7

Business & Economic Networks Example: eBay bidding

vertices: eBay users links: represent bidder-seller or buyer-seller fraud detection: bidding rings

Example: corporate boards vertices: corporations links: between companies that share a board

member Example: corporate partnerships

vertices: corporations links: represent formal joint ventures

Example: goods exchange networks vertices: buyers and sellers of commodities links: represent “permissible” transactions

Page 8: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 8

Content Networks

Example: Document similarity Vertices: documents on web Edges: Weights defined by similarity See TouchGraph GoogleBrowser

Conceptual network: thesaurus Vertices: words Edges: synonym relationships

Page 9: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 9

Enron

Page 10: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 10

Social networks Example: Acquaintanceship networks

vertices: people in the world links: have met in person and know last names hard to measure

Example: scientific collaboration vertices: math and computer science

researchers links: between coauthors on a published paper Erdos numbers : distance to Paul Erdos Erdos was definitely a hub or connector; had

507 coauthors How do we navigate in such networks?

Page 11: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 11

Page 12: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 12

Acquaintanceship & more

Page 13: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 13

Network Models (Barabasi)

Differences between Internet, Kazaa, Chord Building, modeling, predicting

Static networks, Dynamic networks Modeling and simulation

Random and Scale-free Implications?

Structure and Evolution Modeling via Touchgraph

Page 14: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 14

Web-based social networkshttp://trust.mindswap.org

Myspace 73,000,000 Passion.com 23,000,000 Friendster 21,000,000 Black Planet 17,000,000 Facebook 8,000,000

Who’s using these, what are they doing, how often are they doing it, why are they doing it?

Page 15: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 15

Golbeck’s Criteria

Accessible over the web via a browser

Users explicitly state relationships Not mined or inferred

Relationships visible and browsable by others Reasons?

Support for users to make connections Simple HTML pages don’t suffice

Page 16: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 16

CSE 112, Networked Life (UPenn)

Find the person in Facebook with the most friends Document your process

Find the person with the fewest friends What does this mean?

Search for profiles with some phrase that yields 30-100 matches Graph degrees/friends, what is

distribution?

Page 17: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 17

CompSci 1: Overview CS0

Audioscrobbler and last.fm Collaborative filtering What is a neighbor? What is the network?

Page 18: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 18

What can we do with real data?

How do we find a graph’s diameter? This is the maximal shortest path

between any pair of vertices Can we do this in big graphs?

What is the center of a graph? From rumor mills to DDOS attacks How is this related to diameter?

Demo GUESS (as augmented at Duke) IM data, Audioscrobbler data

Page 19: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 19

My recommendations at Amazon

Page 20: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 20

And again…

Page 21: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 21

How do search engines work? Hotbot, Yahoo, Alta Vista, Excite, … Inverted index with buckets of words

Insight: use matrix to represent how many times a term appears in one page

Columns: pages & Rows: terms Problems?

Return pages that have the keyword - in what order? Early solution: return those pages with most

occurrences of term first Problems? Solution?

• Use structure of the web to do the work for us• What did Google do?

Page 22: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 22

Google’s PageRank

web site xxx

web site yyyy

web site a b c d e f g

web

site

pdq pdq ..

web site yyyy

web site a b c d e f g

web site xxx

Inlinks are “good” (recommendations)

Inlinks from a “good” site are better than inlinks from a “bad” site

but inlinks from sites with many outlinks are not as “good”...

“Good” and “bad” are relative.

web site xxx

Page 23: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 23

Google’s PageRank

web site xxx

web site yyyy

web site a b c d e f g

web

site

pdq pdq ..

web site yyyy

web site a b c d e f g

web site xxx Imagine a “pagehopper”

that always either

• follows a random link, or

• jumps to random page

Page 24: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 24

Google’s PageRank(Brin & Page, http://www-db.stanford.edu/~backrub/google.html)

web site xxx

web site yyyy

web site a b c d e f g

web

site

pdq pdq ..

web site yyyy

web site a b c d e f g

web site xxx Imagine a “pagehopper”

that always either

• follows a random link, or

• jumps to random page

PageRank ranks pages by the amount of time the pagehopper spends on a page:

• or, if there were many pagehoppers, PageRank is the expected “crowd size”

Page 25: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 25

Collaborative Filtering Goal: predict the utility of an item to a

particular user based on a database of user profiles User profiles contain user preference

information Preference may be explicit or implicit

• Explicit means that a user votes explicitly on some scale

• Implicit means that the system interprets user behavior or selections to impute a vote

Problems Missing data: voting is neither complete

nor uniform Preferences may change over time Interface issues

Page 26: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 26

Memory-based methods

Store all user votes and generalize from them to predict vote for new item

Predicted vote of active user a for item j:

where there are n users with non-zero weights, vi,j is the vote of user i and item j, is a normalizing factor,

w() is a weighting function between users• Distance metric• Correlation or similarity

Page 27: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 27

Computing weights - Cosine Correlation

In information retrieval, documents are represented as vectors of word frequencies For CF, we treat preferences as vector

• Documents -> users• Word frequencies -> votes

Similarity is then the cosine between two vectors Dot product of the vectors divided by

the product of their magnitudes

Page 28: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 28

Computing weights - Pearson & Spearman correlation

Pearson Correlation First used for CF in GroupLens project

[Resnick et al., 1994] Relatively efficient to calculate

incrementally

Spearman Correlation same as Pearson but calculations are

done on rank of va,j and vi,j

Page 29: Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Social Networks, CompSci 49s, 11/16/2006 29

Model-based methods

Really what we want is the expected value of the user’s vote

Cluster Models• Users belong to certain classes in C with

common tastes• Naive Bayes Formulation

• Calculate Pr(vi|C=c) from training set

Bayesian Network Models