20
@ How the Semantic Web is Being Used: An Analysis of FOAF Documents Li Ding, Lina Zhou, Tim Finin, Anupam Joshi eBiquity Lab, Department of CSEE University of Maryland Baltimore County

How the Semantic Web is Being Used: An Analysis of FOAF Documents

  • Upload
    auryon

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

How the Semantic Web is Being Used: An Analysis of FOAF Documents. Li Ding, Lina Zhou, Tim Finin, Anupam Joshi eBiquity Lab, Department of CSEE University of Maryland Baltimore County. Outline. Introduction The six popular ontologies FOAF vocabulary Why FOAF - PowerPoint PPT Presentation

Citation preview

Page 1: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

How the Semantic Web is Being Used:An Analysis of FOAF Documents

Li Ding, Lina Zhou, Tim Finin, Anupam Joshi

eBiquity Lab, Department of CSEEUniversity of Maryland Baltimore County

Page 2: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

Outline Introduction

The six popular ontologies FOAF vocabulary Why FOAF

Building FOAF Document collection FOAF Document Identification FOAF Document Discovery Popular Properties of foaf:Person

Applications Personal Information Fusion Social Network Analysis

Page 3: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

The Six Most Popular Ontologies

RDF

DC

RSS

FOAF

RDFS

MCVB

The statistics is generated by http://swoogle.umbc.edu

Page 4: How the Semantic Web is Being Used: An Analysis of FOAF Documents

FOAF vocabulary(http://xmlns.com/foaf/0.1/)

@

Page 5: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

Why FOAF Information Creators

Community membership management Unique Person Identification (privacy preserved) Indicating Authorship

Information Consumers Provenance tracking Social networking

Expose community information to new comers Match interests

Trust building block

Page 6: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

1. D is an RDF document.2. D uses FOAF namespace3. The RDF graph serialized by D contains the sub-graph below

4. D defined one and only one master Person

1. D is an RDF document.2. D uses FOAF namespace3. The RDF graph serialized by D contains the sub-graph below

4. D defined one and only one master Person

Identify a FOAF document D is a generic FOAF document when 1,2,3 met D is a strict FOAF document when 1,2,3,4 met

X

foaf:Person

Z foaf:Y

rdf:type

Page 7: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

FOAF document Discovery Bootstrap: using web search engine (Got 10,000 docs) Discovery: using rdfs:seeAlso semantics (Got 1.5M docs)

Top 7 FOAF websites

Page 8: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

Popular properties of foaf:Person (1/2)

non-blog(26,936)

liveJournal.com(20,298,073)

DS-FOAF-SMALL *(33,790)

1 foaf:mbox_sha1sum (0.84) foaf:mbox_sha1sum (1.0) foaf:name(0.80)

2 foaf:homepage (0.66 ) dc:description(1.0) foaf:mbox_sha1sum(0.71)

3 foaf:name (0.64) dc:title (1.0) foaf:nick (0.51)

4 foaf:nick (0.61) foaf:nick (1.0) foaf:homepage (0.40)

5 foaf:weblog (0.60) foaf:page (1.0) foaf:depiction (0.35)

6 foaf:knows (0.44) foaf:weblog (0.99) foaf:weblog (0.30)

7 foaf:mbox (0.38) rdfs:seeAlso (0.85) foaf:knows (0.28)

8 foaf:img (0.38) foaf:knows (0.85) foaf:surname (0.27)

9 bio:olb (0.35) foaf:dateOfBirth (0.71) foaf:firstName (0.26)

10 rdfs:seeAlso (0.34) foaf:interest (0.67) rdfs:seeAlso (0.26)

11 foaf:mbox (0.26)

*DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.

Top 10 popular properties (per document)

Page 9: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

Popular properties of foaf:Person (2/2)

non-blog(26,936)

liveJournal.com(20,298,073)

DS-FOAF-SMALL *(33,790)

1 foaf:name (0.84) dc:title (1.74) foaf:name(0.69)

2 foaf:knows (0.79) foaf:interest (1.68) foaf:mbox_sha1sum(0.65)

3 foaf:homepage (0.63) foaf:nick (1.04) rdfs:seeAlso (0.39)

4 foaf:mbox_sha1sum (0.51) foaf:weblog (1.00) foaf:nick (0.26)

5 rdfs:seeAlso (0.40) rdfs:seeAlso (0.99) foaf:homepage (0.18)

6 dc:title (0.31) foaf:knows (0.95) foaf:mbox (0.15)

7 foaf:nick (0.22) foaf:page (0.95) foaf:weblog (0.15)

8 foaf:weblog (0.18) dc:description (0.046) foaf:firstName (0.11)

9 foaf:mbox (0.15) foaf:mbox_sha1sum (0.046) foaf:surname (0.11)

10 daml:equivalentTo (0.13) foaf:dateOfBirth (0.046) foaf:depiction (0.10)

11 foaf:knows (0.07)

Top 10 popular properties (per instance)

*DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.

Page 10: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

Collecting Personal Information

http:www.cs.umbc.edu/~dingli1/foaf.rdf

http://www-2.cs.cmu.edu/People/fgandon/foaf.rdf

Page 11: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

Caution: Collision? Mistake!

http://www.mindswap.org/~katz/2002/11/jordan.foaf

http://www.ilrt.bris.ac.uk/people/cmdjb/webwho.xrdf

caution

Page 12: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

SNA1: Instances of foaf:Person per doc Zipf’s distribution Sloppy tail: few person directory documents

contains thousands of instances

1

10

100

1000

10000

1 10 100 1000 10000 100000

# of persons

# of

FO

AF

doc

umen

ts

Cumulative distribution

Page 13: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

SNA2: Instances of foaf:Person per group Zipf’s distribution Sloppy tail: some instances are wrongly

fused due to incorrect FOAF documents

1

10

100

1000

10000

100000

1 10 100 1000

group size (# of persons)

# of

gro

ups

Cumulative distribution

A group refers to a fused person

Page 14: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

SNA3: In-degree of group Zipf’s Distribution Sharp tail: few FOAF documents have large in-

degrees

1

10

100

1000

10000

100000

1 10 100in degree of group

# o

f gro

ups

Cumulative distribution

Page 15: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

SNA4: Out-degree of group Zipf’s distribution Sloppy tail: few person directory documents

1

10

100

1000

10000

1 10 100 1000 10000 100000

out degree per group

# of

gro

ups

Cumulative distribution

Page 16: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

SNA5: Patterns of FOAF Network Four types of group

Isolated Only in

only one inlink (97%) Only out Both (intermediate)

Basic Patterns: Singleton: (isolated) Star: (only out) an active

person publishes friends Clique: a small group

Page 17: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

SNA6: Size of components Zipf’s distribution Sloppy head: singleton Sloppy tail: blog websites (e.g. www.livejournal.com)

1

10

100

1000

10000

1 10 100 1000 10000 100000

# of groups per connected component

# of connecte

d com

ponent

Cumulative distribution

Page 18: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

SNA7: Growth of FOAF network1

2

3

Page 19: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

The Map of FOAF network (Jun,2004)

www.livejournal.com

www.ecademy.com

Blog.livedoor.jp

non-blog

Page 20: How the Semantic Web is Being Used: An Analysis of FOAF Documents

@

Questions?

Demo: http://apple.cs.umbc.edu/semdis

Swoogle: http://swoogle.umbc.edu

eBiquity group: http://ebiquity.umbc.edu