Upload
auryon
View
38
Download
0
Embed Size (px)
DESCRIPTION
How the Semantic Web is Being Used: An Analysis of FOAF Documents. Li Ding, Lina Zhou, Tim Finin, Anupam Joshi eBiquity Lab, Department of CSEE University of Maryland Baltimore County. Outline. Introduction The six popular ontologies FOAF vocabulary Why FOAF - PowerPoint PPT Presentation
Citation preview
@
How the Semantic Web is Being Used:An Analysis of FOAF Documents
Li Ding, Lina Zhou, Tim Finin, Anupam Joshi
eBiquity Lab, Department of CSEEUniversity of Maryland Baltimore County
@
Outline Introduction
The six popular ontologies FOAF vocabulary Why FOAF
Building FOAF Document collection FOAF Document Identification FOAF Document Discovery Popular Properties of foaf:Person
Applications Personal Information Fusion Social Network Analysis
@
The Six Most Popular Ontologies
RDF
DC
RSS
FOAF
RDFS
MCVB
The statistics is generated by http://swoogle.umbc.edu
FOAF vocabulary(http://xmlns.com/foaf/0.1/)
@
@
Why FOAF Information Creators
Community membership management Unique Person Identification (privacy preserved) Indicating Authorship
Information Consumers Provenance tracking Social networking
Expose community information to new comers Match interests
Trust building block
@
1. D is an RDF document.2. D uses FOAF namespace3. The RDF graph serialized by D contains the sub-graph below
4. D defined one and only one master Person
1. D is an RDF document.2. D uses FOAF namespace3. The RDF graph serialized by D contains the sub-graph below
4. D defined one and only one master Person
Identify a FOAF document D is a generic FOAF document when 1,2,3 met D is a strict FOAF document when 1,2,3,4 met
X
foaf:Person
Z foaf:Y
rdf:type
@
FOAF document Discovery Bootstrap: using web search engine (Got 10,000 docs) Discovery: using rdfs:seeAlso semantics (Got 1.5M docs)
Top 7 FOAF websites
@
Popular properties of foaf:Person (1/2)
non-blog(26,936)
liveJournal.com(20,298,073)
DS-FOAF-SMALL *(33,790)
1 foaf:mbox_sha1sum (0.84) foaf:mbox_sha1sum (1.0) foaf:name(0.80)
2 foaf:homepage (0.66 ) dc:description(1.0) foaf:mbox_sha1sum(0.71)
3 foaf:name (0.64) dc:title (1.0) foaf:nick (0.51)
4 foaf:nick (0.61) foaf:nick (1.0) foaf:homepage (0.40)
5 foaf:weblog (0.60) foaf:page (1.0) foaf:depiction (0.35)
6 foaf:knows (0.44) foaf:weblog (0.99) foaf:weblog (0.30)
7 foaf:mbox (0.38) rdfs:seeAlso (0.85) foaf:knows (0.28)
8 foaf:img (0.38) foaf:knows (0.85) foaf:surname (0.27)
9 bio:olb (0.35) foaf:dateOfBirth (0.71) foaf:firstName (0.26)
10 rdfs:seeAlso (0.34) foaf:interest (0.67) rdfs:seeAlso (0.26)
11 foaf:mbox (0.26)
*DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.
Top 10 popular properties (per document)
@
Popular properties of foaf:Person (2/2)
non-blog(26,936)
liveJournal.com(20,298,073)
DS-FOAF-SMALL *(33,790)
1 foaf:name (0.84) dc:title (1.74) foaf:name(0.69)
2 foaf:knows (0.79) foaf:interest (1.68) foaf:mbox_sha1sum(0.65)
3 foaf:homepage (0.63) foaf:nick (1.04) rdfs:seeAlso (0.39)
4 foaf:mbox_sha1sum (0.51) foaf:weblog (1.00) foaf:nick (0.26)
5 rdfs:seeAlso (0.40) rdfs:seeAlso (0.99) foaf:homepage (0.18)
6 dc:title (0.31) foaf:knows (0.95) foaf:mbox (0.15)
7 foaf:nick (0.22) foaf:page (0.95) foaf:weblog (0.15)
8 foaf:weblog (0.18) dc:description (0.046) foaf:firstName (0.11)
9 foaf:mbox (0.15) foaf:mbox_sha1sum (0.046) foaf:surname (0.11)
10 daml:equivalentTo (0.13) foaf:dateOfBirth (0.046) foaf:depiction (0.10)
11 foaf:knows (0.07)
Top 10 popular properties (per instance)
*DS-FOAF-SMALL is a newly dataset in Oct 2004, based on 7276 evenly sampled documents.
@
Collecting Personal Information
http:www.cs.umbc.edu/~dingli1/foaf.rdf
http://www-2.cs.cmu.edu/People/fgandon/foaf.rdf
@
Caution: Collision? Mistake!
http://www.mindswap.org/~katz/2002/11/jordan.foaf
http://www.ilrt.bris.ac.uk/people/cmdjb/webwho.xrdf
caution
@
SNA1: Instances of foaf:Person per doc Zipf’s distribution Sloppy tail: few person directory documents
contains thousands of instances
1
10
100
1000
10000
1 10 100 1000 10000 100000
# of persons
# of
FO
AF
doc
umen
ts
Cumulative distribution
@
SNA2: Instances of foaf:Person per group Zipf’s distribution Sloppy tail: some instances are wrongly
fused due to incorrect FOAF documents
1
10
100
1000
10000
100000
1 10 100 1000
group size (# of persons)
# of
gro
ups
Cumulative distribution
A group refers to a fused person
@
SNA3: In-degree of group Zipf’s Distribution Sharp tail: few FOAF documents have large in-
degrees
1
10
100
1000
10000
100000
1 10 100in degree of group
# o
f gro
ups
Cumulative distribution
@
SNA4: Out-degree of group Zipf’s distribution Sloppy tail: few person directory documents
1
10
100
1000
10000
1 10 100 1000 10000 100000
out degree per group
# of
gro
ups
Cumulative distribution
@
SNA5: Patterns of FOAF Network Four types of group
Isolated Only in
only one inlink (97%) Only out Both (intermediate)
Basic Patterns: Singleton: (isolated) Star: (only out) an active
person publishes friends Clique: a small group
@
SNA6: Size of components Zipf’s distribution Sloppy head: singleton Sloppy tail: blog websites (e.g. www.livejournal.com)
1
10
100
1000
10000
1 10 100 1000 10000 100000
# of groups per connected component
# of connecte
d com
ponent
Cumulative distribution
@
SNA7: Growth of FOAF network1
2
3
@
The Map of FOAF network (Jun,2004)
www.livejournal.com
www.ecademy.com
Blog.livedoor.jp
non-blog
@
Questions?
Demo: http://apple.cs.umbc.edu/semdis
Swoogle: http://swoogle.umbc.edu
eBiquity group: http://ebiquity.umbc.edu