If you can't read please download the document
Upload
laura-dragan
View
1.053
Download
0
Embed Size (px)
Citation preview
Slide 1
Interlinking Personal Semantic Dataon the Desktop and the Web
Laura Drgan
Outline
// IntroductionBackground and motivation
Research questions
// Directions and resultsWithin the Semantic desktop
To the Web of Data
A use case to rule them all
// ConclusionResearch answers
Future work
1
Outline
// IntroductionBackground and motivation
Research questions
// Directions and resultsWithin the Semantic desktop
To the Web of Data
A use case to rule them all
// ConclusionResearch answers
Future work
1
Outline
// IntroductionBackground and motivation
Research questions
// Directions and resultsWithin the Semantic desktop
To the Web of Data
A use case to rule them all
// ConclusionResearch answers
Future work
1
Background
Personal Information Management
2
Background
Personal Information Management
1945
19621968
1965
2
Background
Personal Information ManagementWeb
2
Background
Personal Information ManagementSemantic Web
2
Background
Personal Information ManagementSemantic Web
Semantic Desktop
2
Background
Personal Information ManagementSemantic Web
Semantic Desktop
2
Motivation
Use the framework provided by the Semantic Desktop to build useful applications and services3
Research questions
How to build semantic applications and tools for the Semantic Desktop to provide the best experience for the users, while creating reusable semantic data?
4
Research questions
How to build semantic applications and tools for the Semantic Desktop to provide the best experience for the users, while creating reusable semantic data?
4
Research questions
How to build semantic applications and tools for the Semantic Desktop?
4
Research questions
How to build semantic applications and tools for the Semantic Desktop?
How to expand the scope of the Semantic Desktop into the realm of the Web of Data, to benefit the users and enhance their experience?
4
Research questions
How to build semantic applications and tools for the Semantic Desktop?
How to expand the scope of the Semantic Desktop into the realm of the Web of Data, to benefit the users and enhance their experience?
4
Research questions
How to build semantic applications and tools for the Semantic Desktop?
How to expand the scope of the Semantic Desktop into the Web of Data?
4
Q1 sub-questions
semantic applications for the Semantic Desktop
How to create semantic data that is complete, correct, safe, and provides a high degree of interlinking with the already existing network of semantic data on the desktop?
How to reuse existing Semantic Desktop data in an application?
How to design the human-computer interaction in an application for the Semantic Desktop?
How to correctly evaluate a semantic application?
5
Q1 sub-questions
semantic applications for the Semantic Desktop
How to create semantic data that is complete, correct, safe, and provides a high degree of interlinking with the already existing network of semantic data on the desktop?
How to reuse existing Semantic Desktop data in an application?
How to design the human-computer interaction in an application for the Semantic Desktop?
How to correctly evaluate a semantic application?
5
Q2 sub-questions
connect the Semantic Desktop with the Web of Data
How to find Web instances representing the same real-world thing described by a Semantic Desktop resource?
How to use the Web information which is related to a desktop resource?
How to make desktop data available online safely?
6
Q2 sub-questions
connect the Semantic Desktop with the Web of Data
How to find Web instances representing the same real-world thing described by a Semantic Desktop resource?
How to use the Web information which is related to a desktop resource?
How to make desktop data available online safely?
6
Directions
7
Directions
1.
7
Directions
1.
7
Directions
7
Directions
2.
7
Directions
2.
1.
7
And then of course the 2 directions can and should and are combined
Within the Semantic Desktop
8
SemNotes
Challenges described by Q1
create new semantic dataData representation
Data management
reuse existing Semantic Desktop dataInterlinking
design the human-computer interactionVisualisation
correctly evaluate a semantic applicationTask-based comparison to Evernote
9
Data representation
10
Data representation
a pimo:Note ;
10
Data representation
a pimo:Note ; nao:prefLabel "holiday plans" ;
10
Data representation
a pimo:Note ; nao:prefLabel holiday plans ; nao:created 2010-09-16T21:08:54.29Zxsd:dateTime ; nao:lastModified 2010-09-17T10:59:01.58Zxsd:dateTime ; nao:numericRating 9xsd:int ;
10
Data representation
a pimo:Note ; nao:prefLabel holiday plans ; nao:created 2010-09-16T21:08:54.29Zxsd:dateTime ; nao:lastModified 2010-09-17T10:59:01.58Zxsd:dateTime ; nao:numericRating 9xsd:int ; nao:description ... xsd:string ;
10
Data representation
a pimo:Note ; nao:prefLabel holiday plans ; nao:created 2010-09-16T21:08:54.29Zxsd:dateTime ; nao:lastModified 2010-09-17T10:59:01.58Zxsd:dateTime ; nao:numericRating 9xsd:int ; nao:description ... xsd:string ; nao:hasTag ;
10
Data representation
a pimo:Note ; nao:prefLabel holiday plans ; nao:created 2010-09-16T21:08:54.29Zxsd:dateTime ; nao:lastModified 2010-09-17T10:59:01.58Zxsd:dateTime ; nao:numericRating 9xsd:int ; nao:description ... xsd:string ; nao:hasTag ; pimo:isRelated , .
10
Interlinking
Annotation suggestions:Based on the content of the note.
Certain types preferred.
Preference based on past use and matched length.
... brian ... Brian Davis
Brian Wall
... brian davis ... Brian Davis
11
Interlinking algorithm
Algorithmscan text; identify possible entities
for each possible entity find a list of desktop resource candidatescompute score for each possible candidate
filter list by score
sort by score
present the candidates to the user
create the relation only if the user chooses a resource
Visualisation - HCI
12
Visualisation - versions
13
Visualisation - versions
13
Visualisation - HCI
13
Evaluation
Task-based experiment
Comparation of SemNotes to Evernote
The effort of interlinking lower than the effort spent when searching.14
Evaluation
Experimental setup20 participants 14 use note-taking regularly
5 use Evernote in their daily activity
Familiar data 130 contacts
20 scientific papers
50 notes
8 tasks 2 tasks - familiarise the participants with the dataset
6 tasks focused on note-taking, varying the complexity
Measurements Time spent
Mouse clicks
Keystrokes
15
Evaluation
TasksFind notes tagged with todo
Find to-dos that are related to DERI
Find a to-do related to a presentation given by John
Take a note about planning a social event for your group
Find a note containing minutes from the last meeting about the NICE project. Change the date of the next meeting planned
Take a note for the action item assigned to you at the last meeting
Evaluation
Quantitative resultsTime spent note-taking no significant differences
Time spent searchingSemNotes significantly faster for complex queries
no significant difference for simple queries
16
Evaluation
Quantitative resultsTime spent note-taking no significant differences
Time spent searchingSemNotes significantly faster for complex queries
no significant difference for simple queries
Questionnaire resultsFasterBetter
16
Evaluation
Quantitative results
TaskTimeClicks
AvgMedtAvgMedt
T10.500.1520.16700.692
T2-8-8-2.94-0.333-1-0.48
T3-0.1251 -0.046 0.8571 1.426
T40.0630.0160.4866.06782.026
T514.357131.7134.81221.527
T60.2490.2431.00420.8123.08
But ...
The desktop is not any more the sole repository of personal informationSocial networks
Mobile devices
Cloud services
17
But the semantic desktop, as efficient as it might become with semantic tools and interconnected data, is no longer the only repository or even the main one some would say of personal data.
To the Web of Data
Challenges described by Q2 (Q2.1.)find Web aliases of Semantic Desktop resources
18
Finding Web Aliases
Web alias = Web resource representing the same real-world entity as the desktop resource
19
Finding Web Aliases
Different identifiers
19
Finding Web Aliases
Different identifiers
nepomuk:/res/Angela
http://angelaonthe.net/foaf/me
19
Finding Web Aliases
Different vocabularies
19
Finding Web Aliases
The sheer size of the Web of Data
19
Finding Web Aliases
The sheer size of the Web of Data
19
2 Step approach
Candidate Selection
20
2 Step approach
Candidate Selection
Query various Web of Data sources
Identify candidate URIs
Retrieve data for each of the candidate
20
2 Step approach
Candidate Selection
Query various Web of Data sources
Identify candidate URIs
Retrieve data for each of the candidate
Candidate Filtering
20
2 Step approach
Candidate Selection
Query various Web of Data sources
Identify candidate URIs
Retrieve data for each of the candidate
Candidate Filtering
Compute similarity score.
Filter the candidates.
20
Candidate Selection
Determined set of sourcesSpecific requirements
Restricted domain
Semantic search engineGeneric domain
Unknown data sources
21
Candidate Selection
Determined set of sourcesSpecific requirements
Restricted domain
Semantic search engineGeneric domain
Unknown data sources
21
Candidate Filtering
Filter by type
Compute similarity score
Filter by score
(local, web)returnscore
22
Matching Module
Typematchingreturn 0No
Compute scoreYes score threshold
NoYes(local, web)returnscore
Matching Parameters
String matching (SM) Exact matching versus approximate string matching
Koeln vs. Kln
Weighted properties (WP)Weighted participation of properties in the final score
Email address more exact than name
Multi-valued properties (MVP)All matching values for a property contribute to the score
e.g. Authors' names for a paper
Score Calculation
Driven by the local data
weighted sum of matching props
score =
total sum of all weighted props
Evaluation
Manually constructed gold standardData collection
Relevance judgements
IR measuresEffect of parameter settings
Adjust thresholds
23
Data collection
Desktop data50 people nco:PersonContact
50 music albums nmo:MusicAlbum
50 publications nfo:PaginatedTextDocument
11.917 triples
Web data20 candidates for each desktop resource -> 3000 URIs
1.530.686 triples
24
Relevance Judgements
25
Relevance Judgements
3000 pairs x 3 experts
Fleiss' K = 0.638 0.214Average pairwise agreement 92.252%
25
IR Measures
MAP
NDCG
P@k (k=1,2,3,4,5)
Baseline: exact match
all properties count equally
single value considered for each property
Evaluation Results
Approximate string matching improves results for albums and people
does not help for publications
Weights and multiple valueswhen combined improve results for publications,but not for the other types
26
Merging the two directions
1.
2.
27
A use case
Note Blog post [Semantic] note-taking [Semantic] blogging
[Preserve context] [Preserve privacy]
28
Steps
TransformationOn the local side
Extension to SemNotes
PublicationOn the server side
According to Linked Data principles
29
Steps
(Note-taking & annotation)(Entity matching)
TransformationOn the local side
Extension to SemNotes
PublicationOn the server side
According to Linked Data principles
29
Levels and layers
30
Ontology level
Local - Nepomuk ontologies
Remote SIOC, FOAF, DC, ...
pimo:Notesioc:Post
nao:Tagsioct:Tag
pimo:Personfoaf:Person
pimo:Projectdoap:Project
pimo:Eventical:Vevent
nao:prefLabelrdfs:label
nao:createddcterms:created
nao:lastModifieddcterms:modified
nao:hasTagsioc:topic
pimo:isRelatedsioc:related_to
Data level
Local notes, desktop resources (tags included)
Remote blog posts, Web resources, tags
http://semnotes.deri.ie/notes/note/id http://semnotes.deri.ie/notes/resource/id http://semnotes.deri.ie/notes/tag/label
Application level - local
Plugin for SemNotesAsk server for server URLs for the new note and resources
Replace desktop URIs with the server URLs in the note
Add RDFa to the note
Push the transformed note to the server
Application level - remote
Web server with MySQL, PHP, ARC2Create new URLs for resources
Receive and process the note
Publish the data online
Published data
31
Research answers
How to build semantic applications and tools for the Semantic Desktop?
How to expand the scope of the Semantic Desktop into the Web of Data?
32
Research answers
How to build semantic applications and tools for the Semantic Desktop? SemNotes Create new data
Reuse existing data
HCI
Evaluation
How to expand the scope of the Semantic Desktop into the Web of Data?
32
Research answers
How to build semantic applications and tools for the Semantic Desktop? SemNotes Create new data
Reuse existing data
HCI
Evaluation
How to expand the scope of the Semantic Desktop into the Web of Data?Web aliases
Semantic blogging use case
32
Future work
Information Extraction algorithms and methodscreate multiple types of relations based on the text
extract new entities from text
extract links between entities mentioned in the notes
Explore visualisations personal data browser
Large scale user study of semantic personal information usage and behaviours
33
Summary
1.
2.
Web aliases
+ semantic publishing use case
Digital Enterprise Research Institute www.deri.ie
Digital Enterprise Research Institutederi.ie