Copyright @ Sebastian Ryszard Kruk, http://www.sebastiankruk.com/
Semantic Digital Libraries: Improving Usability of Information Discovery
with Semantic and Social Services
Sebastian Ryszard Kruk
Presentation Outline
Ontologies
Navigation
JeromeDLthe prototype
Problem statement and hypothesis
Evaluation
Problem and Hypothesis
Problem StatementDigital library users are
missing a librarian => problems with information discovery and understanding complex metadatamissing peers => cannot share experience with other users visiting the librarymissing connection with other sources => library resources cannot become a part of the information processing workflow
Digital Library systemknowledge organization systems => islands of highly organized informationpoor information discovery => loosing their position to other sourcesincompatible taxonomies and schemata => loosing potential of rich metadata
HypothesisSemantic and social technologies in digital libraries improve information discovery compared to classic approaches:
Users find information more easilyPrecision in searching is improvedUsers’ satisfaction is increasedUsers retain more information
SemanticWeb
Web 2.0
DigitalLibraries
Semantic Digital
Libraries
expressiveness
interoperability
tagging
communities
controlled vocabularies knowledge organization systems
Semantic Digital Libraries:Architecture & Ontologies
SemDL Architecture
Users
Content
System
communities of users
externalservices
DLdesigners
servicedevelopers
DLadministrators
UI agents
Data Presentation Layer
Data Abstraction Layer
Data Sources
Advanced Mgmt.
Services
BasicServices
InformationAccess
Services
Interoper-ability
Services
Data Access and Manipulation Layer
Existing reference digital library architecturesAlexandria DL architecture (Frew et al, 1998)DELOS reference model (actors) (Candela et al, 2007a)Interaction Triptych Model (Fuhr et al, 2007)
Missing:Object Model: integration of metadata, reuse of library resourcesDigital Library Services: interoperability, user annotation, advanced search and browsing
Published in: Kruk et al., 2005 (DEXA); Kruk and McDaniel, 2008 (Springer); Kruk et al., 2009 (accepted to TEL)
Ontologies for SemDL
Requirements:Support for a complex and dynamic structure of information objects; reuse, aggregation; scientific publications workflowSupport for reach, interconnected and interoperable bibliographic metadata; align existing concepts, e.g., MARC21, BibTeX, Dublin Core, SKOS, Address OntologySupport for communities of library users: FOAF, SIOC, Tom Gruber’s Tagging OntologySupport for rights management; model based on ODRL and XACML
Published in: Kruk et al., 2005 (DEXA); Kruk and Haslhofer, 2006 (NKOS, ECDL); Kruk and McDaniel, 2008 (Springer); Kruk et al., 2009 (accepted to TEL)
SemDL Ontologies Example
SemDL
Book
Sebastian
Kruk
marcont:hasCreator
Introduction
intro.pdf
Corrib
Collection
DERImarcont:hasAffiliation
(is in)
jdl:hasPart
jdl:hasRepresentation
SemDL
Tutorial
marcont:hasRelatedEvent
Pittsburgh
hasAddress
Abstractmarcont:hasAbstract
John
Doe
foaf:knows
20%
xfoaf:trustLevel
(Tagging)
Book
(term)Book
Digital
Libraries
(directory)
sscf:issuedBy
sscf:isIn
dc:creator
sioc:related_to
tagging:hasTerm
rdfs:label DERI
(License)
DERIans
read
eac:hasLicense
Community Ontologies Structure Ontology Bibliographic Ontologies
Rights Management Ontology
Ontologies designed: JeromeDL structure ontology, MarcOnt bibliographic ontology, FOAFRealm/SSCF ontology, Extensible Access Control ontology, S3B Tagging OntologyOntologies used: FOAF, SKOS, SIOC, Address ontology
Published in: Kruk et al., 2005 (DEXA); Kruk and Haslhofer, 2006 (NKOS, ECDL); Kruk and McDaniel, 2008 (Springer); Kruk et al., 2009 (accepted to TEL)
Semantic Digital Libraries:Navigation
Social Semantic Collaborative Filtering
Motivationsupport identifying and finding experts, and propagating their expertiseallow to express users’ interests and filtering knowledge base using disambiguation mechanismsfeature security mechanisms for efficient and secure information gathering and dissemination
Modelgraph of quantified social relationsgraph of inclusions of collections annotated with KOS concepts social relationsaccess control based on the position in the social network
Published in: Grzonkowski, Gzella, Kruk, et al., 2009 (Journal of Web Based Communities); Choi, Kruk, et al., 2006 (IRW, WWW); Kruk, et al, 2006 (ASWC); Kruk and Decker, 2005 (Semantic Desktop Workshop, ISWC)
Alice
Bob Caroline
Damian Eric
Bibliographic Ontologies
Mediation
Artificial Intelligence Digital
Libraries
Distributed
Systems
Libraries P2P Systems
Semantic Web
FQ=80%
FQ=50%
FQ=30%
FQ=10%
ACL(PD, Damian) < 2
ACL(FQ, Damian) > 80%Gerald
Legacy Ontologies
Mediation
FelixOntology
Mediation
Social Semantic Collaborative Filtering
Published in: Grzonkowski, Gzella, Kruk, et al., 2009 (Journal of Web Based Communities); Choi, Kruk, et al., 2006 (IRW, WWW); Kruk, et al, 2006 (ASWC); Kruk and Decker, 2005 (Semantic Desktop Workshop, ISWC)
Evaluation of SSCF ModelQuestion for evaluation:
Is the social network better informed with SSCF?
Assumptions for evaluation model:The quality of the information provided by a user on a certain collection is proportional to the expertise level of the user on the topic of the collection. It is possible to find a user with a high expertise on the given topic within the social network.
Evaluation setup:a model of the social network - 1000 usersdistribution of relationships: bell-curved (µ = 25, σ = 12.5) and zipfian (θ = 1.9)
Measuring: Average Maximal Expertise (R) - average value of the highest expertise level found within given degree of separation (R)
Published in: Grzonkowski, Gzella, Kruk, et al., 2009 (Journal of Web Based Communities); Choi, Kruk, et al., 2006 (IRW, WWW); Kruk, et al, 2006 (ASWC); Kruk and Decker, 2005 (Semantic Desktop Workshop, ISWC)
Evaluation of SSCF ModelQ1: Can a user access information gathered by the domain experts ?
For Zipf ’s distribution maximal average expertise for R=6 is 91% - answer: very probableFor Bell-curved distribution maximal average expertise for R>3 is above 96% - answer: even more probable
Q2: Is the average expertise level higher in the social network ?
For both types of distributions the average expertise of a single member (R = 0) is much lower than in the social network.
0%
25%
50%
75%
100%
0 1 2 3 4 5 6
Zipf (θ = 1.9)Bell (σ = 12.5)
Published in: Grzonkowski, Gzella, Kruk, et al., 2009 (Journal of Web Based Communities); Choi, Kruk, et al., 2006 (IRW, WWW); Kruk, et al, 2006 (ASWC); Kruk and Decker, 2005 (Semantic Desktop Workshop, ISWC)
Shortcomings of faceted navigationShortcomings of faceted navigation
RDF is not a homogeneous information spacejoin operator is unintuitive to the end user (Oren, 2006)no filtering based on only given value no union and difference operatorsmost of solutions are monolithic (no MVC)poor accessibility: information overload
Extended modelExtensions to inverted and existential operators Browse and similarity operatorsNew combination operators: union, difference, binding
DERIaffiliation
...knows
... ?creator
browse-(affiliation)
browse-(knows)
browse-(creator)
Published in: Kruk et al., 2007 (ODBASE)
MultiBeeBrowseZoomable User Interface: basic, structured, browsing and complete history viewCollaborative Browsing (using SSCF and RSS)Adaptable Browsing Interface (incl. concepts suggestions, facets labeling, results presentation)Services for Accessible Faceted NavigationModel of meta-operations:
browse
(dc:creator)
search
(name:"Decker")
similar-
(dc:creator)
sum
sum
Forest with faceted navigation decision trees Meta-operations decision tree
filter
(dc:creator, "Kruk")
Published in: Kruk et al., 2007 (ODBASE)
EvaluationComparing three solutions:
Operator MBB Browse RDF Longwell Othersearch
selection
exist. propertybrowsecombine
+ ± ± -
+ ± ± ±
+ + - -
+ - - -
+ ± ± ±
0
5
10
15
20
Friendly Average Hard to use
4
16
12
6
2 3
78
MultiBeeBrowseBrowseRDFLongwell
Features comparison:
Published in: Kruk et al., 2007 (ODBASE)
Semantic Digital Libraries: JeromeDL - the prototype
JeromeDLSemantic digital library project based on cooperation of
Gdańsk University of TechnologyDERI, National University of Ireland, Galway
Distributed under Open Source (BSD) license10+ instances worldwide:
DERI, Ireland: Library, Books, EastWeb DLGUT, Poland: WBSS, Kashebian, PMR JournalINEGI, Mexico: internal digital librarydContentWare, Italy: core of the projectBosco Inc., India: 1000+ resourcesWKU, KY, USA: learning materials repository
Published in: Kruk, Decker and Zieborak, 2005 (DEXA); Kruk et al., 2007 (Semantic Web Challenge, ISWC), Kruk et al., 2008 (ECDL), Kruk and McDaniel, 2008 (Springer)
Differentiators of JeromeDL
combining semantic bibliographic descriptions and social mediaadvanced, personalized search solutionssocial networking platform integrated with user profiling componentextensible access control system based on social network relationscollaborative filtering and browsingdynamic collectionsintegration with other Web 2.0 services
Published in: Kruk, Decker and Zieborak, 2005 (DEXA); Kruk et al., 2007 (Semantic Web Challenge, ISWC), Kruk et al., 2008 (ECDL), Kruk and McDaniel, 2008 (Springer)
resource
resource
WordNet
DMoz
comments
Collaborative
Filtering
Collaborative
Browsing
BloggingTagging
Mediation
Services
Natural Language
Query Template
Identity
Management
Filtering and
Browsing
KOS
Community
Driven
Taxonomies
Ontologized
Metadata
Digital
Library
Resources
Social
Services
Semantic
Services
Classic
Services
Distributed
Search
Security &
Access Control
Full-text
Index & Search
3-layered Architecture
Users
Content
System
communities of users
externalservices
DLdesigners
servicedevelopers
DLadministrators
UI agents
Data Presentation Layer
Data Abstraction Layer
Data Sources
Advanced Mgmt.
Services
BasicServices
InformationAccess
Services
Interoper-ability
Services
Data Access and Manipulation Layer
Users
Content
System
communities of users
externalservices
DLdesigners
servicedevelopers
DLadministrators
UI agents
Data Presentation Layer
Data Abstraction Layer
Data Sources
resource
resource
WordNet
DMoz
comments
Collaborative Filtering Collaborative Browsing
BloggingTagging
Mediation Services NLQ
Identity Management Filtering and Browsing
KOS Distributed Search
Security & AC Full-text Search
Published in: Kruk, Decker and Zieborak, 2005 (DEXA); Kruk et al., 2007 (Semantic Web Challenge, ISWC), Kruk et al., 2008 (ECDL), Kruk and McDaniel, 2008 (Springer)
Search and BrowsingTagsTreeMaps - filtering with hierarchical tagsMultiBeeBrowse - social browsingDynamic collections - defined based on triple filtering and SPARQL queriesRecommendations of related resources based on semantic resource descriptionQuery templates in natural languageSemantic Query Expansion based on user’s context and semantic annotationsSocial Semantic Collaborative Filteringflexible API for integration of external services, e.g., Exhibit (SIMILE, MIT)
Published in: Kruk, Decker and Zieborak, 2005 (DEXA); Kruk et al., 2007 (Semantic Web Challenge, ISWC), Kruk et al., 2008 (ECDL), Kruk and McDaniel, 2008 (Springer)
Semantic Digital Libraries:Evaluation
Evaluation ProcedureEvaluating usability (system, user)Two digital libraries in their basic (vanilla) setup
JeromeDL - semantic digital library DSpace - classic digital library (control group)
Database:noise: 529 articles from DERI JeromeDL instancesreference set: 35 articles on Internet psychology
Participants: 59 commenced evaluation, 26 completed
Initial Tasks
registration
getting to know
the library
Question-Answering
Tasks
Task
1
Task
2
Task
3
Memory Task
one of QA Tasks
no library access
Initial
Questionnaires
QA
Questionnaires
Memory
Questionnairy
Final
Questionnairy
long time
Published in: Kruk et al., 2008 (ECDL), Kruk and McDaniel, 2008 (Springer)
Questions for Evaluation (1)Do semantic and social services improve the quality of answers?
slightly better results for JeromeDL group, improving significantly over time (results statistical significance close to acceptance threshold)
Do semantic and social service increase the quality of references provided by the participants?
slightly better results for JeromeDL group, improving significantly over time (could not confirm statistical significance)
Do semantic and social service increase the satisfaction from using a digital library? (statistical significance significance)
0
5.75
11.50
17.25
23.00
task 1 task 2 task 3 average
8.239.39
1.88
13.4119.84
22.6921.99
14.86
JeromeDL DSpace
Published in: Kruk et al., 2008 (ECDL), Kruk and McDaniel, 2008 (Springer)
Questions for Evaluation (2)Which services are found to be most useful?
recommendations and social filtering (results statistically significant)
Do semantic and social services increase information retention? (results statistically significant)
Quality of answers: JeromeDL - 2.78, DSpace - 2.44Accuracy of references: JeromeDL - 6, DSpace - 1Satisfaction:
Would you like to continue using this library ?
understanding easy of execution intuitiveness
-1.00-17.22
10.8921.11
2.00
29.11
JeromeDL DSpace
46.15%
84.62%
JeromeDL DSpace
Published in: Kruk et al., 2008 (ECDL), Kruk and McDaniel, 2008 (Springer)
Conclusions
I have presented
Architecture and ontologies for Semantic Digital Libraries
Examples of search and browsing services:
Social Semantic Collaborative Filtering
MultiBeeBrowse
JeromeDL - the prototype
Evaluation of semantic and social services
What about hypothesis ?
Semantic and social technologies in digital libraries improve information discovery compared to classic approaches:
Users find information more easily
Precision in searching is improved
Users’ satisfaction is increased
Users retain more information
✓✓✓✓
The Impact1 Book: Kruk, McDaniel: Semantic Digital Libraries (Springer, 2008) [300+ copies sold]
30+ Papers (excluding 9 chapters in the book):JeromeDL: IIS 2004, DEXA 2005, ECDL Demo Session 2005 Workshop, InfoBazy 2005, ICIW 2006 (best paper), Semantic Web Challenge at ISWC 2007, SemTech 2007, MCAST Workshop 2007, Dev. Track WWW 2008, ECDL 2008, InfoBazy 2008, The Electronic Library JournalFOAFRealm - FOAF Workshop 2004, TEHOSS 2005, MoSO @ MDM 2006, IRW2006 @ WWW2006, ASWC 2006, WBC 2007, International Journal of WBC, Semantic Web Challenge at ISWC 2007, Media in Transition 2007MarcOnt - DublinCore 2005, ECDL Poster Session 2005, International Artificial Intelligence Research Society Conference 2007MultiBeeBrowse - ODBASE 2007, Conference on Teaching and Learning 2007, CHI 2008Social Semantic Collaborative Filtering - Semantic Desktop at ISWC, 2005NLQ - IADIS International Conference WWW/Internet 2006Didaskon/IKHarvester - EC-TEL 2007, LACLO 2006, IEEE ICSC 2006HyperCuP - ESWC Demo Session 2006
5 Tutorials: JCDL2006, ESWC 2007, WWW 2007, JCDL 2008, ICSD 2009 (upcoming)
3 Invited talks: EPFL, UCD, Polish Information Processing Society
3 workshops: Irish DL Summit, Web Archiving, Special Session at NKOS 2006
10+ open source projects - corrib.org, opensource.knowledgehives.com
17 MSc Theses supervised at GUT
Startup company (Knowledge Hives) continuing R&D efforts initiated in SemDL domain
Semantic and Social Services Improve Usability of Information Discovery
in
Semantic Digital Libraries
Sebastian Ryszard Krukhttp://www.sebastiankruk.com/