IOS Press Aemoo: Linked Data Exploration based on ...pdfs.semanticscholar.org/3388/fb0d11652e372fa29cdd... · into a text, or into a conversation, all of the others are automatically

Undefined 1 (2015) 1ndash27 1IOS Press

Aemoo Linked Data Exploration based onKnowledge PatternsAndrea Giovanni Nuzzolese alowast Valentina Presutti a Aldo Gangemi acSilvio Peroni ab and Paolo Ciancarini aba STLab-ISTC Consiglio Nazionale delle Ricerche Rome Italyb Department of Computer Science and Engineering University of Bologna Bologna Italyc LIPN Universiteacute Paris 13 Sorbone Citeacute UMR CNRS Paris France

Abstract This paper presents a novel approach to Linked Data exploration that uses Encyclopedic Knowledge Patterns (EKPs)as relevance criteria for selecting organising and visualising knowledge EKP are discovered by mining the linking structure ofWikipedia and evaluated by means of a user-based study which shows that they are cognitively sound as models for buildingentity summarisations We implemented a tool named Aemoo that supports EKP-driven knowledge exploration and integratesdata coming from heterogeneous resources namely static and dynamic knowledge as well as text and Linked Data Aemoo isevaluated by means of controlled task-driven user experiments in order to assess its usability and ability to provide relevant andserendipitous information as compared to two existing tools Google and RelFinder

Keywords Exploratory search Knowledge exploration Visual Exploration Knowledge Patterns Analysis of Linked Data

1 Introduction

In the Semantic Web vision [7] agents were sup-posed to leverage the Web knowledge in order to helphumans in solving knowledge-intensive tasks Nowa-days Linked Data is feeding the Semantic Web bypublishing datasets that rely on URIs and RDF How-ever it is still difficult to enable homogeneous and con-textualised access to Web knowledge for both humansand machines because of the heterogeneity of LinkedData and the lack of relevance criteria (aka knowl-edge boundary problem [21]) for providing tailoreddata

The heterogeneity problem is due to different datasemantics ontologies and vocabularies used in linkeddatasets In fact Linked Data is composed of datasetsfrom different domains (eg life science geographygovernment) Moreover some of them classify data ac-cording to a reference ontology (eg DBpedia) andothers just provide access to raw RDF data For ex-

Corresponding author E-mailandreanuzzoleseistccnrit

ample if we consider the case of aggregating datafrom different linked datasets we need a shared in-tensional meaning over the things described in thesedatasets in order to properly mash up facts about thosethings The scenario is even more complex if we alsotake into account dynamic data coming from a varietyof sources like social streams (eg Twitter) and news(eg Google News)

On the other hand the knowledge boundary prob-lem consists in the difficulty of identifying what con-figuration of data is really meaningful with respect tospecific application tasks Identifying meaningful datainvolves the need of establishing a clear relevance cri-teria to be applied as a filter on data As an example wecan consider an application that leverages Linked Datato provide a summary on some topic If the topic of thesummary is the philosopher Immanuel Kant this appli-cation should provide users with tailored informationconcerning facts about Kantrsquos major works and ideasbut should skip facts that are too peculiar or curioussuch as the nationality of his grandfather

Elsewhere [21] we introduced a vision for the Se-mantic Web based on Knowledge Patterns (KP) as ba-

0000-000015$0000 ccopy 2015 ndash IOS Press and the authors All rights reserved

2 AG Nuzzolese et al Aemoo Linked Data Exploration based on Knowledge Patterns

sic units of meaning for addressing both the hetero-geneity and the knowledge boundary problems Morerecently we introduced Encyclopedic Knowledge Pat-terns [37] (EKP) as a particular kind of KP that are em-pirically discovered by mining the linking structure ofWikipedia pages EKPs provide knowledge units thatcan answer the following competency question

What are the most relevant entity types (T1 Tn)that are involved in the description of an entity oftype C

For example the EKP for describing philosophersshould possibly include types such as book universityand writer because a philosopher is typically linked tothese entity types We assume that EKPs are cogni-tively sound because they emerge from the largest ex-isting multi-domain knowledge source collaborativelybuilt by humans with an encyclopedic task in mindIf Wikipedia article writing and hyperlinking confirmthis intuition for each entity type in its ontology is anempirical matter and we have performed data miningand experiments [37] to this aim proving that automat-ically extracted EKPs can be trusted as relevant con-figurations of data for a given entity type

In this paper we exploit EKPs for designing a sys-tem that helps humans to address summarisation anddiscovery tasks These tasks can be classified as ex-ploratory search tasks as they involve different phasesie look-up learning and investigation [29] that char-acterise the strategies that humans adopt while explor-ing the Web For example consider a student who isasked to build a concept map about a topic for herhomework She starts by looking-up specific terms in akeyword-based search engine (eg Google) then shemoves through search results and hyperlinks in orderto investigate the available information about the topicfor finally learning new knowledge she can use to ad-dress her task

Hypotheses The work presented in this paper groundson two working hypotheses (i) EKPs provide a unify-ing view as well as a relevance criterion for buildingentity-centric summaries and (ii) they can be exploitedeffectively for helping humans in exploratory searchtasks

Contribution The contributions of this paper arethe following

ndash a detailed description of the Aemoo 1 visual ex-ploratory search system based on the extractionof EKPs from Wikipedia and their evaluation(formerly explained in [37]) We shortly intro-duced Aemoo in [38] whereas in this paper weprovide an exhaustive description of the systemand the method used for applying EKPs in orderto provide entity summarisation and knowledgeaggregation

ndash an original evaluation of Aemoo by means of con-trolled task-driven user experiments in order toassess its usability and ability to provide relevantand serendipitous information

It is worth mentioning that there are state of the artsystems that provide semantic mash-up and browsingcapabilities such as [542526] However they mostlyfocus on presenting linked data coming from differ-ent sources and visualising it in interfaces that mir-ror the linked data structure Instead Aemoo organisesand filters the retrieved knowledge in order to showonly relevant information to users and providing themotivation of why a certain piece of information isincluded The rest of the paper is organised as fol-lows Section 2 presents a method for extracting EKPsfrom Wikipedia Section 3 presents Aemoo a systembased on EKPs for knowledge exploration Section 4describes the experiments we conducted for evaluatingthe EKPs and the system Section 5 discusses evalu-ation results limits and possible solutions to improvethe system Section 6 presents the related work finallySection 7 summarises the contribution and illustratesfuture work

2 Encyclopedic Knowledge Patterns

A general formal theory for Knowledge Patterns(KPs) does not exist yet Different independent theo-ries have been developed so far and KPs have beenproposed with different names and flavours across dif-ferent research areas such as linguistics [18] artifi-cial intelligence [313] cognitive sciences [420] andmore recently in the Semantic Web [21] As discussedin [21] it is possible to identify a shared meaning forKPs across those different theories as ldquoa structure thatis used to organize our knowledge as well as for inter-preting processing or anticipating informationrdquo

1httpwwwaemooorg

AG Nuzzolese et al Aemoo Linked Data Exploration based on Knowledge Patterns 3

In linguistics frames are a form of KPs that wereintroduced by Fillmore in 1968 [18] in his work aboutcase grammar In a case grammar each verb selectsa number of deep cases which form its case frame Acase frame describes important aspects of semantic va-lency verbs adjectives and nouns Fillmore elaboratedfurther the theory of case frames and in 1976 he intro-duced frame semantics [19] According to Fillmore aframe is

ldquo any system of concepts related in such a waythat to understand any one of them you have to un-derstand the whole structure in which it fits whenone of the things in such a structure is introducedinto a text or into a conversation all of the othersare automatically made availablerdquo [19]

A frame is comparable to a cognitive schema anotherform of KPs since it has a prototypical form that canbe applied to a variety of concrete cases that fit thatform According to cognitive science theories eg [4]humans are able to recognize frames to repeatedly ap-ply them in so-called manifestations of a frame and tolearn new frames that can become part of their compe-tence Frames are considered cognitively relevant sincethey are used by humans to successfully interact withtheir environment when some information structuringis needed

In computer science frames were introduced byMinsky [31] who recognized that frames convey bothcognitive and computational value in representing andorganizing knowledge His definition was

ldquo a remembered framework to be adapted to fitreality by changing details as necessary A frame isa data-structure for representing a stereotyped sit-uation like being in a certain kind of living roomor going to a childrsquos birthday partyrdquo [31]

In knowledge engineering the term Knowledge Pat-tern was used by Clark [11] The notion of KP in-troduced by Clark is slightly different from frames asintroduced by both Fillmore and Minsky In fact ac-cording to Clark KPs are first order theories whichprovide a general schema able to provide terminologi-cal grounding and morphisms for enabling mappingsamong knowledge bases that use different terms forrepresenting the same theory Clark recognizes KPs asgeneral templates denoting recurring theory schemataand his approach is similar to the use of theories andmorphisms in the formal specification of software

More recently Knowledge Patterns have been re-vamped in the context of the Semantic Web by Gangemi

and Presutti [21] Their notion of KPs encompassesthose proposed by Fillmore Minsky and Clark andgoes further envisioning KPs as research objects ofknowledge engineering and the Semantic Web fromthe viewpoint of empirical science

In [37] we introduced the Encyclopedic KnowledgePatterns (EKPs) EKPs were discovered by miningthe structure of Wikipedia articles They are a specialkind of knowledge patterns they express the core ele-ments that are used for describing entities of a certaintype with an encyclopedic task in mind The cognitivesoundness of EKPs is bound to an important workinghypothesis about the process of knowledge construc-tion realized by the Wikipedia crowds each article islinked to other articles when defining or describing theentity referenced by the article DBpedia accordinglywith this hypothesis has RDF-ized a) the entities ref-erenced by articles as resources b) the wikilinks as re-lations between those resources and c) the types of theresources as OWL classes EKPs are grounded in theassumption that wikilink relations in DBpedia ie in-stances of the dbpowikiPageWikiLink property2convey a rich encyclopedic knowledge that can be for-malized as knowledge patterns

An EKP is a small ontology that defines a class Sand the typical relations Ri between individuals fromS and other individuals from the most relevant classesCj that are used to describe the individuals from SIn [37] we defined a method for extracting EKP byanalysing the structure of wikilinks This method isbased on three main steps ie

ndash gathering the knowledge architecture of a datasetThe goal of this step is to create a model ableto provide an overview over the structure of wik-ilinks by exploiting the dataset of wikipedia pagelinks available in DBpedia 3 For this purpose weused an OWL vocabulary called knowledge ar-chitecture4 [44] which allows to gather a land-scape view on a dataset even with no prior knowl-edge of its vocabulary In our case the overviewsare focused on the identification of type paths inthe dataset of wikilinks A type path is a prop-

2Prefixes dbpo dbpedia and ka standfor httpdbpediaorgontology httpdbpediaorgresource and httpwwwontologydesignpatternsorgontlod-analysis-pathowl respectively

3The dataset is named dbpedia_page_links_en4httpwwwontologydesignpatternsorgont

lod-analysis-pathowl


Fig 1 The UML class diagram for the EKP ekpPhilosopher The arrows among classes represent universal restrictions

erty path (limited to length 1 in this work iea triple pattern) whose occurrences have (i) thesame rdftype for their subject nodes and (ii)the same rdftype for their object nodes

ndash EKP discovery This step is focused on the dis-covery of EKPs emerging from data in a bottom-up fashion For this purpose we used a functioncalled pathPopularity(Pij Si) which recordsthe ratio of how many distinct resources of acertain type participate as subject in a path tothe total number of resources of that type In-tuitively it indicates the popularity of a pathfor a certain subject type within a dataset AnEKP for a certain type S is identified by theset type paths having S as subject type whosepathPopularity(Pij Si) values are above a cer-tain threshold t ranging from 0 to 1 In order to de-cide a value for t we built a prototypical rankingof the pathPopularity(Pij Si) scores calledpathPopularityDBpedia Then we computed aK-Means clustering on pathPopularityDBpedia

scores to hypothesise a threshold value for tThe clustering generated one large cluster (85of the 40 ranks) with pathPopularityDBpedia

scores below 018 and three small clusters withpathPopularityDBpedia scores above 023Hence we set t = 018 We recommend to readour previous work [37] for further details aboutthe identification of the threshold t

ndash OWL2 formalization of EKPs In this step we ap-ply a refactoring procedure to the dataset result-ing from the previous steps in order to formaliseEKPs as OWL2 ontologies

Figure 1 shows the EKP ekpPhilosopher5 re-sulting from our method It reports the types of entitiesthat are most frequently (according to pathPopularityvalues) associated with entities typed as philosophersin Wikipedia by means of wikilink relations Accord-ing to the method described we generated 231 EKPsout of 272 DBPO classes 6 and published them into theODP repository7

3 Encyclopedic Knowledge Patterns as relevancecriteria for exploratory search

Aemoo is a tool that exploits relevance strategiesbased on EKPs for supporting exploratory searchtasks It uses EKPs as a unifying view for aggregatingknowledge from static (ie DBpedia and Wikipedia)as well as dynamic (ie Twitter and Google News)

5The prefix ekp stands for the namespace httpwwwontologydesignpatternsorgekp

6As at DBpedia version 377httpontologydesignpatternsorgaemoo

ekp


sources We presented a preliminary version of Aemooin [38]

We assume that the human action of linking enti-ties on the Web reflects the way humans organise theirknowledge EKPs reflect the most frequent links be-tween entity types hence our hypothesis is that EKPscan be used for selecting the most relevant entitiesto be included in an entity-centric summary that cansupport users in knowledge exploration In fact Ae-moo novelty is its ability to build entity-centric sum-maries by applying EKPs as lenses over data In thisway Aemoo performs both enrichment and filtering ofinformation Enrichment and filtering are the two ac-tions performed in order to address the knowledge het-erogeneity and boundary problems Users are guidedthrough their navigation by both reducing and focus-ing the amount of available data given an entity in-stead of being presented with a bunch of triples or abig unfocused graph users navigate through units ofknowledge and move between them without losing theoverview of the entity In practice an EKP determinesa topic context according to its entity type All rela-tions between resources that emerge from the selectedEKP are used as the basis for (i) selecting the informa-tion to be aggregated and (ii) visualising it in a conceptmap fashion We discuss the first point in next sectionand the visualisation in Section 32

31 Knowledge enrichment and filtering

Knowledge enrichment and filtering is obtained byperforming the following steps

ndash identity resolution of a subject (provided by a userquery) against Linked Data entities

ndash selection of the EKP corresponding to the subjecttype

ndash filtering and enrichment of static data about thesubject according to the model provided by theselected EKP

ndash filtering and enrichment of dynamic data aboutthe subject according to the model provided bythe selected EKP

ndash aggregation of peculiar knowledge about the sub-ject

In next paragraphs we detail these steps

Identity resolution Identity resolution is performedin two possible ways

ndash a semi-automatic approach that leverages a DB-pedia index based on the Entityhub8 componentof Apache Stanbol10 This approach is semi-automatic because the system returns a list ofpossible entities that match the subject of a userquery based on the value of their rdfslabelannotation The user selects an entity within thelist

ndash a completely automatic approach based on EntityLinking11 Aemoo uses the Enhancer componentof Apache Stanbol for entity linking In this casethe identity of the subject of a user query is re-solved against DBpedia by choosing the entitywith the highest linking confidence

EKP selection An EKP provides a unit of knowl-edge that can be used for answering the questionldquoWhat are the most relevant entity types (T1 Tn)that are involved in the description of an entity of typeC rdquo Thus given a certain entity Aemoo needs to (i)identify its type and (ii) select its corresponding EKPin order to build its entity-centric summary Given asubject entity12 Aemoo replies on an index that asso-ciates EKPs with DBPO types This index is built dur-ing the extraction of EKPs from Wikipedia Selectingan EKP from the index involves the following

ndash identification of the most specific DBPO type tfor a subject entity This allows to avoid multi-typing and to be compliant with the method usedfor generating type paths (cf Section 2)

ndash retrieving the EKP associated with t from the in-dex If no association is available Aemoo tra-verses the DBPO taxonomy of superclasses iter-atively until an association is found If no asso-ciation is found again the EKP for owlThingie ekpThing13 is selected This is the mostgeneric type and its EKP was generated byanalysing wikilinks having as subjects those en-tities with the only type owlThing We remindthat in this work we ignore types defined in otherontologies than DBPO (eg YAGO [49])

8The Entityhub is a component of the Apache Stanbol projectwhich relies on Apache Solr9 for building a customised entity-centric index of a linked dataset

10httpstanbolapacheorg11Entity linking is the ability to resolve the referent of a term in a

text against an entity from a known knowledge base12The subject of a user query13httpontologydesignpatternsorgekpowl

Thingowl


As an example the type dbpoPhilosopher wouldbe selected for the entity dbpediaImmanuel_Kant

and the EKP ekpPhilosopher14 retrieved from theindex accordingly Figure 1 shows the UML class dia-gram for this EKP in which it is possible to see whatare the typical relations emerging between entities ofthis type and other entity types

Filtering and enrichment of static data Aemoouses EKPs as lenses over data for organising and fil-tering the knowledge to be presented about a specificsubject entity EKPs are used for automatically gener-ating SPARQL CONSTRUCT queries to be executedagainst DBpedia for retrieving the relevant data Weremark that EKPs evolves over time and that the re-sults returned by Aemoo will be affected accordinglyThus the filtering and the organisation of data arepattern based and dynamic For example the follow-ing SPARQL query is generated by using the ekp

Philosopher EKP

1 CONSTRUCT 2 3 entity a type 4 dbpediaImmanuel_Kant5 ekp_property entity6 7 8 WHERE9 GRAPH ltdbpedia_page_links_engt 10 dbpediaImmanuel_Kant11 dbpowikiPageWikiLink entity 12 13 GRAPH ltdbpedia_instance_types_engt 14 entity a type 15 16 GRAPH ltekp_philosophergt 17 18 ekp_property19 rdfsdomain dbpoPhilosopher 20 ekp_property21 rdfsrange type22 23 24

Where dbpedia_page_links_en dbpedia_instance_types_en and ekp_philosopher arethree datasets in our SPARQL endpoint15 ie the DB-pedia dataset of Wikipedia links the DBpedia datasetof instance types and the dataset identifying the ekpPhilosopher respectively The query above imple-

14httpontologydesignpatternsorgekpowlPhilosopherowl

15httpwitistccnrit8894sparql

ments the concept of knowledge boundary in Aemoosince it filters out the wikilinks that are not typical to aspecific subject using its reference EKP as relevancecriterion The query can be explained as follows

ndash the subject entity is associated with its page linksat lines 10 and 11

ndash each linked entity is bound to its type at line 14ndash linked entities are filtered based on their types

according to the EKP Based on the definition oftype paths [37] we have

lowast Si =dbpoPhilosopherlowast q =ekp_propertylowast Oj =type

where type is a variable bound to the objecttype and ekp_property is a variable bound tothe EKP local property for a wikilink type (egekpSlinksToCity) This property is gener-ated during the formalisation step of the methoddescribed in Section 2 By constraining the resultswith domain and range of ekp_property (lines18-21) the result of the query consists of all wik-ilinks that match the ekpPhilosopher EKP

These queries generate entity-centric RDF modelsused by Aemoo for building a summary descriptionfor a specific subject entity For example consideringdbpediaImmanuel_Kant as subject entity its re-sulting entity-centric model is the following

1 dbpediaImmanuel_Kant2 a dbpoPhilosopher 3 rdfslabel Immanuel Kanten 4 ekpSlinksToCity5 dbpoKoumlnigsberg 6 7 ekpSlinksToLanguage8 dbpoLatin 9 10 ekpSlinksToBook11 dbpoCritique_of_Pure_Reason 1213 dbpoKoumlnigsberg a dbpoCity 14 rdfslabel Koumlnigsbergen 1516 dbpoLatin a dbpoLanguage 17 rdfslabel Latinen 1819 dbpoCritique_of_Pure_Reason20 a dbpoBook 21 rdfslabel22 Critique of Pure Reasonen

Aemoo further enriches the resulting model byidentifying possible semantic relations that provide


a label to wikilinks It associates each wikilink typewith a list of popular DBpedia semantic relationsholding between the types involved in that wik-ilink type For example considering a wikilink typeekpSlinksToCity which links entities of typesdbpoPerson (subject) and dbpoPlace (object) wecan extract from DBpedia that the most popular se-mantic relations linking these type of entities aredbpobirthPlace and dbppropplaceOfBirth16 Thesesemantic relations are added to the previous RDFmodel by means of skosrelatedMatch axiomsFor example

1 ekp_phillinksToCity2 a owlObjectProperty 2 skosrelatedMatch3 dbpobirtPlace dbppropplaceOfBirth

A wikilink is a hyperlink in a Wikipedia page that is(almost17) always surrounded by a piece of text Thispiece of text is very important because it can be usedas linguistic evidence for explaining the nature of therelation between a subject entity and the object entitylinked by a wikilink Thus Aemoo extracts this pieceof text (with a maximum of 50 words) and enrichesthe model with an OWL2 annotation representing thislinguistic evidence For example given the triple

dbpediaImmanuel_KantekpSlinksToCity

dbpediaKoumlnigsberg

its annotation will be the following

1 [] a owlAxiom 2 owlannotatedSource3 dbpediaImmanuel_Kant 4 owlannotatedProperty5 ekpSlinksToCity 6 owlannotatedTarget7 dbpediaKoumlnigsberg 8 groundinghasLinguisticEvidence9 aemoowiki-sentence 1011 aemoowiki-sentence a docoSentence 12 frbrpartOf wikipediaImmanuel_Kant13 c4ohasContent14 Immanuel Kant was born in 172415 in Koumlnigsberg Prussia (since 194616 the city of Kaliningrad17 Kaliningrad Oblast Russia) as18 the fourth of nine childrenen

16This information is limited by the coverage of DBpedia17An exception is represented for example by the infoboxes hav-

ing wikilinks in table cells without any surrounding text

1920 wikipediaImmanuel_Kant21 a fabioWikipediaEntry

Where the triple is annotated with the propertygroundinghasLinguisticEvidence18 while thelinguistic evidence is identified by aemoowiki-sentence19The linguistic evidence is (i) typed as adocoSentence20 [12] is (ii) declared to be extractedfrom Wikipedia with the property frbrpartOf21

and (iii) associated with the text by means of the prop-erty c4ohasContent22 [15]

Filtering and enrichment of dynamic data Be-sides static data sources Aemoo uses two dynamicdata sources (that can be easily extended) Twitter andGoogle News In fact the entity-centric model builtby Aemoo can be extended in order to include (i) thecurrent stream of Twitter messages and (ii) the avail-able articles provided by Google News that mentionthe subject entities co-occurring by mentions of enti-ties that can be resolved on DBpedia For example letus consider the following tweet

ldquoLots of people love to read Kant here in Romerdquo

In this tweet the named entities ldquoKantrdquo and ldquoRomerdquoare resolved to the entities dbpediaImmanuel_

Kant and dbpediaRome These in turn have typesekpPhilosopher and ekpCity respectively Ae-moo uses this mechanism for extending an entity-centric model by aggregating new (heterogeneous)data According to the representation schema previ-ously described the entity-centric model is then ex-tended as follows

1 [] a owlAxiom 2 owlannotatedSource3 dbpediaImmanuel_Kant 4 owlannotatedProperty5 ekpSlinksToCity 6 owlannotatedTarget7 dbpediaRome 8 groundinghasLinguisticEvidence

18The prefix grounding identifies the ontologyhttpontologydesignpatternsorgcpowlgroundingowl

19The prefix aemoo identifies the local namespace used by Ae-moo

20The prefix doco identifies the ontology httppurlorgspardoco

21The prefix frbr identifies the ontology httppurlorgvocabfrbrcore

22The prefix c4o identifies the ontology httppurlorgsparc4o [15]


9 twittertweet_id 1011 twittertweet_id a fabioTweet 12 c4ohasContent13 Lots of people love to14 read Kant here in Romeen

where the triple is annotated with the propertygroundinghasLinguisticEvidence and the lin-guistic evidence in this case is identified by twittertweed_id23 The tweet is (i) typed as fabioTweet [40]24

and (ii) associated with the text by means of the prop-erty c4ohasContent

It is worth clarifying that not all co-occurrences areselected but only those whose types are compliantwith the intensional schema provided by the referenceEKP for the subject entity In fact let us consider thetweet

ldquoJust bought another Kantrsquos book on Amazonrdquo

The entity dbpediaAmazon typed as dbpoOrganisationwould not be added to the entity-centric model as-sociated with dbpediaImmanuel_Kant becauseaccording to the relevance criterion provided by theEKP ekpPhilosopher the relation between dbpoPhilosopher entities and dbpoOrganisation en-tities are not central for philosophers

Aggregation of peculiar knowledge However thelink between dbpediaImmanuel_Kant and dbpediaAmazon is kept and stored as a peculiar informationfor this specific subject We refer to this kind of dataas ldquocuriositiesrdquo about a subject and make the user ableto visualise them in a complementary entity-centricmodel (cf Section 32) Curiosities are captured by thewikilinks of a subject whose path type falls in the long-tail ie its pathPopularity value is below the EKPthreshold (cf Section 2) While EKPs allow to capturethe core relevant knowledge about a subject based onthe relations that are popularly used for all (or most)entities of its same type curiosities show features thatare peculiar to that subject hence providing a differentrelevance criterion for selection that is what makes ita distinguished entity as compared to the others of itssame type Let us consider two subjects as an exam-ple ldquoBarack Obamardquo and ldquoArnold SchwarzeneggerrdquoBoth are office holders according to DBpedia hencetheir entity-centric model based on the relevance cri-

23The tweet_id has to be replaced with an actual tweed identifieravailable from Twitter

24The prefix fabio identifies the ontology httppurlorgsparfabio [40]

terion of identifying the core knowledge about themwill be built according to the dbpoOfficeHolder

EKP which will allow Aemoo to show informationabout legislatures elections other office holders etcIf for the same subjects we want to identify informa-tion that is more unusual to find for office holdersthen Aemoo generates a complementary entity-centricmodel for the subjects based on curiosities This modelwill include for example the information that BarackObama was awarded with a nobel prize and a list ofmovies that starred Arnold Schwarzenegger Howeverdifferently from the core knowledge filtered by EKPthese data are noisy hence further studies include theinvestigation of this complementary models to the aimof conceiving strategies for refining this ldquopeculiarity-basedrdquo relevance criterion

32 Knowledge visualisation

In this section we motivate our design choices con-cerning the visualisation and the presentation of datain Aemoo

Dadzie and Rowe [14] starting from the conclusionof Shneiderman [47] debate about the requirementsthat visualisation systems should fulfil for enabling theconsumption of Linked Data depending on the targetuser ie a tech-user (a user with a good knowledgeof Linked Data and Semantic Web technologies) or alay-user (a user with little knowledge of Linked Dataand Semantic Web technologies) We focused on thedesign of Aemoo user interface (UI) for providing avisual presentation of data in order to enable its usageto lay-users Hence we took into account the followingrequirements

(R1) data extraction and aggregation in order toprovide overviews of the underlying data

(R2) intuitive navigation(R3) support for filtering to users with no knowl-

edge of formal query syntax data structure andspecific vocabularies used in the dataset

(R4) exploratory knowledge discovery(R5) history mechanism for enabling undo or re-

tracing of navigation steps(R6) support to detail data on demand

The method described in Section 31 was designedto address requirements R1 and R3 Then the mainissue is to provide an intuitive visualisation (R2) thatbenefits from the way Aemoo extracts aggregates andfilters data for enabling exploratory search (R4) De-signing an UI for supporting humans in exploratory


(a) Initial summary page for the query Immanuel Kant

(b) Exploring knowledge between Immanuel Kant and Koumlnigsberg On the left side there is a list of explanation snippets thatprovide evidences of the relation with respect to the enabled sources

Fig 2 Aemoo user interface

search means to reduce the cognitive load required foractivities like look-up learning and investigation [29]Novak and Gowin [36] demonstrated that conceptmaps are an effective means for representing and com-municating knowledge for supporting humans in un-

dertaking understanding and learning tasks Conceptmaps introduced by [34] are diagrams that are typ-ically used to depict relationships between conceptsand they can be easily adopted in our case for visu-alising knowledge organised by means of EKPs In


Fig 3 Aemoo breadcrumb

fact EKPs similarly to concept maps include con-cepts (ie DBPO classes) and relations among themThus we claim that visualising knowledge in a conceptmap is a fair solution to help humans to construct theirmental models [13] for addressing exploratory searchtasks with Aemoo

In fact the concept map related to a subject entityis visualised by Aemoo using a radial graph with one-degree connections having

ndash the subject entity in the center represented assquared node

ndash circular nodes around the subject entity that rep-resent sets of resources of a certain type Namelythey are the types of the resources linked to thesubject entity that Aemoo aggregated and en-riched from the various sources (cf requirementR1) We refer to them as node sets As previ-ously described such types are the ones that auser would intuitively expect to see in a summarydescription of an entity according to its type aswe described in Section 2

The radial graph referred as a bona fide metaphorfor presenting data by some authors [16] was preferredto other visualisation solutions (eg tabular tree etc)because it directly mirrors the structure of an EKP (cfFigure 1) and provides a ldquocompassrdquo deriving from theradial layout that is meant to be intuitive for the usernavigation (cf requirement R2) The compass allows

to summarise and organise knowledge and can be usedby the users (i) to enable exploratory features and (ii)to provide a mechanism to detail data on demand (cfrequirement R6) In fact the user can hover with thepointer on the circular nodes of the graph in order toexpand its knowledge and gather more and more de-tailed information Additionally the radial graph is in-teractive all its elements can be clicked including thenew interface objects that are trigger by navigating itcausing the user to change and enrich the focus of theexploration The interaction aspect is designed to ad-dress requirement R4 ie exploratory knowledge dis-covery It is worth noting that in case of much in-formation radial graphs can easily display messy andovercrowded layouts However in Aemoo the radialgraph with one-degree connections prevents such sit-uations as the graph goes to only one level below theroot and the relationships are unweighted Addition-ally EKPs capture only 10 type paths on average (cfour paper on EKPs [37]) Thus in the most frequentcase Aemoo presents a radial graph having a subjectentity surrounded by 10 node sets at most Moreoverradial layouts allow more easily to keep the focus onthe central area of the graph [9] that in the case of Ae-moo includes both the subject entity and the node setsHence the radial graph of Aemoo25 allows to organ-ise the entity-centric knowledge in a way that is com-

25Based on a graph with one-degree connections


pact and focused to the visualisation of the relationsbetween a subject entity and its surrounding node setsFor example a tree layout would scale less easily Incase of many node sets it would require more horizon-tal space in order to display the subject entity to thetop while all node sets at the same level below the sub-ject entity Similarly a tabular layout would organisethe subject entity and its related node sets in a morescattered way if compared to the simple relations thatAemoo presents between a subject entity and its nodesets We remark that this rationale was intended for theAemoo case It cannot be applied to radial graphs withmultiple-degree connections

Figure 2(a) shows an example of a concept map hav-ing Immanuel Kant as subject entity Aemoo splits theinterface into two parts (i) on the center-right side itvisualises the concept map and (ii) on the left side it vi-sualises a widget named entity abstract An entity ab-stract provides a high-level overview on selected enti-ties fulfilling requirement R1 along with the conceptmap This overview consists of an entity label with itsthumbnail DBPO type and abstract A user can clickon an entity label or on its DBPO type in order to beredirected to their corresponding pages in DBpediaSimilarly a user can open the Wikipedia page associ-ated with the subject entity by clicking on the link ldquo(goto Wikipedia page)rdquo that Aemoo shows at the end ofthe abstract (cf Figure 2(a))

The concept map is interactive and serves as pre-viously argued as an entity-centric navigation toolfor gathering more detailed information and browsingamong DBpedia entities More in detail the follow-ing elements of the interactive interface are designedto address requirements R3 R4 and R6

ndash a square box (cf the arrow with id 1 in Fig-ure 2(b)) is visualised by hovering the pointer onany node set Such a box contains the list of theentities belonging to a node set The list is pagi-nated in order to present 10 entities per page Werefer to these boxes as entity boxes Each entity inan entity box is depicted along with an icon thatindicates the provenance (ie Wikipedia Twitteror Google News) These icons are also shown fornode sets in order to summarise the provenanceabout their contained entities In an early proto-type of Aemoo the content of the entity boxeswas visualised by exploding the node sets as newcircular nodes We ran a first set of preliminarytests by asking volunteer users to play with Ae-moo and provide open feedback on its early in-

terface prototype Most of the feedback suggestedminor changes (eg on the positioning of the ab-stract and explanations) which we implementedNevertheless a major request emerged from mostusers to keep the graph visualising only one-degree relations and to move elsewhere the con-tent of the sets in order to keep the interface con-tent readable and easier to navigate In additionto this empirical observation we claim that thischoice is also motivated by the inherent differ-ence between the relation between the subject en-tity and the node sets and the relation between thenode sets and the entities that they contain Thisis why Aemoo now separates entity boxes fromthe concept map which keeps the interface easierto navigate and reflects the different semantics ofthe visualised relations

ndash the relations between the subject entities and thesurrounding node sets are always displayed byusing an unlabeled edge representing a set ofwikilink relations Aemoo also shows a list ofsemantic relations extracted from DBpedia thatmight express the intension of the unlabeled re-lation This list is displayed in a tooltip appear-ing by mouse hovering the pointer on any edge(cf Section 31) Such relations are (i) representedas skosrelatedMatch axioms in the entity-centric model generated by Aemoo for the subjectentity (cf Section 31) and (ii) are extracted ac-cording to their frequencies for the specific pathtype The tooltip is used in order to enable thedetail-on-demand functionality (cf requirementR6) For example Figure 2(a) shows a tooltip forthe edge expressing a set of wikilink relationsbetween Immanuel Kant and the node set CitySuch a tooltip provides dbpobirthPlace anddbppropplaceOfBirth26 as possible seman-tic relations for that type of link

ndash a list of explanations appears in a widget on theleft-bottom of the interface (cf the arrow with id 2in Figure 2(b)) by hovering on any entity in an en-tity box These explanations provide details-on-demand (cf requirement R6) and come from theWikipedia text surrounding the wikilink as de-scribed in Section 31 The explanations can beused by humans for understanding the semanticsof the relations between the entities visualised by

26The prefix dbpprop stands for the namespace httpdbpediaorgproperty


(a) The radial graph displayed by Aemoo contains curiosities about Alan Turing

(b) The radial graph displayed by Aemoo contains curiosities about Prussia

Fig 4 Aemoo displaying curisioties about Alan Turing and Prussia

the concept map For example the arrow with id 2in Figure 2(b) points to the explanations of the re-lations between Immanuel Kant and KoumlnigsbergA user can easily get that the link represents tworelations Immanuel Kant was born in Koumlnigs-berg and there is a statue of Kant in that city Ex-

planations have also associated icons which indi-cate their provenance

ndash any entity in an entity box is clickable This en-ables navigation and detail-on-demand capabili-ties (cf requirements R2 and R6) In fact a usercan change the focus (ie the subject entity) ofthe concept map at any time by selecting with a


click any possible entity in an entity box Whenthe focus changes the concept map is rearrangedaccording to the new subject entity and its typeby applying the appropriate EKP Figure 3 showsthe situation after some exploration steps thatchanged the focus to Prussia as subject entity Atthe center-bottom of the interface there is the ex-ploratory history (cf the arrow with id 4 in Fig-ure 3) named breadcrumb The breadcrumb ful-fils both requirements R4 and R5 as it allows auser to retrace his exploratory steps at any timeand provides her with updated information abouther exploratory path

The sources to be used for populating the conceptmap can be chosen by users through a set of check-boxes that appear at the top-right corner of the inter-face (cf Figure 2(b))

A blue link located in top-center of the interface un-der the search bar allows users to switch to the ldquocu-riositiesrdquo about a subject entity When clicking on thislink the knowledge is again arranged in a concept mapfashion and enriched with news and tweets just as ithappens for the previous summary but this time thenode sets are selected with a different criterion theyare types of resources that are unusual to be includedin the description of a country hence possibly provid-ing insights of what distinguishes eg Prussia fromother countries (cf Section 31) We use the same vi-sualisation metaphor as for the presentation of the coreknowledge about a subject entity in order to keep theinterface coherent and to ensure a smooth interactionbetween the two views Figure 4(a) and Figure 4(b)show the radial graphs containing the curiosities aboutAlan Turing and Prussia respectively It is possible tonote how these graphs report unusual or less commonrelations between the subject entities and the entitytypes identified by the node sets eg the relation be-tween Alan Turing and Optic nerve (cf 4(a)) and thatbetween Prussia and Baltic See (cf 4(b))

Table 1 provides a summary of Aemoo UI compo-nents indicating the requirements they are designed toaddress

33 Implementation details

Aemoo is released as a RESTful architecture it con-sists of a server side component implemented as aJava-based REST service and a client side componentbased on HTML and JavaScript

Table 1UI components with respect to design requirements

UI component R1 R2 R3 R4 R5 R6Concept map 3 3 3 3

Entity abstract 3

Entity box 3 3

Explanation widget 3

Breadcrumb 3 3

The overview of the architecture at the server sideincluding also the components for EKP extraction isdepicted in Figure 5

The architecture is designed by using the Component-based [22] and the REST [17] architectural styles

The Knowledge Pattern (KP) extractor is composedof the following components

ndash KP extraction coordinator which takes care of thecoordination of the overall extraction process

ndash Property path identifier that is responsible for theidentification of type paths

ndash Property path storage that manages the storage ofidentified paths

ndash Property path analyzer that draws boundariesaround paths in order to formalise KPs

ndash KP repository manager which is responsible forthe storage indexing and fetching of KPs

Aemoo is composed of the following components

ndash Aemoo coordinator which coordinates all the ac-tivities

ndash Identity resolver which is in charge of resolvingan user query with respect to entity in LinkedData

ndash KP selector that selects an appropriate KP accord-ing to the entity identified

ndash Knowledge filter which takes care of applying aKP on raw RDF data

ndash Knowledge aggregator that aggregates knowl-edge form other sources with respect to the KPselected

All components are implemented as Java OSGi [53]bundles components and services and some of themcan be accessed through the RESTful interfaces ex-posed by the Aemoo REST provider (ie Aemoo co-ordinator and KP selector) and the KP extractor RESTprovider (ie KP extraction coordinator KP reposi-tory manager)

The client side interacts with the other compo-nents via REST interfaces through AJAX Addition-ally it handles the visualization of Aemoo through the


Fig 5 Overview of the architecture of Aemoo and the EKP extractor

JavaScript InfoVis Toolkit27 a Javascript library thatsupports the creation of interactive data visualizationsfor the Web

4 Evaluation

In Section 1 we hypothesised that EKPs provide in-tuitive entity-centric summaries We also hypothesisedthat EKPs can be exploited for visualising Linked Datain order to help humans in exploratory search tasksIn Section 41 we summarise the experimental setupwe defined in [37] and used for assessing the cognitivesoundness of EKPs While in Section 42 we describethe experimental setup used for assessing our workinghypothesis

27httpthejitorg

41 Cognitive soundness of EKPs

In [37] we carried out a user-based study to assessthe cognitive soundness of EKPs We intended to makeEKPs emerge from human consensus and to comparethem to those extracted automatically from WikipediaIn that study we asked 17 participants to indicate thecore relevant types of things (object types) that couldbe used to describe a certain type of things (subjecttypes) For example for the subject ldquoCountryrdquo (such asGermany) core object types can be ldquoLanguagerdquo (egGerman) ldquoCountryrdquo ie other countries with which itborders (eg Denmark Poland Austria Czech Repub-lic Switzerland France Luxembourg Belgium andNetherlands) etc The participants had different na-tionalities (Italy Germany France Japan Serbia Swe-den Tunisia and Netherlands) and different mothertongues although they were all fluent in English Hav-ing participants from different nationalities and na-tive languages allowed us to observe whether EKPs


are perceived as sound units of meaning independentlyfrom one specific language or culture at least for thoserepresented in our study Although the multi-culturaland multi-language character of EKPs cannot be as-sessed with proof28 the good inter-rater agreement(Kendallrsquos W = 068) and the high reliability of theraters (Cronbachrsquos alpha = 093) provide an encour-agement towards further dedicated study

In order to compare the EKPs annotated by humansto the EKPs empirically extracted from Wikipediawe computed the correlation between the scores as-signed by participants and a ranking function namedpathPopularityDBpedia representing the trend ofpathPopularity values for Wikipedia EKPs Werecorded high correlation on average (Spearmanrsquosρ = 075) meaning that the identified value for t (cfSection 2) is a stable criterion for determining knowl-edge boundaries for EKPs More details can be foundin our previous work [37]

42 Experimental setup for evaluating Aemoo

In order to assess the validity of our hypothesis(ie EKPs provide intuitive entity-centric summariesthat can support humans during exploratory searchtasks) we carried out a user-based study whose aim istwofold

ndash evaluating the system usability of Aemoondash analysing usersrsquo feedback about their interaction

with the UI of Aemoo

For this purpose we defined three tasks involvinglook-up learning and investigation [29] phases thatcharacterise the strategies that humans adopt while ex-ploring the Web Each task could be undertaken byusing one of three tools Google RelFinder and Ae-moo The tool to be used was automatically selectedby a system built for the evaluation hence it was not achoice of the participants but constraint by the exper-imental setting Using three tools allowed us to con-duct the evaluation of Aemoo as a comparative anal-ysis Google and RelFinder provided us two viableand suitable choices to compare with (i) althoughGoogle does not provide an interface specially de-signed for exploratory search it is currently the most

28It has to be remarked that we had a small number of partici-pants and they were all highly educated fluent in English and mainlyfrom European countries We assume that participants from Euro-pean countries have many cultural and linguistic aspects (as speakersof Indo-European languages) in common

used exploratory tool on the Web Users have devel-oped their own methods for exploring and discover-ing knowledge by using Google and they are very fa-miliar with its interface We expected that comparingwith Google would give us insights on how Aemoo isperceived as compared to a popular and well known(exploratory) search interface For this reason Googleprovides a reference baseline (ii) RelFinder [14] isa tool supporting visual exploratory search on linkeddata it is very popular among Semantic Web expertsless known to the general users It uses a graph vi-sualisation metaphor and gives users the possibilityto filter data according to a number of fixed criteriaComparing with RelFinder allows us to assess boththe usability of our visualisation interface with re-spect to RelFinderrsquos and the effectiveness of the EKP-based relevance criterion for automatic summarisationas compared to manual filtering

It has to be noted that the three tools rely on dif-ferent background data sources (ie Google on poten-tially the whole Web RelFinder on a number of linkeddatasets and Aemoo on DBpedia Twitter and GoogleNews) This would cause an issue during the analysisof results as the data gathered by users from the dif-ferent tools would not be straightforwardly compara-ble one may not be able to judge whether a differ-ence in the task solutions is due to a more effectiveexploration support or to a largersmaller set of avail-able data sources In order to make the results com-parable we constrained the tools background knowl-edge to Wikipedia (for Google) and to DBpedia (forRelFinder) which constitute the data intersection ofthe three tools

The three tasks of the controlled experiment are thefollowing

Task 1 - Summarisation the participants wereasked to build a summary of the subject ldquoAlan Tur-ingrdquo with the information they could get from the toolat hand They were asked (i) to collect as quickly aspossible all elements and their relations that could bepossibly included in such summary and record them(ii) and to score them based on their relevance and un-expectedness The information had to recorded as a setof triples ltsubject objects descriptiongt For examplethe fact that ldquoAlan Turing worked at the University ofManchesterrdquo is a valid information to include in a sum-mary about Alan Turing Let us suppose that a partici-pant is able to find this information with Aemoo (egby navigating the node sets of the radial graph having


ldquoAlan Turingrdquo as subject entity) then she has to recorda triple as follows

ndash subject=Alan Turingndash object=University of Manchesterndash description=Alan Turing worked at the University

of Manchester

Where the subject and the object identify the two el-ements of the relation found and the description pro-vides an explanation about the nature of such a rela-tion as it is understood by the participants (eg byreading the explanations provided by Aemoo)

For each identified triple the user has to separatelyrank its relevance and unexpectedness with respect tothe subject using a 5-point scale ranging from 1 (Irrel-evantbanal) to 5 (Relevantunexpected) Participantswere free to start their exploration from any conceptbut they were assigned a specific tool to use for ex-ploring information with a maximum time of 10 min-utes for each task They have been instructed to includeall possible triples they could get from the tool resultsbeing them interesting wrong relevant obvious etcand to rank them accordingly They were invited to actso that the finalisation of the summary would happenat a later stage based on their ratings

Task 2 - Related entities the participants wereasked to find as many objects of a certain type as pos-sible which have a relation to the given subject Theactual question was What are the places related toldquoSnow Whiterdquo Hence the participants had to providea list of all places related to ldquoSnow Whiterdquo with thehelp of the provided tool Examples are Germany Al-bania Russia etc

Task 3 - Relation finding the participants wereasked to find one or more relations between two sub-jects and to describe their semantics The actual ques-tion was What is the relation between ldquoSnow Whiterdquoand ldquoCharlize Theronrdquo Participants had to report thelist of relations they were able to find with the help ofthe provided tool Possible elements of such list couldbe acts in the movie plays role in the movie etc

Each group performed the three tasks (on the samesubjects) twice using two different tools29 one ofwhich was always Aemoo On one hand performingthe same task twice might introduce a bias as during

29The first iteration consisting of the three tasks performed withone tool and the second iteration consisting of the same three tasksperformed with a second tool

the second iteration the cognitive load could be re-duced due to the previous investigation of the samesubject On the other hand this procedure was meantto foster participants to provide feedback based on theself-assessment of their user experience by also com-paring different tools on the same tasks The feedbackwere collected by means of open questions at the endof the experiment (cf Table 2) In order to mitigatethe bias possibly affecting the results the experimentwas designed in order to balance the number of par-ticipants using Aemoo as first tool with the number ofparticipants using an alternative system (either Googleor RelFinder [25]) as first tool We recorded the ex-periment executions by keeping track of the orderinghence we were able to split the results collected fromthe first run of the tasks from those collected from thesecond run

At the end of each iteration the participants wereasked to rate ten statements using a five-point Likertscale (from 1 Strongly Disagree to 5 Strongly Agree)and to answer five open questions The ten statementswere those of the System Usability Scale (SUS) [8]The SUS is a well-known metric used for evaluatingthe usability of a system It has the advantage of be-ing technology-independent and reliable even with avery small sample size [46] It also provides a two-factor orthogonal structure which can be used to scorethe scale on independent Usability and Learnability di-mensions [46]

Table 2 reports the five open questions aimed atcollecting feedback (pros and cons) from the partic-ipants about the quality of their experience with Ae-moo The user feedback from this questionnaire hasbeen used to perform a qualitative analysis of Aemoobased on Grounded theory [48] a method often used inSocial Sciences to extract relevant concepts from un-structured corpora of natural language resources (textsinterviews or questionnaires)

We developed an ad-hoc web application named Ae-mooEval 30 for supporting the experiment The appli-cation was designed in order to allow users to per-form their tasks with any of the tools (ie AemooGoogle RelFinder) without the need of switching con-text and interface for recording their findings and pro-viding their judgments Figure 6 shows the interfaceof the tool during the execution of Task 1 (ie Sum-marisation) The ldquoShow morerdquo button at the top of thepage allows to viewhide a text providing guidelines

30httpwitistccnritswengcgi-bin


Fig 6 The interface of the Web-based tool designed and implemented for carrying on the user-study

for undertaking the task The main body of the inter-face embeds the tool to be used (Aemoo in the Figure)while the bottom of the interface provides text fieldsand checkboxes for collecting input from users eg thetriple ltsubject object descriptiongt for Task 1

For Task 2 and Task 3 the bottom part of the in-terface only includes text fields as no ranking is re-quested AemooEval takes care of managing task iter-ations automatically selecting the tool to be used at acertain iteration by guaranteeing the balance of the al-ternating sequence of tool usage storing user feedbackand metadata (user id iteration time to perform thetask etc) and enforcing the time constraint

The three tasks were performed by 32 participantsaged between 20 and 35 years and equally distributedin terms of gender All the participants were under-graduate students in computer science coming fromthe University of Bologna in Italy and the Universityof Paris 13 in France The participants were dividedinto 5 groups and supervised by an evaluator whowas in charge to support them during the experimentsThe evaluator provided participants with an introduc-tory description of the experimentrsquos goal and tasks abrief tutorial about how to use AemooEval and a brieftutorial on the three tools ie Aemoo RelFinder andGoogle


Fig 7 Answers provided by participants to the questions concerning their background related to the experiment Question labels correspond tothe question numbers in Table 3 Standard deviation values are expressed between brackets and shown as black vertical lines in the chart

Table 2Open questions for grounded theory-based analysis

Nr Question1 How effectively did the system support you

in answering to the previous tasks

2 What were the most useful features of thesystem that helped you to perform yourtasks

3 What were the main weaknesses that the sys-tem exhibited in supporting your tasks

4 Would you suggest any additional featuresthat would have helped you to accomplishyour tasks

5 Did you (need to) open and read Wikipediapages when using the system If yes pleaseexplain the motivation

In order to assess the background and skills of par-ticipants before running the experiments they wereasked to rank31 twelve statements reported in Table 3composing a self-assessment questionnaire Figure 7shows the averages of scores for each statement andtheir standard deviations According to the results ofthe self-assessment questionnaire participants con-firmed to have little knowledge about Semantic WebLinked data technologies and DBpedia (statement 611 12) They declared to have small experience of ex-ploratory search (statement 1) and to have no knowl-edge about RelFinder and Aemoo (statements 2 3)though as expected they were familiar with Google(statement 4) Furthermore the knowledge about the

31using a 5-point Likert scale from 1 Strongly Disagree to 5Strongly Agree

Table 3Self-assessment questionnaire

Nr Question1 I have extensive experience in exploratory

search

2 I am an expert user of Aemoo

3 I am an expert user of RelFinder

4 I am an expert user of Google

5 I frequently use the Web to explore informa-tion and to perform tasks such as homeworkpresentations working analysis reports

6 I have extensive experience in Semantic Weband Linked Data technologies

7 I have detailed knowledge of Alan Turing

8 I have detailed knowledge of Snow White

9 I know what Wikipedia is

10 I frequently use Wikipedia

11 I know what DBpedia is

12 I frequently use DBpedia

two subjects explored during the execution of the taskswas comparable among the participants (statement 78) as it can be observed from the small standard devia-tion value (018 and 016 respectively)

The next section shows and discusses the results ofthe experiments

5 Results and discussion

Results In order to compare the performance of par-ticipants in executing their tasks with the support ofthe three tools (ie Aemoo RelFinder and Google)we consider the time spent by participants for provid-


Fig 8 Number of answers per minute for each task and tool

(a) SUS scores and standard deviation values for Aemoo RelFinder and Google Stan-dard deviation values are expressed between brackets and shown as black vertical linesin the chart

(b) P -values computed with a Tukeyrsquos HSD pairwise comparison of the SUS scores betweenAemooRelFinder AemooGoogle and GoogleRelFinder

Fig 9 SUS scores (cf Figure 9(a)) and p-values (cf Figure 9(b))

ing each answer (eg each triple in Task 1) Our intu-ition is that this measure can give us an insight on howwell the tools support users in undertaking their tasksespecially when there is significant difference amongthe three tools The observed performance is reportedin Figure 8 On average Aemoo performs better thanRelFinder and Google The better result on average isdue to Task 2 for which Aemoo significantly outper-forms the other tools while RelFinder has a slightlybetter performance in Task 1 and Google in Task 3

A main focus of our experiment is to asses how Ae-moo is perceived by participants in terms of its usabil-ity To this aim we computed the SUS for the threetools and show the results in Figure 9(b) We distin-guish the results of the first iterations (only consider-ing the questionnaire filled after performing the tasksthe first time) from the results of only the second it-erations (only considering the questionnaire filled af-ter the second iteration) We also report the aggregatedresults of the two iterations Values between brackets


(a) Learnability scores

(b) P -values computed with a Tukeyrsquos HSD pairwise comparison of the learnabilityscores between AemooRelFinder AemooGoogle and GoogleRelFinder

(c) Usability scores

(d) P -values computed with a Tukeyrsquos HSD pairwise comparison of the usability scoresbetween AemooRelFinder AemooGoogle and GoogleRelFinder

Fig 10 Learnability (cf Figure 10(a)) and Usability (cf Figure 10(c)) values with corresponding p-values (cf Figures 10(b) and 10(d)) measur-ing significance Standard deviation values are expressed between brackets and shown as black vertical lines in the chart

provide standard deviations and they are also reportedinto the chart as vertical black bars SUS values rangebetween 0 (Unusable) and 100 (Usable) Based on em-

pirical studies [46] a SUS score of 68 represents theaverage usability value for a system The same workdemonstrates that SUS allows to reliably assess the us-


ability of a system also with a small number of partici-pants Aemoo usability is satisfactory (average gt 68)however we were also interested in investigating its us-ability in comparison to the other tools

Figure 9 shows the p-values computed by using theTukeyrsquos HSD (honest significant difference) methodindicating the statistical significance of the pairwisecomparison among the three tools Unfortunately theevidence is insufficient for claiming the significance ofthe comparison between Google and Aemoo howeverthe data are reported for completeness and for possi-ble use in future work We have strong evidence (p lt001) supporting the hypothesis that Aemoo is moreusable than RelFinder (also RelFinder results less us-able than Google) if we consider both iterations Thisclaim is also supported by moderate evidence if weconsider the iterations separately (001 lt p le 005)

In addition to the SUS score we analysed two sub-parameters (Learnability and Usability [28]) for thethree systems The value range for these parameter is[0 (Hard to learnuse) 1 (Easy to learnuse)] The re-sults of this analysis are reported in Figure 10 Figure10(a) shows the learnability scores for the three sys-tems and their standard deviations According to Fig-ure 10(b) the evidence is insufficient for claiming thesignificance of the comparison however we report theresults for completeness and for possible use in futureinvestigation As far as Usability is concerned Figure10(c) shows the obtained scores (with their standarddeviations) We have strong evidence for claiming thatAemoo is more usable than RelFinder (moderate evi-dence supports the same claim if we consider the twoiterations separately)

The results of the open questionnaire (cf Table 2)were analysed by using the Grounded Theory [48] Weproceeded first with the open coding and then with theaxial coding The open coding aims at extracting ac-tual relevant sentences - called codes - from the an-swers The axial coding rephrases the original codesso as to define conceptual clusters capturing semanticconnections from codes Each conceptual cluster wasassociated with a priority score (the greater the amountof codes feeding a conceptual cluster the highest itspriority) in order to identify the most important issuesarising from participantsrsquo feedback Figure 11 showsthe results of the open questionnaire limited to themost frequently mentioned codes (ordinate values re-port the number of mentions normalised on a scale be-tween 0 and 1)

Finally Figure 12 shows the ratings (based on a Lik-ert scale) that participants provided for judging rele-

vance and unexpectedness of the results provided bythe three systems during Task 1 (cf Section 42) con-sidering both iterations Aemoo (r = 411 u = 292)performed slightly better than RelFinder (r = 397u = 264) and Google (r = 405 u = 22) Table 4shows the results by separating the values for the firstand the second iterations respectively32 which con-firm the previous ones with the only difference that Ae-moo and RelFinder have the same results for relevanceratings in the first iteration

Table 4Relevance and unexpectedness ratings according to participantsrsquofeedback by taking into account the specific iteration r1 and r2 in-dicate the relevance values reported during the first and second it-eration respectively Similarly u1 and u2 indicate unexpectednessratings

Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222

Discussion In information retrieval accuracy is thetypical measure used to evaluate the performance of asystem The higher the precision and recall the morethe system is accurate Unfortunately exploratorysearch tasks are typically associated with undefinedand uncertain goals [57] For example there are a myr-iad of correct answers that a user can find for provid-ing a summary about the topic ldquoAlan Turingrdquo (cf Task1) It is nearly impossible to classify all correct andwrong answers in order to use precision and recall ap-propriately Additionally we are not interested in theaccuracy of results as much as we are in evaluating thedata visualisation (UI usability) and filtering capabili-ties (EKPs) of Aemoo

For this reason our evaluation focused on (i) theanalysis of the time required by participants for com-pleting their tasks (ii) the SUS (iii) a grounded anal-ysis of open user feedback and (iv) user ratings aboutrelevance and unexpectedness of the results providedby the tools

The time needed by participants for completing theexperiment tasks might be biased by a variety of fac-tors that include for example usersrsquo expertise about acertain topic or their familiarity with the system How-ever in our experiment participants resulted (i) to benot Semantic Web experts (ii) to be unfamiliar with

32ldquorrdquo stands for relevance ldquourdquo stands for unexpectedness Sub-scripts indicate the iteration


Fig 11 A chart of the most mentioned pros and cons in the open questionnaires

Fig 12 Relevance and unexpectedness ratings (and their standard deviations between brakets) according to participantsrsquo feedback

the tools (with the exception of Google) and (iii) to beunfamiliar with the subjects of the tasks (cf Figure 7)Hence the observed performances give us a reasonablyreliable insight about the effectiveness of the differenttools in supporting exploratory search tasks

As far as UI usability is concerned the SUS-basedanalysis provided us with a good overview of thesystem performance Aemoo can be considered aver-agely usable which is a satisfactory result especiallyconsidering the comparison with RelFinder usabilityAlso the results of the task-driven experiments enforce

the findings related to EKP cognitive soundness ieEKPs provide an effective filtering criteria for buildingautomatic entity summarisations However the sameresults point out the need of further improving Ae-moo including smart mechanisms for supporting ex-ploratory browsing especially for identifying relationsbetween two or more entities

We acknowledge significant space for improve-ments and in this respect the grounded analysis(based on open questionnaire) provides us with in-sights on what are the most critical issues to be ad-


dressed In more detail the main cons reported byusers which we will consider in future developmentof the system include the lack of a mechanisms toperform comparison among different entities and thelack of a mechanisms for supporting temporary storage(eg a basket)

In addition to these we think that the automaticrelation finding (provided by RelFinder) if appropri-ately integrated in the interface can provide additionalvalue to Aemoo results eg by indicating when a spe-cific semantic relation is known between two entitiesSome participants appreciated the facet-based filteringprovided by RelFinder and a comparable number ofthem judged it as awkward This is probably due to thepotentially huge number of relations between entitiesthat can be proposed as filtering options to the usersIn our opinion this issue is mainly due to scalabilityproblems that if addressed appropriately can turn thisfunctionality into an added value to data visualisationBased on this observation we plan to investigate thetrade-off between the automatic (eg based on EKPsin Aemoo) and manual filtering for future integrationin Aemoo

As far as information presentation is concernedGoogle resulted to be the best system among the threeAlthough this may be a fair assessment it is reason-able to think that this judgement is due to the extremepopularity of its interface If we consider only Aemooand RelFinder systems with which the participantswere unfamiliar with (cf Figure 7) Aemoo received ahigher number of positive comments concerning infor-mation presentation which further supports the pos-itive outcomes of the EKP assessment analysis andof the SUS analysis Participants particularly appreci-ated Aemoorsquos browsing interface and the way relationsamong entities were visually presented (cf Figure 11)Also users reported positive comments about the vi-sualisation of explanations which they found usefulindicating that Aemoo succeeds in providing this typeof data-on-demand

A final aspect worth remarking is the feedback aboutrelevance and unexpectedness which together providean indication of the capability of the system to pro-duce serendipitous results Serendipity can be infor-mally defined as beneficial discovery that happens inan unexpected way it has been recently described asunexpected relevance [50] The intuition is simple themore a result is at same time relevant and unexpectedthe more it is serendipitous However this is a trickyaspect to evaluate due to its strong subjective charac-ter Furthermore considering the relatively small pop-

ulation involved in our experiments we have insuf-ficient evidence for claiming significance Notwith-standing these limits the results we obtained are worthreporting and allow us to formulate reasonable spec-ulation on their interpretation The results of the self-assessment questionnaire (cf Figure 7) show that allparticipants declared comparable level of knowledgeabout the experiment subjects (ie low standard de-viation values) This in addition to the good ratingsprovided for relevance and unexpectedness suggeststhat Aemoo shows a promising behaviour as far asserendipity is concerned ie it was able to provideusers with relevant and unexpected results during theexperiments

6 Related work

Many existing solutions for exploring Linked Dataare based on semantic mash-up or browsing applica-tions Examples are [542526] that leverage the se-mantic relations asserted in the linked datasets withoutapplying any criterion for defining a boundary whichcould be used for tailoring or contextualizing knowl-edge For example RelFinder [25] provides the visual-isation of existing relations between two of more DB-pedia entities These relations can be simple or can in-clude more complex paths The visualisation of rela-tions can be manually filtered by the user according tocriteria of relations length entity types and propertynames

An increasing number of research has been doneon more sophisticated relevance criteria for addressingthe lack of knowledge boundary or the heterogeneityproblem for summarising recommending or browsingLinked Data Many of them present novel approaches(see below for a quick overview) however to the bestof our knowledge none of them leverages ontologypatterns (ie EKPs) which is the main contributionthat distinguishes our approach from existing ones

Entity summarisation In [51] a diversity-aware al-gorithm called DIVERSUM is presented It generatesgraphical entity summaries extracted from WikipediaDifferently from Aemoo the algorithm selects triplesrelated to an entity by both measuring their relevanceand diversity and it is aimed at providing the most im-portant triples about an entity with the highest cover-age of diverse information Similarly to Aemoo DI-VERSUM visualises entity summaries by using a ra-dial neighbourhood graph that can be compared to a


concept map Although the algorithm shows good re-sults the approach is only focused on the selection oftriples to be presented in a graph and there is no sup-port to user exploration eg explanations or interac-tive graphs The approach proposed by RELIN [10]differs from Aemoo because it computes entity sum-maries by using a variant of the random surfer modelwhich is based on two kinds of actions ie relationalmove and informational jump that follow non-uniformprobability distributions It leverages the relatednessand informativeness of description elements for rank-ing entity triples using patterns A similar approach ispresented by SUMMARUM [52] that uses the PageR-ank algorithm The PageRank is computed in order toassign relevance scores to the triples having as sub-ject a certain DBpedia entity Both Relin and SUM-MARUM rely on a PageRank-like algorithm for build-ing entity summaries while the summarisation per-formed by Aemoo is pattern based

Browsing Linked Data [14] presents a detailed dis-cussion about the challenges and the requirements toconsume and visualise Linked Data along with ananalysis of the state of the art about existing solu-tions and systems that support the presentation ei-ther textual or visual of Linked Data According tothe classification provided by [14] Aemoo can beconsidered a visualisation-based approach to exploreLinked Data Similarly to existing solutions such as[51252364155] Aemoo is designed to enable theconsumption of Linked Data by lay users by provid-ing visualization filtering data overview data merg-ing and browsing capabilities in order to reduce thecognitive load and support humans in knowledge ex-ploration and discovery (cf Section 3) Differentlyfrom systems like [52555] Aemoo does not provideusers with explicit filtering widgets for enabling forexample faceted browsing of data because the fil-tering mechanism is transparent to the user and re-lies on EKPs Additionally Aemoo filters aggregatesand visualises data from a different perspective ieit focuses on the structure of wikilinks which are un-typed hyperlinks among Wikipedia pages which re-flect the way things are intuitively connected by hu-mans in Wikipedia pages

Other systems that rely on faceted browsing for fil-tering results include Yovisto [56] which is a plat-form that provides exploratory capabilities specialisedin academic lecture recordings and conference talksand Visor [43] Visor facilitates the navigation processby introducing a multi-pivot paradigm which allows

users to identify key elements in the data space calledpivots Different filtering solutions are proposed by theDiscovery Hub [30] LED [32] that mainly differ fromAemoo because the userrsquos exploratory path is used forcomputing results at each exploratory step Hence thefiltering mechanism does not depend on fixed schemassuch as EKPs in Aemoo In fact the Discovery Hubuses a spreading activation algorithm for weighting anorigin entity and consequently propagating the weightsto its neighbours while LED exploits usersrsquo query inorder to create tag clouds aimed at suggesting relatedknowledge to users during exploration search tasks

The radial visualisation used by Aemoo is in gen-eral a well known visualisation metaphor in litera-ture [1623224] Aemoo uses the radial visualisationfor rendering concept maps deriving from EKPs thatprovide a cognitive sound and an intuitive way for rep-resenting knowledge [36] There are systems that al-low to build graphical concept maps such as Cmap 33

[35] neverthless they are not directly comparable withAemoo as they aim at supporting users in designingand editing concept maps which is different from thetask addressed by Aemoo (ie exploratory search andentity summarisation)

EKPs can be compared to Fresnel lenses [42]that provide a solution for defining implementation-independent templates for data presentation and areused by some state of the art systems eg DBpe-dia Mobile [5] Marble 34 or IsaViz [41] HoweverEKPs were conceived as units of meaning to be usednot only for data presentation but more in generalfor data exchange EKPs also provide a relevance cri-terion for automatically filtering data while Fresnellenses only support presentation design In fact Fres-nel lenses provide more details about how to actuallypresent data A possible future development for Ae-moo is to design a mapping between EKPs and Fres-nel that support customised presentation interfaces forEKP-based filtered data

Recommending Linked Data To some extent theEKP-based filtering of Aemoo can be abstracted asa recommendation mechanism Although recommend-ing systems are not directly comparable to Aemoo forthe sake of completeness we report here some relevantwork in this area

MORE [33] leverages DBpedia Freebase and Linked-MDB in order to recommend movies It computes sim-

33httpcmapihmcus34httpmesgithubiomarbles


ilarities between movies thanks to an adaptation ofthe Vector Space Model (VSM) Seevl [39] is a rec-ommendation system that provides personalised ac-cess and exploration of a knowledge based about mu-sic facts which is created by exploiting DBpediaThe core of the system is an algorithm called DBrecwhich computes the relatedness among entities of theknowledge base by looking at shared relations bothincoming and outgoing The authors in [27] present arecommendation system for retrieving music related toa point of interest (POI) The system exploits a spread-ing activation algorithm in order to weight the related-ness between musicians and POIs in DBpedia

7 Conclusions and future work

This paper presents a novel approach for LinkedData exploration which uses Encyclopedic KnowledgePatterns (EKPs) as relevance criteria for selecting or-ganising and visualising knowledge EKPs were dis-covered by mining the linking structure of WikipediaA system called Aemoo has been implemented forsupporting EKP-driven exploration as well as integra-tion of data coming from heterogeneous resourcesnamely static (ie DBpedia and Wikipedia) and dy-namic knowledge (ie Twitter and Google News)

Our work grounds on two working hypotheses (i)the cognitive soundness of EKPs and (ii) their effec-tiveness in summarising and aggregating knowledge toaddress exploratory search tasks

The first hypothesis was validated by means of auser-study aimed at making EKPs emerge from humanconsensus and to compare them to those automaticallyextracted from Wikipedia Results show high values ofcorrelation combined with good inter-rater agreementand very high reliability of human raters

The second hypothesis was validated by means ofcontrolled task-driven user experiments in order to as-sess its usability and ability to provide relevant andserendipitous information as compared to two existingtools Google and RelFinder

Currently we are working on several extensionsExamples include

ndash improving the automatic interpretation of hyper-text links by hybridizing NLP with Semantic Webtechniques In this respect we have recently ob-tained very good results by designing a novelOpen Knowledge Extraction (OKE) approach and

its implementation called Legalo 35 [45] thatperforms unsupervised open domain and ab-stractive knowledge extraction from text for pro-ducing directly usable machine readable informa-tion

ndash providing visual analytics interfaces that comparedifferent entities having the same type

ndash providing different views on the same entity byallowing users to change the applied lens ieEKP

ndash adding a basket functionality which allows usersto save the summary data of their exploration inRDF

ndash integrating the EKP-based approach with userprofiles for boundary creation

References

[1] Auer S Doehring R Dietzold S 2010 Less - template-based syndication and presentation of linked data In AroyoL et al (Ed) The Semantic Web Research and Applications(ESWC 2010) Vol 6089 of Lecture Notes in Computer Sci-ence Springer pp 211ndash224

[2] Baldassarre C Daga E Gangemi A Gliozzo A M Sal-vati A Troiani G 2010 Semantic scout Making sense oforganizational knowledge In Cimiano P Pinto H S (Eds)Knowledge Engineering and Management by the Masses(EKAW 2010) Vol 6317 of Lecture Notes in Computer Sci-ence Springer pp 272ndash286

[3] Barker K Porter B Clark P 2001 A Library of GenericConcepts for Composing Knowledge Bases In Proceedingsof the 1st International Conference on Knowledge Capture (K-Cap 1997) ACM Press New York pp 14ndash21

[4] Barsalou L W 1999 Perceptual symbol systems Behavioraland brain sciences 22 (04) 577ndash660

[5] Becker C Bizer C 2008 DBpedia Mobile A Location-Enabled Linked Data Browser In Bizer C et al (Ed) Work-shop on Linked Data on the Web (LDOW 2008) Vol 369 ofCEUR Workshop Proceedings CEUR-WSorg pp 81ndash82

[6] Berners-Lee T Chen Y Chilton L Connolly D DhanarajR Hollenbach J Lerer A Sheets D 2006 Tabulator Ex-ploring and analyzing linked data on the semantic web In Pro-ceedings of the 3rd International Semantic Web User Interac-tion Workshop Vol 2006 pp 1ndash16

[7] Berners-Lee T Hendler J Lassila O May 2001 The Se-mantic Web Scientific American 284 (5) 34ndash43

[8] Brooke J 1996 Sus-a quick and dirty usability scale Usabil-ity evaluation in industry 189 (194) 4ndash7

[9] Burch M Konevtsova N Heinrich J Hoeferlin MWeiskopf D Dec 2011 Evaluation of traditional orthogonaland radial tree diagrams by an eye tracking study Visualiza-tion and Computer Graphics IEEE Transactions on 17 (12)2440ndash2448

35httpwitistccnritstlab-toolslegalo


[10] Cheng G Tran T Qu Y 2011 Relin Relatedness andinformativeness-based centrality for entity summarization InAroyo L et al (Ed) International Semantic Web Conference(1) Vol 7031 of Lecture Notes in Computer Science Springerpp 114ndash129

[11] Clark P Thompson J Porter B 2000 Knowledge PatternsIn Cohn A G Giunchiglia F Selman B (Eds) KR2000Principles of Knowledge Representation and Reasoning Mor-gan Kaufmann San Francisco pp 591ndash600

[12] Constantin A Peroni S Pettifer S Shotton D Vitali F2015 The Document Components Ontology (DoCO) To ap-pear in Semantic Web ndash Interoperability Usability Applicabil-ity

[13] Craik K 1943 The Nature of Explanation Cambridge Uni-versity Press

[14] Dadzie A-S Rowe M 2011 Approaches to visualisingLinked Data A survey Semantic Web 2 (2) 89ndash124

[15] Di Iorio A Nuzzolese A G Peroni S Shotton D VitaliF 2014 Describing bibliographic references in RDF In Cas-tro Alexander Garciacutea et al (Ed) Proceedings of 4th Work-shop on Semantic Publishing (SePublica 2014) Vol 1155 ofCEUR Workshop Proceedings CEUR-WSorg pp 41ndash56

[16] Draper G M Livnat Y Riesenfeld R F 2009 A survey ofradial methods for information visualization Visualization andComputer Graphics IEEE Transactions on 15 (5) 759ndash776

[17] Fielding R T 2000 REST architectural styles and the designof network-based software architectures Doctoral dissertationUniversity of California Irvine

[18] Fillmore C 1968 The case for the case In Bach E HarmsR (Eds) Universals in Linguistic Theory Rinehart and Win-ston New York pp 21ndash119

[19] Fillmore C J 1976 Frame semantics and the nature of lan-guage Annals of the New York Academy of Sciences 280 (1)20ndash32

[20] Gallese V Metzinger T 2003 Motor ontology the repre-sentational reality of goals actions and selves PhilosophicalPsychology 16 (3) 365ndash388

[21] Gangemi A Presutti V 2010 Towards a Pattern Science forthe Semantic Web Semantic Web 1 (1-2) 61ndash68

[22] Garlan D Monroe R T Wile D 2000 Acme Architecturaldescription of component-based systems In Leavens G TSitaraman M (Eds) Foundations of Component-Based Sys-tems Cambridge University Press pp 47ndash68

[23] Goyal S Westenthaler R 2004 RDF Gravity (RDF GraphVisualization Tool) Salzburg Research Austria

[24] Hastrup T Cyganiak R Bojars U 2008 Browsing linkeddata with fenfire In Proceedings of the WWW2008 Work-shop on Linked Data on the Web LDOW 2008 Beijing ChinaApril 22 2008 Vol 369 pp 1ndash2

[25] Heim P Hellmann S Lehmann J Lohmann S StegemannT 2009 RelFinder Revealing Relationships in RDF Knowl-edge Bases In Proceedings of the 3rd International Confer-ence on Semantic and Media Technologies (SAMT) Vol 5887of Lecture Notes in Computer Science Springer pp 182ndash187

[26] Heim P Ziegler J Lohmann S 2008 gFacet A browserfor the web of data In Auer S Dietzold S Lohmann SZiegler J (Eds) Proceedings of the International Workshopon Interacting with Multimedia Content in the Social SemanticWeb (IMC-SSWrsquo08) CEUR-WS pp 49ndash58

[27] Kaminskas M Fernaacutendez-Tobiacuteas I Ricci F Cantador I2012 Knowledge-based music retrieval for places of interest

In Liem C e a (Ed) Proceedings of the 2nd InternationalWorkshop on Music Information Retrieval with User-Centeredand Multimodal Strategies (MIRUM 2012) ACM pp 19ndash24

[28] Lewis J R Sauro J 2009 The Factor Structure of the Sys-tem Usability Scale In Kurosu M (Ed) HCI (10) Vol 5619of Lecture Notes in Computer Science Springer pp 94ndash103

[29] Marchionini G Apr 2006 Exploratory search from findingto understanding Commun ACM 49 (4) 41ndash46

[30] Marie N Gandon F Ribiegravere M Rodio F 2013 Discoveryhub on-the-fly linked data exploratory search In Sabou MBlomqvist E Noia T D Sack H Pellegrini T (Eds) I-SEMANTICS ACM pp 17ndash24

[31] Minsky M 1975 A Framework for Representing KnowledgeIn Winston P (Ed) The Psychology of Computer VisionMcGraw-Hill pp 211ndash281

[32] Mirizzi R Di Noia T 2010 From exploratory search to websearch and back In Proceedings of the 3rd workshop on PhDstudents in information and knowledge management PIKMrsquo10 ACM New York NY USA pp 39ndash46

[33] Noia T D Mirizzi R Ostuni V C Romito D Zanker M2012 Linked open data to support content-based recommendersystems In Presutti V Pinto H S (Eds) I-SEMANTICSACM pp 1ndash8

[34] Novak J D 1977 A theory of education Cornell UniversityPress Ithaca NY

[35] Novak J D Cantildeas A J 2006 The theory underlying con-cept maps and how to construct and use them research report2006-01 Rev 2008-01 Florida Institute for Human and Ma-chine Cognition

[36] Novak J D Gowin D B 1984 Learning how to learn Cam-bridge University Press

[37] Nuzzolese A G Gangemi A Presutti V CiancariniP 2011 Encyclopedic Knowledge Patterns from WikipediaLinks In Aroyo L Noy N Welty C (Eds) Proceedingsfo the 10th International Semantic Web Conference (ISWC2011) Springer pp 520ndash536

[38] Nuzzolese A G Presutti V Gangemi A Musetti A Cian-carini P 2013 Aemoo Exploring knowledge on the web InProceedings of the 5th Annual ACM Web Science ConferenceACM pp 272ndash275

[39] Passant A 2010 dbrec - Music Recommendations Using DB-pedia In Proceedings of the 9th International Semantic WebConference (ISWC 2010) Springer pp 209ndash224

[40] Peroni S Shotton D 2012 Fabio and cito Ontologies fordescribing bibliographic resources and citations J Web Sem17 33ndash43

[41] Pietriga E 2003 Isaviz A visual authoring tool for rdf WorldWide Web Consortium[Online] Available httpwww w3org200111IsaViz

[42] Pietriga E Bizer C Karger D R Lee R 2006 FresnelA Browser-Independent Presentation Vocabulary for RDF InThe Semantic Web - ISWC 2006 5th International SemanticWeb Conference ISWC 2006 Athens GA USA November5-9 2006 Proceedings pp 158ndash171URL httpdxdoiorg10100711926078_12

[43] Popov I O Schraefel M M C Hall W Shadbolt N 2011Connecting the dots A multi-pivot approach to data explo-ration In Aroyo L et al (Ed) International Semantic WebConference (1) Vol 7031 of Lecture Notes in Computer Sci-ence Springer pp 553ndash568


[44] Presutti V Aroyo L Adamou A Schopman B A CGangemi A Schreiber G 2011 Extracting core knowledgefrom linked data In Hartig O Harth A Sequeda J (Eds)COLD Vol 782 of CEUR Workshop Proceedings CEUR-WSorg pp 37 ndash 48

[45] Presutti V Consoli S Nuzzolese A G Reforgiato Recu-pero D Gangemi A Bannour I Zargayouna H 2014 Un-covering the semantics of wikipedia pagelinks In JanowiczK Schlobach S Lambrix P Hyvoumlnen E (Eds) KnowledgeEngineering and Knowledge Management Vol 8876 of Lec-ture Notes in Computer Science Springer International Pub-lishing pp 413ndash428

[46] Sauro J 2011 A practical guide to the system usability scaleBackground benchmarks amp best practices Measuring Usabil-ity LCC

[47] Shneiderman B 1996 The eyes have it A task by data typetaxonomy for information visualizations In VL pp 336ndash343

[48] Strauss A Corbin J 1998 Basics of Qualitative ResearchTechniques and Procedures for developing Grounded TheorySage Publications Inc

[49] Suchanek F Kasneci G Weikum G 2008 Yago - A LargeOntology from Wikipedia and WordNet Elsevier Journal ofWeb Semantics 6 (3) 203ndash217

[50] Sun T Zhang M Mei Q 2013 Unexpected relevanceAn empirical study of serendipity in retweets In KicimanE Ellison N B Hogan B Resnick P Soboroff I (Eds)ICWSM The AAAI Press pp 1ndash10

[51] Sydow M Pikula M Schenkel R 2013 The notion of di-versity in graphical entity summarisation on semantic knowl-edge graphs J Intell Inf Syst 41 (2) 109ndash149

[52] Thalhammer A Rettinger A 2014 Browsing dbpedia en-tities with summaries In The Semantic Web ESWC 2014Satellite Events Springer pp 511ndash515

[53] The OSGi Alliance March 2010 OSGi Service Platform Re-lease 4 Version 42 Enterprise Specification Committee spec-ification Open Services Gateway initiative (OSGi)

[54] Tummarello G Cyganiak R Catasta M Danielczyk SDelbru R Decker S 2010 Sigma Live views on the webof data Web Semantics Science Services and Agents on theWorld Wide Web 8 (4) 355 ndash 364 semantic Web Challenge2009 User Interaction in Semantic Web research

[55] Tvarozek M Bielikovaacute M 2010 Factic Personalized Ex-ploratory Search in the Semantic Web In Benatallah B etal (Ed) Proceedings of the 10th International Conference onWeb Engineering (ICWE 2010) Vol 6189 of Lecture Notes inComputer Science Springer pp 527ndash530

[56] Waitelonis J Sack H 2012 Towards exploratory videosearch using linked data Multimedia Tools Appl 59 (2) 645ndash672

[57] White R W Kules B Bederson B B 2005 Exploratorysearch interfaces categorization clustering and beyond reporton the XSI 2005 workshop at the Human-Computer InteractionLaboratory University of Maryland SIGIR Forum 39 (2) 52ndash56


sic units of meaning for addressing both the hetero-geneity and the knowledge boundary problems Morerecently we introduced Encyclopedic Knowledge Pat-terns [37] (EKP) as a particular kind of KP that are em-pirically discovered by mining the linking structure ofWikipedia pages EKPs provide knowledge units thatcan answer the following competency question

What are the most relevant entity types (T1 Tn)that are involved in the description of an entity oftype C

For example the EKP for describing philosophersshould possibly include types such as book universityand writer because a philosopher is typically linked tothese entity types We assume that EKPs are cogni-tively sound because they emerge from the largest ex-isting multi-domain knowledge source collaborativelybuilt by humans with an encyclopedic task in mindIf Wikipedia article writing and hyperlinking confirmthis intuition for each entity type in its ontology is anempirical matter and we have performed data miningand experiments [37] to this aim proving that automat-ically extracted EKPs can be trusted as relevant con-figurations of data for a given entity type

In this paper we exploit EKPs for designing a sys-tem that helps humans to address summarisation anddiscovery tasks These tasks can be classified as ex-ploratory search tasks as they involve different phasesie look-up learning and investigation [29] that char-acterise the strategies that humans adopt while explor-ing the Web For example consider a student who isasked to build a concept map about a topic for herhomework She starts by looking-up specific terms in akeyword-based search engine (eg Google) then shemoves through search results and hyperlinks in orderto investigate the available information about the topicfor finally learning new knowledge she can use to ad-dress her task

Hypotheses The work presented in this paper groundson two working hypotheses (i) EKPs provide a unify-ing view as well as a relevance criterion for buildingentity-centric summaries and (ii) they can be exploitedeffectively for helping humans in exploratory searchtasks

Contribution The contributions of this paper arethe following

ndash a detailed description of the Aemoo 1 visual ex-ploratory search system based on the extractionof EKPs from Wikipedia and their evaluation(formerly explained in [37]) We shortly intro-duced Aemoo in [38] whereas in this paper weprovide an exhaustive description of the systemand the method used for applying EKPs in orderto provide entity summarisation and knowledgeaggregation

ndash an original evaluation of Aemoo by means of con-trolled task-driven user experiments in order toassess its usability and ability to provide relevantand serendipitous information

It is worth mentioning that there are state of the artsystems that provide semantic mash-up and browsingcapabilities such as [542526] However they mostlyfocus on presenting linked data coming from differ-ent sources and visualising it in interfaces that mir-ror the linked data structure Instead Aemoo organisesand filters the retrieved knowledge in order to showonly relevant information to users and providing themotivation of why a certain piece of information isincluded The rest of the paper is organised as fol-lows Section 2 presents a method for extracting EKPsfrom Wikipedia Section 3 presents Aemoo a systembased on EKPs for knowledge exploration Section 4describes the experiments we conducted for evaluatingthe EKPs and the system Section 5 discusses evalu-ation results limits and possible solutions to improvethe system Section 6 presents the related work finallySection 7 summarises the contribution and illustratesfuture work

2 Encyclopedic Knowledge Patterns

A general formal theory for Knowledge Patterns(KPs) does not exist yet Different independent theo-ries have been developed so far and KPs have beenproposed with different names and flavours across dif-ferent research areas such as linguistics [18] artifi-cial intelligence [313] cognitive sciences [420] andmore recently in the Semantic Web [21] As discussedin [21] it is possible to identify a shared meaning forKPs across those different theories as ldquoa structure thatis used to organize our knowledge as well as for inter-preting processing or anticipating informationrdquo

1httpwwwaemooorg




























ekp





















Thingowl





Philosopher EKP


































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References

























































































ekp





















Thingowl





Philosopher EKP


































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References










































































ekp





















Thingowl





Philosopher EKP


































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References


















































































Thingowl





Philosopher EKP


































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References


































































Philosopher EKP


































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References


















































































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References



























































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References






































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References
































































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References























































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References














































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References







































































Entity abstract 3

Entity box 3 3


Breadcrumb 3 3




















4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References

































































4 Evaluation


27httpthejitorg



















of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References














































































of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References

































































of Manchester






























search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References
















































































search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References











































































search







































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References





















































































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References














































































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References






































































Tool r1 u1 r2 u2

Aemoo 38 282 442 302RelFinder 381 253 413 275

Google 4 218 41 222


















6 Related work





























References











































































6 Related work





























References




































































6 Related work





























References























































































References












































































References
































































































































Documents

IOS Press Aemoo: Linked Data Exploration based on ...pdfs.semanticscholar.org/3388/fb0d11652e372fa29cdd... · into a text, or into a conversation, all of the others are automatically