Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
http://latc-project.eu
D1.6.1 Interface Definitions for 24/7 Platform
Project GA No. FP7-256975 Project acronym LATC Start date of project 2010-09-01 Document due date 2011-02-28 Actual date of delivery 2011-02-28 Lead Partner DERI Reply to Michael Hausenblas, [email protected] Document status FINAL
FP7-256975 LOD Around The Clock (LATC)
2
Project GA No. FP7-256975 Project acronym LATC Project full title Linking Open Data Around The Clock Dissemination level PU Number of pages 17 Task responsible DERI Other contributors All partners Author(s) Michael Hausenblas, Richard Cyganiak EC Project Officer Stefano Bertolo Keywords platform, design, interfaces, API, linksets
FP7-256975 LOD Around The Clock (LATC)
3
TableofContents
EXECUTIVESUMMARY 4
24/7PLATFORMGOALSANDDESIGN 4
1.1 WhyLinking? 4
1.2 TargetUserCommunity 4
1.3 Datasetsnowandthen 5
1.4 SoftwareasaServicevs.SoftwareDistribution 6
1.5 DesignAssumptionsandGoals 6
24/7PLATFORMUSERROLES 7
1.6 EnduserRoles 7
1.7 AdministratorRoles 7
INTERFACEDEFINITIONS 8
1.8 24/7PlatformComponents 81.8.1 Workbench 81.8.2 MetadataStore(MDS) 91.8.3 DataSourceInventory(DSI) 101.8.4 ConsoleAPI 101.8.5 Console 111.8.6 Crawler&Indexer 121.8.7 Runtime 12
1.9 Interfaces 121.9.1 I1:Workbench–ConsoleAPI 131.9.2 I2:Workbench–MDS 131.9.3 I3:Workbench–Crawler&Indexer 131.9.4 I4:ConsoleAPI–MDS 131.9.5 I5:ConsoleAPI–Runtime 131.9.6 I6:MDS–Runtime 141.9.7 I7:MDS–Crawler&Indexer 14
1.10 ComponentsInterplay 141.10.1 LinkGeneration 141.10.2 QualityAssurance 15
1.11 DependenciesandExternalInterfaces 15
APPENDIXA 16
FP7-256975 LOD Around The Clock (LATC)
4
ExecutiveSummaryThis deliverable motivates the LATC 24/7 Platform design and design goals. Itdefines the Platform scope as well as the target users. The 24/7 Platformcomponents are introduced, the interfaces between the components are definedand the workflow to generate links using the 24/7 Platform is described. Thisdeliverable establishes an initial understanding of the components and theinterfaces and is refined over the project’s runtime in various deliverables(highlightedintherespectiveplaces).
24/7PlatformGoalsandDesignTheLATC24/7InterlinkingPlatform(24/7Platform,inshort)produceslinksets1basedontwomajorinputs:i)linkspecifications,andii)datasets.Inthefollowingwedescribethewhy,what,andhowofthe24/7Platform.
1.1 WhyLinking?Linked Data enables lightweight and straightforward data integration scenarios.This is mainly achieved through providing explicit, typed connections betweenentities in different datasets. An application using different datasets can pullrelated entities directly (based on the links) from different datasets to solve acertaintask.Incontrary,withdatasetsthatarenotinterlinked,onequiteoftenhastouseoutofbandinformationtointegrateentities.Inasense,theLinkedDataecosystem–includingtheLODcloud,indexer,etc.–isanexampleofaDataSpaceSupportPlatform(DSSP2).OneofthedistinctfeaturesofLinkedData is its inherentsupport fordatadiscovery.Throughfollowingthelinks in theLODcloud,one isable toexplorenew,relateddata items thatcan inturnbeintegrated,ifdesired.Thetypeofthelinkcanbeusedtodetermineifandhowtointegratethe“targetentity”.
1.2 TargetUserCommunityExperiencetellsthatdesigningasystemwithouthavingaconcreteuser(group)inmind is sort of counterproductive. We have hence identified the primary usergroupofthe24/7Platform:
“ThecommunityofpeoplewithaninterestinorthatdealwithEUleveldata”Thisgroup,referredtoasEUdatausersinthefollowing,includes: Originaldataowners: ingeneralallEuropean institutionsandagenciesthathave data (in all forms, including PDFs, spreadsheets, etc.) such as Eurostat,EEA,etc.
Applicationdevelopers:peoplethatdevelop(Web)applicationswhowanttobenefitfromtheavailableLinkedOpenData.
1http://www.w3.org/2001/sw/interest/void/#linkset2http://portal.acm.org/citation.cfm?id=1107502
FP7-256975 LOD Around The Clock (LATC)
5
Data analysts: people who want to use the Linked Open Data to discoverrelations between entities or perform experiments with it, for examplejournalistsorresearchers.
Linked Data enthusiasts: early adopters, Semantic Web researchers andpracticioners, software developers and data engineers contributing to theinfrastructure, aswell as carry out schema‐level anddata‐level related tasks(quality,verification,exploration,etc.).
Additionally to the above, we note that EU country‐level initiatives concerningLinked Open Data publishing data (for example, as reported in “Technicalworkshoponthegoalsandrequirementsforapan‐Europeandataportal”3)arein‐scopeforthe24/7Platform,especiallyinEUmembercountriesthatalreadyhavean active community, such as the UK. The latter is of importance, as very likelyearly adopters and tester will be recruited from the pool of people that havealreadyexperiencewithLinkedOpenDataandcanprovidesuggestionsregardingnewfeaturesandoptimisation.Furthermore, we expect interlinking to happen not only between EU‐leveldatasets, but also from and to the country‐level datasets. For example, in thestatisticaldomain,both theEurostat aswell asnational statisticsbodiesproviderespectivedatasets,makingthemalogicaltargetofmutualinterlinking.Naturally, the usage of the 24/7 Platform is not limited to the EU data users,howeverprimarilydesignedtobeusedbythem.
1.3 DatasetsnowandthenIntherealmoftheaboveidentifiedprimary24/7Platformusergroup,theEUdatausers,weunderstandadevelopmentconcerningthedatasetsasdepictedinFig1.
Figure1–EUleveldatasetdevelopment.
3 http://cordis.europa.eu/fp7/ict/content‐knowledge/docs/report‐ws‐pan‐eu‐dat‐porta_en.pdf
FP7-256975 LOD Around The Clock (LATC)
6
FromLATCpoint‐of‐view,thismeans: Beforeprojectstart(endof2010)afew,mainlyexperimentalLODdatasets,suchasEurostatareavailable.
Inlate2012,withprojectend,morethan20newLODdatasetsinEU‐leveldataareahavebeenmadeavailablethroughLATCWP2aswellassomemoredirectlyby the original data owners (with LATC support, for example PUBLINKprogrammetogetherwithLOD2project).
In2020, themajorityofEUdataownerspublishtheirdata intotheLODcloud,nevertheless,theLATC‐provideddatasetsstillactasabackbone.
1.4 SoftwareasaServicevs.SoftwareDistributionThe24/7PlatformcanbeunderstoodasSaaSorasasoftwaredistribution,thatis,individual components (as discussed below) with well‐defined interfaces tointeract.Ingeneral,CloudComputingistypicallydividedintothreelayers(Fig.2).
Figure2–CloudComputingcategorisation.
The 24/7 Platform is in this sense a SaaS, providing users the functionality toproducelinksbetweenLODdatasets,withalltheadvantagesthatcomealongwithit,includingscalability,reliabilityandconvenience.Additionally to thegeneric SaaSattributes, the24/7provides (in contrast to thesoftware distribution) some specific advantages that stem from the integratedSindiceCrawler&IndexerandthebespokewaydatasetsareusedinthePlatform.
1.5 DesignAssumptionsandGoals
The primary goal of the 24/7 Platform is to support the above defined usercommunity, theEUdatausers.Anumberofsecondarygoalsexist,which includebutarenotlimitedto: ProvideademonstrationandverificationoftheSoftwareDistribution. EnlargetheLinkedOpenDatacloud.
!"
#$$%"
&$$%"
%$$%"
%'()*+,"-$)./+)"001#"%(223."1+*"4556"
Cloud Computing as Gartner Sees It
FP7-256975 LOD Around The Clock (LATC)
7
Crafthigh‐quality,globallinksetsforgeneraluse. Establish incentives for others to publish their data, raise awareness, andsharpenthelinkqualityconcept.
Thekeydesigngoalsforthe24/7Platform,inordertoallowtheEUdatauserstocreateanduselinksets,are:
1. Makethe24/7Platformeasytouse,henceposingalowentrybarrier.2. Providefastandtightfeedbackloopforlinkgeneration.3. Keepthecouplingbetweencomponentstoaminimum.4. Ensurequalitythroughseparationinpersonalandpublicworkspaces
24/7PlatformUserRolesWehave identified twokindsof roles in the24/7Platform:LATCend‐usersandLATC administrators. The former are typically representatives from the EU datausersgroup,thelatterfromwithintheLATCproject.
1.6 End‐userRolesWefurtherdifferentiatetheend‐userroleinto: LinkConsumer,which isacasualend‐users that ismainly interested inusingthelinksproducedinthe24/7Platform,and
LinkAuthor,whichisatypeofpoweruserthatbothproducelinksandalsoisinterestedinusingownlinksaswellaslinksproducedbyothers.
1.7 AdministratorRolesTheLATCadminroleissub‐dividedinto:
Operator, focusingontheoverallmonitoring,maintenanceandfunctionofthe24/7 Platform, incl. minimising down‐times, (re)starting and upgradingcomponentsandmanageusers.
Linkset Reviewer, performing the Quality Assessment in form of reviewinggeneratedlinksetsandvetaccordingly.
FP7-256975 LOD Around The Clock (LATC)
8
InterfaceDefinitionsIn the following, the 24/7 Platform components, their interaction and overallinterplayisexplained,alongwiththeabovedefineduserroles.
1.8 24/7PlatformComponentsWe briefly summarise the function of each component, here. Where applicable,screen‐shotsof thecurrentstateofUIcomponentsareprovided.Anoverviewofthe24/7PlatformisdepictedinFig.3,showingallcomponentsconceptually,thesystemboundariesaswellastheexternalinteractions.
Figure3–24/7Platformoverview.
1.8.1 Workbench
TheLATCWorkbenchallowscreatinglinkspecificationsandistypicallyusedbyaLinkAuthor. It is a specialised version of the SilkWorkbench, operated by FUB.The Workbench provides both a UI component and a backend component tohandleReferenceLinksets.ALinkAuthorconstructsoneoremorelinktasksintheWorkbenchandtypicallyusesReferenceLinksetstoassessthequalityofthelinksproduced:toenablethis,theWorkbenchoperatesalocalversionofSilk,allowingtheLinkAuthortopreviewageneratedLinkset.ThecurrentstateoftheWorkbenchisshowninFig.4andFig5.
FP7-256975 LOD Around The Clock (LATC)
9
Figure4–TheLATCWorkbench:workspace.
Figure5–TheLATCWorkbench:editor.
1.8.2 MetadataStore(MDS)
TheLATCMetadataStore(MDS)isthecentralhubforalldataset(DS)andLinkset(LS)metadatainthe24/7Platform.Itisabackendcomponent,operatedbyTALIS,anddealswith: Listofcurateddatasets(C‐DS)fromCKAN Listofhost‐baseddatasets(H‐DS)fromSindice Sindice‐coveragestatisticsfordatasets Metadata for generated Linksets including precision, recall and pointer to theReferenceLinkset.
Internally,theMDSusesVoID4torepresentDS/LSmetadataandtotaketheC‐DSviaCKANintoaccount.ItisassumedthatC‐DSaremaintainedentirelyviaCKAN.
4http://www.w3.org/2001/sw/interest/void/
FP7-256975 LOD Around The Clock (LATC)
10
The MDS provides a feed of vetted linksets to other components and also toexternalusers.Additionally,theMDSactsasthebackendfortheDataSourceInventory(DSI).TheDSI and the MDS communicate via an internal protocol, not in scope of thisdocument.
1.8.3 DataSourceInventory(DSI)
The LATCData Source Inventory (DSI) is a UI component operated by TALIS. Itsupportsthefollowingusecases:
AllowsLinkAuthorstofinddatasetstolinkagainst. Enables a Link Author to study example resources in order to decide how towritealinkspecificationorwhetheralinkspecificationisfeasible.
HelpsaLinkConsumertofindinterestingLATC‐generatedLinksets. NotifiesaLinkConsumeraboutre‐generatedLinksetsviafeeds. Enablesanyusertoexploreallavailabledatasets.AnearlyversionoftheDSIisshowninFig.6.
Figure6–TheLATCDataSourceInventory(DSI).
TheM12deliverableD1.2.1FirstDeployment ofData Source Inventorywill detailouttheMDSandDSIasintroducedhere.
1.8.4 ConsoleAPI
TheLATCConsole controls the executionof link tasks towards theRuntime andacts as an intermediate towards the Workbench. It is a backend component,operatedbyVUA,anddealswith: Alistoflinktaskstobeexecuted ThestatusofthelinkrunsAdditionally, the Console API acts as the backend for Console; these bothcomponentscommunicateviaaninternalprotocol,notinscopeofthisdocument.
FP7-256975 LOD Around The Clock (LATC)
11
TheConsoleAPIexposesanHTTPAPI5thatsupportsthefollowingoperations:Link Tasks GET api/tasks Returns the ordered list of configuration files to run.
The answer is an JSON array with entries indicating the UUID (“identifier”) and full name (“title”) of the configuration file.
GET api/task/{UUID}/configuration Returns the XML configuration file associated with this UUID.
PUT api/task/{UUID}/configuration Update the item with a new configuration file whose content is passed in a form variable “configuration”.
DELETE api/task/{UUID} Delete a configuration file. When executed, the file is removed from the database and from the running queue. All the associated reports are also deleted.
GET api/task/{UUID}
Returns extra information about the configuration with the indicated UUID. This currently consists of the title (“title”), a long description (“description”), the identifier (“identifier”) and the position in the processing queue (“position”).
POST api/tasks Propose an XML file for addition. The file is passed as a multi-part form element with the name “fileToUpload”. Upon insertion, an upload report is automatically generated and the configuration is added to the end of the queue.
Link Runs GET api/task/{UUID}/notifications Returns a JSON array of reports for the
configuration under UUID.
POST api/task/{UUID}/notifications Create a new report. The API expects a form with the parameters “message” and “severity”. An optional JSON array can be stored in the variable “data”. The date is automatically set to the date+time of the report uploaded.
1.8.5 Console
TheLATCConsoleAPI is themainaccesspoint foranOperator,providingstatusinformation about the 24/7 Platform, including health, link runs, errors, qualitymeasures, etc. and controloptions for link tasks.TheConsole is aUI componentoperatedbyVUA.ThecurrentversionoftheConsoleisshowninFig.7.
5NotethattheURIsareformattedaccordingtotheIETFdrafton‘URITemplates’,seehttp://tools.ietf.org/id/draft‐gregorio‐uritemplate
FP7-256975 LOD Around The Clock (LATC)
12
Figure7–TheLATCConsole.
1.8.6 Crawler&Indexer
The Sindice Crawler & Indexer is a backend component operated by DERI. Itprovidesaccess tohost‐baseddatasets (H‐DS)and isdescribedseparately in theM6deliverableD1.1DeploymentofCrawlerandIndexerModule.
1.8.7 Runtime
TheLATCRuntimeisabackendcomponentoperatedbyDERI.TheRuntimeusesaSilk MapReduce version and Hadoop. It takes a list of link tasks and producesLinksets along with metadata (in VoID) as well as log information, collectivelyknownasthelinkrun.TheM12deliverableD1.2.1FirstDeploymentofLinkingEnginewilldetailout theRuntimeasintroducedhere.
1.9 InterfacesIn order to function, a number of components in the 24/7 Platform need tocommunicatewitheachotherviaadefinedinterface.Aninterfaceinthiscontextisa defined communication exchange between two components. The initialinterfacesasoftimeofwritingarecapturedinTable1.X … not applicable – … not defined Ik … defined interface k
Workbench DSI Console Console API
MDS Runtime Crawler & Indexer
Workbench X – – I1 I2 – I3 DSI X X – – – – – Console X X X – – – – Console API X X X X I4 I5 – MDS X X X X X I6 I7 Runtime X X X X X X – Crawler & Indexer X X X X X X X
Table1–ComponentCouplingMatrix.
FP7-256975 LOD Around The Clock (LATC)
13
1.9.1 I1:Workbench–ConsoleAPI
TheWorkbench submits a list of link tasks through the Console API and learnsaboutthelinkrunsviaanAtomfeed.
1.9.2 I2:Workbench–MDS
TheWorkbench requiresmetadata of the datasets to be linked, including name,accessmethods(SPARQLendpoint,etc.).ThisdatasetmetadataisprovidedbytheMDS.TheWorkbenchusesSPARQtoquerytheMDS,whichprovidesthemetadataexpressedinVoiD.
1.9.3 I3:Workbench–Crawler&Indexer
ForthepreviewoftheLinksets,theWorkbenchneedsaccesstothecontentofH‐DS,providedthroughtheSindiceSPARQLendpoint.
1.9.4 I4:ConsoleAPI–MDS
The Console API needs the following information from the MDS, provided viaSPARQL: Accessinformation,e.g.,SPARQLendpointlocation Linkrunstatistics,includingprecision/recall Datasetmodificationstatus
1.9.5 I5:ConsoleAPI–Runtime
The Runtime retrieves a list of link tasks via the Console API and sends statusinformationperlinkrun,againusingtheConsoleAPI.Upon execution, the LATC Runtime checks the Console API for link tasks thatrequire execution. If any link tasks require execution, the Runtime retrieves the
Console APIWorkbench
Link task
LRN feed
Metadata StoreWorkbench DS desc
Sindice Crawler &
Indexer Workbench DS
Metadata StoreConsole API status
LATC RuntimeConsole API
Link task list
status
FP7-256975 LOD Around The Clock (LATC)
14
link tasks from the Console API, and launches appropriate SilkMapReduce jobs.TheRuntimetakescareofparallelizingjobswherepossible.Uponjobcompletion,the Runtime posts a response back to the Console API, and triggers any furtherrequiredactions(suchasloadingtheresultsintotheLinkSetAPI,etc.).
1.9.6 I6:MDS–Runtime
The Runtimemight need dataset access information (such as SPARQL endpointlocations for C‐DS) and needs to know if datasetsweremodified from theMDS.This information isretrievedviaSPARQL.Further, theRuntime informstheMDSaboutnewlygeneratedLinksetsviaVoID/SPARQLUpdate.
1.9.7 I7:MDS–Crawler&Indexer
TheMDSretrievesalistofH‐DSandcoverageofH‐DSviatheSindiceAPI.
1.10 ComponentsInterplayIn the following, the overall workflow and components interplay to generate aLinkset is explained. A core principle is that the dataset URIs of the involveddatasetsinthelinktasksarepassedaroundinthe24/7Platform.Thisensuresthatall components have a shared understanding of the datasets and can pull therelevantinformationfortheirtasksfromtheMDS,ifneeded.
1.10.1 LinkGeneration
A Link Author (LA) either creates a datasets or decides to interlink existingdatasetsfromtheLODcloud.TheLAeithermanuallyentersthedatasetintoCKAN(turningit intoaC‐DS)or let it indexviaSindice/Sitemaps6(H‐DS).Then,theLAusesthe24/7Platformtocreatelinks:
1. TheLAselectsthedatasetstobelinkedfromtheDSI.2. TheLAcreatesaLinktaskinthepersonalworkspace.3. The LA either creates Reference Linksets in the Workbench or uploads
existingReferenceLinksetsintotheWorkbench.4. TheLApreviewsLinksetsandqualityassessmentbasedon theReference
LinksetintheWorkbench.5. TheWorkbenchtransfersthelinktasktotheConsole.6. TheConsoleinstructstheRuntimeofpendinglinktasksandreceivesstatus
ofperformedlinkruns.7. TheConsolenotifiestheWorkbenchaboutlinkruns.
6http://sindice.com/developers/publishing
Metadata Store LATC Runtime
LS updates
DS metadata
Metadata StoreSindice Crawler &
Indexer
H-DS list
coverage
FP7-256975 LOD Around The Clock (LATC)
15
1.10.2 QualityAssurance
The LATC Description ofWorkmentions a Quality Assurance (QA)Module. Theexperience gatheredwith the experimental 24/7 Platform set up in the first sixmonth has shown that it is more realistic that several components togetherperformQA;thisiswhatwecalltheinternalQA.Additionally, what is now known as the external QA, a number of approaches(basedondiscoveringdeadlinks,examiningtheBillionTripleChallengedatasets,game‐based‐drivendiscovery, etc.)will beused toenhance the internalQA.ThiswillbedetailedoutintheM12deliverableD1.4.1FirstDeploymentofQAModule.Typically, after having produced a Linkset in the personal workspace, a LinkAuthorwantstomaketheLinksetavailabletothewiderpublic.ThisiswheretheinternalQAkicksin: ifaLinkAuthordecidestosubmithisorherLinksetstothepublicworkspace,thefollowinghappens:
1. The Link Author submits a previously generated Linkset to the publicworkspace.
2. The Linkset Reviewer uses the Workbench to gather a list of pendingLinksetsforthepublicworkspace.
3. TheLinksetReviewerusestheMDStomarkaLinksetasvetted.OnlyvettedLinksetsareshownintheDSI.
1.11 DependenciesandExternalInterfacesTherearethreeexternalinterfacesprovidedand/orusedbythe24/7Platform:
TheCKANAPI,usedbytheMDS.DERIcollaborateswiththeOKFdirectlyandviaLOD2toensurethenecessaryinformationisprovidedinasustainableway.
TheLODcloud,whichiscrawledandindexedbySindiceasdescribedintheM6deliverableD1.1DeploymentofCrawlerandIndexerModule.
TheLinksetAPI,wherethelinksetsthemselvesareprovided,whichwillbepartofSindice(themetadatainformofVoIDdescriptionsismaintainedbytheMDS),alsotoensurethattheproducedlinksetsareindexed,inturn.
FP7-256975 LOD Around The Clock (LATC)
16
AppendixAThe interfaces listedabovehavebeenderived fromanexperimentalsetupof the24/7 Platform. This setup has been performedwithin the first sixmonth of theLATCprojectandis,alongwithobservations,describedinthefollowing.TheexperimentalLATCRuntime(availableviatheLATCSourceForgerepository7)iswritten in Java,packagedasa JARfile. ItcommunicateswiththeConsole(alsoavailablevia theSFrepoandonline8) throughanHTTPAPI forgettingSILK linkspecificationfilesandpostinglinkrunresults.TheMapReducejobisrunningonaHadoopDistributedFilesystem(HDSF),producinglinksetsandlinksetmetadata9.TheLATCRuntimehasthefollowinginputs: Link specification: theSILK link specification inXML format,which isobtainedfromConsole.
Blacklist file: a list of link specifications that are should not be selected forexecution(time‐outsorfails).
Aconfigurationthatcanbeprovidedasfileorviathecommandline.
The output of the Runtime is threefold: a linkset, a respective description of thelinksetinVoIDandalinkrunstatusreport.Therearetwokindsofstatusreports: ThesuccessreportcontainshowmanylinksweregeneratedaswellasthebaseURIofthelinkset.
The failed report mentions the reason, why a linkset run could not generatelinks,suchasSPARQLEndPointdownortimeout,HDSFproblemorinvalidXML.
Atpresent,thespecificationfilesareprovidedbytheConsole,coveringare22files,where 12 files were executed successfully, 8 files failed due to the SPARQLEndpointand invalidspecification filesand four files took too longe togeneratethelinkset.Theaveragetimeforexecutingalinkrunis326.75seconds.ExampleRuntimeconfigurationfile:HADOOP_PATH = hadoop-0.20.2 HDFS_USER = xxx LATC_CONSOLE_HOST = http://fspc409.few.vu.nl/LATC-console/ LINKS_FILE_STORE = links.nt RESULTS_HOST = http://demo.sindice.net/latctemp RESULT_LOCAL_DIR = results SPEC_FILE = spec.xml VOID_FILE = void.ttl
ExampleRuntimeblacklistfile:climb_silk_link_spec db-geolinkd-boris dbpedia_drugbank_drugs dbpedia-lgd_city dbpedia-lgd_city2
7http://sourceforge.net/projects/latc/8http://fspc409.few.vu.nl/LATC‐console/9http://demo.sindice.net/latctemp
FP7-256975 LOD Around The Clock (LATC)
17
ExampleVoIDfile,describingthegeneratedlinkset(forthedbpedia‐lgd_island.xmllinkspecification):@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix void: <http://rdfs.org/ns/void#> . @prefix : <#> . :dbpedia a void:Dataset; void:sparqlEndpoint <http://live.dbpedia.org/sparql/>; . :linkedgeodata a void:Dataset; void:sparqlEndpoint <http://linkedgeodata.org/sparql/>; . :dbpedia2linkedgeodata a void:Linkset ; void:linkPredicate owl:sameAs; void:target :dbpedia; void:target :linkedgeodata ; void:triples 9139;