Upload
jens-lehmann
View
486
Download
0
Tags:
Embed Size (px)
DESCRIPTION
The presentation shows which datasets have been converted to RDF and interlinked within the LATC EU project. In particular, it shows the typical conversion process for one example dataset - the EU financial transparency system.
Citation preview
Jens Lehmann AKSW Group, University of Leipzig
6 June 2012
Realising and Exploiting the EU Data Cloud
European Data Forum, Copenhagen, Denmark
Dataset Presentations
EU-Level Dataset Development
List of LATC Datasets
Business Legal Institutions
FTS(EU finance)
Eur-Lex(European Law)
EuroStat(Statistical Data)
CORDIS (EU projects, finance)
N-Lex(National Law)
Institution List
Euraxess (EU jobs, companies)
Taxation & Customs EU Who is Who
EURES (EU jobs)
EU Patent Office EU Barometer
EC Competition(market overview)
EU Agencies European Election Results
eSBN(eBusiness solutions)
PreLex(inter-institutional law)
European Parliament Media
UNODC(drugs & crime statistics)
European Central Bank Statitstics
Other: Eventseer, Sciencewise
Total: 22 Datasetshttp://latc-project.eu/datasets/
Financial Transparency System
Step 1: Analysing the Dataset
Financial Transparency System (FTS) contains information about 110000+ EU grants
Contains beneficiaries, amount of funding, year, responsible department, country etc.
Covers years 2007 – 2010
Originally published in HTML, XML and CSV
Financial Transparency System Step 2: Modelling the Data in RDF and OWL
Michael Martin, Claus Stadler, Philipp Frischmuth, Jens Lehmann: Increasing the Financial Transparency of European Commission Project Funding: Semantic Web Journal (Under review)
Financial Transparency System Step 3: Converting the Dataset
Java classes generated automatically from XML Schema
XML data accessible as Java Objects → script based transformation
High flexibility for data cleansing and special cases
Source code of transformation
● https://github.com/AKSW/FTS-EC-2-RDF/
XML
XSD Java Classes
Java Objects RDF
JAXB
TransformationJAXB
Financial Transparency System
Step 4: Publishing the Dataset
Landing Page, Linked Data, SPARQL endpoint, browser at http://fts.publicdata.eu via OntoWiki
Metadata: Datahub
OntoWiki
http://thedatahub.org
Financial Transparency System
Financial Transparency System
Financial Transparency System
Step 5: Enriching the Dataset
Linking with LIMES (http://limes.aksw.org)
Link targets:
● LinkedGeoData: cities● DBpedia: cities, countries, years, schema
Geo-Coding of beneficiaries on city and address level – 45k coordinates
Meta data: author, license, source, statistics using DublinCore, Void, DataCube
Financial Transparency System
Step 6: Queries, Applications, Visualisation
RDF version allows:
● Find organisations with highest funding● Compare funding across countries / beneficiaries● Compare funding per year and country (from FTS)
with gross domestic product (from DBpedia) – see next slide
→ overall increases transparency and may serve as input for research policy strategies
Financial Transparency SystemSELECT * { { SELECT ?ftsyear ?ftscountry (SUM(?amount) AS ?funding) { ?com rdf:type fts-o:Commitment . ?com fts-o:year ?year . ?year rdfs:label ?ftsyear . ?com fts-o:benefit ?benefit . ?benefit fts-o:detailAmount ?amount . ?benefit fts-o:beneficiary ?beneficiary . ?beneficiary fts-o:country ?country . ?country owl:sameAs ?ftscountry . } } { SELECT ?dbpcountry ?gdpyear ?gdpnominal { ?dbpcountry rdf:type dbp-o:Country . ?dbpcountry dbp-p:gdpNominal ?gdpnominal . ?dbpcountry dbp-p:gdpNominalYear ?gdpyear . } } FILTER ((?ftsyear = str(?gdpyear)) && (?ftscountry = ?dbpcountry)) }
Financial Transparency System
European Employment Services
European Employment Services (EURES) cooperation network for free movement of workers in the EU
Publishes 1.2+ mio Job vacancies, 700 000 CVs, 25000 employers
RDF version can be used to:● compare geographical, economic information for new jobs
(DBpedia, LGD)● Salary comparisons relative to standards in job region● Quality of nearby schools
European Employment Services
Neither API nor dump available → site scraping
Modelling considered existing ontologies
Published using D2R: http://www4.wiwiss.fu-berlin.de/eures/
7 mio triples, classes: Offer, Skill, Employer
3000 links to DBpedia cities + regions + countries + languages + currencies, LEXVO languages, Eurostat
Updates can be performed by scraping only new pages
Euraxess
Contains research jobs in EU, 6400 organisations, 1700 open jobs, 61000 registered researchers, 18000 researcher CVs
http://ec.europa.eu/euraxess/
Contains information about people, jobs, skills, languages etc.
links to DBpedia languages and LEXVO languages
Euraxess + EURES Query
Query: aggregates information about jobs and companies in a country from two different sources
SELECT DISTINCT ?job ?company WHERE {SERVICE <http://www4.wiwiss.fu-berlin.de/eures/sparql> { ?job eures:country ?countryjob. ?countryjob a eures:Country. ?countryjob rdfs:label ?n.}SERVICE <http://www4.wiwiss.fu-berlin.de/euraxess/sparql> { ?company euraxess:country ?countrycomp. ?countrycomp a euraxess:Country. ?countryjob owl:sameAs ?countrycomp .}}
Summary / Take Away Messages
Linked Data increasingly important in EU E-Government
Many RDF conversion tools/techniques available depending on source format
Linked Data simplifies data integration – added value by enrichment, e.g. linking to other data sets or schema creation
LOD cloud provides rich background information
Thanks for your Attention!