Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
In and out: Work flows between library data and linked-data at the National Library of Spain
Semantic Web in Libraries November, 25-27th, 2019, Hamburg
Ricardo Santos National Library of Spain
BIBLIOTECA NACIONAL DE ESPAÑA
@3186Ricardo #datosbne
BIBLIOTECA NACIONAL DE ESPAÑA
2
@3186Ricardo #datosbne
WHAT IS DATOS.BNE?
A fully functioning linked-data based gateway into the library resources. - Library resources (as any other catalogue)
- Data about authors - Data about vocabularies - Data about items - Connected to usual library services (digital library items, loan
services, document reproduction services)
BIBLIOTECA NACIONAL DE ESPAÑA
3
@3186Ricardo #datosbne
MAIN FEATURES
FRBR-Based
Entity based Entity driven search Entity-based search results ranking
Linked and enriched
Interlinks between FRBR entities Linked to prominent equivalent concepts Enriched with outside data
BIBLIOTECA NACIONAL DE ESPAÑA
4
BASIC DATA MODEL
BIBLIOTECA NACIONAL DE ESPAÑA
5 From MARC to FRBR entities
From a 2-D structure to….
BIBLIOTECA NACIONAL DE ESPAÑA
6 From MARC to FRBR entities
BIBLIOTECA NACIONAL DE ESPAÑA
7
Behind the scenes
MARiMbA
Data (Marc)
Data analysis Mapping & sorting Linked Data generation Linking and enrichment
Data generation
Publication pipeline
Indexing (1)
Double storing Front-end (2)
Triple store (3)
Ranking based in FRBR relationships
(1) Elasticsearch (2) MongoDB (3) Virtuoso
BIBLIOTECA NACIONAL DE ESPAÑA
8
Peeking inside Marimba
Ingestion and analysis of MARC21 records
Mapping 1: sorting
Mapping 2: model entity linking
Mapping 3: data properties
RDF generation Sorting is based on records: Ex: An authority record with a given tag and a given subfield
combination is an entity XX 100 $a $d C1005
Object properties are based on internal keys or inferred
Data properties are based on subfields:
260 $a P3003 (place of publication)
BIBLIOTECA NACIONAL DE ESPAÑA
9
Dataflows
Other sources
Digital Library
(2%)
MARC21 (95%)
DATOS.BNE.ES
BIBLIOTECA NACIONAL DE ESPAÑA
10
Dataflows
Other sources
Identifiers from VIAF
Links: sameAs
BIBLIOTECA NACIONAL DE ESPAÑA
11
Dataflows
Other sources
Identifiers from VIAF
Dbpedia
THEY REMAIN ONLY IN DATOS.BNE
BIBLIOTECA NACIONAL DE ESPAÑA
12
Enrichments and data imports
Main enrichment: EVERYDAY JOB
Other enrichments and imports:
Source Library system Datos.bne
Sources: Libraries and beyond Data imported: identifiers, equivalencies and attributes
BIBLIOTECA NACIONAL DE ESPAÑA
13
2019: Wikidata
Achievement: more than 80.000 persons authority records enriched with data from Wikidata
Mapping BNE Ids with Wikidata IDs, using VIAF identifier datadump
Extracting selected properties (in Spanish where available), using the Wikidata API: - Retrieved selected properties from every ID in JSON - Sorting and grouping - Convert to MARC21 fields according to mapping Performed by datosbne technological partner, Memorandum Multimedia
BIBLIOTECA NACIONAL DE ESPAÑA
14
2019: Wikidata
Quality check-ups:
General overview disregard property P737 (influenced by)
Work on property Occupation (P106) : - Sorting, grouping and arranging for revision by vocabulary experts. - Decision on every value:
o Keep it o Align it with BNE Subject Vocabulary o Disregard it.
BIBLIOTECA NACIONAL DE ESPAÑA
15
2019: Wikidata
Occupation quality check-up:
BIBLIOTECA NACIONAL DE ESPAÑA
16
From Wikidata to datosbne
A funny thing happened on my way to datos.bne
BIBLIOTECA NACIONAL DE ESPAÑA
17
From Wikidata to datosbne
A funny thing happened on my way to datos.bne
BIBLIOTECA NACIONAL DE ESPAÑA
18
From Wikidata to datosbne
A funny thing happened on my way to datos.bne
BIBLIOTECA NACIONAL DE ESPAÑA
19
From Wikidata to datosbne
A funny thing happened on my way to datos.bne
BIBLIOTECA NACIONAL DE ESPAÑA
20
From Wikidata to datosbne
A funny thing happened on my way to datos.bne
BIBLIOTECA NACIONAL DE ESPAÑA
21
From Wikidata to datosbne A funny thing happened on my way to datos.bne
datos.bne Persons’ advanced search
BIBLIOTECA NACIONAL DE ESPAÑA
22
From Wikidata to datosbne A funny thing happened on my way to datos.bne
BIBLIOTECA NACIONAL DE ESPAÑA
23
Wikidata : concerns
PROVENANCE
Provenance is explicit in the library side of the data
No mapping or device has been implement in the RDF side
BIBLIOTECA NACIONAL DE ESPAÑA
24
Wikidata : concerns
QUALITY
Wikidata is exposed to data vandalism
Keep calm and Don’t be afraid of errors
Ricardo Santos Head of Technical Services Department
[email protected] @3186ricardo
Pº de Recoletos 20-22 28071 Madrid
España T +34 915 807 800
www.bne.es
BIBLIOTECA NACIONAL DE ESPAÑA
NATIONAL LIBRARY OF SPAIN