Upload
olaf-janssen
View
658
Download
0
Embed Size (px)
Citation preview
Dutch WW2 underground newspapers on Wikipedia
6th International DBpedia Community Meeting, 12-02-2016, The Hague
Olaf Janssen, Koninklijke Bibliotheek
[email protected] - @ookgezellig - slideshare.net/OlafJanssenNL CC-BY-SA
htt
p:/
/ww
w.4
en5
mei
amst
erd
am.n
l/at
tach
men
t/4
74
54
During WW2 ± 1.300 Dutch underground newspapers have been issued
In every shape & form…
htt
p:/
/ww
w.4
en5
mei
amst
erd
am.n
l/at
tach
men
t/4
74
54
http://resolver.kb.nl/resolve?urn=ddd:010436323
http://resolver.kb.nl/resolve?urn=ddd:010442948
http://resolver.kb.nl/resolve?urn=ddd:010447825 http://resolver.kb.nl/resolve?urn=ddd:010450508
From well-known big titles
(o.a. Parool, Vrij Nederland, Trouw, de Waarheid)
To very small, home-made, pamphlet-like
issues
After the war many titles have
1) been (physically) preserved at the NIOD …
https://commons.wikimedia.org/wiki/File:Verzetskrant_in_archiefdozen_bij_het_NIOD.jpg – CC-BY-SA - OlafJanssen
The national Institute for War, Holocaust and Genocide Studies in Amsterdam
http://opac-gonext.oclc.org:8180/DB=8/XMLPRS=Y/PPN?PPN=107123223
.. were 2) described in formal library catalogues
Bibliographic metadata
.. were 3) digitized in Delpher …
The Dutch national aggregator for historic full-text newspapers, books and magzines
http://resolver.kb.nl/resolve?urn=ddd:010424553:mpeg21:p001
• Scans • Full-text OCR
.. and were 4) contextualized & interlinked
1 by 1 in a book
Context
.. and were 4) contextualized & interlinked
1 by 1 in a book
Relation
Newspaper Placename
semantics, linked data
.. and were 4) contextualized & interlinked
1 by 1 in a book
Relation
Newspaper Persons
semantics, linked data
.. and were 4) contextualized & interlinked
1 by 1 in a book
Relation
Newspaper Other newspapers
semantics, linked data
This book has been OCRed into PDF (CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
This book has been OCRed into PDF (CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
Converted into structured, linked data Linked to KB-catalogue (metadata) and Delpher (full-text) Linked to other sources (DBpedia, VIAF, Gemeentegeschiedenis.nl, Nationaal Archief)
This book has been OCRed into PDF (CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
Convert PDF into structured, linked data Link to KB-catalogue (metadata) and Delpher (full-text) Link people and places to external sources (VIAF, Gemeentegeschiedenis.nl, Nationaal Archief,
Biografisch Portaal)
This book has been OCRed into PDF (CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
Convert PDF into structured, linked data Link titles to KB-catalogue (metadata) and Delpher (full-text) Link people and places to external sources (VIAF, Gemeentegeschiedenis.nl, Nationaal Archief,
Biografisch Portaal)
This book has been OCRed into PDF (CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
Convert PDF into structured, linked data Link titles to KB-catalogue (metadata) and Delpher (full-text) Link titles, people and places to external sources (VIAF, Gemeentegeschiedenis.nl,
Nationaal Archief, Biografisch Portaal)
So:
a lot of information is available about these WW2 underground newspapers
(and the related persons & places) …
... but the chunks of data are (largely)
unconnected!
htt
p:/
/2.b
p.b
logsp
ot.
com
/_BW
zuYw
iS6-I
/TM
geR
sFd3m
I/AAAAAAAAElw
/3cv
gbZSPW
cs/s
1600/d
oct
or+
macr
o+
judy+
scare
d.jpg
... making discovery, understanding & research
for many people harder than necessary.
... making discovery, understanding & research
for many people harder than necessary.
htt
ps:
//n
l.wik
iped
ia.o
rg/w
iki/
Cat
ego
rie:
Illeg
ale_
per
s_in
_de_
Twee
de_
Wer
eld
oo
rlo
g
Today, only 14 of these 1.300 newspapers are described on WP:NL
The Wikiproject Verzetskranten will change this!
Systematically and uniformly describe & interlink all 1.300 Dutch underground newspapers from WW2
on Wikipedia
tinyurl.com/verzetskranten
Automatically makes data available for open reuse projects
Wikidata -- DBpedia -- Dataviz
From 14 1.300 titles
Global approach
1. Make central LOD-database
2. Build article template
3. Generate WP-article stubs -- using 1. and 2.
4. Involve WP-community to expand stubs into full WP-articles
5. Make dataset available for open reuse Wikidata -- DBpedia -- Dataviz -- et al.
First time data about undergound newspapers is systematically collected and linked online!
LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)
Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal
LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)
Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal
LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)
Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal
LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)
Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal
LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)
Link titles, people and places to external sources Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal
LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)
Link titles, people and places to external sources DBpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal
LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)
Link titles, people and places to external sources DBpedia Wikipedia
VIAF Nationaal Archief Biografisch Portaal
htt
p:/
/ww
w.4
en5
mei
amst
erd
am.n
l/at
tach
men
t/4
74
54
So we have a LOD-database with data about 1.300 underground newspapers
Using an article template we can generate 1.300 uniform and interlinked WP-stubs
htt
ps:
//c1
.sta
ticf
lickr
.co
m/9
/82
81
/76
99
23
19
18
_11
a73
56
c38
_b.jp
g
LOD-db + article template = article stub
https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)
Grey = • From database • Predefined fixed strings
All that WP-writers need to add manually to create a full article
Current status
Global approach
1. Make central LOD-database
2. Build article template
3. Generate WP-article stubs
4. Involve WP-community to expand stubs into full WP-articles
Current status
Global approach
1. Make central LOD-database
2. Build article template
3. Generate WP-article stubs
4. Involve WP-community to expand stubs into full WP-articles
Current status
Global approach
1. Make central LOD-database
2. Build article template
3. Generate WP-article stubs
4. Involve WP-community to expand stubs into full WP-articles
Current status
Global approach
1. Make central LOD-database
2. Build article template
3. Generate WP-article stubs
4. Involve WP-community to expand stubs into full WP-articles
Current status
This month
March onwards
htt
p:/
/up
load
.wik
imed
ia.o
rg/w
ikip
edia
/co
mm
on
s/1
/12
/Pla
nn
ing_
tan
k_o
per
atio
ns,
_Sie
ge_o
f_To
bru
k_cp
h.3
b1
82
03
.jpg
Questions?
[email protected] - @ookgezellig
tinyurl.com/verzetskranten