demo.datanetworkservice.nl/qaa
Saving queriesData analysis and sharing in the
humanities
Dirk Roorda – SDE:T&I 2012-03-20
Interedition bootcamp Leuven:◦ geotags as annotations
Lorentz workshop biblical scholarship◦ queries as annotations
Now◦ queries and features as
annotations Tomorrow
◦ annotations across variants
why here?
we promote sustained access to data◦ dans.knaw.nl/en/content/about-dans
we offer an online archiving system◦ easy.dans.knaw.nl
we offer access to research information NL◦ narcis.nl/
we organised the data seal of approval◦ datasealofapproval.org/
we do research ourselves◦ research-development
from where?
data, tools, interfaces, systems ... form a tangled web ...all of whose components are evolving ...with limited backward compatibility
data is relatively stable
...but, without the rest ...also meaningless
persistence in ict
preserving tools: ◦ must they run for ever (and ever)?◦ including previous versions?◦ even if newer tools do a better job?
preserving the track record of some tools◦ meaningful queries and their results◦ visualised in successive versions of the data
preserving versions of the data◦ common parts have common addresses
preserving data and tools
annotations
queries as annotations
the WIVU case
preparing the corpusselectword_objects.first_monad,word_mdf_text_set.string_valuefromword_objectsinner joinword_mdf_text_setonword_mdf_text_set.id_d=word_objects.mdf_textorder byword_objects.first_monad
ב;:|1י|2 �ר?א ש�א|3 A;ב� ר�אFלDה|4
ת|5 � א�6|Jה
שA;Mמ|78|: ו
ת|9 � א�10|Aה
אTAרSץ|11
anchored words
preparing the queries
query results
OANNOT
WIVUfeatures
corpus and annotations
WIVUtext
WIVUqueries
select objects where
tense=perfect
OANNOT
WIVUfeatures
flexibility (1)
WIVUtext
WIVUqueries
west minsterqueries
select objects where
west minsterfeatures
tense=perfect
OANNOT
WIVUfeatures
flexibility (2)
WIVUtext
WIVUqueries
west minstertext
select objects where
tense=perfect
be my guest @ http://demo.datanetworkservice.nl/qaa with help from Eko Indarto Vesa Åkerman Paul Boon Andrea Scharnhorst
the demo app
done the decoupling◦ ...into anchored text and annotations
bodies and target not (yet) web resources◦ should anchors be tied to a fixed copy?◦ or can anchors be incarnation-free?◦ there is use for local addressing
mostly raw sql queries, no data abstraction
did and did not
public website on cloudserver @ €22/month web2py framework
◦ mysql + apache connections◦ very easy deployment
Implementation details
formalism webapp data-prep
sql 90 80
python 250
perl 650
javascript 300
html 50
css 60
shell-script 280
Coding statistics
WIVU text: 426,499 words
WIVU queries: 22 with 2250 results
WIVU features: 2,334,817 values
Data details
yes, Features-As-Annotations and Queries-As-Annotations are feasible
but for maximum profit we must: find an addressing scheme for text monads ... that is stable under “variants” use it for a small portion of the Greek NT ... and make an analysis tool that works across
variants then we have Portable Features and Portable Queries
◦ and preservable too
Next challenge