View
541
Download
0
Category
Tags:
Preview:
DESCRIPTION
This paper deals with national data standardization efforts in Denmark and discusses the role Topic Maps – and topic maps – may play in a new standardization strategy currently being considered by the Danish National IT and Telecom Agency. The strategy entails a paradigm shift from syntactic data standards based on XML schema to a more semantically based approach involving, among other things, the development, publishing and sharing of so-called definitions. The paper accounts for the historical, political and technical context of the strategy pointing out some of the opportunities and constraints this context poses for the introduction and application of Topic Maps as a recognized “data standardization standard” in Denmark.
Citation preview
National Data Standardization: A Place
for Topic Maps?
Lars Johnsen
”There is something rotten in the state of Denmark”
- Shakespeare (1601)
2009: Challenges => modernization of the public sector => digitalization =>
interoperability
2001: It standardization efforts taking off 2006: B103 – Parliament voting for open
standards in Danish e-goverment 2008: Seven sets of open standards made
compulsory including web standards, document standards and data standards
Where is Topic Maps …?
To be or not to be – standardizing *
* … that’s one of the questions
Data standardization in Denmark◦ Background and status◦ New strategy being implemented by ITST (the
national agency for It and Telecom)◦ Technological infrastructure
Paradigm shift from syntax to semantics or even subject-centric thinking …
Why not TM?
Agenda
Data standards - OIOXML
OIOXML schemas
Domain schemasCore components
Problems with OIOXML schemas
Content
Little documentation◦ What legislation?◦ What contexts?◦ What projects, web
services, etc. Findability and
usability Uploading Organizing Viewing
Concepts
No real semantics: <MaritalStatusCode>
◦ married◦ divorced◦ widow◦ registered partnership◦ abolition of partnership◦ longest living partner◦ deceased◦ unmarried
</MaritalStatusCode>
What do you read, my lord? Words, words, words
A new strategy in the pipeline
Semantic definitions
Concept systemsTaxonomies
Data definitions
Message definitions
WDSL files
OIOXML schemas
Concept systems: building on existing methodologies
FORM – a reference model of public services in Denmark
Social services
Benefits
Individual benefits
Allocation of housing benefit
Action
Object
From concept systems to taxonomies and thesauri
In the future: semantic definitions as pivots
<SemanticDefinition definitionIdentifier = ”…” systemIdentifier = ”…” versionIdentifier = ”…”>
<Localization languageCode =”DA”><Term roleCode = ”preferred”>…</Term><Term roleCode = ”admitted”>…</term>…<Description roleCode = ”definition”>…</Description>
</Localization>…
</SemanticDefinition>
Semantic definition
client
From semantics to syntaxSemantic
definition
client status
ClientClient_status: custody, …
Data definition
<ClientStatusCode><enumeration value =”custody” />
…</ClientStatusCode>
OIOXML“Though this be madness, yet there is method in it”
has
Semantic definition custod
y
is a
Conceptual layer◦ Concept systems and taxonomies = TAO
topics/subjects, properties, associations, roles◦ Semantic definition = PSI
names, descriptions, explanations, etc. of subjects (in one or more languages)
Content or resource layer◦ OIOXML schemas, data definitions, wsdl files =
occurrences/addressable subjects◦ Embedded OIO schemas = part-whole associations between
addressable subjects, etc.
TM model, really
So, why not simply adopt TM as a unifying model or standard for representing and organizing ”data standards data” in Denmark …?
This question becomes even more relevant …
Instead an entirely new OIOXML language is being developed …
Web 2.0 platform (Digitaliser.dk)◦ Information architecture centered on groups◦ Functionality for news, debates, user networks,
etc.◦ Tagging as primary organization vehicle
REST API◦ Resources, representations and structured URL’s
Finding, publishing and sharing ..
Prediction: a dire need for an organizational/navigational overlay integrating
distributed conceptual information, resources and metadata on Digitaliser.dk
Why not adopt TM as a unifying technology for
organizing and integrating data standards content on
Digitaliser.dk (and elsewhere)?
Topic Maps as a technology working ”covertly” behind the scenes or topic maps as accessible information products in their own right?
OIO topic maps – truly open information sets about the
”state of Denmark”
Topic maps as resources on Digitaliser.dk?
Public institutions and authorities
The vision of OIO Topic Maps …
Public services (e.g. FORM)
Core conceptsLaws and guidelines
Public web services
Integration
Workflows
<topicMap version="2.0" xmlns="http://www.topicmaps.org/xtm/"><topic id="enke"><subjectIdentifier href="http://digitaliser.dk/resource/123" /><subjectIdentifier href="http://api.digitaliser.dk/rest/resources/123/artefacts/enke.xml/content" /><instanceOf><topicRef href="#OIOconcept" /></instanceOf><name><scope><topicRef href="#DA" /></scope><value>enke</value></name><occurrence><type><topicRef href="#OIOXML" /></type><resourceData datatype="http://www.w3.org/2001/XMLSchema#anyType"> <MaritalStatusCode>widow</MaritalStatusCode></resourceData></occurrence> </topic>...</topicMap>
XTM 2.0 comes in handy …!
General lack of knowledge ”Topic Maps is a metadata standard like
Dublin Core” ”Concepts like subject, topic, identity and
merging are simply too hard to explain” ”We are not having that RDF and OWL stuff,
therefore we cannot adopt TM either” The TM vocabulary does not translate very
easily into Danish
Why is TM not a national standard?
And of course we do not have a Lars Marius or Steve P …
Emne - subjekt (topic/subject) Identifikator - indikator (subject identifier vs
indicator) Adresse (locator) Belæg – hændelse - forekomst (occurrence) Sammenslutning – sammenlægning - fusion
(merging)
Translating TM into Danish
Play the standard card (”TM is an international ISO standard, you know”)?
Create solutions! (chicken and egg problem)
RDF: friend or foe? The ”versatility problem”: promoting TM as
Semantic Web technology, metadata format, or what?
So, what should we do?
Perhaps Digitaliser.dk will solve the problem for us …
”The rest is silence!”
Recommended