Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Classification in contextualising, mapping and switching between vocabularies
Aida SlavicUDC Consortium, The Hague
Colloquium Research Information Systems and Science Classifications: revisiting the NARCIS classification
28.09.2018
OUTLINE
• Classification• Vocabulary mapping• Mapping projects• Mapping and linked data
Aida Slavic The Hague 28 September 2018
• classification for presentation of knowledgephilosophical classifications, pedagogical classifications
• classification for organization of knowledgescientific, economic or administrative classifications
• classifications for applications of knowledgeencyclopaedic classifications lexicographical and linguistic classifications, classifications in ontologies (expert systems)
• classifications for communication of knowledgelibrary & documentary, i.e. bibliographic classifications
KNOWLEDGE CLASSIFICATIONS
• classification schemes • position of each subject is determined by its relation to other
subjects• concept is usually represented by symbols• types: general and special knowledge scheme
§ alphabetical subject indexing languages § concept is represented by natural language terms§ position of subject determined by its name§ types:
§ subject headings (syntagmatic relationships between concepts) – can list concepts from all area of knowledge, no semantic relationships between entries
§ thesauri (language control, semantic relationships between concepts within limited field of knowledge) – each concepts can only have one broader category
Aida Slavic The Hague 28 September 2018
KNOWLEDGE ORGANIZATION SYSTEMS (KOS)
Bamboo sharksBlind sharksBrachaeluridaeBrachaelurusCarpet sharksCartilaginous fishesChiloscylliumChondrichthyesCladoselachiformes,Collared carpet sharksElasmobranchiiGaleomorphiiGinglymostomaGinglymostomatidaeHemiscylliidaeHemiscylliumHybodontiformesNurse sharksOrectolobiformesParascylliidaeParascylliumRhincodonRhincodontidaeShark - bamboo s., blind s.,
carpet s, collared carpet s., nurse s., whale s., etc.
Whale sharkXenacanthiformes
Aida Slavic The Hague 28 September 2018
ALPHABETICAL vs SYSTEMATIC ARRANGEMENT
English full ed. 1982 (class 628) Dutch abridged ed. 2013
Aida Slavic The Hague 28 September 2018
ALPHABETICAL INDEX TO CLASSIFICATION
Index shows all subject fields in which given term and/or concept appear (relative index) – it can list only terms as appear in the schedule or have additional terms/synonyms to assist discovery
alphabetical display of the thesaurus systematic display of the thesaurus
Aida Slavic The Hague 28 September 2018
“THESAURUSIFICATION” (UDC)
• class content is represented by a notation (language independent)=512.16 Jižní skupina turkických jazyků
Южная группа тюркских языков [Russian]तुक$ भाषाओं का द+,णी समहू [Hindi]
Թուրքական լեզուների հարավային խումբ [Armenian]
Νότια ομάδα των Τουρκικών γλωσσών [Greek]
���� [Chinese]
!"#$% &'"$( )*+,-./0/1234 [Bengali]
������ [Japanese]ಟ"# $ಾ&ೆಗಳ ದ+ಣ $ಾಗದ ಸಮೂಹ [Kannada]
• hierarchical organization
Aida Slavic The Hague 28 September 2018
MAIN CLASSIFICATION FEATURES
• disciplinary organization (in compliance with some recognized forms of knowledge and scientific and educational consensus)
• perspective or aspect classifications: one concept – many fields of knowledge (aspect classifications)
• organize information about entities and not entities themselves
• express context in which knowledge may be created, presented, recorded or used (form of presentation, carrier)
• semantic and syntactic relationships
Aida Slavic The Hague 28 September 2018
UNIVERSAL BIBLIOGRAPHIC CLASSIFICATIONS
ZoologyMammalia (mammals)
Lagomorpha (lagomorphs)Leporidae
AgricultureHunting
Various industriesTextile industry
Animal fibresHare fur. Rabbit fur
Animal husbandryRodents kept for fur
Rabbit
Preparation of foodstuffMethods of cooking
Braising
Small game
Animals kept as pets
Home economics
Animal husbandry
Agriculture
AgricultureRabbit
Oryctolagus (rabbit)
Angora rabbit hair
Braised rabbit
Rabbit
Aida Slavic The Hague 28 September 2018
ASPECT CLASSIFICATION
BEYOND HIERARCHY
the expressive power of classification is not only in the level of specificity, or number of classes and subclasses available…
It is in possibility to index and enable searching of content that can be complex, inter- or cross-disciplinary and unpredictable …
CLASSIFICATION VOCABULARY (UDC)
MAIN CLASSES (DISCIPLINES)
‘
.0
-1/-9
SPECIAL AUXILIARYCONCEPTS
LANGUAGE=…
MATERIALS-03
RELATIONS-04
PERSONS-05
PROPERTIES-02
COMMON AUXILIARY CONCEPTS
TIME“ “
ETHNICS(=…)
PLACE(1/9)
FORM(0…)
15,000 CLASSES55,000 CLASSES
FACETED STRUCTURE AND SYNTHESIS IN UDC
Discipline 1
Discipline 2
Discipline 3
81 Linguistics and languages
811.112.2 German811.112.22 Upper German811.112.24 Middle German811.112.3/.4 Low German811.112.3 Plattdeutsch811.112.4 Frisian811.112.5 Dutch811.112.58 Dutch based
pidgin and creole
MAIN TABLES
Materials
Language
Time
Form
(1/9) Place
(4) Europe(430) Germany(436) Austria(437.3) Czech Republic(437.5) Slovakia(438) Poland
COMMON AUXILIARIES
-1 /-9 Schools, trends, methods
-116 Structuralism-116.2 Geneva school
‘0 Origins and periods of langusg
‘0 Origin and periods
‘1/’9 General theory of linguistics
‘1 Metatheory ‘2 Subject fields, facets of lin.‘34 Phonetics. Phonology’35 Graphemics. Orthography’36 Grammar’37 Semantics
SPECIAL AUXILIARY NUMBERS
RELATIONSHIPS BETWEEN SUBJECTS
37 :004 Education : Computers
338.48 :61 Tourism : Medicine
602.72 :17 Embryonic cloning : Ethics
FACET SYNTHESIS
History Scotland
94(410.5)“18”19th century
History Scotland
94 (410.5)“18”19th century
FACET INDICATORS
MORE DETAILS
-057 Persons according to occupation, work, livelihood, education -057.17 Managers in general. The management -057.177 Higher management. Top management-057.177.3 Directors. Board members-057.177.32 Non-executive directors-057.177.321 Deputy directors. Assistant directors
-056 Persons according to constitution, health, disposition, hereditary or other traits
-056.2 Persons according to physical state and health-056.25 Persons according to nourishment (nutritional
state) or body weight -056.257 Overweight persons. Overnourished. Fat.
Obese. Hypertrophic
-053 Persons according to age or age-groups-053.8 Adults. Grown-ups-053.88 Persons in late middle age (troisième âge)-
-056.257 -057.177 -053.88
Top management – Persons in late middle age-Overweight
612.12-009.92
Angina pectoris
1963 - GEIS (Groupe d’Etude sur l’Information Scientifique) first idea of linking information centres via an intermediary language
Aida Slavic The Hague 28 September 2018
INTERMEDIATE LANGUAGE (1960-1980)
UNISIST (1960s) project on an international information network considered the UDC as a serious candidate for a 'switching language'
• Upon reports from experts who looked into existing bibliographic classifications it was concluded that, although the least faulty, UDC would require too much work
• Standard Reference Code (5000 subjects organized hierarchically was proposed as a remedy) – some initial work was put into this but the idea never took of
Aida Slavic The Hague 28 September 2018
UNISIST PROJECT (1960s)
• Broad System of Ordering - British Classification Research Group proposed that the new fully faceted broad scheme be built to serve as an intermediary language. BSO was created 1979 (not implemented, used or developed since that date (http://www.ucl.ac.uk/fatks/bso/outline.htm#910)
• Information Coding Classification, by I. Dahlberg
• Bliss Bibliographic Classification 2, by J. Mills, V. Broughton)
None of the newly created schemes has every been implemented or widely used, completed or updated
Aida Slavic The Hague 28 September 2018
NEW SCHEMES FOR “SWITICHING”
The most common way of complementing classification scheme with an alphabetical indexing language in a library environment.UDC + subject heading index in the National Subject Authority File –CZENAS:
Aida Slavic The Hague 28 September 2018
CLASSIFICATION & SUBJECT HEADING SYSTEMS (1)
Trends towards thesaurus/descriptors-like systems
Aida Slavic The Hague 28 September 2018
CLASSIFICATION & SUBJECT HEADING SYSTEMS (2)
• project Multilingual Subject Access to Catalogues of National Libraries (MSAC) 2005-2007 : mapping subject heading systems of 8 national libraries (Czech Republic, Slovakia, Slovenia, Croatia, Macedonia, Lithuania and Latvia) using UDC as a pivot
• MSAC results were utilized in the TEL-ME-MORE project (The European Library: Modular Extensions for Mediating)
• Online Resources)
Aida Slavic The Hague 28 September 2018
CLASSIFICATION & SUBJECT HEADING SYSTEMS (3)
WTO Library Thesaurus –> UDC (2008)
UDC
Example of an in-house solution -not connected to the UDC management. Problems:§ not updated to match
terminology changes in UDC§ data cannot be publicly shared
as it does not have copyright clearance
Aida Slavic The Hague 28 September 2018
UDC & THESAURI MAPPING
Available at: http://www.taranco.eu/udk/
Konverteringstabell mellan UDK och SAB
Aida Slavic The Hague 28 September 2018
UDC – SAB (Swedish Classification System)
Mapping of around 400 UDC/DDC classes to Conspectus subject categories
Dewey UDC Conspectus
Aida Slavic The Hague 28 September 2018
UDC/DDC MAPPING BY CZECH NATIONAL LIBRARY
Aida Slavic The Hague 28 September 2018
UDC – LBC (Library Bibliographic Class.)
UDC LBC
LBC is the main Russian classification scheme, mapping of an outline of 1000 clases is produced by the Book Chamber of Ukraine
HILT (High Level Thesaurus Project) 2000-2004• JISC (UK Joint Information Systems Committee)
funded, in Phase 2 changed to building a terminology service providing access to mapping data • pivot type intellectual mapping to Dewey Summaries
(1000 classes) from thesauri, classifications and subject headings (LCSH, AAT, MeSH, UNESCO, etc.)• publishing these mappings and making them
available as SKOS data• the service closed in 2004
Aida Slavic The Hague 28 September 2018
PUBLISHING AND SHARING KOS MAPPINGS (1)
CrissCross project (2006-2010)• funded by German Research Foundation at the time of the publication of the
German translation of Dewey by German National Library and the beginning of indexing German national bibliography using Dewey
• Mapping between SWD (German subject headings) and Dewey
Coli-conc (2017-)• project by Verbundzentrale des Gemeinsamer Bibliotheksverbund (VZG),
Common Library Network
• creation of concordances between different vocabularies used in German libraries - Basic Classification (BC), Dewey, Regensburg (network) Classification (RVK), Standard Thesaurus Witschaft (STW) and Universal Decimal Classification (UDC)
• Coli-Conc is a mapping registry, provides infrastructure for the mapping management, creation, assessment and quality control (Cocoda module)
Aida Slavic The Hague 28 September 2018
PUBLISHING AND SHARING KOS MAPPINGS (2)
UDC
Vocabulary 2
Vocabulary 1
DDC UDC
536 Heat 536 Heat. Thermodynamics [English]
Chaleur Chaleur. Thermodynamique [French]
Теплота Тепло. Термодинамика [Russian]
������ � [Chinese]
Examples from UDC and DDC summaries
Aida Slavic The Hague 28 September 2018
UDC SUMMARY AS A SWITCHING LANGUAGE (1)
example of the UDC class=162.3 Czech [Common auxiliary of language]
Aida Slavic The Hague 28 September 2018
UDC SUMMARY AS A SWITCHING LANGUAGE (2)
2600 UDC classes > 1000 DDC classes
Aida Slavic The Hague 28 September 2018
UDC SUMMARY > DEWEY SUMMARIES
• many other research projects dealt with KOS mapping: DESIRE, Renardus, TEL-ME-MORE, TELplus, STITCH, CARMEN, KoMoHe, Europeana, etc.
• numerous good quality vocabularies have already been published as SKOS i.e. linked data (https://www.w3.org/2001/sw/wiki/SKOS/Datasets ) -although better linking between KOS is yet to happen)
• publishing use and re-use of KOS mapping is still sparse
cf. Isaac. A (2010) “Progress in Semantic Mapping”, NKOS Workshop, Glasgow, 9 September.
Aida Slavic The Hague 28 September 2018
CONTINUOUS INTEREST IN MAPPINGS
• relevant for alphabetical indexing languages (especially thesauri), typically one-to-one mapping (pivot mapping not excluded)
• NLP, various methodologies to determine the equivalence between terms, concept and hierarchical relationships between different thesauri, utilising
concept equivalence algorithms
• large (digital) collections of documents indexed by thesauri are also processed to enhance/confirm semantic similarities
• speedy progress made possible through publishing KOS as open linked data (SKOS): explicit semantics & resolved copyright issues for vocabularies
already published as LOD
cf.
S. Faro, E. Francesconi, V. Sandrucci (2007) EUROVOC Studies LOT2 (2007): “D1.5 –
Thesauri KOS analysis and selected thesaurus mapping methodology on the project
case-study” (http://eurovoc.europa.eu/drupal/sites/all/files/Presentation-
D1.5_Mapping_Methodology.pdf)
A. Isaac, et. al. (2011) “Europeana and semantic alignment of vocabularies”, 10th
NKOS Workshop, Berlin, September 2011. https://at-
web1.comp.glam.ac.uk/pages/research/hypermedia/nkos/nkos2011/presentations/Is
aac-NKOS11.pdf
Aida Slavic The Hague 28 September 2018
AUTOMATIC MAPPING / ONTOLOGY MAPPING
• mapping between KOS in the background driven by practical needs (organizations, national information networks)• mapping to/between universal library classifications
(dominance imposed by national bibliographies, physical collections), esp. subject heading systems
• Internet ... cross-collection, cross-language, cross-domain resource discovery and mapping between vocabularies -central to numerous international projects
• solutions for sharing and publishing KOS & KOS mappings, terminological services, KOS linked data, automatic vocabulary mapping, ontology mapping, big data bottom-up approach in categorization of information
Aida Slavic The Hague 28 September 2018
SUMMARY
• need for vocabulary mapping registries and terminological services (updates and sustainability problem)• discovery and recovery of in-house/localized
mapping outputs (not known to the KOS data owners)• clarify copyright – LD related issues (with respect to
mappings)
Aida Slavic The Hague 28 September 2018
WORK AHEAD
Useful links:
• UDC Summary (2,600 classes, 57 languages)http://www.udcsummary.info
• UDC Summary as linked datahttp://udcdata.info/
• UDC Hub (complete UDC schedules online, 71,000 classes)http://www.udc-hub.com/
Aida Slavic The Hague 28 September 2018
THANK YOU