26
Cross domain knowledge discovery, complex system theory and semantic web Aida Slavic, Christophe Gueret, Andrea Scharnhorst Why Otlet?

Cross domain knowledge discovery, complex system theory

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Cross domain knowledge discovery, complex system theory and semantic web Aida Slavic, Christophe Gueret, Andrea Scharnhorst Why Otlet?

Paul Otlet - Mundaneum

!   International Bibliographic Institute

!   Ordering the knowledge of the world

!   Universal Decimal Classification

! Palais Mondial

All pictures shown at the following slights are under the copyright of the Mundaneum

Classification

The scientific process

Workplaces - workspaces

Encyclopedia Universalis Mundaneum – a Visual Encyclopedia

The What, How and Why together

Begreifen & Vermitteln

Otlet - Visions

!   Every information scientists (= we all says Martin White) should once in her/his life visit the Mundaneum

!   Without history we are condemned to eternal repetition (Mircea Eliada, pre-historic society live in eternal return).

!   Visual language experiments, exhibitions, objects

!   Retrieval service (see B. Rayward: Proceedings Classifications & Visualizations 2013) – entrepreneur

The UDC as part of Otlet’s heritage!

Designing interfaces to collections – visual enhanced browsing

All datasets in EASY - the digital research data archive at DANS at one glance. www.drasticdata.nl

Steps towards visual enhanced/facetted browsing

Creation of descriptions

Structure of a collection

(composition, evolution)

Visualizations Search functions

Implementation • Combination with IR • User behavior • Application to ‘living collections’

AH  CS   M&P   SoSc  

!   Translated in over 40 languages, used in over 130 countries: !   bibliographies and bibliographic databases

!   libraries (also some museums, archives)

!   digital collections, web portals, alerting services

!   Annually updated and distributed as a file: 18 versions/‘editions’ since 1992 !   1992: 60,000 classes - 2011: over 70,000 classes

!   over 10,000 classes cancelled

!   over 19,000 new classes added

Back to UDC – Facts and Figures

Understanding the evolution of Knowledge Organization Systems

0. Science and Knowledge. Computer Science.Information

1. Philosophy. Psychology

2. Religion. Theology

3. Social Sciences. Statistics. Politics. Economy. Law

5. Mathematics. Natural Sciences

6. Applied Sciences. Medicine. Technology

7. Arts. Entertainment.Sport

8. Language. Linguistic. Literature

9. Geography. Biography. History

4. Philology

Auxiliaries

(a) (b)

More research needed – various KOS ACM – Veslava Osinska MESH – Alexander Petersen/Orion Penner ISI Classifications – Sandor Soos Wikipedia – Janusz Kertesz, Krzysztof Suchecki ….

!   Cross-language linking of concepts, i.e. managing link between concepts and language

=512.16 Jižní skupina turkických jazyků

Южная группа тюркских языков [Russian]

त"क$ भाषाआ) का दि,णी सम1ह [Hindi]

Թուրքական լեզուների հարավային խումբ [Armenian]

Νότια ομάδα των Τουρκικών γλωσσών [Greek]

突厥南部语 [Chinese]

দি#ণ% &'িণর ত*িক, ভাষাসমূহ [Bengali]

チュルク語南部群 [Japanese]

ಟಕ#ಿ %&ಗಳ ದ*ಣ %ಗದ ಸ-ಹ [Kannada]

!   Hierarchies: graphic knowledge presentation, browsing knowledge space (supporting interactive user behaviour)

!   Linking concepts ‘fish’ in zoology, in sport, in cooking, in food industry, in animal husbandry

UDC Applications: Scope and Potentials

 

       

Sharks

Natural  SciencesBiologyAnimalsVertebrataPisces  (Fishes)Elasmobranchii

Sharks

Arts.  Recreation.  Entertainment.  SportFilm.  Cinema  (motion  pictures)Film  genresDocumentary  filmsDocumentaries  about  sharks

Social  SciencesEconomic  scienceEconomic  sectorsTourismAdventure  tourismSwimming  with  sharks

Arts.  Recreation.  Entertainment.  SportSportSport  fishingSea  fishingShark  fishing

Applied  SciencesAgricultureFishingFishing  for  deep-­‐sea  speciesShark  fishing

Applied  SciencesIndustriesLeather  industryFish  skinSharkskin

Linking concepts across knowledge

Two requirements:

•  publishing classification: open access to classification vocabulary for m2m processing

•  publishing library catalogues: open access to collections and collections’ metadata for m2m processing

Classifications can be used on the Web to: !   Improve and enrich semantics and access points in the

retrieval of information

!   Enable information discovery across collections and languages

KOS and Libraries in the web of knowledge

Linked Open Data – Big Data

See  also:    Tutorial  Linked  Data:  stap  voor  stap.  Paul  Hermans,  h@p://www.den.nl/nieuws/bericht/3075/      

What are Linked (Open) Data?

What is the clue?

Peter Richmond as chair of MP0801 Peter Richmond publishing with Sorin Solomon Peter Richmond using EI/M0HBL

Ah! This is one person! URI … /this is a person/ this is this person

Designed for machines by humans! Occupied by machines guided by humans! Retrievable by (some) humans! Information/Data in databases live urban

(some say in ghettos) Information/Data in the semantic web live in the wild – self-organized, endangered

Science+Engineering Data models + standards, Web technologies Algorithms

connecHng  collecHons  of  data  by  programs  (machine-­‐to-­‐machine)    XML/RDF  presentaHon  relies  on  unique  idenHficaHon  of  resources  (URI)  poinHng  to  one  another    

COLLECTION CATALOGUES

XML/RDF export

UDC

XML/RDF export

KOS and Libraries stream into the LOD cloud – what is the problem?

UDC as LOD

The first stage contains the following UDC data:

!   UDC number (notation) è skos:notation !   class identifier (URI) è skos:Concept !   broader class (URI) è skos:broader !   caption è skos:prefLabel !   including note è skos:note !   application note è skos:note !   scope note è skos:scopeNote !   examples è skos:example !   see also reference è skos:related

example of the UDC class =162.3 Czech [Common auxiliary of language]

For (machine) eyes only! Who said machine make life easier?

!   Enable automatic redirection on the Web from cancelled UDC numbers. UDC MRF database holds data as follows:

!   SKOS does not offer solution for presenting this kind of data at the moment

!   But in RDF, it is possible to use other models in combination with SKOS… !   Dublin Core for versioning links the two classes using properties in the term

namespace : isReplacedby; replaces

UDC CLASS NUMBER: 22 DESCRIPTION: The Bible. Holy scripture REPLACED BY: 26-23 Judaism – Scriptures - Bible

27-23 Christianity – Scriptures - Bible

Issues - I

37:004 Application of computers in education 32:37 Relationships between politics and education

§  Library catalogues or authority data shared on the web contain many pre-combined number and when published as linked data these numbers may have their own URI

§  How to link representations between notations from the original schemes and complex subject expressions developed at the point of indexing?

§  Complex UDC expressions appear in the process of use and may not appear in the original scheme

Issues - 2

Issues - 3 !   Versions of UDC

!   Can be controlled in the editions

!   But what about the actual use in libraries – here UDC numbers don’t come with a year when they have been assigned

!   Updates of UDC – how to give the web a memory !   Provenance

!   Memento

!   URI …/last/….

Summary

!   Heritage of Otlet and other Information/Documentation pioneers -> Mundaneum 2015 Exhibit Knowledge Maps

!   Self-organized knowledge creation and KOS belong together – both are an intriguing object for study – best studied combined

!   Transition to Big Data in form of Linked (Open) Data requires careful and inventive preparations

!   There is a long way from visionary drawings to useful visual navigation – but to start the journey is worth-while and needed.