Overview of technological solutions to terminology services Doug Tudhope Hypermedia Research Unit...

Preview:

Citation preview

Overview of technological solutions to terminology services

Doug Tudhope

Hypermedia Research Unit

University of Glamorgan

JISC Terminology Workshop, London, February 2004

Presentation

Networked Knowledge Organisation Systems/Services

Broad review technological approaches

NKOS Lifecycle

Introduce Workshop Demonstrations

Critical Issues and possible gaps

References

Taxonomy of Knowledge Organisation Systems

Term ListsAuthority Files, Glossaries, Gazetteers, Dictionaries

Classification and CategorizationSubject Headings

Classification Schemes and Taxonomieseg DDC, scientific taxonomies

Relationship SchemesThesauri

Semantic Networks (eg WordNet)

(Ontologies)

Hodg00, http://www.clir.org/pubs/abstract/pub91abst.html

KOS ctd.

Thesauri3 Standard Relationships between concepts (Aitc00)

Equivalence, Hierarchical, Associative

Inherent domain lexicon (lead-in vocabulary)

Concept definitions and warrant (Scope Notes)

OntologiesHigher level conceptualisation (McGu02, Noy)

formal definition of relationships

inference rules and definition of roles (sometimes)

KOS an element of ontologies and schemasJaco03, Ontologies and the Semantic Web,.

ASIST Bulletin, April/May 2003, Special Issue on Semantic Web

Recent Sources

NKOS: Networked Knowledge Organization Systems/Serviceshttp://jodi.ecs.soton.ac.uk/?vol=4&iss=4 NKOS JoDI Special Issue

http://www.multites.com/conference03.htm MultiTes Conference

http://nkos.slis.kent.edu/ JCDL and ECDL Workshops 2003

http://www.lub.lu.se/SEMKOS/ SEMKOS IP Proposal Resources

Semantic Web - RDF/XML, RDF Schema, Metalog, OWLhttp://www.w3.org/2001/sw/ W3C Semantic Web Activity

http://www.semanticweb.org/

http://ontoweb.aifb.uni-karlsruhe.de/ OntoWeb

http://www.w3c.rl.ac.uk/SWAD/thesaurus.html SWAD-Europe Thesaurus index

Semantic Grid - Semantic Web, Web service, eScience, GRID linkshttp://www.semanticgrid.org/

http://www.w3.org/2002/ws/ W3C Web Services Activity

http://www.ariadne.ac.uk/issue29/gardner/intro.html Gardner’s Intro to Web Services

JISC Application Area

Search/retrieval for educational purposes(?)students, teachers, researchers

possibly

Generalised searchpossibly integrated into applications

triggered to take account of context (eg Brow02)

link eScience applications?

Current operational systems (eg RDN)lack terminology services

some browsing categories

but not integrated into search

Technologies

Information ScienceControlled Terminology

Information Retrieval (probabilistic, full text)

Intellectual/Automatic Indexing

search/browse, user interfaces

Facet Analysis

Ontology Engineering (AI Knowledge Representation)formal (finer grained) representation, description logics

automated reasoning, Semantic Web

Distributed SystemsZ39.50, Web Services, Semantic Grid

Language Engineering

Social Engineering

Enriching / Formalising KOS

KOS Legacy - large (multilingual) vocabularies, indexed multimedia (and print) collections

Product of peer review and follow standards

However

Not utilised to full potential in some applicationsDesigned for human inspection, semantic structure not explicitly

represented

May be inconsistently evolved from various sources

Opportunity to formalise / enrichPartly a matter of representation in RDF/XML

but may be inconsistencies in logical structure

--> deconstruction and ontological formalisation

--> mutually exclusive concept structures

Facet Analysis (a link between technologies)

Fundamental categories / foundational conceptseg CRG: Entity, Part, Property, Material, Process, Operation, Product,

Agent, Space, Time, ...

Mapped to facets for particular KOS

Basis of several scientific and industrial KOS

Synthesis rules for principled combination of conceptsrules for combining base concepts when indexing/querying

Browsing and Searching applications

KOS integration into DL services from Hill02 Research Agenda KOS/DL

Taxonomy of KOS - KOS types linked to DL service protocols

Registries of KOS and KOS-level metadata to represent them

XML/RDF KOS representations - customisable

Core set of relationship types across all KOS

General KOS service protocol

from which protocols for specific types of KOS can be derived

Robust linking model in which DL entities (collections, objects, and services) can refer to KOS entities (concepts, labels, and relationships)

Visualization tools that fully use and display the rich semantics embedded in KOS

=> move towards a model of search service ‘flow’?- how semantic search services combine

Terminology Services from Koch04 Structured Overview - Activities to advance the powerful use of vocabularies

Searching for conceptsschemes in registries

concepts/terms in taxonomy servers

Search support for queriescollection finding

cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

query expansion, disambiguation

automatic indexing and classification

extraction/mining of terms

translation support using vocabularies

Workshop Demonstrations

… in context of NKOS Information Lifecycle

NKOS Information Lifecycle

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

High Level Thesaurus (HILT) - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Pilot Terminology Service

HILT team, Wordmap s/w, OCLC

discovery of collections

cross-searching JISC collections

mapping from Terminologies to DDC spine

DDC, LCSH, UNESCO, MeSH, AAT

geoXwalk - Geographic Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Geo-spatial Gazetteer Service

Edina, Data Archive, CIE

feature (concept) searching

geographic searching, spatial operators

spatial result visualisation, flexible footprint

geoparser - automated geographic indexing

Renardus - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

cross-browsing service

NetLab, UKOLN, ILRT, SUB, …

classification mapping via DDC

cross-searching EU subject gateways

(multilingual) user interface for browsing

in large classifications

Learning and Teaching Portal & SSL - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Systems Simulations Ltd, Index+Learning and Teaching Support Network

Web-based thesaurus service

vocabulary management - ‘Suggest a Term’

data entry

browse and search

CIE Health Demonstrator - Information Science, facet analysis

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Adiuri Systems Ltd (from IDEA Project)

Waypoint Health Info search demonstrator

faceted, multi-concept ‘query via browsing’

‘non-zero match’, postings displayed

faceted browsing user interface

COHSE Conceptual Open Hypermedia - Ontology, description logic, hypertext navigation

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Link Navigation Using Ontologies

Manchester, Southampton University

Open Hypermedia System (Soton DLS)

open-source downloadable tools for

Ontology and Annotation Services:

eg OilEd lightweight ontology editor for DAML+OIL

OpenGALEN - Ontology, GRAIL logic, facet analysis

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Open GALEN Common Reference Model -

Medical coding and classification systems Manchester University,

faceted; compositional rather than traditional enumerative medical codes

multilingual GALEN-in-use Project

OpenKnoME, GALEN Case Env toolsets

Co-ODE: Collaborative Open Ontology Development EnvOntology management

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

Manchester University new project

develop Ontology management tools

as plugins for Protégé (Stanford)

building on earlier experience with OilEd

concern with usability

FACET: faceted knowledge organisation for semantic retrieval

- Information Science, facet analysis

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

University of Glamorgan, Science Museum

faceted, multi-concept bestmatch search

semantic expansion as browsing service

faceted thesaurus search interface

standalone and Web demonstrators

E-Biosci : EC platform e-publishing and info integration in Life

Sciences - Information Science

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

European Molecular Biology Organisation

Collexis B. V. technology:

semantic matching conceptual fingerprints

link genomic data + life sciences research lit

multilingual

integrated search: full text/data/researchers

peer-reviewed, different publishing models

SKOS: Simple knowledge organisation for the semantic webInformation Science, Ontology

KOS creation and maintenance

Mapping, merging vocabularies

Document creation and maintenance

Indexing, classification, annotation

intellectual, automatic

Discovery of services and databases/collections

Searching for concepts --> controlled terminology, auto-disambiguation

Querying and result display

Cross-searching, cross-browsing, mapping services

KOS browsing and user interface/visualisation

Query expansion

Extraction/mining of terms

Translation support using vocabularies

Content integration and mediation

CCLRC, SWAD-EUROPE project

Migrate existing KOS to SemWeb via common RDF schema for thesauri and for inter-thesaurus mapping (formal OWL spec planned)

use cases for thesaurus services

lightweight RDF service demonstrators using Jena RDF API toolkit

Some critical issues

Standards

User Interface

Gaps?

Critical issues (1) Standards

Ongoing initiatives to revise thesaurus standardsANSI/NISO Z39.19

BS 5723 and BS 6723 - Dext03

BSI public draft soon, extended scope, interoperability

Thesaurus RepresentationsRDF - SWAD03; Topic Map - Ligh03; various XML

Possibilities to extend current relationships by specialisation,

enriching standards but maintaining compatibility

KOS Service Protocols - Bind04

service oriented approach with composite service provision

not based on atomic elements of data structures and relationships

expansion service provision

NKOS Registry - Vizi01; MEG Registry Project

Cost/benefit issues

Thesaurus long-lived, pragmatic and useful toolcost-effective granularity of relationships

for some search apps

Domain lexicon (UF/ALTs, Scope Notes)

Cost/benefit issues in KOS formalisationApplication dependent level of precision in concept use

Some apps very precise use of concepts (medical?)

Other apps may vary in concept application (humanities?)Indexer - Searcher variation

Results based on probable relevance judgements

Critical issues (2) User interface

User interface critical given controlled terminology demands

Offer different options

Move beyond minimal assumptions of current web search engines on

users, query structure, collections

Link with service protocol issueskind of interfaces easily afforded

Accessibility issues

Critical issues (3) Gaps?

Language EngineeringRelated standards - Shre03

POS tagging tools

large statistical corpora --> source of context data

for disambiguation, annotation, proactive search

JISC-specific corpora?

Collect portal use data --> taxonomies, synonyms Time-varying synonyms - BBCi04

Probabilistic IRterm frequency information, automatic weighting

Social Engineering?

What do users really want?

Problems of introducing new technologiesSometimes a matter of both reflecting and shaping user needs

Done implicitly by successful projectsbut also extant literature on sociology/philosophy of innovation

Lessons from:Participatory Design, Rapid Application Development - Tudh00

evolving network: prototypes, user expectations, requirements and working practices

Lead / Ambassador Users

training, tailoring and advocacy / motivation.

Contact Information

Doug Tudhope

School of Computing

University of Glamorgan

Pontypridd CF37 1DL

Wales, UK

dstudhope@glam.ac.uk

http://www.comp.glam.ac.uk/pages/staff/dstudhope

References

Aitchison J., Gilchrist A., Bawden D. 2000. Thesaurus construction and use: a practical manual (4th edition). London: ASLIB. BBCi, A day in the life of BBCi search. http://www.currybet.net/articles/day_in_the_life/index.shtml

Binding C., Tudhope D. 2004. KOS at your Service: Programmatic Access to Knowledge Organisation Systems. JoDI 4(4), http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Binding/

Brown P. 2002. From information retrieval to hypertext linking. New Review of Hypermedia and Multimedia,8, 231-255.

Dextre Clarke S. 2003. BS 8723 : a new British Standard for structured vocabularies. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-workshop%20Folder/dextre_clarke.ppt

Hill et al. 2002. Integration of Knowledge Organization Systems into Digital Library Architectures. ASIST SigCR - http://www.lub.lu.se/SEMKOS/docs/Hill_KOSpaper7-2-final.doc

Hodge Gail, 2000. Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. CLIR Pub91. April 2000. http://www.clir.org/pubs/abstract/pub91abst.html

Jacob Elin. 2003. Ontologies and the Semantic Web. ASIST Bulletin, April/May 2003, Special Issue on Semantic Web. http://www.asis.org/Bulletin/Apr-03/BulletinAprMay03.pdf

Koch T. Activities to advance the powerful use of vocabularies in the digital environment - Structured overview. http://www.lub.lu.se/~traugott/drafts/seattlespec-vocab.html

Light R. 2003. XML (and Topic Maps). http://www.richardlight.org.uk/thesauri/thesauri.htm

McGuinness D. 2002. Ontologies Come of Age. In: (Fensel et al eds.) Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press.

MultiTes 2003. Conference on Thesauri and Taxonomies http://www.multites.com/conference03.htm

References ctd.

NKOS: Networked Knowledge Organization Systems/Services, http://nkos.slis.kent.edu/

NKOS 2003. Workshop ECDL. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-Workshop.php

NKOS 2004. New Applications of Knowledge Organization Systems. NKOS Special Issue, JoDI. http://jodi.ecs.soton.ac.uk/?vol=4&iss=4

Noy N., McGuinness D. Ontology Development 101: A Guide to Creating Your First Ontology. http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html

Shreve G. 2003. Terminology Standards. http://www.glam.ac.uk/soc/research/hypermedia/NKOS-workshop%20Folder/Shreve.ppt

Soergel D. The representation of Knowledge Organization Structure (KOS) data: a multiplicity of standards. http://www.glam.ac.uk/soc/research/hypermedia/publications/SoergelNKOS2001KOSStandards.PDF

SWAD-Europe Thesaurus Activity. http://www.w3c.rl.ac.uk/SWAD/thesaurus.html

Tudhope D, Beynon-Davies P, Mackay H. 2000. Prototyping praxis: Constructing computer systems and building belief. Human Computer Interaction, 15(4), 353-383. http://www.glam.ac.uk/soc/research/hypermedia/publications/tudhope-2000.pdf

Vizine-Goetz D. 2001. NKOS Registry - draft proposal for KOS-level metadata. http://staff.oclc.org/~vizine/NKOS/Thesaurus_Registry_version3_rev.htm

Recommended