Introductory Review of Current Knowledge Organization Systems/Structures/Services (KOS) Marcia Lei...

Preview:

Citation preview

Introductory Review of Current Knowledge Organization Systems/Structures/Services (KOS)

Marcia Lei ZengSecond International Seminar on Subject Access to Information, Helsinki,Finland, 29-30 November 2007

M.L.Zeng @ ISSAI, Helsinki,2007 2

Purpose of this talk

• Introduce different types of knowledge organization systems/structures/services (KOS)

• Provide a common terminology and background

M.L.Zeng @ ISSAI, Helsinki,2007 3

1. KOS overview (1)

Knowledge organization systems/structures/services (KOS) encompass all types of schemes for organizing information and promoting knowledge management. – (Gail Hodge, 2000)

M.L.Zeng @ ISSAI, Helsinki,2007 4

1. KOS overview (2)

These systems • model the underlying semantic

structure of a domain, and• provide semantics, navigation,

and translation through labels, definitions, typing, relationships, and properties for concepts. – (Hill et al. 2002, Koch and Tudhope 2004).

A Taxonomy of KOS

Term Lists: Authority Files Synonym Rings

Classification &Categorization:

Subject Headings

Classification schemes TaxonomiesCategorization schemes

Relationship Models: Ontologies Semantic networksThesauri

Glossaries/Dictionaries Pick lists

GazetteersDirectories

Metadata-like Models:

Function

Structure

M.L.Zeng @ ISSAI, Helsinki,2007 6

2. Fundamentals of KOS Approaches

• 2.1 Eliminating ambiguity • 2.2 Controlling synonyms or

equivalents• 2.3 Making explicit semantic

relationships– Hierarchical relationships– Hierarchical + other associate

relationships • 2.4 Presenting relationships as

well as properties of concepts

M.L.Zeng @ ISSAI, Helsinki,2007 7

2.1 2.1 Eliminating ambiguityEliminating ambiguity

• Ambiguity: terms having the same spelling (homographs) that represent different concepts or meanings

• Ambiguity exists when a given term can be used to represent completely different concepts.

Ambiguity / Homographs

Source: Z39.19-2005, p.25

M.L.Zeng @ ISSAI, Helsinki,2007 9

To eliminate ambiguity (1)

1. Adding a qualifier to a term -- one of the major methods

used by almost every type of KOS, especially lists of subject headings and thesauri.

• e.g., Mercury (automobile)

M.L.Zeng @ ISSAI, Helsinki,2007 10

2. Providing a scope note-- another major method used

by almost every type of KOS, especially lists of subject headings, classifications, and thesauri.

To eliminate ambiguity (2)

Screenshot from MeSHhttp://www.nlm.nih.gov/mesh/MBrowser.htmlEntry: mercury

M.L.Zeng @ ISSAI, Helsinki,2007 11

http://www.nlm.nih.gov/mesh/MBrowser.html

M.L.Zeng @ ISSAI, Helsinki,2007 12

To eliminate ambiguity (3)

3. providing a context of a term

M.L.Zeng @ ISSAI, Helsinki,2007 13

What are these?

• Flying Horse• King Fisher• Royal Challenge• Heineken• Budweiser• Miller-Lite• Bud-Light

Drinks• Flying Horse• King Fisher• Royal Challenge• Taj Mahal• Hayward’s 2000• Heineken• Corona• Budweiser• Miller-Lite• Bud-Light

Lists (Picklists)

A type of controlled vocabulary induced in NISO Z39.19 Standard

M.L.Zeng @ ISSAI, Helsinki,2007 16

• ListsLists are used to describe aspects of content objects or entities that have a limited number of possibilities.

• Examples include: – geography (e.g., country, state, city), – language (e.g., English, French, Swedish),– format (e.g., text, image, sound), or– … …

M.L.Zeng @ ISSAI, Helsinki,2007 17

Lists can be used effectively for both browsing and searching.

• In browsingbrowsing, items are directly accessed when the list of terms is reviewed and one term is selected

M.L.Zeng @ ISSAI, Helsinki,2007 18

Source: http://www.ncbi.nlm.nih.gov/genome/guide/human/resources.shtml

M.L.Zeng @ ISSAI, Helsinki,2007 19

• In searchingsearching, a list may be used to access content in a single term search, or the terms from the list may be used to limit a retrieved set by another attribute of interest for the user (one or more terms in the search).

M.L.Zeng @ ISSAI, Helsinki,2007 20

Source: Google’s advanced search http://www.google.com

pick lists

Waterford County Image Archivehttp://www.waterfordcountyimages.org

M.L.Zeng @ ISSAI, Helsinki,2007 22

Waterford County Image Archivehttp://www.waterfordcountyimages.org

M.L.Zeng @ ISSAI, Helsinki,2007 23

List - Definition, Purpose, and Uses

• A list (also called a pick list) is a limited set of terms arranged as a simple alphabetical list or in some other logically evident way. – A list is a series of terms in

some sequential order. – Terms can be ordered

alphabetically, chronologically, numerically, etc.

Exercise: Which list is better?

M.L.Zeng @ ISSAI, Helsinki,2007 25

• The defining characteristics of a list are that the terms:· are all members of the same

set or class of items (e.g., countries, products)

· are not overlapping in meaning

· are equal in terms of specificity (granularity)

M.L.Zeng @ ISSAI, Helsinki,2007 26

Typical applications

• Lists are frequently used to display small sets of terms that are to be used for quite narrowly defined purposes such as a web pull-down list or list of menu choices.

M.L.Zeng @ ISSAI, Helsinki,2007 27

2. Fundamentals of KOS Approaches

• 2.1 Eliminating ambiguity

• 2.2 Controlling 2.2 Controlling synonymssynonyms or equivalents or equivalents

• 2.3 Making explicit semantic relationships– Hierarchical relationships– hierarchical + other associate

relationships

• 2.4 Presenting relationships as well as properties of concepts

M.L.Zeng @ ISSAI, Helsinki,2007 28

2.2 Controlling synonyms 2.2 Controlling synonyms or equivalentsor equivalents• Synonyms: terms with the

same or similar meanings1. True synonyms (unusual)

– mean exactly the same thing and are used in precisely the same context

2. Near synonyms (most common)

M.L.Zeng @ ISSAI, Helsinki,2007 29

1. True Synonyms• common and technical names

– salt vs. sodium chloride

• changes in usage of terms over time– electronic calculating machines vs.

computers

• in different languages– eyeglasses, spectacles, glasses

• acronyms– BBC, British Broadcasting

Company; MPG, miles per gallon

• variant spellings: – cancelled, canceled; honor, honour

M.L.Zeng @ ISSAI, Helsinki,2007 30

2. Near Synonyms

• Same stem– computing, computers,

computed, microcomputers, supercomputers

• Overlapping concepts– medicine, drugs – fired, laid off – forest, woods– arid, dry

• General and specific termsCoffee– Double Espresso– Latte– Cappuccino– Short Black – Macchiato– Flat White– etc.

M.L.Zeng @ ISSAI, Helsinki,2007 31

Synonymy

Source: Z39.19-2005, p.25

M.L.Zeng @ ISSAI, Helsinki,2007 32

• Each distinct concept should refer to a unique linguistic form.

• Information or content that is provided to a user should not spread across the system under multiple access points, but should be gathered together in one place.

… … 150    World War, 1939-1945 450    European War, 1939-1945 450    Second World War, 1939-

1945 450    World War 2, 1939-1945 450    World War II, 1939-1945 450    World War Two, 1939-1945

Source: FAST: Faceted Application of Subject Terminologyhttp://fast.oclc.org/

Controlling synonyms: there will only be one term used to represent a given concept or entity.

or:

World War, 1939-1945 UF    European War, 1939-1945 UF    Second World War, 1939-1945 UF    World War 2, 1939-1945 UF    World War II, 1939-1945 UF    World War Two, 1939-1945

European War, 1939-1945USE World War, 1939-1945

Second World War, 1939-1945USE World War, 1939-1945

World War 2, 1939-1945USE World War, 1939-1945

World War II, 1939-1945USE World War, 1939-1945

World War Two, 1939-1945USE World War, 1939-1945

Authority File

Thesaurus

M.L.Zeng @ ISSAI, Helsinki,2007 34

Source: Art and Architecture Thesaurus (AAT)

M.L.Zeng @ ISSAI, Helsinki,2007 35

Source: Medical Subject Headings (MeSH)

Synonym Rings

A type of controlled vocabulary induced in NISO Z39.19 Standard

astronaut

spaceman cosmonaut

spationaut taikonaut

A synonym ring connects a set of words that are defined as equivalent for retrieval.

An example from International SEMATECH.

A search for Silicon would look like this:

Your search was submitted as “CILICON” or “SI”

M.L.Zeng @ ISSAI, Helsinki,2007 39

Synonym Rings are used--• to expand queries for content

objects – If a user enters any one of these terms as

a query to the system, all items are retrieved that contain any of the terms in the cluster.

• in systems where the underlying content objects are left in their unstructured natural language format – The control is achieved through the

interface by drawing together similar terms to these clusters.

• in conjunction with search engines

Poverty mitigation

Poverty alleviation

Poverty elimination

Poverty reducation

Poverty eradication

Poverty abatement

Poverty prevention

Poverty reduction

Rings can include all kinds of synonyms - true, misspellings, predecessors, abbreviations

Source: Bedford, 2006 ppt.

M.L.Zeng @ ISSAI, Helsinki,2007 41

Exercise

• Find synonyms of this type of object:

M.L.Zeng @ ISSAI, Helsinki,2007 42

2. Fundamentals of KOS Approaches

• 2.1 Eliminating ambiguity • 2.2 Controlling synonyms or

equivalents

• 2.3 Making explicit 2.3 Making explicit semantic semantic relationshipsrelationships– Hierarchical relationshipsHierarchical relationships– hierarchical + other hierarchical + other

associate relationships associate relationships • 2.4 Presenting relationships as

well as properties of concepts

M.L.Zeng @ ISSAI, Helsinki,2007 43

2.3 Making explicit semantic 2.3 Making explicit semantic relationships – relationships – Hierarchical relationshipsHierarchical relationships

Birds Cardinals Doves Robins Wrens

All specific names of birds are kinds of birds.

Phylum: Chordata Class: Reptilia

Subclass: Anapsida Order: Testudines

Suborder: Cryptodira Family: Dermochelyidae

Genus: Dermochelys Species: Dermochelys coriacea

(Leatherback turtle)

Scientific Taxonomy An example: Leatherback turtle

M.L.Zeng @ ISSAI, Helsinki,2007 45

superordinate classes (e.g., parents). coordinate classes (e.g., siblings)

. . subordinate classes (e.g., children). . subordinate classes

. coordinate classes . coordinate classes

. . subordinate classes

relationship types: generic, instance, and whole-part

Classifications

M.L.Zeng @ ISSAI, Helsinki,2007 46

M.L.Zeng @ ISSAI, Helsinki,2007 47

Part / WholeCause / EffectProcess / AgentAction / ProductAction / PatientConcept or Thing /

PropertiesConcept or Thing / OriginsThing or Action / Counter-

agentRaw material / ProductAction / Property

Antonyms

Bicycle / Bicycle WheelAccident / InjuryVelocity measurement /

SpeedometerWriting / PublicationTeaching / StudentSteel alloy / Corrosion resistanceWater / WellPest / PesticideGrapes / WineCommunication / Communication

skillsSingle people / Married people

Relationship Example

2.3 Making explicit semantic relationships – 2.3 Making explicit semantic relationships – Associative relationships (not hierarchical)Associative relationships (not hierarchical)

M.L.Zeng @ ISSAI, Helsinki,2007 49

M.L.Zeng @ ISSAI, Helsinki,2007 50

Source: Z39.19-2005, p.29

KOS in Use at World Bank

• Topic Thesaurus (500,000+ English terms, French and Spanish language versions in progress now)

• Topic Classification Scheme (30 top classes, 700+ subtopics, 300+ subsubtopics)

• Business Function Thesaurus (50,000 terms and growing)

• Business Function Classification Scheme (5 business areas, 30 lines of business, 300+ business processes)

• Country-Region classification scheme (6 regions, ca. 200 countries)

• Content Type Classification Scheme (8 content types, 300+ secondary content types – in refinement now)

• Media-Format Classification Scheme

• Country Name Authority Control (synonym, predecessor, successor sources)

• Edition Statements Authority Control

• Publisher Name Authority Control

• Organization Authority Control

• Language Authority Control

• Series Name/Collection Title Authority Control

• Translation Type Authority ControlSource: Bedford, 2007, ASIST

M.L.Zeng @ ISSAI, Helsinki,2007 53

Pick lists Hierarchical taxonomy

Synonym Rings

Synonym Rings

Vision of An Enterprise Advanced Search

Source: Revised based on Bedford, 2006 ppt.

M.L.Zeng @ ISSAI, Helsinki,2007 54

Synonym Rings

Thesaurus

Metadata

Source: Revised based on Bedford, 2006 ppt.

2. Fundamentals of KOS Approaches• 2.1 Eliminating ambiguity • 2.2 Controlling synonyms

or equivalents• 2.3 Making explicit

semantic relationships– Hierarchical relationships– hierarchical + other associate

relationships

• 2.4 Presenting 2.4 Presenting relationshipsrelationships as well as as well as propertiesproperties of concepts of concepts

M.L.Zeng @ ISSAI, Helsinki,2007 56

2.4 Presenting relationships as well as properties of concepts• Entity types• Relationship types• Properties

M.L.Zeng @ ISSAI, Helsinki,2007 57

Semantic networks

organize sets of terms representing concepts, modeled as the nodes in a network of variable relationship types.

M.L.Zeng @ ISSAI, Helsinki,2007 58

UMLS Semantic Network

135 Semantic Types (link) and 54 Semantic Relation Types (link)

Source: Noy, N. F. and Tu, S.W. (2003).

Ontologies

Classes

attributes

instances

M.L.Zeng @ ISSAI, Helsinki,2007 61

M.L.Zeng @ ISSAI, Helsinki,2007 62

M.L.Zeng @ ISSAI, Helsinki,2007 63

The Graph view of relations

M.L.Zeng @ ISSAI, Helsinki,2007 64

A Taxonomy of KOS © 2007 Zeng

Ontologies Semantic networks

Thesauri

Glossaries/Dictionaries Pick lists

xxxxxpresenting properties

xxxxxxxxxestablishing relationships: associative

xxxxxxx xxxxestablishing relationships: hierarchical

xxxxxxxxx xxxxxxcontrolling synonyms

xxxxxxxxx xxxxxeliminating ambiguity

establishing

x establishing

xx

xx

function

Two-dimensions

Term Lists: Synonym RingsFlat

structure

Classification &Categorization:

Subject Headings

Classification schemesTaxonomies

Categorization schemes

Relationship Models:

GazetteersDirectories

Authority Files

Metadata -like Models:

Multiple dimensions

Maj

or fu

nctio

ns

M.L.Zeng @ ISSAI, Helsinki,2007 66

Networked KOS NKOS

• KOS are not used in isolation;• KOS may be used, re-used, and re-

purposed in web-based services; • KOS are used for:

– organizing, indexing, cataloging, and searching, AND

– learning, knowledge modeling, reasoning, etc.

• NKOS need to be machine-processable, machine-understandable– (more to discuss later today)

M.L.Zeng @ ISSAI, Helsinki,2007 67

References

• Hodge, Gail (2000). Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. Washington, DC: Council on Library and Information Resources. http://www.clir.org/pubs/reports/pub91/contents.html http://www.clir.org/pubs/reports/pub91/pub91.pdf

• Hill, Linda, Buchel, Olha, Janee, Greg, and Zeng, Marcia L. 2002. Integration of knowledge organization systems into digital library architectures: In: Mai, Jens-Erik, et al. ed.: Advances of classification research, volume 13, proceedings of the 13th ASIST SIG/CR Workshop, 17 November 2002 Philadelphia PA, pp. 62-68.

• Koch, Traugott and Tudhope, Douglas. 2004. User-centred approaches to Networked Knowledge Organization Systems/Services (NKOS): Background. http://www2.db.dk/nkos-workshop/#Background

Recommended