46
Classification Systems Spring 2006, 3 April Bharat Mehra IS 520 (Organization and Representation of Information) School of Information Sciences University of Tennessee

Classification Systems

  • Upload
    raven

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Classification Systems. Spring 2006, 3 April Bharat Mehra IS 520 (Organization and Representation of Information) School of Information Sciences University of Tennessee. Objectives: to understand different subject access methods to compare these methods Part I. Controlled Vocabulary - PowerPoint PPT Presentation

Citation preview

Classification Systems

Spring 2006, 3 April

Bharat Mehra

IS 520 (Organization and Representation of Information)

School of Information Sciences

University of Tennessee

Assignment 4: Subject Access

Objectives: to understand different subject access

methods to compare these methods

Part I. Controlled VocabularyIn UTK OPAC, select subject index tobrowse April First and Holidays. Look at the LC Authority Records for

the two concepts to understand the structure of the controlled vocabulary: authorized heading, lead-in terms (Use For), narrower terms, broader terms, and the corresponding LCC number (similar to the Relative Index in DDC).

Look at the use of the heading Holidays in pre-coordinated subject cataloging in UTK OPAC: What types of subdivisions are being used? Find examples for topical subdivision, geographical subdivision, chronological subdivision, and form subdivision.

Browse the list forward (Next Page button) and backward (Previous Page button) to see how various holidays (New Year’s Day and Thanksgiving Day) are dispersed in the alphabetical listing. Are the headings in near proximity always related concepts?

Part II: Classifications Take a tour of DDC at

http://www.oclc.org/dewey/resources/tour/default.htm

Read the comparison of DCC and LCC, both enumerative classifications, at http://staff.oclc.org/~vizine/Intercat/vizine-goetz.htm

Read “Was Ranganathan a Yahoo!?” about the colon classification, a facet classification at http://scout.wisc.edu/Projects/PastProjects/toolkit/enduser/archive/1998/euc-9803.html

Assignment 4: Subject Access

Report The report should read like a well-

organized essay. No need to answer the specific questions above; just use the results you obtained as examples to illustrate or back up your arguments. You must use some examples from the above activities to make your points and to show that you gained some understanding while performing the above tasks.

Your essay should have sections that include the following parts:

Summarize your understanding of the roles of controlled vocabularies in providing subject access to intellectual works

Summarize your understanding of the roles of classifications in organizing information objects in physical libraries

Compare classification systems with alphabetical subject headings or thesauri (controlled vocabulary) in providing subject access (pros and cons)

Discuss the new roles of controlled vocabularies and classifications in organizing electronic resources on the Web

Subject Analysis and Classification

Subject analysis: Is part of creating metadata that deals with the conceptual analysis of an information object to determine what it is about, and

Translating “aboutness” of an info object to create controlled vocabulary terms for subject headings and classification notations

Knowledge Classification

A logical system for the organization of knowledge

The division of knowledge into classes usually is based on disciplines

Classes are arranged into a hierarchical and coherent framework

Knowledge Classification: Multistage Process

Identifying property of interest

Distinguishing objects that possess that property or those which lack it

Grouping objects that have the property into one class

Identifying relationships between classes

Finding distinctions within classes to arrive at subclasses

Classical theory: From general to specificProblems???

Fuzzy Set Theory (Lotfi Zadeh)

Some categories are well defined, others are not

Continuum of property rather than discrete marks

If categories defined by properties members share, then no member should be “better” than the others (prototypes)

Categories should be independent of humans doing the categorization

Ad hoc categories: on the spur of the moment

Classificatory Structure (Tree)

This may be arranged using indentation as seen often in printed schedules

Natural sciences

PlantsAstronomy Physics ChemistryMath ……

Algebra Geometry

Philosophy Literature

Classification (Print Format) Philosophy Natural sciences Literature **************** Natural sciences

- Math- Astronomy- Physics - Chemistry……- Plants

Literature 1

Natural sciences - Math -- Algebra -- Geometry - Astronomy

……

201

Linearization using notations

The linear order of these concepts using numeric notation 100 ... 500 510 512 ... 516 ... 520 ... 530 ... 540 ... 580 ... 800

500

580520 530 540510 ……

512 516

100 800

Natural sciences

PlantsAstronomy PhysicsChemistryMath ……

Algebra Geometry

Philosophy Literature

Classification vs. Alphabetical Order

100 (Philosophy) ... 500 (Natural sciences) 510 (Math) 512 (Algebra) ... 516 (Geometry) ... 520 (Astronomy) ... 530 (Physics) ... 540 (Chemistry) ... 580 (Plants) ... 800 (Literature)

Algebra 512 Astronomy 520 Chemistry 540 Geometry 516 Literature 800 Math 510

Math 510Natural sciences 500 Philosophy 100Physics 530Plants 580

Algorithm for Browsing and Searching

Traverse the hierarchical (tree) structure is a top-down process

At each level of hierarchy the searcher must select one node to expand to the next level

Think about how you find information using a Web directory: what is the path?

Library Classification

A way that helps organize information objects by grouping subjects in the manner which is most useful to the users

The most-used systems are LCC: Library of Congress Classification DDC: Dewey Decimal Classification

Hierarchical

Enumerative: attempt to assign designation for every subject concept needed in the system

LCC more enumerative than DDC

UDC: now faceted

Classification Schemes

Verbal description (topic by topic) of things/concepts that can be represented

Arrangement of verbal descriptions in classed or logical order

Notational system alongside each verbal description (schedules)

Cross-references for navigation within the schedules Alphabetical index of terms used in schedule (and

synonyms) Instructions for use Organization that maintains classification scheme

Ranganathan’s Colon Classification: A Faceted Approach Parts of the whole: faces of a diamond Notations for subparts strung together 5 fundamental categories of a subject

Personality (focal or most specific subject) Material Energy (activity, operation, process) Space (place) Timee.g., design of wooden furniture in eighteenth century

America

Faceted indicators: not convenient for shelvesConvenient in the age of the Internet. Why?

Library Classification: Functions

Arrange items in a logical manner on the shelves Locate known work through call number: shared

mark on item and catalog Collocate “like” items: chosen property is subject

Provide systematic display of bibliographic entries in printed catalogs, indexes, etc.

Help in direct retrieval

Basics Successive stages of classes and subclasses

with a chosen property as the basis of each stage

Hierarchical tree structure: Genus and species Facets, arrays, chain, citation order

Classification Concepts Broad vs. Close Classification Classification of Knowledge vs. Classification of a

Particular Collections (Literary warrant) Integrity vs. Keeping Pace with Knowledge Fixed vs. Relative Location Closed vs. Open Stacks Location Device (call number) vs. Collocation

Device (classification notation)

Library of Congress Classification

A -- GENERAL WORKS

B -- PHILOSOPHY. PSYCHOLOGY. RELIGION

C -- AUXILIARY SCIENCES OF HISTORY

D -- HISTORY (GENERAL) AND HISTORY OF EUROPE

E -- HISTORY: AMERICA

F -- HISTORY: AMERICA

G -- GEOGRAPHY. ANTHROPOLOGY. RECREATION

H -- SOCIAL SCIENCES

J -- POLITICAL SCIENCE

K -- LAW

L -- EDUCATION

M -- MUSIC AND BOOKS ON MUSIC

N -- FINE ARTS

P -- LANGUAGE AND LITERATURE

Q -- SCIENCE

R -- MEDICINE

S -- AGRICULTURE

T -- TECHNOLOGY

U -- MILITARY SCIENCE

V -- NAVAL SCIENCE

Z -- BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL)

Library of Congress Classification

Subclass B Philosophy (General) Subclass BC Logic Subclass BD Speculative philosophy Subclass BF Psychology Subclass BH Aesthetics Subclass BJ Ethics Subclass BL Religions. Mythology. Rationalism Subclass BM Judaism Subclass BP Islam. Bahaism. Theosophy, etc. Subclass BQ Buddhism Subclass BR Christianity Subclass BS The Bible Subclass BT Doctrinal Theology Subclass BV Practical Theology Subclass BX Christian Denominations

Library of Congress Classification

Subclass B B1-5802 Philosophy (General)

B69-99 General works B108-5802 By period (Including individual philosophers and schools

of philosophy) B108-708 Ancient B720-765 Medieval B770-785 Renaissance B790-5802 Modern B808-849 Special topics and schools of philosophy B850-5739 By region or country B5800-5802 By religion

Subclass BC BC1-199 Logic BC11-39 History BC25-39 By period BC60-99 General works

LCC—Some Features

Notations lack of built-in hierarchy alphanumeric--linearization

Advantages comprehensive flexible inclusive adaptive / hospitable

Cons difficult to search hierarchically

Dewey Decimal Classification

• From the divine to the mundane (except 000)From the divine to the mundane (except 000)• Choosing decimals for its categories, allows purely Choosing decimals for its categories, allows purely numerical and infinitely hierarchicalnumerical and infinitely hierarchical• Faceted classification: combines elements from Faceted classification: combines elements from different parts of the structure to construct a number different parts of the structure to construct a number representing the subject content representing the subject content

• Except for general works and fiction, works are Except for general works and fiction, works are classified principally by subject, with extensions for classified principally by subject, with extensions for subject relationships, place, time or type of material, subject relationships, place, time or type of material, producing classification numbers of not less than three producing classification numbers of not less than three digits but otherwise of indeterminate length with a digits but otherwise of indeterminate length with a decimal point before the fourth digit, where present decimal point before the fourth digit, where present • Classmarks are to be read as numbers, in the order: Classmarks are to be read as numbers, in the order: 050, 220, 330.973, 331 etc. 050, 220, 330.973, 331 etc.

Dewey Decimal ClassificationMain classes=>divisions=>sections

The system is made up of ten categories: 000 Computers, information and general

reference 100 Philosophy and psychology 200 Religion 300 Social sciences 400 Language 500 Science and mathematics 600 Technology 700 Arts and recreation 800 Literature 900 History and geography

330 for economy + 94 for 330 for economy + 94 for Europe = 330.94 European Europe = 330.94 European economy; 973 for United economy; 973 for United States + 005 form division States + 005 form division for periodicals = 973.005, for periodicals = 973.005, periodicals concerning periodicals concerning the United States the United States generallygenerally

Dewey Decimal Classification

•000 Generalities 000 Generalities 001 Knowledge 001 Knowledge 002 The book 002 The book 003 Systems 003 Systems 004 Data processing Computer science 004 Data processing Computer science 005 Computer programming, programs005 Computer programming, programs006 Special computer methods 006 Special computer methods 007 Not assigned or no longer used 007 Not assigned or no longer used 010 Bibliography 010 Bibliography 011 Bibliographies 011 Bibliographies 012 Bibliographies of individuals012 Bibliographies of individuals

200 Religion 201 Philosophy of Christianity 202 Miscellany of Christianity 203 Dictionaries of Christianity 204 Special topics 205 Serial publications of Christianity 206 Organizations of Christianity 207 Education, research in Christianity 208 Kinds of persons in Christianity 209 History & geography of Christianity 210 Natural theology 211 Concepts of God 212 Existence, attributes of God •100 Philosophy & psychology 100 Philosophy & psychology

101 Theory of philosophy 101 Theory of philosophy 102 Miscellany of philosophy 102 Miscellany of philosophy 103 Dictionaries of philosophy 103 Dictionaries of philosophy 104 Not assigned or no longer used 104 Not assigned or no longer used 105 Serial publications of philosophy 105 Serial publications of philosophy 106 Organizations of philosophy 106 Organizations of philosophy 107 Education, research in philosophy 107 Education, research in philosophy

KnowledgeKnowledge Arabic numeralsArabic numerals UniversalUniversal Uneven classesUneven classes Logical placement Logical placement

of subjectsof subjects Developer Developer

(“generalist”)(“generalist”) MnemonicMnemonic

Literary warrantLiterary warrant AlphanumericAlphanumeric AmericanAmerican HospitableHospitable Logical Logical

hierarchies often hierarchies often lostlost

Developer Developer (“Specialists”)(“Specialists”)

Confusing Confusing notationnotation

Comparison of DDC & LCC

How Call Numbers Work

Every book is given a Every book is given a unique call number to serve unique call number to serve as an address for locating as an address for locating the book on the shelfthe book on the shelf

Call Numbers LLC

Call number has two parts—Call number has two parts—

(Library of Congress Classification or Dewey Decimal Classification) (Library of Congress Classification or Dewey Decimal Classification) and the Cutter number or book numberand the Cutter number or book number

Every book is given a Every book is given a unique call number to serve unique call number to serve as an address for locating as an address for locating the book on the shelfthe book on the shelf

CUTTER NUMBER for a book usually consists of the first letter of the CUTTER NUMBER for a book usually consists of the first letter of the author's last name and a series of numbers (from a table designed to author's last name and a series of numbers (from a table designed to help maintain an alphabetical arrangement of names).help maintain an alphabetical arrangement of names).

Conley, Ellen C767Conley, Ellen C767 Conley, Robert C768Conley, Robert C768 Cook, Robin C77Cook, Robin C77Cook, Thomas C773Cook, Thomas C773

How do we keep the call number unique if the library has several works How do we keep the call number unique if the library has several works by the same author? by the same author?

813.54 Cook, Robin 813.54 Cook, Robin C77aC77a Acceptable Risk Acceptable Risk C77fC77f Fever Fever

   C77faC77fa Fatal CureFatal Cure work mark or work letterwork mark or work letter

Call Numbers DDC

Call number has two parts--Dewey Decimal Classification and the Cutter Call number has two parts--Dewey Decimal Classification and the Cutter number or book numbernumber or book number

Call Numbers DDC

813.54 Farthest shore L52f Ursula Le Guin

813.54 Four ways to forgiveness L52fo Ursula Le Guin

813.54 Planet of Exile L52p   Ursula Le Guin

813.54 Approaches to the Fiction of Ursula Le Guin L52Z James Bittner B54

813.54 is the Dewey number for American Literature after 1945, L52Z is the Cutter number for Ursula Le Guin, Z is for a work of criticsm, B54 is for James Bittner, the author of Approaches.....  

The capital Z the last letter in the alphabet, insures that all criticisms are shelved after the author's work

Assign Call Numbers

Select appropriate class number from the schedule

Add auxiliary number from tables or based on rules to extend the class number

Add cutter number as book mark (use cutter tables)

Call Numbers using LCC

QE534.2.B64Call numbers can begin with one, two, or three letters

The first letter of a call number represents one of the 21 major divisions of the LCC System. In the example, the subject "Q" is Science.

The second letter "E" represents a subdivision of the sciences, Geology. All books in the QE's are primarily about Geology.

Books in categories E, United States History, and F, Local U.S. History and American History, do not have a second letter (exception: in Canada, FC is used for Canadian history).

Books about Law, K's, can have three letters, such as KFH, Law of Hawaii. Some areas of history (D) also have three-letter call numbers.

Call Numbers using LCC Numbers after letters. The first set of numbers in a call number help to define a book's subject.

"534.2" teaches us more about the book's subject. The range QE 500-625 are books about "Dynamic and Structural Geology"

Books with call numbers QE534.2 are specifically "Earthquakes, Seismology - General Works - 1970 to Present"

One of the most frequently used number in call numbers is "1" which is often used for general periodicals in a given subject area. For example, Q1.S3 is the call number for the journal Science.

Journals are also given call numbers based on the specific subject. For example, QE531.E32 is the call number for the journal Earthquake Spectra

as QE531 is the call number for periodicals about "Earthquakes, Seismology"

Call Number using LCC QE534.2.B64, the B64 is taken from the two-number table and

represents the author's last name, Bruce A. Bolt. The book is Earthquakes.

Some books have two Cutters, the first one is usually a further breakdown of the subject matter. QA 76.76 H94 M88 is a book located in the Mathematics section

of the Q's. QA 76 is about Computer Science The ".76" indicates Special Topics in Automation "H94" tells us that this is a book about HTML "M88" represents the last name of the first author “Musciano” The book is HTML: The Definitive Guide

Call Number using LCC

Class mark: Letters Numbers Decimal ... Cutter numbers: Letters plus numbers --single cutter as a book mark --double cutters a first Cutter number as class extension by

topic geographic etc.; a second Cutter number as book number

Call Number in MARC

050 00 $a Q184 $b I87 050 00 $a QA76.9.C64 $b C36

Application -- One

How to organize periodicals on the shelves? Method 1. Alphabetical by title Method 2. Classification

Pros and cons of each method?

How to organize monographs in a series? Method 1. ASIST conference proceedings as

a monograph series (see record 1) Method 2. as a serial: journals or magazines

(see record 2)

Application – Two

Serials: publication issues in successive parts that is intended to continue indefinitely

Monograph series contain individual objects that are complete bibliographic units (not intended to be continued indefinitely)

Pros and cons of each practice?

Definitions

1. A volume of the conference proceedings as a monograph (series)

[Some fields omitted]

020 ‡a 0914236733 (pbk.)

040 ‡a DLC ‡c DLC ‡d DLC

050 10 ‡a Z674.2

110 2 ‡a American Society for Information Science. ‡b Meeting (43rd : ‡d 1980 : ‡c Anaheim, Calif.)

245 10 ‡a Communicating information / ‡c edited by Alan R. Benenfeld and Edward John Kazlauskas.

260 ‡a White Plains, N.Y. : ‡b Published for American Society for Information Science by Knowledge Industry Publications, ‡c 1980.

300 ‡a xii, 417 p. : ‡b ill. ; ‡c 28 cm.

490 1 ‡a Proceedings of the ASIS annual meeting, ‡x 0044-7870 ; ‡v 17

504 ‡a Includes bibliographical references and indexes.

650 0 ‡a Information services ‡x Congresses.

650 0 ‡a Communication ‡x Congresses.

700 10 ‡a Benenfeld, Alan R.

700 10 ‡a Kazlauskas, Edward John.

810 2 ‡a American Society for Information Science. ‡b Meeting. ‡t Proceedings of the ASIS annual meeting; ‡v 17.

2. The conference proceedings as a serial

[Some fields omitted]

022 ‡a 0044-7870

110 2 ‡a American Society for Information Science. ‡b Meeting

245 10 ‡a Proceedings of the ... ASIS annual meeting.

246 3 ‡a Proceedings of the ... American Society for Information Science annual meeting

260 ‡a Washington, DC :‡b American Society for Information Science, ‡c c1963-

300 ‡a v. ;‡c 29 cm.

310 ‡a Annual

362 0 ‡a Vol. 17 (October 5-10, 1980) -

500 ‡a Most vols. also have a distinctive title.

515 ‡a Vol 17 has title Communicating information.

510 2 ‡a Physics abstracts ‡x 0036-8091

510 2 ‡a Engineering index annual (1968) ‡x 0360-8557

510 2 ‡a Computer & control abstract ‡x 0036-8113

650 0 ‡a Information science ‡v Congresses.

Well organized subject headings -- beyond listing Medical Subject Headings MeSH)http://www.nlm.nih.gov/mesh/2005/MeSHtree.A.html

Purpose of Classification

Provides meaningful subject access via retrieval tool

Provides collocation of objects of a like nature (Cutter)

Provides a logical location for similar objects

Saves user time

Purpose of Classification

Because books are classified by subject, you can often find several helpful books on the same shelf, or nearby

Other subject access tools

Facet -- synthetical classification was developed to overcome the limitations of enumerative hierarchical classifications to allow combination of classes

Taxonomy -- organization or subject oriented: classification of things, or the principles underlying the classification

Ontology -- building shareable knowledge structures (among people, computer, …): "What are the fundamental categories of being?"

Semantic Web is not a separate Web but an extension of the current

one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. The first steps in weaving the Semantic Web into the structure of the existing Web are already under way. In the near future, these developments will usher in significant new functionality as machines become much better able to process and "understand" the data that they merely display at present.

---Tim Berners-Lee, etc. Scientific America, May 17, 2001