Upload
victor-jarrett
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Metadata Standards Metadata Standards and Applicationsand Applications
6. Vocabularies: Attributes 6. Vocabularies: Attributes and Valuesand Values
Goals of SessionGoals of Session
Understand how different Understand how different vocabularies are used in metadatavocabularies are used in metadata
Learn about relationships in Learn about relationships in vocabulariesvocabularies
Understand methods of encoding Understand methods of encoding vocabularies for various purposesvocabularies for various purposes
Learn about how registries are used Learn about how registries are used to document vocabulariesto document vocabularies
Metadata Standards & ApplicationsMetadata Standards & Applications 22
Metadata Standards & ApplicationsMetadata Standards & Applications 33
Vocabulary IssuesVocabulary Issues
Where vocabularies occur in Where vocabularies occur in metadatametadata
Establishment of formal relationships Establishment of formal relationships among terms (where appropriate)among terms (where appropriate)
Testing and validation of termsTesting and validation of terms The role of Metadata RegistriesThe role of Metadata Registries
Metadata Standards & ApplicationsMetadata Standards & Applications 44
Why bother?Why bother?
To improve retrieval, i.e., to get an To improve retrieval, i.e., to get an optimum balance of optimum balance of precisionprecision and and recallrecall– PrecisionPrecision – How many of the retrieved – How many of the retrieved
records are relevant?records are relevant?– RecallRecall – How many of the relevant – How many of the relevant
records did you retrieve?records did you retrieve?
Metadata Standards & ApplicationsMetadata Standards & Applications 55
Improving recall Improving recall andand precision precision
Controlled Vocabularies improve Controlled Vocabularies improve recall by addressing synonyms [attire recall by addressing synonyms [attire vs. dress vs. clothing]vs. dress vs. clothing]
Controlled Vocabularies improve Controlled Vocabularies improve precision by addressing homographs precision by addressing homographs [bridge (game) vs. bridge (structure) [bridge (game) vs. bridge (structure) vs. bridge (dental device)]vs. bridge (dental device)]
Metadata Standards & ApplicationsMetadata Standards & Applications 66
Types of Controlled Types of Controlled VocabulariesVocabularies
ListsLists Synonym RingsSynonym Rings TaxonomyTaxonomy ThesaurusThesaurus [Classification Schemes][Classification Schemes] OntologyOntology
Metadata Standards & ApplicationsMetadata Standards & Applications 77
Thesauri & ClassificationThesauri & Classification
Some knowledge management Some knowledge management researchers feel that these are researchers feel that these are essentially the same, with the essentially the same, with the primary difference being whether the primary difference being whether the preferred term is a notation preferred term is a notation
As the need to do machine readable As the need to do machine readable encoding progresses, some encoding progresses, some additional differences are emergingadditional differences are emerging
Metadata Standards & ApplicationsMetadata Standards & Applications 88
ListsLists
A A listlist is a simple group of terms is a simple group of terms Example:Example:
AlabamaAlabamaAlaskaAlaskaArkansasArkansasCaliforniaCaliforniaColoradoColorado. . . .. . . .
Frequently used in Web site pick lists Frequently used in Web site pick lists and pull down menusand pull down menus
Metadata Standards & ApplicationsMetadata Standards & Applications 99
Synonym RingsSynonym Rings Synonym rings are used Synonym rings are used to expandto expand queries for queries for
content objectscontent objects– If a user enters any one of these terms as a query to the If a user enters any one of these terms as a query to the
system, all items are retrieved that contain any of the system, all items are retrieved that contain any of the terms in the clusterterms in the cluster
Synonym rings are Synonym rings are often used in systems where often used in systems where the underlying content objects are left in their the underlying content objects are left in their unstructuredunstructured natural language formatnatural language format– the control is achieved through the interface by drawing the control is achieved through the interface by drawing
together similar terms into these clusterstogether similar terms into these clusters Synonym rings are used in conjunction with searchSynonym rings are used in conjunction with search
engines and provide a minimal amount of control engines and provide a minimal amount of control of the diversity of the language found in the texts of the diversity of the language found in the texts of the underlying documentsof the underlying documents
Metadata Standards & ApplicationsMetadata Standards & Applications 1010
TaxonomiesTaxonomies
A A taxonomy taxonomy is a set of preferred terms, is a set of preferred terms, all connected by a hierarchy or all connected by a hierarchy or polyhierarchypolyhierarchy
Example:Example:ChemistryChemistry
Organic chemistryOrganic chemistryPolymer chemistryPolymer chemistry
NylonNylon
Frequently used in web navigation Frequently used in web navigation systemssystems
Metadata Standards & ApplicationsMetadata Standards & Applications 1111
ThesauriThesauri
A A thesaurusthesaurus is a controlled vocabulary is a controlled vocabulary with multiple types of relationshipswith multiple types of relationships
Example:Example:RiceRice
UF paddyUF paddy
BT CerealsBT Cereals
BT Plant productsBT Plant products
NT Brown riceNT Brown riceRT Rice strawRT Rice straw
Metadata Standards & ApplicationsMetadata Standards & Applications 1212
OntologyOntology
A useful definition: “An arrangement A useful definition: “An arrangement of concepts and relations based on of concepts and relations based on an underlying model of reality.”an underlying model of reality.”– Ex.: Organs, symptoms, and diseases in Ex.: Organs, symptoms, and diseases in
medicinemedicine No real agreement on definition—No real agreement on definition—
every community uses the term in a every community uses the term in a slightly different wayslightly different way
Metadata Standards & ApplicationsMetadata Standards & Applications 1313
Thesaural RelationshipsThesaural Relationships
Relationship types:Relationship types: Use/Used For – indicates preferred termUse/Used For – indicates preferred term Hierarchy – indicates broader and Hierarchy – indicates broader and
narrower termsnarrower terms Associative – almost unlimited types of Associative – almost unlimited types of
relationships may be usedrelationships may be used
It is the most complex format for It is the most complex format for controlled vocabularies and widely used. controlled vocabularies and widely used.
Metadata Standards & Applications
14
Metadata Standards & ApplicationsMetadata Standards & Applications 1515
Z39.19 Types of ConceptsZ39.19 Types of Concepts
Things and their physical partsThings and their physical parts MaterialsMaterials Activities or processesActivities or processes Events or occurrencesEvents or occurrences Properties or states of persons, things, Properties or states of persons, things,
materials or actionsmaterials or actions Disciplines or subject fieldsDisciplines or subject fields Units of measurementUnits of measurement Unique entitiesUnique entities
Metadata Standards & ApplicationsMetadata Standards & Applications 1616
ExamplesExamples
Birds (things)Birds (things) Ornithology (discipline)Ornithology (discipline) Feathers (materials)Feathers (materials) Flying (activity or process)Flying (activity or process) Bird counts (event)Bird counts (event) Barn Owl (unique entity)Barn Owl (unique entity)
Metadata Standards & ApplicationsMetadata Standards & Applications 1717
RelationshipsRelationships
EquivalenceEquivalence HierarchicalHierarchical AssociativeAssociative
Metadata Standards & ApplicationsMetadata Standards & Applications 1818
Equivalence RelationshipsEquivalence Relationships
Term A and Term B overlap completelyTerm A and Term B overlap completely
A = B
Metadata Standards & ApplicationsMetadata Standards & Applications 1919
Hierarchical RelationshipsHierarchical Relationships
Term A is included in Term BTerm A is included in Term B
B A
Metadata Standards & ApplicationsMetadata Standards & Applications 2020
Associative RelationshipsAssociative Relationships
Semantics of terms A and B overlapSemantics of terms A and B overlap
A B
Metadata Standards & ApplicationsMetadata Standards & Applications 2121
Expressing RelationshipExpressing Relationship
Metadata Standards & ApplicationsMetadata Standards & Applications 2222
Hierarchy rulesHierarchy rules
Relationships must be independent Relationships must be independent of context of context
Examples:Examples:– Mice (BT Rodents); Rodents (NT Mice)Mice (BT Rodents); Rodents (NT Mice)– NOT Mice (BT Pests); Pests (NT Mice)NOT Mice (BT Pests); Pests (NT Mice)
Metadata Standards & ApplicationsMetadata Standards & Applications 2323
Hierarchy rulesHierarchy rules
Terms must represent the same type Terms must represent the same type of entity of entity
Examples:Examples:– Shoes (BT Footwear); Footwear (NT Shoes (BT Footwear); Footwear (NT
Shoes)Shoes)– NOT Shoes (BT Shoemaking); NOT Shoes (BT Shoemaking);
Shoemaking (NT Shoes)Shoemaking (NT Shoes)
Metadata Standards & ApplicationsMetadata Standards & Applications 2424
Vocabulary ManagementVocabulary Management The degree of control over a vocabulary is The degree of control over a vocabulary is
(mostly) independent of its type(mostly) independent of its type– Uncontrolled Uncontrolled – Anybody can add anything at – Anybody can add anything at
any time and no effort is made to keep things any time and no effort is made to keep things consistent consistent
– Managed Managed – Software makes sure there is a list – Software makes sure there is a list that is consistent (no duplicates, no orphan that is consistent (no duplicates, no orphan nodes) at any one time. Almost anybody can nodes) at any one time. Almost anybody can add anything, subject to consistency rulesadd anything, subject to consistency rules
– Controlled Controlled – A documented process is – A documented process is followed for the update of the vocabulary. Few followed for the update of the vocabulary. Few people have authority to change the list. people have authority to change the list. Software may help, but emphasis is on human Software may help, but emphasis is on human processes and custodianshipprocesses and custodianship
Metadata Standards & ApplicationsMetadata Standards & Applications 2525
Informal VocabulariesInformal Vocabularies
New movement towards ‘bottom up’ New movement towards ‘bottom up’ classification goes by many names:classification goes by many names:– TaggingTagging– Social bookmarkingSocial bookmarking– FolksonomiesFolksonomies
Many in this movement, seeing Many in this movement, seeing problems of scale, are moving problems of scale, are moving towards more formalizationtowards more formalization
Libraries/Museums and TaggingLibraries/Museums and Tagging
Penn Tags Penn Tags – Still experimental, primarily internal to PennStill experimental, primarily internal to Penn– http://tags.library.upenn.edu/help/http://tags.library.upenn.edu/help/
Library of Congress Flickr projectLibrary of Congress Flickr project– Open public tagging, still unclear how results will be Open public tagging, still unclear how results will be
usedused– http://www.flickr.com/photos/library_of_congress/http://www.flickr.com/photos/library_of_congress/
The Art Museum Social Tagging Project The Art Museum Social Tagging Project – Research/software project focused on museum Research/software project focused on museum
applicationapplication– http://www.steve.museum/http://www.steve.museum/
Metadata Standards & ApplicationsMetadata Standards & Applications 2626
Metadata Standards & ApplicationsMetadata Standards & Applications 2727
Current Encoding Standards: Current Encoding Standards: AuthoritiesAuthorities
MARC 21MARC 21– Authority Format used for names, Authority Format used for names,
subjects, series; subjects, series; – Classification Format used for subject Classification Format used for subject
classificationclassification MADS (a derivative of MARC MADS (a derivative of MARC
authorities)authorities)– Used primarily for namesUsed primarily for names
Metadata Standards & Applications
28
MARC 21 Authority Name
Metadata Standards & Applications
29
MARC 21 Authority Subject
Metadata Standards & Applications
30
MARC 21 Classification LCC
Metadata Standards & Applications
31
MARC 21 Classification DDC
What is MADS?What is MADS?
Metadata Authority Description SchemaMetadata Authority Description Schema– A companion to MODS for authority data using A companion to MODS for authority data using
XMLXML– Defines a subset of MARC authority elements Defines a subset of MARC authority elements
using language-based tagsusing language-based tags– Elements have same definitions as equivalent Elements have same definitions as equivalent
MODSMODS MADS can be used for metadata about MADS can be used for metadata about
people, organizations, events, subjects, people, organizations, events, subjects, time periods, genres, geographics and time periods, genres, geographics and occupationsoccupations
Metadata Standards & ApplicationsMetadata Standards & Applications 3232
MADS ElementsMADS Elements AuthorityAuthority
– namename– titleInfotitleInfo– topictopic– temporaltemporal– genregenre– geographicgeographic– hierarchicalGeographichierarchicalGeographic– occupationoccupation
RelatedRelated– same subelementssame subelements
VariantVariant– same subelementssame subelements
NoteNote AffiliationAffiliation urlurl IdentifierIdentifier fieldOfActivityfieldOfActivity ExtensionExtension recordInforecordInfo
Metadata Standards & ApplicationsMetadata Standards & Applications 3333
Metadata Standards & ApplicationsMetadata Standards & Applications 3434
New/Upcoming New/Upcoming Standards:AuthoritiesStandards:Authorities
Functional Requirements for Authority Data Functional Requirements for Authority Data (FRAD)(FRAD)– A new model for authority informationA new model for authority information– Developed by the IFLA Working Group on Functional Developed by the IFLA Working Group on Functional
Requirements and Numbering of Authority Records Requirements and Numbering of Authority Records (FRANAR)(FRANAR)
– VIAF (Virtual International Authority File)VIAF (Virtual International Authority File) Prototype at: http://orlabs.oclc.org/viaf/ Prototype at: http://orlabs.oclc.org/viaf/
A Review of the Feasibility of an International A Review of the Feasibility of an International Authority Data Number (ISADN)Authority Data Number (ISADN)
Simple Knowledge Organization System (SKOS)—Simple Knowledge Organization System (SKOS)—a W3C standarda W3C standard
Metadata Standards & ApplicationsMetadata Standards & Applications 3535
Metadata Standards & ApplicationsMetadata Standards & Applications 3636
Functions of the Authority FileFunctions of the Authority File
Document decisionsDocument decisions Serve as reference toolServe as reference tool Control forms of access pointsControl forms of access points Support access to bibliographic filesSupport access to bibliographic files Link bibliographic and authority filesLink bibliographic and authority files
(Slide from Glenn Patton) (Slide from Glenn Patton)
Metadata Standards & ApplicationsMetadata Standards & Applications 3838
FRANAR Concept Model, top
Metadata Standards & ApplicationsMetadata Standards & Applications 3939
FRANAR Concept Model, bottom
FRAD person attributesFRAD person attributes
From FRBR (AACR2 additions to names):From FRBR (AACR2 additions to names):Dates associated with the personDates associated with the personTitle of personTitle of personOther designation associated with the personOther designation associated with the person
New:New:GenderGenderPlace of birthPlace of birthPlace of deathPlace of deathCountryCountryPlace of residencePlace of residenceAffiliationAffiliationAddressAddressLanguage of personLanguage of personField of activityField of activityProfession/occupationProfession/occupationBiography/historyBiography/history
(Slide from Ed Jones)(Slide from Ed Jones)
Metadata Standards & ApplicationsMetadata Standards & Applications 4141
VIAF Search Result
Metadata Standards & ApplicationsMetadata Standards & Applications 4242
VIAF DNB Display
SKOSSKOS
Simple Knowledge Organisation Simple Knowledge Organisation System (SKOS)System (SKOS)– A World Wide Web Consortium (W3C) A World Wide Web Consortium (W3C)
standardstandard– Based on RDF and OWLBased on RDF and OWL– Currently resolving “last call” Currently resolving “last call”
comments, will be finalized in early comments, will be finalized in early 20092009
– http://www.w3.org/skos/ http://www.w3.org/skos/
Metadata Standards & ApplicationsMetadata Standards & Applications 4343
Metadata Standards & Applications
44
The skos:Concept class allows you to assert that a resource is a conceptual resource. That is, the
resource is itself a concept.
Metadata Standards & Applications
45
The RDF/XML Encoded Version
Metadata Standards & Applications
46
Preferred and Alternative Lexical Labels
Metadata Standards & Applications
47
The RDF/XML Encoded Version
Metadata Standards & Applications
48
Registries: the Big Picture
(Adapted from Wagner & Weibel, “The Dublin Core Metadata Registry: Requirements, Implementation, and Experience” JoDI, 2005)
Metadata Standards & ApplicationsMetadata Standards & Applications 4949
Why Registries?Why Registries?
Support the “interoperability cycle”:Support the “interoperability cycle”:– Discovery of available schemes and schemas Discovery of available schemes and schemas
for description of resourcesfor description of resources– Promote reuse of extant schemes and schemas Promote reuse of extant schemes and schemas – Access to machine-readable and human-Access to machine-readable and human-
readable services readable services – Support for crosswalking and translationSupport for crosswalking and translation
Coping with a “state of perpetual Coping with a “state of perpetual metadata heterogeneity” (Bianchi and metadata heterogeneity” (Bianchi and Petrone)Petrone)
Metadata Standards & ApplicationsMetadata Standards & Applications 5050
What Do Registries Register?What Do Registries Register?
Metadata Schemas (element sets, Metadata Schemas (element sets, formats)formats)– Crosswalks between metadata schemasCrosswalks between metadata schemas
Controlled VocabulariesControlled Vocabularies– Mappings between vocabulariesMappings between vocabularies
Application ProfilesApplication Profiles– Schema and vocabulary information in Schema and vocabulary information in
combination with specific usage combination with specific usage instructioninstruction
Metadata Standards & Applications
51
Dublin Core Registry—Term Level
Metadata Standards & Applications
52
NSDL Registry—Property Vocabulary List
Metadata Standards & Applications
53
NSDL Registry—Property Vocabulary Detail
Metadata Standards & Applications
54
Element Detail RDF
Metadata Standards & Applications
55
Concept Vocabulary Detail
Metadata Standards & Applications
56
Concept Vocabulary XML Schema
Please Play!Please Play!
The NSDL Registry has a “sandbox” The NSDL Registry has a “sandbox” where anyone can try out the where anyone can try out the registry software:registry software:– http://sandbox.metadataregistry.org http://sandbox.metadataregistry.org
Please feel free to play in the Please feel free to play in the Registry Sandbox!Registry Sandbox!
Note: The production registry is open Note: The production registry is open as well, but not for play …as well, but not for play …
Metadata Standards & ApplicationsMetadata Standards & Applications 5757
Metadata Standards & ApplicationsMetadata Standards & Applications 5858
AcknowledgementsAcknowledgements
Some slides used here are from Some slides used here are from presentations by Marcia Zeng and presentations by Marcia Zeng and Alistair MilesAlistair Miles