Upload
sheila-hicks
View
246
Download
1
Tags:
Embed Size (px)
Citation preview
Metadata Standards Metadata Standards and Applicationsand Applications
8 Metadata 8 Metadata Interoperability and Interoperability and
Quality IssuesQuality Issues
Goals of SessionGoals of Session
Understand interoperability protocols Understand interoperability protocols (OpenURL for reference OAI-PMH for (OpenURL for reference OAI-PMH for metadata sharing)metadata sharing)
Understand crosswalking and Understand crosswalking and mapping as it relates to mapping as it relates to interoperabilityinteroperability
Investigate issues concerning Investigate issues concerning metadata qualitymetadata quality
Metadata Standards amp ApplicationsMetadata Standards amp Applications 22
Whatrsquos the Point About Whatrsquos the Point About InteroperabilityInteroperability
For users itrsquos about resource discovery For users itrsquos about resource discovery (user tasks)(user tasks)ndash Whatrsquos out thereWhatrsquos out therendash Is it what I need for my taskIs it what I need for my taskndash Can I use itCan I use it
For resource creators itrsquos about For resource creators itrsquos about distribution and marketingdistribution and marketingndash How can I increase the number of people who How can I increase the number of people who
find my resources easilyfind my resources easilyndash How can I justify the funding required to make How can I justify the funding required to make
these resources availablethese resources availableMetadata Standards amp ApplicationsMetadata Standards amp Applications 33
Metadata Standards amp ApplicationsMetadata Standards amp Applications 44
OAI-PMHOAI-PMH Open Archives Initiative-Protocol for Open Archives Initiative-Protocol for
Metadata Harvesting (Metadata Harvesting (httpwwwopenarchivesorg))
Roots in the ePrint community although Roots in the ePrint community although applicability is much broaderapplicability is much broader
Mission ldquoThe Open Archives Initiative Mission ldquoThe Open Archives Initiative develops and promotes interoperability develops and promotes interoperability standards that aim to facilitate the standards that aim to facilitate the efficient dissemination of contentrdquoefficient dissemination of contentrdquo
Content in this context is actually Content in this context is actually ldquometadata about contentrdquoldquometadata about contentrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 55
Metadata About the Resource
Metadata Standards amp ApplicationsMetadata Standards amp Applications 66
OAI-PMH in a NutshellOAI-PMH in a Nutshell
Essentially provides a simple protocol Essentially provides a simple protocol for ldquoharvestrdquo and ldquoexposurerdquo of for ldquoharvestrdquo and ldquoexposurerdquo of metadata recordsmetadata records
Specifies a simple ldquowrapperrdquo around Specifies a simple ldquowrapperrdquo around metadata records providing metadata records providing metadata about the record itselfmetadata about the record itself
OAI-PMH is about the OAI-PMH is about the metadatametadata not not about the about the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 77
The OAI WorldThe OAI World Divided into two categoriesDivided into two categories
ndash Data providers ldquoA data provider Data providers ldquoA data provider maintains one or more repositories (web maintains one or more repositories (web servers) that support the OAI-PMH as a servers) that support the OAI-PMH as a means of exposing metadatardquomeans of exposing metadatardquo
ndash Service providers ldquoA service provider Service providers ldquoA service provider issues OAI-PMH requests to data issues OAI-PMH requests to data providers and uses the metadata as a providers and uses the metadata as a basis for building value-added servicesrdquobasis for building value-added servicesrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Goals of SessionGoals of Session
Understand interoperability protocols Understand interoperability protocols (OpenURL for reference OAI-PMH for (OpenURL for reference OAI-PMH for metadata sharing)metadata sharing)
Understand crosswalking and Understand crosswalking and mapping as it relates to mapping as it relates to interoperabilityinteroperability
Investigate issues concerning Investigate issues concerning metadata qualitymetadata quality
Metadata Standards amp ApplicationsMetadata Standards amp Applications 22
Whatrsquos the Point About Whatrsquos the Point About InteroperabilityInteroperability
For users itrsquos about resource discovery For users itrsquos about resource discovery (user tasks)(user tasks)ndash Whatrsquos out thereWhatrsquos out therendash Is it what I need for my taskIs it what I need for my taskndash Can I use itCan I use it
For resource creators itrsquos about For resource creators itrsquos about distribution and marketingdistribution and marketingndash How can I increase the number of people who How can I increase the number of people who
find my resources easilyfind my resources easilyndash How can I justify the funding required to make How can I justify the funding required to make
these resources availablethese resources availableMetadata Standards amp ApplicationsMetadata Standards amp Applications 33
Metadata Standards amp ApplicationsMetadata Standards amp Applications 44
OAI-PMHOAI-PMH Open Archives Initiative-Protocol for Open Archives Initiative-Protocol for
Metadata Harvesting (Metadata Harvesting (httpwwwopenarchivesorg))
Roots in the ePrint community although Roots in the ePrint community although applicability is much broaderapplicability is much broader
Mission ldquoThe Open Archives Initiative Mission ldquoThe Open Archives Initiative develops and promotes interoperability develops and promotes interoperability standards that aim to facilitate the standards that aim to facilitate the efficient dissemination of contentrdquoefficient dissemination of contentrdquo
Content in this context is actually Content in this context is actually ldquometadata about contentrdquoldquometadata about contentrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 55
Metadata About the Resource
Metadata Standards amp ApplicationsMetadata Standards amp Applications 66
OAI-PMH in a NutshellOAI-PMH in a Nutshell
Essentially provides a simple protocol Essentially provides a simple protocol for ldquoharvestrdquo and ldquoexposurerdquo of for ldquoharvestrdquo and ldquoexposurerdquo of metadata recordsmetadata records
Specifies a simple ldquowrapperrdquo around Specifies a simple ldquowrapperrdquo around metadata records providing metadata records providing metadata about the record itselfmetadata about the record itself
OAI-PMH is about the OAI-PMH is about the metadatametadata not not about the about the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 77
The OAI WorldThe OAI World Divided into two categoriesDivided into two categories
ndash Data providers ldquoA data provider Data providers ldquoA data provider maintains one or more repositories (web maintains one or more repositories (web servers) that support the OAI-PMH as a servers) that support the OAI-PMH as a means of exposing metadatardquomeans of exposing metadatardquo
ndash Service providers ldquoA service provider Service providers ldquoA service provider issues OAI-PMH requests to data issues OAI-PMH requests to data providers and uses the metadata as a providers and uses the metadata as a basis for building value-added servicesrdquobasis for building value-added servicesrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Whatrsquos the Point About Whatrsquos the Point About InteroperabilityInteroperability
For users itrsquos about resource discovery For users itrsquos about resource discovery (user tasks)(user tasks)ndash Whatrsquos out thereWhatrsquos out therendash Is it what I need for my taskIs it what I need for my taskndash Can I use itCan I use it
For resource creators itrsquos about For resource creators itrsquos about distribution and marketingdistribution and marketingndash How can I increase the number of people who How can I increase the number of people who
find my resources easilyfind my resources easilyndash How can I justify the funding required to make How can I justify the funding required to make
these resources availablethese resources availableMetadata Standards amp ApplicationsMetadata Standards amp Applications 33
Metadata Standards amp ApplicationsMetadata Standards amp Applications 44
OAI-PMHOAI-PMH Open Archives Initiative-Protocol for Open Archives Initiative-Protocol for
Metadata Harvesting (Metadata Harvesting (httpwwwopenarchivesorg))
Roots in the ePrint community although Roots in the ePrint community although applicability is much broaderapplicability is much broader
Mission ldquoThe Open Archives Initiative Mission ldquoThe Open Archives Initiative develops and promotes interoperability develops and promotes interoperability standards that aim to facilitate the standards that aim to facilitate the efficient dissemination of contentrdquoefficient dissemination of contentrdquo
Content in this context is actually Content in this context is actually ldquometadata about contentrdquoldquometadata about contentrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 55
Metadata About the Resource
Metadata Standards amp ApplicationsMetadata Standards amp Applications 66
OAI-PMH in a NutshellOAI-PMH in a Nutshell
Essentially provides a simple protocol Essentially provides a simple protocol for ldquoharvestrdquo and ldquoexposurerdquo of for ldquoharvestrdquo and ldquoexposurerdquo of metadata recordsmetadata records
Specifies a simple ldquowrapperrdquo around Specifies a simple ldquowrapperrdquo around metadata records providing metadata records providing metadata about the record itselfmetadata about the record itself
OAI-PMH is about the OAI-PMH is about the metadatametadata not not about the about the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 77
The OAI WorldThe OAI World Divided into two categoriesDivided into two categories
ndash Data providers ldquoA data provider Data providers ldquoA data provider maintains one or more repositories (web maintains one or more repositories (web servers) that support the OAI-PMH as a servers) that support the OAI-PMH as a means of exposing metadatardquomeans of exposing metadatardquo
ndash Service providers ldquoA service provider Service providers ldquoA service provider issues OAI-PMH requests to data issues OAI-PMH requests to data providers and uses the metadata as a providers and uses the metadata as a basis for building value-added servicesrdquobasis for building value-added servicesrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 44
OAI-PMHOAI-PMH Open Archives Initiative-Protocol for Open Archives Initiative-Protocol for
Metadata Harvesting (Metadata Harvesting (httpwwwopenarchivesorg))
Roots in the ePrint community although Roots in the ePrint community although applicability is much broaderapplicability is much broader
Mission ldquoThe Open Archives Initiative Mission ldquoThe Open Archives Initiative develops and promotes interoperability develops and promotes interoperability standards that aim to facilitate the standards that aim to facilitate the efficient dissemination of contentrdquoefficient dissemination of contentrdquo
Content in this context is actually Content in this context is actually ldquometadata about contentrdquoldquometadata about contentrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 55
Metadata About the Resource
Metadata Standards amp ApplicationsMetadata Standards amp Applications 66
OAI-PMH in a NutshellOAI-PMH in a Nutshell
Essentially provides a simple protocol Essentially provides a simple protocol for ldquoharvestrdquo and ldquoexposurerdquo of for ldquoharvestrdquo and ldquoexposurerdquo of metadata recordsmetadata records
Specifies a simple ldquowrapperrdquo around Specifies a simple ldquowrapperrdquo around metadata records providing metadata records providing metadata about the record itselfmetadata about the record itself
OAI-PMH is about the OAI-PMH is about the metadatametadata not not about the about the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 77
The OAI WorldThe OAI World Divided into two categoriesDivided into two categories
ndash Data providers ldquoA data provider Data providers ldquoA data provider maintains one or more repositories (web maintains one or more repositories (web servers) that support the OAI-PMH as a servers) that support the OAI-PMH as a means of exposing metadatardquomeans of exposing metadatardquo
ndash Service providers ldquoA service provider Service providers ldquoA service provider issues OAI-PMH requests to data issues OAI-PMH requests to data providers and uses the metadata as a providers and uses the metadata as a basis for building value-added servicesrdquobasis for building value-added servicesrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 55
Metadata About the Resource
Metadata Standards amp ApplicationsMetadata Standards amp Applications 66
OAI-PMH in a NutshellOAI-PMH in a Nutshell
Essentially provides a simple protocol Essentially provides a simple protocol for ldquoharvestrdquo and ldquoexposurerdquo of for ldquoharvestrdquo and ldquoexposurerdquo of metadata recordsmetadata records
Specifies a simple ldquowrapperrdquo around Specifies a simple ldquowrapperrdquo around metadata records providing metadata records providing metadata about the record itselfmetadata about the record itself
OAI-PMH is about the OAI-PMH is about the metadatametadata not not about the about the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 77
The OAI WorldThe OAI World Divided into two categoriesDivided into two categories
ndash Data providers ldquoA data provider Data providers ldquoA data provider maintains one or more repositories (web maintains one or more repositories (web servers) that support the OAI-PMH as a servers) that support the OAI-PMH as a means of exposing metadatardquomeans of exposing metadatardquo
ndash Service providers ldquoA service provider Service providers ldquoA service provider issues OAI-PMH requests to data issues OAI-PMH requests to data providers and uses the metadata as a providers and uses the metadata as a basis for building value-added servicesrdquobasis for building value-added servicesrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 66
OAI-PMH in a NutshellOAI-PMH in a Nutshell
Essentially provides a simple protocol Essentially provides a simple protocol for ldquoharvestrdquo and ldquoexposurerdquo of for ldquoharvestrdquo and ldquoexposurerdquo of metadata recordsmetadata records
Specifies a simple ldquowrapperrdquo around Specifies a simple ldquowrapperrdquo around metadata records providing metadata records providing metadata about the record itselfmetadata about the record itself
OAI-PMH is about the OAI-PMH is about the metadatametadata not not about the about the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 77
The OAI WorldThe OAI World Divided into two categoriesDivided into two categories
ndash Data providers ldquoA data provider Data providers ldquoA data provider maintains one or more repositories (web maintains one or more repositories (web servers) that support the OAI-PMH as a servers) that support the OAI-PMH as a means of exposing metadatardquomeans of exposing metadatardquo
ndash Service providers ldquoA service provider Service providers ldquoA service provider issues OAI-PMH requests to data issues OAI-PMH requests to data providers and uses the metadata as a providers and uses the metadata as a basis for building value-added servicesrdquobasis for building value-added servicesrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 77
The OAI WorldThe OAI World Divided into two categoriesDivided into two categories
ndash Data providers ldquoA data provider Data providers ldquoA data provider maintains one or more repositories (web maintains one or more repositories (web servers) that support the OAI-PMH as a servers) that support the OAI-PMH as a means of exposing metadatardquomeans of exposing metadatardquo
ndash Service providers ldquoA service provider Service providers ldquoA service provider issues OAI-PMH requests to data issues OAI-PMH requests to data providers and uses the metadata as a providers and uses the metadata as a basis for building value-added servicesrdquobasis for building value-added servicesrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 88
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 99
Other important definitionsOther important definitions
ArchiveArchive Not the same as lsquoarchiversquo used in Not the same as lsquoarchiversquo used in libraries more like ldquorepositoryrdquolibraries more like ldquorepositoryrdquo
ProtocolProtocol a set of rules defining a set of rules defining communication between systems FTP communication between systems FTP (File Transfer Protocol) and HTTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are other (Hypertext Transport Protocol) are other examples of Internet protocolsexamples of Internet protocols
HarvestingHarvesting the gathering together of the gathering together of metadata from a number of distributed metadata from a number of distributed repositories into a combined data storerepositories into a combined data store
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1010
Inside OAI RepositoriesInside OAI Repositories repositoryrepository - A - A repositoryrepository is a network is a network
accessible server that can process accessible server that can process requests A requests A repositoryrepository is managed by a is managed by a data provider to expose metadata to data provider to expose metadata to harvestersharvesters
resourceresource - A - A resourceresource is the object or is the object or stuff that metadata is aboutrdquo whether stuff that metadata is aboutrdquo whether physical or digital stored in the repository physical or digital stored in the repository or a constituent of another databaseor a constituent of another database
itemitem - An - An itemitem is a constituent of a is a constituent of a repository from which metadata about a repository from which metadata about a resource can be disseminated resource can be disseminated
recordrecord - A - A recordrecord is metadata in a specific is metadata in a specific metadata formatmetadata format
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1111
OAI GoalsOAI Goals Low barrier to participationLow barrier to participation
ndash Server software available in many Server software available in many programming languages intended to be programming languages intended to be easy to installeasy to install
ndash Server-less implementation available now Server-less implementation available now via ldquoStatic repositoryrdquo (essentially a web via ldquoStatic repositoryrdquo (essentially a web page that looks like an OAI response and page that looks like an OAI response and can be harvested as such)can be harvested as such)
Limited set of commandsLimited set of commands Predictable responses and flows of Predictable responses and flows of
datadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1212
Other OAI InfoOther OAI Info Responses are encoded in XML syntaxResponses are encoded in XML syntax OAI-PMH supports any metadata format OAI-PMH supports any metadata format
encoded in XMLmdashSimple Dublin Core is the encoded in XMLmdashSimple Dublin Core is the minimal format specified minimal format specified
Data Providers may define a logical set Data Providers may define a logical set hierarchy to support levels of granularity hierarchy to support levels of granularity for harvesting by Service Providersfor harvesting by Service Providers
Date stamps flag the last change of the Date stamps flag the last change of the metadata set and thus provide further metadata set and thus provide further support for granularity of harvestingsupport for granularity of harvesting
OAI-PMH supports flow controlOAI-PMH supports flow control
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1313
OAI RequestsOAI Requests Identify--gtReturns general information Identify--gtReturns general information
about the particular OAI serverabout the particular OAI server ListMetadataFormats--gtreturns formats ListMetadataFormats--gtreturns formats
availableavailable ListSets--gtreturns list of sets availableListSets--gtreturns list of sets available ListIdentifiers--gtreturns identifiers onlyListIdentifiers--gtreturns identifiers only ListRecords--gtreturns record ids in a setListRecords--gtreturns record ids in a set GetRecord--gtreturns particular recordGetRecord--gtreturns particular record Try it out at the UIUC OIA Registry Try it out at the UIUC OIA Registry ((
httpgitagraingeruiuceduregistrysearchformasp))
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1414
Dates Used in OAI-PMHDates Used in OAI-PMH
Datestamps are used as values in requests Datestamps are used as values in requests to support selective harvesting by date to support selective harvesting by date (generally latest update date of the (generally latest update date of the metadata record)metadata record)
Datestamps are also used in record Datestamps are also used in record headers in responsesheaders in responses
Datestamps are particular to a repositoryDatestamps are particular to a repository Repeat OAI dates are about the Repeat OAI dates are about the metadatametadata
not the not the resourcesresources
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1515
OAI-PMH Optional ContainersOAI-PMH Optional Containers
Repository levelRepository levelndash RightsRightsndash BrandingBranding
Record levelRecord levelndash AboutAbout
ProvenanceProvenanceRightsRights
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1616
About Container ExampleAbout Container Example
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1717
OAI Rights ExpressionsOAI Rights Expressions
Rights expressions are valid at three Rights expressions are valid at three levelslevelsndash RepositoryRepositoryndash SetSetndash RecordRecord
Rights expressed at the Repository Rights expressed at the Repository and Set levels are not a substitute for and Set levels are not a substitute for expressions at the Record Levelexpressions at the Record Level
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
OAI Best Practices (DLF amp OAI Best Practices (DLF amp NSDL)NSDL)
Guidelines for data providers and Guidelines for data providers and service providersservice providersndash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpindexphpMain_Page1048708mediawikioaibpindexphpMain_Page1048708 Best Practices for Shareable Best Practices for Shareable
MetadataMetadatandash httpwebservicesitcsumicheduhttpwebservicesitcsumichedu
mediawikioaibpPublicTOC1048708mediawikioaibpPublicTOC1048708
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1818
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
OAI In PracticeOAI In Practice
The UIUC OAI-PMH Data Provider RegistryThe UIUC OAI-PMH Data Provider Registryndash httpgitagraingeruiuceduregistrysearchformasp
Includes most known data providersIncludes most known data providers Link on home page to Service ProvidersLink on home page to Service Providers Provides multiple reports sample records Provides multiple reports sample records
browses search etcbrowses search etc Ex Show report from left hand menu Ex Show report from left hand menu
ldquoDistinct Metadata Schemasrdquo ldquoDistinct Metadata Schemasrdquo ndash httpgitagraingeruiuceduregistryListSchemasasp ndash Choose a schema look for providers and sample records Choose a schema look for providers and sample records
Metadata Standards amp ApplicationsMetadata Standards amp Applications 1919
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2020
Whatrsquos an OpenURLWhatrsquos an OpenURL
The OpenURL provides a standardized The OpenURL provides a standardized format for transporting bibliographic format for transporting bibliographic metadata about objects between metadata about objects between information servicesinformation services
Provides a basis for building services via Provides a basis for building services via the notion of an the notion of an extended service-linkextended service-link which moves beyond the classic notion of which moves beyond the classic notion of a a reference linkreference link (a link from metadata to (a link from metadata to the full-content described by the the full-content described by the metadata)metadata)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2121
ldquoldquoThe OpenURL standard enables a user who has The OpenURL standard enables a user who has retrieved an article citation for example to obtain retrieved an article citation for example to obtain immediate access to the most appropriate copy of immediate access to the most appropriate copy of that object through the implementation of extended that object through the implementation of extended linking services The selection of the best copy is linking services The selection of the best copy is based on user and organizational preferences based on user and organizational preferences regarding the location of the copy its cost and regarding the location of the copy its cost and agreements with information suppliers and similar agreements with information suppliers and similar considerations This selection occurs without the considerations This selection occurs without the knowledge of the user it is made possible by the knowledge of the user it is made possible by the transport of metadata with the OpenURL link from transport of metadata with the OpenURL link from the source citation to a resolver (the link server) the source citation to a resolver (the link server) which stores the preference information and the which stores the preference information and the links to the appropriate materialrdquolinks to the appropriate materialrdquo
--OpenURL Overview SFX website--OpenURL Overview SFX website
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2222
OpenURL CharacteristicsOpenURL Characteristics
Protocol operates between an Protocol operates between an information resource and a service information resource and a service componentcomponent
Service component is called a ldquolink Service component is called a ldquolink serverrdquo or ldquolink resolverrdquoserverrdquo or ldquolink resolverrdquo
Link server defines the user contextLink server defines the user context Takes source citation and determines Takes source citation and determines
whether a user has accesswhether a user has access
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2323
Distinguishing UsersDistinguishing Users
Uses information stored in a cookie Uses information stored in a cookie (the CookiePusher mechanism)(the CookiePusher mechanism)
Uses information contained in a Uses information contained in a digital certificate such as the one digital certificate such as the one proposed by the DLF digital proposed by the DLF digital certificates prototype projectcertificates prototype project
Identifies a users IP addressIdentifies a users IP address Obtains user attributes via the Obtains user attributes via the
Shibboleth frameworkShibboleth framework
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2424
Examples of Extended Service Examples of Extended Service LinksLinks
From a record in an abstracting and indexing From a record in an abstracting and indexing database (AampI) to the full-text described by the database (AampI) to the full-text described by the recordrecord
From a record describing a book in a library From a record describing a book in a library catalogue to a description of the same book in an catalogue to a description of the same book in an Internet book shopInternet book shop
From a reference in a journal article to a record From a reference in a journal article to a record matching that reference in an AampI databasematching that reference in an AampI database
From a citation in a journal article to a record in a From a citation in a journal article to a record in a library catalogue that shows the library holdings library catalogue that shows the library holdings of the cited journalof the cited journal
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2525
OpenURL Examples amp DemoOpenURL Examples amp Demo
httpsfxserveruniedusfxmenuissn=1234-5678ampdate=1998ampvolume=12ampissue=2ampspage=134
An OpenURL demoAn OpenURL demondash httpwwwukolnacukdistributed-syste
msopenurl
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2626
Defining and Ensuring Metadata Defining and Ensuring Metadata QualityQuality
What constitutes qualityWhat constitutes quality Techniques for evaluating and Techniques for evaluating and
enforcing consistency and enforcing consistency and predictabilitypredictability
Automated metadata creation Automated metadata creation advantages and disadvantagesadvantages and disadvantages
Metadata maintenance strategiesMetadata maintenance strategies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2727
Beginning to Define QualityBeginning to Define Quality
Experience of the library Experience of the library community--BIBCO amp NACOcommunity--BIBCO amp NACOndash Agreed upon standards for library Agreed upon standards for library
qualityqualityndash Training and documentation in support Training and documentation in support
of practitionersof practitionersndash Review and enforcement of standards Review and enforcement of standards
by means of institutional ldquobuddy by means of institutional ldquobuddy systemrdquosystemrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2828
How Does Quality HappenHow Does Quality Happen
Lessons from the library communityLessons from the library communityndash Quality is quantifiable and measurableQuality is quantifiable and measurablendash To be effective enforcement of standards To be effective enforcement of standards
of quality must take place at the of quality must take place at the community levelcommunity level
FurthermoreFurthermorendash Data problems are not unique to particular Data problems are not unique to particular
communitiescommunitiesndash general strategies can improve general strategies can improve
interoperabilityinteroperability
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 2929
Quality Measurement CriteriaQuality Measurement Criteria
CompletenessCompleteness AccuracyAccuracy ProvenanceProvenance Conformance to expectationsConformance to expectations Logical consistency and coherenceLogical consistency and coherence Timeliness (Currency and Lag)Timeliness (Currency and Lag) AccessibilityAccessibility
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3030
CompletenessCompleteness
ldquoldquoMetadata should describe the target Metadata should describe the target objects as completely as objects as completely as economically feasiblerdquoeconomically feasiblerdquo
ldquoldquoElement set should be applied to Element set should be applied to the target object population as the target object population as completely as possiblerdquocompletely as possiblerdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3131
AccuracyAccuracy
Information provided in values Information provided in values should be correct and factualshould be correct and factual
Editing applied toEditing applied tondash Eliminate typosEliminate typosndash Ensure conforming name expressionsEnsure conforming name expressionsndash Ensure standard abbreviations usages Ensure standard abbreviations usages
in generalin general
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3232
ProvenanceProvenance
Who prepared the metadata What Who prepared the metadata What do we know about the preparerdo we know about the preparer
What methods were used to create What methods were used to create the metadata Is it human created or the metadata Is it human created or created by machinecreated by machine
What transformations have been What transformations have been applied since creationapplied since creation
Where has it been beforeWhere has it been before
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3333
Conformance to ExpectationsConformance to Expectations
Contains elements a community Contains elements a community would expect to find would expect to find
Controlled vocabularies are well-Controlled vocabularies are well-chosen and explicitly exposed to chosen and explicitly exposed to downstream usersdownstream users
Metadata is reflective of community Metadata is reflective of community thinking about necessary thinking about necessary compromises compromises
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3434
Logical ConsistencyCoherenceLogical ConsistencyCoherence
Standard mechanisms like Standard mechanisms like application profiles and common application profiles and common crosswalks are usedcrosswalks are used
Similar structures and appearance Similar structures and appearance are enabled for search resultsare enabled for search results
There is very limited reliance on There is very limited reliance on defaulted valuesdefaulted values
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3535
TimelinessTimeliness CurrencyCurrency
ndash Target object changes but metadata does Target object changes but metadata does notnot
LagLagndash Target object disseminated before some Target object disseminated before some
or all metadata is availableor all metadata is available ldquoldquoMetadata agingrdquo is affected by Metadata agingrdquo is affected by
cultural differences between librarians cultural differences between librarians and technologistsand technologistsndash Librarians once and itrsquos doneLibrarians once and itrsquos donendash Technologists metadata as an iterative Technologists metadata as an iterative
processprocess
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3636
AccessibilityAccessibility
Barriers to accessibility may be Barriers to accessibility may be economic technical or organizationaleconomic technical or organizationalndash Metadata as ldquopremiumrdquo or proprietary Metadata as ldquopremiumrdquo or proprietary
informationinformationndash Unreadable for technical reasons (file Unreadable for technical reasons (file
formats etc)formats etc)ndash Metadata may not be properly linked to Metadata may not be properly linked to
relevant object(s)relevant object(s)
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3737
Evaluating Metadata (1)Evaluating Metadata (1)
Random sampling (XMLSpy)Random sampling (XMLSpy)ndash AdvantagesAdvantages
Includes some formatting and color codingIncludes some formatting and color coding
ndash DisadvantagesDisadvantagesAssumes consistencypredictabilityAssumes consistencypredictabilityDifficult to determine extent of problems Difficult to determine extent of problems
foundfoundTedious at bestTedious at best
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3838
Evaluating Metadata (2)Evaluating Metadata (2)
Spreadsheets (Microsoft Excel)Spreadsheets (Microsoft Excel)ndash AdvantagesAdvantages
Better sorting and control by reviewerBetter sorting and control by reviewer
ndash DisadvantagesDisadvantagesUnwieldy for large filesUnwieldy for large filesRequires sustained focus from reviewerRequires sustained focus from reviewerRequires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 3939
Evaluating Metadata (3)Evaluating Metadata (3)
Visual Graphical Analysis (Spotfire)Visual Graphical Analysis (Spotfire)ndash AdvantagesAdvantages
View of several data dimensions simultaneouslyView of several data dimensions simultaneously Reviewer controls data displayReviewer controls data display Tends to pull reviewer focus to anomaliesTends to pull reviewer focus to anomalies Handles fairly large files at one time while allowing Handles fairly large files at one time while allowing
subset viewssubset views Display manipulation possible without programmersDisplay manipulation possible without programmers
ndash DisadvantagesDisadvantages High cost of softwareHigh cost of software Requires translation into tab-delimited fileRequires translation into tab-delimited file
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp Applications
40
Element Names vs Record Ids (Scatter Plot)
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp Applications
41
Missing Elements (Scatter Plot)
2 records without
language element
format element present
inconsistently
Easy to rescale axis on the fly
and scroll through records
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp Applications
42
Table View
Non-empty ldquono informationrdquo
values that may confuse end users
Only DC Date elements are
selected for display
The only W3CDTF syntax present is four
digits
Sorted by element value
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4343
Improving Metadata Quality hellipImproving Metadata Quality hellip
DocumentationDocumentationndash Basic standards best practice Basic standards best practice
guidelines examplesguidelines examplesndash Exposure and maintenance of local and Exposure and maintenance of local and
community vocabulariescommunity vocabulariesndash Application ProfilesApplication Profilesndash Training materials tools methodologiesTraining materials tools methodologies
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4444
hellip hellip Over TimeOver Time
Culture changeCulture changendash Support for documentation and Support for documentation and
exchange of knowledge and experienceexchange of knowledge and experiencendash Routine contribution to the ldquogeneral Routine contribution to the ldquogeneral
goodrdquogoodrdquondash More focused research on practical More focused research on practical
metadata use and quality considerationsmetadata use and quality considerationsndash Better project-based and community-Better project-based and community-
wide documentationwide documentation
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4545
CrosswalkingCrosswalking
ldquoldquoCrosswalks support conversion projects and Crosswalks support conversion projects and semantic interoperability to enable semantic interoperability to enable searching across heterogeneous searching across heterogeneous distributed databases Inherently there distributed databases Inherently there are limitations to crosswalks there is are limitations to crosswalks there is rarely a one-to-one correspondence rarely a one-to-one correspondence between the fields or data elements in between the fields or data elements in different information systemsrdquodifferent information systemsrdquo
-- Mary Woodley -- Mary Woodley ldquoCrosswalks The Path to Universal AccessrdquoldquoCrosswalks The Path to Universal Accessrdquo
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4646
ldquoldquoMetadata schema transformations are more Metadata schema transformations are more complex than purely structural transforms complex than purely structural transforms because they require a set of equivalences because they require a set of equivalences identified by human expertsmdashDublin Core title identified by human expertsmdashDublin Core title can be mapped to MARC 245 Dublin Core can be mapped to MARC 245 Dublin Core author can be mapped to MARC 100 and so onauthor can be mapped to MARC 100 and so onmdashbut this important knowledge is recorded in a mdashbut this important knowledge is recorded in a multitude of ways that are not standardized and multitude of ways that are not standardized and not always machine-processable including not always machine-processable including Web pages databases spreadsheets PDF Web pages databases spreadsheets PDF documents and the source code of many documents and the source code of many computer languagesrdquocomputer languagesrdquo -- Jean Godby -- Jean Godby Two Paths to Interoperable MetadataTwo Paths to Interoperable Metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
CrosswalksCrosswalks
In general Semantic mapping of elements In general Semantic mapping of elements between source and target metadata standardsbetween source and target metadata standards
The process of metadata conversion specification The process of metadata conversion specification includes transformations required to convert a includes transformations required to convert a metadata record content to another format metadata record content to another format includingincludingndash Element to element mappingElement to element mappingndash Hierarchy and object resolutionHierarchy and object resolutionndash Metadata content conversionsMetadata content conversionsndash Stylesheets can be created to transform Stylesheets can be created to transform
metadata based on crosswalksmetadata based on crosswalks
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4747
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4848
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 4949
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Available CrosswalksAvailable Crosswalks
Library of CongressLibrary of Congressndash httpwwwlocgovmarcmarcdoczhtmlhttpwwwlocgovmarcmarcdoczhtml
MITMITndash httplibrariesmiteduguideshttplibrariesmiteduguides
subjectsmetadatamappingshtmlsubjectsmetadatamappingshtml GettyGetty
ndash httpwwwgettyeduresearchhttpwwwgettyeduresearchconducting_researchstandardsconducting_researchstandardsintrometadatacrosswalkshtmlintrometadatacrosswalkshtml
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5050
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Problems With Converted Problems With Converted RecordsRecords
Differences in granularity (complex Differences in granularity (complex vs simple scheme)vs simple scheme)ndash Some data might be lostSome data might be lostndash Differences in semantics can occurDifferences in semantics can occurndash Differences in use of content standards Differences in use of content standards
make sharing sometimes problematicmake sharing sometimes problematicndash Properties may vary (eg repeatability)Properties may vary (eg repeatability)
Converting everything may not Converting everything may not always be the best solutionalways be the best solution
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5151
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
ExampleMappingExampleMappingMODStitle to DCtitleMODStitle to DCtitle
Includes attribute for type of titleIncludes attribute for type of titlendash AbbreviatedAbbreviatedndash TranslatedTranslatedndash AlternativeAlternativendash UniformUniform
Other attributesOther attributesndash IDauthoritydisplayLabelxLinkIDauthoritydisplayLabelxLink
Subelements title partName Subelements title partName partNumber nonSortpartNumber nonSort
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5252
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Mapping MODStitle toMapping MODStitle toDCtitleDCtitle
DC has one element refinementDC has one element refinement
AlternativeAlternativendash DC title has no substructure MODS allows for DC title has no substructure MODS allows for
subelements for partNumber partNamesubelements for partNumber partName Best practice statement in DC-Lib says to Best practice statement in DC-Lib says to
include initial article include initial article ndash MODS parses intoltnonSortgtMODS parses intoltnonSortgt
MODS can link to a title in an authority file MODS can link to a title in an authority file if desiredif desired
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5353
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata
Metadata Standards amp ApplicationsMetadata Standards amp Applications 5454
ExerciseExercise
Evaluate a small set of human and Evaluate a small set of human and machine-created metadatamachine-created metadata