Upload
diane-i-hillmann
View
767
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Presentation at ALA Midwinter Dallas at the Cataloging Norms IG. Describes the differences between management at the record level and at the statement level.
Citation preview
From Records To Statements
Taking the Leap
ALA Dallas, 1/20/122
What’s different about
statement data?
Library data compliance has been defined by consensus since MARC was a pup
But outside the MARC silo we need different strategies
To accomplish this we need to look at value, costs and investments very differently
Flickr photo by Robert Jagendorf
ALA Dallas, 1/20/123
What Are Statements?
• A MARC record can be viewed as an aggregation of statements• All the attribute = value pairs relate to the same
resource
• In a linked data world, statements are dis-aggregated and each carries the relationship to a resource as the ‘subject’ of each triple
• Though it seems more complicated to deal with statements in isolation, it is really simpler (the complications are that we know little about it)
ALA Dallas, 1/20/124
Future Metadata Strategies
• Statement level rather than record level management
• Records as units of transport rather than units of management
• Emphasis on evaluation coming in and provenance going out
• Shift in human effort from creating standard cataloging to careful human intervention in machine-based processes
• Extensive use of data created outside libraries
• Intelligent re-use of our legacy data
ALA Dallas, 1/20/125
http://dcpapers.dublincore.org/ojs/pubs/article/view/770/766
Managing Statements
ALA Dallas, 1/20/126
[Possible] New Roles for Librarians
• Aggregators of relevant metadata content• Developing methods to expose & redistribute
without a central node
• Modeling and documenting best practices in metadata creation, improvement and exposure• Application profiles important in this effort
• Developers of vocabularies using bibliographic relationships
• Innovators in using social networks to enhance bibliographic description
ALA Dallas, 1/20/127
Re-Thinking MetadataManagement
ALA Dallas, 1/20/128
ALA Dallas, 1/20/129
Harvest/Ingest Plan
• Choosing data sources• There are known sources out there, some of
them are of good quality, others are usable, with improvement
• Tools are needed to help pull data, validate it, cache it, and set it up for evaluation• Most of these tasks can/should be set up with
automated processes, with alerts to human minders when something goes wrong
ALA Dallas, 1/20/1210
ALA Dallas, 1/20/1211
Metadata Evaluation
• Evaluation needs to scale well beyond random sampling
• Statistical and data mining tools need to be brought into the process, to provide both ‘overview’ and specifics of whole data sets
• Improvement specifications, techniques, quality criteria and tools need to be iterative, granular, and shareable
ALA Dallas, 1/20/1212
ALA Dallas, 1/20/1213
Testing, Monitoring & Re-evaluation
• Data will change, and processes must be able to detect that, based on data profiles• Human intervention should be limited
• Tools need to be built so that non-programmers can run them• Reading logs, monitoring error reports, checking
results, writing specs, can/should be done by data specialists (a.k.a. catalogers w/training)
• Looking for opportunities for programmers and catalogers to learn together is essential
ALA Dallas, 1/20/1214
ALA Dallas, 1/20/1215
Re-distribution Plan
• If we improve data, we need to expose how we did it (and what we did), for the use of downstream consumers• New metadata provenance efforts designed to do
this at the statement level
• This strategy can only exist successfully where open licenses allow innovation and wide re-use
• Ideally, distribution AND redistribution should be accomplished with Application Profiles
ALA Dallas, 1/20/1216
Will This Shift Cost Too Much?
• It’s the human effort that costs us• Cost of traditional cataloging is far too high, for
increasingly dubious value
• Our current investments have reached the end of their usefulness• All the possible efficiencies for traditional cataloging have
already been accomplished
• Waiting for leadership from the big players costs us valuable time with no guarantees of results
• We need to figure out how to invest in more distributed innovation and focused collaboration
ALA Dallas, 1/20/1217
ROI in the LOD World
• Free metadata is essential in a ‘culture economy’• We need eyeballs, attention, connection for our
content!
• Thinking about ROI based on recovering the cost of creating metadata is a dead end
• To drive people to your content, you need to put your data out there• But once it’s there, it’s out of your control, and we
need to get comfortable with that
Thank you! Questions?
Contact info: [email protected]
Metadata Matters: http://managemetadata.com/blog
ALA Dallas, 1/20/12 18