Upload
darcy-park
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Gathering Audio Gathering Audio Metadata for the Metadata for the Monterey Jazz Festival Monterey Jazz Festival Concerts Concerts
OLAC 2006 OLAC 2006
By Nancy J. Hoebelheinrich, By Nancy J. Hoebelheinrich, Stanford University Libraries Stanford University Libraries
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Workshop GoalsWorkshop Goals
Surface issues associated with Surface issues associated with gathering MD req’s for access & long gathering MD req’s for access & long term preservation of audio filesterm preservation of audio files
Demonstrate how to use METS for Demonstrate how to use METS for content packaging &content packaging &– MODS for description & retention of logical MODS for description & retention of logical
& physical structures of digital audio & physical structures of digital audio objectsobjects
– PREMIS for preservation MDPREMIS for preservation MD– AES Draft Data Dictionary & JHove for AES Draft Data Dictionary & JHove for
Format MDFormat MD
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Monterey Jazz Festival Monterey Jazz Festival Project DescriptionProject Description Multi-year, multi-part project initiated Multi-year, multi-part project initiated
jointly by Stanford University Libraries jointly by Stanford University Libraries and the Monterey Jazz Festivaland the Monterey Jazz Festival
Goal to preserve and provide access to Goal to preserve and provide access to approximately 750 original audio and approximately 750 original audio and 92 original video recordings 92 original video recordings
Recordings Recordings – Date from 1958 to present Date from 1958 to present – Document the world's longest running jazz Document the world's longest running jazz
festivalfestival
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Project Description, Project Description, cont.cont. Grant funding provided by:Grant funding provided by:
– Grammy FoundationGrammy Foundation– National Historic Publications and National Historic Publications and
Records CommissionRecords Commission– Save America’s Treasures. Save America’s Treasures.
Current timeline: October 1, Current timeline: October 1, 2005 – September 31, 2008. 2005 – September 31, 2008.
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Collection DescriptionCollection Description
Complete collection currently comprises overComplete collection currently comprises over – 1,200 sound recordings1,200 sound recordings– 370 moving image materials 370 moving image materials – 130 linear feet of paper-based records of the founding 130 linear feet of paper-based records of the founding
organization organization
Forms a unique collection of historic recordings of high Forms a unique collection of historic recordings of high research value, currently inaccessible to scholars due research value, currently inaccessible to scholars due to the condition and format of the materialsto the condition and format of the materials
Approximately 750 tapes have been selected to be Approximately 750 tapes have been selected to be digitizeddigitized
Formats: ¼” and ½” analog reel tape, audiocassette, Formats: ¼” and ½” analog reel tape, audiocassette, and digital audio tape. (only audio for this project)and digital audio tape. (only audio for this project)
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Intentions for Intentions for CollectionCollection Creation of master and derivative digital Creation of master and derivative digital
audio filesaudio files Augmentation of existing descriptive MD to Augmentation of existing descriptive MD to
access component level filesaccess component level files Entire digital collection will be accessible to Entire digital collection will be accessible to
listeners on Stanford campuslisteners on Stanford campus MD made accessible to the public via the MD made accessible to the public via the
SULAIR web; [selected sound clips may SULAIR web; [selected sound clips may also be available]also be available]
Deposit into preservation repository (SDR)Deposit into preservation repository (SDR)
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Descriptive / Structural Descriptive / Structural MD Req’s per curator & MD Req’s per curator & SDRSDR Retain relationships among “Retain relationships among “trackstracks” ”
or segments, or segments, tape-sidetape-side and and tapetape to to allow physical access to analog allow physical access to analog artifactartifact
Replicate physical structure, but Replicate physical structure, but also provide direct access to the also provide direct access to the logical structurelogical structure
““Find”, “identify” & “select” by tape, Find”, “identify” & “select” by tape, performer(s), performance, dateperformer(s), performance, date
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Minimal MD Req’s for Minimal MD Req’s for SDRSDR StructuralStructural Descriptive enough for minimal Descriptive enough for minimal
accessaccess Admin Admin
– Technical for AudioTechnical for Audio– PreservationPreservation– RightsRights
MD Packaged with its resourceMD Packaged with its resource
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
FM Pro MD @ FM Pro MD @ beginning of projectbeginning of project Field tags = Field tags =
– Tape numberTape number– Performer (of all on given tape) by Performer (of all on given tape) by
group with individual & instrument group with individual & instrument also listed also listed
– Performance (of all songs on the Performance (of all songs on the tape, differentiated by performer)tape, differentiated by performer)
– Date of performanceDate of performance
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Extra performer
s
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Extra group performer
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Date #1
Date #2
Date
#3
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
The plot thickens…The plot thickens…
How to [retain] link between How to [retain] link between Descriptive MD and “digital-Descriptive MD and “digital-physical” files??physical” files??– Assigned “markers” = virtual BE / Assigned “markers” = virtual BE /
END determined by timestampsEND determined by timestamps– Files & structural naming Files & structural naming
conventionsconventions
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Why worry about Why worry about digital object digital object structure?structure? So many filesSo many files No inherent order No inherent order
to their orderto their order Just streams of Just streams of
bitsbits
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Physical structure by Physical structure by naming convention, naming convention, hmm….hmm…. 0001pm.wav0001pm.wav
0001pm.sfk 0001pm.sfk 0001pm.wav.gpk 0001pm.wav.gpk 0001pm.wav.mem 0001pm.wav.mem 0001sh.wav 0001sh.wav 0001sh.mrk 0001sh.mrk 0001sh.cd 0001sh.cd 0001sh.wav.gpk 0001sh.wav.gpk 0001sh.wav.mem 0001sh.wav.mem
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Physical structure by Physical structure by file naming w/ file naming w/ directoriesdirectories sul-dl-nas1\mjf\Batch01\040606\sul-dl-nas1\mjf\Batch01\040606\
PM\ PM\ 0001pm.wav 0001pm.wav 0001pm.sfk 0001pm.sfk 0001pm.wav.gpk 0001pm.wav.gpk 0001pm.wav.mem 0001pm.wav.mem SH\ SH\ 0001sh.wav 0001sh.wav 0001sh.mrk 0001sh.mrk 0001sh.cd 0001sh.cd 0001sh.wav.gpk 0001sh.wav.gpk 0001sh.wav.mem 0001sh.wav.mem
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Long term storageLong term storagebetsbets Different Different naming conventionsnaming conventions Different Different directory structures, if directory structures, if
anyany Need for device & OS Need for device & OS
independenceindependence Value in “packaging” of metadata Value in “packaging” of metadata
& content together even if stored & content together even if stored separatelyseparately
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
What to do?What to do?
Packaging = Packaging = Descriptive + Descriptive + StructureStructure
METS = (Logical structure METS = (Logical structure expressed as) Descriptive MD + expressed as) Descriptive MD + (Physical Structure expressed as) (Physical Structure expressed as) Structural MapStructural Map
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
How does METS work?How does METS work?
Initial scope limited to objects Initial scope limited to objects comprised of text, image, audio & comprised of text, image, audio & video filesvideo files
Technical ComponentsTechnical Components– Primary XML SchemaPrimary XML Schema– Extension SchemaExtension Schema– Controlled VocabulariesControlled Vocabularies– Community based profiles Community based profiles
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
METS XML SchemaMETS XML Schema
METSHeader
DescriptiveMetadata
AdministrativeMetadata
Content FileInventory
StructuralMap
Behaviors
METSDocument
Structural Link
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Structural Map is keyStructural Map is key
Digital Object modeled as logical or Digital Object modeled as logical or physical tree structure (e.g., book with physical tree structure (e.g., book with chapters with subchapters, image file with chapters with subchapters, image file with encoded text transcription file and audio encoded text transcription file and audio file of oral interview….)file of oral interview….)
Every node in tree can be associated with Every node in tree can be associated with descriptive/administrative metadata and…descriptive/administrative metadata and…
Individual/multiple files (or portions Individual/multiple files (or portions thereof) orthereof) or
Other METS documentsOther METS documents
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Associated MetadataAssociated Metadata DescriptiveDescriptive
– Endorsed XML schemas Endorsed XML schemas of these standards to of these standards to date: date: MARCXML, Dublin MARCXML, Dublin Core simple, MODS; can Core simple, MODS; can use others such as FGDC, use others such as FGDC, VRA, etc.VRA, etc.
AdministrativeAdministrative– Technical (Z39.87 for still Technical (Z39.87 for still
images, Text endorsed), images, Text endorsed), – Rights, SourceRights, Source– Digital Provenance Digital Provenance
(PREMIS endorsed)(PREMIS endorsed)
Can be associated with entire digital object or subcomponent(s)
Can be multiple instances; type used is not prescribed
Can be contained internally (as XML or binary files)
Can be contained externally by reference (using Xlink)
Provides controlled vocabularies for tags and declaration of standards used
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Ex., simple METS Ex., simple METS ObjectObject
BookDesc MD (MARC or DC or MODS)
FileX=Pg1
FileY=Pg2
Tech MD: Image
Admin MD (Digiprov)
Tech MD: Image
Admin MD (Digiprov)
Admin MD: Rights
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Ex., Audio METS ObjectEx., Audio METS Object
Audio Tape- side
Desc MD ( MARC or DC or MODS)
FileX=Track1
FileY= Track2, Track3
Tech MD: Audio
Admin MD (Digiprov)
Tech MD: Audio
Admin MD (Digiprov)
Desc MD for Track - (DC or MODS)
Admin MD: Rights
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
First, descriptiveFirst, descriptive
FMPro FMPro qDC qDC MODS MODS finalDMDTemplatefinalDMDTemplate PDF PDF
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Taking advantage of Taking advantage of the technologiesthe technologies Mechanism for keeping tracks Mechanism for keeping tracks
(segments) connected to tape-side(segments) connected to tape-side– using mods:relatedItem to nest, or notusing mods:relatedItem to nest, or not– Retaining IDs from data provider – SDRRetaining IDs from data provider – SDR
Using subfields / attributes to trigger Using subfields / attributes to trigger code events, e.g., subject/genre & code events, e.g., subject/genre & title informationtitle information
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Viewing the XMLViewing the XML
SeeSee dmdSec dmdSec See See fileSecfileSec See See structMapstructMap
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Administrative MDAdministrative MD
rightsMD using PREMIS RightsrightsMD using PREMIS Rights sourceMD used AES draft data sourceMD used AES draft data
dictionary elementsdictionary elements techMD for format specific MDtechMD for format specific MD
– Preservation Master (Broadcast Preservation Master (Broadcast wave, uncompressed) (AES & Jhove)wave, uncompressed) (AES & Jhove)
– Service High (Broadcast wave, Service High (Broadcast wave, compressed) (AES & Jhove)compressed) (AES & Jhove)
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Viewing the XMLViewing the XML
SeeSee amdSec amdSec – rightsMDrightsMD– sourcMDsourcMD– techMDtechMD
For fileFor file For formatFor format
NjH, Stanford University Libraries, 27 - 28 NjH, Stanford University Libraries, 27 - 28 October, OLAC 2006October, OLAC 2006
Questions, Comments?Questions, Comments?
References:References:
Monterey Jazz Festival Monterey Jazz Festival http://www.montereyjazzfestival.org/50th/http://www.montereyjazzfestival.org/50th/
Archive of Recorded Sound MJF Collection, Archive of Recorded Sound MJF Collection, Stanford University Libraries Stanford University Libraries
http://library.stanford.edu/depts/ars/collections/jahttp://library.stanford.edu/depts/ars/collections/jazz.htmlzz.html
METS METS http://www.loc.gov/standards/mets/http://www.loc.gov/standards/mets/
Dublin Core Metadata Initiative Dublin Core Metadata Initiative http://uk.dublincore.org/schemas/xmls/http://uk.dublincore.org/schemas/xmls/
MODS MODS http://www.loc.gov/standards/mods/http://www.loc.gov/standards/mods/
PREMIS PREMIS http://www.oclc.org/research/projects/pmwg/http://www.oclc.org/research/projects/pmwg/
Audio Preservation information, see Audio Preservation information, see http://palimpsest.stanford.edu/bytopic/audio/http://palimpsest.stanford.edu/bytopic/audio/
JHove JStor / Harvard Object Validation JHove JStor / Harvard Object Validation Environment Environment
http://http://hul.harvard.edu/jhovehul.harvard.edu/jhove//
AcknowledgementsAcknowledgements
Special thanks and acknowledgement to Special thanks and acknowledgement to Hannah Frost, Media Preservation Hannah Frost, Media Preservation Librarian at SULAIRLibrarian at SULAIR
ContactContact::
Nancy HoebelheinrichNancy Hoebelheinrich [email protected]@stanford.edu
And, why are we doing this???And, why are we doing this??? MFOO29-BillieHMFOO29-BillieH MF00229-BillieH2MF00229-BillieH2