Upload
kohana
View
42
Download
1
Embed Size (px)
DESCRIPTION
Emerging Standards for Complex Works. Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/~howard. Emerging Standards for Complex Works. Background & Context for Standards MOA2: Structural & Administrative Metadata NISO/DLF: Technical Imaging Standards - PowerPoint PPT Presentation
Citation preview
Besser--CNI/JISC 6/16/00 1
Emerging Standards for Complex Works
Howard Besser
UCLA School of Education & Information
http://www.gseis.ucla.edu/~howard
Besser--CNI/JISC 6/16/00 2
Emerging Standards for Complex Works
_ Background & Context for Standards_ MOA2: Structural & Administrative
Metadata_ NISO/DLF: Technical Imaging Standards_ Identification/provenance_ Rich Media_ Longevity
Besser--CNI/JISC 6/16/00 3
Key problems we’re facing
Discovery Longevity Interoperability
Besser--CNI/JISC 6/16/00 4
Traditional Digital Library Model
DL
DL
DL
DL
useruser
search & presentation
search & presentation
search & presentation
search & presentation
Besser--CNI/JISC 6/16/00 5
Ideal Digital Library Model
DL
DL
DL
DL
useruser
search & presentation
Besser--CNI/JISC 6/16/00 6
For Interoperability Digital Libraries Need Standards
Descriptive Metadata for consistent description
Discovery Metadata for finding Administrative Metadata for viewing and
maintaining Structural Metadata for navigation ... Terms & Conditions Metadata for
controlling access...
Besser--CNI/JISC 6/16/00 7
Why are Standards and Metadata consensus
important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create
applications that support this
Besser--CNI/JISC 6/16/00 8
Collaborative Metadata Projects
Dublin Core NSF/ERCIM Digital Collaboratory OCLC CORC Project- Visual Resources Association (VRA) Core Encoded Archival Description (EAD) Computerized Interchange of Museum Information
(CIMI)- Records Export for Art and Cultural Heritage
(REACH)
Besser--CNI/JISC 6/16/00 9
CORC--Cooperative Online Resource Catalog
_ both bib records & webliographies (pathfiinders)
_ supports both AACR2/MARC and DC_ began 1/99, scheduled availability 7/00_ 100-200 participants
– Academic libraries– OCLC networks, special libraries, public
libraries, state & national libraries, consortia
Besser--CNI/JISC 6/16/00 10
Making of America II-
Background of the DLF Project Administrative Metadata Structural Metadata
Besser--CNI/JISC 6/16/00 11
MOA2 Goal is Interpoerability
Book example
Besser--CNI/JISC 6/16/00 12
DLF Metadata for Interoperability Testbed:
the MOA II Project R & D Distributed Repositories Transportation, 1869-1900 Testbed Project Best Practices Structural and administrative metadata
Besser--CNI/JISC 6/16/00 13
Previous Projects/Background
Library Standards Background UC Berkeley Background Finding Aids EAD SGML EAD “Digital Archives”
Besser--CNI/JISC 6/16/00 14
MOA II Classes of Objects
Continuous Tone Photos Photo Albums Diaries, journals, letterpress books Ledgers Correspondence
Besser--CNI/JISC 6/16/00 15
MOA II Metadata
_ Administrative Metadata– for enhancing resource management
_ Structural Metadata– for reflecting internal hierarchies and
relationships btwn parts
_ Raw/Seared/Cooked
Besser--CNI/JISC 6/16/00 16
MOA II Behaviors
Navigation Display/Print
Besser--CNI/JISC 6/16/00 17
MOA II Best practices
Use/Users/Collection: Benchmarking Masters vs. Derivatives Scanning- Administrative Metadata- Structural Metadata-
Besser--CNI/JISC 6/16/00 18
Scanning Best Practices
_ Think about users (and potential users), uses, and type of material/collection
_ Scan at the highest quality that does not exceed the likely potential users/uses/material
_ Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery
_ Many documents which appear to be bitonal actually are better represented with greyscale scans
_ Include color bar and ruler in the scan
_ Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct)
_ Don’t use lossy compression_ Store in a common (standardized)
file format_ Capture as much metadata as is
reasonably possiple (including metadata about the scanning process itself)
Besser--CNI/JISC 6/16/00 19
Why Scale is important
Besser--CNI/JISC 6/16/00 20
Administrative Metadatato uniquely identify a digital resource and manage it
over time
_ Information about where the various pieces/versions of the object reside
_ Information to view the digital object_ Information about the scanning process
Besser--CNI/JISC 6/16/00 21
Structural Metadata:that which is relevant to presentation of the
digital object to the user
_ metadata defining the "object”: a book, a diary, a photo album
_ metadata defining the “sub-objects”: pages (physical) or chapters and subheads (intellectual)
Besser--CNI/JISC 6/16/00 22
SGML, XML, HTML
_ TEI for structured humanities text_ EAD for Finding Aids
Besser--CNI/JISC 6/16/00 23
NISO/DLF Image Metadata WorkshopPossible Goals
Metadata fields Rules for Field Contents (authority control)
Core set of necessary fields
Syntax for expressing fields and contents (headers)
Besser--CNI/JISC 6/16/00 24
Image Metadata
Focus on Metadata that may prove helpful for
management use preservation ...
Besser--CNI/JISC 6/16/00 25
Image Metadata
Break-out Groups: Work Done
Characteristics and Features of Images Image Production and Reformatting
Features Image Identification and Integrity
Besser--CNI/JISC 6/16/00 26
NISO/DLF Image Metadata Workshop (4/99)
Image Technical Information : Possible Goals
Metadata fields Rules for Field Contents (authority control)
Core set of necessary fields
Syntax for expressing fields and contents (headers)
Besser--CNI/JISC 6/16/00 27
Image Metadata
Focus on Metadata that may prove helpful for
management use preservation ...
Besser--CNI/JISC 6/16/00 28
Image Metadata
Break-out Groups: Work Done-
Characteristics and Features of Images Image Production and Reformatting
Features Image Identification and Integrity
Besser--CNI/JISC 6/16/00 29
Image Metadata Elements for Data Dictionary
Data Dictionary Entries_ Element Name_ Definition (short) of the element name_ Is the element required? (Identified as: Mandatory, Mandatory if
Applicable, Recommended, Optional)_ How is the value of the element represented?_ Examples_ When is this data collected?_ What is the purpose of this data?_ Who would the identified users be?_ How is the metadata used?_ What other metadata standards reference it?
Besser--CNI/JISC 6/16/00 30
Image Metadata Elements for Data Dictionary
Characteristics and Features Element List
_ Format Issues:_ Resolution Issues:_ Encoding:_ Compression:_ Others:
Besser--CNI/JISC 6/16/00 31
Image Metadata Elements for Data Dictionary
Image Production Element List (Pertaining to the Image)
_ In-image target(s):_ System target(s), associated with the object:_ Responsible agent_ Rationale:_ Hardware:_ Software:
Besser--CNI/JISC 6/16/00 32
Image Metadata Elements for Data Dictionary
Image Production Element List (Pertaining to the Process)
_ Format of the image_ Intrinsic characteristics of the image_ Identification_ Provides a means for defining methodology including documentation and rationale_ Who is involved with the file?_ Who created the image file?_ Who commissioned the creation of the image file (i.e., the chartering entity), as opposed
to: Who is the responsible agency? Who is the owner?_ Where_ What_ When: necessary dates including: capture date/time, modification_ Checksum_ Navigational aid_ Encoding tools
Besser--CNI/JISC 6/16/00 33
Image Metadata
NISO/DLF Image Metadata:In Progress
_ Data Dictionary for both “Characteristics & Features” and for “Image Production Elements” due end of 6/00
Besser--CNI/JISC 6/16/00 34
Finding Image Origins
Besser--CNI/JISC 6/16/00 35
Identification/Provenance (Images)-
The number of variant forms of a work can be enormous Image Families A digital image frequently has many layers of parentage Information about the parentage that can indicate the
quality and veracity of the image (Dublin Core "Source" and "Relation")
how to deal with different versions derived from the same scan or different encoding schemes
Vocabulary Standards to express this
Besser--CNI/JISC 6/16/00 36
The number of variant forms of a work can be enormous
different views of the same object different lighting of the same object different scans of the same photo different resolutions different compression schemes different compression ratios different file storage formats different details of the same image ...
Image Families
Besser--CNI/JISC 6/16/00 38
Identification/Provenance
how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF)
Vocabulary Standards to express this– VRA Surrogate Categories– CIMI's "Image Elements”
Besser--CNI/JISC 6/16/00 39
Other Metadata
_ Description of depiction/surrogate (What VRA calls its "Surrogate Categories")
_ Description of original object
_ Rights and Reproduction Information_ Location Information
Besser--CNI/JISC 6/16/00 40
Metadata for Digital Commerce
DOI <indecs>-
Besser--CNI/JISC 6/16/00 41
<Indecs>
formal structure for describing and uniquely identifying intellectual property itself, the people and businesses involved in its trading, and the agreements which they make about it (primarily for publishing, music, and visual arts)
will develop high-level specifications for the services that will be required to implement a global IP trading system based on this <indecs> generic data model
focus is on encoding rights at a high level, not on resource discovery likely to involve metadata schma registration and directory to allow
interoperation of personal identifiers for rightsholders and users supported by EEC DG-13 First meeting July 1999 http://www.indecs.org/
Besser--CNI/JISC 6/16/00 42
Problems & Potentialsof Rich Media-
_ Types of Rich Media_ Technologies and problems_ Opportunities--a scenario_ Metadata_ Indexing
Besser--CNI/JISC 6/16/00 43
Some Types of Rich Media
_ Moving image materials_ Multimedia_ Interactive programs_ Computer art
Besser--CNI/JISC 6/16/00 44
After an uphill battle, tech and Tinseltown find common ground
(USA Today, 3/3/00)
Besser--CNI/JISC 6/16/00 45
Projected Changes: Prospect of digitized movies already has some mourning loss of film
(SF Chronicle, 3/5/00)
Besser--CNI/JISC 6/16/00 46
Video Technology to Make the Head Spin (NYT 3/2/00)
Besser--CNI/JISC 6/16/00 47
ECI - Hole in Space (both)
Besser--CNI/JISC 6/16/00 48
ECI - 84-locations
Besser--CNI/JISC 6/16/00 49
ECI - 84-Community Memory
Besser--CNI/JISC 6/16/00 50
ECI - 84-MOCA
Besser--CNI/JISC 6/16/00 51
ECI - Avatars & Humans
Besser--CNI/JISC 6/16/00 52
ECI - Avatar Stage
Besser--CNI/JISC 6/16/00 53
Complexity of Rich Media
_ Works often have artistic nature (including video games)
_ Enormous number of elements can, at times, be very important to preserve (pacing, original artifact, elements used to construct the artifact)
_ Too complex to save every one of these aspects for every type of material
_ Importance of saving documentation
Besser--CNI/JISC 6/16/00 54
Rich Media Technologies
_ Streaming media vs. Downloaded files_ Bandwidth and compression_ Need to offload functions onto clients
Besser--CNI/JISC 6/16/00 55
The Inter-relation Problem
-Info is increasingly inter-related to other info
-How do we make our own Info persist when it points to and integrates with Info owned by others?
-What is the boundary of a set of information (or even of a digital object)?
Besser--CNI/JISC 6/16/00 56
The Translation Problem
Content translated into new delivery devices changes meaning– -A photo vs. a painting– -If Info is produced originally in digital form in
one encoded format, will it be the same when translated into another format?
– Behaviors
Besser--CNI/JISC 6/16/00 57
Problems of Rich Media
_ Complexity of formats (storage & compression)_ Synchronicity between media/streams_ Pieces and Boundaries_ Persistent IDs_ Interactivity_ Historical context_ Content_ Recontextualization (Postmodernism)
Besser--CNI/JISC 6/16/00 58
Opportunities--a scenario
_ Huge stable online DB of rich media (Prelinger Archives)
_ Creators create new works that consist mainly of links to and transitions btwn pieces of the rich media DB
_ Works are not really assembled until run-time_ Securing IP permission may shift from capital-intensive
producer to end-user_ Economics of media production may change drastically
Besser--CNI/JISC 6/16/00 59
Structural Metadata for Complex Objects-
_ MPEG 4_ SMIL
Besser--CNI/JISC 6/16/00 60
Synchronized Multimedia Integration Language (SMIL)
_ For repurposing and reuse in different ways_ Use XML to reference various pieces in different
ways_ Supported by Realmedia but not Microsoft or
Macromedia
Besser--CNI/JISC 6/16/00 61
MPEG 4
_ Object-oriented_ Very low level of granularity (even objects vs
backgrounds)_ Scaleable bandwidth use_ Binary Format for Scenes (BIFS) borrows
concepts from VRML
Besser--CNI/JISC 6/16/00 62
Indexing ofMoving Image Materials
_ Whole works vs. parts of Works_ MPEG 7_ Approaches to segmentation & thumbnail
representation_ Closed caption indexing_ Audio description indexing_ Semiotics
Besser--CNI/JISC 6/16/00 63
Other Types of Metadata-
_ Longevity_ Identification/Provenance_ Rights Management
Besser--CNI/JISC 6/16/00 64
The Short Life of Digital Info: Digital Longevity Problems-
Disappearing Information The Viewing Problem The Scrambling Problem The Inter-relation Problem The Custodial Problem The Translation Problem
Besser--CNI/JISC 6/16/00 65
The Viewing Problem
Digital Info requires a whole infrastructure to view it
Each piece of that infrastructure is changing at an incredibly rapid rate
How can we ever hope to deal with all the permutations and combinations
Besser--CNI/JISC 6/16/00 66
The Scrambling Problem
Dangers from: Compression to ease storage & delivery Container Architecture to enhance digital
commerce
Besser--CNI/JISC 6/16/00 67
The Inter-relation Problem
-Info is increasingly inter-related to other info
-How do we make our own Info persist when it points to and integrates with Info owned by others?
-What is the boundary of a set of information (or even of a digital object)?
Besser--CNI/JISC 6/16/00 68
The Custodial Problem
How do we decide what to save? Who should save it? How should they save it?
– -methods for later access: emulation, migration, etc.
– -issues of authenticity and evidence
Besser--CNI/JISC 6/16/00 69
The Translation Problem
Content translated into new delivery devices changes meaning– -A photo vs. a painting– -If Info is produced originally in digital form in
one encoded format, will it be the same when translated into another format?
– Behaviors
Besser--CNI/JISC 6/16/00 70
Pieces of the Solution (1/2)
-We need to insist upon clearly readable standardized ways for digital objects to self-identify their formats
-We should discourage scrambling -We need to better understand information
inter-relates to other Info, and what constitutes “boundaries” of Info objects
Besser--CNI/JISC 6/16/00 71
Pieces of the Solution (2/2)
-People and organizations wishing to make information persist need guidelines of how to go about doing it
-We need to better understand how translating from one storage or display format to another affects the meaning of a work
-We need to save the “behaviors” of a digital object, not just it’s “contents”
Besser--CNI/JISC 6/16/00 72
Metadata can be the first line of defense
Can tell you– where the file is (if you can’t find the file)– where more info about the file is (if you have the
file but most other metadata has become separated)
– what the file format is– what the compression scheme is– what application program and version is needed
for the file
Besser--CNI/JISC 6/16/00 73
Groups Working onthe Big Longevity Problem
http://sunsite.Berkeley.EDU/Imaging/Databases/Longevity/
CPA Task Force CPA Study Group Getty “Time & Bits” Conference-
Internet Archive Long Now
Besser--CNI/JISC 6/16/00 74
Migration/Refreshing
Impact on evidential value
Besser--CNI/JISC 6/16/00 75
Emerging Standards for Complex Works_ Howard Besser_ UCLA School of Education & Information
_ http://www.gseis.ucla.edu/~howard/image-meta.html_ http://sunsite.Berkeley.EDU/moa2/ http://www.gseis.ucla.edu/~howard/Classes/287-moving.html http://www.gseis.ucla.edu/~howard/Classes/287-mov-index-bib.html_ http://www.gseis.ucla.edu/~howard/Metadata/UC-May00/_ http://www.getty.edu/gri/standard/intrometadata/_ http://sunsite.Berkeley.EDU/Imaging/Databases/#standards_ http://sunsite.Berkeley.EDU/Longevity/_ http://www.ifla.org/II/metadata.htm_ http://purl.oclc.org/metadata/dublin_core/_ http://purl.oclc.org/corc/_ http://lcweb.loc.gov/ead/_ http://sunsite.berkeley.edu/Metadata/sp2000.html
Besser--CNI/JISC 6/16/00 76
Data Structures:The VRA Core
28 elements specifically for visual resource collections
Work Description Categories- Visual Document Description Categories- http://www.oberlin.edu/~art/vra/dsc.html
Besser--CNI/JISC 6/16/00 77
VRA Core:Work Description Categories
Work type Title Measurements Material Technique Creator Role Date Repository name Repository place
_ Repository number_ Current site_ Original site_ Style/period/group/
movement_ Nationality/culture_ Subject_ Related work_ Relationship type_ Notes
Besser--CNI/JISC 6/16/00 78
VRA Core:Visual Document Description
Categories Visual document type Visual document format Visual document measurements Visual document date Visual document owner Visual document owner number Visual document view description Visual document subject Visual document source
Besser--CNI/JISC 6/16/00 79
Thesaurus for Graphic Materials
designed for subject indexing of pictorial materials, particularly large general collections of historical images
for cataloging and retrieval good for general audiences and broad approaches
to the material TGM-I: Subject Terms & TGM-II: Genre and
Physical Characteristic Terms http://lcweb.loc.gov/rr/print/tgm/toc.html
Besser--CNI/JISC 6/16/00 80
AAT
120,000 terms for describing objects, textual materials,
images, architecture, and material culture from antiquity to present
large and complex http://www.getty.edu/gri/vocabularies/
Besser--CNI/JISC 6/16/00 81
ULAN
name authority http://www.getty.edu/gri/vocabularies/
Besser--CNI/JISC 6/16/00 82
Thesaurus of Geographic Names
over 1 million records hierarchical and global throughout history most records include coordinates and
descriptive notes
Besser--CNI/JISC 6/16/00 83
Semantics/Syntax/Structure
_ Semantics– meaning, as defined by a community to meet their particular needs
(DC)
_ Syntax– a systematic arrangement of data elements for machine processing
– facilitates the exchange and use of metadata among various applications (HTML, XML, RDF)
_ Structure– a formal arrangement of the syntax with the goal of consistent
representation of the semantics (rules defining field contents like 1/11/99)
Besser--CNI/JISC 6/16/00 84
Metadata Mapping-
Crosswalks Resource Description Framework (RDF)
Besser--CNI/JISC 6/16/00 85
Crosswalks
mapping btwn differing metadata structures eliminate the need for monolithic,
universally adopted standards focus on flexibility and interoperatiblity RDF-based metadata registries
Besser--CNI/JISC 6/16/00 86
Crosswalk ExampleCDWAObject IDCIMISchema FDAVRA CoreCategories USMARCDUBLINCOREOBJECT/WORK (core) DocumentClassification-CatalogLevel (core)DocumentClassification-Group Type
Object/Work-Type (core) Type ofObject objectNAMEDocumentClassification- DocumentType (core)Purpose-Purpose(Broad) (core)Purpose-Purpose(Narrow)
W1. WorkType 655 Genre-Form Type
Object/Work-Components quantity DocumentClassification-Extent 300a PhysicalDescription-Extent ORIENTATION/ARRANGEMENT
DescriptionTITLES ORNAMES(core)
Title objectTitlebibliographicTitleGroup/ItemIdentification-RepositoryTitleGroup/ItemIdentification-DescriptiveTitle (core)Group/ItemIdentification-InscribedTitle
W2. Title 24Xa Titleand Title-RelatedInformationTitle
Besser--CNI/JISC 6/16/00 87
Resource Description Framework (RDF, spec released 2/99)
_ W3C Metadata activity_ designed to move the Web beyond simple links to
semantically-rich relationships btwn resources_ metadata application using XML as a common syntax for
exchange and processing_ flexible architecture for managing diverse application-
specific metadata packets that can be processed by machines_ associates resources, property types, and corresponding
values_ http://www.w3.org/RDF/
Besser--CNI/JISC 6/16/00 88
RDF
_ Resources (character strings, names, digital objects)
_ Property (“is the author of”)_ Value
_ resources+properties=relationships_ many different relationships can be reflected
Besser--CNI/JISC 6/16/00 89
XML-encoded RDF
_ <?xml:namespace ns=http://www.w3.org/RDF/RDF prefix="RDF" ?>
_ <?xml:namespace ns=http://purl.oclc.org/DC/ prefix="DC" ?>
_ <RDF:RDF>_ <DC:Creator>Howard Besser</DC:Creator>_ </RDF:Description>_ </RDF:RDF>
Besser--CNI/JISC 6/16/00 90