29
Embedded Metadata “That media is broken right now. And I think the embedding process has brought the media to an all time low.” - Amy Goodman on Bill Moyers Journal, April 5, 2009 David Rice | www.avpreserve.com | AMIA 2010, Philadelphia 1

Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Embed Size (px)

Citation preview

Page 1: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Embedded Metadata

“That media is broken right now. And I think the embedding process has brought the media to an all time low.”- Amy Goodman on Bill Moyers Journal, April 5, 2009

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

1

chris
Text Box
DAVID RICE AUDIOVISUAL PRESERVATION SOLUTIONS AMIA 2010, PHILADELPHIA NOVEMBER 4TH, 2010
Page 2: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Why Embedded Metadata?

- What is this content?- Where did it come from?- How can I get additional information about this?- What rights to I have to this content?

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

2

Page 3: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Music Metadata Example: ID3

- Metadata 1 - embedded in a web page, searchable

- Metadata 2 (ID3)- embedded within file, not searchable but exchangeable

- Metadata 3 - Derivative of ID3 tags in Metadata 2. Facilitates collection management and local search and browse.

Securing the relationship between metadata and essence

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

3

Page 4: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

4

Page 5: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

• ffmpeg -i camels.avi -vn -acodec libfaac -ab 64k -ac 2 temp.aac• ffmpeg -an -deinterlace -i camels.avi -s 320x240 -r 20 -vcodec rawvideo -pix_fmt yuv420p -f rawvideo - 2>/dev/null | ffmpeg -an -f rawvideo -s 320x240 -r 20 -i - -f yuv4mpegpipe - 2>/dev/null | x264 --bitrate 512 --vbv-maxrate 768 --vbv-bufsize 1024 --profile baseline --pass 1 /dev/stdin --demuxer y4m -o temp.h264• ffmpeg -an -deinterlace -i camels.avi -s 320x240 -r 20 -vcodec rawvideo -pix_fmt yuv420p -f rawvideo - 2>/dev/null | ffmpeg -an -f rawvideo -s 320x240 -r 20 -i - -f yuv4mpegpipe - 2>/dev/null | x264 --bitrate 512 --vbv-maxrate 768 --vbv-bufsize 1024 --profile baseline --pass 2 /dev/stdin --demuxer y4m -o temp.h264• mp4creator -c temp.h264 -r 20 t2.mp4• mp4creator -c temp.aac -interleave t2.mp4• ffmpeg -i t2.mp4 -acodec copy -vcodec copy -metadata title="Camels at a Zoo - http://www.archive.org/details/camels" -metadata year="2004" -metadata comment="license:http://creativecommons.org/licenses/by-nc/3.0/" camels_512kb.mp4• mp4creator -optimize camels_512kb.mp4

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

5

Page 6: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

6

Page 7: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

7

Page 8: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

qtmux

qtmux --metadatatype metadata -o out.mov inVideo.mov inAudio.mov

use -siv to activate save-in-placemultiple metadatatypes and input videos welcome

accepted metadatatypes: Album, Artist, Author, Chapter, Comment, Composer, Copyright, CreationDate, Description, Director, Disclaimer, EncodedBy, FullName, Genre, HostComputer, Information, Keywords, Make, Model, OriginalArtist, OriginalFormat, OriginalSource, Performers, Producer, Product, Software, SpecialPlaybackRequirements, Track, Warning, Writer, URLLink, EditDate

also accepted are metadatatypes of 4 characters starting with @

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

8

Page 9: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

qtmedia-File- "/Users/daverice/Downloads/B03774-12.mov_bitc.mov"-Info-" Hinted: No" Ref Movie: No" Size: 600.00 x 480.00" Duration: 32.00 seconds" DataRate: 90077.97 bytes/secs" BitRate: 0 bits/secs" User Data:" " ©enc: "WITNESS Media Archive oris: "E007653 (WITNESS Media Archive); Prior source: B03774 (WITNESS Media Archive); FCP Reel Value: B03774 cmmt: "Master (Generation). name: "[Screening of ON THE FRONTLINES in Gasorwe, Nuyinga, and Mwaro] (3869) perf: "Bukeni Tete Waruzi genr: "Raw prod: "P-OTF On the Frontlines: Child Soldiers in the DRC. keyw: "Child Soldiers, Film and Video Screenings, WITNESS Events, WITNESS Partners, WITNESS, Burundi, Africa, Gasorwe, Nuyinga, Democratic Republic of Congo (formerly Zaire), Mwaro cprt: "WITNESS/AJEDI-ka

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

9

Page 10: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

http://mediainfo.sourceforge.net/VideoCount : 1AudioCount : 1TextCount : 11Video_Format_List : AVCVideo_Language_List : enAudio_Format_List : DTSText_Format_List : ASS / ASS / ASS / ASS / ASS / ASS / ASS / ASS / ASS / ASS / ASSText_Language_List : en / fr / es / sv / da / fi / hu / ro / de / cs / slFolderName : /Users/daverice/Projects/samples/samples2FileName : matroska+h264+dca+0x0000+memory-usageFileExtension : mkvFormat : MatroskaFileSize : 262144000Duration : 8437088Duration/String3 : 02:20:37.088OverallBitRate : 248563Encoded_Date : UTC 2007-05-30 07:44:45File_Modified_Date : UTC 2010-02-21 12:59:57File_Modified_Date_Local : 2010-02-21 07:59:57Encoded_Application : mkvmerge v2.0.2 ('You're My Flame') built on Feb 21 2007 23:40:55Encoded_Library : libebml v0.7.7 + libmatroska v0.8.1Encoded_Library/String : libebml v0.7.7 + libmatroska v0.8.1Cover : Yes

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

10

Page 11: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

http://mediainfo.sourceforge.net/VideoUniqueID : 1Format : AVCFormat_Profile : [email protected]_Settings : CABAC / 8 Ref FramesCodecID : V_MPEG4/ISO/AVCDuration/String3 : 02:20:37.104Width : 1920Height : 800PixelAspectRatio : 1.000DisplayAspectRatio : 2.400FrameRate : 23.976ColorSpace : YUVChromaSubsampling : 4:2:0BitDepth : 8ScanType : ProgressiveDelay/String3 : 00:00:00.000Title : Letters From Iwo Jima (2006)Encoded_Library/Name : x264Encoded_Library/Version : core 55Encoded_Library_Settings : cabac=1 / ref=5 / deblock=1:-6:-6 / analyse=0x3:0x133 / me=umh / subme=7 / brdo=1 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=2 / deadzone=21,11 / chroma_qp_offset=0 / threads=3 / nr=0 / decimate=1 / mbaff=0 / bframes=3 / b_pyramid=1 / b_adapt=1 / b_bias=0 / direct=3 / wpredb=1 / bime=1 / keyint=250 / keyint_min=25 / scenecut=40(pre) / rc=2pass / bitrate=6546 / ratetol=1.0 / rceq='blurCplx^(1-qComp)' / qcomp=0.60 / qpmin=10 / qpmax=51 / qpstep=4 / cplxblur=20.0 / qblur=0.5 / ip_ratio=1.40 / pb_ratio=1.30 / aq=1:0.3:15.0Language : en

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

11

Page 12: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

http://mediainfo.sourceforge.net/AudioUniqueID : 881684881Format : DTSFormat/Info : Digital Theater SystemsCodecID : A_DTSDuration/String3 : 02:20:37.088BitRate_Mode : CBRBitRate : 1509750Channel(s) : 6ChannelPositions : Front: L C R, Side: L R, LFESamplingRate : 48000SamplingCount : 404980224Resolution : 24BitDepth : 24Delay/String3 : 00:00:00.000Delay_Source : ContainerStreamSize : 1592236701Title : Japanese DTS 5.1 (1.5mbps)

Text #1UniqueID : 3861542542Format : ASSCodecID : S_TEXT/ASSCodecID/Info : Advanced Sub Station AlphaCodec : S_TEXT/ASSLanguage : en

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

12

Page 13: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Categories of Embedded Metadata

Organization Information

TOTAL_PARTS: Total number of parts defined at the first lower level. (e.g. if TargetType is ALBUM, the total number of tracks of an audio CD)

PART_NUMBER: Number of the current part of the current level. (e.g. if TargetType is TRACK, the track number of an audio CD)

PART_OFFSET: A number to add to PART_NUMBER when the parts at that level don't start at 1. (e.g. if TargetType is TRACK, the track number of the second audio CD)

Source: http://www.matroska.org/technical/specs/tagging/index.html

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

13

Page 14: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Categories of Embedded MetadataTitlesTITLE: The title of this item. For example, for music you might label this "Canon in D", or for video's audio track you might use "English 5.1" This is akin to the TIT2 tag in ID3.SUBTITLE: Sub Title of the entity.

Source: http://www.matroska.org/technical/specs/tagging/index.html

DATE_RELEASED: The time that the item was originaly released. This is akin to the TDRL tag in ID3.DATE_RECORDED: The time that the recording began. This is akin to the TDRC tag in ID3. DATE_ENCODED: The time that the encoding of this item was completed began. This is akin to the TDEN tag in ID3.DATE_TAGGED: The time that the tags were done for this item. This is akin to the TDTG tag in ID3.DATE_DIGITIZED: The time that the item was tranfered to a digital medium. This is akin to the IDIT tag in RIFF. DATE_WRITTEN UTF-8 The time that the writing of the music/script began. DATE_PURCHASED: Information on when the file was purchased (see also purchase tags).

Temporal Information

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

14

Page 15: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Categories of Embedded MetadataSearch and Classification

Source: http://www.matroska.org/technical/specs/tagging/index.html

GENRE: The main genre (classical, ambient-house, synthpop, sci-fi, drama, etc). The format follows the infamous TCON tag in ID3.MOOD: Intended to reflect the mood of the item with a few keywords, e.g. "Romantic", "Sad" or "Uplifting". The format follows that of the TMOO tag in ID3.ORIGINAL_MEDIA_TYPE: Describes the original type of the media, such as, "DVD", "CD", "computer image," "drawing," "lithograph," and so forth. This is akin to the TMED tag in ID3. CONTENT_TYPE: The type of the item. e.g. Documentary, Feature Film, Cartoon, Music Video, Music, Sound FX, ...SUBJECT: Describes the topic of the file, such as "Aerial view of Seattle."DESCRIPTION: A short description of the content, such as "Two birds flying."KEYWORDS: Keywords to the item separated by a comma, used for searching.SUMMARY: A plot outline or a summary of the story.SYNOPSIS: A description of the story line of the item.INITIAL_KEY: The initial key that a musical track starts in. The format is identical to ID3.PERIOD: Describes the period that the piece is from or about. For example, "Renaissance".LAW_RATING: Depending on the country it's the format of the rating of a movie (P, R, X in the USA, an age in other countries or a URI defining a logo).ICRA: The ICRA content rating for parental control. (Previously RSACi)

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

15

Page 16: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Categories of Embedded MetadataRights and Legal Information

Source: http://www.matroska.org/technical/specs/tagging/index.html

COPYRIGHT: The copyright information as per the copyright holder. This is akin to the TCOP tag in ID3.PRODUCTION_COPYRIGHT: The copyright information as per the production copyright holder. This is akin to the TPRO tag in ID3.TERMS_OF_USE: The terms of use for this item. This is akin to the USER tag in ID3.

ENCODER: The software or hardware used to encode this item. ("LAME" or "XviD") ENCODER_SETTINGS: A list of the settings used for encoding this item. No specific format.BPS: The average bits per second of the specified item. This is only the data in the Blocks, and excludes headers and any container overhead.BPM: Average number of beats per minute in the complete target (e.g. a chapter). Usually a decimal number. MEASURE: In music, a measure is a unit of time in Western music like "4/4". It represents a regular grouping of beats, a meter, as indicated in musical notation by the time signature.. The majority of the contemporary rock and pop music you hear on the radio these days is written in the 4/4 time signature. TUNING: It is saved as a frequency in hertz to allow near-perfect tuning of instruments to the same tone as the musical piece (e.g. "441.34" in Hertz). The default value is 440.0 Hz.

Technical Information

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

16

Page 17: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Categories of Embedded MetadataOther Metadata Entities

Source: http://www.matroska.org/technical/specs/tagging/index.html

ARTISTLEAD_PERFORMERACCOMPANIMENTCOMPOSERARRANGERLYRICISTCONDUCTORDIRECTORASSISTANT_DIRECTORDIRECTOR_OF_PHOTOGRAPHYSOUND_ENGINEERART_DIRECTORPRODUCTION_DESIGNERCHOREOGRAPHERCOSTUME_DESIGNERACTOR

CHARACTERWRITTEN_BYSCREENPLAY_BYEDITED_BYPRODUCERCOPRODUCEREXECUTIVE_PRODUCERDISTRIBUTED_BYMASTERED_BYENCODED_BYMIXED_BYREMIXED_BYPRODUCTION_STUDIOTHANKS_TOPUBLISHERLABEL

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

17

Page 18: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

18

Page 19: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

19

Page 20: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Embedded Metadata CrosswalksDavid Rice | www.avpreserve.com | AMIA 2010, Philadelphia

20

Page 22: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Validation RequirementsDavid Rice | www.avpreserve.com | AMIA 2010, Philadelphia

22

Page 23: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Validation RequirementsDavid Rice | www.avpreserve.com | AMIA 2010, Philadelphia

23

Page 24: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Protection of the Essence vs. Chunk OrderingDavid Rice | www.avpreserve.com | AMIA 2010, Philadelphia

24

Page 25: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Federal Agencies Digitization Guidelines Initiative (FADGI)BWF MetaEdit 1.0.0Usage: "bwfmetaedit [--Options...] FileName1 [Filename2...]"

Options:--Help, -h Display this help and exit--Version Display AVWG BWF MetaEdit version and exit

*******************************************************************************

Reject file if:--reject-riff2rf64 transformation to RF64 is requested--reject-overwrite existing data must not be overwritten (only add)

Accept file if:--accept-nopadding padding byte is missing

Continue parsing if:--continue-errors there is an error in one or more input files

File modification options:--append, -a place new or expanded chunks at the end of the file

--verbose, -v Display more details about modified values

--simulate, -s Simulate only (no write)

*******************************************************************************

Extract Technical Metadata to:--out-tech Display technical data--out-tech= specified file in CSV format

*******************************************************************************

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

25

Page 26: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

Fill files with:--in-core= data from the specified Core file--in-core-remove clear data (remove bext and INFO)--Description= specified bext description--Originator= specified bext originator--OriginatorReference= specified bext originator reference--OriginationDate= specified bext origination date--OriginationTime= specified bext origination time--Timereference= specified bext time reference--UMID= specified bext umid--History= specified bext history--IARL= specified INFO IARL--ISFT= specified INFO ISFT--xxxx= specified INFO xxxx...

Extract Core Document to:--out-core current display (std::cout) in CSV format--out-core= specified file in CSV format--out-core-xml filename.core.xml (1 ouput per file) in XML format*******************************************************************************--in-XMP= Insert XMP from the specified file--in-XMP-remove Remove XMP--in-XMP-xml Insert XMP from filename._PMX.xml--out-XMP-xml Save XMP in filename._PMX.xml*******************************************************************************--in-aXML= Insert aXML from the specified file--in-aXML-remove Remove aXML--in-aXML-xml Insert aXML from filename.aXML.xml--out-aXML-xml Save aXML in filename.aXML.xml*******************************************************************************--in-iXML= Insert iXML from the specified file--in-iXML-remove Remove iXML--in-iXML-xml Insert iXML from filename.iXML.xml--out-iXML-xml Save iXML in filename.iXML.xml*******************************************************************************--MD5-Evaluate Evaluate MD5 for audio data--MD5-Verify Verify MD5 for audio data--MD5-Embed Embed MD5 for audio data--MD5-Embed-Overwrite Embed MD5 for audio data - Allow overwriting

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

26

Page 27: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

BWF MetaEdit

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

27

Page 28: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

BWF MetaEdit

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

28

Page 29: Embedded Metadata - AVP · Embedded Metadata “That media is ... The time that the encoding of this item was completed began. This is akin to ... Embedded Metadata Crosswalks

BWF MetaEdit

David Rice | www.avpreserve.com | AMIA 2010, Philadelphia

29