154
Multimedia Semantics: Metadata, Analysis and Interaction Raphaël Troncy <[email protected] > Multimedia Semantics, EURECOM (FR)

Multimedia Semantics - SSMS 2010

Embed Size (px)

DESCRIPTION

Multimedia Semantics: Metadata, Analysis and Interaction. Lecture Talk at the 5th Summer School on Multimedia Semantics (SSMS), August 2010, Amsterdam, The Netherlands

Citation preview

Page 1: Multimedia Semantics - SSMS 2010

Multimedia Semantics:Metadata, Analysis and Interaction

Raphaël Troncy <[email protected]> Multimedia Semantics, EURECOM (FR)

Page 2: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 2

Some BIG numbers

User Generated Content (July 2010) 4.3+ billion photos (50% are public, 30% are tagged) 30+ billion photos (2.5 billions per month) 110+ million videos

24 hours uploaded / min ≈ 90 000 full length movies / week2 billions videos served a day

Archived TV content 1.5 million hours ≈ 120 km of shelves 300000 hours | 1 petabyte / year

News content

Content difficult to search and reuse Barely visible for the search engines

Page 3: Multimedia Semantics - SSMS 2010

Why is it so difficult to findappropriate multimedia content, to

reuse and repurpose content previously published and to present this content in interfaces that vary

with user needs?

Page 4: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 4

Image/Video indexing

Techniques used by mainstream search engines search term occurs in the filename or in the caption or in user tags no semantics

Image indexing: main problem an image is not alphabetic: there is no countable discrete units, that,

in combination will provide the meaning of the image image descriptors are not given with the image: one needs to

extract or interpret them

Video indexing: additional problem a video has additionally a temporal dimension to take into account a video has a priori no discrete units neither (i.e. frames, shots,

sequences cannot be absolutely defined)

Page 5: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 5

Sounds Familiar?

[Arnold Smeulders, PAMI, 2000]The semantic gap is the lack of coincidence between the information that one can extract from the sensory data and the interpretation that the same data has for a user in a given situation

Page 6: Multimedia Semantics - SSMS 2010

long waya little drop of semantics goes a

Jim Hendler [1997]

Page 7: Multimedia Semantics - SSMS 2010

From signal … to symbols … to meaning … to users

Applications: Security in Multimedia, Multimedia on the Web

Multimedia Research Themes @EURECOM

110010000011111110101001001001101010111011011011101001111110010000000001010001101100000010010110001111100010101100011110001011101000100011111111111010000010010101010111001000010100101100001101011101101011011001

Content Analysis Content Modeling & Indexing

Multimedia Semantics & Interaction

Audio processing

Video Segmentation

Emotion Recognition

Video Indexation

Video Summarization

Facial+Body Biometrics

Semantic Web

Social networks

Multimedia Interaction

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 7

Page 8: Multimedia Semantics - SSMS 2010

Learn how to get metadata (machine learning) (Semantic) multimedia analysis … or the science of labeling (Semantic) audio processing (ASR + NER + background knowledge)

Explore various multimedia metadata formats Be aware of the advantages and limitations of various models Know the interoperability issues and understand COMM, a Core

Ontology for Multimedia, learn about the W3C ontology for Media Resources

Discuss exploratory interfaces based on rich multimedia metadata semantics Know how to link and expose your data on the web See various multimedia presentation interfaces

Learning Objectives

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 8

Page 9: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 9

Agenda

1. Semantics in multimedia analysis Detecting concepts in video and speech Evaluating interactive search tasks

2. Semantics in metadata MPEG-7 based ontologies and COMM: a Core Ontology for

Multimedia Expose your data following 4 basic principles and re-use a

growing amount of publicly open datasets

3. Semantics in user interfaces Provide meaningful presentation of underlying data HTML5: a game changer for video on the web Event-centric based interfaces for browsing rich media collection

Page 10: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 10

Overview of Canonical Processes

Page 11: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 11

Canonical Processes Possible Flow

Page 12: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 12

The Importance of the Annotations

Page 13: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 13

The science of labeling

Automatically detecting the presence of a concept in a video stream

Naming visual information

airplane

Page 14: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 14

The Computer Vision Approach

Building detectors one-at-the-time

a face detector for frontal faces

a face detector for non-frontal faces

3 years later

One (or more) PhD for every new concept

Page 15: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 15

So how about these?

[Cees Snoek and Marcel Worring, SSMS, 2007]

Page 16: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 16

A Simple Concept Detector

[Cees Snoek and Marcel Worring, SSMS, 2007]

Page 17: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 17

Support Vector Machine

[Cees Snoek and Marcel Worring, SSMS, 2007]

Page 18: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 18

Supervised Learner

[Cees Snoek and Marcel Worring, SSMS, 2007]

Page 19: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 19

NIST TRECVID Evaluation

Until 2001, everybody defined his own conceptsUsing specific and small data setsHard to compare methodologies

Since 2001, worldwide evaluation by NISTPromote progress in video retrieval searchProvide common datasets (shots, ASR, key frames)Use open, metrics-based evaluation

Large-Scale Concept Ontology for Multimedia

Page 20: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 20

Success and Criticism

More and more concept detectors available:TRECVID 2005: 101 concept lexiconTRECVID 2006: 491 concept lexiconMediaMill Challenge 2007: 572 concept lexicon

... but focus is on the final result relative merit of indexing methods: ignore intermediary

steps while systems become more complex (several features and learning methods)

... but concept detectors developed mismatch user information needs

Page 21: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 21

TRECVID Interactive Video Search Task

Query selection: by keyword, by concept, by example

Topics unknown

Test set English (2004) Chinese (2005-6) Dutch (2007-8-9)

Page 22: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 22

VideOlympics

Benchmark performance cannot be sole criterion Experience of searcher counts Usability of systems matters

VideoOlympics: live interactive search task Simultaneous exposure

of video retrieval systems Showcase that goes

beyond a regular demo session

Fun to do (participants) & Fun to watch (audience)

Page 23: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 23

VideOlympics Setup

One display TRECVID like queries Results pushed by searchers

Page 24: Multimedia Semantics - SSMS 2010

How to make video viewable to the blind?

What is required to make video accessible on the Web?

How to increase the number of accessible videos?

Technologies: Annotating: automatic (speech transcription) and manual (social

collaborative annotation tool) Addressing: pointing to, retrieving, transmitting only parts of media Rendering: video visualization for the impaired, Braille output

Expected benefits for: disabled people, getting better access to video video provider, reaching a wider audience the Web in general, using semantic annotations

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 24

Page 25: Multimedia Semantics - SSMS 2010

ACAV: Collaborative Annotation for Video Accessibility

Produce (semantic) annotations of multimedia content: Automatically: speaker diarization, speech recognition Manually: collaborative annotations, template

Generate multimodal presentation of annotated content Subtitles / Surtitles / Close captioning Braille output Media Fragment access

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 25

Page 26: Multimedia Semantics - SSMS 2010

Audio track

BrailleAudio description

Auditoryicons

Annotations multimodal presentation depends on video context

and user preferences

The mother, her son

In the shop In the street

The man and his friendThe son, the man

Look his mother

Put on his shoes Walk in the streetMan’s actions

Son’s actions

Characters

Scenery

Annotations

Accessibility Features for Visually Impaired and Blind People

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 26

Page 27: Multimedia Semantics - SSMS 2010

SurtitlesSubtitles

Annotations presentation depends on video cointext

and user preferences Annotations

Hi mom

How are you ?Mother‘s dialogues

Son’s dialogues Fine and you ?

Sound Car horn

Video track

Accessibility Features for Deaf People

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 27

Page 28: Multimedia Semantics - SSMS 2010

Social annotations

Annotation corrections, enhancement

Audio description (for visually impaired)

Automatic annotations

Speaker diarizationWho spoke and When?

Speech recognitionTranscription

Annotations

Hi mom

How are you ?Mother

Son Fine and you ?

Sound Car horn

Producing Video Annotations

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 28

Annotations

Ho mom

How are you ?Mother

Son Fine

Page 29: Multimedia Semantics - SSMS 2010

Speech Processing

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 29

Page 30: Multimedia Semantics - SSMS 2010

Demo: http://acav.eurecom.fr/

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 30

Page 31: Multimedia Semantics - SSMS 2010

The Advene prototype

31

Enriched Media Player

Timeline with typed annotations

Braille emulation

Rendering views

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 32: Multimedia Semantics - SSMS 2010

Preliminary study (1/2)

Semi-structured interviews with blind users (n=2) Participant’s habits when watching programs with audio description Audio description process Multimodal presentations of descriptions

Requirements: R1: generate additional descriptions and provide unobtrusive access

to descriptions (tactile access for blind Braille readers) R2: descriptions at various level of granularity and verbosity R3: use system’s multimodal output to provide two or more

descriptions (e.g. speech synthesis and Braille display)

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 32

Page 33: Multimedia Semantics - SSMS 2010

Goal: see whether we can use auditory icons to convey the rhythm of the editing of a movie to blind users e.g.: sound of a locomotive arriving from the right to convey the

concept of a traveling from right to left

Experiment and questionnaires (n=16+9) Viewing with headsets of 5 min of Ratatouille,

http://www.imdb.com/title/tt0382932/

Results: Rhythm and movie dynamic better perceived Usefulness of auditory icons but must be limited (5 max) and be very

different from the main soundtrack of the movieEditing cues: change of scenes, camera movement, flashback (e.g. NCIS)Audio zoom (e.g. Survivor)

Preliminary study (2/2)

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 33

Page 34: Multimedia Semantics - SSMS 2010

ACAV Architecture

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 34

ASR Engine: Sphinx/HTK

NER + full text index with the transcription

Interlinking with the Linked Data Cloud to enable semantic search

Page 35: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 35

Agenda

1. Semantics in multimedia analysis Detecting concepts in video and speech Evaluating interactive search tasks

2. Semantics in metadata MPEG-7 based ontologies and COMM: a Core Ontology for

Multimedia Expose your data following 4 basic principles and re-use a

growing amount of publicly open datasets

3. Semantics in user interfaces Provide meaningful presentation of underlying data HTML5: a game changer for video on the web Event-centric based interfaces for browsing rich media collection

Page 36: Multimedia Semantics - SSMS 2010

What is Ontology ?

Ontology (from the Greek ὄν, genitive ὄντος: of being (neuter participle of εἶναι: to be) and -λογία, -logia: science, study, theory) is the philosophical study of the nature of being, existence or reality in general, as well as the basic categories of being and their relations.

Science of Being (Aristotle, Metaphysics, IV, 1) Tries to answer the questions:

What characterizes being?Eventually, what is being?

How should things be classified?

31/08/2010 - - 36Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 37: Multimedia Semantics - SSMS 2010

Why is this Funny?

In “The analytical language of John Wilkins”*, Jorge Borges writes about a “certain Chinese encyclopaedia” that has the following categorization of animals:

Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

(i) frenzied, (j) innumerable,

(k) drawn with a very fine camelhair brush, (l) et cetera,

(m) having just broken the water pitcher,

(n) that from a long way off look like flies.

(a) belonging to the emperor, (b) embalmed,

(c) tame, (d) sucking pigs,

(e) sirens, (f) fabulous,

(g) stray dogs, (h) included in the present

classification, * http://agents.umbc.edu/misc/johnWilkins.html

31/08/2010 - - 37

Page 38: Multimedia Semantics - SSMS 2010

Ontology in Computers

An ontology is an engineering artifact consisting of: A vocabulary used to describe (a particular view of)

some domainAn explicit specification of the intended meaning of the

vocabulary. almost always includes how concepts should be classified

Constraints capturing additional knowledge about the domain

Ideally, an ontology should:Capture a shared understanding of a domain of interestProvide a formal and machine manipulable model of the

domain

31/08/2010 - - 38Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 39: Multimedia Semantics - SSMS 2010

Ontologies: more definitions

An ontology is a "formal, explicit specification of a shared conceptualization".

Ontologies define the concepts and relationships used to describe and represent an area of knowledge. Ontologies are used to classify the terms used in a particular application, characterize possible relationships, and define possible constraints on using those relationships. In practice, ontologies can be very complex (with several thousands of terms) or very simple (describing one or two concepts only).

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 39

Page 40: Multimedia Semantics - SSMS 2010

What is a Multimedia Ontology?

Page 41: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 41

Multimedia: Description methods

MPEG-1

MPEG-2

MPEG-4

MPEG-7

MPEG-21

ISO W3C

Page 42: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 42

MPEG-7: a multimedia description language?

ISO standard since December of 2001

Maincomponents: Descriptors

(Ds) and Description Schemes (DSs)

DDL (XML Schema + extensions)

Concern all types of media

Basic datatypes

Links & media localization

Basic Tools

Models

Basic elements

Navigation & Access

Content management

Content description

Collections

Summaries

Variations

Content organization

Creation & Production

Media Usage

Semantic aspects

Structural aspects

User interaction

User Preferences

Schema Tools

User History Views Views

Part 5 – MDSMultimedia Description Schemes

Page 43: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 43

MPEG-7 and the Semantic Web

MDS Upper Layer represented in RDFS 2001: Hunter Later on: link to the ABC upper ontology

MDS fully represented in OWL-DL 2004: Tsinaraki et al., DS-MIRF model

MPEG-7 fully represented in OWL-DL 2005: Garcia and Celma, Rhizomik model Fully automatic translation of the whole standard

MDS and Visual parts represented in OWL-DL 2007: Arndt et al., COMM model Re-engineering MPEG-7 using DOLCE design patterns

Page 44: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 44

Requirements [aceMedia, MMSEM XG]

MPEG-7 compliance Support most descriptors (decomposition, visual, audio)

Syntactic and Semantic interoperability Shared and formal semantics represented in a Web language (OWL,

RDF/XML, RDFa, etc.)

Separation of concerns Domain knowledge versus multimedia specific information

Modularity Enable customization of multimedia ontology

Extensibility Enable inclusion of further descriptors (non MPEG-7)

Page 45: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 45

MPEG-7 Based Ontologies

Hunter DS-MIRF Rhizomik COMM

Foundational Ontologies ABC None None DOLCE

Complexity OWL-Full OWL-DL OWL-DL OWL-DL

Coverage MDS+Visual MDS+CS All MDS+Visual

Applications Digital Libraries

Digital Libraries Digital Rights MM Analysis

Page 46: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 46

Common Scenario

The "Big Three" at the Yalta Conference (Wikipedia)

Page 47: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 47

The "Big Three" at the Yalta Conference (Wikipedia)

Localize a region Draw a bounding box, a circle around a shape

Annotate the content Interpret the content Tag: Winston Churchill, UK Prime Minister, Allied Forces, WWII

Reg1

Common Scenario: Tagging Approach

Page 48: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 48

The "Big Three" at the Yalta Conference (Wikipedia)

Localize a region Draw a bounding box, a circle around a shape

Annotate the content Interpret the content Link to knowledge on the Web

:Reg1 foaf:depicts dbpedia:Winston_Churchilldbpedia:Winston_Churchill skos:altLabel

"Sir Winston Leonard Spencer-Churchill"dbpedia:Winston_Churchill rdf:type foaf:Person

Reg1

Common Scenario: SW Approach

Page 49: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 49

Hunter's MPEG-7 Ontology

mpeg7:image

http://en.wikipedia.org/wiki/Image:Yalta_Conference.jpg

5 25 10 20 15 15 10 10 5 15"^^xsd:string

mpeg7:MediaLocator

mpeg7:depicts

mpeg7:spatial_decomposition

Reg1

mpeg7:Polygon

mpeg7:SpatialMask

mpeg7:Coords

The Big Three at the Yalta Conference

mpeg7:StillRegion

rdf:type

rgb(25,255,255)

dbpedia:Churchill

mpeg7:DominantColor

mpeg7:depicts

Page 50: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 50

mpeg7:image

http://en.wikipedia.org/wiki/Image:Yalta_Conference.jpg

5 25 10 20 15 15 10 10 5 15"^^xsd:string

mpeg7:MediaLocator

mpeg7:CreationInformation

mpeg7:SpatialDecomposition

Reg1

mpeg7:SubRegion

mpeg7:SpatialMask

mpeg7:Polygon

The Big Three at the Yalta Conference

mpeg7:StillRegionrdf:type

dbpedia:Churchillmpeg7:RelatedMaterial

mpeg7:Creationmpeg7:Coords

mpeg7:dim

mpeg7:MediaURI

mpeg7:Title

contentString

DS-MIRF MPEG-7 Ontology

Page 51: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 51

mpeg7:image

http://en.wikipedia.org/wiki/Image:Yalta_Conference.jpg

5 25 10 20 15 15 10 10 5 15"^^xsd:string

mpeg7:MediaLocator

mpeg7:CreationInformation

mpeg7:spatial_decomposition

Reg1

mpeg7:SubRegion

mpeg7:SpatialMask

mpeg7:Polygon

The Big Three at the Yalta Conference

mpeg7:SegmentType

rdf:type

dbpedia:Churchillmpeg7:Semantic

mpeg7:Title

mpeg7:Coords

mpeg7:dim

Rhizomik MPEG-7 Ontology

Page 52: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 52

COMM: Fragment Identification

core:image-data

http://en.wikipedia.org/wiki/Image:Yalta_Conference.jpg

loc:spatial-mask-role

5 25 10 20 15 15 10 10 5 15"^^xsd:string

dns:realized-by

loc:region-locator-descriptor

loc:bounding-box

dns:plays

dns:played-bydns:defines

data:has-rectangle

dns:setting core:semantic-annotation

core:semantic-label-role

dns:defines

dns:played-by

dbpedia:Churchill

foaf:Person

rdf:type

Page 53: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 53

Comparison

Link with domain semantics Hunter: ABC model + mpeg7:depicts relationship DS-MIRF: Domain ontologies needs to subclass the general MPEG-

7 categories Rhizomik: Use the mpeg7:semantic relationship COMM: Semantic Annotation pattern

MPEG-7 coverage Hunter: extension of the MPEG-7 visual descriptors COMM:

Formalization of the context of the annotationRepresentation of the method (algorithm) that provides the annotation

Page 54: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 54

Comparison

Modeling Decisions: DS-MIRF and Rhizomik: 1-to-1 translation from MPEG-7 to

OWL/RDF Hunter: Simplification and link to the ABC upper model COMM: NO 1-to-1 translation

Need for patterns: use DOLCE, a well designed foundational ontology as a modeling basis

Scalability:

Hunter DS-MIRF Rhizomik COMM

Triples 11 27 20 19

Page 55: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 55

Page 56: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 56

Research Problem

The "Big Three" at the Yalta Conference (Wikipedia)

A history of G8 violence (video) (© Reuters)

Multimedia objects are complex Compound information objects, fragment identification

Semantic annotation Subjective interpretation, context dependent

Linked data principle Open to reuse existing knowledge

MPEG-7

RDF

D&S | OIO

Reg1Seq1

Seq4

Page 57: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 57

COMM: Design Rationale

Approach: NO 1-to-1 translation from MPEG-7 to OWL/RDF Need for patterns: use DOLCE, a well designed foundational

ontology as a modeling basis

Design patterns: Ontology of Information Objects (OIO)

Formalization of information exchangeMultimedia = complex compound information objects

Descriptions and Situations (D&S)Formalization of contextMultimedia = contextual interpretation (situation)

Define multimedia patterns that translate MPEG-7 in the DOLCE vocabulary

Page 58: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 58

COMM: Core Functionalities

Most important MPEG-7 functionalities:Decomposition of multimedia content into segmentsAnnotation of segments with metadata

Administrative metadata: creation & productionContent-based metadata: audio/visual descriptors Semantic metadata: interface with domain specific ontologies

Note that all are subjective and context dependent situations

Page 59: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 59

COMM: D&S / OIO Patterns

Definition of design patterns for decomposition and annotation based on D&S and OIO

MPEG-7 describes digital data (multimedia information objects) with digital data (annotation)Digital data entities are information objectsDecompositions and annotations are situations that satisfy the rules of a method or algorithm

Page 60: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 60

COMM: Decomposition Pattern

MPEG-7MPEG-7

Page 61: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 61

COMM: Annotation Pattern

MPEG-7

Page 62: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 62

COMM: Semantic Pattern

Domain Ontologies

Page 63: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 63

COMM: Modules

Decomposition Pattern

Annotation Pattern

Page 64: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 64

Example 1: Region Annotation

core:image-data

http://en.wikipedia.org/wiki/Image:Yalta_Conference.jpg

loc:spatial-mask-role

5 25 10 20 15 15 10 10 5 15"^^xsd:string

dns:realized-by

loc:region-locator-descriptor

loc:bounding-box

dns:plays

dns:played-bydns:defines

data:has-rectangle

dns:setting core:semantic-annotation

core:semantic-label-role

dns:defines

dns:played-by

http://en.wikipedia.org/wiki/Churchill

foaf:Person

rdf:type

Page 65: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 65

Example 2: Sequence Annotation

core:image-data

http://www.reuters.com/news/video/summitVideo?videoId=56114

loc:temporal-mask-role

"1:21"^^xsd:time

dns:realized-by

loc:media-time-descriptor

loc:media-time-point

dns:plays

dns:played-bydns:defines

data:has-time

dns:setting core:semantic-annotation

core:semantic-label-role

dns:defines

dns:played-by

tgn:Gothenburg

tgn:Sweden

skos:broader

Page 66: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 66

Page 67: Multimedia Semantics - SSMS 2010

W3C Ontology for Media Resources

“The ontology for media resources is meant to bridge the different descriptions of media resources on the Web, as opposed to media resources in local archives or musea. It is defined based on a core set of properties which covers basic metadata to describe media resources. Further it defines syntactic and semantic level mappings between elements from existing formats. The ontology is supposed to foster the interoperability among various kinds of metadata formats currently used to describe media resources on the Web.”

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 67

http://www.w3.org/TR/mediaont-10/

Page 68: Multimedia Semantics - SSMS 2010

Media Ontology: A useful set of mappings

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 68

Identifier Format Example Referencecl11 CableLabs 1.1 cl11:Writer_Display Cablelabs 1.1

dig35 DIG35 dig35:ipr_name/ipr_person@description='Image Creator' DIG35

dc Dublin Core dc:creator Dublin Coreebucore EBUCore ebuc:creator EBUCoreexif EXIF 2.2 exif:Artist EXIFid3 ID3 id3:TCOM ID3iptc IPTC iptc:Creator IPTC

lom21 LOM 2.1 lom21:LifeCycle/Contribute/Entity LOM

ma Core properties of the MA WG ma:creator 4 Property definitions

media Media RDF media:Recording Media RDF

mrss Media RSS mrss:credit@role='author' Media RSS

mets METS mets:agency METS

mpeg7 MPEG-7 mpeg7:CreationInformation/Creation/Creator/Agent MPEG-7

dms DMS-1 dms:Participant/Person DMS-1

tva TV-Anytime tva:CredistsList/CredistItem TV-Anytime

txf TXFeed txf:author TXFeedxmp XMP xmpDM:composer XMP

yt YouTube Data API Protocol yt:author YouTube Data API Protocol

Page 69: Multimedia Semantics - SSMS 2010

Media Ontology: classes

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 69

Page 70: Multimedia Semantics - SSMS 2010

Media Ontology: object properties

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 70

Page 71: Multimedia Semantics - SSMS 2010

Media Ontology: datatype properties

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 71

Page 72: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 72

Page 73: Multimedia Semantics - SSMS 2010

Media Ontology exemplified on Flickr

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 73

Page 74: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 74

Linked Data Cloud

Page 75: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 75

Linked Data Principles

Tim Berners Lee [2006] (Design Issues)

1. Use URIs to identify things (anything, not just documents);

2. Use HTTP URIs – globally unique names, distributed ownership –so that people can look up those names;

3. Provide useful information in RDF –when someone looks up a URI;

4. Include RDF links to other URIs –to enable discovery of related information

Page 76: Multimedia Semantics - SSMS 2010

: Interlinking Multimedia

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 76

dbpedia:Zidane

foaf:depictsnar:location

geonames:2950159

nar:subject

nc:15054000

events:id

wp:2006_FIFA_Wolrd_Cup#Final

Page 77: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 77

Image Annotation with Linked Data

The "Big Three" at the Yalta Conference (Wikipedia)

Localize a region (bounding box)

Annotate the content (interpretation) Tag: Winston Churchill, UK Prime Minister, Allied Forces, WWII Link to knowledge on the Web

Reg1

:Reg1 foaf:depicts dbpedia:Winston_Churchill----------------------------------------------dbpedia:Winston_Churchill dbpedia:spouse

dbpedia:Clementine_Churchilldbpedia:Winston_Churchill owl:sameAs

fbase:Winston_Churchill

Page 78: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 78

A history of G8 violence (video) (© Reuters)

Localize a region

Annotate the content Tag: G8 Summit, Heiligendamn, 2007 Link to knowledge on the Web EU Summit, Gothenburg, 2001

Seq1

Seq4Video Annotation with Linked Data

:Seq1 foaf:depicts dbpedia:34th_G8_Summit----------------------------------------------dbpedia:33rd_G8_Summit foaf:based_near geo:Heilegendamngeo:Heilegendamn skos:broader geo:Germany

Page 79: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 79

Media Annotations

Using structured knowledge on the Web:Clip foaf:depicts dbpedia:Laughter:Clip foaf:depicts dbpedia:Boris_Yeltsin:Clip foaf:depicts dbpedia:Bill_Clinton:Clip foaf:depicts dbpedia:Hyde_Park,New_York----------------------------------------------dbpedia:Hyde_Park,New_York owl:sameAs fbase:hyde_parkfbase:hyde_park skos:broader fbase:new_york_state

• Annotate the content (interpretation)Boris Yeltsin, Bill Clinton, laugh, Bosnia, Hyde Park

Page 80: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 80

Answer abstract queries

Research Problems Data modeling, vocabulary alignment, disambiguation

PREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?ClipWHERE {?Clip foaf:depicts dbpedia:Laughter ,yago:PresidentsOfTheRussianFederation ,yago:President110468559 .

}

Page 81: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 81

Find connection between media

Unexpected relationships: enable further discovery, exploration

Research problems Where should we stop in the exploration? When does it start to be intrusive for the end-user?

:Clip foaf:depicts dbpedia:Boris_Yeltsin:Clip foaf:depicts dbpedia:Bill_Clinton:Clip foaf:depicts fbase:Laughter

Page 82: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 82

Agenda

1. Semantics in multimedia analysis Detecting concepts in video and speech Evaluating interactive search tasks

2. Semantics in metadata MPEG-7 based ontologies and COMM: a Core Ontology for

Multimedia Expose your data following 4 basic principles and re-use a

growing amount of publicly open datasets

3. Semantics in user interfaces Provide meaningful presentation of underlying data HTML5: a game changer for video on the web Event-centric based interfaces for browsing rich media collection

Page 83: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 83

Who are the users?

Why would they use the cloud?

What tasks can be supported?

How will the semantics help?

Page 84: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 84 84

How can semantics help?

Query constructiondisambiguate input (auto-completion)

selection of available terms (grouping and ranking algorithms)

(Semantic) search algorithmgraph traversalquery expansionRDFS/OWL reasoning

Presentation of search resultsgrouping by propertyvisualization on timeline, map, etc.

Page 85: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 85

Provide meaningful presentation of data

Page 86: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 86

... and behind the scene

Page 87: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 87

... link an artist to more data

Page 88: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 88

... myspace

Page 89: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 89

... last.fm

Page 90: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 90

... IMDb

Page 91: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 91

Going through the Walled Gardens

David Simonds: Everywhere and nowhere. 19 May 2008, The Economist.

Page 92: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 92

Reinventing HTML

Tim Berners Lee (27/10/2006, blog post)

«The attempt to get the world to switch to XML … all at once didn't work. The large HTML-generating public did not move … Some large communities did shift and are enjoying the fruits of well-formed systems … The plan is to charter a completely new HTML group. »

Page 93: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 93

Page 94: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 94

Basic Layout in HTML5

Page 95: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 95

HTML5 Audio / Video

Native support in the browserNo need for plug-ins anymore

Flash, Silverlight, Quicktime, Windows Media

DOM APIs for scripts to control the playback

<audio src="music.oga" controls><a href="music.oga">Download song</a>

</audio>

<video src="video.ogv" controls poster="poster.jpg" width="320" height="240">

<a href="video.ogv">Download movie</a></video>

Page 96: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 96

HTML5 Codecs

Media containers: MPEG 4 (extension .mp4) Ogg (extension .ogg) AVI (extension .avi) Flash video (extension .flv) WebM: contained based on a profile of Matroska

Media codecs: MPEG 4: various implementations (Xvid is open source) but various

patents on this codec H.264: variant of MPEG 4, high compression. it is used by Youtube for

HD and by Blu-Ray Theora: free codec. It is generally used within the ogg container VP8: open video compression format released by Google (On2)

Page 97: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 97

HTML5 Audio / Video specification

Element: <audio>, <video>

Attributes for both: src: URL of the media container autobuffer: true/false, video starts loading with the page autoplay: true/false, video starts playing automatically loop: true/false controls: true/false, display default controls

Attributes for <video> width, height: dimensions displayed poster: URL of a still image replacing the video videoWidth, videoHeight: original dimensions of the video

Page 98: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 98

HTML5 <source> Element

Use the <source> element to provide alternative streams and let the browser choose from based on its media and codec support:<audio><source src="music.oga" type="audio/ogg"/><source src="music.mp3" type="audio/mpeg"/>

</audio>

<video poster="poster.jpg"><source src="video.3gp" type="video/3gpp"

media="handheld"/><source src="video.ogv" type="video/ogg;

codecs=theora, vorbis"/><source src="video.mp4" type="video/mp4"/>

</video>

Demo

Page 99: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 99

Sarkozy Laughing with Putin?

http://www.youtube.com/watch?v=7fMCTo-GQ2A#t=34s

Page 100: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 100

Clinton Laughing with Yeltsin?

http://www.youtube.com/watch?v=sxoh1z6s_Cw#t=15s

• Temporal annotation in YouTube... but the UA seeks, buffers and downloads the resource... and the YouTube syntax is different from Google Video, Vimeo, DailyMotion, etc.

Page 101: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 101

Media Fragments

Every popular web site does it ... region-based annotation in Flickr temporal sequence annotation

in YouTube

... BUT: region-based annotations cannot be exported YouTube syntax is different than DailyMotion, Vimeo, etc.

#t=34s #t=15s

Page 102: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 102

W3C Media Fragments WG

W3C Media Fragments WGhttp://www.w3.org/2008/WebVideo/Fragments/

Page 103: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 103

W3C Media Fragments WG

Provide URI-based mechanisms for uniquely identifying fragments for mediaobjects on the Web, such as video, audio, and images.

Page 104: Multimedia Semantics - SSMS 2010

Use Case

Aidem received on her Facebook wall a status message containing a Media Fragment URI Use a ‘#’ ! Highlight a video

sequence Highlight a region

to pay attention to

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 104

Page 105: Multimedia Semantics - SSMS 2010

Requirements

r01: Temporal fragments: a clipping along the time dimension from a start to an end time that

are within the duration of the media resource

r02: Spatial fragments: a clipping of an image region, only consider rectangular regions

r03: Track fragments: a track as exposed by a container format of the media resource

r04: Named fragments: a media fragment - either a track, a time section, or a spatial region -

that has been given a name through some sort of annotation mechanism

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 105

Page 106: Multimedia Semantics - SSMS 2010

Side Conditions

Restrict to what the container format (encapsulating the compressed media content) can express (and expose), thus no transcoding

Protocol covered: HTTP(S), FILE, RTSP, RTMP

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 106

http://www.w3.org/TR/media-frags-reqs/

Page 107: Multimedia Semantics - SSMS 2010

Media Fragments processing

General principle:Smart UA will strip out the fragment definition and

encode it into custom http headers ... (Media) Servers will handle the request, slice the media

content and serve just the fragment while old ones will serve the whole resource

Four recipes proposedUA knows how to map a fragment into bytesUA sends a Range request expressed in a custom unitVariant with cacheabilityServer serves a playable media resource

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 107

Page 108: Multimedia Semantics - SSMS 2010

Recipe 1: UA mapped byte ranges

The User Agent knows how to map a custom unit into bytes and sends a normal Range request expressed in bytes

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 108

Page 109: Multimedia Semantics - SSMS 2010

Recipe 1: UA mapped byte ranges

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 109

Page 110: Multimedia Semantics - SSMS 2010

Recipe 2: Server mapped byte ranges

The UA sends a Range request expressed in a custom unit (e.g. seconds), the server answers directly with a 206 Partial Content and indicates the mapping between bytes and the custom unit

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 110

Page 111: Multimedia Semantics - SSMS 2010

Recipe 2: Server mapped byte ranges

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 111

Page 112: Multimedia Semantics - SSMS 2010

Implementation

Media Fragment server (4 recipes supported): Ninsuna: http://ninsuna.elis.ugent.be/MediaFragmentsServer

Media Fragment user agents: Ninsuna Flash player:

http://ninsuna.elis.ugent.be/MediaFragmentsPlayerSupports recipe 1

Silvia Pfeiffer's experiment with HTML5 + JS: http://annodex.net/~silvia/itext/mediafrag.htmlSupports recipe 1 (for .ogg files and time dimension)

Firefox pluggin development in order to support all recipes (HTML5 + XMLHttpRequest)

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 112

Page 113: Multimedia Semantics - SSMS 2010

Towards an Event-Based Multimedia Web

Page 114: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 114- 114

We have directory of events...

Page 115: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 115

Page 116: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 116

Page 117: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 117

Page 118: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 118

Page 119: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 119

Page 120: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 120

Page 121: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 121

Page 122: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 122

Page 123: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 123

Page 124: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 124

Page 125: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 12516/09/2009 - - 125

We have knowledge about “many things”...

Page 126: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 126

Page 127: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 127

Event-based centric interfaces

Action or occurrence taking place at a certain time at a specific locationUseful for organizing and browsing collections of mediaUseful for discovering complex relationships between

data

Need for an expressive event model for connecting pieces of data

Not Yet Another Model!

Page 128: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 128

There are already many event ontologies

Event Model Ontology URL

CIDOC CRM http://cidoc.ics.forth.gr/OWL/cidoc_v4.2.owl

ABC Ontology http://metadata.net/harmony/ABC/ABC.owl

Event Ontology http://purl.org/NET/c4dm/event.owl#

EventsML-G2 http://www.iptc.org/EventsML/

Dolce+DnS Ultralite http://www.loa-cnr.it/ontologies/DUL.owl

F http://events.semantic-multimedia.org/ontology/2008/12/15/model.owl

OpenCyc Ontology http://www.opencyc.org/

SEM http://semanticweb.cs.vu.nl/2009/04/event/

Page 129: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 129

Fundamental Types of Events

Aspect: ongoing activity vs transition between states cyc:Event ∩ cyc:StaticSituation ≤ cyc:Situation cidoc:E5.Event ∩ cidoc:E3.Condition_State ≤ cidco:E2.Temporal_Entity abc:Event is a transition between abc:Situation ≈ cidoc:E3.Condition_State

Agentivity: who has produced the event? cyc:Action, dul:Action ≤ Event E7.Activity ≤ E5.Event abc:Action ∩ abc:Event = Ø

Events are fully described as a set of actions taken by specific agentsIssue for modeling e.g. earthquakes

Interpretation matters! Identifiable changes or not? Agency can be assigned? dul:Situation describe dul:Event dul:Action, dul:Process ≤ dul:Event

Page 130: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 130

Events and Temporal Intervals

Relating events to chronological spans of time Persistent, socially attributed meanings Arbitrary system for subdividing an abstract space

Modeling a class for temporal intervals and use an OP ABC, CIDOC, EO (owl:TemporalEntity)

Modeling a XML Schema typed value and use a DP Pro: simplicity, values expressed as xsd:date or xsd:dateTime Cons: inability to express uncertain period or when there is no

coincidence with date units

Having two properties dul:hasEventDate ... litteral value dul:isObservableAt ... dul:TimeInterval

Page 131: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 131

Events, Spaces and Places

Relating events to places Semantically significant places Abstract spatial regions

Support spatial regions only: ABC, CIDOC, EO eo:Event eo:place wgs84:SpatialThing

cidoc:E5.Event cidoc:P7.took_place_at cidoc:E53.Place

Support the place/space distinction dul:Event dul:hasLocation dul:Place

dul:Event dul:hasRegion dul:SpaceRegion

Most flexible approach: allow to resolve to places with no geographical coordinate systems (e.g. mythical events, SecondLife)

Page 132: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 132

Participation in events

Object involvement in events: Simple involvement in event:

abc:Event abc:involves owl:Thing (≤ abc:Actuality)cidoc:E5.Event cidoc:P12.occurred_in_the_presence_of cidoc:E77dul:Event dul:hasParticipant dul:Object

eo:Event eo:factor owl:Thing

Tangible thing which results from an event:abc:Event abc:hasResult owl:Thing

eo:Event eo:product owl:Thing

Agent participation in events: abc:hasParticipant ≤ abc:hasPresence cidoc:P11.had_participant ≤ cidoc:P14.carried_out_by dul:involvesAgent ≤ abc:hasParticipant

Page 133: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 133

Events, Influence, Purpose and Causality

Making broad assertions linking events to any thing cidoc:P12.occurred_in_the_presence_of, cidoc:P15.was_influenced_by eo:factor, abc:hasResult

F model uses the DnS pattern

Page 134: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 134

Events, Parts and Composition

Event A being part of event B ≠ A's timespan ϵ B's timespan cidoc:P86.falls_within for expressing containment among timespans cidoc:P9.consist_of ≈ eo:sub_event ≈ abc:isSubEventOf

Linking sub-events with parthood dul:hasPart

The 20th century contains the year 1923World War II included Pearl Harbour

Linking sub-events with composition dul:hasConstituent

The French revolution is composed of the Bastille catch

Page 135: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 13516/09/2009 - Event-based Annotation and Exploration of Media - PetaMedia SYTIM, Lausanne (CH) - 135

Towards a Linked Data Event Model

Page 136: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 136

Some mappings in LODE

ABC CIDOC DUL EO LODE

atTime P4.has_time_span isObservableAt time atTime

P7.took_place_at place inSpace

inPlace hasLocation atPlace

involves P12.occurred_in_the_presence_of

hasParticipant factor involved

hasPresence P11.had_participant involvesAgent agent involvedAgent

Page 137: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 13716/09/2009 - Event-based Annotation and Exploration of Media - PetaMedia SYTIM, Lausanne (CH) - 137

Page 138: Multimedia Semantics - SSMS 2010

What to do in Nimes in July?

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 138

Page 139: Multimedia Semantics - SSMS 2010

Events and Media

31/08/2010 - 139

Experiences documented by Media

Events are observable occurrences grouping

People Places Time

31/08/2010 - - 139Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 140: Multimedia Semantics - SSMS 2010

31/08/2010 - 140

1. Discover PAST, PRESENT and FUTURE events2. Live, relive and predict experiences through shared media3. Identify meaningful and/or interesting relationships

between events/media/people

Goal

31/08/2010 - - 140Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 141: Multimedia Semantics - SSMS 2010

Online Survey (n=28), 2 group discussions (n=35)

31/08/2010 - 141

Existing Technologies• Opinions• Interests• Suggestions• Benefits/drawbacks

Past Experiences (Memorable Events)• Discovery• Decision making• Registering & sharing • Meaningful relationships

ScenariosRequirements

1st Design Concept

Exploratory Study

31/08/2010 - - 141Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 142: Multimedia Semantics - SSMS 2010

Results (1/3)

Discovery Invitations and recommendationsRely on traditional mediaSocial networks (facebook - students)Previously attended events or venues

Decision MakingWho’s Joining? Where, When, How Much?(constraints)What? (e.g. type, performer, topic) Subjective factors (fun, atmosphere)

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 14231/08/2010 - - 142Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 143: Multimedia Semantics - SSMS 2010

Results (2/3)

Registering and SharingCommunicating their experiencePictures and short videos (for sharing)Media directories and social networks

Meaningful RelationshipsSimilar categories, attributes and contentUser attendance (similar interests, behaviors)Repeated events (e.g. annual festivals)

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 14331/08/2010 - - 143Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 144: Multimedia Semantics - SSMS 2010

Results (3/3)

Event DirectoriesSingle source event overview & information which allows

opportunistic/serendipitous discoveryLimited exploration/browsing features Information overload (cluttered, difficult) Information incompleteness (coverage, decision)

Media DirectoriesAids decision making, remembering and sharing

experiences

Social NetworksAllows communication, sharing and event attendance

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 14431/08/2010 - - 144Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 145: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 145

Page 146: Multimedia Semantics - SSMS 2010

Services

Existing services to explore, share and discover event

Aggregate these heterogeneous data sources

Enrich with media and social data

31/08/2010 - - 146Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 147: Multimedia Semantics - SSMS 2010

Semantization of Data

31/08/2010 - 147

1,438,128 resultsMachine tags

“lastfm:events”

SEARCH

LastFM events 2 LODE

...Events[ event_id, ...medias[photo_id, user_id, url_t, url_o, title, description]]

Lastfm + flickr APIs

Upcoming + Flickr (363,137)Eventful, Dailymotion, Youtube?

31/08/2010 - - 147Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 148: Multimedia Semantics - SSMS 2010

LODE Example

31/08/2010 - 148

Jack recorded a video with his mobile phone camera while he was attending the Haiti Relief concert from Radiohead given on January 24th, 2010 in LA. He thinks it was a really nice experience and wants to share it on-line. He would also like to see how other people experienced the show

31/08/2010 - - 148Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010

Page 149: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 149

Page 150: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 150

Page 151: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 151

Page 152: Multimedia Semantics - SSMS 2010

Jamiroquai @ Sziget Festival (Budapest)

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 152

Page 153: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 153

Take Home Message

Concept detection challenges: machine learning and IR Features can be extracted and used to describe multimedia content Show generality of approach, dynamic nature of video (event) Show that an ontology can help

Semantic metadata representation challenges: KR Media and metadata can be passed around and among systems Reuse what is there Expose what you make

Interaction challenges: CHI Users can be given much richer

and more flexible access to (semantically annotated) content ... but we are still figuring out how to do this!

Page 154: Multimedia Semantics - SSMS 2010

31/08/2010 - Multimedia Semantics: Metadata, Analysis and Interaction -SSMS 2010 - 154

Credits

Many peopleCees Snoek, Marcel Worring, Alex Hauptmann,

Alan Smeaton, Ivan Herman, Krishna Chandramouli, David Simonds, Laurent Le Meur

Colleagues from the Interactive Information AccessGroup, CWI Amsterdam

Datasets

http://www.slideshare.net/troncy