50
Machine Vision Background Region Annotation by Hierarchical Region Topic Discovery Probabilistic Generative Models for Machine Vision Davide Bacciu [email protected] Dipartimento di Informatica Università di Pisa Corso di Intelligenza Artificiale Prof. Alessandro Sperduti Università degli Studi di Padova A.A. 2008/2009 Davide Bacciu Probabilistic Generative Models for Machine Vision

Probabilistic generative models for machine vision

  • Upload
    zukun

  • View
    123

  • Download
    1

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Probabilistic Generative Models for MachineVision

Davide Bacciu

[email protected] di Informatica

Università di Pisa

Corso di Intelligenza ArtificialeProf. Alessandro Sperduti

Università degli Studi di PadovaA.A. 2008/2009

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 2: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Outline

1 Machine Vision BackgroundIntroductionVisual Content RepresentationMachine Vision Applications

2 Region Annotation by Hierarchical Region Topic DiscoveryVisual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 3: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Introduction

Visual content representationVisual features - how to identify and describe the single bitsof visual informationImage representation - how to describe the visualinformation within a picture

Machine vision applicationsScene and object recognitionImage annotation

Focus on Probabilistic Generative Models and the Bag ofWords image representation

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 4: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Visual Features Descriptors

How do we describe the single bits of visual information?

Global descriptorsColor histogramsShape featuresTextureSpatial Envelope

Local descriptorsGabor filtersDifferential filtersDistribution-baseddescriptors

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 5: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Scale Invariant Feature Transform (SIFT)

Calculate gradient magnitudeand orientation at a givenkeypoint (x , y) and scale σCompute 3D histogram ofmagnitude orientations in a4× 4 neighborhood of thekeypointAngles are quantized to 8directionsRobust to geometric andillumination distortion

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 6: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Visual Features Detectors

How do we identify the interesting/informative parts of animage?

Feature detectorsSegment/Blob detectorsCorner/Interest point detectorsAffine invariant keypoint detectorsDense sampling

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 7: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Image Representation

Straightforward for global descriptorsAn image corresponds to its descriptorImages are confronted, classified and indexed based ontheir global color and/or texture histograms

Requires an additional step for local descriptorsLocal descriptors provide only information on a portion ofthe sceneProbabilistic modeling of segmentsBag of words

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 8: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Bag of Words Document Representation

Count the occurrences of each dictionary word in yourdocumentRepresent the document as a vector of word counts

Definition (Bag of Words Assumption)

Word order is not relevant for determining document semantics

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 9: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Bag of Words Image Representation

Each image is a document and each visual patch is a wordCount the occurrences of each dictionary visual word(visterm) in your imageRepresent the image as a vector of visterm counts(histogram)

...

Codebook

CollectionImage

Image Collection

Vector

k−means

Input Image

Local patches

Quantization

BOW

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 10: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Image Understanding Applications

Images are typically represented as vector of featuresDiscover the semantics of an image from its vectorialrepresentation

Object recognition (e.g. Airplane)Scene recognition (e.g. Forest)Image annotation (e.g. Sky, Tree, Boat, Sea, ...)

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 11: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Scene and Object Recognition

Structured approachesObject/scene categories arerepresented as collections ofconnected parts characterized bydistinctive appearance and spatialpositionPart-based probabilistic models

Weakly structured approachesBased on bag-of-word assumptionLatent aspect models

Place Context

PixelContext

Part Context

Object Context

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 12: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Latent Aspect Discovery

Rooted into text processingProbabilistic models describing the generative process ofimage content using hidden (i.e. latent) variablesModel likelihood is used to measure the fit of the unknownparameters with respect to the observationsModel parameters are fitted by Expectation Maximization(EM) or variational methods

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 13: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Probabilistic Latent Semantic Analysis (pLSA) (I)

P(d) P(z|d) P(w|z)

D

z wd

Wd

Observations are collected in the integer matrix n(wj ,di)

Every observation is associated to a latent topic k bymeans of the hidden variable zk

Each hidden topic k denotes a particular visual theme (e.g.an object)

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 14: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Probabilistic Latent Semantic Analysis (pLSA) (II)

P(d) P(z|d) P(w|z)

D

z wd

Wd

pLSA is trained by Expectation Maximization (EM)Conditional independence is used to decompose modellikelihood

L(N|θ) =D∏

i=1

W∏j=1

P(wj ,di)n(wj ,di )

P(wj ,di) = P(di)P(wj |di) = P(di)K∑

k=1

P(zk |di)P(wj |zk )

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 15: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

An Object Recognition Example

Caltech-4 Dataset: cars, faces, motorbikes and airplanes on acluttered backgroundFit a pLSA model with 7 latent topics (4 objects + 3 background)Determine the most likely topic of each visual word in image di

P(zk |wj ,di) =P(zk |di)P(wj |zk )∑K

k=1 P(zk |di)P(wj |zk )

Figure: Images by Sivic et al 2005Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 16: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Latent Dirichlet Allocation (LDA)

D

z

K

w

Wd

θ

β φ

α

pLSA provides a generative model only for the training dataLDA treats P(zk |di) as a latent random variable, samplingit from a Dirichlet distribution P(θ|α)

Maximizes data likelihood

P(w |φ, α, β) =

∫ ∑z

P(w |z, φ)P(z|θ)P(θ|α)P(φ|β)dθ

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 17: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Hierarchical Latent Topic Models...

Dependent Pachinko Allocation Model (Li and McCallum, 2006)

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 18: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Image Annotation

Learn association between image content and textTypically exploits region-based image representation

Image segmentationColor and texture descriptors

Non-probabilistic image annotationInformation retrieval approaches (e.g. vector based)Multiple instance learning

Probabilistic Image AnnotationMachine translation approachProbabilistic generativeLatent topic models

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 19: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Latent Aspect Models in Image Annotation

z

v

D

t

b

β

σ

µ

Md

Bd

α θ

z

D

v

b

t

Md

µ

σ

β

Bd

α θ

GM-LDA CORR-LDA

Based on a Mixture of Gaussians modeling of the Bd imageregions and a multinomial distribution for annotations Md

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 20: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

Image Annotation - Is it Worth the Hassle?

The Web is full of annotated multimedia information

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 21: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

IntroductionVisual Content RepresentationMachine Vision Applications

How Informative is Image Representation?

BOW assumption is strong in the visual domain: spatialcontext matters!Region-based representation is often too coarse andunstable.How can we retain the robustness of BOW whileintroducing spatial context?

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 22: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Introduction

Overtake flat region-based and keypoint-basedapproachesStructured representation conveying richer semantics anddiscriminative power than a BOW model without the rigidityof a part-based approachKey idea

Introduce a multi-resolution representation of visual contentdrawing both on region-based and keypoint-based modelsIntroduce a multi-layered structure at the semantic level oflatent topic discovery

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 23: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Multi-Resolution Image Representation

Multi-resolution V describing visual information at differentspatial granularityBOW representation is based on assumption that spatialdistribution of visual keywords can be neglected whendetermining the overall image appearance

Biased view holding mostly for corpora characterized bysmall variability of the pictorial contentNeed of structured approach that can isolate semanticallydifferent image portions

Text representationDocuments are collections of paragraphsParagraphs are collections of words

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 24: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Region-Keypoints Representation

bag−of−words

bag−of−regions

segmented image ...

wr

r1

r2

r3

r4

r5

Use an unsupervised learning model to extract perceptuallycoherent portions of the image rl

Use dense sampling to extract a set of local visual descriptors

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 25: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Hierarchical Region-Topic Probabilistic LatentSemantic Analysis (HRPLSA)

tz

D

T

d w

rd

φ

Wr

Complete model likelihood

LC(N|θ) =D∏

i=1

Ri∏l=1

K∑k=1

P(zk |di)Zilk

Wl∏j=1

T∑p=1

P(tp|zk )(φjp)Tkjp

nilj

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 26: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Model Fitting

HRPLSA parameters are fitted by applying ExpectationMaximization to the complete log-likelihood, e.g.

P(tp|zk ) =

∑Di=1

∑Ril=1 P(zk |di , rl)

∑Wlj=1 niljP(tp|zk ,wj)∑D

i ′=1∑Ri′

l ′=1 P(zk |di ′ , rl ′)ni ′l ′·

Model parameters θP(zk |di) - Document-specific mixing proportions of regiontopicsP(tp|zk ) - Mixing proportions of keyword topics given regionaspectsφjp - Multinomial keyword-topic distribution

Probabilistic inference is extended to images outside thetraining pool by folding-in

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 27: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Modeling Images and Words by HRPLSA(HRPLSA-ANN)

streetlight

sing

awning

window

sidewalk

plants

car

road

...

...

...

...

dsky

building

P (a|t)

P (a|z)

w1 w2 wW1

tW1t2

zR

t1t1

z2z1

t1t1

w1

t2

w2 wWR

tWR

Connection between region-topics (keyword-topics) and word islearned by maximizing constrained annotation log-likelihood

Lcal(N|θ′) =D∑

i=1

F∑f=1

mfi logK∑

k=1

P(af |zk )P(zk |di)

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 28: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Probabilistic InferenceA Few Examples

Most likely visual aspect within an image di

z i = arg maxzk∈Z

P(zk |di)

Most likely visual aspect z∗ for an image segment rl

z∗ = arg maxzk∈Z

P(zk |di , rl)

Most likely annotation a∗ for region rl in image dnew

a∗ = arg maxaf∈A

K∑k=1

P(af |zk )P(zk |dnew )

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 29: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Unsupervised Discovery and Segmentation of VisualClasses

Microsoft Research Cambridge dataset (MSRC-B1)240 images and 9 visual classesGround truth segmentation of visual classes

Image segments are assigned to the most likelydiscovered aspect and segmentation overlap is measured

ρor =GTo ∩ Rr

GTo ∪ Rr.

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 30: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Semantic Segmentation

Discovery of semantic visual aspects corrects segmentationerrors

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 31: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

MSRC-B1 Semantic Segmentation Result

Table: Segmentation Overlap on the MSRC-B1 Dataset

Algorithm Airplanes Bicycles Buildings Cars Cows Faces Grass Tree Sky Average

dLDA 0.43 0.50 0.16 0.45 0.52 0.44 0.60 0.71 0.74 0.51LDA10 0.11 0.06 0.09 0.14 0.11 0.40 0.41 0.69 0.41 0.27LDA15 0.08 0.56 0.06 0.14 0.14 0.43 0.57 0.62 0.41 0.33LDA20 0.10 0.50 0.40 0.15 0.58 0.45 0.54 0.46 0.50 0.40LDA25 0.14 0.52 0.21 0.17 0.48 0.44 0.45 0.59 0.50 0.39

HPRLSA1015 0.32 0.37 0.59 0.68 0.74 0.44 0.87 0.81 0.90 0.62

HPRLSA1020 0.35 0.56 0.65 0.67 0.74 0.47 0.89 0.47 0.91 0.63

HPRLSA1025 0.35 0.52 0.75 0.63 0.73 0.50 0.89 0.71 0.86 0.66

HPRLSA1520 0.36 0.49 0.56 0.60 0.68 0.52 0.89 0.77 0.91 0.64

HPRLSA1525 0.42 0.52 0.61 0.68 0.69 0.45 0.90 0.61 0.92 0.65

HPRLSA2025 0.44 0.43 0.48 0.52 0.67 0.48 0.88 0.75 0.91 0.61

HPRLSA1025 0.16 0.32 0.15 0.40 0.43 0.22 0.28 0.33 0.38 0.30

HPRLSA1525 0.24 0.31 0.13 0.38 0.37 0.19 0.25 0.21 0.45 0.28

Comparative analysis

HRPLSA

Multi-segmentation Latent Dirichlet Allocation (LDA)

Hierarchical Latent Dirichlet Allocation (hLDA)Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 32: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

MSRC-B1 Segmentation Example I

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 33: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

MSRC-B1 Segmentation Example II

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 34: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

MSRC-B1 Segmentation Example III

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 35: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Image Annotation

Washington Ground Truth dataset854 images manually annotated with 392 words427 training/test images

LabelMe dataset2688 images annotated with ∼ 400 words800 training / 1888 test images

Comparative analysisHRPLSA-ANNpLSA-FEATURESAnnotation Propagation (AP)

Precision/recall analysis

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 36: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Washington DatasetImage Annotation

Precision/Recall Plot

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

recall

prec

isio

n

plsa245

hrplsa245

ap245

plsa188

hrplsa188

ap188

plsa158

hrplsa158

ap158

Comparative precision/recall results on the Washington dataset

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 37: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Washington DatasetImage Annotation Example

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 38: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Washington DatasetRegion Annotation Example (I)

Examples of good HRPLSA-ANN annotations

clouds

people

people

house

skyclouds

tree

clouds

clouds50

100

150

200

250

300

350

400

450

tree treetree

tree

treetree

treesidewalk

tree

tree

building

50

100

150

200

250

300

350

400

450

500

window

clear

tree

house

treetree

tree

tree

tree

clear

tree

building

building

building

50

100

150

200

250

300

350

400

450

500

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 39: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Washington DatasetRegion Annotation Example (II)

Examples of bad HRPLSA-ANN annotations

flower

treetree

tree

treetree

tree

tree

tree

tree

tree

tree

tree

bush

50

100

150

200

250

300

350

400

450

500

tree

tree

tree

sky

cloudsclouds clouds

water

skysky

50

100

150

200

250

300

350

400

450

500

cloudshouse

tree

peopletree

people

tree

sky

tree

overcast50

100

150

200

250

300

350

400

450

500

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 40: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

Washington DatasetAnnotated Segments

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 41: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

LabelMe DatasetImage Annotation

Precision/Recall Plot

0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

Recall

Pre

cisi

on

plsa60

hrplsa60,80

plsa25

hrplsa25,50

ap

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 42: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

LabelMe DatasetExample of Annotated Images

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 43: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

LabelMe DatasetRecalling Specific Terms

Balcony

precision recall0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

prec

isio

n/re

call

balcony

hrplsaplsaap

Sign

precision recall0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

prec

isio

n/re

call

sign

hrplsaplsaap

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 44: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

LabelMe DatasetRegion Annotation Issue I

Region annotation tends to predict only a limited number ofwords

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

build

ing car

field

mou

ntain

pers

onro

ad sea

sky

skys

crap

ertre

ewat

er

window

0 100 200 3000

200

400coast

0 100 200 3000

1000

2000forest

0 100 200 3000

500

1000highway

0 100 200 3000

1000

2000 city

0 100 200 3000

500

1000mountain

0 100 200 3000

500

1000country

0 100 200 3000

1000

2000street

0 100 200 3000

1000

2000tallbuilding

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 45: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

LabelMe DatasetRegion Annotation Issue II

Distribution of region-topics in mountain caption

Topic 40

Topic 20

Topic 11

The unbalanced distribution of term occurrences introduces a bias inthe generation of the segment caption that yields to distinct regionaspects being associated to the same frequently co-occurring word

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 46: Probabilistic generative models for machine vision

Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery

Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation

LabelMe DatasetAnnotated Segments

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 47: Probabilistic generative models for machine vision

ConclusionConcluding RemarksBibliography and Online Resources

Wrapping up...

Inspiration can be sought in related application fieldsBag of words representationGenerative models for textNeed to pay attention to the underlying assumptions of amodel!

Hierarchical multi-resolution latent aspect model (HRPLSA)Unsupervised semantic segmentation of visual contentMulti-level image captioning

Future challengesGeneralize multi-resolution representation and topichierarchy beyond a two-layered structureVideo/multimedia understanding

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 48: Probabilistic generative models for machine vision

ConclusionConcluding RemarksBibliography and Online Resources

Online ResourcesA notable introductory course to object recognition (with code)

http://people.csail.mit.edu/torralba/shortCourseRLOC/

Code repositories

http://www.cs.ubc.ca/~lowe/keypoints/ (OriginalSIFT software by D. Lowe)

http://vlfeat.org/ (SIFT, MSER and VQ routines)

www.robots.ox.ac.uk/~vgg/research/affine/ (Affinedetectors and more)

http://www.di.ens.fr/~russell/projects/mult_seg_discovery/index.html (Multiple segmentation LDA)

Image annotation tool and repository

http://labelme.csail.mit.edu/ (LabelMe)

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 49: Probabilistic generative models for machine vision

ConclusionConcluding RemarksBibliography and Online Resources

Bibliography

D. Bacciu, A Perceptual Learning Model to Discover the Hierarchical Latent Structure of Image Collections,Ph.D. Thesis, 2008.

J. Sivic and A. Zisserman. Video google: a text retrieval approach to object matching in videos.Proceedings of the Ninth IEEE International Conference on Computer Vision, ICCV 2003, pages1470–1477, vol.2, Oct. 2003.

J. Sivic, B.C. Russell, A.A. Efros, A. Zisserman, and W.T. Freeman. Discovering objects and their location inimages. Proceedings of the Tenth IEEE International Conference on Computer Vision, ICCV 2005,370–377, Oct. 2005.

J. Sivic, B. C. Russell, A. Zisserman, W. T. Freeman, and A. A. Efros. Unsupervised discovery of visualobject class hierarchies. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2008.

K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. M. Blei, and M. I. Jordan. Matching words and pictures.Journal of Machine Learning Research, vol. 3, 1107–1135, Feb. 2003.

D. M. Blei and M. I. Jordan. Modeling annotated data. Proceedings of the International Conference onResearch and Development in Information Retrieval, SIGIR 2003, Aug. 2003.

F. Monay and D. Gatica-Perez. Modeling semantic aspects for cross-media image indexing. IEEETransactions on Pattern Analysis and Machine Intelligence, 29(10):1802–1817, Oct. 2007.

Davide Bacciu Probabilistic Generative Models for Machine Vision

Page 50: Probabilistic generative models for machine vision

ConclusionConcluding RemarksBibliography and Online Resources

Thank You

Davide Bacciu Probabilistic Generative Models for Machine Vision