Discovering Objects and their Location in Images Josef Sivic 1 , Bryan C. Russell 2 , Alexei A. Efros 3 , Andrew Zisserman 1 and William T. Freeman 2 Goal: Discover visual object categories and their segmentation given a collection of unlabelled images Introduction Represent an image as a histogram of “visual words” The topic discovery models Probabilistic Latent Semantic Analysis (pLSA) [Hofmann’99] Experiment I: Caltech Dataset pLSA graphical model Five samples from a ‘motorbike’ visual word Improving localization using doublets 1 Oxford University 2 MIT 3 Carnegie Mellon University Experiment II: MIT dataset Overview Find topic vectors P(w|z) common to all documents and mixture coefficients P(z|d) specific to each document. Fit model by maximizing likelihood of data using EM. pLSA Model fitting: Assign each image to a topic with the highest P(z|d) Learn K = (5,6,7) topics Background is better modelled by multiple topics Pre-learning background topics on a separate bg dataset improves results Performance on novel images is comparable with semi-supervised method of [Fergus et al.’03] Confusion tables (K=5,6,7) learned topics Form a new vocabulary from pairs of locally co-occurring regions Doublet example I Doublet examle II Doublet segmentation Singlet segmentation 4 of the 10 learned topics shown by the 5 most probable images for each topic - 2873 images, learn 10 topics Singlet segmentation All detected visual words “Buildings” “Trees / Grass” “Bookshelves” “Computers” Example Images with multiple objects Image representation Approach: 1) Represent an image as a collection of visual words 2) Apply topic discovery models from statistical text analysis Results Histogram of visual words • Detect affine covariant regions • Represent each region by a SIFT descriptor • Build visual vocabulary by k-means clustering (K~1,000) • Assign each region to the nearest cluster centre 2 0 1 0 . . . Five samples from an ‘airplane’ visual word Mikolajczyk and Schmid’02, Schaffalitzky and Zisserman’02, Matas et al. ’02, Lowe’99, Sivic and Zisserman’03 Examples of visual words Doublet formation Segmentation For a given word w i in document d j examine posterior probability over topics. Faces Motorbik es Airplane s Cars Background I Background II Background III Visual words colour coded according to the topic with the highest probability Example motorbike segmentation Example airplane segmentation Image Classification Four object categories: faces, motorbikes, airplanes and cars rear (total of 3,190 images) and 900 background images LDA graphical model Latent Dirichlet Allocation (LDA) [Blei et al.’03] Treat multinomial weights over topics as random variables. Fit model using Gibbs sampling [Griffiths and Steyvers’04]. Results shown only for pLSA. LDA had very similar performance. Experiment III: Application to image retrieval Learn topic vectors on Caltech database Represent new query image in terms of learned topic vectors Retrieved images using visual word histograms Retrieved images using pLSA ‘object’ coefficients P(z|d) Example face segmentation Represent each keyframe using topic vectors learned on Caltech database Pretty Woman (6,641 keyframes) Retrieve images within Caltech database Query image pLSA Retrieve images in movie Pretty Woman Raw word histogra ms Precision – Recall plot Find visual words Form histogram s Discover topics Visual Polysemy. Single visual word occurring on different (but locally similar) parts on different object categories. Visual Synonyms. Two different visual words representing similar part of an object (wheel of an motorbike). w … visual words d … documents (images) z … topics (‘objects’) P(z|d) and P(w|z) are multinomial distributions

Discovering Objects and their Location in Images

Download PPT Report

Upload
maite-hamilton
View
32
Download
0

Tags:

Embed Size (px)

DESCRIPTION

0. 1. 0. 2. Discovering Objects and their Location in Images. Josef Sivic 1 , Bryan C. Russell 2 , Alexei A. Efros 3 , Andrew Zisserman 1 and William T. Freeman 2. 1 Oxford University 2 MIT 3 Carnegie Mellon University. Introduction. The topic discovery models. - PowerPoint PPT Presentation

Citation preview

Page 1: Discovering Objects and their Location in Images

Discovering Objects and their Location in ImagesJosef Sivic1, Bryan C. Russell2, Alexei A. Efros3, Andrew Zisserman1 and William T. Freeman2

Goal: Discover visual object categories and their segmentation given a collection of unlabelled images

Introduction

Represent an image as a histogram of “visual words”

The topic discovery models

Probabilistic Latent Semantic Analysis (pLSA) [Hofmann’99]

Experiment I: Caltech Dataset

pLSA graphical model

Five samples from a ‘motorbike’ visual word

Improving localization using doublets

1Oxford University 2MIT 3Carnegie Mellon University

Experiment II: MIT dataset

Overview

Find topic vectors P(w|z) common to all documents and mixture coefficients P(z|d) specific to each document. Fit model by maximizing likelihood of data using EM.

pLSA Model fitting:

Assign each image to a topic with the highest P(z|d)

Learn K = (5,6,7) topics

Background is better modelled by multiple topics

Pre-learning background topics on a separate bg dataset improves results

Performance on novel images is comparable with semi-supervised method of [Fergus et al.’03] Confusion tables (K=5,6,7) learned topics

Form a new vocabulary from pairs of locally co-occurring regions

Doublet example I Doublet examle II

Doublet segmentationSinglet segmentation

4 of the 10 learned topics shown by the 5 most probable images for each topic

- 2873 images, learn 10 topics

Singlet segmentationAll detected visual words

“Buildings” “Trees / Grass”

“Bookshelves”“Computers”

Example Images with multiple objects

Image representation

Approach: 1) Represent an image as a collection of visual words

2) Apply topic discovery models from statistical text analysis

Results

Histogram of visual words

• Detect affine covariant regions

• Represent each region by a SIFT descriptor

• Build visual vocabulary by k-means clustering (K~1,000)

• Assign each region to the nearest cluster centre

Five samples from an ‘airplane’ visual word

Mikolajczyk and Schmid’02, Schaffalitzky and Zisserman’02, Matas et al. ’02, Lowe’99, Sivic and Zisserman’03

Examples of visual words

Doublet formation

Segmentation

For a given word wi in document dj examine posterior probability over topics.

FacesMotorbikesAirplanesCars

Background IBackground IIBackground III

Visual words colour coded according to the topic with the highest probability

Example motorbike segmentationExample airplane segmentation

Image Classification

Four object categories: faces, motorbikes, airplanes and cars rear (total of 3,190 images) and 900 background images

LDA graphical model

Latent Dirichlet Allocation (LDA) [Blei et al.’03]

Treat multinomial weights over topics as random variables. Fit model using Gibbs sampling [Griffiths and Steyvers’04].

Results shown only for pLSA. LDA had very similar performance.

Experiment III: Application to image retrieval

Learn topic vectors on Caltech databaseRepresent new query image in terms of learned topic vectors

Retrieved images using visual word histograms

Retrieved images using pLSA ‘object’ coefficients P(z|d)

Example face segmentation

Represent each keyframe using topic vectors learned on Caltech database

Pretty Woman (6,641 keyframes)

Retrieve images within Caltech database

Query image

pLSA

Retrieve images in movie Pretty Woman

Raw word histograms

Precision – Recall plot

Find visual words

Form histograms

Discover topics

Visual Polysemy. Single visual word occurring on different (but locally similar) parts on different object

categories.

Visual Synonyms. Two different visual words representing similar part of an object (wheel of an

motorbike).

w … visual words d … documents (images) z … topics (‘objects’)

P(z|d) and P(w|z) are multinomial distributions

A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g

Documents

Vocabulario Nuevo I can give the location of objects in the home

Documents

Discovering Regional Co-location Patterns for Sets of ...spatial data mining, regional co-location mining, regional knowledge discovery, clustering, finding associations between continuous

Documents

Discovering Important People and Objects for Egocentric Video Summarizationvision.cs.utexas.edu/projects/egocentric/egocentric_cvpr2012.pdf · Discovering Important People and Objects

Documents

Discovering Regular Groups of Mobile Objects Using ......clustering algorithm that clusters mobile objects according to similarity of their movement patterns. The proposed clustering

Documents

Discovering objects and their location in imagespeople.csail.mit.edu/brussell/research/SREZF05.pdf · highlight the importance of huge amounts of training data. The quantity of good,

Documents

Identifying the location of web objects: A study of library websites

Education

Object Lesson: Discovering and Learning to Recognize Objects Object Lesson: Discovering and Learning to Recognize Objects – Paul Fitzpatrick – MIT CSAIL

Documents

Player - RPG Sheets · combat equip location clothing / accesories equipment Of contacts Description / characteristics Varied equipment location special Objects Description / abilities

Documents

Context as Supervisory Signal: Discovering Objects with Predictable Context

Documents

Jointly Discovering Visual Objects and Spoken Words from Raw …harwath/papers/Harwath_ECCV... · 2020. 9. 4. · Jointly Discovering Visual Objects and Spoken Words from Raw Sensory

Documents

Putting Objects in Perspectiveefros/hoiem_ijcv2008.pdf · Putting Objects in Perspective ... world coordinates, we can put objects into perspective and model the scale and location

Documents

Discovering objects and their location in imagesvgg/publications/papers/sivic05b.pdfDiscovering objects and their location in images ... We achieve this using a model developed

Documents

Discovering physical objects: Meeting researchers’ needsrin.ac.uk/system/files/attachments/Discovering-objects... · 2010-02-19 · Discovering physical objects: Meeting researchers

Documents

Discovering objects and their location in imagespeople.csail.mit.edu/billf/publications/Discovering_Objects.pdfDiscovering objects and their location in images ... polysemy in our

Documents

Discovering Sap Business Objects

Documents

Discovering Spatial Co-location Patterns

Documents

Discovering Important People and Objects for Egocentric Video Summarizationgrauman/papers/egocentric_cvpr2012.pdf · 2012-04-10 · Discovering Important People and Objects for Egocentric

Documents

Discovering objects and their location in imagesabhinav/datamining/papers/01541280.pdf · the various objects in each image. This means that both the object category and image segmentation

Documents

Discovering Objects of Joint Attention via First-Person

Documents

Florida Safe Families Network Business Objects XI R 3.1 ...centervideo.forest.usf.edu › fsfn › boeupgrade › boeupgrade.pdf · Business Objects Enterprise (BOE) from any location

Documents

An Efficient Trajectory Index Structure for Moving Objects in Location-based Services

Documents

DISCOVERING ESCALA HOTEL & SUITES€¦ · discovering escala hotel & suites contents 01 - home 02 - introduction 03 - location 04 - suite features 05 - services & facilities 06 -

Documents

PR03 - Discovering the Functionality of the Rockwell Automation Library of Process Objects

Technology

Discovering objects and their location in imagesefros/courses/LBMV07/Papers/sivic-iccv... · 2006. 1. 16. · Discovering objects and their location in images Josef Sivic1 Bryan C

Documents

Discovering Objects and their Location in Images

Documents

Location management and Moving Objects Databases

Documents

Representing the existence and the location of hidden objects

Documents

Discovering objects and their location in imagesvgg/publications/2005/Sivic05b/sivic05b.pdf · topics in a corpus using the bag-of-words document repre- ... a tally of the ... step

Documents

Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:

Documents