Logistics - Max Planck Institute for Informatics€¦ · High-Level Computer Vision - Final Project Proposal Slides Structure • Slide 1 – Task and motivation ‣ Task statement

High-Level Computer Vision - Final Project 1

Logistics

• Start date: 12.06

• Proposal:

‣ 3 slides due 10.06 – task / setup definition

• Interim presentation 5+5min on 03.07:‣ progress report / problems encountered / feedback

• Final presentation on 23.07‣ Progress and presentation evaluation

• Written report submitted on 23.07‣ Report evaluation

• Reports to indicate assignments of each group member


Project goal

• Choose an existing computer vision method

• Choose a dataset and task:‣ datasets: Caltech4, Caltech101, Buffy Stickmen, HOI,

UKBench, etc.

‣ tasks: object class recognition, object detection/localization, person identification, gender recognition, scene classification

• Apply methods to the task, present the analysis of your results‣ necessary simplifications are OK (e.g. additional annotations)

‣ can you think of a new twist to the method?

High-Level Computer Vision - Final Project

Proposal Slides Structure

• Slide 1 – Task and motivation‣ Task statement and definitions

‣ Motivation

‣ Related work

• Slide 2 – Models, tools, novelty‣ Tentative material and methods

- From coursework, open source and research code

‣ Investigation

- Feature, model, etc.

• Slide 3 – Analysis‣ Benchmark

- Evaluation dataset and metric


Project report structure

• Title

• Abstract

• Introduction

• Related work

• Proposed method explained

• Experimental results‣ Dataset and Benchmark

• Conclusions and Future work

• References


Datasets

• Caltech-4/Caltech-101‣ object class recognition

‣ object localization


Datasets

• Buffy:‣ person identification: recognize Buffy

‣ gender recognition


Datasets

• RGB-D Object Dataset (www.cs.washington.edu/rgbd-dataset/)‣ Object instance retrieval

‣ Object category classification

http://www.cs.washington.edu/rgbd-dataset/


Datasets

• RGB-D Indoor Scenes Dataset (http://cs.nyu.edu/~silberman/datasets/)

‣ Scene classification

‣ Object detection, recognition, segmentation


Datasets

• UK Bench (www.vis.uky.edu/~stewe/ukbench/)‣ Object instance retrieval


Datasets

• Human-Object Interaction (HOI)‣ sport scene recognition


Dataset

• Or capture your own‣ Digital camera, mobile phone, Google glasses..

‣ Microsoft Kinect

‣ Web / Google image search


Methods

• Bag-of-Words (BoW)‣ Lecture 3 on 30.04

‣ Tools: VLFeat (exercise 4), SVM (exercise 5), Random Forests

• Relevant paper‣ G. Csurka, C. Bray, C. Dance, and L. Fan. Visual

categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision (ECCV), 2004.

• Publicly available SVM implementations‣ http://svmlight.joachims.org/

‣ http://www.csie.ntu.edu.tw/~cjlin/liblinear/


Methods

• Histograms of Oriented Gradients (HOG)‣ Lecture 7 on 28.05

‣ Tools: HOG/SVM (exercise 5), Random Forests

• Relevant paper‣ N. Dalal and B. Triggs. Histograms of Oriented Gradients for

Human Detection. In CVPR, 2005.

• An optimized HOG implementation may be found at‣ http://www.cs.brown.edu/~pff/latent/


Methods

• Interest points, vocabulary trees and geometry‣ Lecture 3 on 30.04

‣ Tools:

- Hessian, Harris, DoG keypoint detectors (exercise 3)

- SIFT descriptor

- Vocabulary tree (exercise 4)

- Epipolar geometry (Homography H, Fundamental matrix F)

• Relevant paper‣ D. Nister and H. Stewenius. Scalable recognition with a

vocabulary tree. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006.


Methods

• Implicit Shape Model (ISM)‣ Lecture 8 on 11.06

‣ Tools: VLFeat

• Paper:‣ B. Leibe, A. Leonardis, B. Schiele. Robust Object Detection

with Interleaved Categorization and Segmentation. In Special Issue on Learning for Recognition and Recognition for Learning (IJCV), 2008.


Methods

• Pictorial Structures (PS)‣ Lecture 9 on 18.06

‣ Tools: annotation tool for body parts (provided), distance transform (provided), HOG descriptor for part appearance

• Paper:‣ P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan.

Object Detection with Discriminatively Trained Part Based Models. In IEEE TPAMI, 2010.


Methods

• Image (IS) and video segmentation (VS)‣ Lecture 10 on 02.07

‣ Tools:

- IS (http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/resources.html)

- VS (http://www.d2.mpi-inf.mpg.de/content/video-segmentation-superpixels-0)

- Interactive (GrabCut)

‣ Papers

- P. Arbelaez, M. Maire, C. Fowlkes and J. Malik. Contour Detection and Hierarchical Image Segmentation. IEEE TPAMI 2011.

- F. Galasso, R. Cipolla and B. Schiele. Video Segmentation with Superpixels. In ACCV 2012

- C. Rother, V. Kolmogorov and A. Blake. GrabCut. In SIGGRAPH 2004


Features

• RGB, RG color histograms

• Harris, Hessian, Laplacian, DoG keypoints

• SIFT descriptors

• HOG descriptors

• Others:‣ Local Binary patterns (LBP)

‣ …


For a good project5 W’s

•What? (a problem)

•Why? (motivation)

•How? (proposed strategy)

•Where? (dataset and benchmark)

•Who? (team assignments)

It is recommended

•Baseline

It is desired.. your considerations on

•Influence of parameter and dataset choice

•Results: what is expected and what is surprising.. not just numbers!

Observations must be substantiated by results or references


Example Projects

• Gender Recognition

• Age Recognition

• Good vs. bad fruit recognition

• other recognition tasks....

• Object recognition or detection with the Kinect (RGB + Depth)

• Image retrieval for 3D objects

• Image retrieval, keypoints and invariance

• Object retrieval in videos, on a mobile phone

• Image and 3D object retouch

• Person identification

• Image and video segmentation

• Detection and segmentation


• Questions?

21

Documents

Logistics - Max Planck Institute for Informatics€¦ · High-Level Computer Vision - Final Project Proposal Slides Structure • Slide 1 – Task and motivation ‣ Task statement