CS 1699: Intro to Computer Vision...

CS 1699: Intro to Computer Vision

Intro to Machine Learning & Visual Recognition – Part II

Prof. Adriana KovashkaUniversity of Pittsburgh

October 27, 2015

Homework

• HW3: push back to Nov. 3

• HW4: push back to Nov. 24

• HW5: make half-length, still 10% of overall grade, out Nov. 24, due Dec. 10

• Take out a small piece of paper and vote yes/no on this proposal

Other announcements

• Piazza

– Can get participation credit by asking/answering

• Feedback from surveys

Plan for today

• Visual recognition problems

• Recognition pipeline

– Features and data

– Challenges

• Overview of some methods for classification

• Challenges and trade-offs

Some translations

• Feature vector = descriptor = representation

• Recognition often involves classification

• Classes = categories (hence classification = categorization)

• Training = learning a model (e.g. classifier), happens at training time from training data

• Classification = prediction, happens at test time

Classification

• Given a feature representation for images,

learn a model for distinguishing features from

different classes

Non-zebra

Decision

boundary

Slide credit: L. Lazebnik

Image categorization

• Cat vs Dog

Slide credit: D. Hoiem

• Object recognition

Caltech 101 Average Object ImagesSlide credit: D. Hoiem

• Place recognition

Places Database [Zhou et al. NIPS 2014]Slide credit: D. Hoiem

Region categorization

• Material recognition

[Bell et al. CVPR 2015]Slide credit: D. Hoiem

Recognition: A machine

learning approach

The machine learning

framework

• Apply a prediction function to a feature representation of

the image to get the desired output:

f( ) = “apple”

f( ) = “tomato”

f( ) = “cow”Slide credit: L. Lazebnik

The machine learning

framework

y = f(x)

• Training: given a training set of labeled examples {(x1,y1),

…, (xN,yN)}, estimate the prediction function f by minimizing

the prediction error on the training set

• Testing: apply f to a never before seen test example x and

output the predicted value y = f(x)

output prediction

function

feature

Prediction

Training

LabelsTraining

Images

Training

Features

Testing

Test Image

Learned

Slide credit: D. Hoiem and L. Lazebnik

Popular global image features

• Raw pixels (and simple

functions of raw pixels)

• Histograms, bags of features

• GIST descriptors [Oliva and

Torralba, 2001]

• Histograms of oriented gradients

(HOG) [Dalal and Triggs, 2005]

• Color

• Texture (filter banks or HOG over regions)L*a*b* color space HSV color space

What kind of things do we compute histograms of?

What kind of things do we compute histograms of?• Histograms of descriptors

• “Bag of visual words”

SIFT – [Lowe IJCV 2004]

Prediction

Training

LabelsTraining

Images

Training

Features

Testing

Test Image

Learned

Slide credit: D. Hoiem and L. Lazebnik

Recognition training data• Images in the training set must be annotated with the

“correct answer” that the model is expected to produce

“Motorbike”

Challenges: robustness

Illumination Object pose Clutter

ViewpointIntra-class

appearanceOcclusions

Slide credit: K. Grauman

Challenges: importance of context

Slide credit: Fei-Fei, Fergus & Torralba

Painter identification

• How would you learn to identify the author of a painting?

Goya Kirchner Klimt Marc Monet Van Gogh

Plan for today

• Visual recognition problems

• Recognition pipeline

– Features and data

– Challenges

• Overview of some methods for classification

• Challenges and trade-offs

One way to think about it…

• Training labels dictate that two examples are the same or different, in some sense

• Features and distances define visual similarity

• Goal of training is to learn feature weights so that visual similarity predicts label similarity– Linear classifier: confidence in positive label is a weighted

sum of features – What are the weights?

• We want the simplest function that is confidently correct

Adapted from D. Hoiem

Supervised classification

• Given a collection of labeled examples, come up with a function that will predict the labels of new examples.

• How good is some function that we come up with to do the classification?

• Depends on– Mistakes made

– Cost associated with the mistakes

“four”

“nine”

?Training examples Novel input

• Given a collection of labeled examples, come up with a function that will predict the labels of new examples.

• Consider the two-class (binary) decision problem– L(4→9): Loss of classifying a 4 as a 9

– L(9→4): Loss of classifying a 9 as a 4

• Risk of a classifier s is expected loss:

• We want to choose a classifier so as to minimize this total risk.

49 using|49Pr94 using|94Pr)( LsLssR

Feature value x

Optimal classifier will minimize total risk.

At decision boundary, either choice of label yields same expected loss.

If we choose class “four” at boundary, expected loss is:

If we choose class “nine” at boundary, expected loss is:

So, best decision boundary is at point x where

4)(4) | 4 is (class4)(9 )|9 is class( LPLP xx

9)(4 )|4 is class( LP x

9)(4) |4 is P(class4)(9 )|9 is class( LLP xx

To classify a new point, choose class with lowest expected loss; i.e., choose “four” if

)94()|4()49()|9( LPLP xx

Feature value x

Optimal classifier will minimize total risk.

At decision boundary, either choice of label yields same expected loss.

Loss for choosing “four” Loss for choosing “nine”

Disclaimers

• We will often assume the same loss for all possible types of misclassifications

• We won’t always build probability distributions – often we’ll just find a decision boundary (using discriminative methods)

• What’s the simplest classifier you can think of?

Nearest neighbor classifier

f(x) = label of the training example nearest to x

• All we need is a distance function for our inputs

• No training required!

exampleTraining

examples

from class 1

Training

examples

from class 2

K-Nearest Neighbors classification

Slide credit: D. Lowe

• For a new point, find the k closest points from training data

• Labels of the k points “vote” to classify

If query lands here, the 5

NN consist of 3 negatives

and 2 positives, so we

classify it as negative.

Black = negative

Red = positive

1-nearest neighbor

3-nearest neighbor

5-nearest neighbor

What are the tradeoffs of having a too large k? Too small k?

A nearest neighbor recognition example:

im2gps: Estimating Geographic Information from a Single Image.

James Hays and Alexei Efros.CVPR 2008.

http://graphics.cs.cmu.edu/projects/im2gps/

Where in the World?

Slides: James Hays

Where in the World?

Slides: James Hays

Where in the World?

Slides: James Hays

How much can an image tell about its geographic location?

Slide credit: James Hays

Nearest Neighbors according to gist + bag of SIFT + color histogram + a few others

Slide credit: James Hays

6+ million geotagged photosby 109,788 photographers

Slides: James Hays

A scene is a single surface that can be

represented by global (statistical) descriptors

Spatial Envelope Theory of Scene RepresentationOliva & Torralba (2001)

Slide Credit: Aude Olivia

Scene Matches

[Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.] Slides: James Hays

Slides: James Hays

Scene Matches

The Importance of Data

[Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.]Slides: James Hays

Nearest neighbor classifier

f(x) = label of the training example nearest to x

• All we need is a distance function for our inputs

• No training required!

exampleTraining

examples

from class 1

Training

examples

from class 2

Evaluating Classifiers

• Accuracy

– # correctly classified / # all test examples

• Precision/recall

– Precision = # retrieved positives / # retrieved

– Recall = # retrieved positives / # positives

• F-measure

= 2PR / (P + R)

Discriminative classifiers

Learn a simple function of the input features that correctly predicts the true labels on the training set

Training Goals

1. Accurate classification of training data

2. Correct classifications are confident

3. Classification function is simple

𝑦 = 𝑓 𝑥

Linear classifier

• Find a linear function to separate the classes

f(x) = sgn(w1x1 + w2x2 + … + wDxD) = sgn(w x)

What about this line?

NN vs. linear classifiers• NN pros:

+ Simple to implement

+ Decision boundaries not necessarily linear

+ Works for any number of classes

+ Nonparametric method

• NN cons:

– Need good distance function

– Slow at test time (large search problem to find neighbors)

– Storage of data

• Linear pros:

+ Low-dimensional parametric representation

+ Very fast at test time

• Linear cons:

– Works for two classes

– How to train the linear function?

– What if data is not linearly separable?

Adapted from L. Lazebnik

CS 1699: Intro to Computer Vision...

Documents

CS 1699: Intro to Computer Vision Introductionkovashka/cs1699_fa15/... · Quadratic optimization problem: Minimize Subject to y i (w·x i +b) ≥ 1 wT w 2 1 negative ( 1) : 1 positive

CS 1699: Intro to Computer Vision Edges and Binary Images Prof. Adriana Kovashka University of Pittsburgh September 15, 2015

Free Patterns - SMC - Design 1699

CS 1699: Intro to Computer Vision Introductionpeople.cs.pitt.edu/~kovashka/cs1674_fa16/vision_02...CS 1674: Intro to Computer Vision Matlab Tutorial Prof. Adriana Kovashka University

Jean Baptiste Simenon Chardin (1699-1779)

1699 - DIW

Method 1699: Pesticides in Water, Soil, Sediment ... · Method 1699 December 2007 5 Method 1699: Pesticides in Water, Soil, Sediment, Biosolids, and Tissue by HRGC/HRMS 1.0 Scope

CS 1699: Intro to Computer Vision Bias-Variance Trade-off + Other Models and Problems Prof. Adriana Kovashka University of Pittsburgh November 3, 2015

A-Hinterleithner Lauthen Concert (1699)

Metamorphoses Commentaries (Leipzig, 1699)

CS 1699: Intro to Computer Vision Introduction

CAUSA Nº : 45º C-1699-02

TM 1699 - bargainboatbits.com.au

1699 leia algumas paginas

CS 1699: Intro to Computer Vision Introductionpeople.cs.pitt.edu/~kovashka/cs1699_fa15/vision_07_color.pdf• Three numbers seem to be sufficient for encoding color • Dates back

CS 1699: Intro to Computer Vision Matlab Tutorial Prof. Adriana Kovashka University of Pittsburgh September 3, 2015

CS 1699: Intro to Computer Vision Introductionkovashka/cs1699_fa15/vision_05_edges.pdf · • Masks of varying shapes and sizes used to perform morphology, for example: • Scan mask

Asperen Beg 1640-1699

CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015

15363 - 1965 - 035 - 1692 - 1699