Upload
arnold-bridges
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
Enhancing Human-Machine Communication via Visual Attributes
Devi ParikhVirginia Tech
Interacting with Vision Systems
User Supervisor
2
Interacting with Vision Systems
Semantic Gap3
Mode of communication is important
Interacting with Vision Systems
• Necessary for communication– Language that humans understand (semantic)– Language that machines understand (visual)
• Attributes– Example: furry, natural, chubby, shiny, etc.– Better features, deeper image understanding, etc.
Farhadi et al., Kumar et al., Lampert et al., etc.– Human-machine communication
4
SupervisorUser
User
Reading Between the
Lines
Supervisor
Role of the Human
Com
mun
icat
or
SupervisorUser
Hum
anM
achi
neImage Search Instilling Domain Knowledge
Characterizing Failure Modes
Interpretable Models
My missing brother is fuller-faced than
this boy.
Polar bears are white and larger
than rabbits.
If the image is blurry or the face is not frontal, I may fail.
I think this is a polar bear because this is a
white and furry animal.
Active and Interactive Learning
5
SupervisorUser
User
Reading Between the
Lines
Supervisor
Role of the Human
Com
mun
icat
or
SupervisorUser
Hum
anM
achi
neImage Search Instilling Domain Knowledge
Characterizing Failure Modes
Interpretable Models
My missing brother is fuller-faced than
this boy.
Polar bears are white and larger
than rabbits.
If the image is blurry or the face is not frontal, I may fail.
I think this is a polar bear because this is a
white and furry animal.
Active and Interactive Learning
6
Image SearchQuery: “black shoes”
…
7
Binary Relevance Feedback
Image SearchQuery: “black shoes”
…
“shinier than these”
“more formal than these”
…
8
Relative Attributes
Openness
9
Linear ranking function: open
Training
Testing
[Parikh and Grauman, ICCV 2011]
Image Search
• System has pre-trained relative attribute predictors
• Relevance of image = # constraints satisfied
10
…
“shinier”“more formal”
WhittleSearchshiny
formal
…
“shinier”“more formal” 11
WhittleSearchshiny
formal
12
WhittleSearch
13
[Kovashka, Parikh and Grauman, CVPR 2012](Patent pending)
13
Whittle Search: Demo (Online)
14[Prepared by Naman Agrawal, Demo at CVPR 2013]
(Patent pending) 14
SupervisorUser
User
Reading Between the
Lines
Supervisor
Role of the Human
Com
mun
icat
or
SupervisorUser
Hum
anM
achi
neImage Search Instilling Domain Knowledge
Characterizing Failure Modes
Interpretable Models
My missing brother is fuller-faced than
this boy.
Polar bears are white and larger
than rabbits.
If the image is blurry or the face is not frontal, I may fail.
I think this is a polar bear because this is a
white and furry animal.
Active and Interactive Learning
15
SupervisorUser
User
Reading Between the
Lines
Supervisor
Role of the Human
Com
mun
icat
or
SupervisorUser
Hum
anM
achi
neImage Search Instilling Domain Knowledge
Characterizing Failure Modes
Interpretable Models
My missing brother is fuller-faced than
this boy.
Polar bears are white and larger
than rabbits.
If the image is blurry or the face is not frontal, I may fail.
I think this is a polar bear because this is a
white and furry animal.
Active and Interactive Learning
16
SupervisorUser
User
Reading Between the
Lines
Supervisor
Role of the Human
Com
mun
icat
or
SupervisorUser
Hum
anM
achi
neImage Search Instilling Domain Knowledge
Characterizing Failure Modes
Interpretable Models
My missing brother is fuller-faced than
this boy.
Polar bears are white and larger
than rabbits.
If the image is blurry or the face is not frontal, I may fail.
I think this is a polar bear because this is a
white and furry animal.
Active and Interactive Learning
17
Traditional Active Learning
Is this a forest? No, this is not a forest.
18
[Parkash and Parikh, ECCV 2012]
Classifier FeedbackI think this is a
forest. What do you think ?
No, this is too open to be a
forest.
…
Ah! These images must
not be forests either then.
19
[Images more open than query]
Classifier FeedbackI think this is a
forest. What do you think ?
No, this is too open to be a
forest.
…
Ah! These images must
not be forests either then.
20
[Images more open than query]
Pre-trained relative
attributes
Classifier FeedbackI think this is a
forest. What do you think ?
No, this is too open to be a
forest.
…
Ah! These images must
not be forests either then.
21
[Images more open than query]
Learn attributes on
the fly
Classifier FeedbackI think this is a
forest. What do you think ?
No, this is too open to be a
forest.
Ah! These images must be less open than query
22
…
[images labeled as forest]
[Biswas and Parikh, CVPR 2013]
Classifier Feedback
• Learning attributes on the fly– Start only with unlabeled images (+ a supervisor)– Categories and attributes learnt from scratch
• Confidence in instances
• Active learning for learning with attributes-based classifier feedback
23
Classifier Feedback
0 50 100 150 200 250 30020
30
40
50
60
70
No attributes-based feedback
Parkash et al. ECCV 2012
Proposed
Number of iterations
Accu
racy
24
Parkash and Parikh ECCV 2012
Biswas and Parikh CVPR 2013
SupervisorUser
User
Reading Between the
Lines
Supervisor
Role of the Human
Com
mun
icat
or
SupervisorUser
Hum
anM
achi
neImage Search Instilling Domain Knowledge
Characterizing Failure Modes
Interpretable Models
My missing brother is fuller-faced than
this boy.
Polar bears are white and larger
than rabbits.
If the image is blurry or the face is not frontal, I may fail.
I think this is a polar bear because this is a
white and furry animal.
Active and Interactive Learning
25
WhittleSearchQuery: “black shoes”
…
“shinier than these”
“more formal than these”
…
26
Image Search
27[Parikh and Grauman, ICCV 2013]
28
Saying the Right Thing
Smiling more thanNot smiling
[Sadovnik, Gallagher, Parikh and Chen, ICCV 2013]
• Improved image search, description
29
Saliency of Attributes• Improved image search, zero-shot learning,
description
White, furry Scary, sharp teeth
[Turakhia and Parikh, ICCV 2013]
SupervisorUser
User
Reading Between the
Lines
Supervisor
Role of the Human
Com
mun
icat
or
SupervisorUser
Hum
anM
achi
neImage Search Instilling Domain Knowledge
Characterizing Failure Modes
Interpretable Models
My missing brother is fuller-faced than
this boy.
Polar bears are white and larger
than rabbits.
If the image is blurry or the face is not frontal, I may fail.
I think this is a polar bear because this is a
white and furry animal.
Active and Interactive Learning
30
Accessing user’s intensions for mental
image search
More usable computer vision
systems even with their imperfections
Trustworthy systems: key for effective human-
machine teams
Integrating AI with today’s machine
learning tools
Getting more from what the
human says without added human effort
Enhanced human-machine communication via attributes for improved visual
recognition
Thank you!