Upload
arch
View
37
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Grammars in computer vision. Presented by: Thomas Kollar. Slides courtesy of Song-Chun Zhu. Context in computer vision. Outside the object (contextual features). Inside the object (intrinsic features). Object size. Pixels. Parts. Global appearance. Global context. Local context. - PowerPoint PPT Presentation
Citation preview
Grammars in computer Grammars in computer visionvision
Presented by: Thomas Kollar
Slides courtesy of Song-Chun Zhu
PartsGlobal appearance
Local contextGlobal context
Object size
Inside the object(intrinsic features)
Outside the object(contextual features)
Pixels
Kruppa & Shiele, (03), Fink & Perona (03)
Carbonetto, Freitas, Barnard (03), Kumar, Hebert, (03)
He, Zemel, Carreira-Perpinan (04), Moore, Essa, Monson, Hayes (99)
Strat & Fischler (91), Torralba (03), Murphy, Torralba & Freeman (03)
Agarwal & Roth, (02), Moghaddam, Pentland (97), Turk, Pentland (91),Vidal-Naquet, Ullman, (03)
Heisele, et al, (01), Agarwal & Roth, (02), Kremp, Geman, Amit (02), Dorko, Schmid, (03)
Fergus, Perona, Zisserman (03), Fei Fei, Fergus, Perona, (03), Schneiderman, Kanade (00), Lowe (99)Etc.
Context in computer Context in computer visionvision
Why grammars?Why grammars?
Guzman (SEE), 1968 Noton and Stark 1971 Hansen & Riseman (VISIONS),
1978 Barrow & Tenenbaum 1978 Brooks (ACRONYM), 1979 Marr, 1982 Ohta & Kanade, 1978 Yakimovsky & Feldman, 1973
[Ohta & Kanade 1978]
Why grammars?Why grammars?
Why grammars?Why grammars?
Which papers?Which papers?
F. Han and S.C. Zhu, Bottom-up/Top-down Image Parsing with Attribute Grammar, 2005.
Zijian Xu; A hierarchical compositional model for representation and sketching of high-resolution human images, PhD Thesis 2007.
Song-Chun Zhu and David Mumford; A stochastic grammar of images, 2007.
L. Lin, S. Peng, J. Porway, S.C. Zhu, and Y. Wang, An empirical study of object category recognition: sequential testing with generalized samples, 2007.
DatasetsDatasets
Large-scale image Large-scale image labelinglabeling
Our Goal:Our Goal:
Three projects using and-Three projects using and-or graphsor graphs
1. Modeling an environment with rectangles.
2. Creating sketches
CommonalitiesCommonalities
Use context sensitive grammars Called And-Or graphs in these papers
Provides top-down and bottom-up influence
Most are generative all the way to the pixel level
Configuration matters E.g. they don’t assume independence given
the parent
These can take the form of a MRF
ChallengesChallenges
Objects have large within-category variations
Scenes have variation
ChallengesChallenges
Describing people has variation
Grammar definitionGrammar definition
And-or graphsAnd-or graphs
Modeling with rectanglesModeling with rectangles
Modeling with Modeling with rectanglesrectangles
Six production rulesSix production rules
Two examplesTwo examples
Three phasesThree phases
1. Bottom-up detection Compute edge segments and a number of
vanishing points. These vanishing points are grouped into a line set and rectangle hypotheses are found using RANSAC, generating a number of rectangles from a bottom up proposal.
2. Initialize the terminal nodes greedily Pick the most promising hypotheses with
heaviest weight by increase in posterior probability.
3. Incorporate top-down influence Each step of the algorithm picks the most
promising proposal among the 5 candidate rules by increase in posterior probability.
When a new non-terminal node is accepted (1) insert and create a new proposal (2) reweight the proposals (3) pass attributes between the node and parent.
Probability ModelsProbability Models
)()()),(|(maxarg*freefreeG CpGpCGCIpG
• p(C_free) follows the primal sketch model.
• p(G) is the probability of the parse tree
• p(I | G) is the reconstruction likelihood
Probability ModelsProbability Models
)( )(
))(|)(())(),(|)(())(|)(())(()(GA AchildB
ooo
N
AXBXpAnAlAXpAlAnpAlpGp
• p(l) is the probability of a rule
• p(n | l) is the probability of the number of components given the type of rule.
• p(X | l, n) is the probability of the geometry of A.
• p(X(B) | X(A)) ensures regularities between the geometries (e.g. that aligned rectangles have almost the same shape).
1)"")(|3)(( cubeAlAnp
qcubeAlp )"")((
e.g. each square should look reasonable
e.g. for the line rule, enforce that everything lines up
Probability ModelsProbability Models
N
k yx
M
mmksk
ksk
mnskIhyxByxI
ZCIp
1 ),( 1
22
,
,)(,)),(),((
2
1exp
1)|(
• Primal sketch modelkskkkkkkt yxyxnyyxxByxI
k ,),( ),,(),,;,(),(
Inference: bottom-up Inference: bottom-up detection of rectanglesdetection of rectangles
• RANSAC is run to propose a number of rectangles using vanishing points
Inference: initialize Inference: initialize terminal nodesterminal nodes
• Input: candidate set of rectangles from previous phase
• Output: a set of non-terminal nodes representing rectangles
• While(not done):• re-compute weights• Greedily select the rectangle with the
highest weight• Create a new non-terminal node in the
grammar
Inference: initialize Inference: initialize terminal nodesterminal nodes
• Input: non-terminal rectangles from previous step
• Output: a parse graph
• While (not done):• re-compute weights• Greedily select the highest weight
candidate rule• Add rule to parse graph along with any
top-down predictions.
• Weights are computed similarly to before.
Example of Example of top-down/bottom-up top-down/bottom-up
inferenceinference
ResultsResults
ResultsResults
ResultsResults
ResultsResults
ROC curveROC curve
Generating sketchesGenerating sketches
Additional semantics
ChallengesChallenges
Geometric deformationsclothes are very flexible
Photometric variabilities large variety of colors, shading and
texture
Topological configurations combinatorial number of clothes designs
Decomposing a sketchDecomposing a sketch
And-Or graphAnd-Or graph
“In a computing and recognition phase, we first activate some sub-templates in a bottom-up step. For example, we can detect the face and skin color to locate the coarse position of some components, which help to predict the positions of other components by context.”
Sketch sub-partsSketch sub-parts
Example grammarExample grammar
Sub-templatesSub-templates
Probability modelProbability model
Overview of the Overview of the algorithmalgorithm
Sketch resultsSketch results
Sketch resultsSketch results
ConclusionsConclusions
Grammar-based model was presented for generating sketches.
Markov random fields at lowest level.
Top-down/bottom-up inference performed.