Upload
loraine-barnett
View
216
Download
1
Embed Size (px)
Citation preview
Geometric Hashing: A General and Efficient Model-Based
Recognition Scheme
Geometric Hashing: A General and Efficient Model-Based
Recognition Scheme
Yehezkel Lamdan and Haim J. WolfsonYehezkel Lamdan and Haim J. Wolfson
ICCV 1988ICCV 1988
Presented by Budi PurnomoPresented by Budi Purnomo
Nov 23rd 2004Nov 23rd 2004
Yehezkel Lamdan and Haim J. WolfsonYehezkel Lamdan and Haim J. Wolfson
ICCV 1988ICCV 1988
Presented by Budi PurnomoPresented by Budi Purnomo
Nov 23rd 2004Nov 23rd 2004
MotivationMotivation
Object recognition (ultimate goal of most Object recognition (ultimate goal of most computer vision research).computer vision research).
Inputs: Inputs:
• A database of objects.A database of objects.
• A scene or image to recognize.A scene or image to recognize.
Problems:Problems:
1.1. Objects in the scene undergo some transformations.Objects in the scene undergo some transformations.
2.2. Objects may partially occlude each other.Objects may partially occlude each other.
3.3. Computationally expensive to retrieve each object Computationally expensive to retrieve each object from database and compare it against the observed from database and compare it against the observed scene.scene.
Object recognition (ultimate goal of most Object recognition (ultimate goal of most computer vision research).computer vision research).
Inputs: Inputs:
• A database of objects.A database of objects.
• A scene or image to recognize.A scene or image to recognize.
Problems:Problems:
1.1. Objects in the scene undergo some transformations.Objects in the scene undergo some transformations.
2.2. Objects may partially occlude each other.Objects may partially occlude each other.
3.3. Computationally expensive to retrieve each object Computationally expensive to retrieve each object from database and compare it against the observed from database and compare it against the observed scene.scene.
Problem StatementProblem Statement
Recognition under Similarity Recognition under Similarity Transformation:Transformation:
““Is there a transformed (rotated, Is there a transformed (rotated, translated and scaled) subset of some translated and scaled) subset of some model point-set which matches a subset model point-set which matches a subset of the scene point-set?”of the scene point-set?”
Recognition under Similarity Recognition under Similarity Transformation:Transformation:
““Is there a transformed (rotated, Is there a transformed (rotated, translated and scaled) subset of some translated and scaled) subset of some model point-set which matches a subset model point-set which matches a subset of the scene point-set?”of the scene point-set?”
OutlineOutline
1.1. Key ideaKey idea
2.2. General FrameworkGeneral Framework
3.3. Recognition under Various TransformationsRecognition under Various Transformations
4.4. Recognition of 3D Objects from 2D ImagesRecognition of 3D Objects from 2D Images
5.5. Recognition of Polyhedra ObjectsRecognition of Polyhedra Objects
6.6. ComparisonsComparisons
– AlignmentAlignment
– Generalized Hough TransformGeneralized Hough Transform
1.1. Key ideaKey idea
2.2. General FrameworkGeneral Framework
3.3. Recognition under Various TransformationsRecognition under Various Transformations
4.4. Recognition of 3D Objects from 2D ImagesRecognition of 3D Objects from 2D Images
5.5. Recognition of Polyhedra ObjectsRecognition of Polyhedra Objects
6.6. ComparisonsComparisons
– AlignmentAlignment
– Generalized Hough TransformGeneralized Hough Transform
Key Idea (1/8)Key Idea (1/8)
Recognizing a pentagon in an image
Key Idea (2/8)Key Idea (2/8)
Blue: 1
Key Idea (3/8)Key Idea (3/8)
Red: 1
Key Idea (4/8)Key Idea (4/8)
Green: 5
Key Idea (5/8)Key Idea (5/8)
Purple: 1
Key Idea (6/8)Key Idea (6/8)
Brown: 1
Key Idea (7/8)Key Idea (7/8)
Blue: 1Red: 1Green: 5Purple: 1Brown: 1
Object is a pentagon!
Key Idea (8/8)Key Idea (8/8)
Blue: 1Red: 2Green: 2Purple: 1Brown: 1
Object is NOT a pentagon!
Brute Force RecognitionBrute Force Recognition
Let Let mm: points on the model,: points on the model,
nn: points on the scene.: points on the scene.
Recognize a single model: O((Recognize a single model: O((mm x x nn))2 2 x x tt))
where where tt is the complexity to verify the is the complexity to verify the
model against the scene.model against the scene.
If If mm==nn, and , and tt==nn, then we have O(, then we have O(nn55) to ) to recognize a single model.recognize a single model.
Let Let mm: points on the model,: points on the model,
nn: points on the scene.: points on the scene.
Recognize a single model: O((Recognize a single model: O((mm x x nn))2 2 x x tt))
where where tt is the complexity to verify the is the complexity to verify the
model against the scene.model against the scene.
If If mm==nn, and , and tt==nn, then we have O(, then we have O(nn55) to ) to recognize a single model.recognize a single model.
General Framework (1/2)General Framework (1/2)
Two stages algorithm:Two stages algorithm:
1.1. Preprocessing (for each model):Preprocessing (for each model):
For each feature points pair:For each feature points pair:• Define a local coordinate basis on this pair. Define a local coordinate basis on this pair.
• Compute and quantize all other feature points Compute and quantize all other feature points in this coordinate basis. in this coordinate basis.
• Record (model, basis) in a hash table.Record (model, basis) in a hash table.
Two stages algorithm:Two stages algorithm:
1.1. Preprocessing (for each model):Preprocessing (for each model):
For each feature points pair:For each feature points pair:• Define a local coordinate basis on this pair. Define a local coordinate basis on this pair.
• Compute and quantize all other feature points Compute and quantize all other feature points in this coordinate basis. in this coordinate basis.
• Record (model, basis) in a hash table.Record (model, basis) in a hash table.
General Framework (2/2)General Framework (2/2)
2.2. Online recognition (given a scene, extract feature Online recognition (given a scene, extract feature points):points):
a)a) Pick arbitrary ordered pair:Pick arbitrary ordered pair:
• Compute the other points using this pair as a basis.Compute the other points using this pair as a basis.
• For all the transformed points, vote all records (model, basis) For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and appear in the corresponding entry in the hash table, and histogram them.histogram them.
b)b) Matching candidates: (model, basis) pairs with large Matching candidates: (model, basis) pairs with large number of votes.number of votes.
c)c) Recover the transformation that results in the best least-Recover the transformation that results in the best least-squares match between all corresponding feature squares match between all corresponding feature points.points.
d)d) Transform the features, and verify against the input Transform the features, and verify against the input image features (if fails, repeat to 1).image features (if fails, repeat to 1).
2.2. Online recognition (given a scene, extract feature Online recognition (given a scene, extract feature points):points):
a)a) Pick arbitrary ordered pair:Pick arbitrary ordered pair:
• Compute the other points using this pair as a basis.Compute the other points using this pair as a basis.
• For all the transformed points, vote all records (model, basis) For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and appear in the corresponding entry in the hash table, and histogram them.histogram them.
b)b) Matching candidates: (model, basis) pairs with large Matching candidates: (model, basis) pairs with large number of votes.number of votes.
c)c) Recover the transformation that results in the best least-Recover the transformation that results in the best least-squares match between all corresponding feature squares match between all corresponding feature points.points.
d)d) Transform the features, and verify against the input Transform the features, and verify against the input image features (if fails, repeat to 1).image features (if fails, repeat to 1).
Two Stages Algorithm (1/2) Two Stages Algorithm (1/2)
[1]
Two Stages Algorithm (2/2)Two Stages Algorithm (2/2)
[1]
ComplexityComplexity
Assume Assume mm==nn, and , and kk is the number of point to is the number of point to define the basis.define the basis.
• Preprocessing: O(Preprocessing: O(nnk+1k+1) for a single model.) for a single model.
• Recognition: O(Recognition: O(nnk+1k+1) against all objects in ) against all objects in the database.the database.
Assume Assume mm==nn, and , and kk is the number of point to is the number of point to define the basis.define the basis.
• Preprocessing: O(Preprocessing: O(nnk+1k+1) for a single model.) for a single model.
• Recognition: O(Recognition: O(nnk+1k+1) against all objects in ) against all objects in the database.the database.
Under Various Transformations (1/2)Under Various Transformations (1/2)
1.1. Translation in 2D and 3D.Translation in 2D and 3D.• 1-point basis.1-point basis.
• O(nO(n22).).
2.2. Similarity transformation in 2D.Similarity transformation in 2D.• 2-point basis.2-point basis.
• O(nO(n33).).
3.3. Similarity transformation in 3D.Similarity transformation in 3D.• 3-point basis.3-point basis.
• O(nO(n44).).
1.1. Translation in 2D and 3D.Translation in 2D and 3D.• 1-point basis.1-point basis.
• O(nO(n22).).
2.2. Similarity transformation in 2D.Similarity transformation in 2D.• 2-point basis.2-point basis.
• O(nO(n33).).
3.3. Similarity transformation in 3D.Similarity transformation in 3D.• 3-point basis.3-point basis.
• O(nO(n44).).
Under Various Transformations (2/2)Under Various Transformations (2/2)
4.4. Affine transformationAffine transformation• 3-point basis.3-point basis.
• O(nO(n44))
5.5. Projective transformationProjective transformation• 4-point basis.4-point basis.
• O(nO(n55))
4.4. Affine transformationAffine transformation• 3-point basis.3-point basis.
• O(nO(n44))
5.5. Projective transformationProjective transformation• 4-point basis.4-point basis.
• O(nO(n55))
Recognition of 3D Objects from 2D Images (1/5)Recognition of 3D Objects from 2D Images (1/5)
1.1. Correspondence of planesCorrespondence of planes• Preprocessing: consider planar sections of the Preprocessing: consider planar sections of the
3D object which contain three of more interest 3D object which contain three of more interest points. points.
• Hash (model, plane, basis) triplet.Hash (model, plane, basis) triplet.
• Use either projective transformation or affine Use either projective transformation or affine transformation.transformation.
• Once the planes correspondence have been Once the planes correspondence have been established, the position of the entire 3D body established, the position of the entire 3D body is solved.is solved.
1.1. Correspondence of planesCorrespondence of planes• Preprocessing: consider planar sections of the Preprocessing: consider planar sections of the
3D object which contain three of more interest 3D object which contain three of more interest points. points.
• Hash (model, plane, basis) triplet.Hash (model, plane, basis) triplet.
• Use either projective transformation or affine Use either projective transformation or affine transformation.transformation.
• Once the planes correspondence have been Once the planes correspondence have been established, the position of the entire 3D body established, the position of the entire 3D body is solved.is solved.
Recognition of 3D Objects from 2D Images (2/5)Recognition of 3D Objects from 2D Images (2/5)
2.2. Singular affine transformationSingular affine transformation
A x + b = U A x + b = U wherewhere
AA : 2x3 affine matrix: 2x3 affine matrix
xx : 3x1 3D vector: 3x1 3D vector
bb : 2x1 2D translation vector: 2x1 2D translation vector
UU : 2x1 image : 2x1 image
2.2. Singular affine transformationSingular affine transformation
A x + b = U A x + b = U wherewhere
AA : 2x3 affine matrix: 2x3 affine matrix
xx : 3x1 3D vector: 3x1 3D vector
bb : 2x1 2D translation vector: 2x1 2D translation vector
UU : 2x1 image : 2x1 image
Recognition of 3D Objects from 2D Images (3/5)Recognition of 3D Objects from 2D Images (3/5)
A set of four non-coplanar points in A set of four non-coplanar points in 3D defines a 3D affine basis: 3D defines a 3D affine basis: – One point as originOne point as origin
– The vectors between origin and the other three The vectors between origin and the other three points as the unit (oblique) coordinate system.points as the unit (oblique) coordinate system.
Preprocess the model points in this Preprocess the model points in this four-basis point.four-basis point.
A set of four non-coplanar points in A set of four non-coplanar points in 3D defines a 3D affine basis: 3D defines a 3D affine basis: – One point as originOne point as origin
– The vectors between origin and the other three The vectors between origin and the other three points as the unit (oblique) coordinate system.points as the unit (oblique) coordinate system.
Preprocess the model points in this Preprocess the model points in this four-basis point.four-basis point.
Recognition of 3D Objects from 2D Images (4/5)Recognition of 3D Objects from 2D Images (4/5)
Recognition: Recognition:
• Pick four points: Pick four points: pp00, , pp11, , pp22, and , and pp33 --> three vectors: --> three vectors:
vv11, , vv22, and , and vv33 in the 2D image. in the 2D image.
• ExistsExists: : v v11 + + v v2 2 + + v v33 = 0, = 0, wherewhere ( (, , , , ) )
≠ 0≠ 0
• A point A point pp in the image, with in the image, with v v be the vector from be the vector from pp0 0
to to pp..
• Vote for all t ≠ 0 (Vote for all t ≠ 0 (a line with parametera line with parameter t): t):
vv = ( = (++tt) ) vv11 + ( + ( + + tt) ) vv22 + ( + (tt) ) vv3, 3, where where
((, , ) is the coordinate of ) is the coordinate of vv in the in the vv11, , vv22
basis.basis.
Recognition: Recognition:
• Pick four points: Pick four points: pp00, , pp11, , pp22, and , and pp33 --> three vectors: --> three vectors:
vv11, , vv22, and , and vv33 in the 2D image. in the 2D image.
• ExistsExists: : v v11 + + v v2 2 + + v v33 = 0, = 0, wherewhere ( (, , , , ) )
≠ 0≠ 0
• A point A point pp in the image, with in the image, with v v be the vector from be the vector from pp0 0
to to pp..
• Vote for all t ≠ 0 (Vote for all t ≠ 0 (a line with parametera line with parameter t): t):
vv = ( = (++tt) ) vv11 + ( + ( + + tt) ) vv22 + ( + (tt) ) vv3, 3, where where
((, , ) is the coordinate of ) is the coordinate of vv in the in the vv11, , vv22
basis.basis.
Recognition of 3D Objects from 2D Images (5/5)Recognition of 3D Objects from 2D Images (5/5)
3.3. Establishing a viewing angle with Establishing a viewing angle with similarity transformation.similarity transformation.• Tesselate a viewing sphere (uniform in Tesselate a viewing sphere (uniform in
spherical coordinates).spherical coordinates).
• Record (model, basis, angle) in the hash table.Record (model, basis, angle) in the hash table.
• 2-point basis: O(n2-point basis: O(n33) (the same order as without ) (the same order as without viewing angle because the viewing angle viewing angle because the viewing angle introduces only a constant factor -- introduces only a constant factor -- independent of the scene).independent of the scene).
3.3. Establishing a viewing angle with Establishing a viewing angle with similarity transformation.similarity transformation.• Tesselate a viewing sphere (uniform in Tesselate a viewing sphere (uniform in
spherical coordinates).spherical coordinates).
• Record (model, basis, angle) in the hash table.Record (model, basis, angle) in the hash table.
• 2-point basis: O(n2-point basis: O(n33) (the same order as without ) (the same order as without viewing angle because the viewing angle viewing angle because the viewing angle introduces only a constant factor -- introduces only a constant factor -- independent of the scene).independent of the scene).
Recognition of Polyhedral ObjectsRecognition of Polyhedral Objects
Polygonal objectsPolygonal objects• Choose an edge as the basis, record (model, Choose an edge as the basis, record (model,
basis edge) in the hash table. basis edge) in the hash table.
• Preprocessing and recognition is O(nPreprocessing and recognition is O(n22).).
Polygonal objectsPolygonal objects• Choose an edge as the basis, record (model, Choose an edge as the basis, record (model,
basis edge) in the hash table. basis edge) in the hash table.
• Preprocessing and recognition is O(nPreprocessing and recognition is O(n22).).
[1]
Comparisons (1/2)Comparisons (1/2)
1.1. With alignment method.With alignment method.• Use exhaustive enumeration of all possible Use exhaustive enumeration of all possible
pairs in the objects and the images.pairs in the objects and the images.
• Geometric hashing can process all models Geometric hashing can process all models simultaneously, while the alignment method simultaneously, while the alignment method processes models sequentially.processes models sequentially.
• The alignment method does not require any The alignment method does not require any additional memory, while geometric hashing additional memory, while geometric hashing requires a large memory to store hash table.requires a large memory to store hash table.
• Geometric hashing more efficient if:Geometric hashing more efficient if:
• The scene contains enough features (6-10) for The scene contains enough features (6-10) for efficient recognition by voting.efficient recognition by voting.
• There are many models.There are many models.
1.1. With alignment method.With alignment method.• Use exhaustive enumeration of all possible Use exhaustive enumeration of all possible
pairs in the objects and the images.pairs in the objects and the images.
• Geometric hashing can process all models Geometric hashing can process all models simultaneously, while the alignment method simultaneously, while the alignment method processes models sequentially.processes models sequentially.
• The alignment method does not require any The alignment method does not require any additional memory, while geometric hashing additional memory, while geometric hashing requires a large memory to store hash table.requires a large memory to store hash table.
• Geometric hashing more efficient if:Geometric hashing more efficient if:
• The scene contains enough features (6-10) for The scene contains enough features (6-10) for efficient recognition by voting.efficient recognition by voting.
• There are many models.There are many models.
Comparisons (2/2)Comparisons (2/2)
2.2. With Generalized Hough Transform With Generalized Hough Transform (GHT).(GHT).• GHT quantizes all possible (continuous) GHT quantizes all possible (continuous)
transformations between the model and transformations between the model and the scene into a set of bins, whilethe scene into a set of bins, while
• Geometric Hashing quantizes just the Geometric Hashing quantizes just the (discrete) transformation represented by (discrete) transformation represented by the basis.the basis.
2.2. With Generalized Hough Transform With Generalized Hough Transform (GHT).(GHT).• GHT quantizes all possible (continuous) GHT quantizes all possible (continuous)
transformations between the model and transformations between the model and the scene into a set of bins, whilethe scene into a set of bins, while
• Geometric Hashing quantizes just the Geometric Hashing quantizes just the (discrete) transformation represented by (discrete) transformation represented by the basis.the basis.
SummarySummary
• Ability to recognize objects that have Ability to recognize objects that have undergo an arbitrary transformation.undergo an arbitrary transformation.
• Can perform partial matching.Can perform partial matching.
• Efficient and can be parallelized easily.Efficient and can be parallelized easily.
• Use transformation-invariant access key to Use transformation-invariant access key to the hash table.the hash table.
• Two phases (preprocessing and Two phases (preprocessing and recognition).recognition).
• Require a large memory to store hash table.Require a large memory to store hash table.
• Ability to recognize objects that have Ability to recognize objects that have undergo an arbitrary transformation.undergo an arbitrary transformation.
• Can perform partial matching.Can perform partial matching.
• Efficient and can be parallelized easily.Efficient and can be parallelized easily.
• Use transformation-invariant access key to Use transformation-invariant access key to the hash table.the hash table.
• Two phases (preprocessing and Two phases (preprocessing and recognition).recognition).
• Require a large memory to store hash table.Require a large memory to store hash table.
ReferencesReferences
[1] Yehezkel Lamdan and Haim J. [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition and Efficient Model-Based Recognition Scheme, Scheme, ICCVICCV, 1988., 1988.
[1] Yehezkel Lamdan and Haim J. [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition and Efficient Model-Based Recognition Scheme, Scheme, ICCVICCV, 1988., 1988.