Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd

Geometric Hashing: A General and Efficient Model-Based

Recognition Scheme

Geometric Hashing: A General and Efficient Model-Based

Recognition Scheme

Yehezkel Lamdan and Haim J. WolfsonYehezkel Lamdan and Haim J. Wolfson

ICCV 1988ICCV 1988

Presented by Budi PurnomoPresented by Budi Purnomo

Nov 23rd 2004Nov 23rd 2004

Yehezkel Lamdan and Haim J. WolfsonYehezkel Lamdan and Haim J. Wolfson

ICCV 1988ICCV 1988

Presented by Budi PurnomoPresented by Budi Purnomo

Nov 23rd 2004Nov 23rd 2004

MotivationMotivation

Object recognition (ultimate goal of most Object recognition (ultimate goal of most computer vision research).computer vision research).

Inputs: Inputs:

• A database of objects.A database of objects.

• A scene or image to recognize.A scene or image to recognize.

Problems:Problems:

1.1. Objects in the scene undergo some transformations.Objects in the scene undergo some transformations.

2.2. Objects may partially occlude each other.Objects may partially occlude each other.

3.3. Computationally expensive to retrieve each object Computationally expensive to retrieve each object from database and compare it against the observed from database and compare it against the observed scene.scene.

Object recognition (ultimate goal of most Object recognition (ultimate goal of most computer vision research).computer vision research).

Inputs: Inputs:

• A database of objects.A database of objects.

• A scene or image to recognize.A scene or image to recognize.

Problems:Problems:

1.1. Objects in the scene undergo some transformations.Objects in the scene undergo some transformations.

2.2. Objects may partially occlude each other.Objects may partially occlude each other.

3.3. Computationally expensive to retrieve each object Computationally expensive to retrieve each object from database and compare it against the observed from database and compare it against the observed scene.scene.

Problem StatementProblem Statement

Recognition under Similarity Recognition under Similarity Transformation:Transformation:

““Is there a transformed (rotated, Is there a transformed (rotated, translated and scaled) subset of some translated and scaled) subset of some model point-set which matches a subset model point-set which matches a subset of the scene point-set?”of the scene point-set?”

Recognition under Similarity Recognition under Similarity Transformation:Transformation:

““Is there a transformed (rotated, Is there a transformed (rotated, translated and scaled) subset of some translated and scaled) subset of some model point-set which matches a subset model point-set which matches a subset of the scene point-set?”of the scene point-set?”

OutlineOutline

1.1. Key ideaKey idea

2.2. General FrameworkGeneral Framework

3.3. Recognition under Various TransformationsRecognition under Various Transformations

4.4. Recognition of 3D Objects from 2D ImagesRecognition of 3D Objects from 2D Images

5.5. Recognition of Polyhedra ObjectsRecognition of Polyhedra Objects

6.6. ComparisonsComparisons

– AlignmentAlignment

– Generalized Hough TransformGeneralized Hough Transform

1.1. Key ideaKey idea

2.2. General FrameworkGeneral Framework

3.3. Recognition under Various TransformationsRecognition under Various Transformations

4.4. Recognition of 3D Objects from 2D ImagesRecognition of 3D Objects from 2D Images

5.5. Recognition of Polyhedra ObjectsRecognition of Polyhedra Objects

6.6. ComparisonsComparisons

– AlignmentAlignment

– Generalized Hough TransformGeneralized Hough Transform

Key Idea (1/8)Key Idea (1/8)

Recognizing a pentagon in an image


Blue: 1


Red: 1


Green: 5


Purple: 1


Brown: 1


Blue: 1Red: 1Green: 5Purple: 1Brown: 1

Object is a pentagon!


Blue: 1Red: 2Green: 2Purple: 1Brown: 1

Object is NOT a pentagon!

Brute Force RecognitionBrute Force Recognition

Let Let mm: points on the model,: points on the model,

nn: points on the scene.: points on the scene.

Recognize a single model: O((Recognize a single model: O((mm x x nn))2 2 x x tt))

where where tt is the complexity to verify the is the complexity to verify the

model against the scene.model against the scene.

If If mm==nn, and , and tt==nn, then we have O(, then we have O(nn55) to ) to recognize a single model.recognize a single model.

Let Let mm: points on the model,: points on the model,

nn: points on the scene.: points on the scene.

Recognize a single model: O((Recognize a single model: O((mm x x nn))2 2 x x tt))

where where tt is the complexity to verify the is the complexity to verify the

model against the scene.model against the scene.

If If mm==nn, and , and tt==nn, then we have O(, then we have O(nn55) to ) to recognize a single model.recognize a single model.

General Framework (1/2)General Framework (1/2)

Two stages algorithm:Two stages algorithm:

1.1. Preprocessing (for each model):Preprocessing (for each model):

For each feature points pair:For each feature points pair:• Define a local coordinate basis on this pair. Define a local coordinate basis on this pair.

• Compute and quantize all other feature points Compute and quantize all other feature points in this coordinate basis. in this coordinate basis.

• Record (model, basis) in a hash table.Record (model, basis) in a hash table.

Two stages algorithm:Two stages algorithm:

1.1. Preprocessing (for each model):Preprocessing (for each model):

For each feature points pair:For each feature points pair:• Define a local coordinate basis on this pair. Define a local coordinate basis on this pair.

• Compute and quantize all other feature points Compute and quantize all other feature points in this coordinate basis. in this coordinate basis.

• Record (model, basis) in a hash table.Record (model, basis) in a hash table.

General Framework (2/2)General Framework (2/2)

2.2. Online recognition (given a scene, extract feature Online recognition (given a scene, extract feature points):points):

a)a) Pick arbitrary ordered pair:Pick arbitrary ordered pair:

• Compute the other points using this pair as a basis.Compute the other points using this pair as a basis.

• For all the transformed points, vote all records (model, basis) For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and appear in the corresponding entry in the hash table, and histogram them.histogram them.

b)b) Matching candidates: (model, basis) pairs with large Matching candidates: (model, basis) pairs with large number of votes.number of votes.

c)c) Recover the transformation that results in the best least-Recover the transformation that results in the best least-squares match between all corresponding feature squares match between all corresponding feature points.points.

d)d) Transform the features, and verify against the input Transform the features, and verify against the input image features (if fails, repeat to 1).image features (if fails, repeat to 1).

2.2. Online recognition (given a scene, extract feature Online recognition (given a scene, extract feature points):points):

a)a) Pick arbitrary ordered pair:Pick arbitrary ordered pair:

• Compute the other points using this pair as a basis.Compute the other points using this pair as a basis.

• For all the transformed points, vote all records (model, basis) For all the transformed points, vote all records (model, basis) appear in the corresponding entry in the hash table, and appear in the corresponding entry in the hash table, and histogram them.histogram them.

b)b) Matching candidates: (model, basis) pairs with large Matching candidates: (model, basis) pairs with large number of votes.number of votes.

c)c) Recover the transformation that results in the best least-Recover the transformation that results in the best least-squares match between all corresponding feature squares match between all corresponding feature points.points.

d)d) Transform the features, and verify against the input Transform the features, and verify against the input image features (if fails, repeat to 1).image features (if fails, repeat to 1).

Two Stages Algorithm (1/2) Two Stages Algorithm (1/2)

[1]

Two Stages Algorithm (2/2)Two Stages Algorithm (2/2)

[1]

ComplexityComplexity

Assume Assume mm==nn, and , and kk is the number of point to is the number of point to define the basis.define the basis.

• Preprocessing: O(Preprocessing: O(nnk+1k+1) for a single model.) for a single model.

• Recognition: O(Recognition: O(nnk+1k+1) against all objects in ) against all objects in the database.the database.

Assume Assume mm==nn, and , and kk is the number of point to is the number of point to define the basis.define the basis.

• Preprocessing: O(Preprocessing: O(nnk+1k+1) for a single model.) for a single model.

• Recognition: O(Recognition: O(nnk+1k+1) against all objects in ) against all objects in the database.the database.

Under Various Transformations (1/2)Under Various Transformations (1/2)

1.1. Translation in 2D and 3D.Translation in 2D and 3D.• 1-point basis.1-point basis.

• O(nO(n22).).

2.2. Similarity transformation in 2D.Similarity transformation in 2D.• 2-point basis.2-point basis.

• O(nO(n33).).


• O(nO(n44).).

1.1. Translation in 2D and 3D.Translation in 2D and 3D.• 1-point basis.1-point basis.

• O(nO(n22).).


• O(nO(n33).).


• O(nO(n44).).

Under Various Transformations (2/2)Under Various Transformations (2/2)

4.4. Affine transformationAffine transformation• 3-point basis.3-point basis.

• O(nO(n44))

5.5. Projective transformationProjective transformation• 4-point basis.4-point basis.

• O(nO(n55))

4.4. Affine transformationAffine transformation• 3-point basis.3-point basis.

• O(nO(n44))

5.5. Projective transformationProjective transformation• 4-point basis.4-point basis.

• O(nO(n55))

Recognition of 3D Objects from 2D Images (1/5)Recognition of 3D Objects from 2D Images (1/5)

1.1. Correspondence of planesCorrespondence of planes• Preprocessing: consider planar sections of the Preprocessing: consider planar sections of the

3D object which contain three of more interest 3D object which contain three of more interest points. points.

• Hash (model, plane, basis) triplet.Hash (model, plane, basis) triplet.

• Use either projective transformation or affine Use either projective transformation or affine transformation.transformation.

• Once the planes correspondence have been Once the planes correspondence have been established, the position of the entire 3D body established, the position of the entire 3D body is solved.is solved.

1.1. Correspondence of planesCorrespondence of planes• Preprocessing: consider planar sections of the Preprocessing: consider planar sections of the

3D object which contain three of more interest 3D object which contain three of more interest points. points.

• Hash (model, plane, basis) triplet.Hash (model, plane, basis) triplet.

• Use either projective transformation or affine Use either projective transformation or affine transformation.transformation.

• Once the planes correspondence have been Once the planes correspondence have been established, the position of the entire 3D body established, the position of the entire 3D body is solved.is solved.


2.2. Singular affine transformationSingular affine transformation

A x + b = U A x + b = U wherewhere

AA : 2x3 affine matrix: 2x3 affine matrix

xx : 3x1 3D vector: 3x1 3D vector

bb : 2x1 2D translation vector: 2x1 2D translation vector

UU : 2x1 image : 2x1 image

2.2. Singular affine transformationSingular affine transformation

A x + b = U A x + b = U wherewhere

AA : 2x3 affine matrix: 2x3 affine matrix

xx : 3x1 3D vector: 3x1 3D vector

bb : 2x1 2D translation vector: 2x1 2D translation vector

UU : 2x1 image : 2x1 image


A set of four non-coplanar points in A set of four non-coplanar points in 3D defines a 3D affine basis: 3D defines a 3D affine basis: – One point as originOne point as origin

– The vectors between origin and the other three The vectors between origin and the other three points as the unit (oblique) coordinate system.points as the unit (oblique) coordinate system.

Preprocess the model points in this Preprocess the model points in this four-basis point.four-basis point.

A set of four non-coplanar points in A set of four non-coplanar points in 3D defines a 3D affine basis: 3D defines a 3D affine basis: – One point as originOne point as origin

– The vectors between origin and the other three The vectors between origin and the other three points as the unit (oblique) coordinate system.points as the unit (oblique) coordinate system.

Preprocess the model points in this Preprocess the model points in this four-basis point.four-basis point.


Recognition: Recognition:

• Pick four points: Pick four points: pp00, , pp11, , pp22, and , and pp33 --> three vectors: --> three vectors:

vv11, , vv22, and , and vv33 in the 2D image. in the 2D image.

• ExistsExists: : v v11 + + v v2 2 + + v v33 = 0, = 0, wherewhere ( (, , , , ) )

≠ 0≠ 0

• A point A point pp in the image, with in the image, with v v be the vector from be the vector from pp0 0

to to pp..

• Vote for all t ≠ 0 (Vote for all t ≠ 0 (a line with parametera line with parameter t): t):

vv = ( = (++tt) ) vv11 + ( + ( + + tt) ) vv22 + ( + (tt) ) vv3, 3, where where

((, , ) is the coordinate of ) is the coordinate of vv in the in the vv11, , vv22

basis.basis.

Recognition: Recognition:

• Pick four points: Pick four points: pp00, , pp11, , pp22, and , and pp33 --> three vectors: --> three vectors:

vv11, , vv22, and , and vv33 in the 2D image. in the 2D image.

• ExistsExists: : v v11 + + v v2 2 + + v v33 = 0, = 0, wherewhere ( (, , , , ) )

≠ 0≠ 0

• A point A point pp in the image, with in the image, with v v be the vector from be the vector from pp0 0

to to pp..

• Vote for all t ≠ 0 (Vote for all t ≠ 0 (a line with parametera line with parameter t): t):

vv = ( = (++tt) ) vv11 + ( + ( + + tt) ) vv22 + ( + (tt) ) vv3, 3, where where

((, , ) is the coordinate of ) is the coordinate of vv in the in the vv11, , vv22

basis.basis.


3.3. Establishing a viewing angle with Establishing a viewing angle with similarity transformation.similarity transformation.• Tesselate a viewing sphere (uniform in Tesselate a viewing sphere (uniform in

spherical coordinates).spherical coordinates).

• Record (model, basis, angle) in the hash table.Record (model, basis, angle) in the hash table.

• 2-point basis: O(n2-point basis: O(n33) (the same order as without ) (the same order as without viewing angle because the viewing angle viewing angle because the viewing angle introduces only a constant factor -- introduces only a constant factor -- independent of the scene).independent of the scene).

3.3. Establishing a viewing angle with Establishing a viewing angle with similarity transformation.similarity transformation.• Tesselate a viewing sphere (uniform in Tesselate a viewing sphere (uniform in

spherical coordinates).spherical coordinates).

• Record (model, basis, angle) in the hash table.Record (model, basis, angle) in the hash table.

• 2-point basis: O(n2-point basis: O(n33) (the same order as without ) (the same order as without viewing angle because the viewing angle viewing angle because the viewing angle introduces only a constant factor -- introduces only a constant factor -- independent of the scene).independent of the scene).

Recognition of Polyhedral ObjectsRecognition of Polyhedral Objects

Polygonal objectsPolygonal objects• Choose an edge as the basis, record (model, Choose an edge as the basis, record (model,

basis edge) in the hash table. basis edge) in the hash table.

• Preprocessing and recognition is O(nPreprocessing and recognition is O(n22).).

Polygonal objectsPolygonal objects• Choose an edge as the basis, record (model, Choose an edge as the basis, record (model,

basis edge) in the hash table. basis edge) in the hash table.

• Preprocessing and recognition is O(nPreprocessing and recognition is O(n22).).

[1]

Comparisons (1/2)Comparisons (1/2)

1.1. With alignment method.With alignment method.• Use exhaustive enumeration of all possible Use exhaustive enumeration of all possible

pairs in the objects and the images.pairs in the objects and the images.

• Geometric hashing can process all models Geometric hashing can process all models simultaneously, while the alignment method simultaneously, while the alignment method processes models sequentially.processes models sequentially.

• The alignment method does not require any The alignment method does not require any additional memory, while geometric hashing additional memory, while geometric hashing requires a large memory to store hash table.requires a large memory to store hash table.

• Geometric hashing more efficient if:Geometric hashing more efficient if:

• The scene contains enough features (6-10) for The scene contains enough features (6-10) for efficient recognition by voting.efficient recognition by voting.

• There are many models.There are many models.

1.1. With alignment method.With alignment method.• Use exhaustive enumeration of all possible Use exhaustive enumeration of all possible

pairs in the objects and the images.pairs in the objects and the images.

• Geometric hashing can process all models Geometric hashing can process all models simultaneously, while the alignment method simultaneously, while the alignment method processes models sequentially.processes models sequentially.

• The alignment method does not require any The alignment method does not require any additional memory, while geometric hashing additional memory, while geometric hashing requires a large memory to store hash table.requires a large memory to store hash table.

• Geometric hashing more efficient if:Geometric hashing more efficient if:

• The scene contains enough features (6-10) for The scene contains enough features (6-10) for efficient recognition by voting.efficient recognition by voting.

• There are many models.There are many models.

Comparisons (2/2)Comparisons (2/2)

2.2. With Generalized Hough Transform With Generalized Hough Transform (GHT).(GHT).• GHT quantizes all possible (continuous) GHT quantizes all possible (continuous)

transformations between the model and transformations between the model and the scene into a set of bins, whilethe scene into a set of bins, while

• Geometric Hashing quantizes just the Geometric Hashing quantizes just the (discrete) transformation represented by (discrete) transformation represented by the basis.the basis.

2.2. With Generalized Hough Transform With Generalized Hough Transform (GHT).(GHT).• GHT quantizes all possible (continuous) GHT quantizes all possible (continuous)

transformations between the model and transformations between the model and the scene into a set of bins, whilethe scene into a set of bins, while

• Geometric Hashing quantizes just the Geometric Hashing quantizes just the (discrete) transformation represented by (discrete) transformation represented by the basis.the basis.

SummarySummary

• Ability to recognize objects that have Ability to recognize objects that have undergo an arbitrary transformation.undergo an arbitrary transformation.

• Can perform partial matching.Can perform partial matching.

• Efficient and can be parallelized easily.Efficient and can be parallelized easily.

• Use transformation-invariant access key to Use transformation-invariant access key to the hash table.the hash table.

• Two phases (preprocessing and Two phases (preprocessing and recognition).recognition).

• Require a large memory to store hash table.Require a large memory to store hash table.

• Ability to recognize objects that have Ability to recognize objects that have undergo an arbitrary transformation.undergo an arbitrary transformation.

• Can perform partial matching.Can perform partial matching.

• Efficient and can be parallelized easily.Efficient and can be parallelized easily.

• Use transformation-invariant access key to Use transformation-invariant access key to the hash table.the hash table.

• Two phases (preprocessing and Two phases (preprocessing and recognition).recognition).

• Require a large memory to store hash table.Require a large memory to store hash table.

ReferencesReferences

[1] Yehezkel Lamdan and Haim J. [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition and Efficient Model-Based Recognition Scheme, Scheme, ICCVICCV, 1988., 1988.

[1] Yehezkel Lamdan and Haim J. [1] Yehezkel Lamdan and Haim J. Wolfson, Geometric Hashing: A General Wolfson, Geometric Hashing: A General and Efficient Model-Based Recognition and Efficient Model-Based Recognition Scheme, Scheme, ICCVICCV, 1988., 1988.

Documents

Geometric Hashing: A General and Efficient Model-Based Recognition Scheme Yehezkel Lamdan and Haim J. Wolfson ICCV 1988 Presented by Budi Purnomo Nov 23rd