16
A Graph Theoretic Model for Semantic Annotation of Articulated Shape- Parts using Zernike Moment based Features 1 Sourav Saha, 2 Laboni Nayak, 3 Saptarsi Goswami, 4 Priya RanjanSinhaMahapatra 1 Institute of Engineering & Management, Kolkata Email: [email protected] 2 Institute of Engineering & Management, Kolkata Email: [email protected] 3 A.K.Choudhury School of IT, Calcutta University, 700106,India Email: [email protected] 4 Department of Computer Science, Kalyani University, 741235,India Email: [email protected] Abstract In this paper, we attempt to solve the problem of automated semantic part annotation for articulated objects. This kind of problem is more challenging than standard object detection, object segmentation and pose estimation tasks because semantic parts of articulated objects often have similar appearance andvarying positions. To tackle these challenges, we build a graph theoretic decomposition model to represent the structural composition of semantic parts. We determine features of decomposed shape-parts using Zernike moments and these features undergo ANN based training process to generate shape-annotation model. Our experiment is conducted on well-known MPEG-7 shape data set. The performance of our proposed model is compared with human perception based annotation and the experimental results indicate that our model is capable of achieving promising performance. Keywords: Computer Vision; Shape Analysis; Zernike Moments;Shape Part Annotation. 1.Introduction The past few years have witnessed significant progress on various object-level visual recognition tasks, such as object detection, object segmentation [1]etc. Understanding how different parts of an object are related and where the parts are located have been an increasingly important topic in computer vision[2, 3]. There is extensive study on some part-level visual recognition tasks, such as human pose estimation (predicting joints) [4]. But there are only a few pieces of works on semantic part segmentation, such as human parsing [5] and car parsing [6]. In some applications (e.g., activity analysis), it would be of great use if computers can produce richer part segmentation instead of just giving a set of key point or a bounding box of an entire object. We have made an attempt on the challenging task of semantic part segmentation for articulated objects in this paper. Since articulated objects often have homogeneous appearance on the whole body, hierarchical segmentation methods [7] could not produce quality proposals for semantic parts. Besides, current classifiers are not able to distinguish between different semantic parts since they usually act on whole appearance which varies a lot in case of an articulated object. There is a large amount of variability of shapes due to different viewpoints and poses. Therefore, it is very challenging to build a model that effectively combines object appearance, parts of a shape and spatial relation among parts under varying viewpoints and poses, while still allowing efficient learning and inference. Inspired by [8], a shape decomposition model is proposedin this paper to capture structural relations among parts. We develop agraph-theoretical framework which allows us to develop an efficient shape-decomposition model for articulated object under various poses and viewpoints. The intuitive basis of the decomposition model is that the structure of an articulated objectcan often be described by compositions of its flexible and non-flexible parts where each part most likely corresponds to a maximal-clique in its graphical representation. International Journal of Pure and Applied Mathematics Volume 119 No. 12 2018, 12869-12883 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue ijpam.eu 12869

A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

A Graph Theoretic Model for Semantic Annotation of Articulated Shape-

Parts using Zernike Moment based Features

1Sourav Saha,

2Laboni Nayak,

3Saptarsi Goswami,

4Priya RanjanSinhaMahapatra

1Institute of Engineering & Management, Kolkata

Email: [email protected] 2Institute of Engineering & Management, Kolkata

Email: [email protected] 3A.K.Choudhury School of IT, Calcutta University, 700106,India

Email: [email protected] 4Department of Computer Science, Kalyani University, 741235,India

Email: [email protected]

Abstract In this paper, we attempt to solve the problem of automated semantic part annotation

for articulated objects. This kind of problem is more challenging than standard object

detection, object segmentation and pose estimation tasks because semantic parts of articulated

objects often have similar appearance andvarying positions. To tackle these challenges, we

build a graph theoretic decomposition model to represent the structural composition of

semantic parts. We determine features of decomposed shape-parts using Zernike moments

and these features undergo ANN based training process to generate shape-annotation model.

Our experiment is conducted on well-known MPEG-7 shape data set. The performance of our

proposed model is compared with human perception based annotation and the experimental

results indicate that our model is capable of achieving promising performance.

Keywords: Computer Vision; Shape Analysis; Zernike Moments;Shape Part Annotation.

1.Introduction

The past few years have witnessed significant progress on various object-level visual recognition

tasks, such as object detection, object segmentation [1]etc. Understanding how different parts of an

object are related and where the parts are located have been an increasingly important topic in

computer vision[2, 3]. There is extensive study on some part-level visual recognition tasks, such as

human pose estimation (predicting joints) [4]. But there are only a few pieces of works on semantic

part segmentation, such as human parsing [5] and car parsing [6]. In some applications (e.g., activity

analysis), it would be of great use if computers can produce richer part segmentation instead of just

giving a set of key point or a bounding box of an entire object.

We have made an attempt on the challenging task of semantic part segmentation for articulated

objects in this paper. Since articulated objects often have homogeneous appearance on the whole

body, hierarchical segmentation methods [7] could not produce quality proposals for semantic parts.

Besides, current classifiers are not able to distinguish between different semantic parts since they

usually act on whole appearance which varies a lot in case of an articulated object. There is a large

amount of variability of shapes due to different viewpoints and poses. Therefore, it is very challenging

to build a model that effectively combines object appearance, parts of a shape and spatial relation

among parts under varying viewpoints and poses, while still allowing efficient learning and inference.

Inspired by [8], a shape decomposition model is proposedin this paper to capture structural relations

among parts. We develop agraph-theoretical framework which allows us to develop an efficient

shape-decomposition model for articulated object under various poses and viewpoints. The intuitive

basis of the decomposition model is that the structure of an articulated objectcan often be described by

compositions of its flexible and non-flexible parts where each part most likely corresponds to a

maximal-clique in its graphical representation.

International Journal of Pure and Applied MathematicsVolume 119 No. 12 2018, 12869-12883ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu

12869

Page 2: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

It is also of significant importance to design an efficient inferential learning strategyusing

differentiable features for the proposed shape decomposition model. Wedetermine the features of

shape-parts using Zernike moments and these feature vectors undergo Artificial Neural Network

based learning process to generate a classifier for automated shape-annotation. Our experiment is

conducted on well-known MPEG-7 shape data set and the outcome of our proposed modelis

compared with human perception based annotation.The experimental results indicates that the

proposed model can perform meaningful segmentation of a compound shape as well as it canalso

classify the segmented parts to map them with their sematic meaning as shown in Figure 1.

2.Related Works

In terms of method, our work is related to [9], where they used compositional model for horse

segmentation. But they did not incorporate variations in appearances due to movement of flexible

parts of a compound shape into their compositional shape model.Only a few poses and viewpoints are

modelled by them to identify shape parts. There was also work on automatically learning the

compositional structure/hierarchical dictionary [10], but the algorithms did not consider semantic

parts and were not evaluated on standard dataset.

In [5, 11], Dong et. al. generated segment proposals by super pixel/over-segmentation algorithms, and

then used these segments as building blocks for whole human body by either compositional method or

And-Or graph. Our task is inherently quite different from such parsing because articulated objects

often have roughly homogeneous appearance throughout the wholeinterior region of the body. So

their super pixel/over segmentation algorithms sometimes fail toproducesemantically significant

segments for animal body. Besides, in challenging datasets like Pascal VOC, cluttered background

and unclear boundaries further degrade the super pixel quality. Therefore, the super pixel-based

methods for shape parsing are not appropriate for identifying meaningful parts of anarticulated object.

Our work bears a similarity to [8] in the spirit that a graph based decomposition models is used to deal

with variations of complex shape due to viewpoints/poses or varied orientation of flexible parts. But

our decomposition model is able to capture spatial relation between shapes. Besides, our task is part

annotation for articulated object of various poses and viewpoints, which appears more challenging

than landmark localization for faces.

There are lots of works in the literature on modeling object shape such as [10, 12]. But they were only

aimed at object-level detection or segmentation. None of them explored graph theoretic concepts as an

effective tool for shape decompositionin combination with moment based shape descriptors. In

subsequent section, we describe the proposed model which uses graph theoretic approaches in fusion

with Zernike moment based shape descriptor for automated shape annotation.

Figure 1 (a) Shape Part Segmentation (b) Shape Part Annotation

International Journal of Pure and Applied Mathematics Special Issue

12870

Page 3: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

3.Proposed work

Figure 2 Overall Flow of the proposed model

3.1 Graph Theoretical Shape Decomposition

Shape decomposition is a fundamental step towards shape analysis and understanding. Such a method

is widely used in shape recognition, shape retrieval, skeleton extraction and motion planning. We

primarily explore graph theory coupled with a perception based heuristic strategy to obtain a visually

meaningful shape-partitioning. The proposed model considers polygonal approximation to represent a

shape suitably as a simpler graph form where each polygonal side represents an edge in the graph.

Such graph-representation of shape facilitates us to apply graph theoretic approaches effectively. We

use the concept of approximated vertex-visibility graph to generate viable cuts for decompositionin

the shape-representative graph [Fig. 3]. We propose a heuristic based iterative multi-stage clique

extraction strategy to decompose the shape depending on its visibility graph. A few refinements are

proposed by exploring the options of (a) merging correlated parts for better visual interpretation and

(b) inserting antipodal points of reflex vertices in polygonal approximation for generating more viable

cuts. The decomposition based on the proposed model appears to be coherent with human observation

to a large degree.

3.1.1 Visibility Graph

Figure 3 a) Object b) Visibility Graph

International Journal of Pure and Applied Mathematics Special Issue

12871

Page 4: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

The visibility graph of a simple polygon is a well-known geometric featureuseful in many

applications. In computational geometry, ideally two vertices are said to beable to form visible pair in

a polygon if and only if the line segment joining the associatedvertices lies inside the polygon. Given

a simple polygon P = {v1, v2,…,vn}, a line segmentlies inside P if it does not intersect the exterior of

P. Ideally, an undirected graph G is called the visibility graph of P if the vertices V = {v1, v2,..,vn}

correspondto the vertices of P and an edge occurs in E between two vertices vi andvj in V if and onlyif

vi and vj are visible in P. Figure3b diagrammatically depicts vertex-visibility graph ofa polygonal

approximation of the shape Figure3a. Understandably, the connection betweenvisible adjacent pair is

equivalent to the line of sight.

3.1.2 Shape Decomposition through IterativeMulti-Stage Maximal Clique Extraction

A clique is a complete sub-graph of a graph. A maximal complete sub-graph is calleda maximal

clique which is not contained in any other complete sub-graph i.e. it cannot be extended by including

any more adjacent vertices. In Figure2, polygonal verticesnamely v3, v4, v5, v8, v9, and v10 construct a

maximal clique with boundary-cuts: {v3v10, v5v8}. A boundary-cut can be considered as an interface

between two adjacent shape-parts of an object. Intuitively, a maximal-cliquemost likely corresponds

to a perceptually decomposable semantically meaningful part ofa shape. For example, maximal clique

with verticesnamely v3, v4, v5, v8, v9, and v10in Figure 2 corresponds to a meaningful partof the object.

Based on this notion, the proposed model attempts to explore maximal cliquesin order to decompose a

shape into its semantic components. The most popular technique to findall maximal cliques of a given

undirected graph was presented by CoenBronet. al [13]. The basic form of Bron-Kerbosch algorithm

is a recursive backtrackingalgorithm that searches for all possible maximal cliques in a given graph.

We develop aheuristic strategy based on the principle concept of Bron–Kerbosch algorithm [13] to

iteratively extract maximal clique from the visibility graph of apolygonal shape approximation. The

proposed heuristic algorithm is presented as algorithm2: GetShapePartition.

The main objective of Algorithm 2:GetShapePartition is to find a suitable clique-partitioning of

polygonal shape approximation governed by some perception based heuristic rule. The proposed

algorithm only considers maximal cliques and henceforth for the sake of simplicity, a maximal clique

is also referred as clique. From perception based partitioning perspective, following criteria act as the

intuitive basis of an effective heuristic for selecting most suitable clique corresponding to a

meaningful shape-part from the graph-representation of a polygonal shape approximation.

Heuristic (area, boundary_cut): The proposed work considersarea as well as average boundary-cut-

length of a clique at every step as heuristic parameter. The objective is set so as to obtaina maximum

area clique with minimum average boundary-cut-length. Mathematically, themaximization objective

function is chosen as the ratio of area and average boundary-cut-length for computing heuristic score

of a clique.

Illustration of the Proposed Shape Partitioning Algorithm–GetShapePartition:

International Journal of Pure and Applied Mathematics Special Issue

12872

Page 5: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

Here, we illustrate the flow of Algorithm 2:GetShapePartitionwhich is applied to partition a

International Journal of Pure and Applied Mathematics Special Issue

12873

Page 6: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

polygonalshape approximation based on the above mentioned heuristic criteria. At every step, the

International Journal of Pure and Applied Mathematics Special Issue

12874

Page 7: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

objectivewould be to select a clique from the residual graph which maximizes the objective function

denoted as Heuristic (area, boundary_cut). With reference to Fig.4,P1:{v4, v5, v9, v10, v12,v13} is

segmented at first iteration-stageas most distinctive convex-shape-part based on maximization ofour

heuristic. Removal of P1 leads to three disconnected but distinct cliques namely P2, P3, and P4 as

shown in Fig. 4b. These cliques correspond to meaningful shape-parts and removal of cliques {P1, P2,

P3, P4} results in a residual graph having nodes {v1, v2, v3, v4, v13, v14}. Subsequently, cliqueP5:{v1, v2,

v3, v14} is segmented on maximization of the heuristic leaving P6 as remaining final part. Since our

algorithm generates maximal cliques using the idea of Bron–Kerboschalgorithm, the time complexity

in worst case is O (3n/3

) for n-vertex graph [14].

International Journal of Pure and Applied Mathematics Special Issue

12875

Page 8: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

Figure 4 Illustration: (a) Stage 1: Segmentation of P1 (b) Stage 2: Segmentation of P2, P3, P4

(c) Stage 3: Segmentation of P5 (d) Segmentation of P6

3.3 Zernike Moments (ZM) as Features for Convex Shape-Part

Teague [15] has suggested the use of continuousorthogonal moments to overcome the

problemsassociated with the geometric and invariant moments.He introduced two different

continuous-orthogonalmoments, Zernike and Legendre moments, based onthe orthogonal Zernike and

Legendre polynomials,respectively. Several studies have shown thesuperiority of Zernike moments

over Legendremoments due to their better feature representationcapability and low noise-sensitivity

[16]. Therefore,we choose Zernike moments as our shapedescriptor feature to represent decomposed

convex part.The complex Zernike moments (𝑍𝑛𝑚 ) are derived from orthogonal Zernike

polynomialsasmathematically expressed below.

𝑉𝑛𝑚 𝑥,𝑦 = 𝑉𝑛𝑚 𝑟 cos𝜃 , 𝑟 sin𝜃 = 𝑅𝑛𝑚 𝑟 exp(𝑗𝑚𝜃)

𝑅𝑛𝑚 𝑟 = (−1)𝑠 𝑛 − 𝑠 !

𝑠! 𝑛+ 𝑚

2− 𝑠 !

𝑛− 𝑚

2− 𝑠 !

𝑟𝑛−2𝑠

(𝑛− 𝑚 )/2

𝑠=0

International Journal of Pure and Applied Mathematics Special Issue

12876

Page 9: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

Where 𝑛 is a non-negative integer, 𝑚 is an integer such that 𝑛 − |𝑚| is even,|𝑚| ≤ 𝑛, 𝑟 = 𝑥2 + 𝑦2,

𝜃 = tan−1 𝑦

𝑥

Projecting the image function onto the basis set, the Zernike moment(𝑍𝑛𝑚 ) of order n with repetition

m isusually computed as below.

𝑍𝑛𝑚 =𝑛 + 1

𝜋

𝑥

𝑓(𝑥,𝑦)𝑉𝑛𝑚 (𝑥,𝑦)

𝑦

, 𝑥2 + 𝑦2 < 1

Interestingly, the magnitude of the moments staysthe same after the rotation. Hence, the magnitudes

ofthe Zernike moments of the image, could betaken as rotation invariant features [16].Zernike

moments (ZM) have the followingadvantages [16]:

Rotation invariance: As shown above, the magnitudes of Zernike moments are invariantto

rotation.

Robustness: They are robust to noise and minor variations in shape.

Expressiveness: Since the basis is orthogonal, they have minimum information redundancy.

We have considered magnitudes of Zernike moments having values greater than or equal to one as

significant features while the order nis varied from 0upto 10 [Table 1]. These feature vectors are

labeled with their respective semantically meaningful shape-part classes and an ANN-

classificationtool is used to generate shape-annotation model based on these features.During

automated annotation, a shape-partis segmented using clique-based partitioning method as discussed

in previous section and subsequently, it is annotated based on ANN-classifier’s prediction process.

Table 1 Magnitudes of Zernike Moments of a Shape-Part.

Shape-Part n m magnitude n m magnitude n m magnitude

0 0 10598 6 0 3369 8 8 2205

1 1 <1 6 2 25 9 1 <1

2 0 10364 6 4 3379 9 3 <1

2 2 24 6 6 25 9 5 <1

3 1 <1 7 1 <1 9 7 <1

3 3 <1 7 3 <1 9 9 <1

4 0 3614 7 5 <1 10 0 1983

4 2 24 7 7 <1 10 2 25

4 4 3607 8 0 2205 10 4 1968

5 1 <1 8 2 23 10 6 25

5 3 <1 8 4 2218 10 8 1983

5 5 <1 8 6 23 10 10 25

4. Experimental Results

Evaluation of performance for any shape annotation model is a complex issue, mainly due to the

subjectivity of human vision based judgment. The merit of such a scheme depends on how closely the

outcome of the model matches with the human perception based annotation. The most intuitive

criteria, for estimating qualitative effectiveness of the scheme would be to examine the similarity

between the annotation obtained through the proposed model and the annotation based on human

perception. The proposed model considers well-known publicly available MPEG-7 shape data set

[http://www.dabi.temple.edu/ shape/MPEG7/dataset.html (1999)] for establishing the viability of our

proposed model. Table 2 lists Zernike moment feature vector for each identifiable part of an image-

object. One of the interesting observations of the result is that identical shape-parts of an object

produce similar feature vectors. Table 3 shows the qualitative merit of the proposed model. The

annotation of a shape-part belonging to an object is presented in terms of a specific color. The

effectiveness of the proposed model seems to be promising as evident in Table 3. The quantitative

International Journal of Pure and Applied Mathematics Special Issue

12877

Page 10: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

aspect of the proposed model depends on the accuracy of ANN classifier which is built based on

Zernike feature vectors. The qualitative merit of such a classification scheme is usually determined in

terms of confusion matrix whereas the accuracy is measured based on True Positive (TP), True

Negative (TN), False Positive (FP) and False Negative (FN) values [Accuracy = (TP + TN) / (TP +

TN + FP + FN)]. As per our observation, the proposed framework seems to perform reasonably well

with accuracy nearly 96%.

Table 2Magnitudes of Zernike Moment as Feature Vector with order(n) = 0, 1, 2… 10

Image Magnitudes of Zernike Moments as Feature Vector

17594, 17285, 34, 5962, 33, 5968, 5663, 34, 5655, 34, 3644,

33, 3634, 33, 3644, 3326, 35, 3339, 35, 3327, 35

6046, 5869, 5, 2070, 4, 2075, 1901, 5, 1894, 5, 1282, 4, 1272,

4, 1282, 1096, 5, 1108, 5, 1096, 5

18897, 18572, 47, 6402, 46, 6408, 6086, 48, 6078, 48, 3912,

46, 3901, 46, 3911, 3576, 48, 3590, 48, 3577, 48

6007, 5822, 27, 2058, 26, 2064, 1883, 27, 1875, 27, 1277, 25,

1266, 25, 1277, 1083, 28, 1096, 28, 1083, 28

2890, 2767, 4, 999, 4, 1005, 885, 4, 878, 4, 629, 4, 619, 4, 629,

497, 4, 510, 4, 497, 4

10598, 10365, 25, 3614, 24, 3607, 3369, 25, 3379, 25, 2206,

24, 2219, 24, 2206, 1983, 26, 1968, 26, 1983, 26

2441, 2325, 12, 847, 11, 853, 741, 12, 733, 12, 536, 11, 526,

11, 536, 411, 13, 425, 13, 411, 13

2325, 2215, 2, 807, 2, 812, 706, 2, 698, 2, 511, 1, 501, 1, 511,

391, 2, 405, 2, 392, 2

10080, 9850, 15, 3432, 15, 3437, 3211, 15, 3203, 15, 2110, 14,

2100, 14, 2110, 1871, 16, 1883, 16, 1871, 16

2032, 1922, 19, 709, 18, 715, 608, 20, 600, 20, 453, 17, 441,

17, 452, 332, 20, 346, 20, 332, 20

5451, 5283, 4, 1868, 4, 1874, 1709, 4, 1701, 4, 1160, 4, 1149,

4, 1159, 982, 4, 995, 4, 982, 4

12581, 12319, 32, 4276, 31, 4282, 4023, 33, 4015, 33, 2623,

31, 2613, 31, 2623, 2351, 33, 2364, 33, 2351, 33

7186, 6993, 10, 2455, 9, 2461, 2270, 10, 2263, 10, 1517, 9,

1507, 9, 1516, 1314, 10, 1326, 10, 1314, 10

5240, 5066, 27, 1799, 26, 1805, 1635, 28, 1627, 28, 1119, 25,

1108, 25, 1119, 936, 28, 950, 28, 936, 28

20216, 19890, 23, 6842, 22, 6848, 6526, 23, 6518, 23, 4175,

22, 4165, 22, 4175, 3841, 23, 3853, 23, 3841, 23

International Journal of Pure and Applied Mathematics Special Issue

12878

Page 11: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

9409, 9189, 3, 3205, 3, 3210, 2993, 3, 2986, 3, 1972, 3, 1962,

3, 1972, 1742, 3, 1755, 3, 1743, 3

Table 3 Performance of the proposed model

Image-1 Image-2 Image-3 Annotation

wing

body

tentacle

head

neck

body

hump

leg

head

neck

body

tail

leg

trunk

head

body

leg

ear

head

body

leg

5. Conclusion

In this paper, we propose a model for automated semantic part annotationofarticulated

objects. The semantic parts of articulated objects often have similar appearance and highly

International Journal of Pure and Applied Mathematics Special Issue

12879

Page 12: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

varying positions which pose challenges to the automated annotation task. We build a graph

theoretic multi-stage decomposition model to represent the structural composition of

semantic parts. The concept of Zernike moments is used to determine shape descriptive features of

decomposed shape-parts.These feature vectors are labeled with semantically meaningful shape-part

class and undergo training phase of ANN-classification process to generate shape-annotation

model.During automated annotation, a shape-part isisolated using clique-based partitioning method as

discussed in previously and subsequently, each isolated shape-part is annotated based on ANN-

classifier’s prediction process. One of the interesting observations stemming out of our experiment is

that identical shape-parts of an object produces similar Zernike moment based feature vectors. Our

experiment is conducted on well-known MPEG-7 shape data set and the performance of our model is

compared with human perception based annotation. The experimental results indicate promising

efficacy of the proposed model when the shape-parts can be identifiable on analyzingshape-contour in

its planar form.

References

[1] Felzenszwalb, P. F., Girshick, R. B., McAllester , D., and Ramanan, D., 2010. Object detection

with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine

Intelligence, 32(9):1627–1645.

[2] Arbel´aez, P., Hariharan, B., Gu, C., Gupta, S., Bourdev, L., and Malik, J., 2012. Semantic

segmentation using regions and parts. In International Conference on Computer Vision and Pattern

Recognition.

[3]Carreira J., Caseiro R., Batista J., Sminchisescu C., 2012. Semantic Segmentation with Second-

Order Pooling. In: Fitzgibbon A., Lazebnik S., Perona P., Sato Y., Schmid C. (eds) Computer Vision

– ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7578. Springer, Berlin,

Heidelberg

[4] Yang, Y., and Ramanan, D., 2011.Articulated pose estimation with flexible mixtures-of-parts. In

International Conference on Computer Vision and Pattern Recognition.

[5] Dong, J., Chen, Q., Xia, W., Huang, Z., and Yan, S., 2013. A deformable mixture parsing model

with parselets. In IEEE International Conference on Computer Vision.

[6] Eslami, S. and Williams, C., 2012. A generative model for parts-based object segmentation. In

Advances in Neural Information Processing Systems (pp. 100-107).

[7] Arbelaez, P., Maire, M., Fowlkes, C., and Malik, J., 2011. Contour detection and hierarchical

image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5):898–

916.

[8] Saha, S., Mandal, A., Sheth, P., Narnoli, H., and Mahapatra, P.R.S. 2017. A Computer Vision

Framework for Partitioning of Image-Object through Graph Theoretical Heuristic Approach, In

International Conference on Computational Intelligence, Communications, and Business Analytics

(CICBA).

[9] Zhu, L., Chen, Y., and Yuille. A., 2010. Learning a hierarchical deformable template for rapid

deformable object parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence,

32(6):1029–1043.

International Journal of Pure and Applied Mathematics Special Issue

12880

Page 13: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

[10] Fidler, S., and Leonardis, A., 2007. Towards scalable representations of object categories:

Learning a hierarchy of parts. In International Conference on Computer Vision and Pattern

Recognition.

[11] Dong, J., Chen, Q., Shen, X., Yang, J., and Yan, S., 2014. Towards unified human parsing and

pose estimation. In International Conference on Computer Vision and Pattern Recognition.

[12] Ferrari, V., Jurie, F., and Schmid, C., 2010. From images to shape models for object detection.

International Journal of Computer Vision, 87(3):284–303.

[13] Bron, Coen and Kerbosch, Joep 1973 ‘Algorithm 457: Finding All Cliques of an

UndirectedGraph’, Communication ACM, Vol. 16, No. 9, pp.575–577.

[14] Tomita, Etsuji and Tanaka, Akira and Takahashi, Haruhisa, 2006. The Worst-case Time

Complexity for Generating All Maximal Cliques and Computational Experiments, Theoretical

Computer Science - Computing and combinatorics, Elsevier Science Publishers Ltd., Vol. 363, No. 1,

pp.28–42.

[15] Teague, M. R., 1980. Image analysis via the general theory of moments. Journal of the Optical

Society of America, 70(8), 920-930.

[16] Kim, W. Y., & Kim, Y. S., 2000. A region-based shape descriptor using Zernike moments.

Signal processing: Image communication, 16(1-2), 95-102.

Authors Biography

SouravSaha is currently an Assistant Professor at Department of Computer Science and

Engineering, Institute of Engineering and Management. He did his graduation (B.Tech) in

Computer Science & Engineering from Kalyani University in 2000, and obtained his Master

of Engineering (M.E.) degree in Computer Science and Engineering from Bengal

Engineering and Science University in 2002. He has numerous international and national

publications in reputed journals and conferences. His research interests include Computer

Vision, Cellular Automata, Pattern Recognition etc.

LaboniNayak is currently pursuing master of technology at Department of Computer

Science and Engineering in Institute of Engineering and Management. She did her graduation

(B.Tech) in Computer Science & Engineering from MAKAUT in 2015.Her research interests

are in the field of Computer Vision, Image Procesing, Pattern Recognition etc.

SaptarsiGoswami is currently an Assistant Professor at A.K.Choudhury School of IT

(AKCSIT), Calcutta University. He has received his B.Tech in Electronics Engineering from

MNIT, Jaipur in 2001. He completed his M.Tech (CS) from AKCSIT, Calcutta University. He

has submitted his P.hD thesis in 'Feature Selection’. He has 16 + years of working experience,

where first 11years he has worked in IT Industry and next 5 years in Academics. He has 40 +

research papers in reputed international journals and conferences. His area of expertise

includes data warehousing, business intelligence, software engineering and machine learning.

PriyaRanjanSinhaMahapatrais currently an Associate Professor at Department of

Computer Science and Engineering, University of Kalyani. He received his Ph.D degree from

University of Kalyani. He has numerous international and national publications in reputed

International Journal of Pure and Applied Mathematics Special Issue

12881

Page 14: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

journals and conferences. His research interests lie in the field of Computational Geometry, Algorithms,

Computer Vision etc.

International Journal of Pure and Applied Mathematics Special Issue

12882

Page 15: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

12883

Page 16: A Graph Theoretic Model for Semantic Annotation of ... · 1.Introduction The past few years have witnessed significant progress on various object -level visual recognition tasks,

12884