Angkor Wat Devata Analysis MSU Abstract

Clustering Face Carvings: Exploring the Devatas of Angkor Wat

Brendan Klare, Pavan Mallapragada, Anil K Jain1

Michigan State UniversityEast Lansing, MI, U.S.A.

{klarebre,pavanm,jain}@cse.msu.edu

Kent DavisDatASIA, Inc

Holmes Beach, FL, [email protected]

AbstractWe propose a framework for clustering and vi-sualization of images of face carvings at archaeological sites.The pairwise similarities among face carvings are computedby performing Procrustes analysis on local facial features(eyes, nose, mouth, etc.). The distance between correspondingface features is computed using point distribution models;the final pairwise similarity is the weighted sum of featuresimilarities. A web-based interface is provided to allow domainexperts to interactively assign different weights to each facefeature, and display hierarchical clustering results in 2D or3D projections obtained by multidimensional scaling. The pro-posed framework has been successfully applied to the devatagoddesses depicted in the ancient Angkor Wat temple. Theresulting clusterings and visualization will enable a systematicanthropological, ethnological and artistic analysis of nearly1,800 stone portraits of devatas of Angkor Wat.

Keywords-Facial carvings; Angkor Wat; Devatas; clustering;facial components; archaeology

I. INTRODUCTION

Ancient archaeological sites contain thousands of highlydetailed sculptures carved over several generations. A sig-nificant number of sculptures contain intricate human facecarvings which in essence, describe the ethnological, artisticand social environment of the time. Archaeologists and otherexperts manually compare and group similar facial carvingsto study the history and progress of a society. When dealingwith a large collection of sculptures, experts must resortto computational techniques, since human analysis is oftensubjective and infeasible [1]. Clustering and visualizationare exploratory data analysis [9] techniques that assist informing objective hypotheses that later can be confirmed byexperts.

The notion of similarity between face carvings is crit-ical to generate meaningful clusters. In order to measurethis similarity, face carvings must be represented usingfeatures that capture the salient aspects of a face. Earlierapproaches [4], [8] for clustering face carvings were limiteddue to the unavailability of computational resources andmodeling methodology, so they relied on rather primitivefeatures to represent faces. 3D models have also been used

1Anil K. Jains research was partially supported by World Class Univer-sity (WCU) program through the National Research Foundation of Koreafunded by the Ministry of Education, Science and Technology (R31-2008-000-10008-0).

to measure the similarity of carvings from the Bayon templeusing laser range sensors [6]. Automatic face recognition hasmade substantial progress over the last three decades [11],and state-of-the art approaches use more complex models torepresent detailed aspects of faces.

In this paper, we propose an exploratory framework forclustering and visualization of face carvings. Face carvingsare represented using shape information in the form of localpoint distribution models built from different facial compo-nents (eyes, nose, mouth, etc.). The similarity between twoface carvings is computed as the weighted combination ofthe similarities between corresponding facial components.This measure of similarity is flexible and allows expertsto generate different clusterings by changing the weightsof different facial components. An interface is providedwhere the expert can visualize the variations in clusteringby interactively changing the importance of different facialcomponents, and choose specific clusterings to form hy-pothesis of interest. Multi-dimensional scaling is used tomap the faces into a two- or three-dimensional space forvisualization.

The proposed framework is demonstrated on a collectionof 252 digital photographs of face carvings from the 12thcentury Hindu temple of Angkor Wat in modern day Cambo-dia [5]. The Angkor Wat temple contains carvings of 1,796contemporary 12th century women (referred to hereafteras devatas) depicted in detailed portraits (see Figure 1).According to archaeologists and ethnologists, the womenappear to represent a range of ethnicities residing in thisregion. They may, therefore, represent social and politicalrelationships among different corners of the Khmer Empire.Identifying their origins could indicate the political or cul-tural dominance of certain regions and provide a geneticfingerprint of the empires influence. Probable ethnicitiesof these devatas include (i) Khmer (Central Cambodia), (ii)Lao (NE Thailand & Laos), (iii) Thai-Lanna (North CentralThai), (iv) Chinese (East Asian), (v) Indian (South Asian),(vi) Cham (Coastal Vietnam), (vii) Primitive (MountainTribes) and (viii) Archipelago (Malaysian-Indonesian).

II. FACE CARVING REPRESENTATION

Traditional appearance based face recognition approaches,namely Eigenfaces (PCA) and Fisherfaces (LDA), extract

(a) (b)Figure 1. Two different face carvings of devatas with 140 landmarks usedto represent each face. Subsets of these manually labeled points are usedto describe different face features such as (a) eyes and mouth and (b) noseand face outline.

features from the pixel intensities on the face. However, thetexture information used in appearance based methods can-not be leveraged in representing the images of face carvingsbecause of: (i) the nature of stone carvings (different stonematerials result in different textures), and (ii) degradationof carvings due to exposure to harsh weather conditions forover 800 hundred years. This restricts the face carvings ofAngkor Wat to be represented using the shape informationalone.

A. Face Landmark Points

The proposed approach uses a set of landmark points torepresent the shape of a face. For a given face image Ii,let Pi = (px1 , p

x2 , , pxn, py1, py2, , pyn)T be the set of n

landmark points, where pxj , pyj are the x and y-coordinates of

the j-th landmark point. Further, let pj = (pxj , pyj ). For the

relatively low image quality of devata faces of Angkor Wat,a set of n = 140 landmark is manually marked. Exampleof two such landmark annotations can be seen in Figure1. While typically 68 points have been used to model aface, we use a larger number of points to include the earsand neck. Typically, the process of extracting the landmarkset Pi from subject i can be automated using the ActiveShape Model (ASM) [2]. However, since an ASM relieson the texture information to build a robust model, it is notapplicable to face carvings due to the lack of reliable textureinformation. To minimize the impact of errors in manuallandmark placement, we first trace the boundary of eachfacial component. A new set of landmarks is then resampledalong the contour at equal spacings. In our applications threetimes as many points along the contour were resampledwhich resulted in reliable landmarks for matching two faces.In the case of damaged carvings, we approximated thelocation of the corresponding landmark locations.

B. Procrustes Analysis

The similarity between two sets of landmarks can varydue to their translation, rotation, and the scale variationswith respect to a canonical image. Therefore, it is necessaryto align the face images such that the similarity computedquantifies the difference due to shape alone. Procrustesanalysis [3] has been successfully used to normalize theface images by rigidly aligning multiple sets of points tominimize the distance between the corresponding landmarks.The landmarks points in each image are compensated forthe differences in their translation, rotation, and scale usingProcrustes analysis as follows:

1) The translational component is removed by subtractingthe mean of the landmarks from each of the points, i.e. pi pi 1n

ni=1 pi.

2) The scale is normalized by fixing concatenated vectorof points in Pi to unit norm, i.e. Pi Pi/||Pi||2.

3) The rotational differences are normalized by declaringthe first set of points P1 to have angle 1 = 0, and usingthe least squares minimization to find each i such that thedistance between P1 and Pi is minimized.

C. Point Distribution Model

After normalizing each point set, face carvings are rep-resented using a Point Distribution Model (PDM). PDMshave been shown to be successful for representing faces inASMs [2] and face recognition between sketches and pho-tographs [10]. Since only the shape information is persistentwhen matching sketches to photographs, it resembles thecurrent scenario where texture information is not usable. Wefollow an approach similar to [10] where the distances werecomputed by comparing local features extracted from facialcomponents (eyes, nose, mouth, etc.), as well as the globalrelationship between the locations of components.

A PDM represents a set of landmarks using a statisticalrepresentation obtained by PCA. In particular, given n faciallandmark points and point sets from m faces, data is firstcentered at the origin by subtracting the global mean . Themean centered data points Pi = Pi, for i = 1, ,m arecollected into the columns of the pattern matrix R2n,m,where the ith column of is the centered vector Pi. Thecovariance matrix C = 1m1

T is computed, and the topd eigenvectors V R2n,d of C are selected such that 95%of the data variance is retained. For any subject i, the shapePi is represented as bi = V T Pi.

D. Global and Local Features

At a coarse level, the similarity between a pair of faceimages is determined by (i) the local similarities betweenthe components, and (ii) the global arrangement of thecomponents relative to each other.

(a) Example clusters based on shape of the eyes (b) Example clusters based on shape of the mouthFigure 2. Clustering results using different weight vectors w. The square grids in the top row denotes the set of possible weights an archaeologist maybrowse through to see clusterings obtained using those weights. The closer the user is to a corner in the grid, the higher the weight of the associated facecomponent. The two examples show the prototypes of 8 clusters obtained with weight vectors that only use the (a) eyes, and (b) mouth.

Local Point Models: Given a set of K facial com-ponents (e.g. nose, mouth, eyes, etc.), each component isindependently aligned using Procrustes analysis and a PDMis constructed for the component. We denote the coefficientvector for each component by bki , and all coefficient vectorstogether as bi = [b1i ,b

2i , ,bKi ]. The similarity between

two face carvings is computed using the weighted Euclideandistance, where different facial components are assigneddifferent weights.

Global Point Model: Local models capture only thesimilarity between corresponding components of the face.However, the relative placement of these components isalso important in characterizing a face. The global relationbetween the components is captured as follows: Given Kcomponents, the K(K 1)/2 pairwise distances betweenthe center of mass of all possible pairs are computed.

III. CLUSTERING FACE CARVINGS

Given K facial components (Section II), the similaritymatrix for component k is

Sk(i, j) = 1bki bkj 2 min(Sk)max(Sk)min(Sk) (1)

Data clustering is performed by combining the K localsimilarity matrices Sk Rm,m (k = 1 . . .K), and theglobal similarity matrix Sg into a single similarity matrixS = wgSg + (1wg)

Kk=1 wkSk, where 0 wg 1, andK

k=1 wk = 1The feature weights are introduced to emphasize certain

facial features for data exploration. Different weights result

in different clusterings, providing different insights into thefacial portrait collection.

A. Visualization

Visualization forms a key aspect of data analysis. In ourapplication it assists a domain expert to browse throughdifferent clusterings obtained by using different weightingsof facial components. A web-based interface was developedto allow an expert to alter the facial component weight vectorw and visualize clusterings as the weights are altered. Adiscrete set of weights is used to display a K-sided convexpolygon, in which each of the K vertices corresponds to oneof the K facial features. When the user selects a point withinthe polygon, the weight vector w is determined based on theinverse distance of the selected point from each of the Kvertices. As the user interactively modifies the weights, anupdated clustering is generated and displayed allowing theuser to investigate the groupings as function of weights. Foreach different weightings, the similarity matrix S is used toproject the data points in either a 2 or 3 dimensional spaceusing multidimensional scaling [7].

IV. EXPERIMENTAL RESULTS

The proposed clustering and visualization framework wasapplied to a subset of 252 facial images of devatas, collectedfrom the West Gopura (or entrance pavilion) of the AngkorWat temple. While the landmarks for many different facialcomponents were marked, in this study we used only fourof the major facial components (eyes, nose, mouth, and faceoutline) for clustering the devatas into 8 groups (possibly

Figure 3. (Scaled) Heat map indicating the percentage of pairwise labelssatisfied by each weighted combination of facial components. The brighterthe location, the larger the number of constraints satisfied. These weightingscan be explored further using the interactive web page1.

corresponding to the 8 ethnic groups listed in Sec. I). Weperformed clustering using complete-link clustering, whichallows for dendrogram analysis by domain experts.

Figure 2 shows two separate clustering results withdifferent selection of weights. The figure also shows theassociated 2-d projections of the 252 faces. The two 8-clustergroupings are based on just using the eyes or the mouth incomputing the face similarities. An interactive web page ofour proposed clustering framework is also available1, whereusers can specify the weights interactively. Once a clusteringof interest is found, it can be further explored by clickingon any displayed cluster prototype to view images of allcarvings in the cluster.

A. Evaluation

Since the ground truth cluster labels are not available forthe devata images, external cluster evaluation indices cannotbe applied. While it is difficult to manually provide clusterlabels for all the images, it is relatively easier to determineif a pair of images should be put in the same cluster ornot. A group of faculty and students at the Khmer ArtsAcademy in Phnom Penh, Cambodia, have labeled a setof 243 randomly chosen pairs of images. Figure 3 showsa heat map of the fraction of pairwise constraints satisfiedby the clusterings using different weighted combinations ofthe facial components. The brighter the block, the largerthe number of pairwise constraints satisfied. The brightestblock (6th row, 3rd column) corresponds to around 50%of the human pairwise labels satisfied by the clustering(note that for 8 clusters of equal size, only 12.5% of theconstraints can be satisfied by random clustering). Utilizingthe pairwise labels as side-information during clustering canfurther enhance the performance of clustering in terms ofthis evaluation metric. Further, Figure 3 reveals that faceoutline and nose are considered important facial componentsfor manually labeling the pairs. Further, the objective is notto strictly adhere to the human categorization, but to suggestpossible non-obvious groupings that can help the expertsdraw interesting conclusions.

1http://www.cse.msu.edu/~klarebre/angkor/cluster/index.html

The different clusters generated by our framework are cur-rently being analyzed by archaeologists to discover possibleanswers to long standing questions regarding the devatas ofAngkor Wat. Using the interactive web page, the currenthypotheses being examined include: (i) Do the Angkor Watdevatas represent the different ethnic groups listed in SectionI? (ii) Does the location of certain devatas within the templehave any meaning (in order to further answer this question,we are currently labeling the remaining devata carvings withlandmarks), and (iii) Is there any correlation between facecarvings that may indicate how many different sculptorswere used in crafting the devata carvings.

V. CONCLUSIONS

A method for grouping face carvings in archaeologicalsites is presented that: (i) computes the local distance alongthe predominant modes of variation for different facial com-ponents, (ii) allows domain experts to change the importanceof each component by assigning different weights in com-puting the similarity, and (iii) provides a visualization tool toillustrate how the clusters change with different weights. Theproposed framework results in many meaningful clusters ofdevata face carvings of the magnificent Angkor Wat temple.These clusterings suggest hypotheses to domain experts fortheir validation. Future work will attempt to find richer setof clusters using semi-supervised clustering techniques, byincorporating instance level pairwise constraints provided byexperts into the grouping process.

REFERENCES

[1] J. A. Barcelo, editor. Computational Intelligence in Archae-ology. Information Science Reference, 2008.

[2] T. F. Cootes et al. Active shape modelstheir trainingand application. Computer Vision Image Understanding,61(1):3859, 1995.

[3] J. C. Gower. Generalized procrustes analysis. Psychometrika,40(1):3351, 1975.

[4] E. Guralnick. The proportions of some archaic greek sculp-tured figures: a computer analysis. Computers and theHumanities, 10:153169, 1976.

[5] H. Jessup and T. Zephir. Sculpture of angkor and ancientcambodia: Millennium of glory. In National Gallery of Art,Washington DC, 1997.

[6] M. Kamakura et al. Classification of bayon faces using 3dmodels. In Virtual Systems and Multimedia, 2005.

[7] J. B. Kruskal. Multidimensional scaling by optimizing good-ness of fit to a nonmetric hypothesis. Psychometrika, 29(1):127, 1964.

[8] G. Siromoney, S. Govindaraju, and M. Bagavandas. Templecarvings of southern india. Perspectives in Computing,5(1):3443, 1985.

[9] J. W. Tukey. Exploratory Data Analysis. Addison Wesley, 1edition, 1977.

[10] P. Yuen and C. Man. Human face image searching systemusing sketches. IEEE Trans. Systems, Man and Cybernetics,37(4):493504, July 2007.

[11] W. Zhao et al. Face recognition: A literature survey. ACMComput. Surv., 35(4):399458, 2003.

Documents

Angkor Wat Devata Analysis MSU Abstract