Literal Review Facial Expression Recognition

Embed Size (px)

Citation preview

  • 8/3/2019 Literal Review Facial Expression Recognition

    1/16

    Informatics Research Review

    Emotion recognition through facial expressions

    Konstantinos Kanellis

    January 19, 2012

    1

  • 8/3/2019 Literal Review Facial Expression Recognition

    2/16

    1 Introduction

    In the last two decades, many efforts have been made to improve theHuman - Computer Interaction (HCI). The efforts are focused on the inter-action in a more natural way that would resemble to the Human - HumanInteraction. Humans use speech as the main way of communication to inter-act to each other but they also use non-verbal communicative signals such asgestures, body postures and facial expressions to emphasize a certain part ofthe speech and display emotions. The most expressive way humans displayemotions is through facial expressions [3, 23]. Studies showed [38, 14], thereare six basic emotions that are universal and cross-culturally displayed by

    facial expressions. These are anger, disgust, fear, joy, sadness and surprise.Even though emotion recognition for humans is a rather natural process thathappens effortlessly [15] for computers is a rather difficult and computationalintensive process.

    The applications that emotion recognition systems can be applied arenumerous and almost in every interaction between Human and Computeror Robot. For instance, Emotion recognition systems can be used in thegame and entertainment industry. An example would be a more elabo-rate interactive process in games, closely resembling human-like interac-tion/communication. Furthermore, other areas that could take advantage ofemotion recognition systems are lie detection systems, psychiatric and neuro-psychiatric studies, emotion-sensitive automatic tutoring systems, tour-guiderobot, face image compression and synthetic face animation [24], video surveil-lance and security systems. [20, 12, 39, 22].

    Due to the importance of facial expression recognition and the wide rangeof its applications, several efforts were attempted. One of the pioneers wasSuwa [42] in 1978, followed by Mase [30] in the early 1990s, who used opticalflow for facial expression recognition. Continued efforts in the following 20years produced further improvements in the field [44, 26, 40, 18, 37, 1, 29,

    13, 32, 36], introducing novel approaches in the process.

    In this review three different appearance-based methods of emotions recog-nition through facial expressions will be analyzed and compared based ontheir results at the same database. Furthermore, this review will proposeimprovements to existing methods that are more suitable and precise for reallife situations.

    2

  • 8/3/2019 Literal Review Facial Expression Recognition

    3/16

    2 Basic Structure of Facial Expression Anal-

    ysis Systems

    Facial expression analysis consists of three stages: face acquisition, featureextraction and representation, facial expression recognition (Figure 1).

    Figure 1: Basic structure of facial expression analysis systems [27]

    Face acquisition is the stage where the faces position is detected in or-der to be distinguished from the background and make possible the featuresextraction. The face acquisition stage is affected by the illumination and thepose of the face. Small variations in illumination of the face do not affectmuch the face acquisition. On the other hand bad illuminated faces are asignificant problem [6]. Pose variations may distort the facial expression oreven make it disappear partially [18].

    Feature extraction and representation is the next stage where the facialchanges of the located face are extracted and represented by features. Thereare two approaches: The geometric feature-based methods which are focusedon the shape and location of the facial components such as eyes and mouthto extract features that describe the face geometry. The second approach is

    appearance-based methods which use image filters to the face or parts of theface to extract a feature vector [27].

    The last stage is the Facial Expression Recognition.The expression recog-nition methods can be divided into two methods. The frame-based recogni-tion method that recognizes the expression in each frame separately (staticclassifiers, eg. Bayesian networks, Neural Networks, Support Vector Ma-chine) and the sequence-based recognition method that uses information froma sequence of frames to recognize the expression of the frames (temporal clas-

    3

  • 8/3/2019 Literal Review Facial Expression Recognition

    4/16

    sifiers, eg. HMM). [27, 1, 36, 37].

    3 Related studies

    In this part three methods of emotion recognition systems will be exam-ined. Each method introduces a different approach, but all use the samedataset. The first two are appearance based methods while the last one is ageometric based method.

    3.1 First Method

    The first of the examined approaches is [5] by M. Bartlett, G. Littlewort,M. Frank et al. The method with the best results among the techniques thatthey presented was AdaSVM which is a combination of the Adaptive Boostalgorithm (Adaboost) as a feature selection technique and an SVM classifier.

    In the beginning the faces in the dataset are detected and resized to 48 x48 pixels with the distance between the eyes to be almost 24 pixels.

    For the feature extraction step a bank of Gabor filters at 8 orientationsand 9 spatial frequencies that produces 40 filters is used [25, 28, 4]. A Gaborfilter is a complex sinusoid modulated by a 2D Gaussian function. Gaborfilters can be configured to extract a particular band of frequency featuresfrom an image. The extraction of the features of an image is acquired byconvolving the Gabor filters with the image and the result is 8 x 9 x 48 x 48= 165888 Gabor features. The number of the features is reduced to 900 withthe usage of the Adaboost [4, 41, 46] feature selection algorithm which selectsthe appropriate Gabor feature for different image locations based on their im-portance to classification accuracy. Adaboost is an iterative algorithm that

    treats the gabor filters as weak classifiers. In every iteration Adaboost selectsthe classifier with the lowest weighted classification error through exhaustivesearch. The error is then used to update the weights such that the wronglyclassified samples get weights increased. The features are selected based onthe features that have already been selected in previous iterations and on thegoal of reducing the error of the previous filter.

    After the feature extraction a Support Vector Machine (SVM) classifieris used to classify the features. SVM is a supervised classifier that constructs

    4

  • 8/3/2019 Literal Review Facial Expression Recognition

    5/16

    a hyperplane to separate the input data which are not linear separable in

    the initial data space. In order to do so SVM uses kernel functions to non-linear transform the initial data into a high-dimension feature space, wherethe data can be optimal separated by a hyperplane.

    The dataset that was used against the AdaSVM classifier (the combi-nation of Adaboost and SVM) was Cohn-Canade dataset. There were alsoother classifiers tested: Adaboost as classifier, SVM without pre-process ofthe features and Linear Discriminant Analysis (LDA). The results showed atTable 1 that the best classifier was AdaSVM with 93.3% recognition rate.Remarkable is that the full expression recognition process runs in real time.

    Table 1: Leave-one-out generalization performance of Adaboost,SVMs,AdaSVMs and LDA [5]

    3.2 Second Method

    The second of the examined methods of the expression recognition wasintroduced by C. Shan, S. Gong and P.W. McOwan [31]. It is based on theidea that facial images can be represented by micro-patterns which can bedelineated by Local Binary Patterns (LBP). LBP is a fast and low computa-tional cost method that can be used effectively even in low resolution imagesto extract facial features [33, 34].

    The system gets as input a face image so there is no need for face de-tection. Instead of that a processing of the image is made to transform theface to a desirable size based on the distance between the centers of the eyeswhich should be 55 pixels. The distance between the eyes is about 2 timesthe width of the face and about 3 times the height on the face. In the end thefacial image is cropped to an image of 110x150 pixels based on the positionof the eyes.

    For the feature extraction step the LBP method as mentioned above isused. In the LBP method the image is turned to gray scale and divided into

    5

  • 8/3/2019 Literal Review Facial Expression Recognition

    6/16

    sub-regions. For each pixel of each sub-region a binary number is calculated.

    The binary number is the output of the comparison of a pixels value withsome of surrounding pixels values image. If we consider the pixel as thecenter of a circle then the comparison should be made with the pixels thatare on the perimeter of the circle. The basic LBP algorithm uses a smallneighborhood of 3x3 pixels with 1 pixel radius for the circle (Figure 2 left).Extended LBP algorithm allows any radius (R) and any number of pixels(P) in neighborhood and is denoted as LBPP,R [35] (Figure 2 right). Furtherextension of LBP is the usage only of binary numbers that contain at mosttwo bitwise transitions from 0 to 1. These LBPs are called uniform patternsand are used to reduce the labels. The usage of uniform patterns is denoted

    as LBPu2

    [35]. For instance 00000000, 00110000 and 11100001 are uniformpatterns. In the end a histogram of the uniform patterns is created for each ofthe sub-regions and all the histograms are concatenated to produce a globalhistogram which is the description of the face. In this realization the imageswere divided into 42 regions (6 rows x 7 columns matrix of 18 x 21 pixels perregion), that give good ratio in recognition performance to computationalcost [2] (Figure 3 left). Also it is used a LBPu28,2, which has 59 labels.

    Figure 2: Left: The basic LBP operator [2]. Right: Two examples of theextended LBP [35] a circular (8, 1) neighbourhood, and a circular (12, 1.5)neighbourhood [31]

    Figure 3: Left: A face image divided into 67 sub-region. Right: The weightsset for weighted dissimilarity measure. Black squares indicate weight 0.0,dark gray 1.0, light gray 2.0 and white 4.0. [31]

    For the classification of emotions two different techniques are compared

    6

  • 8/3/2019 Literal Review Facial Expression Recognition

    7/16

    the Template Matching with Nearest Neighbour classifier and SVM classi-

    fier. In the template matching during training the LBP global histogramsof the images which belong to the same classes are averaged to build a his-togram template of the class they belong to. In testing the LBP histogramof the input facial picture is matched with the closest class template. For thematching Chi square statistic (x2) with different weights for each face regionis used (Figure 3 right), according to the importance of the information con-tained in each region. Facial expressions are expressed mostly by eye areaand mouth area. Thus the corresponding regions have more weight.

    For the SVM classifier the classification function of a set of labelled ex-

    amples T=(xi, yi), i = 1,...,l where xiRn

    , yi1,1 and b is the parameterof the optimal hyperplane is given by:

    f(x) = sgn(l

    i = laiyiK(xi, x) + b)

    Where alphai are Lagrange multipliers and K(xi, x) is a kernel function(xi)(xj).

    SVM can decide between two classes. To accomplish multi-class clas-sification a cascade of binary classifiers combined with a voting scheme isused.

    Table 2: Results of LBP with Template matching and LBP with SVM . [31]

    Experiments were run on the Cohn-Kanade dataset for both classifiers.Best results gave the combination of LBP with SVM with Polunomial Kernelfunction. The recognition results are shown on Table 2. Another experimentwas run to evaluate the LBP over different image resolutions (even withvery low) Table 3. The results reflect the effectiveness of LBP in real worldenvironment where most of the times low-resolution video input is available.

    3.3 Third Method

    A different approach to the problem was given by I. Cohen et al. [9].This method uses the Piecewise Bezier Volume Deformation (PBVD) as face

    7

  • 8/3/2019 Literal Review Facial Expression Recognition

    8/16

    Table 3: LBP-based algorithm results on various image resolutions. [31]

    tracking method [43]. In the first frame of image sequence the eye cornersand mouth corners are detected and their position is used as landmark tofit a 16 surface patches face model embedded in Bezier volumes Figure 4(a).

    This looks like a wireframe that wraps the face and can track the changes ofthe facial features like eyebrows and mouth. To calculate the magnitude ofthe features motion in 2D images the template matching technique is usedbetween frames at different resolutions. These magnitudes are translated to3D motion vectors that are called Motion Units Figure 4(b) and used as fea-tures for the classification process. The motion unit are similar to EkmansAction Units [16].

    Figure 4: (a) The wireframe model and (b) the facial motion measurements.[9]

    The idea behind the selection of the classifier is to find a structure thattakes into account the dependencies among the features. Tree-Augmented-Naive Bayes (TAN) [21] is a classifier that partially fulfils that by having theclass node with no parent and each feature with parents the class node and

    8

  • 8/3/2019 Literal Review Facial Expression Recognition

    9/16

    at most one other feature Figure 5.

    Figure 5: An example of a TAN classifier. [9]

    During the learning process the classifier has no fixed structure of theBayesian network, but tries to find the best one that maximizes the likeli-hood function given the training data. The method that it is used to findthe best TAN structure is a modified Chow-Liu algorithm [7, 21]. The al-gorithm Figure 6 calculates the conditional probabilities of pairs of featuresgiven the class and using the probabilities as weights constructs a maximumweighted spanning tree. The spanning tree is built based on Kruskals algo-rithm [11] Figure 7. Another interesting point is that the learning algorithmfor the TAN [21] is based on discrete features but the feature data space

    of the method is continuous, causing complicate computation of the pairsprobabilities for the case of general distribution of features. This problemwas solved by assuming the distribution of the features to be Gaussian andgiven by:

    p(c, x1, x2,...,xn) = p(c)n

    i=1 p(xi|paxi, c)

    They also use Gaussian Naive-Bayes, Cauchy Nave-Bayes and HMM asclassifier. Best results for Cohn-Canade dataset is with Gaussian TAN. TheHMM is tested against their own dataset because of technical reasons. Thetests were performed five times with leave-one out cross validation method.

    The recognition rate for Cohn-Kanade database with 95% confidence inter-vals is 73.22% +- 1.24. At Table 4 are displayed the results of all meth-ods on Cohn-Kanafe dataset. They showed that the classifiers based onframes (static) are easier to implement and train instead of dynamic clas-sifiers (HMM) that are based on sequential data. On the other hand theynoted that static classifiers can be unreliable for video sequence because ofthe misclassification of frames that are not at the peak of expression.

    9

  • 8/3/2019 Literal Review Facial Expression Recognition

    10/16

    Figure 6: TAN learning algorithm. [9]

    Figure 7: Kruskals Maximum Weighted Spanning Tree algorithm. [9]

    Table 4: Recognition rates for CohnKanade database together with their95% confidence intervals. [9]

    4 Conclusion

    In this review we examined three different approaches of facial expressionrecognition systems. These methods were trained and tested with CohnKanadedataset. The method that seems to have the best result is the combination

    10

  • 8/3/2019 Literal Review Facial Expression Recognition

    11/16

    of AdaBoost feature extraction technique with SVM (with linear RBF ker-

    nel function) classifier with recognition performance accuracy of 93.3% andreal-time operation speed. In comparison to the aforementioned method thecombination of LBP feature extraction technique with SVM (with polyno-mial kernel function) classifier seem to be less accurate with recognition rateof 88.4% but with decent performance on low resolution images. Its opera-tion speed is also in real-time. The TAN classifier had the worse recognitionperformance of 73.22%.

    4.1 Improvement suggestions

    Real life emotional facial expressions are difficult to be gathered becausethey are short lived and greatly affected even by slight context-based changes[47]. Furthermore the labelling of the data is a difficult process, time con-suming and expensive[47]. There is the solution of creating datasets withacted emotions. However the acting of some of the 6 basic emotions is diffi-cult to elicit in any lab environment and may lead to wrong labeling of thedata [8]. Moreover the facial expressions of acted emotions differ on intensity,duration and occurrence order from facial expressions of natural spontaneousemotions that occur in daily life situations [10, 17, 45]. Additionally there are

    situations in real life where two or more expressions of natural spontaneousemotions may be blended or be sequential without being clearly separated bya neutral expression. All of the aforementioned difficulties make the creationof the datasets a very challenging task that affects a lot the progress of theemotion recognition systems research. The lack of datasets that fully com-ply with the requirements leads very often the researchers to use their owndatasets which have no homogeneity among them. As result the direct com-parison of systems results that accrue from different test beds is impossible[19]. Thus the effectiveness of various methods cannot be evaluated ob-jectively.Databases should be improved by including authentic spontaneousfacial expressions that take into account the concept of each situation the

    expression was captured, to make accurate labeling.

    Most of the times to fully interpret an emotion the facial expression isnot enough by itself and more information is needed. Moreover, dependingon the context, body gesture, voice and cultural dissimilarities, a facial ex-pression can express intention, cognitive processes, physical effort or otherinterpersonal meanings [27]. A way to overcome this problem and also im-prove the recognition rates is to use additional inputs that would give moreinformation about the expressed emotion wherever is possible. For exam-

    11

  • 8/3/2019 Literal Review Facial Expression Recognition

    12/16

    ple an additional input could be the voice of a person during a conversation

    or hand gestures for situations without voice presence. The combination ofdifferent modalities could be resembled with the way that human uses simul-taneously different senses to recognize an expressed emotion [39, 47].

    Another idea is to use Facial Action Coding System (FACS) to classifyfacial actions prior to any interpretation attempts instead of classifying facialexpressions directly into basic emotional categories. The FACS is the mostwidely accepted technique of measuring the facial muscle movement corre-sponding to different expressions. The FACS is a framework for describingfacial expression and codes the 6 basic universal emotions as a combination

    of facial visual distinct muscular motion known as Action Units (AU). TheFACS is also suitable for labelling of datasets because the FACS AUs areobjective descriptors and independent of interpretation. [16, 19]

    4.2 Future work

    The ideal facial expression analysis system must perform automaticallyand in real-time for all stages of the process and analyze the facial actionsregardless of context, culture, gender, age and so on. Furthermore, it should

    consider except for the type of the expression the intensity and dynamics offacial actions.

    References

    [1] Automatic Recognition of Facial Expressions using hidden Markovmodels and estimation of expression intensity), author=Lien, J.J.J.,year=1998, school=Washington University. PhD thesis.

    [2] T. Ahonen, A. Hadid, and M. Pietikainen. Face recognition with localbinary patterns. Computer Vision-ECCV 2004, pages 469481, 2004.

    [3] N. Ambady and R. Rosenthal. Thin slices of expressive behavior aspredictors of interpersonal consequences: A meta-analysis. Psychologicalbulletin, 111(2):256, 1992.

    [4] M.S. Bartlett, G. Littlewort, I. Fasel, and J.R. Movellan. Real timeface detection and facial expression recognition: Development and ap-plications to human computer interaction. In Computer Vision and

    12

  • 8/3/2019 Literal Review Facial Expression Recognition

    13/16

    Pattern Recognition Workshop, 2003. CVPRW03. Conference on, vol-

    ume 5, pages 5353. IEEE, 2003.

    [5] M.S. Bartlett, G. Littlewort, M. Frank, C. Lainscsek, I. Fasel, andJ. Movellan. Recognizing Facial Expression: Machine Learning andApplication to Spontaneous Behavior. In 2005 IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPR05),volume 2, pages 568573. IEEE, June 2005.

    [6] P.N. Belhumeur, J.P. Hespanha, and D.J. Kriegman. Eigenfaces vs.fisherfaces: Recognition using class specific linear projection. PatternAnalysis and Machine Intelligence, IEEE Transactions on, 19(7):711

    720, 1997.

    [7] C. Chow and C. Liu. Approximating discrete probability distributionswith dependence trees. Information Theory, IEEE Transactions on,14(3):462467, 1968.

    [8] J.A. Coan and J.J.B. Allen. Handbook of emotion elicitation and as-sessment. Oxford University Press, USA, 2007.

    [9] I Cohen. Facial expression recognition from video sequences: temporaland static modeling. Computer Vision and Image Understanding, 91(1-

    2):160187, August 2003.

    [10] JF Cohn and KS Schmidt. The timing of facial motion in posed andspontaneous smiles. In Proceedings of the 2nd International Conferenceon Active Media Technology (ICMAT 2003), pages 5772, 2003.

    [11] T.H. Cormen. Introduction to algorithms. The MIT press, 2001.

    [12] R Cowie, E Douglas-Cowie, N Tsapatsoulis, G Votsis, S Kollias, W Fel-lenz, and J G Taylor. Emotion recognition in human-computer interac-tion. IEEE SIGNAL PROCESSING MAGAZINE, 18(1):3280, 2001.

    [13] G. Donato, M.S. Bartlett, J.C. Hager, P. Ekman, and T.J. Sejnowski.Classifying facial actions. Pattern Analysis and Machine Intelligence,IEEE Transactions on, 21(10):974989, 1999.

    [14] P. Ekman. Strong evidence for universals in facial expressions - a reply torussells mistaken critique. PSYCHOLOGICAL BULLETIN, 115(2):268 287, 1994.

    [15] P. Ekman and W.V. Friesen. The repertoire of nonverbal behavior:Categories, origins, usage, and coding. Semiotica, 1(1):4998, 1969.

    13

  • 8/3/2019 Literal Review Facial Expression Recognition

    14/16

    [16] P. Ekman and WV Friesen. Investigators guide to the facial action

    coding system. palo alto, 1978.

    [17] P. Ekman and E.L. Rosenberg. What the face reveals: Basic and appliedstudies of spontaneous expression using the Facial Action Coding System(FACS). Oxford University Press, USA, 1997.

    [18] I.A. Essa and A.P. Pentland. Coding, analysis, interpretation, and recog-nition of facial expressions. Pattern Analysis and Machine Intelligence,IEEE Transactions on, 19(7):757763, 1997.

    [19] B Fasel and J Luettin. Automatic facial expression analysis: a survey.

    Pattern Recognition, 36(1):259275, January 2003.

    [20] N Fragopanagos and J G Taylor. Emotion recognition in human-computer interaction. Neural networks : the official journal of the In-ternational Neural Network Society, 18(4):389405, May 2005.

    [21] N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian network classi-fiers. Machine learning, 29(2):131163, 1997.

    [22] W. Hu, T. Tan, L. Wang, and S. Maybank. A survey on visual surveil-lance of object motion and behaviors. Systems, Man, and Cybernetics,

    Part C: Applications and Reviews, IEEE Transactions on, 34(3):334352, 2004.

    [23] D. Keltner and P. Ekman. Facial expression of emotion. In Handbookof Emotions.

    [24] R. Koenen. Mpeg-4 project overview. international organisation for stan-dartistion, iso/iec jtc1/sc29/wg11, la baule, 2000.

    [25] M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg,R.P. Wurtz, and W. Konen. Distortion invariant object recognitionin the dynamic link architecture. Computers, IEEE Transactions on,42(3):300311, 1993.

    [26] A. Lanitis, CJ Taylor, and TF Cootes. A unified approach to codingand interpreting face images. In Computer Vision, 1995. Proceedings.,Fifth International Conference on, pages 368373. IEEE, 1995.

    [27] Stan Z. Li, Anil K. Jain, Ying-Li Tian, Takeo Kanade, and Jeffrey F.Cohn. Handbook of Face Recognition. Springer-Verlag, New York, 2005.

    14

  • 8/3/2019 Literal Review Facial Expression Recognition

    15/16

    [28] G. Littlewort, M.S. Bartlett, I. Fasel, J. Susskind, and J. Movellan.

    Dynamics of facial expression extracted automatically from video. Imageand Vision Computing, 24(6):615625, 2006.

    [29] A. Martnez. Face image retrieval using hmms. In Content-Based Accessof Image and Video Libraries, 1999.(CBAIVL99) Proceedings. IEEEWorkshop on, pages 3539. IEEE, 1999.

    [30] K. Mase. Recognition of facial expression from optical flow. Trans.IEICE, 74(10):34743483, 1991.

    [31] P.W. McOwan. Robust facial expression recognition using local binary

    patterns. In IEEE International Conference on Image Processing 2005,pages II370. IEEE, 2005.

    [32] A. Nefian and M. Hayes. Face recognition using an embedded hmm. InIEEE Conference on Audio and Video-based Biometric Person Authen-tication, pages 1924, 1999.

    [33] T. Ojala, M. Pietikainen, and D. Harwood. Performance evaluation oftexture measures with classification based on kullback discriminationof distributions. In Pattern Recognition, 1994. Vol. 1-Conference A:Computer Vision & Image Processing., Proceedings of the 12th IAPR

    International Conference on, volume 1, pages 582585. IEEE, 1994.

    [34] T. Ojala, M. Pietikainen, and D. Harwood. A comparative study of tex-ture measures with classification based on featured distributions. Pat-tern recognition, 29(1):5159, 1996.

    [35] T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary pat-terns. Pattern Analysis and Machine Intelligence, IEEE Transactionson, 24(7):971987, 2002.

    [36] N. Oliver, A. Pentland, and F. Berard. Lafter: A real-time face andlips tracker with facial expression recognition. Pattern Recognition,33(8):13691382, 2000.

    [37] T. Otsuka and J. Ohya. Recognizing multiple persons facial expressionsusing hmm based on automatic extraction of significant frames fromimage sequences. In Image Processing, 1997. Proceedings., InternationalConference on, volume 2, pages 546549. IEEE, 1997.

    15

  • 8/3/2019 Literal Review Facial Expression Recognition

    16/16

    [38] Ekman p. Universals and cultural differences in facial expressions of

    emotion, 1971.

    [39] Maja Pantic and Leon J M Rothkrantz. Toward an Affect-SensitiveMultimodal Human Computer Interaction. Organization, 91(9), 2003.

    [40] M. Rosenblum, Y. Yacoob, and L.S. Davis. Human expression recog-nition from motion using a radial basis function network architecture.Neural Networks, IEEE Transactions on, 7(5):11211138, 1996.

    [41] L. Shen and L. Bai. Adaboost gabor feature selection for classification.In Proc. of Image and Vision Computing NewZealand, pages 7783. Cite-

    seer, 2004.

    [42] M. Suwa, N. Sugie, and K. Fujimora. A preliminary note on patternrecognition of human emotional expression. In International Joint Con-ference on Pattern Recognition, pages 408410, 1978.

    [43] H. Tao and T.S. Huang. Connected vibrations: a modal analysis ap-proach for non-rigid motion tracking. In Computer Vision and PatternRecognition, 1998. Proceedings. 1998 IEEE Computer Society Confer-ence on, pages 735740. IEEE, 1998.

    [44] N. Ueki, S. Morishima, H. Yamada, and H. Harashima. Expression anal-ysis/synthesis system based on emotion space constructed by multilay-ered neural network. Systems and Computers in Japan, 25(13):95107,1994.

    [45] M.F. Valstar, H. Gunes, and M. Pantic. How to distinguish posed fromspontaneous smiles using geometric features. In Proceedings of the 9thinternational conference on Multimodal interfaces, pages 3845. ACM,2007.

    [46] P. Viola and M.J. Jones. Robust real-time face detection. International

    journal of computer vision, 57(2):137154, 2004.

    [47] Zhihong Zeng, Maja Pantic, Glenn I Roisman, and Thomas S Huang.A survey of affect recognition methods: audio, visual, and spontaneousexpressions. IEEE transactions on pattern analysis and machine intel-ligence, 31(1):3958, January 2009.

    16