Anticipation for Facial Animation - DiPaolaivizlab.sfu.ca/arya/Papers/IEEE/Proceedings/C A S A - 04/Anticipation for Facial... · blended with the original facial animation while

Anticipation for Facial Animation

Jung-Ju ChoiDivision of MediaAjou University

[email protected]

Dong-Sun KimDivision of MediaAjou University

[email protected]

In-Kwon LeeDept. of Comp. Sci.Yonsei University

[email protected]

AbstractAccording to traditional 2D animation tech-niques, anticipation makes an animation muchconvincing and expressive. We present anautomatic method for inserting anticipationeffects to an existing facial animation. Ourapproach assumes that an anticipatory facialexpression can be found within an existingfacial animation if it is long enough. Verticesof the face model are classified into a setof components using principal componentsanalysis directly from a given key-framedand/or motion-captured facial animation data.The vertices in a single component will havesimilar directions of motion in the animation.For each component, the animation is examinedto find an anticipation effect for the givenfacial expression. One of those anticipationeffects is selected as the best anticipation effect,which preserves the topology of the face model.The best anticipation effect is automaticallyblended with the original facial animation whilepreserving the entire duration of the animation.

Keywords: facial animation, anticipation,principal component analysis

1 Introduction

Since the introduction of facial animation intothe computer animation, a lot of research hasbeen done on photorealistic facial expression[1]. Facial animation is very important since au-diences pay a lot of attention to the faces of char-acters. In many cases, however, photorealisticfacial animation does not satisfy audiences, who

are not expecting this sort of realism in anima-tions. So non-photorealistic approaches, alreadyused in rendering, have recently been applied toanimations [2, 3, 4]. Traditional 2D animationtechniques are applicable to non-photorealistic3D animation [5], and many animators conven-tionally produce 3D computer-based animationsusing those techniques [6].

There are some reasons non-photorealistic an-imations have more impact on audiences thanphotorealistic animations. Audiences tend torecognize an animation as a series of situa-tions, not as a series of realistic pictures. Non-photorealistic animation, for example, can eas-ily convey the core of a situation by the exagger-ation of an action, and the anticipation and reac-tion exhibited by characters. The characteristicsof the human cognitive system must also be con-sidered. Audiences expect the character to antic-ipate and to react before and after an action: theaction by itself has little impact. Without antici-pation an animation is uncomfortable and unnat-ural. A well-trained animator will insert appro-priate anticipation before an action. However,training an animator is very time-consuming andmaking an impressive animation is a delicateprocess even for a good animator.

This paper deals with a part of a broad sub-ject: application of the principles of traditional2D animation techniques to 3D animations. Ourmethod of applying the anticipation effect to the3D facial animation can be outlined as follows.A given key-framed and/or motion-captured fa-cial animation is examined to extract the facialexpression which is needed to insert anticipa-tion. For efficiency, the vertices having similar

1

directions of motion in the animation are col-lected into a component using principal com-ponents analysis(PCA), and the resulting facemodel consists of several components classifiedin terms of vertex motion. For each compo-nent, the animation data is examined again tofind the motion that can be used as anticipationfor the given facial expression. From the candi-date anticipatory motions for each component,the best motion is selected as the anticipationexpression, which preserve the topology of theface model. Once the anticipation expressionhas been found, it is inserted to and blended withthe given expression within the duration of thegiven expression.

2 Related research

Since Parke’s pioneering work in facial anima-tion [7, 8], there has been a great deal of workon facial animation, which has been very com-petently surveyed [1, 9]. But most research onthe facial animation has focused on photoreal-ism rather than on applications of traditional 2Danimation techniques.

Historically, traditional animation studiossuch as Disney Inc. have been developing many2D animation techniques since the 1930s, andmany of which can be applicable to 3D ani-mations [5]. For instance, Opalach and Mad-doc [10] used multi-layer implicit surfaces formodeling and animating a dinosaur incorporat-ing various 2D animation effects. Rademacher[11] simulated the irregular deformation ob-served in 2D animation for a 3D character ani-mation. In their technique, a 3D character is de-formed at some key-frames and in the remainingframes deformations are calculated by interpola-tion. Bregler et. al [12] tried to capture the mo-tion from old cartoon movies. Agarwala et. al[2] suggested a semi-automatic method to con-vert a video of a real scene into a cartoon ani-mation. Chenney et. al [3] applied a squash andstretch technique to represent the rigidity of anobject in a physical simulation.

3 Characteristics of anticipation

Our method is based on the characteristics of an-ticipation, which is one of the traditional 2D ani-

mation techniques [6]. Most animators consideran action as a sequence of anticipation, the ac-tion proper, and a reaction [5]. Anticipation isthe preparation for an action, and the reactionis the result of an action. Most human actionsinvolve anticipation, since most people think ofan action before doing it, and they get the energyfor the action from the anticipation. In general,the bigger the motion, the bigger the anticipa-tion. When a person shoots an arrow, for exam-ple, the bowstring must be drawn, the arrow isshot by releasing the string, and then the bow-string vibrates. Drawing the string is the antici-pation, shooting the arrow is the action, and vi-brating the string is the reaction. It is clear thatthe string must be drawn further to shoot an ar-row farther.

Anticipation also occurs in facial animation.When people look in an utter amazement, theyspread out their mouth and strain their eyes. Awell-trained animator might insert an oppositeexpression with small eyes and a small mouthbetween the neutral and the amazement expres-sion; in this case, the opposite expression is theanticipation. Inserting such an expression makesa facial animation much more expressive.

The anticipation of an action is usually a mo-tion in opposite direction to the action, and isrelatively slow and small compared to the ac-tion. As a result, the action is a bit faster andbigger than it is expected to. The characteristicsof anticipation can be applied to the facial ani-mation. A face consists of many features suchas eyes, eyebrows, a nose, a mouth and so on.The facial expression is the composition of themotions of such features. If the facial expres-sion with large eyes is a part of the action, forexample, then small eyes play a role in the an-ticipation. If eyebrows move upward during theaction, then the eyebrows should move down-ward during the anticipation.

4 Producing anticipation

In this section, we present a detailed method toproduce the anticipation effect for a given facialanimation sequence, whether or not such antic-ipation are parts of the original sequence. Ourmethod automatically determines the facial ex-pression which is needed to insert anticipation,

2

and searches the input animation sequence tofind the appropriate anticipation effect. The ap-propriate anticipation effect, if found, is insertedto and blended with the original facial expres-sion.

4.1 Filtering the animation

If a facial animation is created by motion cap-ture, it will contain a lot of noise. Filtering isnecessary to facilitate the analysis of animationdata. In general, a Gaussian filter can be gener-ally used for smoothing animation data. In thispaper, we use the method developed by Bruder-lin and Williams [13].

4.2 Creating components

A component is defined as a set of vertices thathave a similar motion in a given face model, andusually corresponds to a facial feature such asan eye, nose, or mouth, and so on. There hasbeen a lot of research for finding features from aface model [14, 15, 16]. Existing methods createcomponents based on the positions of verticesin a polyhedral face model. As a result, eyesand eyebrows, for example, may not be withina same component, even though they have thesame motion. Furthermore, the upper and lowerlips (or the left and right parts of a mouth) maybe within a single component when they havedifferent motions. In this section, we developa method to create components using PCA di-rectly from the animation data, not from the facemodel.

PCA is a good mathematical tool for findinga pattern from multi-dimensional data [17]. Ap-plying PCA to animation data produces a prin-cipal direction for each vertex in a face model,where the principal direction corresponds to theeigenvector with a maximum eigenvalue. Forany two vertices in a face model, if the angle be-tween their principal directions is within a cer-tain tolerance, then the two principal directionsare considered to be similar, and the two verticesare collected into the same component. Eventu-ally, all the vertices in the face model are parti-tioned into a set of components using a breadth-first search paradigm. Figure 1 shows the resultof creating components from a facial animationdata using PCA.

(a) (b)

Figure 1: Components created by PCA.

4.3 Extracting expressions

Once a set of components for a facial animationhas been created, we need to extract the facialexpressions for each component, and then deter-mine the starting and end frames for each facialexpression. For each component, an animationgraph can be constructed in terms of the princi-pal axis computed by the method given in Sec-tion 4.2. In Figure 2, the vertical axis representsthe principal axis of the motion for the corre-sponding component. The zero axis representsthe neutral position of the component. CubicHermite interpolation is used to extract the fa-cial expression from the graph.

teisi

tiIi

G(ti)

sj ejtj

G(tj)

Ij

Figure 2: Example of an animation graph

Let us assume that the input animation graphG, which is defined by the function of a framenumber, has n frames. We partition the graph G

into segments g, each of which is wholly greaterthan (or equal to) zero, or less than (or equal to)zero. Then, G = ∪ gi, 1 ≤ i ≤ s, wheregi stands for the ith segment of the animationgraph, and the number of segments is s. Thestarting and end frames of a segment gi are de-noted by si and ei respectively. The durationIi of the segment gi is defined by the interval[si, ei], and gi is said to be either positive or neg-ative if and only if either G(t) ≥ 0 or G(t) ≤ 0for all t ∈ Ii, respectively. The segment gi hasa horizontal tangent with either a maximum or a

3

minimum value at frame ti ∈ Ii if the segmentgi is either positive or negative, respectively (seeFigure 2). We consider the values of G(ti) andgi to represent the ith facial expression.

For each component, finding the zeros of thegraph G gives the starting frame si and the endframe ei of a facial expression, which actuallycorresponds to the segment gi of the animationgraph. The value of G(t), t ∈ (si, ei) deter-mines whether the segment gi is positive or neg-ative. Furthermore, the zeros of the derivativeof the graph G determine the horizontal tangentat the frame ti with a maximum or a minimumvalue depending on the sign of gi. The result ofextracting the expression for a single componentis shown in Figure 3.

-0.0004

-0.0002

0

0.0002

0.0004

0 20 40 60 80 100 120 140 160 180

start/end frames expressions

Frames

Mag

nitu

de

Figure 3: Extracting expressions from the ani-mation graph for a single component.

4.4 Finding anticipation expressions

For each expression that has been extractedor for any user-selected expression, the corre-sponding animation graph is examined to findan appropriate anticipation effect. If the givenfacial animation is long enough, we may be ableto find the appropriate anticipation effect withinthe animation graph.

When an expression gi is selected for an an-ticipation effect, an appropriate anticipation ef-fect gj , taking place in an interval Ij , shouldbe found. The sign of gj must be considered:if the expression gi is positive, the anticipationgj must be negative, and vice versa. Given twoconstant values w1 and w2, the index j of theanticipation effect of a facial expression gi can

be calculated by:

j = arg min1≤k≤s,k 6=i

{w1∆Gi,k + w2∆Ii,k} ,

(1)where,

∆Gi,k = ||G(ti)| − |G(tk)|| ,∆Ii,k = |Ii − Ik| ,

provided that G(ti)G(tj) < 0, which makes themotion of an anticipation effect opposite to thesubsequent motion. In Equation (1), | · | rep-resents an absolute value and the arg representsa function returning the index of the argument.Equation (1) takes into account the similaritiesof the magnitudes of the two expressions andthe time intervals occupied by the expressionsin determining the anticipation expression.

Considering all the components for a given fa-cial expression, many anticipation effects maybe found at different time intervals, each ofwhich corresponds to a component. Let us as-sume that h is the number of components ina face model. Then, there may be at most h

different anticipation effects for a given facialexpression G(ti), and these effects are denotedby G(tj,1), G(tj,2), . . . , G(tj,h). If we were toapply all these different anticipation effects di-rectly to corresponding components, we wouldget an abnormal result which violates the topol-ogy of the face model (see Figure 4). We need away of selecting a single effect tj,l, 1 ≤ l ≤ h

that represents the best anticipation effect for allthe components.

Figure 4: Examples of abnormal anticipationeffects which violate the topology ofa face model.

Based on our understanding of an audience’sperception, we have developed a method for de-termining the best index l. Audiences, in gen-eral, pay attention to large motions when they

4

watch an animation. The index l is determinedto make the difference between two expressionsbe maximized such that:

l = arg max1≤k≤h

{|G(ti) − G(tj,k)|} . (2)

If we apply the expression at frame tj,l com-puted by Equation (2) as the best anticipation fora given expression at frame ti, we can preservethe topology of the face model. However, othercomponents are not guaranteed to have appro-priate anticipation effects except for the compo-nent indexed by l.

For a given expression, we consider the bestanticipation to be the expression over a certaintime interval that has most frequently occurredfor all the components. If we consider only thefrequency, very small motion might be selectedto represent the best anticipation. Therefore, inaddition to their frequency, we take into accountthe difference between the two expressions. Leta function Fs(t) be defined by

Fs(t) =

{

1, if s − ε < t < s + ε

0, otherwise

where ε is a small positive constant. The bestanticipation effect is computed by

l = arg max1≤k≤h

h∑

q=1

Ftj,k(tj,q)∆Gi,jq

, (3)

where ∆Gi,jq = ||G(ti)| − |G(tj,q)||. Thisguarantees that most components have their ownanticipation effects while preserving the topol-ogy of the face model.

4.5 Blending anticipation effects

Once the best anticipation has been found for afacial expression, it must be inserted in front ofthe given expression. But, if we add the antic-ipation directly before the expression, the con-tinuity of whole expressions is broken and theduration of the expression becomes larger thanbefore (see Figure 5(a) and (b)). From the char-acteristics of anticipation, the anticipatory mo-tion is small and slow compared to the originalmotion. So we make the anticipation expressionsmall and slow relative to the given expression,and blend it with the original expression to make

the whole expressions continuous while retain-ing the entire duration of the original expression.At last, the original motion looks fast as we havealready described in Section 3 (see Figure 5(c)).Experimental results suggest that the appropri-ate amplitude and time interval for the anticipa-tion are 1/3 of the expression that follows.

(a) (b) (c)

Extractedexpression

Anticipationexpression

Blending

Figure 5: Examples of blending the anticipationeffect with a facial expression.

5 Results

A motion-captured facial animation of 200frames was used as an example (see Figure 6).The number of components of the face modelwas computed as 41 by PCA of the animationdata.

(a) (b) (c) (d)

Figure 6: Some expressions extracted from afacial animation without anticipation:(a) at frame 30, (b) at frame 80, (c) atframe 130, and (d) at frame 180.

The first expression, taking place in frame 30,is selected for an anticipation effect. The an-ticipatory motions computed for several compo-nents are shown in Figure 7. Using the methoddescribed previously, the best anticipation forthis expression is found at frame 80, which isactually the second expression in the existing

5

animation. Note that the best anticipatory ex-pression is the expression of Figure 7(a) in thisexample, even if the expression of Figure 7(c)seems to have more frequency than that of Fig-ure 7(a). It is due to that the Equation (3)takes into account the magnitude of the differ-ence between two expressions as well as theirfrequency.

F

A

B

D

C

E

(a) (b) (c)

Figure 7: For given expression, anticipation ef-fects (a) for component F, (b) for com-ponent B and C, (c) for components A,D, and E. The best anticipation effectfor the given expression is (a).

If the third expression around frame 130 is se-lected to insert an anticipation effect, the antic-ipation effects corresponding to several compo-nents are shown in Figure 8, and the expressionat frame 180 is selected as the best anticipation.

We show anticipation effects inserted into ananimation obtained by motion capture in Fig-ure 9. In Figure 10, the same is applied to akey-framed animation.

6 Conclusion

We have presented an automatic method to pro-duce anticipation effects for a facial anima-tion. Using principal component analysis di-rectly from the facial animation data, we clas-sified the vertices of a face model into compo-

F

A

B

D

C

E

(a) (b) (c)

Figure 8: For given expression, anticipation ef-fects (a) for component B and C, (b)for component A, (c) for componentsD, E and F. The best anticipation effectfor the given expression is (c).

nents, each of which represents a similar mo-tion, and extracted the anticipation expressionsby analyzing the animation graph for each com-ponent. Based on the characteristics of anticipa-tion, we presented a method for finding the bestanticipation effect within the existing animationgraph, and a method for blending the best antici-pation effect with the original expression, whilepreserving the topology of the face model andthe duration of the animation as a whole. It isvery time-consuming and delicate to make a fa-cial animation with anticipation. Anticipation iswell-known traditional principle in 2D anima-tions, and it was said to be applicable to 3D an-imations. We have now shown how to incorpo-rate anticipation into 3D facial animations.

In the sense of anatomy, a vertex of a face isphysically connected to a set of muscular bun-dles, and hence the motion of the vertex maynot occur along a single direction. In most ex-perimental results, however, the muscle that pri-marily affects the motion of a vertex can befound using PCA directly from the animationdata. When animators make anticipation for agiven facial expression, they may consider notonly the principal directions but also some other

6

(a)

(b)

Figure 9: (a) Motion-captured facial animation without anticipation. (b) Facial animation with antic-ipation. For each box in (b), the left expression is the anticipation expression of the rightthat is given in the corresponding box of (a).

details of facial motions. In that case, the resultof this paper is somewhat different from the an-imator’s hand-made work. The qualitative andquantitative analysis for the details of facial mo-tions makes the result closer to the animator’swork.

There are many other traditional principlesin 2D animations, for example, the principle ofan overlapping technique. In fact, a facial ex-pression consists of the motions of components,which do not move simultaneously. In futurework we may be able to achieve this techniqueby ordering and overlapping the time intervalsoccupied by different components of the expres-sion.

Acknowledgement

This research was supported by University ITResearch Center Project.

References

[1] F. I. Parke and K. Waters. Computer FacialAnimation. A. K. Peters, 1996.

[2] A. Agarwala. Snaketoonz : a semiauto-matic approach to creating cel animationfrom video. In Proc. of Non-PhotorealisticAnimation and Rendering, pages 139–148,2002.

7

(a)

(b)

Figure 10: (a) Key-frame facial animation without anticipation. (b) Facial animation with anticipa-tion. For each box in (b), the left expression is the anticipation expression of the right thatis given in the corresponding box of (a).

[3] S. Chenney, M. Pingel, R. Iverson, andM. Szymanski. Simulating cartoon styleanimation. In Proc. of Non-PhotorealisticAnimation and Rendering, pages 133–138,2002.

[4] A. Korf. Computer aided inbetween. InProc. of Non-Photorealistic Animation andRendering, pages 125–132, 2002.

[5] J. Lasseter. Principles of traditional ani-mation applied to 3D computer graphics.In Proc. of SIGGRAPH ’87, pages 35–44,1987.

[6] R. Williams. The Animator’s Survival Kit.Faber and Faber, 2001.

[7] F. I. Parke. Computer generated animationof faces. In Proc. of ACM anuual confer-ence, pages 451–457, August 1972.

[8] F. I. Parke. A Parametric Model for Hu-man Faces. PhD thesis, University of Utah,1974.

[9] J. Y. Noh and U. Neumann. A surveyof facial modeling and animation tech-niques. Technical Report 99-705, Univer-sity of Southern California, 1999.

[10] A. Opalach and S. Maddoc. Disney effectsusing implicit surfaces. In Proceedings of

5th Eurographics Workshop on Animationand Simulation, 1994.

[11] P. Rademacher. View-dependent geometry.In Proceedings of SIGGRAPH ’99, pages439–446, 1999.

[12] C. Bregler, L. Loeb, E. Chuang, andH. Deshpande. Turning to the masters:Motion capturing cartoons. ACM Trans-actions on Graphics (Proc. of SIGGRAPH2002), 21(3):399–407, 2002.

[13] A. Bruderlin and L. Williams. Motion sig-nal processing. In Proc. of SIGGRAPH’97, pages 97–104, 1997.

[14] Y. C. Lee, D. Terzopoulous, and K. Waters.Constructing physic-based facial modelsof individuals. In Proc. of Graphics Inter-face, pages 1–8, 1993.

[15] Y. C. Lee, D. Terzopoulous, and K. Wa-ters. Realistic modeling for facial anima-tion. In Proc. of SIGGRAPH ’95, pages55–62, 1995.

[16] C. J. Kuo, R. S. Huang, and T. G. Lin. Syn-thesizing lateral face from frontal facialimage using anthropometric estimation. InProc. of International Conference on Im-age Processing, volume 1, pages 133–136,1997.

8

[17] I. T. Jollife. Principal Components Analy-sis. Springer, 1986.

9

Documents

Anticipation for Facial Animation - DiPaolaivizlab.sfu.ca/arya/Papers/IEEE/Proceedings/C A S A - 04/Anticipation for Facial... · blended with the original facial animation while