[Institution of Engineering and Technology 2013 Constantinides International Workshop on Signal Processing (CIWSP 2013) - London, UK (25 Jan. 2013)] 2013 Constantinides International

Splicing partial body motion video sequences for motion synthesis

William W.L. Ng and Daniel P.K. Lun* Centre for Signal Processing

Department of Electronic and Information Engineering The Hong Kong Polytechnic University, Hong Kong

*[email protected]

Abstract—Motion synthesis has become an important research area in computer animation. When synthesizing new motions, recent partial body parts motion synthesis methods allow two upper body motions to be swapped if they have similar lower body motion. Although such requirement can ensure natural motion, it limits the possibility of new motions that can be synthesized using this method. In this paper, we propose to relax this requirement by considering also lower body motions that have only similar functionality and regularity. To facilitate the similarity measurement, a new time alignment algorithm is proposed so that motion sequences which have similar regularity can be easily identified. Our results show that the proposed algorithm can significantly improve the reusability of motion data for motion synthesis.

I. INTRODUCTION

The technique of motion synthesis allows the captured human motion data to be reused for different kind of applications such as computer games, movies and simulation. In motion synthesis, human motion data are divided into a number of short motion sequences by detecting the foot contact [1][2]. The short motion sequences from different human motion data set can then be combined temporally to generate a long stream of motion (e.g. using motion graphs) [3][4], or be combined spatially to create high-fidelity motions (e.g. based on motion interpolation) [5]. Recently, it is suggested that by decomposing whole body motion sequences into their upper and lower body parts, the reusability of motion sequences can be improved since there are more choices for motion synthesis [6][7]. However, splicing upper and lower body motion sequences back to a whole body one is never trivial. To ensure the spliced motions are smooth and natural, it was suggested that an upper body motion sequence can splice with a new lower body motion sequence only if its original lower body motion sequence is similar to the new one. The similarity is measured by computing the Euclidean distance between the motion data of the two lower body motion sequences. The smaller is the Euclidean distance, the closer is the posture of the two lower bodies across time.

Although the above requirement can ensure natural synthesized motions, it limits the number of possible partial body motion sequences that can be spliced together and hence lowers the reusability of the motion data. To solve the problem, we suggest in this paper to relax the condition by considering also lower body motions that have only similar functionality and regularity. More specifically, we consider a

special kind of locomotion that has alterative leg motions and contacts with ground. It covers the majority of human locomotion on ground such as walking, running, sneaking, marching, and so on. It is observed that the upper body motions of such locomotion can be swapped without significant visual artefact, provided that their respective lower body motions have similar rhythm and are aligned in time. Let us use an example to illustrate this condition. For instance, a punching upper body motion which originally splices with a normal walking motion can splice with a normal sneaking lower body motion without significant visual artefact. It however is not the case if it splices with a zombie walking lower body motion since the rhythm is much different and they are not aligned in time.

To evaluate the similarity of the rhythm of two lower body motions, we propose a new algorithm based on the vertical component of the centre of mass (CoM) of lower body. It is observed that for the kind of locomotion as mentioned above, lower body motions having similar rhythm will have a similar vertical component of the centre of mass (CoM) across time. However, directly comparing the CoM of two lower bodies in motion can be difficult since they are always not aligned in time. Besides, due to the condition of the ground surface, the CoM can have a big variation which does not truly reflect the difference of the rhythm of the two lower body motions. The proposed algorithm performs a time alignment to the lower body motions in question based on the cross correlation between the two motion sequences. It iteratively shifts the samples in one sequence such that the cross correlation between the two given sequences is maximized. The resulting cross correlation value gives a good indication of the similarity of the rhythm of two lower body motions. Besides, the resulting time alignment curve can also be used to adjust the upper body motion sequence to ensure the condition as mentioned above is fulfilled when splicing with a lower body motion sequence. Experimental results show that the proposed splicing algorithm increases the reusability of motion data by over 80% without introducing significant visual artefact to the synthesized motion sequences.

II. NEW TIME ALIGNMENT ALGORITHM

In this section, a new algorithm is proposed for splicing upper body and lower body motion sequences. To illustrate the algorithm, let us first formulate the problem more

specifically. We denote the set of motions of interest as Γ and assume that there are two lower body motion sequences { }am ∈ Γ and { }bm ∈ Γ belong to that set. We further denote the upper body and lower body motions of am and bm to be

{ },U La am m and { },U L

b bm m , respectively. Then it is suggested

that Uam and L

bm can be spliced together if the following condition is fulfilled:

Condition 1: Given that { }am ∈ Γ and { }bm ∈ Γ . { },U La bm m is

feasible if Lam and L

bm have a similar rhythm and are aligned in time.

Assume that the starting spatial orientation of am and bm is found to be the same [8]. That is, am and bm face to the same direction initially. The proposed algorithm then examines if

Lam and L

bm have a similar rhythm and if there exists a time

alignment operator such that { }LaF m and L

bm are aligned in

time. If both are positive, a new motion { }{ },U La bF m m can be

created without significant visual artefact. It is however not trivial to achieve the above since L

am and Lbm can be very different in posture. Traditional techniques

such as dynamic time warping which compares two motion sequences based on the Euclidean distance of postures [3] cannot give useful result. In this paper, we propose to adopt the vertical component of the centre of mass (CoM) [1] to represent the behaviour of motions. The CoM is defined as a weighted-sum of the joints’ positions of a human character,

( ) ( )1

1 J

k kk

CoM t w p tJ =

= (1)

where wk is the weight of the kth joint, pk is the kth joint’s 3-dimensional position, t is the time and J is the total number of joints. We denote the vertical component of the CoM of a lower body motion as CoMy. Several examples of CoMy are shown in Fig. 1. It can be seen that the locomotion that have similar rhythm (such as walk, run and sneak) usually have similar CoMy, although they may not be well aligned in time (see the difference between walk and run, Fig. 1(a) and (b); the time duration of run is much shorter than walk). Besides, due to the ground condition, there can be a bias adding to CoMy (see Fig. 1(d) walking upstairs). Motions that are different in rhythm will have different CoMy. It can be seen the CoMy of zombie walk and drunk walk (see Fig. 1(e) and (f) respectively) is much different from that of run and walk.

If two motions are well aligned in time, we can directly compare their CoMy using their cross correlation as follows:

( )( )( ) ( )( )

( )( ) ( )( )1

2 2

1 1

,

. .

a by y

N a a b by y y yt

N Na a b by y y yt t

xcorr CoM CoM

CoM t CoM CoM t CoM

CoM t CoM CoM t CoM

=

= =

=

− −

− −

(2)

where x is the mean of , ayCoM and b

yCoM are the CoMy of

am and bm , respectively. If the CoMy of two motions are similar and well aligned in time, the value in Eq. (2) would be high. However, as can be seen in Fig. 1, the CoMy of two motions, although can be similar in shape, are often not aligned in time. To solve the problem, we propose an iterative algorithm as follows to align the CoMy: 1. Assume a

yCoM and byCoM have a length of and ,

respectively. Up sample ayCoM by a factor of and

byCoM by a factor of such that . In this case,

both CoMy have the same length. 2. For each such that ( )a

yCoM t is non-zero, perform,

( )( )( )( )

2 2

int ( ), ( ) ,

intmax

ay

bm ms t y

shift CoM t s txcorr

CoM− < ≤

(3)

In Eq. (3), refers to shifting the sample of function by an amount of s(t) samples; refers to the interpolation operation using a spline function.

3. Repeat step 2 until there is no more update to ayCoM

4. Down sample the resulting ayCoM by a factor of n such

that it has the same length as the original byCoM .

As shown in Eq. (3), each sample of ayCoM is shifted to its

neighbourhood to see if it will give a higher correlation as compared with b

yCoM . It then iterates to search for the best

such that the cross correlation of ayCoM and b

yCoMreaches the highest. To reduce the chance that the iteration is trapped at the local maxima, another constraint is imposed such that when searching for the best in each iteration, only those that will reduce the difference in gradient between a

yCoM and byCoM will be considered. More

specifically, we shall only consider those such that,

( ) ( )( )( )( )( )( )( )

( )( )( ) ( )( )( )

1

1

int , .

int

int . int

ayL

t by

La by y

t

dsign shift CoM t s tdtdsign CoM tdt

d dsign CoM t sign CoM tdt dt

=

=

≥

(4)

In Eq. (4), the function returns 1 if is positive and -1 if negative. Eq. (4) ensures a

yCoM , after shifting, will have

more samples having gradient of the same sign as byCoM . In

practice, we consider Lbm have similar rhythm if a

yCoM and byCoM have an value (after alignment) higher than a

user-defined threshold λ0. And we make use of the resulting time alignment curve to derive the operator to create a new motion sequence { }{ },U L

a bF m m .

0 10 20 30 40 50-0.01

-0.005

0

0.005

0.01

0.015

(a) 0 5 10 15 20 25 30

-0.02

-0.01

0

0.01

0.02

(b)

0 10 20 30 40 50 60-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

(c) 0 10 20 30 40 50 60

-0.15

-0.1

-0.05

0

0.05

(d)

0 20 40 60 80-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

(e) 0 10 20 30 40

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

(f)

Fig. 1. CoMy of motion sequences: (a) walk; (b) run; (c) sneak; (d) walk upstairs; (e) zombie walk; and (f) drunk walk. For all figures, x-axis is time, y-axis is CoMy.

III. EXPERIMENTS

We first present in Fig. 2 the time alignment results of the CoMy of different lower body motions using the proposed algorithm. As it is shown in Fig. 2, the original CoMy (dot line) in most cases does not align with the target CoMy (solid line). Direct computation of their cross-correlation is thus meaningless. The proposed algorithm tries to align the CoMy. For those motions that have similar rhythm (see Fig. 2(a) to (d)), the proposed algorithm can always align the original CoMy to a form (dash line) close to the target one. But for those motions that have a different rhythm (see Fig. 2(e) to (f)), the aligned CoMy is still quite far from the target one.

TABLE I shows the cross correlation results between the CoMy of different lower body motions after using the proposed time alignment algorithm. For motions that have similar rhythm (e.g. run and walk, run and sneak, etc.), the cross correlation approaches 1 after time alignment. For those having much difference in rhythm (e.g. run and drunk walk, run and zombie walk, etc.), their cross correlation value is close to 0 after time alignment. Note that the proposed algorithm performs very well for those motions having similar rhythm but are conducted under different ground condition (e.g. run and walk on slope, run and walk upstairs, etc.). The cross correlation value after using the proposed algorithm is well above 0.5. These results have verified that the proposed time alignment algorithm can effectively identify the motions that are similar in rhythm.

0 10 20 30 40 50-0.02

-0.01

0

0.01

0.02

WalkOriginal RunAligned Run

(a) 0 10 20 30 40 50 60

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

WalkOriginal SneakAligned Sneak

(b)

0 10 20 30 40 50-0.03

-0.02

-0.01

0

0.01

0.02

0.03

Walk (slope)Original RunAligned Run

(c) 0 10 20 30 40 50 60

-0.15

-0.1

-0.05

0

0.05

Walk (Stair)Original RunAligned Run

(d)

0 20 40 60 80-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

Zombie WalkOriginal RunAligned Run

(e) 0 10 20 30 40

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

Drunk WalkOriginal RunAligned Run

(f)

Fig. 2. CoMy of motion sequences after using the proposed time alignment algorithm: (a) run and walk; (b) sneak and walk; (c) run and walk on slope; (d) run and walk upstairs; (e) run and zombie walk; and (f) run and drunk walk. For all figures, x-axis is time and y-axis is CoMy.

TABLE I CROSS CORRELATION RESULTS

ayCoM b

yCoM Xcorr after time alignment

Run Walk 0.968 Run Sneak 0.967

Sneak Walk 0.966 Run Walk with punch 0.983 Run Walk on slope 0.791 Run Walk upstairs 0.647 Run Drunk walk 0.019 Run Zombie walk 0.178

Fig. 3 shows three examples that the partial body motions fulfill Condition 1 and are spliced together with time alignment using the proposed algorithm. In Fig. 3a, the defence motion of upper body comes from a lower body motion with uniform rhythm. It thus can be spliced with other locomotion that behave similarly such as walking (Fig. 3a) and sneaking (Fig. 3b) with negligible visual artefact. In Fig. 3c, the uniform upper body motion of running has shown to splice smoothly with the lower body motion of walking upstairs although the motion is conducted on a different ground condition. It is because they have similar rhythm. With the proposed time alignment algorithm, it is seen that the swinging of the arms matches with the legs while walking up the stairs. For all three examples, there is no visible artefact in the spliced motions. They verify the cross correlation results in TABLE 1.

(a)

(b)

(c)

Fig. 3. Examples of synthesized motion fulfilling Condition 1.. Each box represents a frame of the motion with respect to time. (a) Defence (upper body) and walking (lower body); (b) defence (upper body) and sneaking (lower body); and (c) running (upper body) and walking upstairs (lower body).

If a lower body motion is irregular in time, it will be difficult for it to splice with an upper body motion which originally has a regular lower body motion. Fig. 4 shows three examples of such case. The first example in Fig. 4a attempts to splice a regular upper body motion of running with an irregular lower body motion of zombie walking. As their rhythm is so different, direct splicing them would lead to unnatural motion even using the proposed alignment algorithm. For example, the 3rd box to the 4th box in Fig. 4a show that the arms swing before the next foot step. Fig. 4b shows a similar result but the upper body motion is a regular punch. The third example in Fig. 4c demonstrates the result of splicing the upper body motion of running with the irregular lower body motion of drunk walking. It is seen that the arms do not swing synchronously with the lower body motion, and thus the result looks unrealistic. The above results verify the low cross correlation values of these cases as shown in TABLE I.

(a)

(b)

(c)

Fig. 4. Examples of synthesized motion not fulfilling Condition 1. (a) Running (upper body) and zombie walking (lower body); (b) punching (upper body) and zombie walking (lower body); and (c) running (upper body) and drunk walking (lower body)

As the traditional approach allows upper body motions to be swapped only if they have a similar lower body postures

across time, the number of possible upper body motions that can be spliced with lower body motions is quite limited. The problem is solved by using the proposed algorithm due to the relaxed requirement. It in turn improves the reusability of the motion data as more motion sequences can be synthesized. To show the improvement achieved by the proposed algorithm, a comparison study was conducted and found that, for the traditional approach, the number of swappable upper body motions over the total number of motion sequences in our motion database is only 26%. By relaxing the requirement as stated in Condition 1, the percentage of swappable upper body motions raises to 58%. For all these swappable upper body motions, a user perception test was performed to ensure the resulting motion sequences are perceptually acceptable.

IV. CONCLUSION

For splicing partial body motion sequences, we have shown that the similarity in lower body motions is the key for synthesizing smooth and natural motions. However, traditional approach requires lower body motions to have very similar postures may be too restrictive that limits the possibility of splicing partial body motion sequences. We showed that for common locomotion the similarity in rhythm may be enough for synthesizing smooth and nature motions. We then proposed a new algorithm to measure the similarity in rhythm of lower body motions. It iteratively aligns a lower body motion and then measures its cross correlation with the target lower body motion. We have shown in the experimental results that the new algorithm successfully identifies the lower body motions having similar rhythm. It is applicable for lower body motions having different postures, timings and perhaps performing under different ground conditions. The new algorithm greatly increases the reusability of motion data, without sacrificing the quality of the synthesized motions.

ACKNOWLEDGEMENT

This work is fully supported by the Hong Kong Polytechnic University (grant no. 4-904V).

REFERENCES[1] T. Kwon and S. Y. Shin, “Motion Modeling for On-line Locomotion

Synthesis,” Proc. ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’05), pp. 29-38, 2005.

[2] T. Kwon, T.S. Cho, S. I. Park and S. Y. Shin, “Two-Character Motion Analysis and Synthesis,” IEEE Transactions on Visualization and Computer Graphics, vol. 14, no. 3, pp. 707-720, 2008.

[3] L. Kovar, M. Gleicher, and F. Pighin, “Motion Graphs,” ACM Transactions on Graphics, vol. 21, issue 3, pp. 473-482, 2002.

[4] L. Zhao and A. Safonova, “Achieving Good Connectivity in Motion Graphs,” Proc. ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’08), pp. 127-136, 2008.

[5] L. Kovar and M. Gleicher, “Automated Extraction and Parameterization of Motions in Large Data Sets,” ACM Transactions on Graphics, vol. 23, issue 3, pp. 559-568, 2004.

[6] R. Heck, L. Kovar, and M. Gleicher, “Splicing Upper-body Actions with Locomotion,” Comp. Graph. Forum, vol. 25, pp. 459-466, 2006.

[7] W. W.L. Ng, C. S.T. Choy, D. P.K. Lun and L.P. Chau, “Syncrhonized Partial-body Motion Graphs”, Proc. ACM SIGGRAPH ASIA 2010 Sketches. No. 28, 2010

[8] L. Kovar and M. Gleicher, “Flexible Automatic Motion Blending with Registration Curves. Proc. ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’03), pp. 214-224, 2003.

Documents

[Institution of Engineering and Technology 2013 Constantinides International Workshop on Signal Processing (CIWSP 2013) - London, UK (25 Jan. 2013)] 2013 Constantinides International