6
A New Fast Block Matching Algorithm for Video Files with Much Inter-frame Difference Hyeonwoo Nam and Sungchae Lim Department of Computer Science Dongduk Women 's University 23-1 Wolgok-Dong, Sungbuk-Gu, Seoul, Korea {hwnam, sclim}@dongduk.ac.kr Abstract: - For video coding, we have to consider two performance factors, i.e., the search speed and coded video’s quality. Since there is often a trade-off between those performance factors, it is not easy to develop an efficient algorithm in all these aspects. As two feasible video coding algorithms based on the block matching motion estimation, the CDS and HEXBS have been proposed. Although the algorithms have better performance advantages than a naive full search algorithm, they cannot provide the best coding speed and image quality for video files with less inter-frame redundancy. Against this, we propose a new block matching algorithm that works with the small cross and flat hexagon search patterns. From extensive performance evaluations, we can see that our algorithm outperforms both CDS and HEXBS algorithms in terms of the search speed and the coded video’s quality. Using our algorithm, we can improve the search speed by up to 51%, and also diminish the PSNR (Peak Signal Noise Ratio) by at most 0.7 dB, thereby improving the video quality. Key-Words: - Motion estimation, Fast block matching algorithm, Video coding, Video compression 1 Introduction For the motion-compensated coding, the motion estimation (ME) is used to express the differences between successive video frames, thereby reducing the overall data size required to code a video file. The motion estimation is aimed at efficiently coding the pels or pixels of a target frame by using the image similarity hidden in the corresponding reference frame [1-11]. For such fast motion estimation, block matching schemes are usually accepted in the computing community [1, 11]. The block matching motion estimation is a scheme that divides the target image frame being encoded into equal-sized pixel blocks and determines a best matching block for each of image (or pixel) blocks within its reference image frame. Here, the best match of a certain image block B existing in a target frame is the reference image block that is estimated to be the most similar to B. Such block matching between best matches and its target image blocks is expressed with motion vectors in general. As a basic block matching algorithm for detecting best matches, a full search algorithm is proposed [1]. Although this algorithm is very suitable for efficient H/W implementation, it suffers from heavy use of CPU time. To provide a faster block matching speed, many variants have been proposed [2-11]. Among them, some algorithms such as DS (Diamond Search) [11], CBHS (Center-biased Hybrid Search) [10], HEXBS (Hexagon-based Pattern Search) [7, 9], and CDS (Cross Diamond Search) [4] are popularly accepted for their relatively fast search speed. The efficiency of those block matching algorithms largely depends on the shape and size of the search patterns. Since the best match is searched for by following the pixels of a search pattern, while computing the block distortion measures (BDMs) with respect to each of target image blocks, the search patterns employed deeply affect the search speed and image quality of coded videos. In this paper, we also propose a new algorithm for the fast block-based ME. In our method, we use two types of search patterns, that is, a small cross search pattern and a flat hexagon search pattern. At the first step in the search of a best match, the small cross search pattern is applied for detecting a best match in slowly changing video scenes. Then, the flat hexagon pattern is recursively applied if a best match is not detected by using only the small cross search pattern. To show performance benefits of the proposed algorithm, we conduct experimental evaluations over examined block matching algorithms. From the experimental results, we can see that our proposed algorithm outperforms others in terms of the search speed, while preserving the high image quality of video files being coded. Proceedings of the 6th WSEAS Int. Conf. on Systems Theory & Scientific Computation, Elounda, Greece, August 21-23, 2006 (pp58-63)

A New Fast Block Matching Algorithm for Video Files with Much

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

A New Fast Block Matching Algorithm for Video Files with Much Inter-frame Difference

Hyeonwoo Nam and Sungchae Lim Department of Computer Science

Dongduk Women 's University 23-1 Wolgok-Dong, Sungbuk-Gu, Seoul, Korea

{hwnam, sclim}@dongduk.ac.kr

Abstract: - For video coding, we have to consider two performance factors, i.e., the search speed and coded video’s quality. Since there is often a trade-off between those performance factors, it is not easy to develop an efficient algorithm in all these aspects. As two feasible video coding algorithms based on the block matching motion estimation, the CDS and HEXBS have been proposed. Although the algorithms have better performance advantages than a naive full search algorithm, they cannot provide the best coding speed and image quality for video files with less inter-frame redundancy. Against this, we propose a new block matching algorithm that works with the small cross and flat hexagon search patterns. From extensive performance evaluations, we can see that our algorithm outperforms both CDS and HEXBS algorithms in terms of the search speed and the coded video’s quality. Using our algorithm, we can improve the search speed by up to 51%, and also diminish the PSNR (Peak Signal Noise Ratio) by at most 0.7 dB, thereby improving the video quality. Key-Words: - Motion estimation, Fast block matching algorithm, Video coding, Video compression 1 Introduction For the motion-compensated coding, the motion estimation (ME) is used to express the differences between successive video frames, thereby reducing the overall data size required to code a video file. The motion estimation is aimed at efficiently coding the pels or pixels of a target frame by using the image similarity hidden in the corresponding reference frame [1-11]. For such fast motion estimation, block matching schemes are usually accepted in the computing community [1, 11]. The block matching motion estimation is a scheme that divides the target image frame being encoded into equal-sized pixel blocks and determines a best matching block for each of image (or pixel) blocks within its reference image frame. Here, the best match of a certain image block B existing in a target frame is the reference image block that is estimated to be the most similar to B. Such block matching between best matches and its target image blocks is expressed with motion vectors in general.

As a basic block matching algorithm for detecting best matches, a full search algorithm is proposed [1]. Although this algorithm is very suitable for efficient H/W implementation, it suffers from heavy use of CPU time. To provide a faster block matching speed, many variants have been proposed [2-11]. Among them, some algorithms such as DS (Diamond Search)

[11], CBHS (Center-biased Hybrid Search) [10], HEXBS (Hexagon-based Pattern Search) [7, 9], and CDS (Cross Diamond Search) [4] are popularly accepted for their relatively fast search speed. The efficiency of those block matching algorithms largely depends on the shape and size of the search patterns. Since the best match is searched for by following the pixels of a search pattern, while computing the block distortion measures (BDMs) with respect to each of target image blocks, the search patterns employed deeply affect the search speed and image quality of coded videos.

In this paper, we also propose a new algorithm for the fast block-based ME. In our method, we use two types of search patterns, that is, a small cross search pattern and a flat hexagon search pattern. At the first step in the search of a best match, the small cross search pattern is applied for detecting a best match in slowly changing video scenes. Then, the flat hexagon pattern is recursively applied if a best match is not detected by using only the small cross search pattern. To show performance benefits of the proposed algorithm, we conduct experimental evaluations over examined block matching algorithms. From the experimental results, we can see that our proposed algorithm outperforms others in terms of the search speed, while preserving the high image quality of video files being coded.

Proceedings of the 6th WSEAS Int. Conf. on Systems Theory & Scientific Computation, Elounda, Greece, August 21-23, 2006 (pp58-63)

The rest of this paper is organized as follows. In Section 2, we summarize the previous researches regarding the block matching algorithm. The proposed algorithm is presented in Section 3, and performance comparisons are given in Section 4. Lastly, we conclude this paper in Section 5.

2 Related Works In general, a video stream has a large amount of temporal redundancy between its successive image frames, and thus the concept of the block matching ME capitalizes on such temporal redundancy. The block matching is a procedure to search for a best match within the reference frame with respect to an image block of the corresponding target frame. The best match for an image block B in a target frame is the image block that exists in the associated reference frame and has the minimum BDM against block B. The position of the best match is expressed with a motion vector, which is in a form (x, y) and integers x and y represent the distance of the best match from the left-top point of block B. By coding such motion vectors between reference frames and target frames, we can reduce the data size of video files [1-11].

Fig. 1 depicts the distributions of motion vectors of six popular sample videos. In this figure, the peak point at (0, 0) expresses the fact that the motion vector of (0, 0) has the largest occurrence density. Here, motion vector (0, 0) means that the best match has the same frame coordinates with respect to its target image block. In Fig. 1, more than 54 percent of motion vectors have (0, 0) as their values. Moreover, more than 71 percent of motion vectors belong to the motion vector sets of (-1, 0), (1, 0), (0, -1) and (0, 1) and thus the distance between a best match and its target image block is not greater than 2 .

Based on such a distribution characteristic of motion vectors, some algorithms such as CDS and HEXBS are proposed. The CDS algorithm uses three types of search patterns depicted in Fig. 2 for the fast block matching [4]. To search for a best match of an image block B with its top-left point at (x, y), CDS first places the center of the cross-shaped pattern (CSP) of Fig. 2(a) at (x, y) in the reference frame. If the pattern’s center has a minimum BDM among the nine points in this pattern, then CDS stops its search procedure and returns the motion vector of (0, 0).

-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7-7

-3

1

5

0

10

20

30

40

50

60 Density ofmotion

vector(%)

Fig. 1 Motion vector distribution of six sample videos.

If the pattern’s center has a minimum BDM among the nine points in this pattern, then CDS stops its search procedure and returns the motion vector of (0, 0). If that is not true, other search patterns in Fig. 2(b) and Fig. 2(c) are used, which are called the large diamond-shaped pattern (LDSP) and the small diamond-shaped pattern (SDSP), respectively. For the space limitation, further detail is omitted here.

(a) CSP (b) LDSP (c) SDSP

Fig. 2 Search patterns used in the CDS algorithm.

As a similar method of CDS, the HEXBS (Hexagon-based Search) algorithm has been proposed. The search patterns used in HEXBS are shown in Fig. 3. Using the large hexagon-shaped search pattern (LHSP) of Fig. 3(a), HEXBS recursively examines if the center point of the LHSP has a minimum BDM. If that is true, then the center point of the small hexagon-shaped search pattern (SHSP) of Fig. 3(b) is placed at the center of LHSP. Then, the minimum BDM point among the five points in an LHSP is chosen as the motion vector [7, 9]. The CDS and HEXBS algorithms have a faster search speed, compared with a full search block algorithm, while preserving a high image quality.

Proceedings of the 6th WSEAS Int. Conf. on Systems Theory & Scientific Computation, Elounda, Greece, August 21-23, 2006 (pp58-63)

(a) LHSP (b) SHSP

Fig. 3 Search patterns used in the HEXBS algorithm.

3 Proposed Block Search Algorithm 3.1 Idea Sketch From the distribution density of motion vectors of Fig. 1, we can see that the five motion vectors of (0, 0), (0, 1), (0, -1), (1, 0) and (-1, 0) account for more than 74% of best matches. From this observation, we decide to choose a small cross pattern with five points for the fist search step. Besides such five points, other six points of (-2, 0), (-1, 1), (-1, -1), (1, 1), (1, -1), and (2, 0) have a high occurrence density. To use these six points with high occurrence density as a search pattern, we employ a flat hexagon search pattern.

Our new search patterns are shown in Fig. 4. The search pattern of Fig. 4(a) is called the small cross search pattern used to check the points of (0, 0), (0, 1), (0, -1), (1, 0) and (-1, 0). The pattern of Fig. 4(b) is the flat hexagon search pattern covering the six points of (-2, 0), (-1, 1), (-1, -1), (1, 1), (1, -1), and (2, 0). With the combination of the two search patterns, we detect the best match in a rapid speed. Moreover, owing to the flat hexagon pattern of Fig. 4(b) our method can provide a better performance while coding video frames that contain frequently changing scenes or images including quickly moving objects. The search pattern of Fig. 4(c) is called the small hexagon search pattern. This pattern is used to decide a best match at the last search step.

(a) SCSP (b) FHSP (c) SHSP

Fig. 4 Proposed search patterns in our algorithm.

The proposed algorithm is summarized as follows. At step 1, we center the small diamond search pattern of Fig. 4(a) at (0, 0) and compute BDMs of the five search points. If the center point has the minimum BDM, then our algorithm immediately selects the center point as the motion vector and ends the search procedure. Otherwise, at step 2 the flat hexagon pattern of Fig. 4(b) is applied. After placing the center of the flat hexagon pattern on the minimum BDM selected at the step 1, we again compute BDMs of the points in the flat hexagon pattern. If the center of the flat hexagon is the minimum BDM point, then the next step 3 is executed; otherwise, the proposed algorithm recursively applies the flat hexagon pattern until its center point has a minimum BDM. At step 3, the center of the small hexagon of Fig. 4(c) is placed at the minimum BDM point found at step 2. After computing BDMs of the small hexagon points, the minimum BDM point among them is selected as a motion vector. 3.2 Proposed Algorithm In this subsection, we present the detailed algorithm of the proposed block matching method. As the measure of the BDM, we use the SAD (sum of absolute difference) used in [1, 10, 11]. The computation of the SAD value is done using equation (1). In this equation, notation MB is the size of an image block to be coded. The notation It(x, y) represents the color of a pixel at (x, y) in the target frame, while It-1(x, y) is the color of the pixel in the reference frame. Here, (dx, dy) represents the distance between the reference image block and the target image block.

∑∑= =

− ++−=MB

x

MB

ytt dyydxxIyxISAD

1 11 ),(),( (1)

The details of the proposed block matching

algorithm are given in Fig. 5. This algorithm is to return a motion vector for the image block of tBlock which has its left-top point at (x, y) in the target frame.

In the algorithm, if a best match for tBlock has the same coordinates as those of tBlock, then the motion vector (0, 0) is returned. Additionally, like other algorithms, we also use a maximum search window with size of seven by seven pixels [1-11]. With this maximum search window, a best match is searched for within the boundaries from -7 to 7.

Proceedings of the 6th WSEAS Int. Conf. on Systems Theory & Scientific Computation, Elounda, Greece, August 21-23, 2006 (pp58-63)

Algorithm Find-Best-Match(rFrame, tBlock, x, y) Input: rFrame: given reference frame,

tBlock: target block to be coded, (x, y): coordinates of the left-top pixel of

tBlock. Output: MV: the motion vector of tBlock. (1) Place the center of the small cross search pattern at (x, y) of rFrame; let Points be the points in the small cross search pattern. (2) Calculate the SADs of each point in Points. (3) If the minimum SAD point is equivalent to the center point of Points, then set MV to (0, 0) and exit.(4) Let pMin be the minimum SAD point among Points. Place the center of the flat hexagon search pattern at the position of pMin. (5) Let HPoints be the points in the flat hexagon search pattern; calculate the SADs of each point in HPoints. (6) If the minimum SAD point is not the center point in HPoints, then repeat (4) above; otherwise, let pCenter be the center point of HPoints and perform the next step. (7) Place the center of the small hexagon search pattern at the position of pCenter; let SHPoints be the points of the small hexagon search pattern. (8) Calculate the SADs of each point in SHPoints; determine the minimum SAD point among SHPoints and let pMin be the point found. (9) Let (d1, d2) be the frame coordinates of pMin; then set MV to (d1-x, d2-y) and exit. End Algorithm.

Fig. 5 The proposed algorithm for a best match search.

4 Performance Evaluations 4.1 Experiment Environment For the performance evaluation, we use four CIF video files (352 by 288 in pixels), an SIF video file (352 by 240 in pixels), and a QCIF video file (176 by 144 in pixels). To show our performance advantages, we conduct several experimental comparisons with other algorithms such as a FS (Full Search), DS, and CDS, and HEXBS, in terms of the search speed and coded video’s quality. To run the block match algorithms to be evaluated, we use a PC server with a Pentium 4 CPU and 512 MB of memory.

As measures of coded video’s quality, we employ both of popular ones of MAD (Mean Absolute Difference) and PSNR (Peak Signal-to-Noise Ratio)

[7, 9, 11]. How to obtain MAD and PSNR values is given in equations (2) and (3), respectively.

∑∑= =

−⎟⎠⎞

⎜⎝⎛

×=

M

x

N

ytt yxEyxO

NMMAD

1 1

),(),(1 (2)

MSEPSNR

2

10255log10= (3)

where, ∑∑= =

−⎟⎠⎞

⎜⎝⎛

×=

M

x

N

ytt yxEyxO

NMMSE

1 1

2),(),(1

In the formulas above, notations M and N represent

the sizes of video frames in horizontal axis and vertical axis, respectively. Ot(x, y) represents the color of a pixel at (x, y) in reference frame, while Et(x, y) is the color of the corresponding pixel in the target frame. According to the definitions of MAD and PSNR, a coded video yielding less MAD and PSNR is said to have a better image quality. Throughout our experiments, we will show that our proposed algorithm provides a faster block search speed than other algorithms, while maintaining the high image quality by yielding less MAD and PSNR. 4.2 Experiment Results Table 1 shows the experimental results concerning the search speed of the block matching algorithms. In this table, our proposed algorithm is denoted by FHEX (Flat-Hexagon Search Algorithm) for short. As stated before, three different types of video formats are evaluated for the generality. In this table, the columns of PN are the numbers of pixels on which the BDM needs to be computed for detecting a best match. The SRATE columns give the improvement rates of each algorithm’s search speeds with respect to FS (full search). For example, if any algorithm has a value of 20 in the SRATE column, then that algorithm is 20 times as faster as FS in the search of a best match.

As known from Table 1, our algorithm provides the best search speed among the experimented algorithms. In particular, the performance gain becomes largest in the case of Akiyo, where the performance gain is at most 240%, compared with the DS algorithm. In addition, in the case of Football which contains the largest inter-frame difference, our algorithm also supports a better search speed. This efficiency of block searching mainly comes from the use of our flat hexagon search pattern.

Proceedings of the 6th WSEAS Int. Conf. on Systems Theory & Scientific Computation, Elounda, Greece, August 21-23, 2006 (pp58-63)

Table 1 Comparisons of search speeds.

As known from Table 1, the proposed algorithm provides the best coding speed. Since the fast search speed could adversely affect the coded image quality, it is important to prevent a faster searching procedure from deteriorating the image quality of coded video files. To show the image quality preserving in our algorithm, we give Table 2. The columns of MAD and PSNR in the table correspond to the image quality measures of equation (2) and (3), respectively. As known from the evaluation results, the proposed algorithm provides the nearly equivalent image quality, compared with other algorithms. From that, we can say that our fast coding speed does not impair the image quality of coded videos.

Table 2 Coded video's qualities in terms of MAD and

PSNR.

Fig. 6 shows that the proposed algorithm efficiently

runs for video files with much difference between successive frames, as well as other video files with less inter-frame difference. In Fig. 6(a), the CDS algorithms provides a faster search time than HEXBS in the case of Akiyo which does not include quick changes of scenes and thus has less inter-frame difference. However, the experiment in Fig. 6(b) does not retain such a performance advantage. In this case, the HEXBS algorithm provides a better search time

than CDS, because of much inter-frame difference contained in the test video file of Football. From this, we can say that the algorithms of CDS and HEXBS are susceptible to the degree of temporal redundancy hidden in video files. Unlike them, the proposed algorithm provides a better search performance for the two cases of Football and Akiyo, being invulnerable to the degree of temporal redundancy.

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

0 10 20 30 40 50 60 70 80 90 100

FRAME NUMBER

PN

DS

CDS

HEXBS

FHEX

(a) BDM computations for Akiyo frames.

0.00

5.00

10.00

15.00

20.00

25.00

30.00

0 10 20 30 40 50 60 70 80 90 100

FRAME NUMBER

PN

DS

CDS

HEXBS

FHEX

(b) BDM computations for Football frames.

Fig. 6 Comparisons of the BDM computing numbers with respect to the image frame sequence.

5 Conclusions As two feasible algorithms for block matching motion estimation (ME), the CDS and HEXBS have been proposed. The algorithms have advantages such as a very high search speed and preserving of coded video’s image quality, compared with the naive full search algorithm. Although theses algorithms have such desirable characteristics, their performance advantages are rather susceptible to inter-frame redundancy of compressed video. That is, the CDS algorithm provides a better performance than HEXBS

Proceedings of the 6th WSEAS Int. Conf. on Systems Theory & Scientific Computation, Elounda, Greece, August 21-23, 2006 (pp58-63)

for videos with less difference between video frames, while HEXBS outperforms CDS when coded frames have much difference because of fast moving objects drawn or frequent changes of scenes.

To prevent such shortcomings, we proposed a new block matching algorithm that is based on the small cross search pattern and a flat hexagon search pattern. From the performance experiments, we can see that our algorithm provides a better performance in terms of the search speed and the coded video’s quality, compared with both the CDS and HEXBS algorithms. In particular, our algorithm provides performance advantages regardless of the amounts of temporal redundancy between video frames. Using our algorithm, we can improve the search speed by up to 51%, and also diminish the PSNR (Peak Signal Noise Ratio) by at most 0.7 dB, thereby improving the video quality. References: [1] Information Technology-Generic Coding of

Moving Pictures and Associated Audio Information: Video, ISO/IEC 13818-2(MPEG-2 Video), 2000

[2] Tae-Gyoung Ahn, Yong Ho Moon, and Jae Ho Kim, "Fast full-search motion estimation based on multilevel successive elimination algorithm", IEEE Transactions on Circuits and Systems for Video Technology, VOL. 14, NO. 11, pp. 1265-1269, Nov. 2004

[3] G. Calvagno, F.Fantozzi, R.Rinaldo, and A.Viareggio, "Model-based global and local motion estimation for videoconference sequences", IEEE Transactions on Circuits and Systems for Video Technology, VOL. 14, NO. 9, pp. 1156-1161, Sep. 2004

[4] C. H. Cheung, and L. M. Po, "A Novel Cross-Diamond Search Algorithm for Fast Block-Matching Motion Estimation", IEEE Transactions on Circuits & System for Video Technology, Vol. 12, No. 12, pp. 1168-1177, Dec. 2002.

[5] T. Sappasitwong, S. Aramvith, S. Jitapunkul, A. Tamtrakarn, P. Kitti-punyangam, and H. Kortrakulkij, "Adaptive asymmetric diamond search algorithm for block-based motion estimation", In Proc. of the International Conference on Digital Signal Processing, pp. 563-566, July 2002.

[6] T. Sappasitwong, S. Aramvith, S. Jitapunkul, A. Tamtrakarn, P. Kitti-punyangam, and H.

Kortrakulkij, "Adaptive asymmetric diamond search algorithm for block-based motion estimation", In Proc. of the International Symposium on Video/Image Processing and Multimedia, pp. 16-19, June 2002.

[7] Ce Zhu, Xiao Lin, and Lap-Pui Chau, "Hexagon-Based Search Pattern for Fast Block Motion Estimation", IEEE Transactions on Circuits & System for Video Technology, Vol. 12, No. 5, pp. 349-355, May 2002.

[8] Weiguo Zheng, I. Ahmad, and Ming Lei Liou, "Adaptive motion search with elastic diamond for MPEG-4 video coding", In Proc. of the International Conference on Image Processing, pp. 377-380, Oct. 2001.

[9] Ce Zhu, Xiao Lin, Lap-Pui Chau, Keng-Pang Lim, Hock-Ann Ang, and Choo-Yin Ong, "A novel hexagon-based search algorithm for fast block motion estimation", In Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1593-1596, May 2001.

[10] Sung-Chul Shin, Hyun-ki Baik, Myong-Soon Park, and Dong Sam Ha, "A center-biased hybrid search method using plus search pattern for block motion estimation", In Proc. of the IEEE International Symposium on Circuits and Systems 2000, Geneva., Vol. 4, pp. 309-312, May 2000.

[11] Shan Zhu, and Kai-Kuang Ma, "A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation", IEEE Transactions on Image Processing, Vol. 9, No. 2, pp. 287-290, Feb. 2000.

Proceedings of the 6th WSEAS Int. Conf. on Systems Theory & Scientific Computation, Elounda, Greece, August 21-23, 2006 (pp58-63)