A Hexagon-based fast: search method for block motion estimation in video encoding
BACKGROUND OF THE INVENTION
The invention relates to a method for motion estimation, a method for encoding a video sequence, a motion estimation device and a video encoder.
In the field of encoding a video sequence (video encoding) , the compression of video data has become a very important issue for reducing the amount of data needed to be transmitted and/or stored for the encoding of a plurality of pictures in a quality which is sufficiently high for a user.
A very important factor with respect to video data compression is the motion estimation between subsequent pictures of the video sequence, which is used to extract motion information from the video sequence. The extracted motion information is used for the reduction of temporal redundancy between subsequent video pictures.
Block-matching motion estimation is widely applied in many motion-compensated video coding techniques/standards such as ISO MPEG-1/2/4 and ITU-T H.261/262/263/263+/263L, which is aimed to exploit the strong temporal redundancy between successive frames. By partitioning a current frame into non- overlapping rectangular blocks/ acroblocks of equal size, a block matching method attempts to find a block from a reference frame (past or future frame) that best matches a predefined block in the current frame. Matching is performed by minimizing a matching criterion, which in most cases is the mean absolute error between this pair of blocks. The block in the reference frame moves inside a search window (region) centered around the position of the block in the current frame. The best matched block producing the minimum
distortion is searched within the search window (region) in the reference frame. The displacement of the current block with respect to the best matched reference block in x and y directions composes the motion vector assigned to this current block.
However, the motion estimation is quite computational intensive and can consume up to 80% of the computational power of the encoder if the full search is used by exhaustively evaluating all possible candidate blocks within a predefined search window. Therefore, fast algorithms are highly desired to significantly speed up the procedure without sacrificing the distortion sharply.
Many computationally efficient variants were developed, typically among which are the so-called three-step search, the new three-step search, the four-step search, the block- based gradient descent search and the diamond search algorithms .
In the block-based motion estimation, the search pattern with different shapes or sizes has a great impact on the reachable search speed and the resulting distortion performance.
On one hand, in the three-step search, the new three-step search, the four-step search and the block-based gradient descent search algorithms, square-shaped search patterns of different sizes are employed.
On the other hand, the diamond search algorithm, as described in [1] and [2], adopts a diamond-shaped search pattern, which has demonstrated faster processing time with marginally worse distortion in comparison with the three-step search, the new three-step search and the four-step search.
The search pattern used in the diamond search algorithm has a rectangular, diamond shape. Two different size of diamonds
are employed. The larger one consists of nine checking points, of which eight checking points surround a center checking point. The small diamond search pattern consists of five checking points, of which four checking points surround a center checking point to compose the diamond shape.
However, there is still the emerging need to improve the processing speed for motion estimation.
SUMMARY OF THE INVENTION
Therefore, it is an object of the invention to provide a method for block motion estimation which can be carried out with improved processing speed compared with the above- described diamond search algorithm.
The object is achieved with a method for block motion estimation for a picture block of a current picture with picture blocks, with reference to a reference picture, wherein the motion estimation is carried out in a search area with a hexagonal form in the reference picture preferably with respect to a picture block of the current picture and a plurality of reference picture blocks in the reference picture.
The motion estimation method is preferably used in the field of video encoding.
Furthermore, it is a further object of the invention to provide a motion estimation device which can carry out a motion estimation with improved processing speed compared with a motion estimation device which is arranged to carry out the motion estimation using the above-described diamond search algorithm.
The object is achieved with a motion estimation device for motion estimation for a picture block in a picture with picture blocks, with reference to a reference picture, preferably comprising a processor, which device or which processor is arranged in a way that the following steps can be carried out:
• a motion estimation in a search area in the reference picture is carried out, and
• the used search pattern in the motion estimation has a hexagonal form.
In an alternative motion estimation device, a separate unit for executing the individual step of the motion estimation and the involved search algorithm may be provided.
The motion estimation device is in particular suitable for the use in a video encoder.
The invention may be implemented using a special electronic circuit, i.e. in hardware, or using a computer program, i.e. in software.
The invention can generally be seen in a hexagon-based search algorithm in the block motion estimation in a sequence of pictures, i.e. a video sequence, where the search algorithm can achieve significant speed improvement over the diamond search algorithm with similar distortion performance.
Thus, the motion estimation may be processed with improved speed compared with the motion estimation using the diamond search algorithm.
In summary, the hexagon-based search scheme according to the invention may find any point in the motion field with fewer search points (also denoted as checking points) than the diamond search algorithm.
The larger the motion vector, i.e. the larger the motion between the two considered pictures, the more substantial the speedup gain will be in terms of number of checking points saved by the invented algorithm according to the invention compared with diamond search algorithm.
The reference picture (reference frame) used for motion estimation may be a preceding or a following picture of the current picture in a sequence of pictures, i.e. in a video sequence.
Furthermore, coding information may be assigned to the picture elements of the current picture (current frame) and the reference picture. Coding information in this context may be luminance information or chrominance information, which is assigned to one or more picture elements in the respective picture.
According to this embodiment of the invention, the motion estimation is based on the coding information.
According to a further embodiment of the invention, the motion estimation comprises the step of determining a minimum block distortion between the picture block of the current picture and each considered candidate picture block of the plurality of candidate picture blocks of the reference picture in the search area.
According to a further embodiment of the invention, the search area comprises a plurality of checking points, each of which corresponds to a respective candidate picture block of the reference picture used for the motion estimation.
According to a further embodiment of the invention, each checking point corresponds to a block in the reference picture.
By this embodiment, the motion estimation is further simplified, thereby further improving the processing speed of the motion estimation.
According to a further embodiment of the invention, one checking point is arranged in the center of the search pattern and each further checking point is located at an endpoint of the search pattern.
Thus, the hexagon search area preferably comprises a checking point at each corner of the hexagon and one checking point in the center of the respective current search area.
According to this embodiment, the motion estimation is further improved in speed, especially when a second search pattern, which will be further described in detail, is moved in a way that the newly generated second search pattern contains only three new checking points, for which the calculation of the minimum block distortion with respect to the picture block of the current picture has to be provided.
This means, that there only have to be provided three new calculations of a novel minimum block distortion in the newly generated second search pattern.
Furthermore, the motion estimation may comprise the following steps:
• determining a minimum block distortion between the picture block of the current picture and each considered candidate picture block in the reference picture according to the search pattern,
• determining, whether the candidate picture block with the minimum block distortion is the candidate picture block corresponding to the center checking point in the search pattern,
• in case the candidate picture block with the minimum block distortion does not correspond to the center
checking point of the search pattern,
- generating a second hexagonal search pattern around the checking point with the minimum block distortion,
- carrying out a new iteration of motion estimation in the second hexagonal search pattern, and
• in case the candidate picture block with the minimum block distortion is the candidate picture block corresponding to the center point of the search pattern, selecting this candidate picture block for further processing.
This embodiment of the invention with the iterative search algorithm leads to a further improved motion estimation result.
In order to further simplifying the search and thus rendering the motion estimation even more simple, the search pattern and the second search pattern may have the same size, i.e. the same form and the same number of checking points, for example.
According to a further embodiment of the invention, the motion estimation further comprises the following steps:
• generating a third search pattern in the first search pattern or in the second search pattern, in which the selected candidate picture block for further processing has been determined, the third search pattern being smaller in size than the search pattern and/or the second search pattern around the selected candidate checking point,
• determining a minimum block distortion between the picture block of the current picture and each considered candidate picture block of the plurality of candidate picture blocks of the reference picture in the third search pattern.
The wording "the third search pattern being smaller in size than the search pattern and/or the second search pattern" in this context means, that there is a smaller number of checking points included in the third search pattern than in the search pattern and/or in the second search pattern. In other words, there is a fewer number of calculations of the minimum block distortion to be calculated and that the third search pattern is placed in the search pattern and/or in the second search pattern.
The used third search pattern in the reference picture may also have a hexagonal form.
The checking points in the third pattern area may be arranged in such a way that one checking point is in the center of the third search pattern, one half of the rest of the checking points of the third search pattern is located at an endpoint of the third search pattern and the other half of the rest of the checking points of the third search pattern is placed preferably in the middle between two endpoints of the smaller hexagon of the third search pattern.
According to a further embodiment of the invention, the search pattern and/or the second search pattern comprises/comprise seven checking points.
This special number of checking points and the respective arrangement of the checking point on the endpoints of the hexagon and the center of the hexagon has revealed a further improvement of speed in processing of the motion estimation.
According to a further embodiment of the invention, the third search pattern comprises five checking points, one checking point in the center of the third, smaller search pattern, two checking points located at two endpoints of the hexagon of the third search pattern around the center checking point arranged on opposite sides, and two checking points on the
edge of the hexagon in the middle of the longer edges of the hexagon arranged on opposite sides, respectively.
The embodiments of the invention described above apply to the method of motion estimation, to the method for coding a video sequence, to the motion estimation device as well as to the video encoder.
A preferred embodiment of the invention will now be described with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows a diagram illustrating an example of determining a search path finding a motion vector in a motion estimation algorithm according to a preferred embodiment of the invention; and
Figures 2A and 2B show a diagram illustrating a larger hexagonal search pattern (Figure 2A) and a smaller hexagonal search pattern (Figure 2B) according to a preferred embodiment of the invention;
Figure 3 shows a block diagram illustrating a video encoder according to a preferred embodiment of the invention.
DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT OF THE
INVENTION
Pig.3 shows a video encoder 300 for encoding a digital video sequence 301.
The digital video sequence 301 is a sequence of subsequent digital pictures 302 (digital frames) comprising picture elements .
The picture elements of each picture 302 are grouped in picture blocks, each picture block comprising 8 x 8 picture elements or 16 x 16 picture elements.
In this context it should be pointed out that the invention is not restricted to a specific form of picture blocks, the picture elements are grouped in, in particular, the invention is not restricted to a specific number of picture elements in a picture block.
Any form of picture block with any number of picture elements may be used in the motion estimation according to the invention.
Four picture blocks with luminance information and two picture blocks with chrominance information are grouped into one macroblock, respectively.
According to this preferred embodiment of the invention, each picture 302 comprises 16 x 16 macroblocks.
Luminance information and chrominance information is assigned to one picture element or to a plurality of picture element (in the case of sub-sampling) as coding information.
In a selecting unit 303, each picture 302 is selected and it is determined whether it is to be coded in a so called INTER- mode or the so called INTRA-mode.
If the selected picture is to be coded in the INTER-mode, i.e. in a differential mode, for each macroblock in the current picture to be coded (denoted as first picture) , a motion estimation is provided in a motion estimation unit 304
with reference to a reference picture. The reference picture may be a preceding picture or a following picture of the first picture in the video sequence 301.
As a result of the motion estimation, a motion vector describing the displacement of the macroblock in the current picture compared with the selected best matched macroblock in the reference picture, is determined.
The coding mode 305 for the selected picture 301 and the determined motion vectors 306 are provided to and stored in a picture storage element 307.
The motion estimation will further be described in detail.
In the INTER-coding mode, the data 308 in the macroblocks (the coding information) is provided to a subtracting unit 309, where the data 310 of the macroblock from the reference picture, which macroblock has been matched and selected according to the motion estimation is subtracted, thereby generating motion estimation error coefficients 311.
In the INTRA-coding mode, no subtraction and no motion estimation is provided. In the INTRA-coding mode, the entire luminance information and the entire chrominance information of the selected picture to be coded is coded, not only a differential signal as in the INTER-coding mode. In the INTRA-coding mode, INTRA-coefficients 311 are outputted by the subtracting unit 309.
The motion estimation error coefficients 311 or the INTRA coefficients 311 are provided to a transformation unit 312.
In the transformation unit 312, a discrete cosine transformation (DCT) is provided to the inputted motion estimation error coefficients 311 or the INTRA coefficients
311, thereby generating transformation coefficients 313, which are provided to a quantizing unit 314.
In the quantizing unit 314, the transformation coefficients 313 are quantized and scanned, thereby generating quantized transformation coefficients 315, which are provided to a entropy coding unit 316 and to an inverse quantizing unit 318.
In the entropy coding unit 316, a variable length encoding, such as a Lempel-Ziv-Encoding or a Huffman encoding is performed on the quantized transformation coefficients 315. Furthermore, a run length encoding algorithm is performed on the variable length encoded quantized transformation coefficients.
The resulting coded data stream 317 is outputted by the video encoder 300 and, for example, transmitted to a video decoder unit.
Furthermore, in the inverse quantizing unit 318, the quantized transformation coefficients 315 are inversely quantized, thereby generating inverse quantized transformation coefficients 319, which are provided to an inverse transformation unit 320.
In the inverse transformation unit 320, an inverse discrete cosine transformation (IDCT) is performed on the inverse quantized transformation coefficients 319, thereby generating inverse transformed coefficients 321, which are provided to a first input 322 of an adder 323.
The data 310 of the macroblock from the reference picture, which macroblock has been matched and selected according to the provided motion estimation is provided to a second input 324 of the adder 323, thus being added to the inverse
transformed coefficients 321 in the INTER-mode, thereby generating a reconstructed signal 325.
The reconstructed signal 325 outputted by the adder 323 is stored in the picture storage element 307.
Now, the motion estimation algorithm according to the preferred embodiment of the invention will be described in detail.
It should be noted that the now described motion estimation algorithm is not restricted to the architecture of the above- described encoder 300.
The motion estimation algorithm may be applied to any other encoding architecture, where a motion estimation is needed, for example in an object-oriented encoding scheme like MPEG4 or in a hybrid block-based encoding scheme and object- oriented encoding scheme.
Fig.2A shows a picture 200 with picture elements 201 and a large search pattern 202, in which the search during the motion estimation is performed.
The large search pattern 202 has the form of a hexagon, comprising seven checking points 203, 204, 205, 206, 207,
208, 209. The center checking point 203, which is placed in the center of the large search pattern 202, is surrounded by six endpoint checking points 204, 205, 206, 207, 208, 209, which are placed on the endpoints of the hexagon.
It should be noted that the hexagon can be rotated by 90 degrees.
Of the six endpoint checking points 204, 205, 206, 207, 208,
209, in the hexagon, two checking points 204, 207 are away from the center checking point 203 with a distance of two
(measured in pixel elements) and the rest four checking points 205, 206, 208, 209 have a distance of 5 (square root of 5) from the center checking point 203.
Fig.2B shows the picture 200 with picture elements 201 and a small search pattern 210, in which the finer search during the motion estimation is performed as will be described later in detail.
The small search pattern 210 also has the form of a hexagon, comprising five checking points 211, 212, 213, 214, 215, a further center checking point 211, which is placed in the center of the small search pattern 210, and four further checking points 212, 213, 214, 215, two 212, 214 of which are placed on the endpoints of the hexagon on opposite sides, respectively. The rest 213, 215 of the further checking points 212, 213, 214, 215, are placed in the middle of the longer edges 216, 217 of the hexagon, also on opposite sides, respectively.
In other words, the further center checking point 211 is surrounded by the four further checking points 212, 213, 214, 215.
It should be noted that also this hexagon of the small search pattern 210 can be rotated by 90 degrees.
The four further checking points 212, 213, 214, 215 in the hexagon are away from the further center checking point 211 with a distance of one (picture element) .
With the two search pattern 202, 210, the search procedure is carried out as follows.
In the first step, the large search pattern 202 with the seven checking points 203, 204, 205, 206, 207, 208, 209, is used for search.
For each checking point of the seven checking points 203, 204, 205, 206, 207, 208, 209, the block distortion is determined by calculating, e.g., the sum of the squared differences.
It should be remarked that any other criterion for determining the block distortion may be used, for example the sum of the absolute differences.
After having calculated all seven distortions, the minimum one is selected and it is checked whether the minimum one corresponds to the center checking point 203.
This means that if the optimum match is found at the center of the large search pattern 202, in a second stage of the method, it is switched to the small search pattern 210 including the four further checking points 212, 213, 214, 215 for the focused inner search.
As can be seen from Fig.l, the small search pattern 210 in Fig.2 is placed in the inner of the large search pattern 202, in which the optimum match has been found at the center checking point. Both small search pattern and large search pattern have the same center checking point.
Otherwise, if the optimum match is found not to be at the center of the large search pattern 202, the search continues around the checking point with the corresponding minimum block distortion in a new large search pattern. Now, in the new large search pattern, the checking point with the corresponding minimum block distortion is at the center.
It should be noted that while the large search pattern moves along the direction of decreasing distortion, only three new checking points non-overlapped will be evaluated as candidates each time, i.e. in each iteration.
Fig.l shows an example of the search path strategy leading to the motion vector (4, -4), where 20 (= 7 + 3 + 3 + 3 + 4) search points (checking points) are evaluated in five iterations sequentially.
As shown in Fig.l, the search begins within a first large search pattern 101 with seven first checking points 102, 103, 104, 105, 106, 107, 108, where the checking point 102 is located at the motion vector (0, 0) . All the seven checking points are marked as 1 in the circle point. The number "1" denotes the first iteration.
Each iteration includes the determination of the minimum block distortion as described above.
According to this embodiment, it is assumed that the checking point 106 has yielded the corresponding minimum block distortion in the first iteration. Note that the checking point 106 is not at the center of the first hexagon.
Thus, a second large search pattern 109 with seven second checking points 107, 102, 106, 110, 111, 112, 105 is formed, wherein the checking point 106 is the center checking point of the second large search pattern 109. Note that due to the overlapping of the first and second search patterns, only three new points (block matches) marked as 2 will be evaluated (they are the checking points 110, 111, 112) in the second iteration.
According to this embodiment, it is assumed that the checking point 112 of the second large search area 109 has yielded the corresponding minimum block distortion.
After having determined the minimum block distortion in the second large search pattern 109, since the corresponding checking point 112 with the minimum block distortion is not
the center of the second large search pattern 109, a third large search pattern 113 is formed.
Thus, the third large search pattern 113 with seven checking points 111, 106, 107, 114, 115, 116, 112, is formed in the picture 200, wherein the checking point 111 is the center checking point of the third large search pattern 113. Also only three new checking points marked as 3 will be -checked in the third iteration.
After having determined the minimum block distortion for the third large search pattern 113, since the corresponding checking point of the minimum block distortion is not the center of the third large search pattern 113, a fourth large search pattern 117 is formed.
According to this embodiment, it is assumed that the checking point 115 of the third large search pattern 113 has yielded the corresponding minimum block distortion.
Thus, the fourth large search pattern 117 with seven fourth- iteration checking points 115, 116, 112, 114, 118, 119, 120, is formed, wherein the checking point 114 is the center checking point of the fourth large search pattern 117. Also only three new checking points (block matches) marked as 4 will be evaluated in the fourth iteration.
After having determined the minimum block distortion in the fourth large search pattern 117, since the corresponding checking point of the minimum block distortion is the center 115 of the fourth large search pattern 117, a small search pattern 125 is formed for the last iteration.
The small search pattern 125 consists of five checking points 115, 121, 122, 123, 124, which is formed in the inner of the fourth large search pattern 117 in Fig.l. The checking point
115 (also the center of the fourth large search pattern 117) is the center checking point of the small search pattern 125.
For each of the checking points 115, 121, 122, 123, 124, of the small search pattern 125, only four new checking points 121, 122, 123, 124, marked as 5 are needed to calculate the block distortion in the fifth iteration.
Subsequently, the minimum block distortion of the calculated block distortion is determined, and the final motion vector corresponding to the minimum block distortion is obtained.
The motion estimation algorithm according to the preferred embodiment of the invention can be summarized in the following detailed steps.
Step 1) Starting:
The large hexagon pattern with 7 checking points (refer to Fig.2A) is centered at the defined coordinate position (0,0), the center of a predefined search window in the motion field. If the minimum block distortion checking point is found to be at the center of the hexagon pattern, proceed to Step 3 (Ending) ; otherwise, proceed to Step 2 (Searching) . Step 2) Searching:
With the minimum block distortion checking point in the previous search step as the center, a new large hexagon pattern is formed. Three new candidate checking points are checked, and the minimum block distortion checking point is again identified. If the minimum block distortion checking point is the center point of the newly formed hexagon pattern, then go to Step 3 (Ending) ; otherwise, repeat this step continuously. Step 3) Ending:
Switch the search pattern from the large size of hexagon to the small size of hexagon (refer to
Fig.2B) . The four new checking points covered by the small hexagon pattern are evaluated to compare with the current minimum block distortion checking point. The new minimum block distortion checking point is the final solution of the motion estimation.
The above process applies to each block in a current frame for motion estimation. From the above procedure, it can be easily derived that the total number of search points per block will be 7+3χn+4, where n is the number of execution of Step 2. The method according to the preferred embodiment of the invention attempts to find a sub-optimal motion vector using the least number of search points.
Compared with any other known search patterns known, the hexagon-based search pattern according to the preferred embodiment of the invention is found to be the most efficient in terms of minimizing search points.
For example, the diamond search method is one of the most efficient algorithms known, which requires 9 + M x n + 4 search points for each block where M is either 5 or 3 depending on the search direction and depends on the search distance. The ή is always greater than or equal to the n in the invented method to find the same motion vector. For small-motion image sequence that implies small n or , the method according to the invention generally can save at least above 15% of search points used by diamond search.
Furthermore, for large-motion image sequence, much more search points can be saved using the method according to the invention, e.g., about 40% may be saved in some cases.
For the five motion vectors located at zero point and its nearest four points with distance 1, the proposed hexagon- based search algorithm according to the preferred embodiment
of the invention has a scenario of only 11 search points compared with 13 search points by diamond search algorithm.
More attractively, for large motion image sequences, the method according to the preferred embodiment of the invention may use much fewer search points than the diamond search algorithm, where around 40% (with extreme of 45.5%) of search points in diamond search algorithms may be saved in some certain cases. Equivalently, the speed improvement rate may be up to 83.3%.
The following publications are cited in this document:
[1] Shan Zhu, Kai-Kuang Ma, A new diamond search algorithm for fast block-matching motion estimation, IEEE Transactions on Image Processing, Vol. 9, No. 2, pp. 287 - 290, February 2000;
[2] Jo Yew Tham, Surendra Ranganath, Maitreya Ranganath, and Ashraf Ali Kassim, A novel unrestricted center-biased diamond search algorithm for block motion estimation, IEEE Transactions on Circuits & Systems for Video Technology, Vol. 8, No. 4, pp. 369 - 377, August 1998.