Summary of the invention
The object of the present invention is to provide a kind of utilization that search procedure is carried out in the judgement of motion complexity and select, under equal coding quality, obtain the rapid motion estimation self-adaptive selection method based on motion complexity of higher code efficiency.
For achieving the above object, the technical solution used in the present invention is:
1) at first, according to the time continuity of video sequence, be foundation with the motion complexity of last two field picture, carry out the frame level of motion complexity and judge;
2) then, according to the result that the frame level is judged, the picture frame to needs are proceeded to judge according to the spatial coherence of video sequence, carries out the judgement of macro-block level motion complexity;
3) last,, select method for estimating for current macro and carry out estimation the judgement of motion complexity according to frame level and macro-block level.
Frame level judgement of the present invention may further comprise the steps:
1) the absolute error sum of macro block and optimum Match macro block, i.e. single frames macroblock prediction error mb_SAD in the calculating previous frame image:
SAD wherein
iBe absolute error and (SAD, the Sum of AbsoluteDifferences) of i macro block and its optimum Match macro block, N is the macro block sum in this two field picture;
2) calculate the texture complexity of previous frame image, use to be of a size of 3 * 3 Sobel operator computed image pixel I (i j) is positioned at (i, j) the horizontal gradient Grad of the gray value of the pixel of position
H(i is j) with vertical gradient Grad
V(i j), obtains the gradient information value of this pixel correspondence:
gradient(i,j)=|Grad
H(i,j)|+|Grad
V(i,j)|
The gradient information gradient of whole two field picture texture complexity is obtained by following formula:
Wherein i, j distinguish horizontal stroke, the ordinate of represent pixel correspondence position, and I represents the set of a two field picture pixel;
3) carrying out the frame level judges, this two field picture texture complexity of single frames macroblock prediction sum of errors is compared, if the texture complexity is more than or equal to single frames macroblock prediction error, think that then the predicated error of macro block is mainly derived from the texture information of image itself, image motion is comparatively simple, adopts the diamond search method; Otherwise think that then image motion is comparatively complicated, may need to use the Hierarchical Motion Estimation method, that is:
When gradient 〉=mb_SAD: flag_HME=0, carry out the search of diamond search method;
When gradient<mb_SAD: flag_HME=1, then according to the picture frame of judging, according to the spatial coherence of video sequence, carry out the judgement of macro-block level motion complexity.
Macro-block level judgement of the present invention may further comprise the steps:
1) motion vector prediction (MVP of current macro, Motion vector prediction) collection: this set is zero motion vector (0,0) formation by the motion vector of the macro block of the motion vector of space adjacent macroblocks (upper and lower, upper left, upper right), former frame same position correspondence and predicted value;
2) motion complexity of computing macro block region: this complexity concentrates the otherness diff_MVP of predicted value to represent with the motion vector prediction of current macro:
Wherein: MVPnum is that the motion vector prediction of current macro is concentrated available predicted value number, and (px py) is the motion vector prediction value of the concentrated minimum match error correspondence of motion vector prediction of current macro, (px
i, py
i) be concentrated (px, py) predicted value in addition of removing of motion vector prediction of current macro;
3) obtain the adaptive threshold value_dis that macro-block level is judged: this threshold value is to adopt the diff_MVP average of the macro block of diamond search method in the cataloged procedure;
4) motion complexity and the adaptive threshold that obtains in conjunction with first two steps carries out the macro-block level judgement:
When diff_MVP>value_dis, the macro block region is compound movement, uses the Hierarchical Motion Estimation method;
When diff_MVP≤value_dis, the macro block region is simple motion, uses the diamond search method.
The present invention is a kind of multi-mode system of selection based on video content.On the basis that various method for estimating are analyzed,, select the most suitable searching method in conjunction with the motion conditions of current video content.This method at first compares the method for estimating of several classics, utilize distinct methods to the video image of different motion complexity in the complementarity aspect the search performance, judge by motion complexity, select a kind of suitable method and carry out estimation video image content.Thereby reduced unnecessary search procedure, and when guaranteeing picture quality, improved estimation efficient.
Embodiment
Below in conjunction with accompanying drawing the present invention is described in further detail.
Traditional method for estimating adopts with a kind of search procedure different videos, though can obtain good search effect, but owing to do not consider the motion complexity information of video image itself, for motion severe degree variation video sequence greatly, its whole search procedure is not an optimum, and code efficiency still has the space of lifting.
The method that the present invention proposes is mainly based on to existing two kinds of typical motion methods of estimation---diamond search method (DS, Diamond Search) and Hierarchical Motion Estimation method (HME, Hierarchical Motion Estimation) analysis, utilize the complementarity aspect search performance between them, while is in conjunction with the motion complexity of video image content, for current macro is selected a suitable searching method, and then shorten the time of estimation on the whole, improve code efficiency.
DS is a kind of quick block matching method based on search pattern, and its search strategy is similar to circle, has good isotropic directivity.This method begins according to the shape progressively outwards expansion of LDSP (Large Diamond-Shaped Pattern) template with rhombus from starting point, if the optimum of calculating at the center, is that the center continues to carry out fine searching according to SDSP (Small Diamond-Shaped Pattern) template with this point then.Corresponding LDSP, SDSP template and search procedure are as shown in Figure 1 and Figure 2.
HME belongs to the block matching method under the multiresolution, as shown in Figure 3, this method adopts averaging method to construct a two-stage image pyramid in the present invention, make progress from the bottom, image resolution ratio reduces gradually, the image-region of the corresponding 4*4 of lower floor of a pixel in the upper strata, and the upper strata pixel value is the average of lower floor corresponding region.The search procedure of HME belongs to the multi-pass decoding pattern, and as shown in Figure 4, the motion vector that it obtains with low-resolution image shortens the estimation time on the whole as the initial value of high-definition picture estimation.A is the size of macro block among the Level 1 among Fig. 3, Fig. 4, and K*a is the size of macro block among the Level 0, and K is the scale factor of two-stage macroblock size;
In order to compare DS and the quality of two kinds of methods of HME on search performance.Note mbnum is the number of macro block in the frame, (px
i, py
i) and (x
i, y
i) be respectively the initial position of i macro block search in the frame and the optimum position that obtains.The complexity that defines two field picture search is:
[formula 1]
Six sections different videos of motion complexity are carried out 10 tests respectively, to each complexity value, the pixel number that the corresponding participation macroblock match of statistics is calculated (promptly count by coupling, because low-resolution image is different with high-definition picture to the amount of calculation of point search coupling in the HME method, unified in order to accomplish standard, so adopted coupling to count) and the estimation time.Along with the increase of search complexity, the difference of two kinds of methods of DS and HME on search strategy can be by one group of correlation curve reflection of Fig. 5.
Can be seen that by above analysis two kinds of methods have complementarity, the DS method is simple, is suitable for the mild video sequence that moves, but for the bigger video sequence of motion change, more consuming time.On the contrary, the HME method can effectively be estimated big motion vector owing to adopted position fixing process from coarse to fine, still need make up pyramidal structure extraly, and this there is no need for background piece and the less piece of motion vector.In order to overcome above-mentioned technology limitation, the present invention proposes a kind of fast motion estimation system of selection that motion complexity is judged.
As shown in Figure 6, the present invention mainly is divided into two steps: 1, the video image motion complexity is carried out the frame level and judge.2, the frame level is judged as the macro-block level judgement that the picture frame that may carry out Hierarchical Motion Estimation is proceeded motion complexity.Judged result according to motion complexity is that current macro selects suitable method to carry out estimation.
The first step of the present invention is that the motion complexity to video image carries out the frame level and judges.According to the time continuity of video sequence, the foundation that the motion complexity of former frame image is judged as present frame.In the estimation, the predicated error of macro block, residual error just is mainly from two aspects: the texture of image itself and motion complexity.Image texture is complicated more, and perhaps motion is complicated more, all can cause the increase of macroblock prediction error.Definition single frames macroblock prediction error is the absolute error sum of macro block and optimum Match macro block in this two field picture.That is:
[formula 2]
SAD wherein
iBe absolute error and (the SAD:Sum of AbsoluteDifferences) of i macro block and its optimum Match macro block, N is the macro block sum in this two field picture.
The texture complexity of image can be described with the gradient information of image, and the present invention uses following formula to be of a size of 3 * 3 Sobel operator
[formula 3]
(i j) is positioned at (i, j) the horizontal gradient Grad of the gray value of the pixel of position to calculate computed image pixel I
H(i is j) with vertical gradient Grad
V(i j), obtains the gradient information value of this pixel correspondence:
[formula 4]
gradient(i,j)=|Grad
H(i,j)|+|Grad
V(i,j)|
The gradient information gradient that then describes whole two field picture texture complexity can be obtained by following formula:
[formula 5]
Wherein i, j distinguish horizontal stroke, the ordinate of represent pixel correspondence position, and I represents the set of a two field picture pixel; Single frames macroblock prediction error and this two field picture texture complexity are compared, if the texture complexity is more than or equal to single frames macroblock prediction error, think that then the predicated error of macro block is mainly derived from the texture information of image itself, image motion is comparatively simple, adopts the DS method; Otherwise think that then image motion is comparatively complicated, may need to use the HME method; That is:
When gradient 〉=mb_SAD: flag_HME=0, carry out the DS search;
When gradient<mb_SAD: flag_HME=1 makes up image pyramid, promptly according to the picture frame of judging, according to the spatial coherence of video sequence, carries out the judgement of macro-block level motion complexity;
Second step of the present invention is can adopt the picture frame of HME to carry out the judgement of macro-block level motion complexity to determining.Spatial coherence according to video sequence, the motion vector prediction of current macro (MVP:Motion vectorprediction) collection can be made of the motion vector and the motion vector (0,0) of the macro block of the motion vector of space adjacent macroblocks (upper and lower, upper left, upper right), former frame same position correspondence.The motion complexity of macro block region concentrates the otherness of predicted value to judge that this otherness is represented with diff_MVP with MVP:
[formula 6]
Wherein: MVPnum is that MVP concentrates available predicted value number, and (px py) is the motion vector prediction value that MVP concentrates the minimum match error correspondence, (px
i, py
i) be concentrated (px, py) predicted value in addition of removing of MVP.The motion of big more this macro block of explanation of otherness is complicated more.In order effectively to judge the motion complexity of macro block region, the present invention uses the diff_MVP average (value_dis) of the macro block that has adopted the DS method in the cataloged procedure as threshold value, that is:
When diff_MVP>value_dis, the macro block region is compound movement, uses the HME method.
When diff_MVP≤value_dis, the macro block region is simple motion, uses the DS method.
When a video sequence was encoded, the initial value of value_dis was changed to zero, and the value of value_dis tends towards stability gradually along with the frame number of coding increases.
Through behind the motion search of above whole pixel, proceed the search of half-pix and 1/4th pixels to obtaining motion vector, thereby obtain final motion vector.If current macro is not last macro block in the frame, then proceed macro-block level and judge, otherwise being carried out the frame level, next frame judges.
In order to verify validity of the present invention, on the jm86 source code, carried out following experiment, wherein be encoded to the IPPP form, with last two field picture as a reference, maximum search scope 16 adopts 16*16 macro block mode estimation.That be used for testing is standard sequence football.cif (90 frame) and other two sections video Football2.cif (100 frame), Football3.cif (100 frame) that cut from the football match.There is replacing of accurate static and strenuous exercise's two states in these three sections video sequences, can check the adaptively selected performance of new method effectively.Adopt DS, HME, frame level adaptation (FLA:Frame-Level Adaptive) and two level adaptation methods (TLA:Two-Level Adaptive) to test respectively to video sequence, in order to obtain the time average of estimation, every section video measurement 10 times.
In order to verify the judgement performance of frame level adaptation method, cycle tests is only carried out the frame level judge that promptly all adopt the HME method to search for for macro block in the frame of flag_HME=1, test result as shown in Figure 7 motion complexity.Average coupling when adaptive approach is for flag_HME=0 among the figure is counted a little less than the DS method, this mainly is because traditional DS method will be applicable to all motion conditions, in order to guarantee picture quality, the initial position of search is not unique, has increased some extra search procedures.And adaptive approach has been beforehand with judgement to motion conditions, so original DS search strategy is simplified, has adopted the unique DS search procedure of initial position.Judged result from figure frame level adaptation threshold value as can be seen can effectively be followed image complexity and changed search pattern is made right judgement.
To DS, HME and two level adaptation methods average coupling count and picture quality aspect compare, the result is shown in table 1, table 2.By data in the table 1 as can be seen, two level adaptation methods are improving significantly aspect on average coupling is counted, and can reach the purpose that shortens the scramble time than DS, HME method.It should be noted that when programming realizes new method is not optimized the correlative factor such as internal memory, buffering.Therefore the minimizing of search point can illustrate more that than the minimizing of running time the performance of method improves.
Table 1DS, HME and two level adaptations (TLA) method are relatively
(a) average coupling is counted
(b) the estimation time
Table 2 has provided the picture quality comparative result of three kinds of methods, and two level adaptation methods can reach and DS, the essentially identical picture quality of HME method as can be seen.
Table 2DS, HME and two level adaptations (TLA) are based on the comparison of picture quality
Table 3 has provided result's contrast of two level adaptation methods and frame level adaptation method, and macro-block level judges it is to effectively replenishing that the frame level is judged as can be seen, can further reduce search point, accelerates search procedure.
Table 3 two-stage is judged the performance boost that the frame level is judged
(a) average coupling is counted
(b) the estimation time
As mentioned above, the invention provides and a kind ofly carry out the method that search procedure is selected by motion complexity is judged, and a kind of rapid motion estimation self-adaptive selection method based on motion complexity that under this thought guidance, proposes.This method can by video sequence is carried out the judgement of frame level and macro-block level motion complexity, be selected the rapid motion estimating method that is fit to current macro preferably in conjunction with DS, HME method advantage separately.Under the fairly simple situation of motion, its search performance is basic identical with the DS method, and under the bigger situation of motion complexity, its performance and HME method are approaching, thereby under the prerequisite that guarantees picture quality, shortens the scramble time effectively.What deserves to be mentioned is that the judgment threshold of motion complexity is that video itself is relevant, adaptive, need not man-machine interactively.