CN101815218B

CN101815218B - Method for coding quick movement estimation video based on macro block characteristics

Info

Publication number: CN101815218B
Application number: CN 201010140709
Authority: CN
Inventors: 刘鹏宇; 贾克斌
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2010-04-02
Filing date: 2010-04-02
Publication date: 2012-02-08
Anticipated expiration: 2030-04-02
Also published as: CN101815218A

Abstract

The invention discloses a method for coding a quick movement estimation video based on macro block characteristics, relating to the field of video compression coding. The method comprises the following steps of: firstly, extracting brightness information of a current coding macro block from original video data; secondly, searching and designing a simple movement estimation search template aiming at an integral pixel movement vector and reasonably distributing search points; classifying movement sever degree of the current prediction macro block according to distortion principle, making a movement searching prior stopping strategy corresponding to movement characteristics from the movement characteristics of the current prediction macro block, dynamically adjusting the number of search layers and search points of the movement estimation and adaptively selecting a corresponding movement estimation search principle. Compared with the movement estimation search method adopted in a video coding standard H.264, the method can effectively accelerate the movement estimation search process, reduces the movement estimation time consumption, strictly controls the code rate increase, ensures better reconstructed Image quality, and realizes quick movement estimation coding.

Description

Method for coding quick movement estimation video based on macro block characteristics

Technical field

The present invention relates to the video compression coding field, design and realized a kind of quick video motion estimated coding compression method based on macro block characteristics.

Background technology

H.264/AVC unite the latest generation video encoding standard of formulation as ITU and ISO, have more excellent encoding compression performance and interference free performance, can adapt to the various network environment, have important researching value and using value.Yet H.264 the raising of code efficiency is to be cost with encoder-side high computation complexity, has caused it to be difficult to the defective of using in real time.

Research and analyse and show; H.264 the overwhelming majority of the computation complexity of encoder-side comes from the inter prediction encoding of coding standard different from the past: that supports size and shape variable comprises 16 * 16; 16 * 8,8 * 16,8 * 8; 8 * 4; The interframe encoding mode (see figure 1) of 4 * 8,4 * 4 seven kinds of variable size blocks, 1/4 pixel motion vector, multi-reference frame utilize optimum estimation of Lagrangian rate-distortion optimization ROD (the Rate Distortion Optimization) choice of technology and interframe encoding mode to select or the like.Experimental analysis shows, adopts the motion estimation module of the whole interframe encoding modes of multi-reference frame to account for more than 85% of whole encoder amount of calculation; Only adopt the motion estimation module of single reference frame to account for the whole encoder amount of calculation more than at least 60%.Therefore, the computation complexity of huge scramble time expense an urgent demand reduction motion estimation module.

Estimation (Motion Estimation; ME); It is a core technology that generally adopts in the video code between frames; It also is the most important method of fundamental sum of time redundancy of eliminating video data, and the quality of estimation directly has influence on the efficient and image recovery quality of coding, and the research of various algorithms also receives the concern of industrial quarters and academia always.

Various estimation and motion compensation process have been arranged at present, and popular algorithm mainly contains two kinds, and a kind of is the pixel-recursive algorithm, and another kind is a block matching algorithm.The PRA of comparing, BMA has been simplified problem greatly, though its precision has reduced, it is strong that it has the displacement tracking ability, is prone to advantages such as realizations, obtained application widely.At present popular video encoding standard (H.261/263/264, MPEG1/2/4 etc.) generally adopts the motion estimation algorithm based on the piece coupling.

The operation principle of block matching motion algorithm for estimating is as shown in Figure 2.In estimation, frame of video is divided into equal-sized M * N piece, supposes that all pixels are all done identical translational motion in the piece, and subscript t is the time.The displacement of piece in t and t-Δ t interval is called motion vector MV (Motion vector).Estimation is at reference frame F _{T-Δ t}Given window Ω in the match block of search present encoding piece, seek the process of optimum movement vector.In block matching method, with correlation function correlation function R (x y) representes relevance function between current macro and the reference macroblock, R (x, y) big more, the correlation between expression current block and the predict blocks is strong more, the two is coupling more.Usually use absolute error and SAD (Sum of Absolute Difference) to weigh the size of relevance function.

SAD (x, y) = MN \times MAD (x, y) = Σ_{m = 1}^{M} Σ_{n = 1}^{N} | f_{k} (m, n) - f_{k - 1} (m + x, n + y) |

f _k(m, the n) brightness value of expression present frame macro block, f _K-1(M, N represent the size of current macro for m+x, the n+y) brightness value of expression reference frame macro block; R (x, the y) relevance function between expression current macro and the reference macroblock is under this criterion; The correlation function R of current macro and predicted macroblock (x, y)=1/SAD (x, y).

In estimation, matching criterior is little to the precision influence of estimation, for the further computation complexity that reduces,

In existing block matching motion algorithm for estimating, that search precision is the highest at present is full-search algorithm (FS), but computation complexity is too high, is unfavorable for real-time application.Researchers have proposed various quick block matching motion algorithm for estimating for this reason, like three-step approach, new three-step approach, four step rule, rhombus therapy, MVFAST and PMVFAST scheduling algorithm.But all improvement algorithms all have decline in various degree than FS on search precision, and search speed still is difficult to well satisfy actual requirement.Therefore, be necessary to seek block matching motion algorithm for estimating more efficiently.

General estimation comprises two parts: whole pixel motion is estimated and the decimal estimation of the optimum point that obtained by whole pixel search.Particularly whole fast pixel motion estimates to have obtained many concerns.This is because traditional fraction pixel estimation (such as 1/2 pixel) has only accounted for ratio seldom in the amount of calculation of whole estimation.The UMHexagonS (Unsymmetrical-Cross Multi-Hexagon Search) that the main achievement of in the recent period estimating to be optimized algorithm research to whole pixel motion in the compressed encoding H.264/AVC has people such as J.M.Zhang to propose; Full name is " asymmetric cross multi-level hexagonal point search algorithm "; This method is compared with full-search algorithm, can reduce by 90% estimation computation complexity, and bit rate is compared equally matched with full-search algorithm; Average peak signal to noise ratio (peak signal-to-noise; PSNR) decline can keep distortion performance preferably simultaneously less than 0.05dB, is one of best fast search algorithm of present effect.H.264/AVC begin from reference software JM7.6 version, adopted the UMHexagonS algorithm, this algorithm combination multiple searching algorithm, improved the validity and the robustness of prediction to a great extent.Comprise four steps, each step has adopted different search pattern (like Fig. 3), and basic step is following:

Step 1: the prediction of initial search point.

Adopt multiple predictive modes such as median prediction, upper strata prediction, former frame corresponding blocks motion-vector prediction, contiguous reference frame motion-vector prediction to ask motion vectors MV _Pred, make initial search starting point more near the optimum prediction point.

Step 2: unsymmetrical cross searching.

Based on general hypothesis, in most of natural image sequences, the horizontal movement severe is generally much bigger than vertical direction, therefore can adopt asymmetric cross search to come optimized Algorithm.Shown in the step2 of Fig. 3, unsymmetrical cross searching horizon scan scope is 2 times of vertical search scope.Spacing is got two pixel units between the adjacency search point.Optimum Match is named a person for a particular job as the initial ranging central point of next step search.

Step 3: non-homogeneous multilayer hexagon graticule mesh search.

This step search is divided into two stages completion.

Phase I Step3-1: the optimal match point to confirm in the step 2 is the center, in 4 * 4 zone around it, carries out 5 * 5 whole pixels and searches for entirely, shown in the step3-1 among Fig. 3.

Second stage Step3-2: the optimal match point to confirm among the step3-1 is the center as starting point, carries out non-homogeneous multilayer hexagonal grid search, shown in the step3-2 among Fig. 3.Non-homogeneous multilayer hexagon graticule mesh search is used to handle significantly, the situation of irregular movement, and 4 layers of hexagonal meshes search heterogeneous are carried out in unification in the algorithm.Consider in the actual conditions that natural image sequence horizontal motion amplitude generally is greater than vertical direction, the non-homogeneous hexagon that adopts in the algorithm at 16 is as the basic search pattern, and is as shown in Figure 4.

Step 4: expand symmetrical hexagon search.

Step4-1: be that 2 pixels are carried out symmetrical hexagon six point search with step-length earlier; Correlation Centre point and 6 rate distortion costs values to angle point; As the central point of search next time, become the point of rate distortion costs value minimum with the minimum point of rate distortion costs value up to central point.Shown in the Step4-1 of Fig. 3.

Step4-2: reducing step-length again, is that 1 pixel is carried out diamond search with step-length, repeats the search procedure of Step4-1, ends during for smallest point up to the rate distortion costs value of central point.This point is whole pixel search optimal match point.Shown in the Step4-2 of Fig. 3.

As previously mentioned; Though the UMHexagonS algorithm has significantly improved the arithmetic speed of full search; But this method is not considered the motion feature of current predicted macroblock to the search strategy of seven kinds of interframe encoding modes employing traversal formula without exception, and its search point is still more; Search strategy exists redundant, also has the very big rising space aspect the raising algorithm search speed.Through researching and analysing whole pixel motion estimation characteristics and a large amount of experiment statistics data results in the video coding, find that its search point is still more, there is redundant phenomenon in algorithm, and search rule and search pattern still have room for promotion, are necessary further optimization.H.264/AVC original whole pixel search quick motion is estimated the shortcoming of UMHexagonS algorithm, mainly shows following several respects:

(1) all is adopted unified fixing motion search method of estimation by predicted macroblock, do not analyze to the macro block characteristic.

(2) search radius is fixed, and for the less region of search of uniform motion amplitude, excessive search radius is little to improving the motion estimation performance effect, has but spent more computing time.

(3) there is redundant phenomenon in search point, and to the macro block self-characteristic, searching and computing like all search points are traveled through has without exception caused coding consuming time.

Summary of the invention

In order effectively to reduce search point in the motion estimation process; Accelerate computational speed and the maintenance even the raising motion search accuracy of estimation in the multi-reference frame; The present invention is from being set about by the motion feature of predicted macroblock; According to the characteristics of estimation in the standard interframe encode H.264/AVC, to whole pixel motion vector search designed efficiently, motion estimation search matching template accurately, make corresponding motion search and preferentially stop strategy; Propose a kind of rapid motion estimating method of the motion vector tagsort based on H.264/AVC, be intended to win with the less sacrifice in code efficiency aspect the significantly reduction of computation complexity.Experimental result shows the inventive method accelerated motion estimating searching process effectively; Reduce amount of calculation; Can keep simultaneously encoder bit rate lower in the former H.264 canonical algorithm and reconstructing video picture quality preferably, realize the doulbe-sides' victory of computation complexity and motion-estimation encoded efficient.

The concrete technical scheme of the present invention is: based on the method for coding quick movement estimation video of macro block characteristics; Characteristic and inter prediction encoding Design Pattern motion estimation search template and dynamic method for fast searching according to current predicted macroblock; According to current by the motion severe of predicted macroblock; Selection result in conjunction with the inter prediction encoding pattern; Dynamic adjustment hunting zone and the corresponding motion estimation search criterion of adaptive selection, thus realize fast motion estimation predictive coding to inter macroblocks, it is characterized in that specifically comprising the steps:

Step 1: extract current by the luma component values of predicted macroblock: with current be coded object by the monochrome information of predicted macroblock, from current video frame, extract by the luma component information of predicted macroblock;

Step 2: confirm the motion estimation search template, distribute the search point: comprise four layers of search, along with the increase of search radius, the expansion of hunting zone progressively increases progressively from inside to outside every layer search point, and the search point on every layer is followed successively by 8,8,12,16;

Step 3: the initial search point of confirming high-accuracy: adopt Lagrangian rate-distortion optimization (RDO-Rate Distortion Optimization) function; As estimation judgement foundation; The coupling predict blocks and the optimal motion vector of the best on the selection rate distortion sense make that the Bit Allocation in Discrete of motion vector and residual coding is minimum; Utilize Lagrangian rate distortion criterion to select the optimal motion vector problem can be described as:

J _motion(mv，ref|λ _motin)＝SAD[s，r(ref，mv)]+λ _motin[R(mv-pred)+R(ref)] (1)

(1) J in the formula _MotionRate distortion costs value RDcost for the motion vector of current prediction _MotionS is the current macro block pixels value; Mv is current vector, and pred is a predictive vector; The reference frame of ref for selecting; (ref mv) is the pixel value of reference macroblock to r; R is that motion vector carries out the bit number that differential coding consumes, and comprises number of coded bits and the number of coded bits of reference frame of the difference of motion vector and its predicted value; λ _MotionBe Lagrange multiplier; SAD (Sum of absolute difference) be between current block and the reference block pixel absolute error with;

SAD [s, r (ref, mv)] = Σ_{x = 1}^{B_{1}} Σ_{y = 1}^{B_{2}} | s (x, y) - r (x - m_{x}, y - m_{y}) | - - - (2)

B in the formula (2) ₁And B ₂Horizontal number of pixels and the Vertical number of pixels of representing piece respectively, according to different inter-frame forecast modes, its value can be 16,8, and 4; (x y) is the current macro block pixels value to s; (x y) is the pixel value of reference macroblock, m to r _xAnd m _yThe displacement of representing level and vertical direction respectively;

Utilize Lagrangian rate distortion criterion to select the problem of optimization model to be expressed as:

J _mode(s，c，MODE|λ _mode)＝SSD(s，c，MODE|QP)+λ _mode×R(s，c，MODE|QP) (3)

In the formula (3), MODE representes a kind of interframe encoding mode of current macro; J _Mode(s, c, MODE| λ _Mode) rate distortion costs value RDcost under the expression MODE pattern _ModeS is original vision signal; C is the reconstructed video signal behind the employing MODE pattern-coding; λ _ModeBe Lagrange multiplier; (s, c MODE|QP) are the total number of bits that comprise macro block header, motion vector and all DCT block messages relevant with pattern and quantization parameter to R, and it is through obtaining behind the coding that piece is carried out reality, so its operand is bigger; QP is the coded quantization step-length; SSD (s, c, MODE) (Sum of SquareDifference) be between primary signal and the reconstruction signal squared differences with, that is:

SSD (s, c, MODE | QP) = Σ_{i = 1, j = 1}^{B_{1}, B_{2}} {(S_{Y} [x, y] - C_{Y} [x, y, MPDE | QP])}^{2} +

Σ_{i = 1, j = 1}^{B_{1}, B_{2}} {(S_{U} [x, y] - C_{U} [x, y, MPDE | QP])}^{2} + - - - (4)

Σ_{i = 1, j = 1}^{B_{1}, B_{2}} {(S_{V} [x, y] - C_{V} [x, y, MPDE | QP])}^{2}

In formula (4) formula: B ₁And B ₂Represent the Horizontal number of pixels and the Vertical number of pixels of piece respectively, its value can be 16,8, and 4; S _YThe value of [x, y] expression source macro block brightness information, C _YThe value of the monochrome information of macro block is rebuild in [x, y, MODE|QP] expression; S _U, S _VAnd C _U, C _VThe value of representing corresponding colour difference information respectively; Specifically may further comprise the steps:

Step 3.1: utilize three kinds of predictive coding patterns choose best motion vector (Motion Vector, MV), promptly based on the median prediction MV of spatial domain _{Pred_space}The upper strata prediction MV of many sizes of the estimation macroblock partitions characteristics that adopt based on standard H.264 _{Pred_uplayer}Reference frame motion-vector prediction MV based on time-domain _{Pred_ref}

(a) MV _{Pred_space}: median prediction, utilize spatial coherence, get the left side of the present frame of having obtained, last, upper right median of facing the motion vector of piece;

(b) MV _{Pred_uplayer}: the upper strata prediction, utilize H.264 many sizes of estimation macroblock partitions characteristics, comprise pattern 1:16 * 16; Pattern 2:16 * 8, mode 3: 8 * 16, pattern 4:8 * 8; Pattern 5:8 * 4; Pattern 6:4 * 8, mode 7: 4 * 4 hierarchical search order, get co-located, the upper level up layer that has obtained, the motion vector that is twice piece;

(c) MV _{Pred_ref}: the time-domain prediction, utilize temporal correlation, according to the motion vector MV of the current block in the former frame reference frame of having obtained _RefPredict in proportion

Wherein the time of current macro place frame is t, and the time of predicted macroblock place reference frame is t ';

Step 3.2: the motion vector MV of corresponding sad value and prediction has very strong correlation; With MV _{Pred_space}, MV _{Pred_uplayer}, MV _{Pred_ref}The rate distortion costs value of point pointed is designated as pred_space_mincost respectively; Pred_uplayer_mincost; Pred_ref_mincost; The motion vector points point that will have minimum rate distortion costs value is as initial search point, and this is named a person for a particular job as the initial ranging central point of next step unsymmetrical cross searching;

Step 4: unsymmetrical cross searching: asymmetric cross search horizon scan scope is made as W, and the vertical search scope is the half the of horizon scan scope, is W/2, and spacing is got two pixel units between the adjacency search point, and then the search point of horizontal direction is W/2; The search point of vertical direction is W/4; Total W/2+W/4=3W/4 candidate search point in unsymmetrical cross searching; The optimal match point that this step is confirmed in this 3W/4 candidate search point is as the initial ranging central point of next step non-homogeneous multilayer hexagonal grid search;

Step 5: judge current macro motion severe: current predicted macroblock is divided into three class by severe, i.e. motion severe is lower, the motion severe is medium, the motion severe is higher; The determination methods of current predicted macroblock motion severe is following:

Step 5.1: the rate distortion costs value of calculating the motion vector of current predicted macroblock according to formula (1) is designated as RDcost _Motion, calculate the rate distortion of current predicted macroblock under the mode pattern according to formula (3) and be designated as RDcost for value _Mode, therefrom choose rate distortion costs value with minimum prediction, be designated as RD_mincost, that is: RD_mincost=min{Pred_space_mincost, Pred_uplayer_mincost, Pred_ref_mincost};

Step 5.2: the value of confirming pred_mincost according to the motion-vector prediction mode of initial ranging point selection:

If the initial search point of in step 3, confirming adopts the motion vector of time-domain prediction mode, then pred_mincost=pred_ref_mincost;

If the initial search point of in step 3, confirming does not adopt the motion vector of time prediction mode, be further divided into two types of situation:

If that the interframe encoding mode of current estimation predicted macroblock is selected is 16 * 16 large scale pattern, then pred_mincost=pred_space_mincost;

If that the interframe encoding mode of current estimation predicted macroblock is selected is not 16 * 16 large scale pattern, then pred_mincost=pred_uplayer_mincost;

Step 5.3: calculate motion severe adjustment factor gamma, δ:

γ = \frac{Bsize [blocktype]}{pred_\min \cos t^{2}} - α_{Radii 1} [blocktype]

(5)

δ = \frac{Bsize [blocktype]}{pred_\min \cos t^{2}} - α_{Radii 2} [blocktype]

The size of the current predicted macroblock of Bsize [blocktype] expression in the formula (5), its value can be 16,8, and 4;

Matrix α _Radii1According to 7 kinds of inter macroblocks partition modes, its value is defined as [blocktype]:

α _Radii1[1]＝-0.23；α _Radii1[2]＝-0.23；α _Radii1[3]＝-0.23；

α _Radii1[4]＝-0.25；α _Radii1[5]＝-0.27；α _Radii1[6]＝-0.27；α _Radii1[7]＝-0.28；

Matrix α _Radii2[blocktype] according to 7 kinds of inter macroblocks partition modes, value be defined as:

α _Radii2[1]＝-2.39；α _Radii2[2]＝-2.40；α _Radii2[3]＝-2.40；

α _Radii2[4]＝-2.41；α _Radii2[5]＝-2.45；α _Radii2[6]＝-2.45；α _Radii2[7]＝-2.48；

Step 5.4: the judgement of current predicted macroblock motion severe:

When RD_mincost＜(1+ γ) * pred_mincost, judge that the motion severe of current predicted macroblock is lower;

When (1+ γ) * pred_mincost＜RD_mincost＜(1+ δ) * pred_mincost, judge that the motion severe of current predicted macroblock is medium;

When RD_mincost＞(1+ δ) * pred_mincost, judge that the motion severe of current predicted macroblock is higher;

Step 6: carry out 5 * 5 pixels and search for entirely: after step 5 has been judged current search content severe; When the search content severe is low, be the center with current future position just only, the full search of 5 * 5 search points is carried out in 4 * 4 zones around it; Then abandon 5 * 5 search entirely when the motion severe when being medium or violent, directly get into step 7;

Step 7: non-homogeneous multilayer hexagonal mesh search: according to the motion severe Dynamic Selection motion estimation search number of plies of current macro: adopt non-homogeneous hexagon search as the basic search pattern, the search number of plies is 4 layers at most; When the motion severe is low, only carry out 2 layers of non-homogeneous hexagonal mesh search, every layer search point is 8; When the motion severe is medium, carry out 3 layers of non-homogeneous hexagonal mesh search, search point from inside to outside is followed successively by 8,8, and 12; When the motion severe is higher, just carry out 4 layers of non-homogeneous hexagonal mesh search, search point from inside to outside is followed successively by 8,8, and 12,16; The optimal match point that this step is confirmed in non-homogeneous hexagonal mesh search is as the initial ranging central point of the symmetrical hexagon search of step 8 expansion;

Step 8: expand symmetrical hexagon search: the optimal match point of confirming with step 7 is as the initial ranging central point of this step; Here adopt two stage ways of search:

Phase I: be that 2 pixels are carried out symmetrical hexagon six point search with step-length earlier; Correlation Centre point and 6 rate distortion costs values to angle point; According to the rate distortion criterion with rate distortion costs value smallest point as the central point of search next time, become smallest point up to the rate distortion costs value of central point;

Second stage: reduce step-length again; With step-length is that 1 pixel is carried out rhombus four point search; Repeat the search procedure of phase I, stop search up to the rate distortion costs value of central point during for smallest point according to the rate distortion criterion, this point is the final whole pixel search optimal match point of confirming;

Step 9: output movement estimated coding information is in the Out.264 file, and record estimation time (Total ME time for sequence), code check (Bit rate), Y-PSNR information (PSNR), to estimate the encryption algorithm quality.

The fast motion estimation search core algorithm of the inventive method is called NUMHexagonS (New-UMHexagonS), is convenient to distinguish with the UMHexagonS algorithm and compare.Mainly comprise following technology among the NUMHexagonS:

One, the more succinct search pattern of design

1.1 design principle one

The UMHexagonS algorithm is the center at step step3-2 with current prediction starting point, when carrying out the non-homogeneous hexagonal grid search of multilayer, is to adopt every layer of non-homogeneous hexagon search template that 16 search points are all arranged all the time, shown in the step3-2 among Fig. 3.The template of the search point of fixing so every layer of distribution is prone to cause the unreasonable distribution of search point.Just think; If be positioned on the non-homogeneous hexagon search template of the minimum ground floor of the radius of innermost layer and distribute the not waste of 16 search points; So along with the increase of search radius; Still distribute 16 search points on the progressively abducent second layer, the 3rd layer, the 4th layer the non-homogeneous hexagon search template, the not enough phenomenon of search point must occur; On the contrary; Distribute 16 search points enough to satisfy searching requirement if be positioned on the 4th layer maximum non-homogeneous hexagon search template of outermost radius, be positioned at so on the minimum non-homogeneous hexagon search template of ground floor of the radius of innermost layer and still distribute 16 search points to certainly exist the phenomenon of search point waste.A more outstanding search pattern should be can be along with the increase of search radius, i.e. the expansion of hunting zone, and being distributed in every layer of search point on the non-homogeneous hexagon search template should increase progressively thereupon, rather than the search pattern of taking fixing search to count.

1.2 technical application

Change the non-homogeneous hexagonal point search of every layer of fixing 16 search point into can adjust search point automatically non-homogeneous hexagonal point search according to radius, as shown in Figure 5.Because former UMHexagonS algorithm itself has had the point prediction characteristic of accurate; Therefore near the possibility of optimal match point current future position is bigger; In order to keep the precision of prediction of original algorithm, on the non-homogeneous hexagon search template of the ground floor that is positioned at internal layer and the second layer, distribute 8 point search points, along with the search point of every layer of the increase of search radius also can correspondingly increase; The 3rd layer of non-homogeneous hexagon distributes 12 search points, distributes 16 search points up to the 4th layer of non-homogeneous hexagon.The search patterns of a large amount of experiment proof the inventive method can guarantee search accuracy, have avoided the waste of unnecessary search point again, have saved the estimation time.

Two, according to the adaptive selection search of the macro block characteristics number of plies

2.1 design principle two

In a single day the UMHexagonS algorithm gets into step step3-2 when carrying out non-homogeneous multilayer hexagonal grid search; No matter the special severe of the motion of current predicted macroblock how; All can only just can forward next step search to afterwards by the whole 4 layers non-homogeneous hexagon search of completion without exception, as shown in Figure 3.4 layers non-homogeneous hexagon search is counted considerablely in the former algorithm, amounts to 4 layers * 16/layer=64 search points.Especially as far as having the UMHexagonS algorithm that plays point prediction of high-accuracy; Beginning just finds the probability of optimum bigger, is unnecessary so be wasted in counting in the outer non-homogeneous hexagonal mesh search for uniform motion severe lower region.If the search number of plies in the time of can confirming to carry out the non-homogeneous hexagon search of multilayer according to the search content severe will be practiced thrift search point greatly at every turn, reduce the computation complexity of estimation, save the estimation time.

2.2 technical application

With the number of plies is that 4 non-homogeneous hexagonal mesh search changes the non-homogeneous hexagonal mesh search of the search number of plies with the adaptive change of search content severe into, the dynamic search number of plies of selecting.At first, judge the severe of current predicted macroblock, the motion severe of current macro is divided into three class, that is: severe is lower, severe is medium, severe is higher.According to the search number of plies of the definite non-homogeneous hexagonal mesh search that need carry out of motion severe, as shown in Figure 5.

● when the motion severe of macro block is hanged down, only carry out the two-layer non-homogeneous hexagonal mesh search of the ground floor and the second layer, search point is 8+8=16;

● when the motion severe of macro block is medium, search for three layers of non-homogeneous hexagonal mesh search of three layers of ground floors to the, search point is 8+8+12=28;

● when the motion severe of macro block is higher, just search for four layers of non-homogeneous hexagonal mesh search of four layers of ground floors to the, search point is 8+8+12+16=44.

Three, the full search technique of selectable 5 * 5 pixels

3.1 design principle three

The UMHexagonS algorithm is the center at step Step3-1 with current prediction starting point, carries out the full search of 5 * 5 whole pixels around it in 4 * 4 the zone, like Fig. 3 Step3-1.Step3-1 has 25 search points, and amount of calculation is obviously considerable.Why former UMHexagonS algorithm carries out 5 * 5 full search is because after having carried out a point prediction of high-accuracy; Current search point just near optimum prediction point, carries out 5 * 5 full search this moment and very likely in 5 * 5 scopes, finds the optimum prediction match point probably.But when the search content movement degree is relatively more violent; Near the probability of optimal match point current future position is lower; Carry out 5 * 5 full search this moment again and obviously no longer be applicable to the macro block that movement degree is more violent, caused the significant wastage of unnecessary search point on the contrary.Also will practice thrift search point greatly if can whether carry out 5 * 5 full search according to the adaptive selection of motion severe of current search content.

3.2 technical application

After judging current search content motion severe, only when search content motion severe is low, just carry out 5 * 5 search entirely, and then carry out multi-level non-homogeneous hexagonal mesh search; Then abandon 5 * 5 search entirely when the motion severe when being medium or violent, directly carry out multi-level non-homogeneous hexagonal mesh search.

Technical thought of the present invention is characterized as:

1. the present invention prolongs with the rate distortion criterion in the canonical algorithm H.264/AVC, and the motion vector of rate distortion costs value of selecting to have optimum meaning is as the estimation predictive vector of optimum.Utilize three kinds of optimisation techniques mentioning among the present invention, guaranteeing a little increase of code check, estimation precision and video quality are constant basically, keep under the prerequisite of original code flow structure, increase substantially motion-estimation encoded speed.

2. the motion estimation search template of design the inventive method is confirmed the search point in the middle of every layer of the search pattern, guarantees the reasonable distribution of search point.

3. design movement estimating searching scheme is divided into six steps, the i.e. first steps: the initial search point forecast of high-accuracy; Second step: unsymmetrical cross searching; The 3rd step: the motion severe of calculating current predicted macroblock; The 4th step:, selectively carry out 5 * 5 pixels and search for entirely according to the motion severe of current predicted macroblock; The 5th step:, carry out non-homogeneous multilayer hexagonal grid search according to the motion severe of current predicted macroblock; The 6th step: expand symmetrical hexagon search.

To sum up analyze, through above three kinds of technology, the inventive method; Simplified search pattern (as shown in Figure 6), effectively designed the reasonable distribution of search point, saved motion estimation search and counted; Continue to use primary standard method starting point and predicted advantage accurately; Taken into full account the characteristic of current predicted macroblock, not only 2 layers non-homogeneous hexagonal mesh search can only have been arranged to the direct employing of the lower macro block of movement degree, the method for anticipation in advance realizes fast prediction; Simultaneously again the higher macro block of movement degree is given up 5 * 5 pixels and searched for entirely, kept the method for 4 layers of non-homogeneous hexagonal mesh search of primary standard.Fully take into account all macro block motion features like this; According to its motion severe; The Dynamic Selection search strategy can either guarantee the accuracy and the encoded video quality of estimation, the growth of strict control code check; Can significantly improve motion-estimation encoded speed again, save the motion-estimation encoded time.

The present invention has following beneficial effect:

A large amount of experiment statistics data results show, the inventive method with at present H.264 in the UMHexagonS method of up-to-date employing compare, under the prerequisite that guarantees the estimation accuracy; The code check increase has been controlled in strictness; Guaranteed that reconstruction video picture quality is constant basically, kept original code flow structure, reduced search point effectively; It is consuming time to have significantly reduced estimation, has improved inter prediction encoding speed.Realized fast motion estimation coding to inter macroblocks.

Below in conjunction with description of drawings and embodiment the present invention is done further explain.

Description of drawings

Inter macroblocks partition mode in the algorithm predicts method between Fig. 1 standard frame;

Fig. 2 block matching motion is estimated sketch map;

The asymmetrical hexagonal-shaped search pattern of 16 search points of Fig. 3;

The asymmetric right-angled intersection multilayer of Fig. 4 UMHexagonS hexagon graticule mesh searching algorithm sketch map;

The with different levels non-homogeneous hexagonal mesh search pattern of Fig. 5 the present invention;

The motion estimation searching method sketch map of Fig. 6 the inventive method;

The technical scheme flow chart of Fig. 7 the inventive method;

Fig. 8 motion vector median prediction sketch map;

Fig. 9 motion vector upper strata prediction sketch map;

Figure 10 motion vector time-domain prediction sketch map;

Embodiment

In order to check the present invention to propose the validity of method, the cycle tests of having selected to have different characteristics is like motion comparatively violent sequence C oastguard and Forman; The comparatively mild sequence A kiyo that moves, Miss America and Mother and Daughter; Sequence Mobile with more details smooth motion.And from representing three good and bad performances of method for video coding, promptly motion-estimation encoded time, compression bit rate and Y-PSNR have carried out comparative statistics with the inventive method and the H.264/AVC middle UMHexagonS motion estimation coding method that adopts.In performance test, the up-to-date coding checkout model JM12.2 that adopts JVT to announce, experiment condition disposes as follows: main frame is P4 2.8CPU; The 512M internal memory, 100 frames of encoding, frame per second 30f/s code flow structure are IPPP; Promptly first frame adopts I frame coding, and remaining adopts P frame coding, and quantization parameter QP is made as 28; Entropy coding is CAVLC, 5 reference frames.

Because this method is to accomplish to the luminance component in the video sequence; Read the video sequence of one section yuv format in actual use earlier; Extract its luma component information value, encoder calls the inter macroblocks fast motion estimation coding module of the present invention's design and accomplishes concrete video compression coding.

In the practical implementation, in computer, accomplish following steps:

The first step: read in the video sequence of yuv format according to encoder configuration file encoder.cfg, according to the parameter configuration encoder in the configuration file.As accomplish the frame number FramesToBeEncoded of required video sequence coding; Frame per second FrameRate; The length and width size SourceWidth of original video files, SourceHeight; Output file title OutputFile; Quantization step QP value QPISlice, QPPSlice; Motion estimation search scope SearchRange; Motion estimation search mode type SearchMode; The reference frame number NumberReferenceFrames that allows; Activity ratio distortion cost function R DOptimization; Parameter configuration such as entropy coding type SymbolMode;

Second step: from the video file of original yuv format, read out the luma component values of video sequence, taking-up needs the luma component values of the macro block of coding in order;

The 3rd step: confirm initial search point: current predicted macroblock is made median prediction MV _{Pred_space}, upper strata prediction MV _{Pred_uplayer}With time-domain prediction MV _{Pred_ref}

Median prediction MV _{Pred_space}: utilize the mean prediction device, with a left side that obtained, that be positioned at current predicted macroblock, go up, the median of the motion vector (MV) of upper right adjacent piece is as median prediction vector MV _{Pred_space}

Median prediction is a kind of technology commonly used, and its theoretical foundation is to exist between piece and the piece very big spatial coherence, and therefore the optimum movement vector of current predicted macroblock can be obtained by the motion-vector prediction of adjacent piece around known.As shown in Figure 8, E is current predicted macroblock, and its left side adjacent macroblocks is A, and the top adjacent macroblocks is B, and upper right adjacent macroblocks is C, and the calculating formula of median prediction is following:

MV _{pred_space}＝median(MV_A，MV_B，MV_C) (9)

When piece A is outside picture or outside GOB (sheet group) border, MV _{Pred_space}Substitute with (0,0); When piece C is outside picture or outside GOB (sheet group) border, MV _{Pred_space}Motion vector with piece D substitutes, i.e. MV _{Pred_space} ⁼MV_D.

Upper strata prediction MV _{Pred_uplayer}: utilize interframe movement in the standard H.264/AVC to estimate the characteristics of many sizes macroblock partitions, adopt from pattern 1 (16 * 16) pattern 2 (16 * 8); Mode 3 (8 * 16); Pattern 4 (8 * 8), pattern 5 (8 * 4), pattern 6 (4 * 8); Up to the hierarchical search of mode 7 (4 * 4) order from get obtained, co-located, upper level (up layer), be twice the motion vector of sized blocks, as shown in Figure 9; (for example, pattern 5 (mode5) (8 * 4) or pattern 6 (mode6) (4 * 8) are the upper stratas of mode 7 (mode7) (4 * 4), and pattern 4 (mode4) (8 * 8) is the upper strata of pattern 5 (mode5) (8 * 4) or pattern 6 (mode6) (4 * 8)).

Time-domain prediction MVp _{Red_ref}: utilize temporal correlation, get the motion vector of current block in the reference frame that obtained, last and predict in proportion.Because the continuity of video sequence, the motion vector of current predicted macroblock in different reference frames also has certain correlation.Shown in figure 10: the time of supposing current predicted macroblock place frame is t; The time at its reference frame place, front is t '; If in its front reference frame during the best matching blocks of search current block, can utilize current predicted macroblock to estimate for the motion vector in the reference frame of t '+1 in the time that current predicted macroblock be the motion vector in the reference frame of t ' in the time:

{MV}_{pred_ref} = {MV}_{ref} \times \frac{t - t^{,}}{t - t^{,} - 1} - - - (10)

Similar motion vector MV, corresponding sad value also has very strong correlation.The rate distortion costs value of these three kinds of motion vector prediction mode points pointed is designated as Pred_space_mincost respectively; Pred_uplayer_mincost; Pred_ref_mincost; The motion vector points point that will have minimum rate distortion costs value is confirmed the initial ranging initial point position as initial search point; The UMHexagonS motion estimation algorithm that the inventive method has kept former H.264 standard to adopt has the characteristics of the starting point prediction of high-accuracy, makes it more near best estimation match point position; The optimum Match of this step is named a person for a particular job as the initial ranging central point of next step unsymmetrical cross searching;

The 4th step: carry out unsymmetrical cross searching; Be based in most of natural image sequences the characteristic that the horizontal movement severe is generally much bigger than vertical direction.In this implementation process, get W=32, promptly open a size and be 32 * 16 cross-shape search window, spacing is got two pixel units between the adjacency search point, and the hunting zone of horizontal direction is 32, and search point is 16; The hunting zone of vertical direction is 16, and search point is 8; Total 16+8=24 candidate search point in unsymmetrical cross searching; According to the rate distortion criterion; Utilize formula (3) 24 search points to be calculated the rate distortion costs value of its motion vector respectively; Therefrom choose the optimal match point of the minimum point of rate distortion costs value as this step, with it as the initial ranging central point in next step non-homogeneous multilayer hexagonal mesh search step;

The 5th step: the severe of judging current macro: according to the rate distortion costs criterion current predicted macroblock is divided into three class by severe, i.e. motion severe is lower, the motion severe is medium, the motion severe is higher; The determination methods of current predicted macroblock motion severe is following:

Calculate the rate distortion costs value RDcost of the motion vector of current predicted macroblock according to formula (3) _Motion, calculate the rate distortion of current predicted macroblock under the mode pattern for value RDcost according to formula (5) _Motion, therefrom choose rate distortion costs value with minimum prediction, be designated as RD_mincost, utilize formula (7) to judge the motion severe of current predicted macroblock:

The motion severe of definition adjustment factor gamma in formula (7), δ, the matrix variables α that relates to _Radii1, its value of attaching as follows:

AlphaRadii1[1]＝-0.23f；

AlphaRadii1[2]＝-0.23f；

AlphaRadii1[3]＝-0.23f；

AlphaRadii1[4]＝-0.25f；

AlphaRadii1[5]＝-0.27f；

AlphaRadii1[6]＝-0.27f；

AlphaRadii1[7]＝-0.28f；

The matrix variables α that relates to _Radii2, its value of attaching as follows:

AlphaRadii2[1]＝-2.39f；

AlphaRadii2[2]＝-2.40f；

AlphaRadii2[3]＝-2.40f；

AlphaRadii2[4]＝-2.41f；

AlphaRadii2[5]＝-2.45f；

AlphaRadii2[6]＝-2.45f；

The 6th step: carry out 5 * 5 pixels and search for entirely: the inventive method has been judged current search content severe in the 5th step after; When the search content severe is low, be the center with current future position just only, the full search of 5 * 5 search points is carried out in 4 * 4 zones around it; Then abandon 5 * 5 search entirely when the motion severe when being medium or violent, directly got into for the 7th step;

The 7th step: carry out non-homogeneous multilayer hexagonal mesh search; The motion severe Dynamic Selection motion estimation search number of plies according to current macro: adopt non-homogeneous hexagon search as the basic search pattern in the present invention, the search number of plies is 4 layers at most; According to the motion severe of the current predicted macroblock of judging in the 6th step, confirm the search number of plies:

With the innermost layer of asymmetrical hexagonal-shaped grid search in the former algorithm, i.e. ground floor search point coordinates is an example, and 16 search points on the ground floor are set at the coordinate of level and vertical direction:

static?const?int?Big_Hexagon_x[16]＝{0，-2，-4，-4，-4，-4，-4，-2，0，2，4，4，4，4，4，2}；

static?const?int?Big_Hexagon_y[16]＝{4，3，2，1，0，-1，-2，-3，-4，-3，-2，-1，0，1，2，3}；

If the motion severe of current predicted macroblock is lower, then only carry out interior 2 layers of non-homogeneous hexagonal mesh search, every layer search point is 8, amounts to 16 search points, more former algorithm is saved 48 search points;

New multi-level asymmetrical hexagonal-shaped grid search template, level is put in 8 search on the ground floor and the vertical direction coordinate is set to:

static?int?Big_Hexagon_x1[8]＝{0，-4，-4，-4，0，4，4，4}；

static?int?Big_Hexagon_y1[8]＝{4，2，0，-2，-4，-2，0，2}；

Level is put in 8 search on the second layer and the vertical direction coordinate is set to:

static?int?Big_Hexagon_x2[8]＝{0，-4，-4，-4，0，4，4，4}；

static?int?Big_Hexagon_y2[8]＝{4，2，0，-2，-4，-2，0，2}；

If the motion severe of current predicted macroblock is medium, then carry out 3 layers of non-homogeneous hexagonal meshes search from inside to outside, from inside to outside search point be followed successively by 8,8,12, amount to 28 search points; More former algorithm is saved 36 search points;

Level is put in 12 search on the 3rd layer and the vertical direction coordinate is set to:

static?int?Big_Hexagon_x3[12]＝{0，-4，-4，-4，-4，-4，0，4，4，4，4，4}；

static?int?Big_Hexagon_y3[12]＝{4，2，1，0，-1，-2，-4，-2，-1，0，1，2}；

If the motion severe of current predicted macroblock is higher, then carry out 4 layers of non-homogeneous hexagonal meshes search, from inside to outside search point be followed successively by 8,8,12,16, amount to 44 search points; More former algorithm is saved 20 search points;

Level is put in 16 search on the 4th layer and the vertical direction coordinate is set to:

static?int?Big_Hexagon_x4[16]＝{0，-2，-4，-4，-4，-4，-4，-2，0，2，4，4，4，4，4，2}；

static?int?Big_Hexagon_y4[16]＝{4，3，2，1，0，-1，-2，-3，-4，-3，-2，-1，0，1，2，3}；

The optimal match point that this step is confirmed in the asymmetrical hexagonal-shaped grid search is expanded the initial ranging central point of symmetrical hexagon search as next step;

The 8th step: expand symmetrical hexagon search, more than the optimal match point confirmed of a step as the initial ranging central point of this step; Be divided into two search phases:

Calculate a stage: be that 2 pixels carry out symmetrical hexagon six point search with step-length earlier; Correlation Centre point and 6 rate distortion costs values to angle point; According to the rate distortion criterion with rate distortion costs value smallest point as the central point of search next time, become smallest point up to the rate distortion costs value of central point;

Second stage: reducing step-length again, is that 1 pixel carries out the search of rhombus module with step-length, repeats the search procedure of phase I, stops search according to the rate distortion criterion during for smallest point up to the rate distortion costs value of central point; This point is the whole pixel search optimal match point that the inventive method is confirmed;

The search point level of rhombus module and the coordinate of vertical direction are set to:

static?int?Diamond_x[4]＝{-1，0，1，0}；

static?int?Diamond_y[4]＝{0，1，0，-1}；

The 9th step: finish motion estimation search, preserve relevant motion-estimation encoded information in output file out.264, and the code stream behind the output encoder.

Experimental result such as table 1.Can find out that from table 1 method of the present invention is compared with the UMHexagonS motion estimation coding method of H.264/AVC standard employing, Y-PSNR slightly increases, and on average increases 0.007dB, and video quality is loss not; Code check has been controlled in strictness, on average increases by 0.29%, has kept the superior function of primary standard algorithm high compression ratio; Shorten the motion-estimation encoded time greatly, on average saved the motion-estimation encoded time 18.46%.The experiment statistics data have been verified validity of the present invention effectively.The inventive method has good portability, can combine with other fast video coding methods (like inter-frame mode selecting method etc.), improves video coding speed jointly.

Table 1. New-UMHexagonS algorithm of the present invention and UMHexagonS algorithm motion-estimation encoded performance are relatively

Claims

1. based on the method for coding quick movement estimation video of macro block characteristics; Characteristic and inter prediction encoding Design Pattern motion estimation search template and dynamic method for fast searching according to current predicted macroblock; According to current by the motion severe of predicted macroblock; Selection result in conjunction with the inter prediction encoding pattern; Dynamic adjustment hunting zone and the corresponding motion estimation search criterion of adaptive selection, thus realize fast motion estimation predictive coding to inter macroblocks, it is characterized in that specifically comprising the steps:

Step 3: the initial search point of confirming high-accuracy: adopt Lagrangian rate-distortion optimization function; As estimation judgement foundation; The coupling predict blocks and the optimal motion vector of the best on the selection rate distortion sense make that the Bit Allocation in Discrete of motion vector and residual coding is minimum; Utilize Lagrangian rate distortion criterion to select the optimal motion vector problem can be described as:

(1) J in the formula _MotionRate distortion costs value RDcost for the motion vector of current prediction _MotionS is the current macro block pixels value; Mv is current vector, and pred is a predictive vector; The reference frame of ref for selecting; (ref mv) is the pixel value of reference macroblock to r; R is that motion vector carries out the bit number that differential coding consumes, and comprises number of coded bits and the number of coded bits of reference frame of the difference of motion vector and its predicted value; λ _MotionBe Lagrange multiplier; SAD be between current block and the reference block pixel absolute error with;

SAD [s, r (ref, mv)] = Σ_{x = 1}^{B_{1}} Σ_{y = 1}^{B_{2}} | s (x, y) - r (x - m_{x}, y - m_{y}) | - - - (2)

In the formula (3), MODE representes a kind of interframe encoding mode of current macro; J _Mode(s, c, MODE| λ _Mode) rate distortion costs value RDcost under the expression MODE pattern _ModeS is original vision signal; C is the reconstructed video signal behind the employing MODE pattern-coding; λ _ModeBe Lagrange multiplier; (s, c MODE|QP) are the total number of bits that comprise macro block header, motion vector and all DCT block messages relevant with pattern and quantization parameter to R, and it is through obtaining behind the coding that piece is carried out reality, so its operand is bigger; QP is the coded quantization step-length; SSD (s, c, MODE) be between primary signal and the reconstruction signal squared differences with, that is:

SSD (s, c, MODE | QP) = Σ_{i = 1, j = 1}^{B_{1}, B_{2}} {(S_{Y} [x, y] - C_{Y} [x, y, MPDE | QP])}^{2} +

Σ_{i = 1, j = 1}^{B_{1}, B_{2}} {(S_{U} [x, y] - C_{U} [x, y, MPDE | QP])}^{2} + - - - (4)

Σ_{i = 1, j = 1}^{B_{1}, B_{2}} {(S_{V} [x, y] - C_{V} [x, y, MPDE | QP])}^{2}

Step 3.1: utilize three kinds of predictive coding patterns to choose best motion vector Motion Vector, MV is promptly based on the median prediction MV of spatial domain _{Pred_space}The upper strata prediction MV of many sizes of the estimation macroblock partitions characteristics that adopt based on standard H.264 _{Pred_uplayer}Reference frame motion-vector prediction MV based on time-domain _{Pred_ref}

(b) MV _{Pred_uplayer}: the upper strata prediction, utilize H.264 many sizes of estimation macroblock partitions characteristics, comprise pattern 1:

16 * 16, pattern 2:16 * 8, mode 3: 8 * 16, pattern 4:8 * 8, pattern 5:8 * 4, pattern 6:4 * 8, mode 7: 4 * 4 hierarchical search order, get co-located, the upper level up layer that has obtained, the motion vector that is twice piece;

Step 5.1: the rate distortion costs value of calculating the motion vector of current predicted macroblock according to formula (1) is designated as RDcost _Motion, calculate the rate distortion of current predicted macroblock under the mode pattern according to formula (3) and be designated as RDcost for value _Mode, therefrom choose rate distortion costs value with minimum prediction, be designated as RD_mincost, that is: RD_mincost=min{Pred_space_mincost, Pred_uplayer_mincost, Pred_ref mincost};

Step 5.3: calculate motion severe adjustment factor gamma, δ:

γ = \frac{Bsize [blocktype]}{pred_\min \cos t^{2}} - α_{Radii 1} [blocktype]

(5)

δ = \frac{Bsize [blocktype]}{pred_\min \cos t^{2}} - α_{Radii 2} [blocktype]

α _Radii1[1]＝-0.23；α _Radii1[2]＝-0.23；α _Radii1[3]＝-0.23；

α _Radii2[1]＝-2.39；α _Radii2[2]＝-2.40；α _Radii2[3]＝-2.40；

Step 5.4: the judgement of current predicted macroblock motion severe:

Step 9: output movement estimated coding information is in the Out.264 file, and record estimation time, code check, Y-PSNR information, to estimate the encryption algorithm quality.