CN101833768A

CN101833768A - Method and system for carrying out reliability classification on motion vector in video

Info

Publication number: CN101833768A
Application number: CN 200910119520
Authority: CN
Inventors: 韩博; 吴伟国; 陈正华
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2009-03-12
Filing date: 2009-03-12
Publication date: 2010-09-15

Abstract

The invention provides a method and a system for carrying out reliability classification on a motion vector in a video. The method comprises the following steps of: classification according to blocks: carrying out motion vector reliability classification on all blocks obtained by partitioning an appointed video frame of an input video; and classification according to frames: carrying out motion vector reliability classification optimization in the appointed video frame according to the motion vector reliability classification of all the blocks in the appointed video frame so as to obtain the optimal classification of motion vectors of all the blocks in the appointed video frame. Accordingly, the accuracy of the motion vector reliability classification of the video can be improved.

Description

Method and system to the carrying out reliability classification on motion vector in the video

Technical field

The application relates to the processing/analysis of video (moving image).Specifically, the application relates to a kind of method and system to the carrying out reliability classification on motion vector in the video, and the movable information (referring to motion vector in this application) that is used for extracting from video is classified according to its reliability.

Background technology

The movable information of a certain image-region that from video data, extracts, be in the processing/analysis of video a kind of key character of generally adopting, people wish the motion of the target in the reflecting video more exactly of this category information.Yet the set of some pixels might not be unique in the space, unalterable in time in the video.Therefore, just having only part in the movable information by existing method acquisition is reliably, the real motion of energy reflecting video target; Those insecure movable informations bring negative effect then can for the processing/analysis of video.What is called reliably refers to this motion and has described the change in location of this area relative physical target (the perhaps imaginary physical target of correspondence in the cartoon) between the different frame of video exactly.

Existing effective motion vector searching method (is example with the enough big full search in hunting zone), Search Results are mainly reliable and unreliable two kinds of situations.The existing technology that motion vector in the video is carried out reliability evaluation mainly adopts three category informations: the textural characteristics of image-region, the error character of image-region coupling and the correlation information between zones of different.But still there is weak point in existing various technology in application.

Summary of the invention

In order to remedy the deficiency of prior art in application, reach the insecure motion vector of identification or extract the purpose of corresponding reliable motion vectors, the application has proposed a kind of new, effective motion vector reliability classification method more.

In realizing an embodiment of the invention, a kind of method to the carrying out reliability classification on motion vector in the video is provided, comprise: piecemeal and search step, be used for designated frame piecemeal with input video, and to the physical block searching motion vector of described designated frame so that described physical block is generated block-matching error; The texture feature extraction step is used to extract the textural characteristics of described physical block; And, be used for described block-matching error and textural characteristics according to described physical block by the block sort step, and described physical block is carried out the motion vector reliability classification, motion vector is divided into reliably, mates by mistake, mates more three classes.

In realizing another embodiment of the invention, a kind of system to the carrying out reliability classification on motion vector in the video is provided, comprise: piecemeal and search module, be used for designated frame piecemeal with input video, and to the physical block searching motion vector of described designated frame so that described physical block is generated block-matching error; The texture feature extraction module is used to extract the textural characteristics of described physical block; And, be used for described block-matching error and textural characteristics according to described physical block by the block sort module, and described physical block is carried out the motion vector reliability classification, motion vector is divided into reliably, mates by mistake, mates more three classes.

In realizing another embodiment of the invention, a kind of method to the carrying out reliability classification on motion vector in the video is provided, comprise:, be used for each piece that the designated frame piecemeal with input video obtains is carried out the motion vector reliability classification by the block sort step; And classification step frame by frame, be used for motion vector reliability classification according to each piece in the described designated frame, in described designated frame, carry out the optimization of motion vector reliability classification, so that obtain the optimal classification of the motion vector of each piece in the described designated frame.

In realizing another embodiment of the invention, a kind of system to the carrying out reliability classification on motion vector in the video is provided, comprise:, be used for each piece that the designated frame piecemeal with input video obtains is carried out the motion vector reliability classification by the block sort module; And sort module frame by frame, be used for motion vector reliability classification according to each piece in the described designated frame, in described designated frame, carry out the optimization of motion vector reliability classification, so that obtain the optimal classification of the motion vector of each piece in the described designated frame.

In an embodiment of the invention, according to causing the insecure reason of motion vector that unreliable motion vector is divided into mistake coupling and many couplings; The mistake coupling of the motion vector that this sorting technique is introduced and many coupling vectors have promoted the performance of system in some application (for example video lens transition detection).

In yet another embodiment of the present invention, utilize in the video image correlativity of motion vector between the adjacent area, improved the accuracy of video motion vector reliability classification.

In yet another embodiment of the present invention, utilize statistical method that the motion vector reliability classification result of entire frame is optimized, obtained the result of global optimum of entire frame, thereby improved the accuracy of video motion vector reliability classification.

In yet another embodiment of the present invention, merge and improved image block matching error and two kinds of vectors of image block inner vein intensity as essential characteristic, and by utilizing in the image correlativity of motion vector between the adjacent block to obtain the result of global optimum of motion vector classification.This makes the motion vector classification result reach very high accuracy, for example, to the mistake match motion vector, recall ratio (the Recall of this embodiment, be that the present invention searches qualified motion vector quantity and accounts for the ratio that meets this condition actual motion vector quantities) and precision ratio (Precision, promptly, search qualified motion vector for the present invention, motion vector reliability classification that the present invention obtains and the actual consistent ratio of motion vector reliability classification) all can reach more than 97%, to reliable motion vectors and many match motion vector, recall ratio and precision ratio all can surpass or near 90%.The reliability classification result of higher accuracy can use the performance of further elevator system in (for example video overall motion estimation) naturally at some.

Description of drawings

Fig. 1 illustrates the process flow diagram according to the method for an embodiment of the invention.

Fig. 2 is the particular flow sheet of step 101 among Fig. 1.

Fig. 3 is the particular flow sheet of step 102 and step 103 among Fig. 1.

Fig. 4 (A) and (B) be two examples (A and B) of the difference operator used in the step 301 among Fig. 3, wherein A is 2 * 2 operators (comprising horizontal and vertical), B is 3 * 3 operators (comprising horizontal and vertical).

Fig. 5 is the synoptic diagram of the pairing motion vector of block-matching error that uses in the step 302 among Fig. 3.

Fig. 6 is the particular flow sheet of step 104 among Fig. 1.

Fig. 7 (1) and (2) are the examples of unreliable motion vector.

Fig. 8 is the example of the directivity texture feature extraction of image block.

Fig. 9 (A)-(C) is according to the method for an embodiment of the invention example as a result to carrying out reliability classification on motion vector.

Figure 10 is the system chart according to an embodiment of the invention.

Figure 11 illustrates the example of wiping between video lens.

Embodiment

Following with reference to the realization of description of drawings according to the method and system of specific implementations of the present invention.

For existing effective motion vector searching method (is example with the enough big full search in hunting zone), the insecure situation of Search Results mainly contains two kinds.A kind of is mistake coupling, and promptly present image area relative target does not exist in reference frame or very big variation (comprising color, texture, shape, size etc.) takes place; Another kind is many couplings, promptly exists a plurality of incomplete overlapping areas very approaching with the present image zone respectively in the reference frame, thereby causes zone and the present image zone that searches and do not correspond to same physical target.

Fig. 7 is the example of unreliable motion vector, and wherein (2) illustrate the piecemeal result of present frame and central area, and (1) illustrates the motion search situation of reference frame and relevant block; The motion vector of the piece correspondence of its acceptance of the bid X is the mistake coupling, and the motion vector of the piece correspondence of mark O is many couplings.

Fig. 1 illustrates the process flow diagram according to the method for an embodiment of the invention.As shown in Figure 1, at step 101 (piecemeal and search step),, and search for the motion vector (MV) of each piece and generate block-matching error designated frame (the being current video frame) piecemeal of input video.In step 102 (texture feature extraction step), extract the textural characteristics of the physical block (being current block) of described designated frame.In step 103 (by the block sort step), carry out the MV reliability classification by piece, that is,, motion vector is divided into reliably, mates by mistake, mates more three classes according to the textural characteristics of described block-matching error and described physical block.Obviously, repeating

step

102 and 103 can be finished the MV reliability classification of each piece of current video frame.In step 104 (classification step frame by frame), carrying out the MV classification frame by frame optimizes, promptly, motion vector reliability classification according to each piece in the described designated frame, in described designated frame, carry out the optimization of motion vector reliability classification, so that obtain the optimal classification of the motion vector of each piece in the described designated frame.

Fig. 2 is the particular flow sheet of step 101 among Fig. 1, as shown in Figure 2, in step 201, (for example: 352 * 288 or 352 * 240) generates lower resolution grayscale image from above-mentioned input video.Here select the purpose of low resolution to be, making the image-region that can reflect apparent in view motion in the two field picture comprise pixel quantity can be not too much, can improve counting yield.In step 202, to the pixel in the picture centre zone of current video frame, with predetermined size template (for example 16 * 16 or 32 * 32) piecemeal.In step 203, physical block to current video frame, and current video frame corresponding reference frame in search with its near the piece and the position thereof of (being the matching error minimum between this physical block of piece in the reference frame and current video frame), thereby the motion vector of definite block-matching error minimum.In step 204, the block-matching error between this physical block of the adjacent block of the piece that searches in the calculation procedure 203 and current video frame.Obviously, repeating step 203 and 204 can obtain the block-matching error of each piece of current video frame.

According to step shown in Figure 2, video image is carried out resolution conversion and piecemeal (desirable block size as 8 * 8,16 * 16 or 32 * 32 etc.), and the motion vector of every of search in reference frame; Obtain the motion-vector search result,, determine the residing location of pixels X in hour piece center of matching error promptly to a physical block of current video frame; Calculate then as shown in Figure 5 be the block-matching error of each pairing motion vector in adjacent position at center with motion-vector search result (being the location of pixels X of matching error minimum), i.e. block-matching error between each adjacent block and this physical block.Fig. 5 is the synoptic diagram of the pairing motion vector of block-matching error that uses in the step 302 in Fig. 3, wherein each motion vector about a specify image piece can be expressed as: have common starting point in present frame, the i.e. center of this physical block, and another end points of each motion vector is promptly represented in the piece center in reference frame.As shown in Figure 5, the motion vector of block-matching error minimum in the corresponding motion search range of X; The corresponding motion vector (in level and vertical direction at a distance of 1 or 2 pixels, in oblique line directions at a distance of 1 pixel) with the X adjacent pixel location of other mark.

Block-matching error can adopt error sum of squares (SSD, Sum of Square Difference) or Error Absolute Value and (SAD, Sum of Absolute Difference).For an image block (Block), its block-matching error SSD and SAD are defined as follows respectively, I (x wherein, y) presentation video is at (x, y) (subscript current represents current frame image to the gray-scale value of location of pixels, subscript reference represents reference frame image), H and V represent level and vertical direction, MV respectively _HAnd MV _VThe motion vector of difference presentation video piece on H and V direction, as mentioned above, this motion vector can be expressed as each image block of this image block sensing reference frame of present frame, therefore, the block-matching error between each image block of this image block of present frame and reference frame can adopt the SSD of this motion vector or SAD to represent.

SSD ({MV}_{H}, {MV}_{V}) = \underset{(x, y) &Element; Block}{Σ} {[I_{current} (x, y) - I_{reference} (x + {MV}_{H}, y + {MV}_{V})]}^{2}

SAD ({MV}_{H}, {MV}_{V}) = \underset{(x, y) &Element; Block}{Σ} {| I}_{current} (x, y) - I_{reference} (x + {MV}_{H}, y + {MV}_{V}) |

For position about motion-vector search result (being the smallest blocks matching error correspondence position X shown in Fig. 5) symmetry, position A as shown in Figure 5 and A ', B and B ', C and C ', D and D ', E and E ', F and F ' can only preserve the motion vector of wherein less block-matching error and position correspondence thereof.Like this, for an image block, only need preserve 7 block-matching errors that comprise the pairing smallest blocks matching error of X.Certainly, also can preserve block-matching error more or still less.

Fig. 3 is the particular flow sheet of step 102 and step 103 among Fig. 1.

As shown in Figure 3, in step 301,, obtain the directivity textural characteristics from current image block and surrounding pixel texture feature extraction thereof.Wherein, utilize image difference operator (for example among Fig. 4 shown in A or the B), (θ angle is wherein represented the angle of the horizontal direction (X) of texture and frame at all directions θ to calculate current image block (Block) according to following formula, its need are got some discrete values, its scope can be between-90 ° to+90 °) on texture strength R (θ), the maximal value of record texture strength and the pairing direction θ of minimum value of minimum value and R (θ) ₀, f wherein _X(x, y) and f _Y(x y) is illustrated respectively in (x, y) the local difference on the level of location of pixels (being the location of pixels of center Z in current image block Block of operator among Fig. 4) and vertical (Y) direction.

R (θ) = \frac{\sin^{2} θ \underset{(x, y) &Element; Block}{Σ} f_{X}^{2} (x, y) + \cos^{2} θ \underset{(x, y) &Element; Block}{Σ} f_{Y}^{2} (x, y) - 2 \sin θ \cos θ \underset{(x, y) &Element; Block}{Σ} f_{X} (x, y) f_{Y} (x, y)}{(\underset{(x, y) &Element; Block}{Σ} f_{X}^{2} (x, y)) (\underset{(x, y) &Element; Block}{Σ} f_{Y}^{2} (x, y)) - {(\underset{(x, y) &Element; Block}{Σ} f_{X} (x, y) f_{Y} (x, y))}^{2}}

Two difference operator A as shown in Figure 4 and B represent the operator operation, promptly the gradation of image value in the specific region is weighted summation, the weight that numeral respective pixel position in the operator will apply (the expression weight of nil is 0), Z represents the center of operator.About two operators be used for the difference of calculated level and vertical direction respectively, and two operators pairings (A) use, (B) two the operators pairings in are used.

Fig. 8 is the example of the directivity texture feature extraction of image block, wherein Fig. 8 top has provided 4 examples of image block type, be followed successively by the smoothed image piece, have with the horizontal direction (X) of frame be 0 ° unidirectional edge image block, have horizontal direction (X) with frame be 45 ° unidirectional edge image block and have the image block that enriches texture; Fig. 8 lower illustration the directivity texture strength function of image block, the texture strength of longitudinal axis presentation video piece, transverse axis is represented the angle θ of the horizontal direction (X) of texture and frame, the minimum value correspondence of texture strength the texture principal direction of image block, promptly above-mentioned θ ₀In Fig. 8 bottom, four curves that are labeled as (1), (2), (3), (4) correspond respectively to the value of texture strength function (R) under different angles (θ) situation of four image blocks in the exemplary plot of Fig. 8 top.

In step 302, the block-matching error that from movable information search procedure (being the step 203 and 204 of Fig. 2), obtains, the bonding position textural characteristics extracts piece coupling stability features, obtains the surface fitting coefficient.In the process shown in the step 302, directivity textural characteristics and block-matching error according to the front gained, carry out the quadric surface match according to following formula, wherein make c equal the pairing block-matching error of motion-vector search result, make θ equal the pairing direction of minimum value R (θ) of texture strength, i.e. texture principal direction.Block-matching error (promptly other block-matching error except that the pairing block-matching error of motion-vector search result for example, has been preserved 6 other block-matching errors as mentioned above) SSD (MV with above-mentioned adjacent block _H, MV _V) and accordingly at the motion vector MV of H direction and V direction (level and vertical direction) _HAnd MV _V, the formula below the substitution is respectively estimated wherein surface fitting coefficient a and b with least square method.Magnitude relationship between a and b and the c can be represented piece coupling stability: the value of c is more little, and expression current block and reference block are approaching more; A and b are big more, and the expression block-matching error departs from optimum point with motion vector and increases soon more; Very little a of c and b are very big to illustrate that then the coupling of current block and reference block is very stable.

aX′ ²+bY′ ²+c＝SSD(MV _H，MV _V)

\{\begin{matrix} X^{'} = X \cos θ + Y \sin θ \\ Y^{'} = - X \sin θ + Y \cos θ \end{matrix}

\{\begin{matrix} X = {MV}_{H} - {MV}_{H}^{best} \\ Y = {MV}_{V} - {MV}_{V}^{best} \end{matrix}

The 2nd on the right of two equations up and down of above-mentioned the 2nd system of equations, be illustrated respectively in the optimum piece matching value that H direction and V direction (level and vertical direction) motion-vector search algorithm finds, promptly the motion vector of smallest blocks matching error correspondence is in the value of H direction and V direction.

In step 303, according to directivity textural characteristics and surface fitting coefficient, the sorter that utilizes statistical learning to obtain carries out current block movable information reliability classification.That is, with the minimum value of a, the b, c and the texture strength that obtain previously, maximal value 5 features as the movable information reliability evaluation, the sorter that utilizes statistical learning to obtain is divided into motion vector reliably, the mistake coupling, mates three classes more.Here, gather abundant video data, the artificial mark to the carrying out reliability classification on motion vector of wherein image block obtains correct motion vector reliability classification, and calculates corresponding characteristic of division; Add up the corresponding relation between motion vector reliability classification and the characteristic of division then, draw the corresponding relation function, be i.e. the sorter that obtains of statistical learning.

Fig. 6 is the particular flow sheet of step 104 among Fig. 1.

As shown in Figure 6, in step 601, from comprising the multitude of video data of movable information reliability manual sort result (the correct classification results of artificial mark), according to probability distribution, set up Markov random field (MRF) model, thereby obtain the MRF model of the reliability classification of the movable information of all pieces in the two field picture.That is, set up Markov random field model according to the data acquisition of the movable information classification results that is undertaken by piece.

In step 602,, set up the energy function of gibbs random field (GRF) model according to the reliability classification result who is undertaken by piece of the movable information of all pieces in the two field picture that obtains from flow process shown in Figure 3 and at the MRF model that step 601 obtains.

In this process, the result's that step 303 among Fig. 3 is obtained data acquisition (the reliability classification result who is undertaken by piece of the movable information of all pieces in the two field picture) is as the observed reading g of Markov random field, and f represents the probability realization of Markov random field, i.e. the reliability classification result's of all pieces value in the two field picture.

The flow process of step 104 shown in Figure 6 is to obtain the optimal classification of the motion vector of each piece in the two field picture, and therefore, the maximum a posteriori probability that needs to calculate Markov random field is realized promptly optimum reliability classification result.Realize for the maximum a posteriori probability that obtains Markov random field,, will maximize product P (f|g) ∝ P (f) P (g|f) that posterior probability P (f|g) changes into maximization prior probability P (f) and conditional probability P (g|f) according to bayesian criterion.

According to the Hamersley-Clifford theorem, Markov random field is equivalent to the gibbs random field.The Markov random field prior probability can represent with Gibbs Distribution,

P (f) = Z^{- 1} \exp [- \underset{c &Element; C}{Σ} V_{c} (f_{p}, f_{q})]

V wherein _c(f _p, f _q) potential energy of expression adjacent block p and q, Z is a normaliztion constant, f _pAnd f _qThe reliability classification result's of difference presentation video piece p and q value, C represents the set of all adjacent blocks.Derivation as shown below, the maximization prior probability can change into minimize two energy functions and.

P (f) P (g | f) &Proportional; \exp [- \underset{c &Element; C}{Σ} V_{c} (f_{p}, f_{q})] \underset{p &Element; P}{Π} P (g_{p} | f_{p})

In the following formula, conditional probability P (g _p| f _p) promptly the actual value (right value) in the reliability classification result be f _pThe time, the classification results of step 303 is g _pProbability.Taken the logarithm in the following formula both sides, and the maximization logarithm value, the right as shown in the formula, its maximization change into two energy functions and minimize.

- \underset{c &Element; C}{Σ} V_{c} (f_{p}, f_{q}) + \ln (\underset{p &Element; P}{Π} P (g_{p} | f_{p}))

= - {\frac{1}{2} \underset{p &Element; P}{Σ} \underset{q &Element; N_{p}}{Σ} V_{c} (f_{p}, f_{q}) - \underset{p &Element; P}{Σ} \ln (P (g_{p} | f_{p}))}

= - {E_{smoothness} (f) + E_{data} (f)}

N wherein _pRepresent the set of the image block adjacent, the set of the image block in the P presentation video, E with image block p _Smoothness(f) energy introduced by the difference between the reliability classification relation of optimizing adjacent block among the result and statistical law of expression, for example, there is the motion vector of two adjacent image pieces identical, if it is very big that statistical law shows that under the identical situation of adjacent block motion vector their reliability classification is all reliable probability, then optimizing among the result them, all to be divided into the pairing energy of reliable class just very little, and the energy of other situation correspondences is just bigger; E _Data(f) energy introduced by the difference of optimizing between result's (being the output of step 104) and observed reading (being the output of step 103) of expression can directly calculate.Here the physical meaning that contains is: the more little then corresponding energy of the probability that situation takes place is big more, therefore requires energy minimization.

According to the Markov property between adjacent block p and the q, V _c(f _p, f _q) be defined as follows shown in the formula.

In following formula, u _{P, q}, v _{P, q}, w _{P, q}Equal and the f of value _pWith f _qThe probability of value is relevant, is respectively u _{P, q}=-β ln (P _A), v _{P, q}=-β ln (P _B) and w _{P, q}=-β ln (P _C), wherein β is a constant, probable value P _A, P _BAnd P _CBe respectively the probability that situation A, B and C take place, they are relevant with the movable information difference of adjacent block, and with the statistical frequency value as the probability value.In order to further describe the difference of movable information, according to the motion vector extent of adjacent block || MV _p-MV _q|| it is divided into limited plurality of sections, adds up P then _A, P _BAnd P _CValue at each section.Segmentation but not be that to carry out probability statistics be that segmentation can guarantee that the sample size in a section is enough to be used in probability estimate because the bigger sample of motion vector difference is less in unit with each different value.

Therefore, by said process, can obtain the energy function E=E of gibbs random field models _Smoothness(f)+E _Data(f).

In step 603, utilize optimization method, find the solution f by the above-mentioned energy function that minimizes the GRF model, promptly the maximum a posteriori of MRF is separated.That is to say, can utilize traditional optimization method, for example belief propagation method and figure cut method, find the solution by the minimization of energy function, promptly find the solution the maximum a posteriori probability of Markov model and realize, thereby obtain the overall situation (promptly in the whole video frame) the optimal classification result of the motion vector of each piece.

Fig. 9 is according to the method for an embodiment of the invention example as a result to carrying out reliability classification on motion vector; Wherein the motion vector of black block correspondence is the mistake coupling, and the motion vector of grey block correspondence is many couplings, and the motion vector of white blocks correspondence is reliable.In Fig. 9, (A) be reference frame, (B) for present frame by 16 * 16 results that carry out motion vector reliability classification behind the piecemeal, (C) for present frame by 32 * 32 results that carry out motion vector reliability classification behind the piecemeal.

It should be noted that, the input of the flow process of step 104 shown in Figure 6 (being the input of step 602) is not necessarily limited to the output of step 103, and can be other result who carries out the method for MV reliability classification by piece, the reliability classification result of each piece that the designated frame that for example uses existing motion vector reliability classification method to obtain is interior, promptly reliable and unreliable two classes, for example the input of the flow process of step 104 can also be that the result who carries out the MV reliability classification by piece is more than 3 kinds again.By flow process shown in Figure 6 (reliability classification of each piece that will obtain is input step 602 as a result), also can obtain the overall situation (promptly in the whole video frame) the optimal classification result of the motion vector of each piece.In above-mentioned example, except V _c(f _p, f _q) the probability value only corresponding to f _pWith f _qBy the result (reliable and unreliable, perhaps Dui Ying classification results more than 3 kinds) of employing MV reliability classification method in addition, other process repeats no more as mentioned above here.

Figure 10 is the system chart according to an embodiment of the invention.Described system comprises piecemeal and search module 1001, is used for designated frame (the being current video frame) piecemeal with input video, and to the physical block searching motion vector of described designated frame so that described physical block is generated block-matching error; Texture feature extraction module 1002 is used to extract the textural characteristics of described physical block (being current block); By block sort module 1003, be used for carrying out the MV reliability classification by piece, that is,, motion vector is divided into reliably, mates by mistake, mates more three classes according to the described block-matching error and the textural characteristics of described physical block; And sort module 1004 frame by frame, being used for carrying out frame by frame the MV classification optimizes, promptly, motion vector reliability classification according to each piece in the described designated frame, in described designated frame, carry out the optimization of motion vector reliability classification, so that obtain the optimal classification of the motion vector of each piece in the described designated frame.

Corresponding to method shown in Figure 1, each module of system according to an embodiment of the invention shown in Figure 10 can be finished the function corresponding to the step shown in Fig. 2,3 and 6, repeats no more here.

The method and system of the application's proposition not only is divided into motion vector reliable and unreliable two classes, but also unreliable motion vector is divided into mistake coupling and two types of many couplings, and abundant more useful information is provided.In each embodiment of the present invention of enumerating in this application, utilized current image block and its corresponding region in reference frame feature on every side to carry out reliability classification by piece; But also can utilize the motion and the motion credibility information of adjacent each piece of current block, utilize statistical model that the movable information reliability classification of whole frame is carried out global optimization.Undertaken by piece in the process of reliability classification, not only utilizing the piece coupling stability features that obtains in the motion-vector search process, but also utilizing the directivity textural characteristics of current image block itself.In each embodiment of the present invention of enumerating in this application, outstanding global optimization method of having illustrated motion vector reliability classification overall plan and specific implementation, the required Feature Extraction method of reliability classification and classification results.

The method and system to the carrying out reliability classification on motion vector in the video that the application proposes is applicable to the processing and the analysis of various videos without form, different content, and it uses the restriction that is not subjected to image block scheme and method for searching motion.

The present invention can be applicable to the cutting apart and field such as segmentation of video, retrieval, analysis of accurate extraction, object video of video motion information.Below described and used several examples of the present invention, but application of the present invention is not limited to address the field of following example.

1. global camera motion parameter estimation

The global motion of video camera (operation) parameter estimation is a major issue in the video analysis field.The estimation of these parameters generally all depends on the motion vector information that extracts from video.By analyzing as can be known, it is reliable often only having about half in the motion vector that utilizes existing method to obtain; And insecure motion vector information can have a strong impact on the accuracy of estimated result as the noise in the kinematic parameter estimation procedure.By using the present invention, can only adopt reliable motion vector, and the influence of removing unreliable information, thereby obtain global camera motion parameter more accurately.

2. video lens transition detection

Video lens (shot) conversion (transition) detects the first step to handling, analyze through editor's video often.Traditional method based on histogram and textural characteristics still can not reach gratifying detection effect.Use the accuracy of detection that the present invention can significantly improve the camera lens conversion.By using the present invention, utilize on time shaft the change information of all kinds of motion vector ratios in every frame, detector lens shear more accurately (cut).And space distribution and time distribution characteristics in video, the employing reliable motion vectors of all kinds of motion vectors in image estimated global camera motion parameter that obtains and various motions (spatial domain and the time domain) statistical information of using the weighting of motion vector reliability, can help to improve the detection effect of gradual shot (gradual change).

3. motion target area detects and cuts apart

According to the difference of local motion vector and global motion model, can detect with divided video in motion target area; But this must be based upon on the reliable basis of movable information.The fast-changing zone of moving target, its motion vector is often unreliable.By using the present invention, its motion vector can be divided into the mistake match-type, such zone is generally all corresponding to moving target and edge thereof.The motion vector of smooth region is often unreliable, by using the present invention, its motion vector can be divided into many match-types, rejects the influence in these zones and can find motion target area more accurately.

4. video frame rate promotes (comprising by staggered scanning to the conversion of lining by line scan)

The purpose that video frame rate promotes is to go up the frame data of disappearance estimated time according to the frame of video that has obtained, and is according to the line data that lacks in the image line estimation space that has obtained by staggered scanning to the purpose of the conversion of lining by line scan.Because all being the estimation according to the data that obtained (part), missing data obtains, so the reliability of movable information is just very important.By using the present invention, for the zone that is divided into the mistake coupling, it can be divided into a plurality of less area unit searching motion vector respectively, thereby improve the precision of motion vector, recover missing data more accurately, improve the output video quality.

5. detection of wiping between video lens and coupling

The video lens conversion can be divided into shear (Cut) and gradual change two classes; And wipe (Wipe) belongs to a kind of of gradual change.The image that refers to a back camera lens of wiping becomes greatly according to certain rule gradually from the beginning of a certain zone, overrides fully up to the image previous camera lens.The wiping of the process of simultaneous logotype picture entry and exit is called animation wipe (Graphic Wipe).Owing to wipe, the motion vector of the image block that is in front and back camera lens picture juncture area of every frame often belongs to mistake match-type (as shown in figure 11) in its process.

Figure 11 illustrates the example of wiping between video lens.Two row are respectively that sequential is numbered each frame of 1,2,3,4,5 among transition A and the transition B about among Figure 11.The frame of getting sequence number correspondence in these two sections carries out the motion vector reliability classification respectively, wherein draws the black surround position and represents that motion vector is the very high piece of possibility of mistake coupling.If A and B all belong to the same type of wiping, it is higher that the motion vector of the piece of position correspondence is all the probability that by mistake mates in the so corresponding frame.

Therefore, according to the distribution characteristics in some frames in a certain period, the detection that can wipe corresponding to the image block that misses match motion vector.And, utilize simultaneously whether the motion vector of the piece of each position correspondence in the frame of each sequential correspondence in two transitions is the information of mistake coupling, can judge whether these two gradual changes all belong to the same type of wiping; This information is helpful to playback (Replay) camera lens that detects in the sports video.

The application has only described specific embodiment of the present invention and realization.According to the content that the application describes, can make various improvement, distortion and other embodiment and realization.

For example, except shown in Figure 1 according to method of an embodiment of the invention and the system according to an embodiment of the invention shown in Figure 10, another embodiment of the method according to this invention, can only comprise the step 101-103 shown in Fig. 1, and, can only comprise piecemeal shown in Figure 10 and search module 1001, texture feature extraction module 1002 and by block sort module 1003 according to another embodiment of system of the present invention.

Another embodiment of the method according to this invention can only comprise the step 103 shown in Fig. 1 and 104, perhaps can further comprise step 101 or step 102; And according to another embodiment of system of the present invention, can only comprise shown in Figure 10 by block sort module 1003 and sort module 1004 frame by frame, perhaps can further comprise piecemeal and search module 1001 or texture feature extraction module 1002.Details wherein repeats no more as mentioned above here.

Claims

1. method to the carrying out reliability classification on motion vector in the video comprises:

By the block sort step, be used for each piece that the designated frame piecemeal with input video obtains is carried out the motion vector reliability classification; And

Classification step is used for the motion vector reliability classification according to each piece in the described designated frame frame by frame, carries out the optimization of motion vector reliability classification in described designated frame, so that obtain the optimal classification of the motion vector of each piece in the described designated frame.

2. method according to claim 1, wherein classification step comprises frame by frame:

From comprising motion vector reliability manual sort result's multitude of video data,, set up Markov random field MRF model according to probability distribution;

With the motion vector reliability classification of each piece in the described designated frame observed reading,, set up the energy function of GRF model by the equivalence of Markov random field MRF and gibbs random field GRF as Markov random field; And

Find the solution by the energy function that minimizes described GRF model, obtain the optimal classification of the motion vector of each piece in described designated frame.

3. method according to claim 2, wherein said energy function be two energy and, one of them is:

\frac{1}{2} \underset{p &Element; P}{Σ} \underset{q &Element; N_{p}}{Σ} V_{c} (f_{p}, f_{q})

V wherein _c(f _p, f _q) potential energy of expression adjacent block p and q, f _pAnd f _qThe value of representing the reliability classification result of piece p and q respectively, N _pRepresent the set of the image block adjacent, the set of the image block in the P presentation video with image block p.

4. method according to claim 2 wherein is divided into motion vector reliably, mates by mistake, mates three classes, according to the Markov property between adjacent block p and the q, V more _c(f _p, f _q) be defined as follows shown in the formula,

In following formula, u _{P, q}, v _{P, q}, w _{P, q}Equal and the f of value _pWith f _qThe probability of value is relevant, is respectively u _{P, q}=-β ln (P _A), v _{P, q}=-β ln (P _B) and w _{P, q}=-β ln (P _C), wherein β is a constant, probable value P _A, P _BAnd P _CBe respectively the probability that situation A, B and C take place, they are relevant with the motion vector difference of adjacent block, and with the statistical frequency value as the probability value.

5. method according to claim 1 also comprises:

Piecemeal and search step are used for the designated frame piecemeal with described input video, and to the physical block searching motion vector of described designated frame so that described physical block is generated block-matching error;

And wherein,, described physical block is carried out the motion vector reliability classification by the described block-matching error of block sort step according to described physical block.

6. method according to claim 1 also comprises:

The texture feature extraction step is used to extract described textural characteristics;

And wherein,, carry out the motion vector reliability classification to described by the described textural characteristics of block sort step according to described.

7. method according to claim 1 also comprises:

Piecemeal and search step are used for the designated frame piecemeal with described input video, and to the physical block searching motion vector of described designated frame so that described physical block is generated block-matching error; And

The texture feature extraction step is used to extract the textural characteristics of described physical block;

And wherein, described by described block-matching error and the textural characteristics of block sort step according to described physical block, described physical block is carried out the motion vector reliability classification.

8. method according to claim 5, wherein piecemeal and search step comprise:

Generate lower resolution grayscale image from described input video;

With the preliminary dimension template central area of the gray level image of described designated is divided into a plurality of;

To described physical block, with described designated frame corresponding reference frame in, the piece and the position thereof of block-matching error minimum between search and the described physical block, thereby the motion vector of definite block-matching error minimum; And

Calculate the adjacent block of piece position of described smallest blocks matching error correspondence and the block-matching error between the described physical block.

9. method according to claim 8, wherein:

The described adjacent block piece position corresponding with described smallest blocks matching error in level and vertical direction at a distance of 1 or 2 pixels, in oblique line directions at a distance of 1 pixel, and for the adjacent block of the piece position symmetry of described relatively smallest blocks matching error correspondence, the only motion vector of block-matching error that the reserved block matching error is little and position correspondence thereof.

10. method according to claim 6, wherein the texture feature extraction step comprises:

From described physical block and surrounding pixel thereof, calculate the texture strength of described physical block in all directions, obtain maximal value and the minimum value and the texture principal direction of the texture strength of described physical block.

11. method according to claim 7 wherein comprises by the block sort step:

According to the textural characteristics of described block-matching error and described physical block, carry out the quadric surface match, obtain the surface fitting coefficient; And

With described textural characteristics and described surface fitting coefficient, as the feature of motion vector reliability evaluation, the sorter that utilizes statistical learning to obtain carries out the motion vector reliability classification to described physical block.

12. the system to the carrying out reliability classification on motion vector in the video comprises:

By the block sort module, be used for each piece that the designated frame piecemeal with input video obtains is carried out the motion vector reliability classification; And

Sort module is used for the motion vector reliability classification according to each piece in the described designated frame frame by frame, carries out the optimization of motion vector reliability classification in described designated frame, so that obtain the optimal classification of the motion vector of each piece in the described designated frame.

13. system according to claim 12, wherein sort module comprises frame by frame:

From the multitude of video data that comprise motion vector reliability manual sort result, according to probability distribution, set up the device of Markov random field MRF model;

With the motion vector reliability classification of each piece in the described designated frame as the observed reading of Markov random field, by Markov random field MRF and gibbs random field GRF equivalence, set up the device of the energy function of GRF model; And

The device of finding the solution by the energy function that minimizes described GRF model, thus the optimal classification of the motion vector of each piece in described designated frame obtained.

14. system according to claim 13, wherein said energy function be two energy and, one of them is:

\frac{1}{2} \underset{p &Element; P}{Σ} \underset{q &Element; N_{p}}{Σ} V_{c} (f_{p}, f_{q})

15. system according to claim 14 wherein is divided into motion vector reliably, mates, mates three classes, according to the Markov property between adjacent block p and the q, V more by mistake _c(f _p, f _q) be defined as follows shown in the formula,

16. system according to claim 12 also comprises:

Piecemeal and search module are used for the designated frame piecemeal with described input video, and to the physical block searching motion vector of described designated frame so that described physical block is generated block-matching error;

And wherein,, described physical block is carried out the motion vector reliability classification by the described block-matching error of block sort module according to described physical block.

17. system according to claim 12 also comprises:

The texture feature extraction module is used to extract described textural characteristics;

And wherein,, carry out the motion vector reliability classification to described by the described textural characteristics of block sort module according to described.

18. system according to claim 12 also comprises:

Piecemeal and search module are used for the designated frame piecemeal with described input video, and to the physical block searching motion vector of described designated frame so that described physical block is generated block-matching error; And

The texture feature extraction module is used to extract the textural characteristics of described physical block;

And wherein, described by described block-matching error and the textural characteristics of block sort module according to described physical block, described physical block is carried out the motion vector reliability classification.

19. system according to claim 16, wherein piecemeal and search module comprise:

Generate the device of lower resolution grayscale image from described input video;

With the preliminary dimension template central area of the gray level image of described designated is divided into a plurality of device;

To described physical block, and described designated frame corresponding reference frame in, the piece of block-matching error minimum between search and the described physical block and the device of position thereof, thereby determine the motion vector of block-matching error minimum; And

Calculate the adjacent block of piece position of described smallest blocks matching error correspondence and the device of the block-matching error between the described physical block.

20. system according to claim 19, wherein:

21. system according to claim 17, wherein:

The texture feature extraction module is calculated the texture strength of described physical block in all directions from described physical block and surrounding pixel thereof, obtains maximal value and the minimum value and the texture principal direction of the texture strength of described physical block.

22. system according to claim 18 wherein comprises by the block sort module:

According to the textural characteristics of described block-matching error and described physical block, the device that carries out the quadric surface match, obtain the surface fitting coefficient; And

With described textural characteristics and described surface fitting coefficient, as the feature of motion vector reliability evaluation, utilize sorter that statistical learning obtains, described physical block carried out the device of motion vector reliability classification.