CN100486336C

CN100486336C - Real time method for segmenting motion object based on H.264 compression domain

Info

Publication number: CN100486336C
Application number: CN 200610116363
Authority: CN
Inventors: 刘志; 张兆杨; 陆宇
Original assignee: University of Shanghai for Science and Technology
Current assignee: Shanghai University; University of Shanghai for Science and Technology
Priority date: 2006-09-21
Filing date: 2006-09-21
Publication date: 2009-05-06
Anticipated expiration: 2026-09-21
Also published as: CN1960491A

Abstract

In the invention, the only information depended on by the segmentation is the motion vector field based on 4*4 block uniform sampling and extracted from the H.264 video signals. In the first, the normalization is made for the motion vector field of continuous multi frames, and the iteration back-projection is made for the motion vector field in order to get the accumulated motion vector field used for enhancing the salient motion vector; then, making global motion compensation for the accumulated motion vector field, and meanwhile using fast region-growing algorithm to segment it into multi regions according to motion similarity. By the invention, a computer with 3.0 GHz CPU main frequency and 512M memory can smoothly process the CIF format video sequence and play video stream in 25 fps with good segmentation quality. The invention can also be used in motion object segmentation of MPEF compressed domain.

Description

Based on compression domain motion object real time method for segmenting H.264

Technical field

The present invention relates to a kind of based on compression domain motion object real time method for segmenting H.264, particularly distinctly be with existing method, exempted the complete decoding to compressed video, only the motion vector that extracts by the entropy decoding is used as and cuts apart required motion feature, so computation amount.And it is not subject to the video sequence of static background, for the video sequence with movement background or static background, can both be partitioned into to fast and reliable the motion object.Because this method only used the motion vector field information, so it is equally applicable to the motion Object Segmentation of MPEG compression domain.

Background technology

The motion Object Segmentation is such as necessary preconditions of numerous content-based multimedia application such as video index and retrieval, intelligent video monitoring, video editing and recognitions of face.Since the content-based video coding of MPEG-4 proposition, the research of relevant motion Object Segmentation concentrates on pixel domain mostly, and just begins to cause concern until in recent years based on the motion Object Segmentation of compression domain.In compression domain, carry out the motion Object Segmentation, compare the needs that are more suitable in practical application with the dividing method in the pixel domain.Especially certain form of boil down to of the most of video sequences in the practical application directly carries out the motion Object Segmentation in this compression domain, can exempt compressed video is carried out complete decoding; And, lack than pixel domain in compression domain domestic demand data volume to be processed a lot, so computation amount; In addition, the motion vector and the DCT coefficient that only extract by the entropy decoding from compressed video can be directly as cutting apart required motion feature and textural characteristics.Therefore, cut apart the motion object from compression domain and have characteristics fast, can solve traditional pixel domain dividing method and be difficult to satisfy the requirement that real-time is cut apart, be suitable for numerous application scenarios that real-time requires that has.

At present, though compression domain motion Object Segmentation method existing people propose, but basically at the MPEG-2 compression domain, H.264 be up-to-date video encoding standard, be doubled than the MPEG-2 code efficiency, at present increasing the application all turning to employing H.264 to replace MPEG-2, but it is very few to carry out the research of motion Object Segmentation so far in compression domain H.264.Compare with the MPEG compression domain, H.264 in the compression domain DCT coefficient of I frame can not be directly as the textural characteristics of cutting apart, because they are in the enterprising line translation of spatial prediction residual error of piece, rather than in the enterprising line translation of original block.Therefore, carry out the feature that the motion Object Segmentation can directly use in territory H.264 and have only motion vector information.At present in territory H.264, have only Zeng etc. to propose a kind of block-based MRF model is cut apart the motion object from sparse motion vector field method, give each piece dissimilar marks according to the amplitude of each block motion vector, mark the piece that belongs to the motion object by the posterior probability that maximizes MRF.But this method only is applicable to the video sequence of static background, and the accuracy of cutting apart is not high, and amount of calculation is also bigger.

Summary of the invention

The purpose of this invention is to provide a kind ofly, cut apart used unique information and be the motion vector field that from compressed video H.264, extracts based on 4 * 4 uniform samplings based on compression domain motion object real time method for segmenting H.264.This method can be exempted the complete decoding to compressed video, and required motion feature is cut apart in the motion vector conduct of using the entropy decoding to extract, so computation amount, to reach the purpose of real time kinematics Object Segmentation.

For reaching above-mentioned purpose, design of the present invention is:

As shown in Figure 1, to the H.264 compressing video frequency flow extraction motion vector and the normalization of input, carry out the iterative backward projection then, acquisition can significantly strengthen the cumulative motion vector field of movable information.Carry out global motion compensation again and the cumulative motion vector field is divided into a plurality of zones according to kinematic similarity, then with the cut zone of a coupling matrix notation present frame and former frame motion object in the correlation of present frame projection, based on this coupling matrix the motion Object Segmentation is come out.

According to above-mentioned design, technical scheme of the present invention is:

A kind of based on the motion object real time method for segmenting of compression domain H.264, it is characterized in that the motion vector field normalization of the continuous multiple frames row iteration rear orientation projection that goes forward side by side is obtained the cumulative motion vector field; Then the cumulative motion vector field is carried out global motion compensation, adopting fast simultaneously, the statistical regions growth algorithm is divided into a plurality of zones according to kinematic similarity with the cumulative motion vector field; Utilize above-mentioned two aspect results, the motion Object Segmentation method based on the coupling matrix that adopts the present invention to propose is partitioned into the motion object, wherein can carry out multiple situations such as the appearance of the merging of the tracking of object and renewal, object and division, object and disappearance effectively in video sequence; The steps include:

A. motion vector field normalization: from video H.264, extract motion vector field and carry out normalization on time domain and the spatial domain;

B. cumulative motion vector field: utilize the motion vector field of continuous multiple frames to carry out the iterative backward projection and obtain cumulative motion vector field more reliably;

C. global motion compensation: after carrying out overall motion estimation on the cumulative motion vector field, compensate to obtain each residual error of 4 * 4;

D. Region Segmentation: adopt the statistical regions growing method that the cumulative motion vector field is divided into a plurality of zones with similar movement;

E. Object Segmentation: adopt dividing method that the motion Object Segmentation is come out based on the coupling matrix.

The normalized step of above-mentioned motion vector field is:

(1) time domain normalization: with the motion vector of present frame interval frame number, i.e. time domain distance divided by present frame and reference frame;

(2) spatial domain normalization: give all 4 * 4 that this macro block covered greater than direct tax of each macroblock motion vector of 4 * 4 with all sizes.

The step of above-mentioned cumulative motion vector field is:

(1) utilizes the present frame motion vector field of some frames afterwards, motion vector field to consecutive frame carries out rear orientation projection, multiply by the projection motion vector that addition behind the different scale factors obtains current block by motion vector exactly to each projecting block, the method for selecting of scale factor is: if the gross area of overlapping region greater than half of current block area, then the scale factor of each projecting block is taken as the gross area of the equitant area of this projecting block and current block divided by all projecting blocks and current block overlapping region; Otherwise the scale factor of each projecting block is taken as the ratio of its overlapping area and current block area;

(2) begin the iteration accumulation to obtain the cumulative motion vector field of present frame from the back frame;

The step of above-mentioned global motion compensation is:

(1) adopt the affine motion model of 6 parameters to estimate the global motion vector field;

1. model parameter initialization: establish m=(m ₁, m ₂, m ₃, m ₄, m ₅, m ₆) be the parameter vector of global motion model, model parameter m ⁽⁰⁾Be initialized as:

m^{(0)} = {[\begin{matrix} 1 & 0 & \frac{1}{N} Σ_{i = 1}^{N} {mvx}_{i} & 0 & 1 & \frac{1}{N} Σ_{i = 1}^{N} {mvy}_{i} \end{matrix}]}^{T},

2. reject point not in the know: at first calculating the present frame centre coordinate is (x _i, y _i) i piece at the estimation centre coordinate of former frame

[\begin{matrix} x_{i}^{'} \\ y_{i}^{'} \end{matrix}] = [\begin{matrix} m_{1} & m_{2} \\ m_{4} & m_{5} \end{matrix}] [\begin{matrix} x_{i} \\ y_{i} \end{matrix}] + [\begin{matrix} m_{3} \\ m_{6} \end{matrix}],

Motion vectors then With original cumulative motion vector (mvx _i, mvy _i) deviation (ex _i, ey _i) be calculated as:

\begin{matrix} {ex}_{i} = x_{i}^{'} - x_{i} - {mvx}_{i} \\ e y_{i} = y_{i}^{'} - y_{i} - {mvy}_{i} \end{matrix},

Use this formula to calculate each prediction deviation (ex of 4 * 4 _i, ey _i), calculate the deviation amplitude quadratic sum at last

Histogram, reject those deviation amplitude quadratic sums in the histogram then greater than 25% motion vector;

3. model parameter is upgraded: use the motion vector and the Newton-Raphson method that remain in the preceding step to upgrade model parameter, new model parameter vectors m in the l step iteration ^(l)Be defined as follows: m ^(l)=m ^(l-1)-H ^-1B, Hessian matrix H and gradient vector b are calculated as follows here:

H = [\begin{matrix} \underset{i &Element; R}{Σ} x_{i}^{2} & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} x_{i} & 0 & 0 & 0 \\ \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} y_{i}^{2} & \underset{i &Element; R}{Σ} y_{i} & 0 & 0 & 0 \\ \underset{i &Element; R}{Σ} x_{i} & \underset{i &Element; R}{Σ} y_{i} & \underset{i &Element; R}{Σ} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i}^{2} & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} x_{i} \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} y_{i}^{2} & \underset{i &Element; R}{Σ} y_{i} \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i} & \underset{i &Element; R}{Σ} y_{i} & \underset{i &Element; R}{Σ} 1 \end{matrix}]

b = {[\begin{matrix} \underset{i &Element; R}{Σ} x_{i} {ex}_{i} & \underset{i &Element; R}{Σ} y_{i} {ex}_{i} & \underset{i &Element; R}{Σ} {ex}_{i} & \underset{i &Element; R}{Σ} x_{i} {ey}_{i} & \underset{i &Element; R}{Σ} y_{i} {ey}_{i} & \underset{i &Element; R}{Σ} {ey}_{i} \end{matrix}]}^{T}

The set of the piece that R representative here remains;

4. termination condition: repeating step 2. and 3. maximum 5 times, and if one of following two conditions be satisfied also finishing iteration in advance: (i) calculate m ^(l)-m _Static, obtain a difference value vector, if all less than 0.01, just being judged as, each the parameter component in this difference value vector belongs to the static situation of video camera, finishing iteration; M wherein _Static=[100010] ^TBe the global motion vector under the static situation of video camera; (ii) calculate m ^(l)And m ^(l-1)Difference, if the parameter component m of this difference ₃And m ₆Less than 0.01, and other parameter component is less than 0.0001, and then iteration finishes;

5. with the global motion model parameters vector m substitution that obtains

[\begin{matrix} x_{i}^{'} \\ y_{i}^{'} \end{matrix}] = [\begin{matrix} m_{1} & m_{2} \\ m_{4} & m_{5} \end{matrix}] [\begin{matrix} x_{i} \\ y_{i} \end{matrix}] + [\begin{matrix} m_{3} \\ m_{6} \end{matrix}],

Obtain the estimated coordinates of former frame

Obtain the global motion vector field at last

(2) calculate each residual error of 4 * 4 in global motion vector field and the cumulative motion vector field.

Above-mentioned Region Segmentation is to adopt the statistical regions growth algorithm that the cumulative motion vector field is divided into a plurality of zones with similar movement, and step is as follows:

(1) the differences in motion opposite sex of calculating any adjacent block group in the neighbours territory is measured;

(2) all adjacent block groups sort according to differences in motion opposite sex tolerance order from small to large;

(3) the minimum adjacent block of differences in motion opposite sex tolerance is made up also, to begin area growth process herein, when each region growing, current two piece groups belong to two adjacent zones respectively, then judge condition that whether these two zones merge be the difference of average motion vector in these two zones whether less than threshold condition:

Δ (R) = \frac{{SR}^{2}}{2 Q | R |} (\min (SR, | R |) \log (1 + | R |) + 2 \log 6 wh),

Wherein SR represents the dynamic range of motion vector, | R| represents the motion vector number that the zone comprises, wh represents the size of motion vector field, and parameter Q is used for the dividing degree of controlled motion vector field, just motion vector field moderately can be divided into some zones with similar movement with regard to sample;

(4) calculate the mean residual of each cut zone behind global motion compensation;

(5) distinguish the zone at the most reliable background area and other object place, as background area reliably, be labeled as in zone that area is selected the mean residual minimum in greater than some cut zone of whole motion vector field 10%

The zone that remaining zone may exist as the motion object

Mark is distinguished in M subject area and 1 background area that present frame is cut apart at last, and segmentation result is designated as

Above-mentioned Object Segmentation is to utilize former frame, t-1 constantly, the motion Object Segmentation result who has obtained judges present frame, t constantly, each cut zone whether with certain object coupling of former frame, construct the coupling matrix with this; Judge the division of merging, the object of the tracking of object and renewal, object, the appearance, the situations such as disappearance of old object of object newly based on the coupling matrix, finally obtain some motion objects of present frame; The steps include:

(1) adopt rear orientation projection's method to obtain former frame, in the t-1 moment, each object is in present frame t view field constantly, and elder generation is with N motion object of former frame

With 1 background object

Mark comes out, and adopts the method for rear orientation projection to obtain the view field of each object of former frame at present frame then.Utilize the coordinate of any piece in the present frame cumulative motion vector field and the difference of its corresponding cumulative motion vector to obtain the matched position of this piece in former frame exactly, then the block object on the former frame matched position is projected to present frame and one by one mark come out, be designated as

(2) structural matrix CM ^t, its expression cut zone and overlapped area of object projection; Structural matrix CMR ^t, it represents that each cut zone drops on the ratio in each object projection; Structural matrix CMC ^t, it represents that each object projection drops on the ratio in each cut zone; According to the mark image

With

Construct the Matrix C M of 3 capable N+1 row of M+1 ^t, CMR ^t, CMC ^tMatrix C M wherein ^tIn arbitrary element CM ^t(i, j) value be In be labeled as i and Be labeled as the number of pixels of j, i.e. cut zone

With the object projection

Overlapped area.And Matrix C MR ^t(i j) is defined as CMR ^tEach element that i is capable is a cut zone

Drop on the ratio in each object projection; Matrix C MC ^t(i j) is defined as CMC ^tEach element of j row is an object Projection drop on ratio in each cut zone;

(3) structural matrix CMM ^t, the correlation degree between its expression present frame cut zone and the object projection, Matrix C MM ^tWrite down CMR ^tAnd CMC ^tReflected

With

Between relevant information; CMM ^tAt first be changed to the null matrix that M+1 is capable, N+1 is listed as; Then to CMR ^tCarry out line scanning and find the position at each row maximum place, to CMM ^tThe element value of middle corresponding position adds 1; Then to CMC ^tCarry out column scan and find the position at each row maximum place, to CMM ^tThe element value of middle corresponding position adds 2; The Matrix C MM that generates ^tOrdinate be expressed as the present frame background area successively

And moving region

Abscissa is expressed as the former frame background object successively

With the motion object

The possible value of each element is 0,1,2,3 in the matrix; CMM ^tIn arbitrarily be not 0 Elements C MM ^t(i j) has shown cut zone

With object There is certain correlation, particularly:

1. CMM ^t(i j)=1, shows cut zone

Belong to the former frame object to a great extent

2. CMM ^t(i j)=2, shows the former frame object Be included in cut zone to a great extent

In;

3. CMM ^t(i j)=3, has comprised above-mentioned two kinds of situations simultaneously, shows

With

Has extremely strong correlation; Need further relatively, if CMR ^t(i, j)〉CMC ^t(i, j), CMM then ^t(i, j)=1; Otherwise, CMM ^t(i, j)=2; The CMM of Sheng Chenging at last ^tSpan is 0,1,2;

(4) based on coupling Matrix C MM ^tTracking and renewal, new object appearance, the merging of object, the division of object and the disappearance five class situations of object to single object are carried out Object Segmentation; By Matrix C MM ^tCan set up the incidence relation of cut zone and motion object effectively, it can handle following five kinds of situations effectively with a kind of uniform way:

1. single to image tracing and renewal (1 → 1);

(0 → 1) appears in 2. new object;

3. the merging of object (m → 1);

4. the division (1 → m) of object;

5. the disappearance of object (1 → 0).

The present invention compared with prior art, have following outstanding feature and advantage: the real time kinematics Object Segmentation method based on compression domain provided by the invention, be based on H.264 video flowing, promptly distinct with existing method is that existing compressed domain video Object Segmentation method mainly is applicable to the MPEG territory, and the present invention is not only applicable to H.264 compression domain, is equally applicable to the MPEG compression domain.And the present invention is not subject to the video sequence of static background, no matter for the video sequence with movement background or static background, can both be partitioned into to fast and reliable the motion object.The coupling matrix of the present invention's proposition is cut apart the method for motion object in addition, almost can make in real time the various situations of object video motion and cutting apart, so the effect of cutting object is fine, has very strong applicability.

Description of drawings

Fig. 1 is the flow chart based on the H.264 compression domain motion object real time method for segmenting that mates matrix of the present invention.

Fig. 2 is the structured flowchart of motion vector field normalization and cumulative motion vector field among Fig. 1.

Fig. 3 is the structured flowchart of global motion compensation and Region Segmentation among Fig. 1.

Fig. 4 is the structured flowchart of Object Segmentation among Fig. 1.

Fig. 5 is the diagram to each typical frame among the sequence C oastguard (the 4th, 37,61,208 frame) motion Object Segmentation result.

Fig. 6 is the diagram to each typical frame among the sequence Mobile (the 4th, 43,109,160 frame) motion Object Segmentation result.

Embodiment

Details are as follows in conjunction with the accompanying drawings for one embodiment of the present of invention:

The H.264 compression domain motion object real time method for segmenting that the present invention is based on the coupling matrix is by flow chart shown in Figure 1, is that programming realizes that Fig. 5 and Fig. 6 illustrate the emulation testing result on the PC test platform of 3.0GHz, internal memory 512M at CPU.

Referring to Fig. 1, the present invention is based on the H.264 compression domain motion object real time method for segmenting of coupling matrix, normalization and accumulation by motion vector field have strengthened significant movable information, then the cumulative motion vector field is carried out global motion compensation, adopt the statistical regions growth algorithm and come cut zone and motion object based on the motion Object Segmentation method of coupling matrix, it is simple to have algorithm, and Object Segmentation speed is fast, the characteristics that segmentation effect is good.

The steps include:

(1) motion vector field normalization: from H.264 extracting motion vector field and carry out normalization on time domain and the spatial domain the video;

(2) cumulative motion vector field: utilize the motion vector field of continuous multiple frames to carry out the iterative backward projection and obtain cumulative motion vector field more reliably;

(3) global motion compensation: after carrying out overall motion estimation on the cumulative motion vector field, compensate to obtain each residual error of 4 * 4;

(4) Region Segmentation: adopt the statistical regions growing method that the cumulative motion vector field is divided into a plurality of zones with similar movement;

(5) Object Segmentation: adopt partitioning algorithm that the motion Object Segmentation is come out based on the coupling matrix.

The normalized process of motion vector field of above-mentioned steps (1) is as follows:

1. time domain normalization: with the motion vector of present frame interval frame number, i.e. time domain distance divided by present frame and reference frame;

2. spatial domain normalization: give all 4 * 4 that this macro block covered greater than direct tax of each macroblock motion vector of 4 * 4 with all sizes.

The process of the cumulative motion vector field of above-mentioned steps (2) is as follows:

1. utilize the motion vector field of some frames after the present frame, the motion vector field of consecutive frame is carried out rear orientation projection;

2. begin the iteration accumulation to obtain the cumulative motion vector field of present frame from the back frame.

The process of the global motion compensation of above-mentioned steps (3) is as follows:

1. adopt the affine motion model estimation global motion vector field of 6 parameters;

2. calculate each 4 * 4 residual error behind global motion compensation.

The process of the Region Segmentation of above-mentioned steps (4) is as follows:

1. calculate the differences in motion opposite sex tolerance of any adjacent block group in the neighbours territory;

2. all adjacent block groups sort according to differences in motion opposite sex tolerance order from small to large;

3. the minimum adjacent block of differences in motion opposite sex tolerance is made up also, to begin area growth process herein;

4. calculate the mean residual of each cut zone behind global motion compensation;

5. distinguish the zone at the most reliable background area and other object place.

The process of the Object Segmentation of above-mentioned steps (5) is as follows:

1. adopt rear orientation projection's method to obtain the view field of each object of former frame at present frame;

2. structural matrix CM ^t, its expression cut zone and overlapped area of projection objects; Structural matrix CMR ^t, it represents that each cut zone drops on the ratio in each object projection; Structural matrix CMC ^t, it represents that each object projection drops on the ratio in each cut zone;

3. structural matrix CMM ^t, the correlation degree between its expression present frame cut zone and the object projection;

4. based on coupling Matrix C MM ^tTo single image tracing and renewal, new object appearance, the merging of object, the division of object and the five class situations such as disappearance of object are carried out Object Segmentation.

Below five steps of present embodiment in conjunction with general diagram (Fig. 1) are further described:

A. motion vector field normalization:

As shown in Figure 2, the motion vector of present frame is obtained normalization on the time domain divided by the interval frame number of present frame and reference frame, size in the present frame is directly composed greater than the motion vector of 4 * 4 piece given all 4 * 4 normalization that obtain on the spatial domains that this piece covered.

B. cumulative motion vector field:

As shown in Figure 2, utilize the motion vector field of some frames after the present frame earlier, the motion vector field of consecutive frame is carried out rear orientation projection.Multiply by the projection motion vector that addition behind the different scale factors obtains current block by motion vector exactly to each projecting block, the method for selecting of scale factor is: if the gross area of overlapping region greater than half of current block area, then the scale factor of each projecting block is taken as the gross area of the equitant area of this projecting block and current block divided by all projecting blocks and current block overlapping region; Otherwise the scale factor of each projecting block is taken as the ratio of its overlapping area and current block area.Begin the iteration accumulation to obtain the cumulative motion vector field of present frame from the back frame then.

C. global motion compensation:

As shown in Figure 3, adopt the affine motion model estimation global motion vector field of 6 parameters, utilize the difference of it and cumulative motion vector field just can obtain the residual error of any piece of cumulative motion vector field behind global motion compensation.The steps include:

(1) adopt the affine motion model of 6 parameters to estimate the global motion vector field:

1. model parameter initialization:

If m=is (m ₁, m ₂, m ₃, m ₄, m ₅, m ₆) be the parameter vector of global motion model, model parameter m ⁽⁰⁾Be initialized as:

m^{(0)} = {[\begin{matrix} 1 & 0 & \frac{1}{N} Σ_{i = 1}^{N} {mvx}_{i} & 0 & 1 & \frac{1}{N} Σ_{i = 1}^{N} {mvy}_{i} \end{matrix}]}^{T};

2. reject point not in the know:

At first calculating the present frame centre coordinate is (x _i, y _i) i piece at the estimation centre coordinate of former frame

[\begin{matrix} x_{i}^{'} \\ y_{i}^{'} \end{matrix}] = [\begin{matrix} m_{1} & m_{2} \\ m_{4} & m_{5} \end{matrix}] [\begin{matrix} x_{i} \\ y_{i} \end{matrix}] + [\begin{matrix} m_{3} \\ m_{6} \end{matrix}],

Motion vectors then

With original cumulative motion vector (mvx _i, mvy _i) deviation (ex _i, ey _i) be calculated as:

\begin{matrix} {ex}_{i} = x_{i}^{'} - x_{i} - {mvx}_{i} \\ e y_{i} = y_{i}^{'} - y_{i} - {mvy}_{i} \end{matrix}

。Use this formula to calculate each prediction deviation (ex of 4 * 4 _i, ey _i), calculate the quadratic sum of deviation amplitude at last

Histogram, reject those deviation amplitude quadratic sums in the histogram then greater than 25% motion vector.

3. model parameter is upgraded:

Use the motion vector and the Newton-Raphson method that remain in the preceding step to upgrade model parameter.New model parameter vectors m in the l step iteration ^(l)Be defined as follows: m ^(l)=m ^(l-1)-H ^-1B, Hessian matrix H and gradient vector b are calculated as follows here:

H = [\begin{matrix} \underset{i &Element; R}{Σ} x_{i}^{2} & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} x_{i} & 0 & 0 & 0 \\ \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} y_{i}^{2} & \underset{i &Element; R}{Σ} y_{i} & 0 & 0 & 0 \\ \underset{i &Element; R}{Σ} x_{i} & \underset{i &Element; R}{Σ} y_{i} & \underset{i &Element; R}{Σ} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i}^{2} & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} x_{i} \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} y_{i}^{2} & \underset{i &Element; R}{Σ} y_{i} \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i} & \underset{i &Element; R}{Σ} y_{i} & \underset{i &Element; R}{Σ} 1 \end{matrix}]

b = {[\begin{matrix} \underset{i &Element; R}{Σ} x_{i} {ex}_{i} & \underset{i &Element; R}{Σ} y_{i} {ex}_{i} & \underset{i &Element; R}{Σ} {ex}_{i} & \underset{i &Element; R}{Σ} x_{i} {ey}_{i} & \underset{i &Element; R}{Σ} y_{i} {ey}_{i} & \underset{i &Element; R}{Σ} {ey}_{i} \end{matrix}]}^{T}

The set of the piece that R representative here remains.

4. termination condition: repeating step 2. and 3. maximum 5 times, and if one of following two conditions be satisfied also finishing iteration in advance:

(i) calculate m ^(l)-m _Static, obtain a difference value vector, if all less than 0.01, just being judged as, each the parameter component in this difference value vector belongs to the static situation of video camera, finishing iteration; M wherein _Static=[100010] ^TBe the global motion vector under the static situation of video camera;

(ii) calculate m ^(l)And m ^(l-1)Difference, if the parameter component m of this difference ₃And m ₆Less than 0.01, and other parameter component is less than 0.0001, and then iteration finishes.

5. with the global motion model parameters vector m substitution that obtains

[\begin{matrix} x_{i}^{'} \\ y_{i}^{'} \end{matrix}] = [\begin{matrix} m_{1} & m_{2} \\ m_{4} & m_{5} \end{matrix}] [\begin{matrix} x_{i} \\ y_{i} \end{matrix}] + [\begin{matrix} m_{3} \\ m_{6} \end{matrix}],

Obtain the estimated coordinates of former frame

Obtain the global motion vector field at last

D. Region Segmentation:

As shown in Figure 3, the present invention adopts the Region Segmentation of statistical regions growth algorithm realization to the cumulative motion vector field.Details are as follows for step:

(3) the minimum adjacent block of differences in motion opposite sex tolerance is made up also, to begin area growth process herein.When each region growing, current two piece groups belong to two adjacent zones respectively, then judge condition that whether these two zones merge be the difference of average motion vector in these two zones whether less than threshold condition:

Δ (R) = \frac{{SR}^{2}}{2 Q | R |} (\min (SR, | R |) \log (1 + | R |) + 2 \log 6 wh),

(5) zone at the most reliable background area of differentiation and other object place., be labeled as background area reliably in zone that area is selected the mean residual minimum in greater than some cut zone of whole motion vector field 10%

The zone that remaining zone may exist as the motion object

E. Object Segmentation

As shown in Figure 4, find the piece that mates by calculating in adjacent two frames earlier, the motion object with former frame is projected to present frame and is labeled as the object projection again, utilizes the Matrix C M of the capable N+1 row of 3 M+1 of correlation structure of projection of present frame object and cut zone then ^t, CMR ^t, CMC ^tAgain by Matrix C MR ^tAnd CMC ^tGenerate coupling Matrix C MM ^t, the different motion object situation of five classes made cut apart based on this coupling matrix.Details are as follows for step:

(1) adopt rear orientation projection's method to obtain former frame, in the t-1 moment, each object is in present frame t view field constantly.Elder generation is with N motion object of former frame

With 1 background object

(2) structural matrix CM ^t, its expression cut zone and overlapped area of object projection; Structural matrix CMR ^t, it represents that each cut zone drops on the ratio in each object projection; Structural matrix CMC ^t, it represents that each object projection drops on the ratio in each cut zone.According to the mark image With

Construct the Matrix C M of 3 capable N+1 row of M+1 ^t, CMR ^t, CMC ^tMatrix C M wherein ^tIn arbitrary element CM ^t(i, j) value be

In be labeled as i and

Be labeled as the number of pixels of j, i.e. cut zone

With the object projection

Drop on the ratio in each object projection; Matrix C MC ^t(i j) is defined as CMC ^tEach element of j row is an object Projection drop on ratio in each cut zone.

(3) structural matrix CMM ^t, the correlation degree between its expression present frame cut zone and the object projection.Matrix C MM ^tWrite down CMR ^tAnd CMC ^tReflected

With

Between relevant information.CMM ^tAt first be changed to the null matrix of the capable N+1 row of M+1; Then to CMR ^tCarry out line scanning and find the position at each row maximum place, to CMM ^tThe element value of middle corresponding position adds 1; Then to CMC ^tCarry out column scan and find the position at each row maximum place, to CMM ^tThe element value of middle corresponding position adds 2.The Matrix C MM that generates ^tOrdinate be expressed as the present frame background area successively And moving region

Abscissa is expressed as the former frame background object successively With the motion object

The possible value of each element is 0,1,2,3 in the matrix.CMM ^tIn arbitrarily be not 0 Elements C MM ^t(i j) has shown cut zone

With object

There is certain correlation, particularly:

1. CMM ^t(i j)=1, shows cut zone Belong to the former frame object to a great extent

In;

With

Has extremely strong correlation.Need further relatively, if CMR ^t(i, j)〉CMC ^t(i, j), CMM then ^t(i, j)=1; Otherwise, CMM ^t(i, j)=2.The CMM of Sheng Chenging at last ^tSpan is 0,1,2.

(4) based on coupling Matrix C MM ^tTracking and renewal, new object appearance, the merging of object, the division of object and the disappearance five class situations of object to single object are carried out Object Segmentation.By Matrix C MM ^tCan set up the incidence relation of cut zone and motion object effectively, it can handle following five kinds of situations effectively with a kind of uniform way:

1. single to image tracing and renewal (1 → 1): if CMM ^tI capablely have only a nonzero element CMM ^t(i, j), and the j row also have only this nonzero element CMM ^t(i j), shows cut zone so

Only and object

There is correlation, according to CMM ^t(i, value j) is taked different strategies: if CMM ^t(i j)=2, takes update strategy, the object after representing to upgrade with the cut zone of present frame, promptly

O_{j}^{t} = R_{i}^{t}

。If CMM ^t(i j)=1, generally takes the strategy to image tracing, promptly represents the object of present frame with the projection of former frame object, promptly

O_{j}^{t} = Proj (O_{j}^{t - 1});

In addition, if cut zone

Also satisfy threshold condition simultaneously:

| R_{i}^{t} | > T_{s},

And

{ER}_{i}^{t} > α {ER}_{0}^{t},

T wherein _s=64, α=1.5,

The expression zone

The motion vector number that is comprised,

The expression zone Mean residual,

The mean residual of expression background; Then think

Be a reliable moving region, can be used to represent the motion object of present frame, promptly

O_{j}^{t} = R_{i}^{t} .

(0 → 1) appears in 2. new object: if CMM ^tI capable have only a nonzero element and be positioned at the 1st row, value is 1, shows this cut zone

In former frame or background object Do not belong to existing any motion object.If

Threshold condition above satisfying simultaneously in 1. then can be thought

Be an emerging motion object, be designated as

O_{M + 1}^{t} = R_{i}^{t} .

1. above-mentioned and 2. under two kinds of situations, CMM ^tThe nonzero element number of certain row all is 1.If CMM ^tI capablely have a plurality of nonzero elements, then show cut zone

May there be correlation with a plurality of objects.In this case, only the object of former frame need be projected to the motion object of present frame, realize the tracking of object, promptly as present frame

O_{j}^{t} = Proj (O_{j}^{t - 1}) .

3. the merging of object (m → 1): if CMM ^tI the element value more than 2 is arranged is 2 except that the 1st row on capable, show that the object more than 2 is included in new cut zone to a great extent in the former frame

In, then

Represented the new object after these objects merge, be designated as

O_{M + 1}^{t} = R_{i}^{t}

。In this case, cut zone

Often comprised the object that has quite similar motion and adjoin each other in the space more than 2 or 2, then they are split as a new combining objects at present frame.

4. the division of object (1 → m): if CMM ^tJ row in the element value more than 2 is arranged is 1, then show the former frame object

Split into a plurality of cut zone at present frame

Even these zones are adjacency not spatially, in the cutting apart of present frame, still think that these cut zone belong to same object, but mutual not a plurality of cut zone of adjacency show different motions in some frames subsequently in the space to have the same object mark up to these, then give different object tags, be designated as these cut zone

O_{s_{i}}^{t} = R_{s_{i}}^{t},

To realize real object division.

5. the disappearance of object (1 → 0): if CMM ^tJ row have only 1 nonzero element and be positioned at the 1st row, value is 2, shows the former frame object

Projection drop on the background area of present frame Then think

Disappear at present frame.

5 kinds of situations that in the motion Object Segmentation process of video sequence, may occur can have been handled as mentioned above effectively.But when bigger variation takes place in scene, continuous multiple frames has all been taked the strategy of following the tracks of to all objects, it promptly all is the projection of each object of previous frame, the correlation that shows each cut zone of present frame and each motion object of former frame is very weak, therefore will judge whether that 2. new object occurs, and need detect the motion object again according to situation.

Example when below providing the input video form and be 352 * 288 CIF adopts the H.264 encoder of JM8.6 version that the MPEG-4 standard test sequences is encoded, as the H.264 compressed video of test usefulness.H.264 the configuration of encoder is as follows: Baseline Profile, and IPPP, per 30 frames insert 1 I frame, 3 reference frames, the hunting zone of estimation is [32,32], and quantization parameter is 30, and the coding frame number is 300 frames.In experiment, we take to calculate every 3 frames (frame number that uses in the motion vector accumulation) way of a cumulative motion vector field, have obtained the performance that 100 frame cumulative motion vector fields are tested the motion Object Segmentation Algorithm that this paper proposes altogether.Earlier obtain the Region Segmentation result by the cumulative motion vector field, then former frame motion object is projected to present frame, adopt dividing method that the motion Object Segmentation is come out based on the coupling matrix based on these two results from present frame.

Adopt typical standard test sequences Coastguard and Mobile to test as input video, experimental result respectively as shown in Figure 5 and Figure 6.The 1st original picture of classifying present frame as among two figure, the 2nd classifies present frame as cuts apart the Region Segmentation result of gained by the cumulative motion vector field, and the 3rd classifies the view field at present frame of former frame motion object as, and the 4th classifies the motion object that present frame is partitioned into as.The processing time of average every frame is 38ms, can satisfy the requirement that great majority are used 25fps in real time.The dividing method of considering this paper is to carry out once motion Object Segmentation every 3 frames in fact, for the original video sequence that provides, can in real-time decoding, just can be partitioned into corresponding motion object fully, even require every frame all to be partitioned into corresponding motion object, only need carry out the object projection to all the other frames, its amount of calculation is also very little, still can be partitioned into the motion object in real time.

Experiment 1: sequence C oastguard has tangible global motion, and video camera at first right-to-left translation is followed the tracks of the middle canoe of picture, and the big ship that occurs from the picture left side is followed the tracks of in motion from left to right then.Fig. 5 the 1st row (sequence the 4th frame) is followed the tracks of the motion of canoe for the video camera right-to-left, Fig. 5 the 2nd row (sequence the 37th frame) moves from left to right for the big ship of new object, Fig. 5 the 3rd row (sequence the 61st frame) is that two big ships of motion object and canoe appear in the scene of video camera fully, and Fig. 5 the 4th row (sequence the 208th frame) begins to follow the tracks of from left to right the motion of big ship for video camera.By Fig. 5 the 2nd row image as can be seen, the zone that can be partitioned into two motion object places mostly more exactly of cutting apart to the cumulative motion vector field, and the most of background area that meets global motion model also is included in the big cut zone, the region representation of white the most reliable background area after motion-compensated, therefore accumulation and the dividing method to motion vector field taked of this paper is effectively, can utilize motion vector information to obtain the result that appropriateness is cut apart.In conjunction with the view field of each object of former frame shown in the 3rd row at present frame, utilize motion Object Segmentation method based on the coupling matrix, can in whole sequence, reliablely and stablely be partitioned into the motion object shown in the 4th row.

Experiment 2: sequence Mobile has more complicated global motion, except the translation and elevating movement of video camera, also has the motion of tangible convergent-divergent preceding half section of sequence.Comprise 3 motion objects altogether in Fig. 6 the 1st row (sequence the 4th frame) scene, mini train promotes ball and moves in orbit, and wall calendar is intermittently moving up and down, so the difficulty of motion Object Segmentation is bigger.By the segmentation result of Fig. 6 as can be seen, the motion Object Segmentation Algorithm that the present invention proposes is under the situation of motion object stop motion, can be partitioned into this motion object by the object projection, as the ball of Fig. 6 the 2nd row (sequence the 43rd frame) and the wall calendar of Fig. 6 the 3rd row (sequence the 109th frame).In addition, the experimental result of Fig. 6 has shown that also the motion Object Segmentation Algorithm of this paper can handle the merging and the division of motion object well.At Fig. 6 the 3rd row (sequence the 109th frame) because mini train has pushed away the ball motion in seamless unoccupied place, therefore two spatially closely in abutting connection with and the on all four motion object that moves be considered the object merging taken place; To two objects in gap have been arranged, and movement degree is when no longer identical at Fig. 6 the 4th row (sequence the 160th frame), and two objects have been divided into two zones, really realize the division of two objects.

Claims

1. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix is characterized in that the motion vector field normalization of the continuous multiple frames row iteration rear orientation projection that goes forward side by side is obtained the cumulative motion vector field; Then the cumulative motion vector field is carried out global motion compensation, adopting fast simultaneously, the statistical regions growth algorithm is divided into a plurality of zones according to kinematic similarity with the cumulative motion vector field; Utilize above-mentioned two aspect results, the motion Object Segmentation method based on the coupling matrix that adopts the present invention to propose is partitioned into the motion object, wherein can carry out multiple situations such as the appearance of the merging of the tracking of object and renewal, object and division, object and disappearance effectively in video sequence; Its step is as follows:

2. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that the normalized step of described motion vector field is: (1) time domain normalization: with the motion vector of present frame interval frame number, i.e. time domain distance divided by present frame and reference frame; (2) spatial domain normalization: give all 4 * 4 that this macro block covered greater than direct tax of each macroblock motion vector of 4 * 4 with all sizes.

3. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, the step that it is characterized in that described cumulative motion vector field is: (1) utilizes the present frame motion vector field of some frames afterwards, motion vector field to consecutive frame carries out rear orientation projection, multiply by the projection motion vector that addition behind the different scale factors obtains current block by motion vector exactly to each projecting block, the method for selecting of scale factor is: if the gross area of overlapping region greater than half of current block area, then the scale factor of each projecting block is taken as the gross area of the equitant area of this projecting block and current block divided by all projecting blocks and current block overlapping region; Otherwise the scale factor of each projecting block is taken as the ratio of its overlapping area and current block area; (2) begin the iteration accumulation to obtain the cumulative motion vector field of present frame from the back frame then.

4. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that global motion compensation, be to adopt affine motion model estimation global motion vector field earlier, calculate each 4 * 4 residual error behind global motion compensation then.Step is as follows:

m^{(0)} = {[\begin{matrix} 1 & 0 & \frac{1}{N} Σ_{i = 1}^{N} {mvx}_{i} & 0 & 1 & \frac{1}{N} Σ_{i = 1}^{N} {mvy}_{i} \end{matrix}]}^{T};

2. reject point not in the know:

(x_{i}^{'}, y_{i}^{'}) : [\begin{matrix} x_{i}^{'} \\ y_{i}^{'} \end{matrix}] = [\begin{matrix} m_{1} & m_{2} \\ m_{4} & m_{5} \end{matrix}] [\begin{matrix} x_{i} \\ y_{i} \end{matrix}] + [\begin{matrix} m_{3} \\ m_{6} \end{matrix}],

Motion vectors then

\begin{matrix} {ex}_{i} = x_{i}^{'} - x_{i} - {mvx}_{i} \\ {ey}_{i} = y_{i}^{'} - y_{i} - mv y_{i} \end{matrix},

3. model parameter is upgraded

Use the motion vector and the Newton-Raphson method that remain in the preceding step to upgrade model parameter, new model parameter vectors m in the l step iteration ^(l)Be defined as follows: m ^(l)=m ^(l-1)-H ^-1B, Hessian matrix H and gradient vector b are calculated as follows here:

H = [\begin{matrix} \underset{i &Element; R}{Σ} x_{i}^{2} & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} x_{i} & 0 & 0 & 0 \\ \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} y_{i}^{2} & \underset{i &Element; R}{Σ} y_{i} & 0 & 0 & 0 \\ \underset{i &Element; R}{Σ} x_{i} & \underset{i &Element; R}{Σ} y_{i} & \underset{i &Element; R}{Σ} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i}^{2} & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} x_{i} \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i} y_{i} & \underset{i &Element; R}{Σ} y_{i}^{2} & \underset{i &Element; R}{Σ} y_{i} \\ 0 & 0 & 0 & \underset{i &Element; R}{Σ} x_{i} & \underset{i &Element; R}{Σ} y_{i} & \underset{i &Element; R}{Σ} 1 \end{matrix}]

b = {[\begin{matrix} \underset{i &Element; R}{Σ} x_{i} {ex}_{i} & \underset{i &Element; R}{Σ} y_{i} {ex}_{i} & \underset{i &Element; R}{Σ} {ex}_{i} & \underset{i &Element; R}{Σ} x_{i} {ey}_{i} & \underset{i &Element; R}{Σ} y_{i} {ey}_{i} & \underset{i &Element; R}{Σ} {ey}_{i} \end{matrix}]}^{T}

The set of the piece that R representative here remains;

4. termination condition: repeating step 2. and 3. maximum 5 times, and if one of following two conditions be satisfied also finishing iteration in advance: (i) calculate m ^(l)-m _Static, obtain a difference value vector, if all less than 0.01, just being judged as, each the parameter component in this difference value vector belongs to the static situation of video camera, finishing iteration; M wherein _Static=[1 0001 0] ^TBe the global motion vector under the static situation of video camera; (ii) calculate m ^(l)And m ^(l-1)Difference, if the parameter component m of this difference ₃And m ₆Less than 0.01, and other parameter component is less than 0.0001, and then iteration finishes;

5. with the global motion model parameters vector m substitution that obtains

[\begin{matrix} x_{i}^{'} \\ y_{i}^{'} \end{matrix}] = [\begin{matrix} m_{1} & m_{2} \\ m_{4} & m_{5} \end{matrix}] [\begin{matrix} x_{i} \\ y_{i} \end{matrix}] + [\begin{matrix} m_{3} \\ m_{6} \end{matrix}],

Obtain the estimated coordinates of former frame

Obtain the global motion vector field at last

5. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that described Region Segmentation, is to adopt the statistical regions growth algorithm that the cumulative motion vector field is divided into a plurality of zones with similar movement; Step is as follows:

Δ (R) = \frac{{SR}^{2}}{2 Q | R |} (\min (SR, | R |) \log (1 + | R |) + 2 \log 6 wh),

The zone that remaining zone may exist as the motion object

6. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that described Object Segmentation, be to utilize former frame, t-1 constantly, the motion Object Segmentation result who has obtained judges present frame, and t constantly, each cut zone whether with certain object coupling of former frame, construct the coupling matrix with this; Judge the division of merging, the object of the tracking of object and renewal, object, the appearance, the situations such as disappearance of old object of object newly based on the coupling matrix, finally obtain some motion objects of present frame; Step is as follows:

With 1 background object

Mark comes out, and adopts the method for rear orientation projection to obtain the view field of each object of former frame at present frame then; Utilize the coordinate of any piece in the present frame cumulative motion vector field and the difference of its corresponding cumulative motion vector to obtain the matched position of this piece in former frame exactly, then the block object on the former frame matched position is projected to present frame and one by one mark come out, be designated as

(2) structural matrix CM ^t, its expression cut zone and overlapped area of object projection; Structural matrix CMR ^t, it represents that each cut zone drops on the ratio in each object projection; Structural matrix CMC ^t, it represents that each object projection drops on the ratio in each cut zone; According to the mark image With

In be labeled as i and Be labeled as the number of pixels of j, i.e. cut zone

With the object projection

Overlapped area; And Matrix C MR ^t(i j) is defined as CMR ^tEach element that i is capable is a cut zone Drop on the ratio in each object projection; Matrix C MC ^t(i j) is defined as CMC ^tEach element of j row is an object Projection drop on ratio in each cut zone;