CN101184242A

CN101184242A - Multi-reference movement compensation based three-dimensional wavelet video coding algorithm

Info

Publication number: CN101184242A
Application number: CN 200710195424
Authority: CN
Inventors: 凃国防; 张磊; 张灿; 陈康; 吕述望
Original assignee: University of Chinese Academy of Sciences
Current assignee: Graduate School of CAS; University of Chinese Academy of Sciences
Priority date: 2007-11-28
Filing date: 2007-11-28
Publication date: 2008-05-21

Abstract

The invention relates to a three-dimensional wavelet video coding algorithm based on multi-reference motion compensation in wireless multimedia communications field. The algorithm is based on the intense extent of the video sequence to make groups dynamically; process one-dimensional time-domain transformation utilizing one-dimensional multi-reference motion compensation upgrade of time-domain filter; on one hand, utilize a multi-reference frame to give the motion estimation and compensation; on the other hand, when the motion is compensated, for a certain pixel x of the current frame, utilize the surrounding mean value of the surrounding corresponding pixel of the reference frame as the predict value x'. The measures ensure the validity of the time domain filtering; improve the coding efficiency of all the system. After the motion compensation time domain filtering, the frames are transformed two-dimensional airspace wavelet transform by lifting wavelet, and the transform coefficient utilizes CCSDS two-dimensional standard coding algorithm to code; then together with the coefficient coding code word and the motion vector coding code word, forms a quality expansion code stream. The method can be applied to wireless and spatial video communications.

Description

3 D wavelet video coding algorithm based on multi-reference movement compensation

Technical field

The present invention relates to video coding technique in the wireless multimedia communication field, particularly a kind of 3 D wavelet video coding algorithm based on multi-reference movement compensation.

Background technology

With respect to video coding and 2-d wavelet video coding based on discrete cosine transform, the 3 D wavelet compressed encoding is used as time shaft the one-dimensional signal with certain statistical feature, adopt small echo to carry out decorrelation, not only can provide multiple scalabilities such as frame per second, compression quality, and owing to there is not the loop structure of predictive coding, therefore avoided the error drift problem in the predictive coding, be very suitable for towards the scalable video of communication.

Typical 3 D wavelet coding is divided into two kinds, and wherein not with the coding of motion compensation, because it does not have relativity of time domain between considered frame, so code efficiency is affected.(Motion Compensation TemporalFiltering is not to carry out according to the order of locus correspondence when the time domain direction is carried out one-dimensional wavelet transform MCTF), but carries out according to the movement locus direction motion compensated temporal filter.Have higher correlation owing to be between the pixel of same movement locus, therefore carry out the time domain wavelet transformation and can better eliminate correlation between each frame, improve code efficiency according to the movement locus direction.Yet most of at present 3 D wavelet encryption algorithm based on MCTF adopts methods such as fixing grouping, single frames estimation, these methods well matching sequence the motion severe degree, detect motion vector accurately, thereby can't better eliminate time domain redundancy between sequence.

At the deficiency that exists in the existing algorithm, the present invention proposes a kind of scalability 3 D wavelet video coding algorithm based on multi-reference movement compensation.

Summary of the invention

3 D wavelet video coding algorithm based on multi-reference movement compensation, motion severe degree according to video sequence is carried out Dynamic Packet (Group Of Picture to video sequence, GOP), utilize one dimension multi-reference movement compensation hoist type time-domain filtering (Multi-reference Motion Compensated Lifting Temporal Filtering, MRMCLTF) carry out the one dimension spatial transform, again each frame is carried out the spatial domain two-dimensional wavelet transformation.Adopt (the The ConsentCommitree for Space Data System of consultative committee for space data system, CCSDS) two-dimentional standard code algorithm (CCSDS 122.0-B-1:Image datacompression, Blue Book, Sept.2005) encode, at last coefficient coding code word and corresponding motion vector encoder code word are organized together, constitute the quality scalability code stream.Concrete technical characterictic is as follows:

1) Dynamic Packet

The Dynamic Packet method utilizes average motion vector to judge the motion severe degree of sequence, divides into groups according to the motion severe degree of video sequence then, the every group of frame that comprises small number when violent that move, and every group of frame that comprises a greater number when mild moves.

2) multi-reference movement compensation hoist type time-domain filtering

Utilize many reference motion to estimate to carry out one dimension time domain wavelet transformation with compensation.At first use n reference frame to carry out estimation respectively, the 1st reference frame adopts all direction search method, and during all the other reference frame estimation, the motion vector that all adopts a reference frame generation is as initial value, in a small range, search for then, obtain accurate motion vectors.Obtain after n frame estimation is finished n organize corresponding motion vector and absolute error and, select minimum absolute error and pairing motion vector as final motion vector.When any one pixel x of present frame is carried out motion compensation, utilize the compensation method of many pixel motions, promptly use the predicted value x ' of the weighted average of pairing pixel in the reference frame and adjacent 8 pixels as x, the motion vector that the multiframe estimation obtains utilizes the Harr Lifting Wavelet to carry out one dimension time domain wavelet transformation.

3) wavelet coefficient coding

Adopt the Daubechies9/7 wavelet basis to carry out the spatial domain two-dimensional wavelet transformation to each frame, adopt the CCSDS canonical algorithm to carry out coefficient coding then, obtain the quality scalability code stream.

Beneficial effect based on the 3 D wavelet video coding algorithm of multi-reference movement compensation is: adopt the multi-reference frame predicted motion to estimate and compensation method can obtain more accurate movement information and predicted value, improved the validity of time-domain filtering and the code efficiency of whole system.

Description of drawings

Fig. 1 is based on the 3 D wavelet coding structure of multi-reference movement compensation

In Fig. 1: 1. sequence of video images, 2. video frame packet, 3. GOP grouping, 4. multi-reference movement compensation time-domain filtering, the 5. GOP of process one dimension time-domain filtering, 6.Daubechies9/7 spatial domain two-dimensional wavelet transformation, 7.GOP in the wavelet coefficient of all frames, 8.CCSDS wavelet coding, the encoding code stream of all frames among the 9.GOP, 10. code stream tissue, 11. final encoding code streams.

Fig. 2, motion compensated temporal filter structure (Fig. 2 is 4 a detailed structure among Fig. 1)

In Fig. 2: 12. first order time-domain filterings are handled, 13.4 inferior multi-reference frame estimation, 14. the corresponding motion vector of first order time-domain filtering, 15.4 inferior Harr wavelet transformation, 16.4 low frequency L frames, 17.4 high frequency H frames, 18. second level time-domain filtering is handled, 19.2 inferior multi-reference frame estimation, the corresponding motion vector of 20. second level time-domain filterings, 21.2 Harr wavelet transformations, 22.2 individual low frequency LL frame, 23.2 individual high frequency LH frame, 24. third level time-domain filterings are handled, 25.1 multi-reference frame estimation, 26. the corresponding motion vector of third level time-domain filtering, 27.1 inferior Harr wavelet transformation, 28.1 low frequency LLL frames, 29.1 high frequency LLH frames.

Fig. 3, the multi-reference frame estimation

In Fig. 3: 30. present frames, 31. first reference frame, 32. block matching motion is estimated, 33. second reference frame, 34. n reference frame, 35. the motion vector of the first reference frame correspondence and mean absolute error and, 36. the motion vector of the second reference frame correspondence and mean absolute error and, 37. the motion vector of n reference frame correspondence and mean absolute error and, 38. multi-reference frame motion estimation result, 39. relatively each mean absolute error and size, 40. decision units, 41. motion vectors that finally obtain.

Fig. 3 is the detailed structure of multi-reference frame estimation in 13,19 or 25 among Fig. 2, and carrying out the multi-reference frame estimation with a certain present frame 30 here is that example is described.Wherein be input as present frame 30 and a plurality of

reference frame

31,33 and 34, be output as the motion vector 41 that finally obtains.

Fig. 4, many pixel motions collocation structure

In Fig. 4, a certain pixel x in 42. present frames, 8 pixels around the 43. respective pixel r, 44. pixel r, 45. calculate the weighted average of 9 pixels, the predicted value x ' of 46. pixel x.

Among Fig. 4, establish and obtain final motion vector 41 after present frame 30 passes through the multi-reference frame estimation, this group motion vector comes from reference frame 33, so motion compensation will be carried out in reference frame 33.

Concrete implementation step

3 D wavelet video coding general structure based on multi-reference movement compensation is as follows:

As shown in Figure 1, after encoder receives sequence of video images 1, at first need carry out the dynamic frame grouping, obtain some GOP groupings, as a GOP grouping 3 in video frame packet 2.Adopt 4 pairs of GOP groupings 3 of multi-reference movement compensation time-domain filtering to carry out 3 grades of one dimension time-domain filterings (multi-reference movement compensation time-domain filtering detailed structure as shown in Figure 2) among the figure.Adopt Daubechies9/7 spatial domain two-dimensional wavelet transformation 6 to carry out 3 grades of wavelet transformations to each frame among the GOP5 behind the one dimension time-domain filtering, obtain the wavelet coefficient 7 of all frames among the GOP, and adopt CCSDS wavelet coding 8 to encode to each frame wavelet coefficient, obtain the encoding code stream 9 of all frames among the GOP, in code stream tissue 10, constitute complete coding structure, final output encoder code stream 11.

Introduce each part specific implementation method below:

1) Dynamic Packet

Video sequence is handled the size of determining GOP by Dynamic Packet.Setting first GOP is 8 frames, adopts the spatial domain block matching motion to estimate to obtain motion vector, be calculated as follows the average absolute motion vector and

MASMV = \frac{1}{MN} Σ_{m = 0}^{M - 1} Σ_{n = 0}^{N - 1} [| m v_{x} (m, n) | + | m v_{y} (m, n) |] - - - (1)

Wherein MASMV be the average absolute motion vector and, M is the estimation number of blocks that image comprises in the horizontal direction, N is the estimation number of blocks that image comprises at vertical direction.Mv _x(m, n) and mv _y(m n) is relevant block motion excursion amount in the x and y direction.Use current group of average absolute motion vector and the size of going to judge next grouping, as shown in the formula:

GOP = \{\begin{matrix} 4; & if & T_{0} \leq MASMV \\ 8; & if & T_{1} < MASMV < T_{0} \\ 16; & if & T_{1} &GreaterEqual; MASMV \end{matrix} - - - (2)

T wherein ₀And T ₁It is decision threshold.Can measure the decision threshold that is fit to by a large amount of emulation experiments, get T among the present invention ₀=3, T ₁=1.2.

2) multi-reference movement compensation hoist type time-domain filtering

As shown in Figure 1, after video frame packet 2 processing, will constitute several GOP groupings, be example with a GOP grouping 3, multi-reference movement compensation time-domain filtering 4 processing procedures are as follows: as shown in Figure 2, one dimension time-domain filtering MCTF processing procedure divides 3 grades to carry out the time-domain filtering processing.Handle in 12 at first order time-domain filtering, MCTF is carried out in a GOP grouping 3, at first need to carry out reference motion more than 4 times and estimate 13, obtain the corresponding motion vector 14 of first order time-domain filtering, and utilize motion vector to carry out Harr wavelet transformation 15 4 times, obtain 4 low frequency L frames 16 and 4 high frequency H frames 17, because human eye is insensitive to high frequency coefficient, reduce frame rate so can give up high-frequency frame, but the MCTF of individual layer can only provide the speed that reduces by half, therefore need in Fig. 2,18 and 24 carry out that second level time-domain filtering is handled and the processing of third level time-domain filtering to 4 low frequency L frames 16, it is 12 identical that its process and first order time-domain filtering are handled, and obtains a low frequency LLL frame 28 and a high frequency LLH frame 29 behind third level time-domain filtering.

3) multi-reference frame estimation

In the above-mentioned multi-reference movement compensation time-domain filtering, adopt a kind of multi-reference frame estimation.With a certain frame is example, detailed process as shown in Figure 3: present frame 30 respectively with first reference frame 31, second reference frame 33, carry out block matching motion up to n reference frame 34 and estimate 32, obtain multi-reference frame motion estimation result 38.In multi-reference frame motion estimation result 38, the motion vector and mean absolute error and 36 that comprise the motion vector of the first reference frame correspondence and mean absolute error and 35, the second reference frame correspondence, (wherein first reference frame adopts all direction search method up to the motion vector of n reference frame correspondence and mean absolute error and 37, during all the other reference frame estimation, the motion vector that all adopts a reference frame generation is as initial value, in a small range, search for then, obtain accurate motion vectors).Then in Figure 39 relatively each mean absolute error and size, in decision unit 40, select minimum absolute error and pairing motion vector as the motion vector 41 that finally obtains.

4) many pixel motion compensation

Adopt many pixel motion compensation during motion compensation.With a pixel is example, detailed structure as shown in Figure 4: for a certain pixel x42 in the present frame, according to motion vector 41, the corresponding pixel r43 of search in second reference frame 33, find pixel r 8 pixels 44 on every side then, calculate the weighted average 45 of 9 pixels, the weighted average calculation formula is

x^{'} = \frac{1}{9} Σ_{j = 0}^{8} α_{j} r_{j},

A wherein _jBe weighted value, and

Σ_{j = 0}^{8} α_{j} = 9 .

Can select different weights according to actual conditions, the present invention gets a ₀=3, a ₁=a ₃=a ₆=a ₈=1/2, a ₂=a ₄=a ₅=a ₇=1.Obtain the predicted value x ' 46 of pixel x at last.

3) spatial domain two-dimensional wavelet transformation and scalability wavelet coding

After the one dimension time-domain filtering, need use the Daubechies9/7 Lifting Wavelet to carry out the spatial domain two-dimensional wavelet transformation to each frame behind the spatial transform, utilize CCSDS image encoding standard to encode, with the complete video coding code stream of the common formation of coding codeword and motion vector encoder code word tissue back that obtains.

Claims

1. 3 D wavelet video coding algorithm based on multi-reference movement compensation, it is characterized in that: the motion severe degree according to video sequence is carried out Dynamic Packet to video sequence, utilize one dimension multi-reference movement compensation hoist type time-domain filtering to carry out the one dimension spatial transform, again each frame is carried out two-dimentional spatial domain wavelet transformation, adopt two-dimentional CCSDS standard code algorithm to encode, constitute the quality scalability code stream.

2. the 3 D wavelet video coding algorithm based on multi-reference movement compensation according to claim 1, its feature also is: utilize average motion vector to judge the motion severe degree of sequence, divide into groups according to the motion severe degree of video sequence again, moving when violent every group comprises a spot of frame, and motion is every group of frame that comprises a greater number at ordinary times.

3. the 3 D wavelet video coding algorithm based on multi-reference movement compensation according to claim 1, its feature also is: utilize multi-reference movement compensation hoist type time-domain filtering to carry out the one dimension spatial transform, at first utilize the multiframe estimation, n reference frame carried out estimation respectively, the 1st reference frame adopts all direction search method, during all the other reference frame estimation, the motion vector that all adopts a reference frame generation is as initial value, in a small range, search for then, obtain accurate motion vectors, obtain after n frame estimation is finished n organize corresponding motion vector and absolute error and, select minimum absolute error and pairing motion vector as final motion vector, when any one pixel x of present frame is carried out motion compensation, utilize the compensation method of many pixel motions, promptly use the predicted value x ' of the weighted average of pairing pixel in the reference frame and adjacent 8 pixels as x, the motion vector that last multiframe estimation obtains utilizes the Harr Lifting Wavelet to carry out one dimension time domain wavelet transformation.

4. the 3 D wavelet video coding algorithm based on multi-reference movement compensation according to claim 1, its feature also is: adopt the Daubechies9/7 wavelet basis to carry out two-dimentional spatial domain wavelet transformation to each frame, adopt the CCSDS canonical algorithm to carry out coefficient coding then, obtain the quality scalability code stream.