CN103188490A

CN103188490A - Combination compensation mode in video coding process

Info

Publication number: CN103188490A
Application number: CN 201110458912
Authority: CN
Inventors: 朱洪波
Original assignee: Individual
Current assignee: Individual
Priority date: 2011-12-29
Filing date: 2011-12-29
Publication date: 2013-07-03

Abstract

The invention discloses an efficient motion compensation method of a time domain predictive coding block in a video coding process. According to the method, bi-directional motion compensation is realized by sharing a motion vector of a left adjacent block or a top adjacent block of a current coding block without transmission of a second motion vector needed, the code rate is reduced, and efficiency of motion compensation is improved.

Description

Merging compensation model in the video coding

Technical field

The invention belongs to the compression of digital video field, be specifically related to the time domain motion compensation encoding of vision signal.

Background technology

Digital video is by the continuous natural scene of time domain and spatial domain is carried out time domain and spatial domain continuous sampling gained.As shown in Figure 1, digital video is made up of the frame of video on a series of time domains, and each frame of video is represented natural scene in the sampling of the spatial domain of certain time, and it is made up of the vision pixel of two-dimentional uniform sampling.Each pixel is made up of the numeral of a series of description pixel intensity and color, in the video coding, the most extensive form that is used is yuv format, in this form, each pixel is by a luminance component Y, and two color difference components U and V form, and generally U and V component level and vertical direction respectively carried out down-sampling one time, every like this 4 adjacent pixels share 1 U and V component, Here it is YUV4:2:0 form.

The most extensive video coding technique that is used is piece base hybrid motion compensation dct transform video coding technique.As shown in Figure 2, incoming frame is split is divided into 16 * 16 macro block one by one, then from left to right, encodes successively from top to bottom.Current macro to be encoded to each input, at first from the frame of reconstruct, select the prediction to current block, and subtract each other with current block, residual error is carried out dct transform, quantification successively, inverse quantization, anti-dct transform obtain the reconstruct macro block then, deposit in the reconstructed frame sequence, be used for the macro block of encoding is thereafter produced prediction signal.In the forecasting process of reality, macro block usually is divided into littler 8 * 8 or 4 * 4 to be predicted accurately.

In piece base hybrid motion compensation video coding technique, have 3 kinds of dissimilar frames, I frame, P frame and B frame.Only use in the present frame information of encoding block to be used as the prediction of current block in the I frame.In the P frame, the reconstructed frame of DISPLAY ORDER before present frame also can be used as the prediction of block to be encoded in the present frame on the time domain.As shown in Figure 3, the demonstration time is that the frame of t is current encoded frame, and black block is the present encoding piece.The demonstration time is that t-t0, t-2*t0, t-3*t0 frame are reconstructed frame, and the grey block that wherein has the point-like border is the piece with present encoding piece same position.In the P frame, the demonstration time is t-t0, t-2*t0, the t-3*t0 frame can be as the prediction of present encoding piece.Motion estimation module in reconstructed frame with the current block same position near the piece of search and current block coupling as the prediction of current block.As shown in Figure 3, to point to the demonstration time by motion vector MV0 be that piece BLK0 among the t-t0 is as the prediction of current block to the present encoding piece.Encoder enrolls code stream with MV0, then current block and prediction piece are subtracted each other and obtain prediction residual, prediction residual is carried out dct transform successively, quantized the back quantization parameter is write code stream, and carry out behind inverse quantization, the anti-DCT in the prediction addition, obtain reconstructed blocks, be used for the prediction of subsequent encoded blocks in subsequent received.

In the B frame, the present encoding piece not only can use the reconstructed frame of DISPLAY ORDER before present frame as the prediction of current block, also can use the reconstructed frame of DISPLAY ORDER behind present frame as the prediction of current block.Shown in Fig. 4,5,6 and 7, the demonstration time is that the frame of t is current encoded frame, and black block is the present encoding piece.The demonstration time is that t-t0, t-2*t0, t+t0 frame are reconstructed frame, and the grey block that wherein has the point-like border is the piece with present encoding piece same position.The demonstration time is t-t0, t-2*t0, the t+t0 frame can be as the prediction of present encoding piece.In Fig. 3, motion vector MV0 points to the forward prediction piece BLK0 of current block.In Fig. 4, motion vector MV1 points to the back forecast piece BLK1 of current block.In Fig. 5, motion vector MV points to the forward prediction piece BLK0 of current block, and the opposite direction of MV is pointed to back forecast piece BLK1, the average prediction as current block of BLK0 and BLK1.In Fig. 6, motion vector MV0 points to the forward prediction piece BLK0 of current block, and motion vector MV1 points to the back forecast piece BLK1 of current block, the average prediction as current block of BLK0 and BLK1.In forward prediction, back forecast and symmetry prediction, have only a kinematic parameter (comprising motion vector and reference picture) to need coding, and in bi-predictive mode, 2 kinematic parameters need coding.

In many hypothesis motion compensation theories, the number that increases prediction can increase forecasting efficiency, but needs to pay more the do more physical exercises cost of parameter of coding simultaneously.The invention describes a kind of method of reusing kinematic parameter in the adjacent block, not only can increase forecasting efficiency, the cost of the coding kinematic parameter of paying is littler.

Summary of the invention

Described time domain prediction method for video coding P image comprises 3 kinds of time domain prediction patterns at least, the single directional prediction pattern, and a left side merges predictive mode and the top merges predictive mode.To each encoding block in the present image, encoder is by assessing respectively three kinds of patterns, selects wherein a kind of pattern as the possible time domain prediction pattern of current block.

As shown in Figure 9, the present encoding piece points to its time domain prediction piece by a motion vector MV0 under the single directional prediction pattern.The best matching blocks of encoder search current block in certain hunting zone.

As shown in figure 10, a left side merges under the predictive mode, the time domain compensation of current block in two steps, one is to point to piece BLKR1 by motion vector MV1, another is to use the motion vector MVL of left adjacent block to point to BLKL.The prediction of current block is the arithmetic average of BLKR1 and BLKL.Encoder is searched for optimal motion vector MV1 within the specific limits to obtain the optimum prediction of current block.

As shown in figure 11, the top merges under the predictive mode, the time domain compensation of current block in two steps, one is to point to piece BLKR2 by motion vector MV2, another is to use the motion vector MVT of top adjacent block to point to BLKT.The prediction of current block is the arithmetic average of BLKR2 and BLKT.Encoder is searched for optimal motion vector MV2 within the specific limits to obtain the optimum prediction of current block.

When encoder determines above-mentioned predictive mode, generally need utilization rate distortion estimation.When the final decision coding mode, need to use simple rate distortion cost comparative approach, perhaps also can adopt complicated rate distortion cost comparative approach.If encoder has been selected a kind of in 3 kinds of time domains, it specifically is 3 kinds any to distinguish that encoder need write information in the code stream.No matter adopted the sort of time domain prediction pattern, encoder all needs to obtain prediction residual, carries out dct transform successively, quantize to obtain quantization parameter, and and unique motion vector MVx (forward prediction mode is MV0, and left merging patterns are MV1, and the top merging patterns are MV2) write code stream together.Add prediction signal behind quantization parameter process inverse quantization, the anti-dct transform, obtain the reconstruct of current block, be used for the prediction of subsequent encoded blocks in subsequent received.

In a Video Decoder, decoder at first reads the coding mode that code stream obtains current decoding block, judges whether it is a kind of in above-mentioned 3 kinds of time domain prediction patterns.If decoder reads information from code stream, obtaining unique motion vector MVx and judging specifically is any in 3 kinds of patterns.If unidirectional time domain prediction pattern obtains the prediction of current block so according to MVx.If left merging patterns, the current block of motion vector MVL according to motion vector MVx and current block left side adjacent block carries out twice compensation so, and the arithmetic average of two compensating signals is prediction pieces of current block.If the top merging patterns, the current block of motion vector MVT according to motion vector MVx and current block top adjacent block carries out twice compensation so, and the arithmetic average of two compensating signals is prediction pieces of current block.

Description of drawings

Fig. 1 is the digital video example.

Fig. 2 is piece base motion compensation dct transform video encoder.

Fig. 3 is the motion compensation in the P image.

Fig. 4 is the forward motion compensation in the B image.

Fig. 5 is the reverse compensation in the B image.

Fig. 6 is the symmetric motion compensation in the B image.

Fig. 7 is the bi directional motion compensation in the B image.

Fig. 8 is current block and a left side thereof, top adjacent block schematic diagram in the P image among the present invention.

Fig. 9 is the forward prediction of current block in the P image.

Figure 10 is that the left side of current block in the P image merges prediction.

Figure 11 is that the top of current block in the P image merges prediction.

Execution mode

In a video encoder, the present encoding piece in the P image and a left side thereof, adjacent block position, top are as shown in Figure 8.The demonstration time be t be current encoded frame, wherein BLKC is the present encoding piece, black block BT is the top adjacent block of current block, black block BL is the left adjacent block of current block, BT and BL are encoding blocks.The demonstration time is that the frame of t-t0, t-2*t0, t-3*t0 is the reference frame of current P frame.In general P image coding, reference picture is not limited on the time domain prior to the reference picture of current encoded image, also can be on the time domain back in the reference picture of current encoded image.Piece BT points to its prediction piece BTP in the t-t0 frame by motion vector MVT, if current block BLKC uses the kinematic parameter of piece BT, the prediction piece of present encoding piece is BLKT.Piece BL points to its prediction piece BLP in the t-2*t0 frame by motion vector MVL, if current block BLKC uses the kinematic parameter of piece BL, the prediction piece of present encoding piece is BLKL.In the time of coding or decoding current block BLKC, because piece BL and piece BT encode or decode, so when encoding current block, kinematic parameter MVT and MVL are known.In the present invention, use BLK _{I, j}Coordinate is that (set of all coordinates of current block is represented with Ω for i, pixel j) among the expression piece BLK.

In the present invention, the time domain prediction mode of current block BLKC is divided into three kinds: single directional prediction pattern, a left side merge predictive mode and the top merges predictive mode.At the video coding side, in order to obtain best code efficiency, need from 3 kinds of coding modes, select a kind of of optimum, as the coding mode of current block BLKC.At first assess the single directional prediction pattern.As shown in Figure 9, in the single directional prediction pattern, at first search for MV0 in the reference frame of present frame, following formula is minimized, the MV0 of gained is exactly the motion vector under the single directional prediction pattern.

{JC}_{motion} = \underset{(i, j) &Element; Ω}{Σ} abs ({BLKC}_{i, j} - {BLKR 0}_{i, j}) + λ_{motion} \times ({rate}_{mv 0} + {rate}_{sglp})

Wherein abs represents absolute value operation, λ _MotionBe Lagrangian parameter, it is determined by quantization parameter.Rate _Mv0The needed code check of presentation code MV0, rate _SglpThe code check of forward prediction mode is selected in expression, and piece BLKR0 is the reference block that motion vector MV0 points to.

As shown in figure 10, in left merging patterns, the prediction of current block is the arithmetic average of piece BLKL and BLKR1.So under left merging patterns, encoder is searched for MV1 in reference frame, and following formula is minimized:

{JL}_{motion} = \underset{(i, j) &Element; Ω}{Σ} abs ({BLKC}_{i, j} - \frac{{BLKR 1}_{i, j} + {BLKL}_{i, j} + 1}{2}) + λ_{motion} \times ({rate}_{mv 1} + {rate}_{lcbn})

Rate wherein _LcbnBe the code check of selecting left merging patterns, rate _Mv1Be the code check of kinematic parameter MV1, abs represents absolute value operation.Piece BLKR1 is motion vector MV1 reference block pointed.

As shown in figure 11, in the merging patterns of top, the prediction of current block is the arithmetic average of piece BLKT and BLKR2.So under the merging patterns of top, encoder is searched for MV2 in reference frame, and following formula is minimized:

{JT}_{motion} = \underset{(i, j) &Element; Ω}{Σ} abs ({BLKC}_{i, j} - \frac{{BLKR 2}_{i, j} + {BLKT}_{i, j} + 1}{2}) + λ_{motion} \times ({rate}_{mv 2} + {rate}_{tcbn})

Rate wherein _TcbnBe the code check of selecting the top merging patterns, rate _Mv2Be the code check of kinematic parameter MV2, abs represents absolute value operation.Piece BLKR2 is motion vector MV2 reference block pointed.

After three kinds of patterns determine to finish, need to select a kind of time domain prediction pattern as current block.In a simple encoder configuration, directly compare JC _Motion, JL _MotionAnd JT _MotionValue, selecting wherein, reckling is the time domain prediction pattern of current block.Also can adopt the method for rate-distortion optimization, namely for every kind of pattern, at first obtain its residual error, conversion then, quantification, inverse quantization, inverse transformation, obtain reconstructed error, and the total bitrate that quantization parameter entropy coding is asked, that minimum pattern of selection rate distortion cost then.When the subsequent block coding need be quoted the motion vector of present encoding piece, the current block motion vector was defined as MVx.

In the encoder of a reality, except above-mentioned three kinds of patterns, also other patterns such as skip mode, intra-frame encoding mode may be arranged, so a piece is after above-mentioned three kinds of model selections are finished, also to compare with other patterns, to select the pattern an of the best.When encoder was selected a kind of in above-mentioned three kinds of patterns, it specifically was the sort of predictive mode to distinguish that encoder writes information in the code stream.The kinematic parameter MVx that encodes unique then (forward prediction mode is MV0, and left merging patterns are MV1, and the top merging patterns are MV2) and residual information.

In a Video Decoder, decoder reads information from code stream, judges whether current block is the time domain prediction pattern.If continue the information that from code stream, reads so to determine being any in 3 kinds.From code stream, read kinematic parameter MVx and residual information then.If left merging patterns so to current block, carry out twice motion compensation, as shown in figure 10, at first obtain a piece BLKR1 according to MVx, the motion vector MVL according to current block left side adjacent block obtains BLKL then, and the prediction of current block is exactly the arithmetic average of BLKR1 and BLKL.If the top merging patterns so to current block, carry out twice motion compensation, as shown in figure 11, at first obtain a reference block BLKR2 according to MVx, the back obtains BLKT according to the motion vector MVT of current block left side adjacent block, and the prediction of current block is exactly the arithmetic average of BLKR2 and BLKT.To the residual information inverse quantization, anti-dct transform adds that prediction is exactly the reconstruct of current block then.When the subsequent block decoding need be quoted the motion vector of current decoding block, the current block motion vector was defined as MVx.

Claims

1. in a kind of video encoder, for the present encoding piece in the P image, can select a kind of in three kinds of time domain prediction patterns: unidirectional time domain prediction pattern, a left side merge the time domain prediction pattern and the top merges the time domain prediction pattern.Encoder percent of pass distortion estimation is searched in the encoded image prior to present image on time domain, every kind of pattern is assessed, percent of pass distortion cost function is selected optimum a kind ofly then, and encoder is encoded and specifically used the sort of pattern by write information in code stream.

2. according to the described method of claim 1, it is characterized in that in above-mentioned three kinds of patterns every kind only need write one and motion vector MVx only in the code stream.When selecting the single directional prediction pattern, the prediction of current block is exactly MVx piece pointed.When selecting to merge predictive mode, if left merging patterns, the current block of motion vector MVL according to motion vector MVx and current block left side adjacent block carries out twice compensation so, and the arithmetic average of two compensating signals is prediction pieces of current block.If the top merging patterns, the current block of motion vector MVT according to motion vector MVx and current block top adjacent block carries out twice compensation so, and the arithmetic average of two compensating signals is prediction pieces of current block.

3. according to the described method of claim 2, when the subsequent block coding need be quoted the current block motion vector, the current block motion vector was MVx.

4. according to the described method of claim 1, in general P image coding, the reference picture of present encoding P image can be on the time domain prior to the reference picture of present encoding P image, also can be on the time domain back in the reference picture of present encoding P image.

5. in a kind of Video Decoder, for the current decoding block in the P image, can select a kind of in three kinds of time domain prediction patterns: unidirectional time domain prediction pattern, a left side merge the time domain prediction pattern and the top merges the time domain prediction pattern.Decoder is by reading information in code stream, at first judge whether to belong to a kind of in above-mentioned 3 kinds of patterns.If decoder reads information from code stream, obtaining specifically is any in 3 kinds, and unique motion vector MVx.

6. according to the described method of claim 5, it is characterized in that if current block is the single directional prediction pattern, the prediction of current block is exactly MVx piece pointed.If left merging patterns, the current block of motion vector MVL according to motion vector MVx and current block left side adjacent block carries out twice compensation so, and the arithmetic average of two compensating signals is prediction pieces of current block.If the top merging patterns, the current block of motion vector MVT according to motion vector MVx and current block top adjacent block carries out twice compensation so, and the arithmetic average of two compensating signals is prediction pieces of current block.

7. according to the described method of claim 6, when the subsequent block decoding need be quoted the current block motion vector, the current block motion vector was MVx.

8. according to the described method of claim 5, in general P image decoding, the reference picture of current decoding P image can be on the time domain prior to the reference picture of current decoding P image, also can be on the time domain back in the reference picture of current decoding P image.